Handbook of Quantitative Finance and Risk Management
Editors Cheng-Few Lee, Rutgers University, USA Alice C. Lee, State Street Corp., USA John Lee, Center for PBBEF Research, USA
Advisory Board Ivan Brick, Rutgers University, USA Stephen Brown, New York University, USA Charles Q. Cao, Penn State University, USA Chun-Yen Chang, National Chiao Tung University, Taiwan Wayne Ferson, Boston College, USA Lawrence R. Glosten, Columbia University, USA Martin J. Gruber, New York University, USA Hyley Huang, Wintek Corporation, Taiwan Richard E. Kihlstrom, University of Pennsylvania, USA E. H. Kim, University of Michigan, USA Robert McDonald, Northwestern University, USA Ehud I. Ronn, University of Texas at Austin, USA
Disclaimer: Any views or opinions presented in this publication are solely those of the authors and do not necessarily represent those of State Street Corporation. State Street Corporation is not associated in any way with this publication and accepts no liability for the contents of this publication.
Cheng-Few Lee Alice C. Lee John Lee Editors
Handbook of Quantitative Finance and Risk Management
123
Editors Cheng-Few Lee Rutgers University Department of Finance and Economics 94 Rockafeller Road New Brunswick, NJ 08854-8054, Janice H. Levin Bldg. USA
[email protected]
John Lee Center for PBBEF Research North Brunswick, NJ USA
[email protected]
Alice C. Lee State Street Corp. Boston, MA USA
[email protected]
ISBN 978-0-387-77116-8 e-ISBN 978-0-387-77117-5 DOI 10.1007/978-0-387-77117-5 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2010921816 c Springer Science+Business Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Quantitative finance and risk management is a combination of economics, accounting, statistics, econometrics, mathematics, stochastic process, and computer science and technology. This handbook is the most comprehensive handbook in quantitative finance and risk management, which integrates theory, methodology, and application. Due to the importance of quantitative finance and risk management in the finance industry, it has become one of the most popular subjects in business schools and departments of mathematics, operation research, and statistics. In addition, the finance industry has many job opportunities for people with good training in quantitative finance and risk management. Thus, a handbook should have a broad audience and be of interest to academics, educators, students, and practitioners. Based on our years of experience in industry, teaching, research, textbook writing, and journal editing on the subject of quantitative finance and risk management, this handbook will review, discuss, and integrate theoretical, methodological and practical issues of quantitative finance and risk management. This handbook is organized into five parts as follows: Part I. Overview of Quantitative Finance and Risk Management Research Part II. Portfolio Theory and Investment Analysis Part III. Options and Option Pricing Theory Part IV. Risk Management Part V. Theory, Methodology, and Applications Part I of this handbook covers three chapters: they are “Chapter 1. Theoretical Framework of Finance,” “Chapter 2. Investment, Dividend, Financing, and Production Policies,” and “Chapter 3. Research Methods of Quantitative Finance and Risk Management.” Part II of this handbook covers 18 chapters of portfolio theory and investment analysis. Part III of this handbook includes 21 chapters of options and option pricing theory. Part IV of this handbook includes 23 chapters of theory and practice in risk management. Finally, Part V of this handbook covers 44 chapters of theory, methodology, and applications in quantitative finance and risk management. In the preparation of this handbook, first, we would like to thank the members of advisory board and contributors of this handbook. In addition, we note and appreciate the extensive help from our Editor, Ms. Judith Pforr, our research assistants Hong-Yi Chen, Wei-Kang Shih and Shin-Ying Mai, and our secretary Ms. Miranda Mei-Lan Luo. Finally, we would like to thank the Wintek Corporation and the Polaris Financial Group for the financial support that allowed us to write this book. There are undoubtedly some errors in the finished product, both typographical and conceptual. We invite readers to send suggestions, comments, criticisms, and corrections to the author Professor Cheng-Few Lee at the Department of Finance and Economics, Rutgers University at Janice H. Levin Building Room 141, Rockefeller Road, Piscataway, NJ 08854-8054. New Brunswick, NJ Boston, MA North Brunswick, NJ
Cheng-Few Lee Alice C. Lee John Lee v
About the Editors
Cheng-Few Lee is Distinguished Professor of Finance at Rutgers Business School, Rutgers University and was chairperson of the Department of Finance from 1988 to 1995. He has also served on the faculty of the University of Illinois (IBE Professor of Finance) and the University of Georgia. He has maintained academic and consulting ties in Taiwan, Hong Kong, China, and the United States for the past three decades. He has been a consultant to many prominent groups, including the American Insurance Group, the World Bank, the United Nations, The Marmon Group Inc., Wintek Corporation, and Polaris Financial Group. Professor Lee founded the Review of Quantitative Finance and Accounting (RQFA) in 1990 and the Review of Pacific Basin Financial Markets and Policies (RPBFMP) in 1998, and serves as managing editor for both journals. He was also a co-editor of the Financial Review (1985– 1991) and the Quarterly Review of Economics and Business (1987–1989). In the past 36 years, Dr. Lee has written numerous textbooks ranging in subject matters from financial management to corporate finance, security analysis and portfolio management to financial analysis, planning and forecasting, and business statistics. In addition, he edited a popular book entitled Encyclopedia of Finance (with Alice C. Lee). Dr. Lee has also published more than 170 articles in more than 20 different journals in finance, accounting, economics, statistics, and management. Professor Lee was ranked the most published finance professor worldwide during the period 1953–2008. Professor Lee was the intellectual force behind the creation of the new Masters of Quantitative Finance program at Rutgers University. This program began in 2001 and has been ranked as one of the top ten quantitative finance programs in the United States. These top ten programs are located at Carnegie Mellon University, Columbia University, Cornell University, New York University, Princeton University, Rutgers University, Stanford University, University of California at Berkley, University of Chicago, and University of Michigan. Alice C. Lee is currently a Director in the Model Validation Group, Enterprise Risk Management, at State Street Corporation. Most recently, she was an Assistant Professor of Finance at San Francisco State University. She has more than 20 years of experience and has a diverse background, which includes academia, engineering, sales, and management consulting. Her primary areas of teaching and research are corporate finance and financial institutions. She is coauthor of Statistics for Business and Financial Economics, 2e (with Cheng F. Lee and John C. Lee) and Financial Analysis, Planning and Forecasting, 2e (with Cheng F. Lee and John C. Lee). In addition, she has co-edited other annual publications including Advances in Investment Analysis and Portfolio Management (with Cheng F. Lee). John C. Lee is a Microsoft Certified Professional in Microsoft Visual Basic and Microsoft Excel VBA. He has a bachelor and masters degree in accounting from the University of Illinois at Urbana-Champaign. John has more than 20 years’ experience in both the business and technical fields as an accountant, auditor, systems analyst, and as a business software developer. He has authored a book on how to use MINITAB and Microsoft Excel to do statistical analysis; this book is vii
viii
a companion text to Statistics of Business and Financial Economics, of which he is one of the co-authors. John has been a senior technology officer at the Chase Manhattan Bank and assistant vice president at Merrill Lynch. He is currently Director of the Center for PBBEF Research.
About the Editors
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part I
v
Overview of Quantitative Finance and Risk Management Research
1
Theoretical Framework of Finance : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Discounted Cash-Flow Valuation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 M and M Valuation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Markowitz Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Arbitrage Pricing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Option Valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Futures Valuation and Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 3 6 10 10 12 14 15 22 22
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
23
2.1 2.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Investment and Dividend Interactions: The Internal Versus External Financing Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Interactions Between Dividend and Financing Policies . . . . . . . . . . . . . . . . . . 2.4 Interactions Between Financing and Investment Decisions . . . . . . . . . . . . . . . 2.5 Implications of Financing and Investment Interactions for Capital Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Implications of Different Policies on the Beta Coefficient . . . . . . . . . . . . . . . . 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 2A Stochastic Dominance and its Applications to Capital-Structure Analysis with Default Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2A.2 Concepts and Theorems of Stochastic Dominance . . . . . . . . . . . . . . . 2A.3 Stochastic-Dominance Approach to Investigating the Capital-Structure Problem with Default Risk . . . . . . . . . . . . . . . . . . . 2A.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 23 25 28 30 34 36 36 38 38 38 39 40
ix
x
3
Contents
Research Methods in Quantitative Finance and Risk Management : : : : : : : : : : 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Other Disciplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part II
41 41 41 43 46 48 49 50
Portfolio Theory and Investment Analysis
4
Foundation of Portfolio Theory : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5: 3 Cheng-Few Lee, Alice C. Lee, and John Lee 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 Risk Classification and Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3 Portfolio Analysis and Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4 The Efficient Portfolio and Risk Diversification . . . . . . . . . . . . . . . . . . . . . . . . 60 4.5 Determination of Commercial Lending Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.6 The Market Rate of Return and Market Risk Premium . . . . . . . . . . . . . . . . . . . 66 4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5
Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng-Few Lee, Joseph E. Finnerty, and Hong-Yi Chen 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Measurement of Return and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Utility Theory, Utility Functions, and Indifference Curves . . . . . . . . . . . . . . . 5.4 Efficient Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
7
Capital Asset Pricing Model and Beta Forecasting : : : : : : : : : : : : : : : : : : : : : : : : Cheng-Few Lee, Joseph E. Finnerty, and Donald H. Wort 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 A Graphical Approach to the Derivation of the Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Mathematical Approach to the Derivation of the Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 The Market Model and Risk Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Growth Rates, Accounting Betas, and Variance in EBIT . . . . . . . . . . . . . . . . . 6.6 Some Applications and Implications of the Capital Asset Pricing Model . . . 6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 6A Empirical Evidence for the Risk-Return Relationship . . . . . . . . . . . . . Appendix 6B Anomalies in the Semi-strong Efficient-Market Hypothesis . . . . . . . . Index Models for Portfolio Selection : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng-Few Lee, Joseph E. Finnerty, and Donald H. Wort 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Single-Index Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Multiple Indexes and the Multiple-Index Model . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69 69 69 71 77 91 91 93 93 93 96 97 100 104 105 105 106 109 111 111 111 118 121 122
Contents
xi
Appendix 7A A Linear-Programming Approach to Portfolio-Analysis Models . . . . 122 Appendix 7B Expected Return, Variance, and Covariance for a Multi-index Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Performance-Measure Approaches for Selecting Optimum Portfolios : : : : : : : : Cheng-Few Lee, Hong-Yi Chen, and Jessica Shin-Ying Mai 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Sharpe Performance-Measure Approach with Short Sales Allowed . . . . . . . . 8.3 Treynor-Measure Approach with Short Sales Allowed . . . . . . . . . . . . . . . . . . . 8.4 Treynor-Measure Approach with Short Sales Not Allowed . . . . . . . . . . . . . . . 8.5 Impact of Short Sales on Optimal-Weight Determination . . . . . . . . . . . . . . . . 8.6 Economic Rationale of the Treynor Performance-Measure Method . . . . . . . . 8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 8A Derivation of Equation (8.6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 8B Derivation of Equation (8.10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 8C Derivation of Equation (8.15) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
125
The Creation and Control of Speculative Bubbles in a Laboratory Setting : : : : James S. Ang, Dean Diavatopoulos, and Thomas V. Schwarz 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Bubbles in the Asset Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
137
10 Portfolio Optimization Models and Mean–Variance Spanning Tests : : : : : : : : : : Wei-Peng Chen, Huimin Chung, Keng-Yu Ho, and Tsui-Ling Hsu 10.1 Introduction of Markowitz Portfolio-Selection Model . . . . . . . . . . . . . . . . . . . 10.2 Measurement of Return and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Efficient Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Mean–Variance Spanning Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Alternative Computer Program to Calculate Efficient Frontier . . . . . . . . . . . . 10.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
165
11 Combining Fundamental Measures for Stock Selection : : : : : : : : : : : : : : : : : : : : Kenton K. Yee 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Bayesian Triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Triangulation in Forensic Valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Bayesian Triangulation in Asset Pricing Settings . . . . . . . . . . . . . . . . . . . . . . . 11.5 The Data Snooping Trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Using Guidance from Theory to Mitigate Data Snooping . . . . . . . . . . . . . . . . 11.7 Avoiding Data-Snooping Pitfalls in Financial Statement Analysis . . . . . . . . . 11.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 11A Proof of Theorem 11.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11A.1 Generalization of Theorem 11.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
185
8
9
125 125 128 130 132 132 133 133 133 134 135
137 139 140 145 161 163
165 166 166 172 175 182 184
185 187 189 190 194 195 197 199 200 201 201
xii
Contents
12 On Estimation Risk and Power Utility Portfolio Selection : : : : : : : : : : : : : : : : : : Robert R. Grauer and Frederick C. Shen 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 The Multiperiod Investment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Alternative Ways of Estimating the Joint Return Distribution . . . . . . . . . . . . . 12.6 Alternate Ways of Evaluating Investment Performance . . . . . . . . . . . . . . . . . . 12.7 The Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.9 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
203
13 International Portfolio Management: Theory and Method : : : : : : : : : : : : : : : : : : Wan-Jiun Paul Chiou and Cheng-Few Lee 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Overview of International Portfolio Management . . . . . . . . . . . . . . . . . . . . . . . 13.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Forming the Optimal Global Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 The Benefits of International Diversification Around the World . . . . . . . . . . . 13.6 The Optimal Portfolio Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
221
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model: A Case of World Equity Fund Market : : : : : : : : : : : : : : : : : : Chin W. Yang, Ken Hung, and Jing Cui 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 The Le Châtelier Principle in the Markowitz Investment Model . . . . . . . . . . . 14.4 An Application of the Le Châtelier Principle in the World Equity Market . . . 14.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
203 203 205 206 206 208 210 216 217 218
221 222 226 226 227 229 232 233 235 235 236 236 237 245 245
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints : : : : Darinka Dentcheva and Andrzej Ruszczy´nski 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 The Portfolio Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 The Dominance-Constrained Portfolio Problem . . . . . . . . . . . . . . . . . . . . . . . . 15.5 Optimality and Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.6 Numerical Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
247
16 Portfolio Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Jack Clark Francis 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Inputs for Portfolio Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3 The Security Analyst’s Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.4 Four Assumptions Underlying Portfolio Analysis . . . . . . . . . . . . . . . . . . . . . . . 16.5 Different Approaches to Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.6 A Portfolio’s Expected Return Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.7 The Quadratic Risk Formula for a Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.8 The Covariance Between Returns from Two Assets . . . . . . . . . . . . . . . . . . . . .
259
247 248 249 252 254 256 257 257
259 259 259 260 260 261 261 262
Contents
xiii
16.9 Portfolio Analysis of a Two-Asset Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.10 Mathematical Portfolio Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.11 Calculus Minimization of Risk: A Three-Security Portfolio . . . . . . . . . . . . . . 16.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
262 265 265 266 266
17 Portfolio Theory, CAPM and Performance Measures : : : : : : : : : : : : : : : : : : : : : : Luis Ferruz, Fernando Gómez-Bezares, and María Vargas 17.1 Portfolio Theory and CAPM: Foundations and Current Application . . . . . . . 17.2 Performance Measures Related to Portfolio Theory and the CAPM: Classic Indices, Derivative Indices, and New Approaches . . . . . . . . . . . . . . . . . . . . . . . 17.3 Empirical Analysis: Performance Rankings and Performance Persistence . . . 17.4 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
267
18 Intertemporal Equilibrium Models, Portfolio Theory and the Capital Asset Pricing Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Stephen J. Brown 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Intertemporal Equilibrium Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3 Relationship to Observed Security Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4 Intertemporal Equilibrium and the Capital Asset Pricing Model . . . . . . . . . . . 18.5 Hansen Jagannathan Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.6 Are Stochastic Discount Factors Positive? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
267 274 277 280 280 283 283 283 284 285 285 286 286 287
19 Persistence, Predictability, and Portfolio Planning : : : : : : : : : : : : : : : : : : : : : : : : Michael J. Brennan and Yihong Xia 19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 Detecting and Exploiting Predictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3 Stock Price Variation and Variation in the Expected Returns . . . . . . . . . . . . . . 19.4 Economic Significance of Predictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5 Forecasts of Equity Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 19A The Optimal Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 19B The Unconditional Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 19C The Myopic Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 19D The Optimal Buy-and-Hold Strategy . . . . . . . . . . . . . . . . . . . . . . . . . .
289
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies : : : : : Lan-chih Ho, John Cadle, and Michael Theobald 20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Theory of Alternative Portfolio Insurance Strategies . . . . . . . . . . . . . . . . . . . . 20.3 Empirical Comparison of Alternative Portfolio Insurance Strategies . . . . . . . 20.4 Recent Market Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.5 Implications for Financial Market Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
319
289 290 296 298 303 314 314 315 316 317 317
319 319 324 329 331 332 332
xiv
Contents
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Reto Francioni, Sonali Hazarika, Martin Reck, and Robert A. Schwartz 21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Microstructure’s Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 The Perfectly Liquid Environment of CAPM . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 What Microstructure Analysis Has to Offer: Personal Reflections . . . . . . . . . 21.5 From Theory to Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6 Deutsche Börse: The Emergence of a Modern, Electronic Market . . . . . . . . . 21.7 Conclusion: The Roadmap and the Road . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 21A Risk Aversion and Risk Premium Measures . . . . . . . . . . . . . . . . . . . . 21A.1 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21A.2 Risk Premiums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 21B Designing Xetra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21B.1 Continuous Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21B.2 Call Auction Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21B.3 Electronic Trading for Less Liquid Stocks . . . . . . . . . . . . . . . . . . . . . . 21B.4 Xetra’s Implementation and the Migration of Liquidity to Xetra Since 1997 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part III
333 333 334 335 339 344 345 347 347 349 349 349 350 350 351 351 352
Options and Option Pricing Theory
22 Options Strategies and Their Applications : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 355 :: Cheng Few Lee, John Lee, and Wei-Kang Shih 22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 22.2 The Option Market and Related Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 22.3 Put-Call Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 22.4 Risk-Return Characteristics of Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 22.5 Examples of Alternative Option Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 22.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 23 Option Pricing Theory and Firm Valuation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng Few Lee, Joseph E. Finnerty, and Wei-Kang Shih 23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.2 Basic Concepts of Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.3 Factors Affecting Option Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.4 Determining the Value of Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.5 Option Pricing Theory and Capital Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.6 Warrants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Applications of the Binomial Distribution to Evaluate Call Options : : : : : : : : : : Alice C. Lee, John Lee, and Jessica Shin-Ying Mai 24.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.2 What Is an Option? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.3 The Simple Binomial Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.4 The Generalized Binomial Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . 24.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
377 377 377 380 384 387 390 391 392 393 393 393 393 395 397 397
Contents
xv
25 Multinomial Option Pricing Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng Few Lee and Jack C. Lee 25.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2 Multinomial Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.3 A Lattice Framework for Option Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 25A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng-Few Lee and Carl Shu-Ming Lin 26.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.2 The Two-State Option Pricing Model of Rendleman and Bartter . . . . . . . . . . 26.3 The Binomial Option Pricing Model of Cox, Ross, and Rubinstein . . . . . . . . 26.4 Comparison of the Two Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 26A The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Normal, Lognormal Distribution and Option Pricing Model : : : : : : : : : : : : : : : : Cheng Few Lee, Jack C. Lee, and Alice C. Lee 27.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.3 The Lognormal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4 The Lognormal Distribution and Its Relationship to the Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5 Multivariate Normal and Lognormal Distributions . . . . . . . . . . . . . . . . . . . . . . 27.6 The Normal Distribution as an Application to the Binomial and Poisson Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.7 Applications of the Lognormal Distribution in Option Pricing . . . . . . . . . . . . 27.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Bivariate Option Pricing Models : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng Few Lee, Alice C. Lee, and John Lee 28.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.2 The Bivariate Normal Density Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.3 American Call Option and the Bivariate Normal CDF . . . . . . . . . . . . . . . . . . . 28.4 Valuating American Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.5 Non-Dividend-Paying Stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.6 Dividend-Paying Stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Displaced Log Normal and Lognormal American Option Pricing: A Comparison : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Ren-Raw Chen and Cheng-Few Lee 29.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.2 The American Option Pricing Model Under the Lognormal Process . . . . . . . 29.3 The Geske-Roll-Whaley Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 29A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
399 399 399 402 406 406 406 409 409 409 415 417 418 418 419 421 421 421 422 422 423 425 426 428 428 429 429 429 430 431 433 433 438 438 439 439 439 440 442 442 443
xvi
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model : George Chalamandaris and A.G. Malliaris 30.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.2 The ITÔ Process and Financial Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.3 ITÔ’S Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.4 Stochastic Differential-Equation Approach to Stock-price Behavior . . . . . . . 30.5 The Pricing of an Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.6 A Reexamination of Option Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.7 Extending the Risk-Neutral Argument: The Martingale Approach . . . . . . . . . 30.8 Remarks on Option Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 30A An Alternative Method To Derive the Black–Scholes Option-Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30A.1 Assumptions and the Present Value of the Expected Terminal Option Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30A.2 Present Value of the Partial Expectation of the Terminal Stock Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30A.3 Present Value of the Exercise Price under Uncertainty . . . . . . . . . . . . 31 Constant Elasticity of Variance Option Pricing Model: Integration and Detailed Derivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Y.L. Hsu, T.I. Lin, and C.F. Lee 31.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 The CEV Diffusion and Its Transition Probability Density Function . . . . . . . 31.3 Review of Noncentral Chi-Square Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 The Noncentral Chi-square Approach to Option Pricing Model . . . . . . . . . . . 31.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 31A Proof of Feller’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
447 447 447 451 452 454 455 458 463 465 465 466 466 467 469 471 471 471 473 474 478 478 478
32 Stochastic Volatility Option Pricing Models : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Cheng Few Lee and Jack C. Lee 32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Nonclosed-Form Type of Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . 32.3 Review of Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.4 Closed-Form Type of Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 32A The Market Price of the Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
481
33 Derivations and Applications of Greek Letters: Review and Integration : : : : : : Hong-Yi Chen, Cheng-Few Lee, and Weikang Shih 33.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Delta () . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3 Theta .‚/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4 Gamma ./ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.5 Vega ./ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.6 Rho ./ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.7 Derivation of Sensitivity for Stock Options Respective with Exercise Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.8 Relationship Between Delta, Theta, and Gamma . . . . . . . . . . . . . . . . . . . . . . . 33.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
491
481 481 485 485 489 489 489
491 491 494 496 498 500 501 502 503 503
Contents
xvii
34 A Further Analysis of the Convergence Rates and Patterns of the Binomial Models : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : San-Lin Chung and Pai-Ta Shih 34.1 Brief Review of the Binomial Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34.2 The Importance of Node Positioning for Monotonic Convergence . . . . . . . . . 34.3 The Flexibility of GCRR Model for Node Positioning . . . . . . . . . . . . . . . . . . . 34.4 Numerical Results of Various GCRR Models . . . . . . . . . . . . . . . . . . . . . . . . . . 34.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 34A Extrapolation Formulas for Various GCRR Models . . . . . . . . . . . . . . 35 Estimating Implied Probabilities from Option Prices and the Underlying : : : : : Bruce Mizrach 35.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2 Black Scholes Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.3 Empirical Departures from Black Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.4 Beyond Black Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.5 Histogram Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.6 Tree Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.7 Local Volatility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.8 PDF Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.9 Inferences from the Mixture Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.10 Jump Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Are Tails Fat Enough to Explain Smile : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Ren-Raw Chen, Oded Palmon, and John Wald 36.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36.3 The Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36.4 Data and Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 36A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36A.1 The Derivation of the Lognormal Model Under No Rebalancing . . . 36A.2 Continuous Rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36A.3 Smoothing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36A.4 Results of Sub-Sample Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Gurdip Bakshi, Charles Cao, and Zhiwu Chen 37.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2 The Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.3 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.4 Empirical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 37A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
505 505 506 507 507 510 513 513 515 515 516 517 518 518 520 522 522 524 526 528 528 531 531 532 533 537 541 541 542 542 543 543 544
547 547 549 556 557 571 571 572
xviii
Contents
38 Application of the Characteristic Function in Financial Research : : : : : : : : : : : H.W. Chuang, Y.L. Hsu, and C.F. Lee 38.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.2 The Characteristic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.3 CEV Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.4 Options with Stochastic Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
575
39 Asian Options : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Itzhak Venezia 39.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2 Valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
583
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Kehluh Wang and Ming-Feng Hsu 40.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.2 Definitions and the Basic Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.3 Edgeworth Binomial Model for Asian Option Valuation . . . . . . . . . . . . . . . . . 40.4 Upper Bound and Lower Bound for European Asian Options . . . . . . . . . . . . . 40.5 Upper Bound and Lower Bound for American Asian Options . . . . . . . . . . . . . 40.6 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 The Valuation of Uncertain Income Streams and the Pricing of Options : : : : : : Mark Rubinstein 41.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Uncertain Income Streams: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Uncertain Income Streams: Special Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 41A The Bivariate Normal Density Function . . . . . . . . . . . . . . . . . . . . . . . . 42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : John Lee 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Call and Put Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 One Period Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.4 Two-Period Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.5 Using Microsoft Excel to Create the Binomial Option Trees . . . . . . . . . . . . . . 42.6 Black-Scholes Option Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.7 Relationship Between the Binomial OPM and the Black-Scholes OPM . . . . 42.8 Decision Tree Black-Scholes Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 42A Excel VBA Code: Binomial Option Pricing Model . . . . . . . . . . . . . .
575 575 576 577 581 581
583 584 586 586
587 587 588 589 591 593 594 602 602 605 605 606 608 611 613 613 614
617 617 617 618 621 622 624 625 626 626 627 627
Contents
xix
Part IV
Risk Management
43 Combinatorial Methods for Constructing Credit Risk Ratings : : : : : : : : : : : : : : : 639 :: Alexander Kogan and Miguel A. Lejeune 43.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639 43.2 Logical Analysis of Data: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 43.3 Absolute Creditworthiness: Credit Risk Ratings of Financial Institutions . . . 643 43.4 Relative Creditworthiness: Country Risk Ratings . . . . . . . . . . . . . . . . . . . . . . . 648 43.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660 Appendix 43A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662 44 The Structural Approach to Modeling Credit Risk : : : : : : : : : : : : : : : : : : : : : : : : Jing-zhi Huang 44.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 Structural Credit Risk Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.3 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Michael S. Pagano 45.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45.2 Theories of Risk-Management, Previous Research, and Testable Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45.3 Data, Sample Selection, and Empirical Methodology . . . . . . . . . . . . . . . . . . . . 45.4 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
665 665 665 668 671 671 675 675 677 685 689 694 694
46 Copula, Correlated Defaults, and Credit VaR : : : : : : : : : : : : : : : : : : : : : : : : : : : : Jow-Ran Chang and An-Chi Chen 46.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
697
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing : : : : : : Feng Zhao 47.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.2 Term Structure Models with Spanned Stochastic Volatility . . . . . . . . . . . . . . . 47.3 LIBOR Market Models with Stochastic Volatility and Jumps: Theory and Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.4 Nonparametric Estimation of the Forward Density . . . . . . . . . . . . . . . . . . . . . . 47.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 47A The Derivation for QTSMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 47B The Implementation of the Kalman Filter . . . . . . . . . . . . . . . . . . . . . . Appendix 47C Derivation of the Characteristic Function . . . . . . . . . . . . . . . . . . . . . . .
713
697 698 703 710 711
713 716 723 734 746 746 748 750 751
xx
48 Catastrophic Losses and Alternative Risk Transfer Instruments : : : : : : : : : : : : : Jin-Ping Lee and Min-Teh Yu 48.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.2 Catastrophe Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.3 Catastrophe Equity Puts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.4 Catastrophe Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.5 Reinsurance with CAT-Linked Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Chuang-Chang Chang, Pei-Fang Hsieh, and Hung-Neng Lai 49.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 49A The Correlations Between the Standard Wiener Process Generated from a Bank’s Net Interest Income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 49B The Risk-Adjusted Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 49C The Discrete Version of the Risk-Adjusted Process . . . . . . . . . . . . . . 50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets C.H. Ted Hong 50.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50.2 Model Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50.3 Default Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50.4 Prepayment Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50.5 Delinquency Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 50A Default and Prepayment Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 50B General Model Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 50C Default Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 50D Prepayment Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Huimin Chung, Wei-Peng Chen, and Yu-Dan Chen 51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 51A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Put Option Approach to Determine Bank Risk Premium : : : : : : : : : : : : : : : : : : : Dar Yeh Hwang, Fu-Shuen Shie, and Wei-Hsiung Wu 52.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Evaluating Insurer’s Liability by Option Pricing Model: Merton (1977) . . . . 52.3 Extensions of Merton (1977) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.4 Applications for Merton (1977) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
753 753 753 757 760 763 764 766 767 767 768 771 775 777 777 778 778 778 779 779 780 782 792 797 800 802 802 803 803 805 807 807 808 812 815 815 816 819 819 820 820 823
Contents
xxi
52.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 52A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 52B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach : : : : : : : : : : : : : : : : : : : : : : : : : : Hai-Chin Yu, Chih-Sean Chen, and Der-Tzon Hsieh 53.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 Data and Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.4 Empirical Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.5 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
825 826 826 827 829 829 831 831 836 840 841
54 On the Feasibility of Laddering : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Joshua Ronen and Bharat Sarath 54.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
843
55 Stock Returns, Extreme Values, and Conditional Skewed Distribution : : : : : : : Thomas C. Chiang and Jiandong Li 55.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.2 The AGARCH Model Based on the EGB2 Distribution . . . . . . . . . . . . . . . . . . 55.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.4 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.5 Distributional Fit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.6 The Implication of the EGB2 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
853
56 Capital Structure in Asia and CEO Entrenchment : : : : : : : : : : : : : : : : : : : : : : : : Kin Wai Lee and Gillian Hian Heng Yeo 56.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56.2 Prior Research and Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56.3 Data and Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 56A Variables Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
863
57 A Generalized Model for Optimum Futures Hedge Ratio : : : : : : : : : : : : : : : : : : Cheng-Few Lee, Jang-Yi Lee, Kehluh Wang, and Yuan-Chung Sheu 57.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57.2 GIG and GH Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57.3 Futures Hedge Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57.4 Estimation and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 57A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
873
843 845 849 851 851
853 854 855 856 859 859 861 862
863 864 865 867 871 871 872
873 876 877 879 880 880 881
xxii
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Nikolay Kosturov and Duane Stock 58.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58.2 Theory and Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58.3 Data and Return Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58.4 Descriptive Statistics of Daily Excess Returns . . . . . . . . . . . . . . . . . . . . . . . . . 58.5 OLS Regressions of Volatility and Excess Returns . . . . . . . . . . . . . . . . . . . . . . 58.6 Conditional Variance Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58.7 Alternative GARCH Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix 58A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Raw Material Convenience Yields and Business Cycle : : : : : : : : : : : : : : : : : : : : : Chang-Wen Duan and William T. Lin 59.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59.2 Characteristics of Study Commodities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59.3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59.5 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Sheng-Syan Chen, Cheng-Few Lee, and Han-Hsing Lee 60.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60.2 The Traditional Theory of Optimal Capital Structure . . . . . . . . . . . . . . . . . . . . 60.3 Optimal Capital Structure in the Contingent Claims Framework . . . . . . . . . . 60.4 Recent Development of Capital Structure Models . . . . . . . . . . . . . . . . . . . . . . 60.5 Application and Empirical Evidence of Capital Structure Models . . . . . . . . . 60.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Actuarial Mathematics and Its Applications in Quantitative Finance : : : : : : : : : Cho-Jieh Chen 61.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Actuarial Discount and Accumulation Functions . . . . . . . . . . . . . . . . . . . . . . . 61.3 Actuarial Mathematics of Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Actuarial Mathematics of Annuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Actuarial Premiums and Actuarial Reserves . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Applications in Quantitative Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 The Prediction of Default with Outliers: Robust Logistic Regression : : : : : : : : : Chung-Hua Shen, Yi-Kai Chen, and Bor-Yi Huang 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Literature Review of Outliers in Conventional and in Logit Regression . . . . . 62.3 Five Validation Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4 Source of Data and Empirical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
883 883 884 886 886 897 899 903 910 912 913 915 915 917 919 921 922 930 931 933 933 934 936 941 948 950 950 953 953 953 955 958 959 961 963 963 965 965 966 967 969 969 973 976
Contents
xxiii
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 979 Hai Lin and Chunchi Wu 63.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979 63.2 Definitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 980 63.3 Bond Pricing in Dynamic Term Structure Model Framework . . . . . . . . . . . . . 980 63.4 Dynamic Term Structure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 981 63.5 Models of Defaultable Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988 63.6 Interest Rate and Credit Default Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996 63.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001 64 Liquidity Risk and Arbitrage Pricing Theory : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1007 Umut Çetin, Robert A. Jarrow, and Philip Protter 64.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007 64.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009 64.3 The Extended First Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1011 64.4 The Extended Second Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 1012 64.5 Example (Extended Black–Scholes Economy) . . . . . . . . . . . . . . . . . . . . . . . . . 1015 64.6 Discontinuous Supply Curve Evolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016 64.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017 Appendix 64A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018 65 An Integrated Model of Debt Issuance, Refunding, and Maturity : : : : : : : : : : : : 1025 Manak C. Gupta and Alice C. Lee 65.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1025 65.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1026 65.3 Operationalizing the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029 65.4 Numerical Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1032 65.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1037 Part V
Theory, Methodology, and Applications
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1041 ::: Thomas S. Y. Ho and Sang Bin Lee 66.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041 66.2 The Model Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1042 66.3 Simulation Results of the Capital Budgeting Decisions . . . . . . . . . . . . . . . . . . 1045 66.4 Relative Valuation of Equity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1048 66.5 Equity Return Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1050 66.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1051 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1051 Appendix 66A Derivation of the Risk Neutral Probability . . . . . . . . . . . . . . . . . . . . . . 1052 Appendix 66B The Model for the Fixed Operating Cost at Time T . . . . . . . . . . . . . . 1052 Appendix 66C The Valuation Model Using the Recombining Lattice . . . . . . . . . . . . 1053 Appendix 66D Input Data of the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054
xxiv
67 Dividends Versus Reinvestments in Continuous Time: A More General Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1055 Ren-Raw Chen, Ben Logan, Oded Palmon, and Larry Shepp 67.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055 67.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055 67.3 The Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057 67.4 Expected Bankruptcy Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058 67.5 Further Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059 67.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1060 68 Segmenting Financial Services Market: An Empirical Study of Statistical and Non-parametric Methods : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1061 Kenneth Lawrence, Dinesh Pai, Ronald Klimberg, Stephen Kudbya, and Sheila Lawrence 68.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1061 68.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 68.3 Evaluating the Classification Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064 68.4 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065 68.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065 68.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1066 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1066 69 Spurious Regression and Data Mining in Conditional Asset Pricing Models : : : 1067 Wayne Ferson, Sergei Sarkissian, and Timothy Simin 69.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1067 69.2 Spurious Regression and Data Mining in Predictive Regressions . . . . . . . . . . 1068 69.3 Spurious Regression, Data Mining, and Conditional Asset Pricing . . . . . . . . . 1069 69.4 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1069 69.5 The Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1071 69.6 Results for Predictive Regressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073 69.7 Results for Conditional Asset Pricing Models . . . . . . . . . . . . . . . . . . . . . . . . . . 1080 69.8 Solutions to the Problems of Spurious Regression and Data Mining . . . . . . . 1086 69.9 Robustness of the Asset Pricing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1087 69.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1088 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089 70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests : : : : : 1091 Dongcheol Kim 70.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091 70.2 The Errors-in-Variables Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1092 70.3 A Correction for the Errors-in-Variables Bias . . . . . . . . . . . . . . . . . . . . . . . . . . 1094 70.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099 70.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108 71 McMC Estimation of Multiscale Stochastic Volatility Models : : : : : : : : : : : : : : : 1109 German Molina, Chuan-Hsiang Han, and Jean-Pierre Fouque 71.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109 71.2 Multiscale Modeling and McMC Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1110 71.3 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1113 71.4 Empirical Application: FX Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1113 71.5 Implication on Derivatives Pricing and Hedging . . . . . . . . . . . . . . . . . . . . . . . . 1118
Contents
Contents
xxv
71.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1118 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1119 Appendix 71A Proof of Independent Factor Equivalence . . . . . . . . . . . . . . . . . . . . . . 1119 Appendix 71B Full Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1120 72 Regime Shifts and the Term Structure of Interest Rates : : : : : : : : : : : : : : : : : : : : 1121 Chien-Chung Nieh, Shu Wu, and Yong Zeng 72.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1121 72.2 Regime-Switching and Short-Term Interest Rate . . . . . . . . . . . . . . . . . . . . . . . 1122 72.3 Regime-Switching Term Structure Models in Discreet Time . . . . . . . . . . . . . . 1126 72.4 Regime-Switching Term Structure Models in Continuous Time . . . . . . . . . . . 1128 72.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133 73 ARM Processes and Their Modeling and Forecasting Methodology : : : : : : : : : : 1135 Benjamin Melamed 73.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135 73.2 Overview of ARM Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1136 73.3 The ARM Modeling Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139 73.4 The ARM Forecasting Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1140 73.5 Example: ARM Modeling of an S&P 500 Time Series . . . . . . . . . . . . . . . . . . 1145 73.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1148 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1149 74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1151 Lee Cheng-Few and Yi Lin Wu 74.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1151 74.2 The Information Contents of Equity-Selling Mechanisms . . . . . . . . . . . . . . . . 1152 74.3 Alternative Econometric Methods for Information-Based Equity-Selling Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1153 74.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1161 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1162 75 Implementation Problems and Solutions in Stochastic Volatility Models of the Heston Type : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1165 Jia-Hau Guo and Mao-Wei Hung 75.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165 75.2 The Transform-Based Solution for Heston’s Stochastic Volatility Model . . . 1165 75.3 Solutions to the Discontinuity Problem of Heston’s Formula . . . . . . . . . . . . . 1168 75.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1170 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1171 76 Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models: Evidence from U.S. Stock Markets : : : : : : : : : : : : : : : : : : : : : : 1173 Zhuo Qiao and Wing-Keung Wong 76.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173 76.2 The Mixture of Distribution Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175 76.3 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175 76.4 Empirical Findings in NYSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176 76.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1178 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1179 Appendix 76A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1180
xxvi
77 Application of Fuzzy Set Theory to Finance Research: Method and Application : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1183 Shin-Yun Wang and Cheng Few Lee 77.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1183 77.2 Fuzzy Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184 77.3 Applications of Fuzzy Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1190 77.4 A Example of Fuzzy Binomial OPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194 77.5 An Example of Real Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196 77.6 Fuzzy Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1197 77.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1198 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1199 78 Hedonic Regression Analysis in Real Estate Markets: A Primer : : : : : : : : : : : : : 1201 Ben J. Sopranzetti 78.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1201 78.2 The Theoretical Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1201 78.3 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1202 78.4 The Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1202 78.5 Empirical Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1203 78.6 The Semi-Log Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204 78.7 The Box-Cox Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205 78.8 Problems with Hedonic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205 78.9 Recent Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206 78.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207 79 Numerical Solutions of Financial Partial Differential Equations : : : : : : : : : : : : : 1209 Gang Nathan Dong 79.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1209 79.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1209 79.3 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210 79.4 Finite Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210 79.5 Finite Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1217 79.6 Finite Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1218 79.7 Empirical Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1219 79.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1220 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1220 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1221 80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1223 Ivan E. Brick and Daniel G. Weaver 80.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1223 80.2 Textbook Approaches to NPV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224 80.3 Theoretical Valuation of Cash Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226 80.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1228 80.5 Personal Tax and Miller Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1229 80.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1231 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1232 81 Determinants of Flows into U.S.-Based International Mutual Funds : : : : : : : : : 1235 Dilip K. Patro 81.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235 81.2 Motivation and Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236
Contents
Contents
xxvii
81.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237 81.4 Methodology and Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1238 81.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1247 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1253 Appendix 81A Econometric Analysis of Panel Data . . . . . . . . . . . . . . . . . . . . . . . . . . 1253 82 Predicting Bond Yields Using Defensive Forecasting : : : : : : : : : : : : : : : : : : : : : : : 1257 Glenn Shafer and Samuel Ring 82.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257 82.2 Game-Theoretic Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1260 82.3 Defensive Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1265 82.4 Predicting Bond Yields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1269 82.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1271 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1271 83 Range Volatility Models and Their Applications in Finance : : : : : : : : : : : : : : : : : 1273 Ray Yeutien Chou, Hengchih Chou, and Nathan Liu 83.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1273 83.2 The Price Range Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274 83.3 The Range-Based Volatility Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276 83.4 The Realized Range Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1278 83.5 The Financial Applications and Limitations of the Range Volatility . . . . . . . . 1279 83.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1279 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1280 84 Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1283 Zhuo Qiao, Venus Khim-Sen Liew, and Wing-Keung Wong 84.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1283 84.2 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1284 84.3 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285 84.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1289 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1289 Appendix 84A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1290 85 Application of Alternative ODE in Finance and Economics Research : : : : : : : : 1293 Cheng-Few Lee and Junmin Shi 85.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1293 85.2 Ordinary Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1294 85.3 Applications of ODE in Deterministic System . . . . . . . . . . . . . . . . . . . . . . . . . 1295 85.4 Applications of ODE in Stochastic System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1297 85.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1300 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1300 86 Application of Simultaneous Equation in Finance Research : : : : : : : : : : : : : : : : 1301 Carl R. Chen and Cheng Few Lee 86.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1301 86.2 Two-Stage and Three-Stage Least Squares Method . . . . . . . . . . . . . . . . . . . . . 1302 86.3 Application of Simultaneous Equation in Finance Research . . . . . . . . . . . . . . 1305 86.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1305 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1306
xxviii
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance : : : : : 1307 Wikil Kwak, Yong Shi, and Cheng-Few Lee 87.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1307 87.2 A Fuzzy Approach to International Transfer Pricing . . . . . . . . . . . . . . . . . . . . 1307 87.3 A Fuzzy Set Approach to Human Resource Allocation of a CPA Firm . . . . . 1312 87.4 A Fuzzy Set Approach to Accounting Information System Selection . . . . . . . 1316 87.5 Fuzzy Set Formulation to Capital Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . 1319 87.6 A Data Mining Approach to Firm Bankruptcy Predictions . . . . . . . . . . . . . . . 1324 87.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329 88 Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High-Frequency Index Returns : : : : : : : : : : : : : : : : : 1333 Bevan J. Blair, Ser-Huang Poon, and Stephen J. Taylor 88.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1333 88.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334 88.3 Methodology for Forecasting Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1336 88.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1338 88.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1343 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344 89 Detecting Structural Instability in Financial Time Series : : : : : : : : : : : : : : : : : : : 1345 Derann Hsu 89.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345 89.2 Genesis of the Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345 89.3 Problems of Multiple Change Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1347 89.4 Here Came the GARCH and Its Brethrens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1348 89.5 Examples of Structural Shift Analysis in Financial Time Series . . . . . . . . . . . 1349 89.6 Implications of Structural Instability to Financial Theories and Practice . . . . 1352 89.7 Direction of Future Research and Developments . . . . . . . . . . . . . . . . . . . . . . . 1353 89.8 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1354 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1354 90 The Instrument Variable Approach to Correct for Endogeneity in Finance : : : 1357 Chia-Jane Wang 90.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1357 90.2 Endogeneity: The Statistical Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358 90.3 Instrumental Variables Approach to Endogeneity . . . . . . . . . . . . . . . . . . . . . . . 1358 90.4 Validity of Instrumental Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1361 90.5 Identification and Inferences with Weak Instruments . . . . . . . . . . . . . . . . . . . . 1364 90.6 Empirical Applications in Corporate Finance . . . . . . . . . . . . . . . . . . . . . . . . . . 1366 90.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1368 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1368 91 Bayesian Inference of Financial Models Using MCMC Algorithms : : : : : : : : : : 1371 Xianghua Liu, Liuling Li, and Hiroki Tsurumi 91.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1371 91.2 Bayesian Inference and MCMC Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 1371 91.3 CKLS Model with ARMA-GARCH Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1374 91.4 Copula Model for FTSE100 and S&P500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1376 91.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1379 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1380
Contents
Contents
xxix
92 On Capital Structure and Entry Deterrence : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1381 Fathali Firoozi and Donald Lien 92.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1381 92.2 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1382 92.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1384 92.4 Capital Structure and Entry Deterrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1386 92.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1388 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1389 93 VAR Models: Estimation, Inferences, and Applications : : : : : : : : : : : : : : : : : : : : 1391 Yangru Wu and Xing Zhou 93.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1391 93.2 A Brief Discussion of VAR Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1391 93.3 Applications of VARs in Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1393 93.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397 94 Signaling Models and Product Market Games in Finance: Do We Know What We Know? : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1399 Kose John and Anant K. Sundaram 94.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1399 94.2 Supermodularity: Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1400 94.3 Supermodularity in Signaling Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1400 94.4 Supermodularity in Product Market Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1403 94.5 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1406 94.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1407 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1407 95 Estimation of Short- and Long-Term VaR for Long-Memory Stochastic Volatility Models : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1409 Hwai-Chung Ho and Fang-I Liu 95.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1409 95.2 Long Memory in Stochastic Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1410 95.3 VaR Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411 95.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1414 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1414 96 Time Series Modeling and Forecasting of the Volatilities of Asset Returns : : : : 1417 Tze Leung Lai and Haipeng Xing 96.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417 96.2 Conditional Heteroskedasticity Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417 96.3 Regime-Switching, Change-Point and Spline-GARCH Models of Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1421 96.4 Multivariate Volatility Models and Applications to Mean–Variance Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1424 96.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425 97 Listing Effects and the Private Company Discount in Bank Acquisitions : : : : : 1427 Atul Gupta and Lalatendu Misra 97.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1427 97.2 Why Acquiring Firms May Pay Less for Unlisted Targets . . . . . . . . . . . . . . . . 1428 97.3 Sample Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1430 97.4 Event Study Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1431 97.5 Findings Based on Multiples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1433
xxx
Contents
97.6 Cross-Sectional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1439 97.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1442 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1443 98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1445 Yu-Ting Chen, Cheng-Few Lee, and Yuan-Chung Sheu 98.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1445 98.2 Integro-Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1446 98.3 Explicit Formula for ˆ – ODE Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1448 98.4 The Constant Vector Q: Second Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1453 98.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1457 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1458 Appendix 98A Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1458 Appendix 98B Toolbox for Phase-Type Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 1462 Appendix 98C First Order Derivative of ˆ at Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1462 99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1465 Alice C. Lee and J. David Cummins 99.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1465 99.2 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1466 99.3 Model-Specification and Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1467 99.4 Data Description and Cost of Equity Capital Estimates . . . . . . . . . . . . . . . . . . 1470 99.5 Evaluations of Simulations and Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1476 99.6 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1480 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1481 100 Implementing a Multifactor Term Structure Model : : : : : : : : : : : : : : : : : : : : : : : 1483 Ren-Raw Chen and Louis O. Scott 100.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483 100.2 A Multifactor Term Structure Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1483 100.3 Pricing Options in the Multifactor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1485 100.4 Calibrating a Multifactor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1487 100.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1488 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1488 101 Taking Positive Interest Rates Seriously : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1489 Enlin Pan and Liuren Wu 101.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1489 101.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1490 101.3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1491 101.4 The Hump-Shaped Forward Rate Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1494 101.5 Fitting the US Treasury Yields and US Dollar Swap Rates . . . . . . . . . . . . . . . 1495 101.6 Extensions: Jumps in Interest Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1498 101.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1500 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1500 Appendix 101A Factor Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1501 Appendix 101B Extended Kalman Filter and Quasilikelihood . . . . . . . . . . . . . . . . . . 1502
Contents
xxxi
102 Positive Interest Rates and Yields: Additional Serious Considerations : : : : : : : : 1503 Jonathan Ingersoll 102.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1503 102.2 A Non-Zero Bound for Interest Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1503 102.3 The Cox–Ingersoll–Ross and Pan–Wu Term Structure Models . . . . . . . . . . . . 1504 102.4 Bubble-Free Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1506 102.5 Multivariate Affine Term-Structure Models with Zero Bounds on Yields . . . 1511 102.6 Non-Affine Term Structures with Yields Bounded at Zero . . . . . . . . . . . . . . . . 1514 102.7 Non-Zero Bounds for Yields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516 102.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1517 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1517 Appendix 102A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1517 102A.1 Derivation of the Probability and State price for rT D 0 for the PW Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1517 102A.2 Bond Price When rt D 0 Is Accessible for Only the Risk-Neutral Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519 102A.3 Properties of the Affine Exponentially Smoothed Average Model . . 1520 102A.4 Properties of the Three-Halves Power Interest Rate Process . . . . . . . 1521 103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1523 Cheng-Few Lee, Dilip K. Patro, and Bo Liu 103.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1523 103.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1524 103.3 Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1526 103.4 Data and Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1527 103.5 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1534 103.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1545 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1553 104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1555 Michael Mania, Marina Santacroce, and Revaz Tevzadze 104.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1555 104.2 Some Basic Definitions, Conditions, and Auxiliary Facts . . . . . . . . . . . . . . . . 1556 104.3 Backward Semimartingale Equation for the Value Process . . . . . . . . . . . . . . . 1558 104.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1564 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1565 105 The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model with Jumps (Reprint) : : : : : : : : : : : : : : : : : : : : : 1567 Fred Espen Benth and Thilo Meyer-Brandis 105.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1567 105.2 The Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1568 105.3 The Minimal Entropy Martingale Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1569 105.4 The Density Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1571 105.5 The Entropy Price of Derivatives and Integro-Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1573 105.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1574 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1575
xxxii
106 Arbitrage Detection from Stock Data: An Empirical Study : : : : : : : : : : : : : : : : : 1577 Cheng-Der Fuh and Szu-Yu Pai 106.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1577 106.2 Arbitrage Detection: Volatility Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1579 106.3 Arbitrage Detection: Mean Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1583 106.4 Empirical Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1586 106.5 Conclusions and Further Researches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1590 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1591 107 Detecting Corporate Failure : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1593 Yanzhi Wang, Lin Lin, Hsien-Chang Kuo, and Jenifer Piesse 107.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1593 107.2 The Possible Causes of Bankruptcy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1594 107.3 The Methods of Bankruptcy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1594 107.4 Prediction Model for Corporate Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1596 107.5 The Selection of Optimal Cutoff Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1603 107.6 Recent Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604 107.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1604 108 Genetic Programming for Option Pricing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1607 N.K. Chidambaran 108.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1607 108.2 Genetic Program Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1608 108.3 Black–Scholes Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1611 108.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613 108.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1614 109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions in Option Pricing, Review, and Integration : : : : : : : : : : : : : : 1615 Ren-Raw Chen and Cheng-Few Lee 109.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1615 109.2 The CEV Diffusion and Its Transition Density . . . . . . . . . . . . . . . . . . . . . . . . . 1616 109.3 The CEV Option Pricing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1619 109.4 Computing the Non-Central Chi-Square Probabilities . . . . . . . . . . . . . . . . . . . 1622 109.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1623 Appendix 109A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1623 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1625 References : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1627 Author Index : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1685 Subject Index : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1709
Contents
List of Contributors
James S. Ang, Florida State University, Tallahassee, FL, USA Gurdip Bakshi, University of Maryland, College Park, MD, USA Hamid Beladi, University of Texas at San Antonio, San Antonio, TX, USA Fred Espen Benth, University of Oslo and Agder University College, Kristiansand, Norway Bevan J. Blair, Ingenious Asset Management, London, UK Michael J. Brennan, University of California at Los Angeles, Los Angeles, CA, USA Ivan Brick, Rutgers University, Newark, NJ, USA Stephen J. Brown, New York University, New York, NY, USA John Cadle, University of Birmingham, Birmingham, UK Charles Cao, Department of Finance, Smeal College of Business, Pennsylvania State University, University Park, PA, USA Umut Çetin, Technische Universität Wien, Vienna, Austria George Chalamandaris, Athens University of Economics and Business, Athens, Greece Chuang-Chang Chang, National Central University, Taipei, Taiwan, ROC Jow-Ran Chang, National Tsing Hua University, Hsinchu, Taiwan, ROC An-Chi Chen, KGI Securities Co. Ltd., Taipei, Taiwan, ROC Carl R. Chen, University of Dayton, Dayton, OH, USA Chih-Sean Chen, Chung Yuan University, Taoyuan County, Taiwan, ROC Cho-Jieh Chen, University of Alberta, Edmonton, AB, Canada Hong-Yi Chen, Rutgers University, Newark, NJ, USA Ren-Raw Chen, Fordham University, New York, NY, USA Sheng-Syan Chen, National Taiwan University, Taipei, Taiwan, ROC Wei-Peng Chen, Shih Hsin University, Taipei, Taiwan, ROC Yi-Kai Chen, National University of Kaohsiung, Kaohsiung, Taiwan, ROC Yu-Dan Chen, National Chiao Tung University, Hsinchu, Taiwan, ROC Yu-Ting Chen, National Chao Tung University, Hsinchu, Taiwan, ROC Zhiwu Chen, Yale University, New Haven, CT, USA
xxxiii
xxxiv
Thomas C. Chiang, Drexel University, Philadelphia, PA, USA N. K. Chidambaran, Fordham University, New York, NY, USA Wan-Jiun Paul Chiou, Shippensburg University, Shippensburg, PA, USA Heng-chih Chou, Ming Chuan University, Taipei, Taiwan, ROC Ray Y. Chou, Academia Sinica, Taipei, Taiwan, ROC H.W. Chiang, National Taiwan University, Taipei, Taiwan, ROC Huimin Cheng, National Chiao Tung University, Hsinchu, Taiwan, ROC San-Lin Cheng, National Taiwan University, Taipei, Taiwan, ROC Jing Cui, Clarion University of Pennsylvania, Clarion, PA, USA J. D. Cumming, Temple University, Philadelphia, PA, USA Darinka Dentcheva, Stevens Institute of Technology, Hoboken, NJ, USA Dean Diavatopoulos, Villanova University, Philadelphia, PA, USA Gang Nathan Dong, Rutgers University, Newark, NJ, USA Chang-Wen Duan, Tamkang University, Taipei, Taiwan, ROC Luis Ferruz, University of Zaragoza, Zaragoza, Spain Wayne Fresón, University of Southern California, Los Angeles, CA, USA Joseph E. Finnerty, University of Illinois at Urbana-Champaign, Champaign, IL, USA Fathali Firoozi, University of Texas at San Antonio, San Antonio, TX, USA Jean-Pierre Fouque, University of California, Santa Barbara, CA, USA Reto Francioni, Deutsche Börse, Frankfurt, Germany Jack Clark Francis, Baruch College, New York, NY, USA Cheng-Der Fuh, National Central University and Academia Sinica, Taipei, Taiwan, ROC Fernando Gómez-Bezares, University of Deusto, Bilbao, Spain Robert R. Grauer, Simon Fraser University, Burnaby, BC, Canada Jia-Hau Guo, Soochow University, Taipei, Taiwan, ROC Atul Gupta, Bentley University, Waltham, MA, USA Manak C. Gupta, Temple University, Philadelphia, PA, USA Chuan-Hsiang Han, National Tsing Hua University, Hsinchu, Taiwan, ROC Sonali Hazarika, Baruch College, New York, NY, USA Hwai-Chung Ho, Academia Sinica and National Taiwan University, Taipei, Taiwan, ROC Keng-Yu Ho, National Taiwan University, Taipei, Taiwan, ROC Lan-chih Ho, Central Bank of the Republic of China, Taipei, Taiwan, ROC Thomas S. Y. Ho, Thomas Ho Company, Ltd, New York, NY, USA C.H. Ted Hong, Beyondbond Inc., New York, NY, USA Tsui-Ling Hseu, National Chiao Tung University, Hsinchu, Taiwan, ROC Der-Tzon Hsieh, National Taiwan University, Taipei, Taiwan, ROC
List of Contributors
List of Contributors
xxxv
Pei-Fang Hsieh, Department of Finance, National Central University, Chung Li City, Taiwan, ROC Derann Hsu, University of Wisconsin–Milwaukee, Milwaukee, WI, USA Ming-Feng Hsu, Tatung University, Taipei, Taiwan, ROC Ying Lin Hsu, National Chung Hsing University, Taichung, Taiwan, ROC Bor-Yi Huang, Department of Business Management, Shih Chien University, Taipei, Taiwan, ROC Dar-Yeh Huang, National Taiwan University, Taipei, Taiwan, ROC Jingzhi Huang, Pennsylvania State University, University Park, PA, USA Ken Hung, Texas A&M International University, Laredo, TX, USA Mao-Wei Hung, National Taiwan University, Taipei, Taiwan, ROC Jonathan E. Ingersoll, Jr., Yale School of Management, New Haven, CT, USA Robert A. Jarrow, Cornell University, Ithaca, NY, USA Kose John, New York University, New York, NY, USA Dongcheol Kim, Korea University Business School, Seoul, Korea Ronald Klimberg, St. Joseph’s University, Philadelphia, PA, USA Alexander Kogan, Rutgers University, Newark, NJ, USA Nikolay Kosturov, University of Oklahoma, Norman, OK, USA Stephen Kudbya, New Jersey Institute of Technology, Newark, NJ, USA Hsien-chang Kuo, National Chi-Nan University and Takming University of Science and Technology, Nantou Hsien, Taiwan, ROC Wikil Kwak, University of Nebraska at Omaha, Omaha, NE, USA Hung-Neng Lai, Department of Finance, National Central University, Chung Li City, Taiwan, ROC Tze Leung Lai, Stanford University, Stanford, CA, USA Kenneth Lawrence, New Jersey Institute of Technology, Newark, NJ, USA Sheila Lawrence, Rutgers University, Newark, NJ, USA Alice C. Lee, State Street Corp., Boston, MA, USA Cheng-Few Lee, Rutgers University, New Brunswick, NJ, USA and National Chiao Tung University, Hsinchu, Taiwan, ROC Han-Hsing Lee, National Chiao Tung University, Hsinchu, Taiwan, ROC Jack C. Lee, National Chiao Tung University, Hsinchu, Taiwan, ROC Jang-Yi Lee, Tunghai University, Taichung, Taiwan, ROC Jin-Ping Lee, Feng Chia University, Taichung, Taiwan, ROC John Lee, Center for PBBEF Research, Hackensack, NJ, USA Kin Wai Lee, Nanyang Technological University, Singapore, Singapore Sang Bin Lee, Hanyang University, Seoul, Korea Miguel A. Lejeune, George Washington University, Washington, DC, USA
xxxvi
Jiandong Li, Central University of Finance and Economics, P.R. China Liuling Li, Rutgers University, New Brunswick, NJ, USA Donald Lien, University of Texas at San Antonio, San Antonio, TX, USA Venus Khim-Sen Liew, Universiti Malaysia Sabah, Sabah, Malaysia Carle Shu Ming Lin, Rutgers University, New Brunswick, NJ, USA Hai Lin, Xiamen University, Xiamen, Fujian, China Lin Lin, Department of Banking and Finance, National Chi-Nan University, 1 University Rd., Puli, Nantou Hsien, Taiwan 545, ROC T. I. Lin, National Chung Hsing University, Taichung, Taiwan, ROC William T. Lin, Tamkang University, Taipei, Taiwan, ROC Bo Liu, Citigroup Global Market Inc., New York, NY, USA Fang-I Liu, National Taiwan University, Taipei, Taiwan, ROC Nathan Liu, National Chiao Tung University, Hsinchu, Taiwan, ROC Xianghua Liu, Rutgers University, Piscataway, NJ, USA Ben Logan, Bell Labs, USA Jessica Mai, Rutgers University, Newark, NJ, USA A.G. Malliaris, Loyola University Chicago, Chicago, IL, USA Michael Mania, A. Razmadze Mathematical Institute, Georgia and Georgian-American University, Tbilisi, Georgia Benjamin Melamed, Rutgers Business School, Newark and New Brunswick, NJ, USA Thilo Meyer-Brandis, University of Oslo, Oslo, Norway Lalatendu Misra, University of Texas at San Antonio, San Antonio, TX, USA Bruce Mizrach, Rutgers University, New Brunswick, NJ, USA German Molina, Statistical and Applied Mathematical Sciences Institute, NC, USA Chien-Chung Nieh, Tamkang University, Taipei, Taiwan, ROC Michael S. Pagano, Villanova University, Philadelphia, PA, USA Dinesh Pai, Rutgers University, Newark, NJ, USA Szu-Yu Pai, National Taiwan University, Taipei, Taiwan, ROC Oded Palmon, Rutgers University, New Brunswick, NJ, USA Enlin Pan, Chicago Partners, Chicago, IL USA Dilip K. Patro, Office of the Comptroller of the Currency, Washington, DC, USA Jenifer Piesse, University of London, London, UK Ser-Huang Poon, University of Manchester, Manchester, UK Philip Protter, Cornell University, Ithaca, NY, USA Zhuo Qiao, University of Macau, Macau, China Martin Reck, Deutsche Börse, Frankfurt, Germany Samuel Ring, Rutgers University, Newark, NJ, USA
List of Contributors
List of Contributors
xxxvii
Joshua Ronen, New York University, New York, NY, USA Mark Rubinstein, University of California, Berkley, CA, USA Andrzej Ruszczynski, Rutgers University, Newark, NJ, USA Marina Santacroce, Politecnico di Torino, Department of Mathematics, C.so Duca degli Abruzzi 24, 10129 Torino, Italy Bharat Sarath, Baruch College, New York, NY, USA Sergei Sarkissian, McGill University, Montreal, QC, Canada Robert A. Schwartz, Baruch College, New York, NY, USA Thomas V. Schwarz, Grand Valley State University, Allendale, MI, USA Louis O. Scott, Morgan Stanley, New York, NY, USA Glenn Shafer, Rutgers University, Newark, NJ USA Chung-Hua Shen, National Taiwan University, Taipei, Taiwan, ROC Frederick C. Shen, Coventree Inc, Toronto, ON, Canada Larry Shepp, Rutgers University, Piscataway, NJ, USA Yuan-Chung Sheu, National Chao Tung University, Hsinchu, Taiwan, ROC Junmin Shi, Rutgers University, Newark, NJ, USA Yong Shi, University of Nebraska at Omaha, Omaha, NE, USA and Chinese Academy of Sciences, Beijing, China Fu-Shuen Shie, National Taiwan University, Taipei, Taiwan, ROC Pai-Ta Shih, Department of Finance, National Taiwan University, Taipei 106, Taiwan, ROC Wei-Kang Shih, Rutgers University, Newark, NJ, USA Timothy Simin, Pennsylvania State University, University Park, PA, USA Ben J. Sopranzetti, Rutgers University, Newark, NJ, USA Duane Stock, University of Oklahoma, Norman, OK, USA Anant Sunderam, Tuck School, Hanover, NH, USA Stephen J. Taylor, Lancaster University, Lancaster, UK Revaz Tevzadze, Institute of Cybernetics, Georgia and Georgian-American University, Tbilisi, Georgia Michael Theobald, Accounting and Finance Subject Group, University of Birmingham, Birmingham, UK Hiroki Tsurumi, Rutgers University, New Brunswick, NJ, USA María Vargas, University of Zaragoza, Zaragoza, Aragon, Spain Itzhak Venezia, Hebrew University, Jerusalem, Israel John Wald, Pennsylvania State University, University Park, PA, USA Chia-Jane Wang, Manhattan College, New York, NY, USA Kehluh Wang, National Chiao Tung University, Hsinchu, Taiwan, ROC
xxxviii
Shin-Yun Wang, National Dong Hwa University, Hualien, Taiwán, ROC Yanzhi Wang, Yuan Ze University, Taoyuan, Taiwán, ROC Daniel Weaver, Rutgers University, Piscataway, NJ, USA Wing-Keung Wong, Hong Kong Baptist University, Hong Kong, Kowloon Tong, Hong Kong Donald H. Wort, California State University East Bay, Hayward, CA, USA ChunChi Wu, University of Missouri, Columbia, MO, USA Liuren Wu, Baruch College, New York, NY, USA Shu Wu, The University of Kansas, Lawrence, KS, USA Wei-Hsiung Wu, National Taiwan University, Taipei, Taiwan, ROC Yangru Wu, Rutgers Business School, Newark and New Brunswick, NJ, USA Yi Lin Wu, National Tsing Hua University, Hsinchu, Taiwan, ROC Yihong Xia, Wharton School, Pennsylvania, PA, USA Haipeng Xing, SUNY at Stony Brook, Stony Brook, NY, USA Chin W. Yang, Clarion University of Pennsylvania, Clarion, PA, USA Kenton K. Yee, Columbia Business School, New York, NY, USA Gillian Hian Heng Yeo, Nanyang Technological University, Singapore, Singapore Hai-Chin Yu, Chung Yuan University, Taoyuan, Taiwan, ROC Min-Teh Yu, Providence University, Taichung, Taiwan, ROC Yong Zeng, The University of Missouri at Kansas City, Kansas City, MO, USA Feng Zhao, Rutgers University, Newark, NJ, USA Xing Zhou, Rutgers Business School, Newark and New Brunswick, NJ, USA
List of Contributors
Chapter 1
Theoretical Framework of Finance
Abstract The main purpose of this chapter is to explore important finance theories. First, we discuss discounted cash-flow valuation theory (classical financial theory). Second, we discuss the Modigliani and Miller (M and M) valuation theory. Third, we examine Markowitz portfolio theory. We then move on to the capital asset pricing model (CAPM), followed by the arbitrage pricing theory. Finally, we will look at the option pricing theory and futures valuation and hedging. Keywords Discounted cash-flow valuation r M and M valuation theory r Markowitz portfolio theory r Capital asset pricing model r Arbitrage pricing theory r Option pricing model r Futures valuation and hedging
1.1 Introduction Value determination of financial instruments is important in security analysis and portfolio management. Valuation theories are the basic tools for determining the intrinsic value of alternative financial instruments. This chapter provides a general review of the financial theory that most students of finance would have already received in basic corporate finance and investment classes. Synthesis and integration of the valuation theories are necessary for the student of investments in order to have a proper perspective of security analysis and portfolio management. The basic policy areas involved in the management of a company are (1) investment policy, (2) financial policy, (3) dividend policy, and (4) production policy. Since the determination of the market value of a firm is affected by the way management sets and implements these policies, they are of critical importance to the security analyst. The security analyst must evaluate management decisions in each of these areas and convert information about company policy into price estimates of the firm’s securities. This chapter examines these policies within a financial theory framework, dealing with valuation models.
There are six alternative but interrelated valuation models of financial theory that might be useful for the analysis of securities and the management of portfolios: 1. Discounted cash-flow valuation theory (classical financial theory) 2. M and M valuation theory 3. Capital asset pricing model (CAPM) 4. Arbitrage Pricing Theory (APT) 5. Option-pricing theory (OPT) 6. Futures Valuation and Hedging The discounted cash-flow valuation and M and M theories are discussed in the typical required corporate-finance survey course for both bachelor’s and master’s programs in business. The main purpose of this chapter is to review these theories and discuss their interrelationships. The discounted cash-flow model is first reviewed by some of the basic valuation concepts in Sect. 1.2. In the second section, the four alternative evaluation methods developed by M and M in their 1961 article are discussed. Their three propositions and their revision with taxes are explored, including possible applications of their theories in security analysis. Miller’s inclusion of personal taxes is discussed in Sect. 1.3. Section 1.4 discusses the Markowitz portfolio theory. Section 1.5 includes a brief overview of CAPM concepts. Section 1.6 introduces the Arbitrage Pricing Theory (APT). Sections 1.6 and 1.7 discuss the option-pricing theory and the futures valuation and hedging. Conclusion is presented in Sect. 1.8.
1.2 Discounted Cash-Flow Valuation Theory Discounted cash-flow valuation theory is the basic tool for determining the theoretical price of a corporate security. The price of a corporate security is equal to the present value of future benefits of ownership. For example, for common stock, these benefits include dividends received while the stock is owned plus capital gains earned during the ownership period. If we assume a one-period investment and a world of certain cash flows, the price paid for a share of
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_1,
3
4
1
stock, P0 , will equal the sum of the present value of a certain dividend per share, d1 (assumed to be paid as a single flow at year end), and the selling price per share P1 : P0 D
d1 C P1 1Ck
(1.1)
in which k is the rate of discount assuming certainty. P1 can be similarly expressed in terms of d2 and P2 : P1 D
d2 C P2 1Ck
(1.2)
If P1 in Equation (1.1) is substituted into Equation (1.2), a two-period expression is derived: P0 D
d2 d1 P2 C C .1 C k/ .1 C k/2 .1 C k/2
(1.3)
It can be seen, then, that an infinite time-horizon model can be expressed as the P0 D
1 X t D1
dt .1 C k/t
MV 0 D
1 X t D1
Dt .1 C k/t
where: PV D present value of the bond; n D the number of periods to maturity; CFt D the cash flow (interest and principal) received in period t; kb D the required rate of return of the bondholders (equal to risk-free rate i plus a risk premium).
1.2.1.1 Perpetuity The first (and most extreme) case of bond valuation involves a perpetuity, a bond with no maturity date and perpetual interest payments. Such bonds do exist. In 1814, the English government floated a large bond issue to consolidate the various small issues it had used to pay for the Napoleonic Wars. Such bonds are called consols, and the owners are entitled to a fixed amount of interest income annually in perpetuity. In this case, Equation (1.6) collapses into the following: PV D
(1.4)
Since the total market value of the firms’ equity is equal to the market price per share multiplied by the number of shares outstanding, Equation (1.4) may be re-expressed in terms of total market value MV 0 : (1.5)
Theoretical Framework of Finance
CF kb
(1.7)
Thus, the valuation depends directly on the periodic interest payment and the required rate of return for the bond. It can be seen that required rates of return, necessitated by a higher rate of inflation or an increase in the perceived risk of the bond, lower the present value, decreasing the bond’s market value. For example, if the stated annual interest payment on the perpetuity bond is $50 and the required rate of return in the market is 10%, the price of the security is stated: PV D $50=0:10 D $500
in which Dt D total dollars of dividends paid during year t. Using this basic valuation approach as a means of expressing the appropriate objective of the firm’s management, the valuation of a firm’s securities can be analyzed in a world of certainty.
If its issuing price had been $1,000, it can be seen that the required rate of return would have been only 5% (kb D CF=PV D $50=$1; 000 D 0:05, or 5%).
1.2.1 Bond Valuation
1.2.1.2 Term Bonds
Bond valuation is a relatively easy process, as the income stream the bondholder will receive is known with a high degree of certainty. Barring a firm’s default, the income stream consists of the periodic coupon payments and the repayments of the principal at maturity. These cash flows must be discounted to the present using the required rate of return for the bond. The basic principles of bond valuation are represented in the equation: n X CFt (1.6) PV D .1 C kb /t t D1
Most bonds are term bonds, which mature at some definite point in time. Thus, Equation (1.6) should be respecified to take this fact into account: PV D
n X t D1
It Pn C .1 C kb /t .1 C kb /n
(1.8)
where: It D the annual coupon interest payment; Pn D the principal amount (face value) of the bond; and n D the number of periods to maturity.
1 Theoretical Framework of Finance
Again, it should be noted that the market price, PV, of a bond is affected by changes in the rate of inflation. If inflation increases, the discount rate must also increase to compensate the investor for the resultant decrease in the value of the debt repayment. The present value of each period’s interest payment thus decreases, and the price of the bond falls. The bondholder is always exposed to interest-rate risk, the variance of bond prices resulting from fluctuations in the level of interest rates. Interest-rate risk, or price volatility of a bond caused by changes in interest-rate levels, is directly related to the term to maturity. There are two types of risk premiums associated with interest-rate risk as it applies to corporate bonds. The bond maturity premium refers to the net return from investing in long-term government bonds rather than the short-term bills. Since corporate bonds generally possess default risk, another of the components of corporate bond rates of return is default premium. The bond default premium is the net increase in return from investing in long-term corporate bonds rather than in long-term government bonds. Additional features of a bond can affect its valuation. Convertible bonds, those with a provision for conversion into shares of common stock, are generally more valuable than a firm’s straight bonds for several reasons. First, the investor receives the potential of positive gains from conversion, should the market price of a firm’s common stock rise above the conversion price. If the stock price is greater than the conversion price, the convertible bond generally sells at or above its conversion value. Second, the bondholder also receives the protection of fixed income payment, regardless of the current price of the stock – assuring the investor that the price of the bond will be at least equal to that of a straight bond, should stock prices fail to increase sufficiently. Third, for any given firm the coupon rate of return from its bonds is generally greater than the dividend rate of return (dividend yield) from its common stock – thus causing a measure of superiority for a convertible bond over its conversion into common stock until stock dividends rise above the bond’s coupon rate. Even then, the convertible bond may be preferred by investors because of the higher degree of certainty of interest payments versus dividends that would decline if earnings fall. A sinking fund provision may also increase the value of a bond, at least at its time of issue. A sinking-fund agreement specifies a schedule by which the sinking-fund will retire the bond issue gradually over its life. By providing cash to the sinking-fund for use in redeeming the bonds, this provision ensures the investor some potential demand for the bond, thus increasing slightly the liquidity of the investment. Finally, the possibility that the bond may be called will generally lower the value relative to a noncallable bond. A call provision stipulates that the bond may be retired by the issuer at a certain price, usually above par or face value. Therefore, in periods of large downward interest movements,
5
a company may be able to retire a high coupon bond and issue new bonds with a lower interest payment requirement. A call feature increases the risk to investors in that their expected high interest payments may be called away from them, if overall interest rate levels decline.
1.2.2 Common-Stock Valuation Common-stock valuation is complicated by an uncertainty of cash flows to the investor, necessarily greater than that for bond valuation.1 Not only might the dividends voted to shareholders each period change in response to management’s assessment concerning the current level of earnings stability, future earnings prospects, or other factors, but the price of the stock may also either rise or fall – resulting in either capital gains or losses, if the shares are sold. Thus, the valuation process requires the forecasting of both capital gains and the stream of expected dividends. Both must also be discounted at the required rate of return of the common stockholders. P0 D
d2 d1 Pn C CC 1Ck .1 C k/2 .1 C k/n
(1.9)
where: P0 D the present value, or price, of the common stock per share; d D the dividend payment per share; k D the required rate of return of the common stockholders; and Pn D the price of the stock in period n when sold. However, Pn can also be expressed as the sum of all discounted dividends to be received from period n forward into the future. Thus, the value at the present time can be expressed as an infinite series of discounted dividend payments: 1 X dt (1.4) P0 D .1 C k/t t D1 in which dt is the dividend payment in period t. Several possibilities exist regarding the growth of dividend payments over time. First, dividends may be assumed to be a constant amount, and the formula for the stock’s valuation is simple Equation (1.7), where CF is the constant dividend and k is the required rate of return of the common stockholder. Second, dividends may be expected to grow at some constant rate, g. In such a case, a dividend at time t is simply the compound value of the present dividend (i.e., 1
This is true because foregoing interest puts the firm into default, while missing dividend payments does not.
6
1
Pt D .1 C g/t d0 ). Under this assumption, if g < k, the valuation equation can be simplified to the following: P0 D
d1 .k g/
(1.10)
This equation represents the Gordon growth model. Note that a critical condition for this model is that the constant growth of dividends must be less than the constant required rate of return. The zero growth situation is a special case of this model, in which: d1 (1.11) P0 D k Finally, dividends can exhibit a period of supernormal growth (i.e., g is greater than k) before declining to the normal growth situation assumed in the Gordon model (g is less than k). Supernormal growth often occurs during the “takeoff” phase in a firm’s life cycle. That is, a firm may experience a life cycle analogous to that of a product: first, a low-profit introductory phase, then a takeoff phase of high growth and high profits, leveling off at a plateau during its mature stage, perhaps followed by a period of declining earnings. Computer and electronics manufacturers experienced a period of supernormal growth during the 1960s, as did semiconductor firms during the 1970s. Bioengineering firms appear to be the super growth firms of the 1980s. The valuation of a supernormal growth stock requires some estimate of the length of the supernormal growth period. The current price of the stock will then consist of two components: (1) the present value of the stock during the supernormal growth period, and (2) the present value of the stock price at the end of the supernormal growth period: P0 D
n X d0 .1 C gs /t t D1
.1 C k/t
C
dnC1 kgn
.1 C k/n
exclusion of portfolio concepts and the interrelationship with the overall market indexes. Most of the classical models are also static in nature, overlooking the concept of dynamic growth. Nevertheless, a fundamental approach to security valuation – the stream of dividends approach – has evolved from this theory.
1.3 M and M Valuation Theory Modigliani and Miller (M and M 1961) have proposed four alternative valuation methods to determine the theoretical value of common stocks. This section discusses these valuation approaches in some detail. M and M’s four more or less distinct approaches to the valuation of common stock are as follows: 1. The discounted cash-flow approach; 2. the current earnings plus future investment opportunities approach; 3. the stream of dividends approach; and 4. the stream of earnings approach Working from a valuation expression referred to by M and M as the “fundamental principle of valuation”: 1 .d1 C P1 / (1.13) 1Ck M and M further developed a valuation formula to serve as a point of reference and comparison among the four valuation approaches: P0 D
V0 D
1 X t D0
(1.12)
where: gs D supernormal growth rate; n D the number of periods before the growth drops from supernormal to normal; k D the required rate of return of the stockholders; and gn D the normal growth rate of dividends (assumed to be constant thereafter).
1 .Xt It / .1 C k/t C1
(1.14)
where: V0 D the current market value of the firm; Xt D net operating earnings in period t; and It D new investment during period t. In this context, the discounted cash-flow approach can be expressed: V0 D
1 X t D0
As we can see from our development of the discounted cashflow financial theory, the primary determinant of value for securities is the cash flow received by the investors. Anything that affects the cash flow, such as the dividend policy, investment policy, financing policy, and production policy of the firm, needs to be evaluated in order to determine a market price. Some shortcomings of this approach include the overemphasis on the evaluation of the individual firm to the
Theoretical Framework of Finance
1 .Rt Ot / .1 C k/t C1
(1.15)
in which Rt is the stream of cash receipts by the firm and Ot is the stream of cash outlays by the firm. This fundamental principle is based on the assumption of “perfect markets,” “rational behavior,” and “perfect certainty” as defined by M and M. Since Xt differs from Rt and It differs from Ot only by the cost of goods sold and depreciation expense, if .Rt Ot / equals .Xt It /, then (1.15) is equivalent to (1.14) and the discounted cash-flow approach is an extension of
1 Theoretical Framework of Finance
7
Equation (1.13), the fundamental valuation principle. Hence, the security analyst must be well versed in generally accepted accounting principles in order to evaluate the worth of accounting earnings of .Xt It /. The investment-opportunities approach seems in some ways the most natural approach from the standpoint of an investor. This approach takes into account the ability of the firm’s management to issue securities at “normal” market rates of return and invest in the opportunities, providing a rate higher than the normal rate of return. From this framework, M and M developed the following expression, which they show can also be derived from Equation (1.14):
in the pre-M and M period. Assuming an infinite time horizon, this approach defines the current market price of a share of common stock as equal to the discounted present value of all future dividends: P0 D
t D0
X0 X It .kt k/ C k .1 C k/t C1 t D0 1
(1.16)
in which X0 is the perpetual net operation earning and kt is the “higher than normal” rate of return on new investment It . From the expression it can be seen that if a firm cannot generate a rate of return of its new investments higher than the normal rate, k, the price/earnings ratio applied to the firm’s earnings will be equal to 1/k, thus implying simple expansion rather than growth over time. An important variable for security analysis is a firm’s P/E ratio (or earnings multiple), defined as:
Conceptually the P/E ratio is determined by three factors: (1) the investor’s required rate of return .K/; (2) the retention ratio of the firm’s earning, b, where b is equal to 1 minus the dividend payout ratio; and (3) the firm’s expected return on investment .r/. Using the constant-growth model Equation (1.10):
V0 D
1 X
1 .Dt / .1 C k/t C1
(1.19)
With no outside financing, it can be seen that Dt D Xt It and: 1 X 1 V0 D .Xt It / .1 C k/t C1 t D0 which is Equation (1.14). With outside financing through the issuance of shares of new common stock, it can be shown that: V0 D
1 X t D0
1 .Dt C Vt C1 mt C1 Pt C1 / .1 C k/t C1
(1.20)
Vt C1 .mt C1 /.Pt C1 / D It .Xt Dt / Thus, Equation (1.20) can also be written in the form of Equation (1.14): V0 D
1 X t D0
d1 kg
E1 .1 b/ P0 D k .br/ P0 1b D E1 k .br/
(1.18)
in which mt C1 is the number of new shares issued at price Pt C1 . For the infinite horizon, the value of the firm is equal to the investments it makes and the new capital it raises, or:
Market price P=E ratio D Earnings per share
P0 D
1 .dt / .1 C k/t C1
Restating in terms of total market value:
t D0
V0 D
1 X
(1.17)
in which b is the retention rate and E1 is the next period’s expected profit. The P0 =E ratio is theoretically equal to the payout ratio of a firm divided by the difference between the investor’s required return and the firm’s growth rates. In the above equation a direct relationship has been identified between price/earnings ratio and discount cash-flow valuation model. The stream-of-dividends approach has been by far the most popular in the literature of valuation; it was developed
1 .Xt It / .1 C k/t C1
(1.14)
Given the M and M ideal assumptions, the above result implies the irrelevance of dividends because the market value of the dividends provided to new stockholders must always be precisely the same as the increase in current dividends. This is in direct disagreement with the findings of the discounted cash-flow model, where dividends are a major determinant of value. In this case, dividends have no impact on value, and the firm’s investment policy is the most important determinant of value. Security analysis should concern itself with the future investment opportunities of the firm and forget about dividends. M and M also developed the stream-earnings approach, which takes account of the fact that additional capital must be acquired at some cost in order to maintain the stream of future earnings at its current level. The capital to be raised
8
1
is It and its cost is K percent per period thereafter; thus, the current value of the firm under this approach can be stated:
V0 D
1 X t D0
1 .Xt It / .1 C k/t C1
which, again, is Equation (1.14). Under none of these four theoretical approaches does the term Dt remain in the final valuation expression and because Xt ; It , and k are assumed to be independent of Dt , M and M conclude that the current value of a firm is independent of its current and future dividend decisions. The amount gained by stockholders is offset exactly by the decline in the market value of their stock. In the short run, this effect if observed when a stock goes ex-dividend – that is, if the market price of the stock falls by the amount of the dividend on the last day the old shareholders are entitled to receive a dividend payment. The stock’s value depends only on the expected future earnings stream of the firm. Security analysts spend much time and effort forecasting a firm’s expected earnings. While the above analysis ignores the case in which external financing is obtained through the issuance of debt, in such a situation M and M’s position then rests upon their indifference proposition with respect to leverage (M and M 1958), discussed elsewhere in this chapter. Since that analysis shows that under a set of assumptions consistent with their “fundamental principal of valuation” the real cost of debt in a world of no taxation is equal to the real cost of equity financing, M and M conclude that the means of external financing used to offset the payment of dividends does not affect their hypothesis that dividends are irrelevant. Prior to Miller and Modigliani’s (1961) article, the classical view held that dividend policy was a major determinant of the value of the corporation and that firms should seek their “optimal payout ratios” to maximize their value. M and M’s conclusions about the irrelevance of dividend policy given investment policy, collided head-on with the existing classical view. The view that the value of the firm is independent of dividend policy also extends into a world with corporate taxes but without personal taxes.
1.3.1 Review and Extension of M and M Proposition I
Theoretical Framework of Finance
cast doubt upon the existence of such an optimal structure. The specific assumptions that they made, consistent with the dividend irrelevance analysis previously outlined, include the following: 1. Capital markets are perfect (frictionless). 2. Both individuals and firms can borrow and lend at the risk-free rate. 3. Firms use risk-free debt and risky equity. 4. There are only corporate taxes (that is, there are no wealth taxes or personal income taxes). 5. All cash flow streams are perpetuities (that is, no growth). Developing the additional concepts of risk class and homemade leverage, M and M derived their well-known Proposition I, both with and without corporate taxes.2 If all firms are in the same risk class, then their expected risky future net operating cash flow .XP / varies only by a scale factor. Under this circumstance, the correlation between two firms’ net operating income (NOI) within a risk class should be equal to 1.0. This implies that the rates of return will be equal for all firms in the same risk class, that is: Rit D
XP it XP i t 1 Xi t 1
and because XPit D CXP jt where C is the scale factor: Rjt D
CXP jt CXPjt D Rit CXP jt 1
(1.22)
in which Rit and Rjt are rates of return for the i th and j th firms, respectively. Therefore, if two streams of cash flow differ by only a scale factor, they will have the same distributions of returns and the same risk, and they will require the same expected return. The concept of homemade leverage is used to refer to the leverage created by individual investors who sell their own debt, while corporate leverage is used to refer to the debt floated by the corporation. Using the assumption that the cost of homemade leverage is equal to the cost of corporate leverage, M and M (1958) derived their Proposition I both with and without taxes. However, the Proposition I with taxes was not correct, and they subsequently corrected this result in their 1963 paper. Mathematically, M and M’s Proposition I can be defined: Vj D .Sj C Bj / D Xj =k
The existence of optimal capital structure has become one of the important issues for academicians and practitioners in finance. While classical finance theorists argue that there is an optimal capital structure for a firm, the new classical financial theory developed y M and M (1958, 1963) has
(1.21)
2
(1.23)
In 1985 Franco Modigliani won the Nobel Prize for his work on the life cycle of savings and his contribution to what has become known as the M and M theory, discussed in this section.
1 Theoretical Framework of Finance
9
and Proposition I with taxes can be defined as VjL D
.1 j /Xj Ij D VjU C Bj C k r
(1.24)
In Equation (1.23), Bj ; Sj , and Vj are, respectively, the market value of debt, common shares, and the firm. Xj is the expected profit before deduction of interest, k the required rate of return or the cost of capital in risk class k. In Equation (1.24), k is the required rate of return used to capitalize the expected returns net of tax for the unlevered firm with long-run average earnings before tax and interest of .Xj / in risk class k. j is the corporate tax rate for the j th firm, Ij is the total interest expense for the j th firm, and r is the market interest rate used to capitalize the certain cash inflows generated by risk-free debt. Bj is total risk-free debt floated by the j th firm, and V L and V U are the market values of the leveraged and unleveraged firms, respectively. By comparing these two equations, we find that the advantages of a firm with leverage will increase that firm’s value by j Bj – that is, the corporate tax rate times the total debt floated by that firm. One of the important implications of this proposition is that there is no optimal capital structure for the firm unless there are bankruptcy costs associated with its debt flotation. If there are bankruptcy costs, then a firm will issue debt until its tax benefit is equal to the bankruptcy cost, thus providing, in such a case, an optimal capital structure for the firm. In addition to the bankruptcy costs, information signaling (see Leland and Pyle 1977, and Ross 1977a) and differential expectations between shareholders and bondholders can be used to justify the possible existence of an optimal structure of a firm. The existence of optimal capital structure is an important issue for security analysts to investigate because it affects the value of the firm and the value of the firm’s securities. Is a firm with a high level of debt more valuable than a similar firm with very little debt? M and M say it doesn’t matter or that the highly leveraged firm is more valuable. The important assumptions used to prove the M and M Proposition I with taxes are that (1) there are no transaction costs, (2) homemade leverage is equal to corporate leverage, (3) corporate debt is riskless, and (4) there is no bankruptcy cost. Overall, M and M’s Proposition I implies that there is no optimal capital structure. If there is a tax structure that systematically provides a lower after-tax real cost of debt relative to the after-tax real cost of equity, the corporation will maximize the proportion of debt in its capital structure and will issue as much debt as possible to maximize the tax shield associated with the deductibility of interest. Stiglitz (1969) extends M and M’s proposition using a general equilibrium state preference framework. He is able to show that M and M’s results do not depend on risk classes, competitive capital markets, or agreement by
investors. The only two fundamental assumptions are that there is no bankruptcy and individuals can borrow at the same rate as firms. Stiglitz (1974) develops the argument that there may be a determinate debt-equity ratio for the economy as a whole, but not for the individual firm.
1.3.2 Miller’s Proposition on Debt and Taxes Miller (1977) argues that although there is no optimal capital structure for an individual firm, in the aggregate case there may be an optimal structure. In balancing bankruptcy cost against tax shelter, an optimal capital structure is derived, just as the classical view has always maintained. The Tax Reform Act of 1986 taxes dividends and longterm capital gains at the same top rate of 28%. This is a major change from the old 50% rate on dividends and 20% rate on long-term capital gains. The new tax bill has also shifted the major tax burden to corporations and away from individuals. Even though the maximum corporate tax rate will decrease to 34% from the current top rate of 46%, corporations will be paying more taxes because of the loss of the Investment Tax Credits and the Accelerated Cost Recovery System depreciation allowances. These changes in the tax code will shift the emphasis of corporate management from retaining earnings in order to generate price appreciation and capital gains to the payout of corporate funds in the form of dividends. In his presidential address at the Annual Meeting of the American Finance Association, Merton Miller (1977) incorporates personal taxes into the Modigliani and Miller (1958, 1963) argument for the relationship between the firm’s leverage and cost of capital. M and M’s Proposition I shows that the value of the leveraged firm equals the value of the unleveraged firm plus the tax shield associated with interest payments, as shown by Equation (1.24): V L D V U C tc B
(1.25)
where: V L D the value of the leveraged firm; V U D the value of the unleveraged firm; tc D the corporate tax rate; and B D the value of the firm’s debt. Miller generalizes the M and M relationship shown in Equation (1.25) to include personal taxes on dividends and capital gains as well as taxes on interest income, to yield: .1 tc /.1 tps / B VL D VU C 1 .1 tpB /
(1.26)
10
1
Theoretical Framework of Finance
in which tps is the personal tax rate on income from stock and tpB is the personal tax rate on income from bonds.
In the second approach, the maximization problem in Equation (1.30) can be rewritten as
1.4 Markowitz Portfolio Theory
82 9 31=2 ˆ > n n X < X = 5 4 W i R i C 1 Cov Ri ; Rj p Max L D ˆ > : j D1 i D1 ; i D1 n X
Professors Markowitz, Miller, and Sharpe earned their Nobel Prize in applied economics in 1989. Section 1.3 briefly discussed the M and M theory. In the next section, we will discuss CAPM. In this section we will discuss Prof. Markowitz portfolio theory. In the paper entitled “Markowitz, Miller, and Sharpe: The First Nobel Laureates in Finance,” Lee (1991) has discussed this historical event in details. Markowitz suggests two constrained maximization approaches to obtain the optimal portfolio weight. The first approach is to minimize the risk or variance of the portfolio, subject to the portfolio’s attaining some target expected rate of return, and also subject to the portfolio weight summing to one. The problem can be stated mathematically: Min p2 D
n n X X
Wi Wj ij
(1.27)
Wi E .Ri / D E
(1.28)
i D1 j D1
Subject to
n X
1:
C2
n X
! Wi 1
(1.31)
i D1
where Ri is the average rate of rates of return of the portfolio given targeted standard deviation of the portfolio P . Essentially Equation (1.31) maximizes the expected rates of return of the portfolio given the targeted standard deviation of the portfolio. In this book, a large portion is dedicated to the portfolio theory and its application. Chapter 10 discusses portfolio optimization models and mean-variance spanning tests. Chapter 12 discusses the estimation risk and power utility portfolio selection. Chapter 13 discusses theory and methods in the international portfolio management. Chapter 17 provides discussion on portfolio theory, CAPM, and the performance measures. Chapter 18 discusses the intertemporal equilibrium models, portfolio theory, and the capital asset pricing model.
1.5 Capital Asset Pricing Model
i D1
where E is the target expected return and 2:
n X
Wi D 1:0
(1.29)
i D1
The first constraint simply says that the expected return on the portfolio should equal the target return determined by the portfolio manager. The second constraint says that the weights of the securities invested in the portfolio must sum to one. The Lagrangian objective function can be written: Min L D
n n X X
W i W j C 1
i D1 j D1
C2
n X
n X
Wi E .Ri / E
i D1
! Wi 1
(1.30)
i D1
Taking the partial derivatives of this equation with respect to each of the variables, W1 ; W2 ; W3 ; 1 ; 2 and setting the resulting five equations equal to zero yields the minimization of risk subject to the Lagrangian constraints. This system of five equations and five unknowns can be solved by the use of matrix algebra. Equations (1.30) minimizes the portfolio variance given the portfolio’s targeted expected rate of return.
At about the same time as M and M were developing their work, developments in portfolio theory were leading to a model describing the formation of capital asset prices in world of uncertainty: the capital asset pricing model (CAPM). The CAPM is a generalized version of M and M theory in which M and M theory is provided with a link to the market: E.Rj / D Rf C ˇj ŒE.Rm / Rf
(1.32)
where: Rj D the rate of return for security j ; ˇj D a volatility measure relating the rate of return on security j with that of the market over time; Rm D the rate of return for the overall market (typically measured by the rate of return reflected by a market index, such as the S&P 500); and Rf D the risk-free rate available in the market (usually the rate of return on U.S. Treasury bills is used as a proxy). In the CAPM framework, the valuation of a company’s securities is dependent not only on its cash flows but also on those of other securities available for investment. It is assumed that much of the total risk, as measured by standard
1 Theoretical Framework of Finance
11
deviation of return, can be diversified away by combining the stock of a firm being analyzed with those of other companies. Unless the cash flows from these securities are perfectly positively correlated, smoothing or diversification will take place. Thus, the security return can be divided into two components: a systematic component that is perfectly correlated with the overall market return and an unsystematic component that is independent of the market return: Security return D Systematic return C Unsystematic return (1.33) Since the security return is perfectly correlated with the market return, it can be expressed as a constant, beta, multiplied by the market return .Rm /. The beta is a volatility index measuring the sensitivity of the security return to changes in the market return. The unsystematic return is residual of the relationship of Rj with Rm . As has been previously noted, the standard deviation of the probability distribution of a security’s rate of return is considered to be an appropriate measure of the total risk of that security. This total risk can be broken down into systematic and unsystematic components, just as noted above for security return: Total security risk D Systematic risk C Unsystematic risk (1.34) Diversification is achieved only when securities that are not perfectly correlated with one other are combined. The unsystematic risk components tend to cancel each other as they are all residuals from the relationship of security returns with the overall market return. In the process, the portfolio risk measure declines without any corresponding lowering of portfolio return (see Fig. 1.1). It is assumed in this illustration that the selection of additional securities as the portfolio size is increased is performed in some random manner, although any selection process other than intentionally choosing perfectly correlated securities will suffice. Unsystematic risk is shown to be gradually eliminated until the remaining portfolio risk is completely market related. While for an actual portfolio the systematic risk will not remain constant as securities are added, the intent is to show that the unsystematic-risk portion can be diversified away, leaving the market related systematic portion as the only relevant measure of risk. Empirical studies have shown that a portfolio of about 20 securities not highly correlated with one another will provide a high degree of diversification. Although capital-market theory assumes that all investors will hold the market portfolio, it is neither necessary nor realistic to assume that all investors will be satisfied with the market level of risk. There are basically two ways that investors can adjust their risk level within the CAPM theoretical framework. First, funds for investment can
Fig. 1.1 Diversification process
be divided between the market portfolio and risk-free securities. The capital-market line (CML) is derived assuming such a tradeoff function. This is illustrated in Fig. 1.2, in which point M is the market portfolio and points on the CML below and above M imply lending and borrowing at the risk-free rate. The second way of adjusting the portfolio risk level is by investing in a fully diversified portfolio of securities (that is, the correlation coefficient of the portfolio with the market, rpm is equal to 1.0) that has a weighted average beta equal to the systematic-risk level desired: ˇp D
n X
Wj Bj
(1.35)
j D1
in which Wj is the proportion of total funds invested in security j . In the CAPM, systematic risk as measured by beta is the only risk that need be undertaken; therefore, it follows that no risk premium should be expected for the bearing of unsystematic risk. With that in mind, the relationship between expected return and risk can be better defined through the illustration of the security-market line (SML) in Fig. 1.3 in which Rm and ˇm are the expected return and risk level of the market portfolio. In equilibrium, all securities and combinations of securities are expected to lie along this line. In contrast, only fully diversified portfolios would be expected to fall along the CML, because only with full diversification is total risk equal to systematic risk alone. In addition to the static CAPM developed by Sharpe (1964) and others Merton has discussed intertemporal CAPM. In Chap. 6 we will discuss the static CAPM and beta forecasting and in Chap. 18 we will focus on intertemporal CAPM.
12
1
Theoretical Framework of Finance
bij D a coefficient called a factor loading that quantifies the sensitivity of asset i ’s returns to the movements in the common factor ıQj (and is analogous to the beta in the CAPM); and 2Q i D an error term, or unsystematic risk component, idiosyncratic to the i th asset, with mean zero and variance equal to Q 22 . Moreover, it is assumed that the 2Q i reflects the random influence of information that is unrelated to other assets. Thus, the following condition is assumed to hold:
Fig. 1.2 Capital market line
˚ E 2Q i j2Q j D 0
(1.37)
as well as 2i and 2j independence for all i ¤ j . Also, for any two securities i and j : ˚ E 2Q i ; 2Q j D 0
Fig. 1.3 Security market line
1.6 Arbitrage Pricing Theory 1.6.1 Ross’s Arbitrage Model Specification This section focuses on two related forms of the arbitrage pricing model (APM). The first is the model as originally proposed by Ross (1976). The initial and probably the most prominent assumption made by APM concerns the return-generating process for assets. Specifically, individuals are assumed to believe (homogeneously) that the random returns on the set of assets being considered are governed by a k-factor generating model of the form: rQi D Ei C bi1 ıQ1 C : : : C bi k ıQk C 2Q i .i D 1; : : : ; n/
(1.36)
where: rQi D random return on the i th asset; Ei D expected return on the i th asset; ıQj D j th factor common to the returns of all assets under consideration with a mean of zero, common factors that in essence capture the systematic component of risk in the model;
(1.38)
for all i and j , where i ¤ j . If this last condition did not hold – that is, if there was too strong a dependence between 2Q i and 2Q j – it would be equivalent to simply saying that the k-hypothesized common factors existed. Finally, it is assumed that for the set of n assets under consideration that n is much greater than the number of factors k. Before developing Ross’s riskless arbitrage argument, it is essential to examine Equation (1.36) more closely and draw some implications from its structure. First, consider the effect of omitting the unsystematic risk terms eQi . Equation (1.36) would then imply that each asset i has returns rQ that are an exact linear combination of the returns on a riskless asset (with constant return) and the returns on k other factors or assets (or column vectors) ıQi ; : : : ; ıQk . Moreover, the riskless return and each of the k factors can be expressed as a linear combination of k C 1 other return – for example, r, through rkC1 – in this type of setting. Taking this logic one step further, since any other asset return is a linear combination of the factors, it must also be a linear combination of the returns of the first k C 1 assets. Hence, portfolios composed from the first k C 1 assets must be perfect substitutes for all other assets in the market. Consequently, there must be restrictions on the individual returns generated by the model, as perfect substitutes must be priced equivalently. This sequence of mathematical logic is the core of APT. That is, only a few systematic components of risk exist in the economy, and consequently many portfolios will be close substitutes, thereby demanding the same value. To initiate Ross’s arbitrage argument about APT, it is best to start with the assumption of Equation (1.36). Next, presume an investor who is contemplating an alteration of the currently held portfolio, the difference between any new portfolio and the old portfolio will be quantified by changes
1 Theoretical Framework of Finance
13
in the investment proportions xi .i D 1; : : : ; n/. The xi represents the dollar amount purchased or sold of asset i as a fraction of total invested wealth. The investor’s portfolio investment is constrained to hold to the following condition: n X
xi D 0
(1.39)
i D1
In words, Equation (1.39) says that additional purchases of assets must be financed by sales of others. Portfolios that require no net investment such as x .xj ; : : : ; xn / are called arbitrage portfolios. Now, consider an arbitrage portfolio chosen in the following manner. First, the portfolio must be chosen to be well diversified by keeping each element, x, of order 1=n in size. Second, the x of the portfolio must be selected in such a way as to eliminate all systematic risk (for each h): xbh
X
xi bih D 0 .h D 1; : : : ; k/
(1.40)
i D1
The returns on any such arbitrage portfolios can be described: Q x rQ D .xE/ C .xb/ıQ1 C : : : C .xbk /ıQk C .x 2/ xE C .xb1 /ıQ1 C : : : C .xbk /ıQk D xE where x rQ D
n P
xi rQi and xE D
i D1
n P
xi Ei . Note that the term
i D1
Q is (approximately) eliminated by the effect of holding a .x 2/ well-diversified portfolio of n assets where n is large. Using the law of large numbers, if 2 denotes the average variance of the eQi terms, and assuming for simplicity that each x, approximately equals 1=n and that the 2i are mutually independent: 1X Q D Var Var.x 2/ 2i n i
! D
2 Var.2Q i / D n2 n2
(1.41)
Thus if n is large the variance of x 2Q will be negligible. Reconsidering the steps up to this point, note that a portfolio has been created that has no systematic or unsystematic risk and using no wealth. Under conditions of equilibrium, it can be stated unequivocally that all portfolios of these n assets that satisfy the conditions of using no wealth and having no risk must also earn no return on average. In other words, there are no free lunches in an efficient market, at least not for any extended period of time. Therefore the expected return on the arbitrage portfolio can be expressed:
x rQ D xE D
n X
xi Ei D 0
(1.42)
i D1
Another way to state the preceding statements and results is through linear algebra. In general, any vector x with elements on the order of 1=n that is orthogonal to the constant vector and to each of the coefficient vectors bh .h D 1; : : : ; k/ must also be orthogonal to the vector of expected returns. A further algebraic consequence of this statement is that the expected return vector E must be a linear combination of the constant vector and the b vectors. Using algebraic terminology, there exist k C 1 weights .0 ; 1 ; : : : ; k / such that: E i D 0 C 1 b i l C : : : C k b i k ;
for all i
(1.43)
In addition, if there exists a riskless asset with return E0 , which can be said to be the common return on all zero-beta assets – that is, bih D 0 (for all h) – then: E 0 D 0 Utilizing this definition and rearranging: E i E 0 D 1 b i l C : : : C k b i k
(1.44)
The pricing relationship depicted in Equation (1.44) is the central conclusion of the APT. Before exploring the consequences of this pricing model through a simple numerical example, it is best to first give some interpretation to the h , the factor risk premium. If portfolios are formed with a systematic risk of 1 relative to factor h and no risk on other factors, then each h can be interpreted as: h D E h E 0
(1.45)
In words, each h can be thought of as the excess return or market risk premium on portfolios with only systematic factor h risk. Hence, Equation (1.44) can be rewritten: Ei E0 D .E 1 E0 /bi1 C : : : C .E k E0 /bi k
(1.46)
The implications that arise from the arguments concerning APT that have been constructed thus far can be summarized in the following statement: APT yields a statement of relative pricing on subsets of the universe of assets. Moreover, note that the arbitrage pricing model of Equations (1.44) or (1.46) can be tested by examining only subsets of the set of all return. Consequently, the market portfolio plays no special role in APT, since any well-diversified portfolio could serve the same purpose. Hence, it can be empirically tested on any set of data, and the results should be generalizable to the entire market.
14
1
Theoretical Framework of Finance
Even though the APT is very general and based on few assumptions, it provides little guidance concerning the identification of the priced factors. Hence empirical research must achieve two goals. 1. Identify the number of factors. 2. Identify the various economics underlying each factor. Chapter 64 will discuss the relationship between the liquidity risk and the arbitrage pricing theory. Fig. 1.4 Theoretical and actual values of a call option
1.7 Option Valuation Option contracts give their holders the right to buy and sell a specific asset at some specified price on or before a specified date. Since these contracts can be valued in relation to common stock, the basic concepts involved have a number of applications to financial theory and to the valuation of other financial instruments. While there are a variety of option contracts – for example, call options, put options, combinations of calls and puts, convertible securities, and warrants – this chapter’s discussion is limited to call options. A call option gives the holder the right to buy a share of stock at a specified price, known as the exercise price, and the basic American option can be exercised at any time through the expiration date. The value of the option at expiration is the difference between the market price of the underlying stock and its exercise price (with a minimum value of zero, of course). While several factors affect the value of an option, the most important factor is the price volatility of the stock – the greater the volatility, the greater the value of the option, other things remaining the same. We will also note that the longer the time left before expiration and the higher the level of interest rates in the market, the greater the option value, all other things held the same. The theoretical value of a call option at expiration is the difference between the market price of the underlying common stock, ps and the exercise price of the option, E, or zero, whichever is greater: C D Max.Ps E; 0/
(1.47)
When the price of the stock is greater than the exercise price, the option has a positive theoretical value that will increase dollar for dollar with the price of the stock. When the market price of the stock is equal to or less than the exercise price, the option has a theoretical value of zero, as shown in Fig. 1.4. Nevertheless, as long as some time remains before expiration, the actual market price of the option (referred to as the option premium at the time of issue) is likely to be greater than its theoretical value. This increment above the
theoretical value is called the time value or speculative value of the option, and its size will depend primarily on the perceived likelihood of a profitable move on the price of the stock before expiration of the option. The full range of possible values for the market price of the option is from the theoretical value on the low side to the market price of the stock itself on the high side. For the option price to be equal to the stock price, an infinite time to expiration would be implied. For the option price to be equal to the theoretical value only, imminent expiration would be implied. For virtually all options for which the value would be determined, however, the option price would fall somewhere between these two extremes. Because an option costs less than its underlying stock, the percentage change in option price is greater than the percentage change in stock price, given some increase in the market price of the stock. Thus, a leveraged rate of return can be earned by investment in the option rather than the stock. As stock price continues to increase, the difference between the percentage change in option price and the percentage change in stock price will tend to converge. Thus far it has been shown that the value of an option will be a function of the underlying stock price, the exercise price of the option, and the time to maturity. Yet there is still another factor that is probably the single most important variable affecting the speculative value of the option. That is the price volatility of the underlying stock. The greater the probability of significant change in the price of the stock, the more likely it is that the option can be exercised at a profit before expiration. There is another factor affecting the speculative premium for options. This is the level of interest rates in the market – specifically for option analysis, the call money rate charged by brokers for the use of margin in common-stock accounts. As this concept is discussed later, it is sufficient here to point out that the leverage achieved through option investment is similar to that achieved through direct margin purchase of the underlying common stock, but without the explicit interest cost involved in the latter. Thus, the higher the call money rate, the greater the savings from the use of options and the greater the speculative value of the option.
1 Theoretical Framework of Finance
15
To summarize, there are five variables necessary to determine the value of an American call option (ignoring dividends on the common stock): 1. and 2. Stock price–Exercise price: The relationship between these two prices determines whether the option has a positive theoretical value. 3. Time to maturity: The longer the time to maturity, the greater the speculative value of the option because the chances for a profitable movement in the price of the stock are increased. 4. Volatility of stock price: There is a positive relationship between the volatility of the underlying stock price and the speculative value of the option because with greater volatility, there is greater potential for gain on the upside and greater benefit from the downside protection involved with the option. 5. Interest rate: The higher the call money rate for direct margin purchase of common stock, the greater the relative value of being able to achieve equal amounts of leverage through the alternative of option purchase. The factors that affect the value of an option can be written in a functional form: C D f .S, X; 2 ; T,rf /
(1.48)
where: C D value of the option; S D stock price; X D exercise price; 2 D variance of the stock; T D time to expiration; and rf D risk-free rate. The value of the option increases as a function of the value of the stock for a given exercise price and maturity date. The lower the exercise price, the greater the value of the option. The longer the time to maturity, the higher the value of the option. The holder of an option will prefer more variance in the price of the stock to less. The greater the variance (price volatility) the greater the probability that the stock price will exceed the exercise price and thus benefit the holder. Considering two related financial securities – common stock and the option on the common stock – it is possible to illustrate how a risk-free hedged position can be developed. In this way, unprofitable price movements in one of the securities will be offset by profitable price movements in the other. The hedge ratio determines the portion of stock held long in relation to the options in the short position (or vice versa). With a complete hedge, the value of the hedged position can be the same regardless of the stockprice outcome. In efficient financial markets, the rate of return earned on perfectly hedged positions will be the risk-free rate. Consequently, it is then possible to determine
the appropriate option price at the beginning of the period. If the actual market price is above or below this value, arbitrage would then drive the option price toward its correct level. This process and the development of the Black-Scholes (1973) continuous type of option-pricing model will be discussed in Chaps. 23, 24, and 27. Cox et al. (1979) discrete type of binomial option-pricing model will be analyzed in Chaps. 25, 26, and 28.
1.8 Futures Valuation and Hedging A basic assumption of finance theory is that investors are risk averse. If we equate risk with uncertainty, can we question the validity of this assumption? What evidence is there? As living, functional proof of the appropriateness of the risk aversion assumption, there exist entire markets whose sole underlying purpose is to allow investors to display their uncertainties about the future. These particular markets, which primary focus on the future, are called just that, futures markets. These markets allow for the transfer of risk from hedgers (risk-averse individuals) to speculators (risk-seeking individuals). A key element necessary for the existence of futures markets is the balance between the number of hedgers and speculators who are willing to transfer and accept risk. A future contract is a standardized legal agreement between a buyer and a seller, who promise now to exchange a specified amount of money for goods or services at a future time. Of course, there is nothing really unusual about a contract made in advance of delivery. For instance, whenever something is ordered rather than purchased on the spot, a futures (or forward) contract is involved. Although the price is determined at the time of the order, the actual exchange of cash for the merchandise takes place later. For some items the lag is a few days, while for others (such as a car) it may be months. Moreover, a futures contract imparts a legal obligation to both parties of the contract to fulfill the specifications. To guarantee fulfillment of this obligation, a “goodfaith” deposit, also called margin, may be required from the buyer (and the seller, if he or she does not already own the product). To ensure consistency in the contracts and to help develop liquidity, futures exchanges have been established. These exchanges provide a central location and a standardized set of rules to enhance the credibility of these markets and thus generate an orderly, liquid arena for the price determination of individual commodities at distinct points in the future. A substantial increase in the number of types of futures contracts offered by the exchanges has been occurring over the last decade. At the same time, the growth in futures trading volume has been phenomenal. Two explanations can be offered for this increase in futures activity. These increases
16
can be intuitively correlated with the growing levels of uncertainty in many facts of the economic environment – for example, inflation and interest rates. A second view is based on the argument that even though the world has not become any more uncertain, the increased integration of financial and real markets has increased the risk exposure of any given individual. The tremendous growth in the home-mortgage and consumer-debt financial markets has allowed the purchase of more expensive real assets. This increase in the rise of individual financial leverage has increased individual exposure to interest-rate fluctuations, thereby increasing the requirements for risk-sharing across markets or between individuals with varied portfolios. Futures markets have the potential to help people manage or transfer the uncertainties that plague the world today. This section examines the basic types of futures contracts offered and the functioning of futures markets. In addition the uses of financial and index futures are illustrated, and the theoretical pricing concepts related to these financial instruments are discussed. The important terms associated with futures contracts and futures markets are defined and an analysis of futures market follows. A theory of valuation is introduced, and the section closes with a discussion of various hedging strategies and concepts.
1.8.1 Futures Markets: Overview In the most general sense, the term commodity futures is taken to embrace all existing futures contracts. Nevertheless, for purposes of clarity and classification its meaning here is restricted to a limited segment of the total futures markets. Accordingly, futures contracts can be classified into three main types. 1. Commodity futures 2. Financial futures 3. Index futures Within this classification commodity futures include all agriculturally related futures contracts with underlying assets, such as corn, wheat, rye, barley, rice, oats, sugar, coffee, soybeans, frozen orange juice, pork bellies, live cattle, hogs, and lumber. Also within the commodity-futures framework are futures contracts written on precious metals, such as gold, silver, copper, platinum, and palladium, and contracts written on petroleum products, including gasoline, crude oil, and heating oil. Many of the futures contracts on metals and petroleum products have been introduced as recently as the early 1980s. Producers, refineries, and distributors, to name only a few potential users, employ futures contracts to assure a particu-
1
Theoretical Framework of Finance
lar price or supply – or both – for the underlying commodity at a future date. Futures-market participants are divided into two broad classes: hedgers and speculators. Hedging refers to a futuresmarket transaction made as a temporary substitute for a cashmarket transaction to be made at a later date. The purpose of hedging is to take advantage of current prices by using futures transactions. For example, banks and corporations can be hedgers when they use futures to fix future borrowing and lending rates. Futures market speculation involves taking a short or long futures position solely to profit from price changes. If you think that interest rates will rise because of an increase in inflation, you can sell T-bill futures and make a profit if interest rates do rise and the value of T-bills falls. Financial futures are a trading medium initiated with the introduction of contracts on foreign currencies at the International Monetary Market (IMM) in 1972. In addition to futures on foreign currencies, financial futures include contracts based on Treasury bonds (T-bonds), Treasury bills (T-bills), Treasury notes (T-notes), bank certificates of deposit, Eurodollars, and GNMA mortgage securities. These latter types of financial futures contracts are also referred to as interest-rate futures as their underlying asset is an interest-bearing security. While foreign-currency futures arose with the abolition of the Bretton Woods fixed exchange-rate system during the early 1970s, interest-rate futures surged in popularity and number following the change in U.S. monetary policy in October 1979. The effect of the Federal Open Market Committee’s decision to deemphasize the traditional practice of “pegging” interest rates was to greatly increase the volatility of market interest rates. Thus, interest-rate changes have become a highly prominent risk to corporations, investors, and financial institutions. Index futures represent the newest and boldest innovation in the futures market to date. An index-futures contract is in for which the underlying asset is actually a portfolio of assets – for example, the Major Market Index (MMI) includes 20 stocks traded on the NYSE and the S&P index includes 500 stocks. Contracts on more diverse types of indexes include a high-quality bond index, an interest-rate index composed of interest-bearing market securities, and the consumer price index. The S&P 500 index, requiring delivery of the 500 stocks constituting the S&P 500 stock index, would certainly have dampened enthusiasm for this and similar index contracts. Because of this, an index-futures contract is settled on the basis of its cash value when the contract matures. The cash value of the contract is equal to the closing index value on its last trading day multiplied by a dollar amount of $500. Many portfolio managers are taking advantage of index futures to alter their portfolios risk-return distributions.
1 Theoretical Framework of Finance
17
1.8.2 The Valuation of Futures Contracts The discussions of each of the three classifications of futures contracts have pointed out pricing idiosyncrasies and have examined specific pricing models for particular types of contracts. Nevertheless, the underlying tenets of any particular pricing model have their roots in a more general theoretical framework of valuation. Consequently, the focus is now on the traditional concepts of futures contracts valuation.
Ft;T D St .1 C Rf;T t /
1.8.2.1 The Arbitrage Argument An instant before the futures contract matures, its price must be equal to the spot (cash) price of the underlying commodity, or: (1.49) Ft;T D St where: Ft;T D the price of the futures contract at time t, which matures at time T , where T > t and T t is a very small interval of time; and St D the spot price of the underlying commodity at time t. If Equation (1.49) did not hold arbitrage condition would prevail. More specifically, when t D T at the maturity of the contract, all trading on the contract ceases and the futures price equals the spot price. If an instant before maturity Ft;T < St , one could realize a sure profit (an arbitrage profit) by simultaneously buying the futures contract (which is undervalued) and selling the spot commodity (which is overvalued). The arbitrage profit would equal: St Ft;T
carry out the arbitrage process, the trader would incur certain costs. For instance, if the spot commodity were purchased because it is undervalued relative to the futures, the trader or arbitrageur would incur an opportunity or interest cost. Any funds he or she tied up in the purchase of the commodity could alternatively be earning some risk-free interest rate Rf through investment in an interest-bearing risk-free security. Therefore, the futures price should account for the interest cost of holding the spot commodity over time, and consequently Equation (1.49) can be modified to:
(1.50)
However, if Ft;T > St is the market condition an instant before maturity, smart traders would recognize this arbitrage condition and sell futures contracts and buy the spot commodity until t D T and Ft;T D St . In fact, the effect of selling the futures and buying the spot would bid their prices down and up, respectively. Thus, the arbitrage process would alleviate any such pricing disequilibrium between the futures contract and its underlying spot commodity.
1.8.2.2 Interest Costs The previous simplified argument demonstrated that the futures and spot prices must be equal an instant before the contract’s maturity. This development assumes no costs in holding the spot commodity or carrying it (storing it) across time. If such a market condition held, Equation (1.49) could be extended to apply to any point of time where t < T . However, by having to buy or sell the spot commodity to
(1.51)
where Rf;T t is the risk-free opportunity cost or interest income that is lost by tying up funds in the spot commodity over the interval T t.
1.8.2.3 Carrying Costs Since theories on the pricing of futures contracts were developed long before the introduction of financial or index futures, the costs of storing and insuring the spot commodity were considered relevant factors in the price of a futures contract. That is, someone who purchased the spot commodity to hold from time t to a later period T incurs the costs of actually housing the commodity and insuring it in case of fire or theft. In the case of livestock such as cattle or hogs, the majority of this cost would be in feeding. The holder of a futures contract avoids these costs borne by the spot holder, making the value of the contract relative to the spot commodity increase by the amount of these carrying costs. Therefore, Equation (1.51) can be extended: Ft;T D St .1 C Rf;T t / C CT t
(1.52)
where CT t is the carrying costs associated with the spot commodity for the interval T t.
1.8.2.4 Supply and Demand Effects As for other financial instruments or commodities, the price of a futures contract is affected by expectations of future supply and demand conditions. The effects of supply and demand for the current spot commodity (as well as for the future spot commodity) have not yet been considered in this analysis. If the probability exists that future supplies of the spot commodity might significantly differ from current supplies, then this will affect the futures price. The discussion up to this point has assumed that the aggregate supply of the commodity was fixed over time and that demand remained
18
1
constant; however, for agricultural, financial, and index futures this is a very unrealistic assumption. For instance, if it is expected that the future available supply of wheat for time T will decline because of poor weather, and demand is unchanged, one would then expect the future spot price of wheat to be higher than the current spot price. Furthermore, a futures contract on wheat that matures at time T can also be considered to represent the expected spot price at time T and consequently should reflect the expected change in supply conditions. In a more extreme fashion, if there is no current supply of wheat, then the futures price would reflect only future supply conditions and the expected future spot price at time T . This can be expressed as: ST / Ft;T D Et .e
(1.53)
S T / is the spot price at a future point T exwhere Et .e pected at time t, where t < T . The tilde above ST indicates that the future spot price is a random variable because future factors such as supply cannot presently be known with certainty. Equation (1.53) is called the unbiased-expectations hypothesis because it postulates that the current price of a futures contract maturing at time T represents the market’s expectation of the future spot price at time T . Which of these expressions for the price of a futures contract at time t will hold in the market – the arbitrage pricing relationship in Equation (1.52) or the unbiased-expectations hypothesis in Equation (1.53)? As the markets are assumed to be efficient the answer is that, the market price of the futures contract will take on the minimum value of either of these two pricing relationship, or: S T /; St .1 C Rf;T t / C CT t Ft;T D MinŒEt .e
(1.54)
For any storable commodity on a given day t, the futures price Ft;T will be higher than the spot price St on day tI Ft;T > St . The amount by which the futures price exceeds the spot price .Ft;T St / is called the premium. In most cases this premium is equal to the sum of financial costs St Rf;T t and carrying costs CT t . The condition of Ft;T > St is associated with a commodity market called a normal carryingchange market. In general, the difference between the futures price Ft;T and spot price St is called the basis. Basis D Ft;T St
(1.55)
1.8.2.5 The Effect of Hedging Demand John Maynard Keynes (1930), who studied the futures markets as a hobby, proposed that for some commodities there was a strong tendency for hedgers to be concentrated on the
Theoretical Framework of Finance
short side of the futures market. That is, to protect themselves against the risk of a price decline in the spot commodity, the spot holder or producer (such as a farmer) would hedge the risk by selling futures contracts on his or her particular commodity. This demand for hedging, producing an abundant supply of futures contracts, would force the market price below that of the expected spot price at maturity (time T ). Moreover, the hedgers would be transferring their price S T / and Ft;T risk to speculators. This difference between Et .e when Ft;T < Et .e S T /, can be thought of as a risk premium paid to the speculators for holding the long futures position and bearing the price risk of the hedger. This risk premium can be formulated as: S T / Ft;T Et .RP / D Et .e
(1.56)
where Et .RP / is the expected risk premium paid to the speculator for bearing the hedger’s price risk. Keynes described this pricing phenomenon as normal backwardation. When the opposite conditions exist – hedgers are concentrated on the long side of the market and bid up the futures spot pricing Ft;T over the expected future spot S T / – the pricing relationship is called contango (that is, Et .e S T //. To reflect the effect of normal Et .RP / D Ft;T Et .e backwardation or contango on the current futures price, the S T / term in Equation (1.56) must be adjusted for the efEt .e fects of hedging demand: S T / C E.RP /; St .1 C Rf;T t / C CT t Ft;T D Min ŒEt .e (1.57) Equation (1.57) expresses a broad pricing framework for the value of a futures contract. Over the life of the futures contract the futures price must move toward the cash price, because at the maturity of the futures contract the futures price will be equal to the current cash price. If hedgers are in a net short position, then futures prices must lie below the expected future spot price, and futures prices would be expected to rise over the life of the contract. However, if hedgers are net long, then the futures price must lie above the expected futures spot price and the price of the futures would be expected to fall. Either a falling futures price (normal backwardation) or a rising futures price (contango) determines the boundaries within which the actual futures price will be located. However, numerous other factors can alter and distort the relationship shown by Equation (1.57). For instance, the analysis implicitly assumes that interest rates remain constant from time t to the contract’s maturity date at time T . However, since market interest rates fluctuate, an increasing or decreasing term structure of interest rates would bias the price of the futures contract higher or lower. In fact, the more accurate one’s forecast of future interest rates, the more accurate the current valuation of the futures contract.
1 Theoretical Framework of Finance
Empirical research casts rather strong doubt on the size of the expected risk-premium component of futures prices, particularly for financial and index futures. In fact, the expectation of speculators, along with actual futures contract supply and demand conditions in the pit, can combine to reverse the effect expected by Keynes. This results in part from the makeup of futures’ users, a clear majority of whom are not hedgers as suggested by Keynes. Additionally, a futures contract for which an illiquid level of trading volume exists would put the bid and offer prices for the contract further apart. A seller of such a futures contract would require more than the theoretical fair price as compensation for the risk undertaken. The risk is of prices starting to rise in an illiquid market in which the position cannot be immediately closed out. It costs money to maintain the position; therefore, a premium is required to cover this cost.
19
Table 1.1 The components of basis risk Type of risk Components Expiration-date risk
Futures contracts are not usually available for every month. If a hedger needed a futures contract for July and the only contracts that were available were for March, June, September, and December, the hedger would have to select either the June or September contract. Either of these contracts would have a different price series than a July contract (if one existed). Hence, the hedger cannot form a perfect hedge and is faced with the chance that the basis may change
Location risk
The hedger requires delivery of the futures contract in location Y, but the only futures contracts available are for delivery in location X. Hence, the hedger cannot form a perfect hedge because of the transportation costs from X to Y; this may cause the basis to change
Quality risk
The exact standard or grade of the commodity required by the hedger is not covered by the futures contract. Therefore, the price movement of commodity grade A may be different from the price movement of commodity grade B, which will cause the basis to change and prevent the hedger from forming a perfect hedge
Quantity risk
The exact amount of the commodity needed by the hedger is not available by a single futures contract or any integer multiple thereof. Hence, the amount of the commodity is not hedged exactly; this prevents the hedger from forming a perfect hedge, and the underhedged or overhedged amount is subject to risk
1.8.3 Hedging Concepts and Strategies The underlying motivation for the development of futures markets is to aid the holders of the spot commodity in hedging their price risk; consequently, the discussion now focuses on such an application of futures market. Four methodologies based on various risk-return criteria are examined; moreover, to fully clarify the hedger’s situation some of the common problems and risks that arise in the hedging process are analyzed.
1.8.3.1 Hedging Risks and Costs As mentioned previously, hedging refers to a process designed to alleviate the uncertainty of future price changes for the spot commodity. Typically this is accomplished by taking an opposite position in a futures contract on the same commodity that is held. If an investor owns the spot commodity, as is usually the case (a long position), the appropriate action in the futures market would be to sell a contract (a short position). However, disregarding for the moment the correct number of futures contracts to enter into, a problem arises if the prices of the spot commodity and the futures contract on this commodity do not move in a perfectly correlated manner. This nonsynchronicity of spot and futures prices is related to the basis and is called the basis risk. The basis has been defined as the difference between the futures and spot prices. Basis risk is the chance that this difference will not remain constant over time. Four types of risk contribute to basis risk; these are defined in Table 1.1. These four types of risks prevent the hedger from forming a perfect hedge (which would have zero risk). Even though the hedger
is reducing the amount of risk, it has not been reduced to zero. It is often said that hedging replaces price risk with basis risk. The potential causes of basis risk are not necessarily limited to those identified in Table 1.1. Hence, basis risk is the prominent source of uncertainty in the hedging process. Other potential causes are (1) supply-demand conditions and (2) cross-hedging consequences. Even if the futures contract is written on the exact commodity that the hedger holds, differing supply and demand conditions in the spot market and futures market could cause the basis to vary over time. Occasionally, speculators in the futures market will bid the futures price above or below its equilibrium position, due perhaps to the excitement induced by an unexpected news release. Of course, the market forces of arbitrage will eventually bring the spot and futures prices back in line. The limiting case is at the expiration of the futures contract, when its price must converge to the spot price.
20
Consequently, the disequilibrating influence on the basis stemming from supply-demand forces can be alleviated by entering a futures contract that matures on the exact day that the hedger intends to sell the spot commodity. But although most futures contracts are quite flexible, it is unlikely that any contract would correlate so precisely with the hedger’s needs. In some cases (such as for agricultural commodities, where futures contracts are offered that mature each month) the basis risk due to nonsimultaneous maturities is not so great. However, for other commodities, particularly financial instruments, futures contracts maturing 3 months apart are more typically offered. Thus, at the time the hedger needs to sell the spot commodity in the market, any protection in price risk over the hedging period could conceivably be wiped out by a temporary adverse change in the basis. Cross-hedging refers to hedging with a futures contract written on a nonidentical commodity (relative to the spot commodity). Although not often necessary with agricultural futures, cross-hedging is frequently the best that can be done with financial and index futures. Changes in the basis risk induced by the cross-hedge are caused by less-than-perfect correlation of price movements between the spot and futures prices – even at maturity. That is, because the spot commodity and futures contract commodity are different, their respective prices will tend to be affected (even though minutely at times) by differing market forces. While the futures price must equal the price of its underlying spot commodity at the contract’s maturity date, this condition does not necessarily hold when the hedger’s commodity is not the “true” underlying asset. Therefore, even when the liquidation of the spot commodity coincides with the maturity of the futures contract, there is no guarantee of obtaining the original price of the commodity that held at the initiation of the hedge.
1
Theoretical Framework of Finance
where:
St D change in the spot price at time t;
Ft D change in the futures price at time t; a D constant; H D hedged ratio; and et D residual term at time t. Furthermore, the hedge ratio measure can be better understood by defining it in terms of its components: Xf Xs
D
S; F DH 2 F
(1.59)
where: Xf and Xs D the dollar amount invested in futures and spot; S; F D the covariance of spot and futures price changes; and 2 D the variance of futures price changes. F Thus H , the minimum-variance hedge ratio computed in variability, is also a measure of the relative dollar amount to be invested in futures per dollar of spot holdings. In a sense it is a localized beta coefficient similar in concept to the beta of a stock à la capital asset pricing theory. As a measure of hedging effectiveness, Johnson utilizes the squared simple-correlation coefficient between spot and futures price changes, 2 . More formally, Johnson’s hedgingeffectiveness measure can be ascertained by first establishing the following expression: HE D 1
VH Vu
(1.60)
where: 1.8.3.2 The Johnson Minimum-Variance Hedge Strategy Developed within the framework of modern portfolio theory, the Johnson hedge model (1960) retains the traditional objective of risk minimization but defines risk as the variance of return on a two-asset hedge portfolio. As in the two-parameter world of Markowitz (1959), the hedger is assumed to be infinitely risk averse (that is, the investor desires zero variance). Moreover, with the risk-minimization objective defined as the variance of return of the combined spot and futures position, the Johnson hedge ratio is expressed in terms of expectations of variances and covariances for price changes in the spot and futures markets. The Johnson hedge model can be expressed in regression form as: (1.58)
St D a C H Ft C et
2 Vu D variance of the unhedged spot position D Xs2 S 2 S D variance of spot price changes; and VH D the variance of return for the hedged portfolio D 2 .1 2 /. Xs2 S
By substituting the minimum-variance hedge position in the futures, Xf : 2 XS2 S .1 2 / D 2 HE D 1 2 XS2 S
(1.61)
In simpler terms then, the Johnson measure of hedging effectiveness is the R2 of a regression of spot-price changes on futures-price changes. To utilize this hedging method, it is necessary to regress historical data of spot-price changes on futures-price changes. The resulting beta coefficient from the regression would be the localized Johnson hedge ratio, and the regression R2 would represent the expected degree of
1 Theoretical Framework of Finance
variance minimization using this hedge ratio over the hedging horizon. “Localized” and “expected” must be emphasized because, first of all, although the Johnson hedge ratio can be re-estimated, it nonetheless is a static measure based on historical data. What held for the past may not hold precisely for the future. Moreover, large price moves may distort this hedge ratio considerably. Hence, R2 is what can be expected based on the past in terms of variance reduction for the total hedge position. It should not be expected to hold exactly.
1.8.3.3 The Howard-D’Antonio Optimal Risk-Return Hedge Strategy The classic one-to-one hedge is a naïve strategy based upon a broadly defined objective of risk minimization. The strategy is naïve in the sense that a hedging coefficient of one is used regardless of past or expected correlations of spotand futures-price changes. Working’s strategy brings out the speculative aspects of hedging by analyzing changes in the basis and, accordingly, exercising discrete judgment about when to hedge and when not to hedge. The underlying objective of Working’s decision rule for hedgers is one of profit maximization. Finally, Johnson (1960), in applying the mean-variance criteria of modern portfolio theory, emphasizes the risk-minimization objective but defines risk in terms of the variance of the hedged position. Although Johnson’s method improves on the naïve strategy of a one-to-one hedge, it however, essentially disregards the return component associated with a particular level of risk. Rutledge (1972) uses both mean and variance information to derive hedge ratio. In a recent paper by Howard and D’Antonio (1984), a hedge ratio and measure of hedging effectiveness are derived in which the hedger’s risk and return are both explicitly taken into account. Moreover, some of the variable relationships derived from their analysis help explain some of the idiosyncrasies of hedging that occur in practice. Using a mean-variance framework, the HowardD’Antonio strategy begins by assuming that the “agent” is out to maximize the expected return for a given level of portfolio risk. With a choice of putting money into three assets – a spot position, a futures contract, and a risk-free asset – the agent’s optimal portfolio will depend on the relative risk-return characteristics of each asset. For a hedger, the optimal portfolio may contain a short futures position, a long futures position, or no futures position at all. In general, the precise futures position to be entered into will be determined by (1) the risk-free rate, (2) the expected returns and the standard deviations for the spot and futures positions, and (3) the correlation between the return on the spot position and the return on the futures.
21
Howard and D’Antonio arrive at the following expressions for the hedge ratio and the measure of hedging effectiveness: Hedge ratio H D
.œ ¡/ ” .1 œ¡/
(1.62)
and s Hedging effectiveness HE D
1 2œ¡ C œ2 1 ¡2
(1.63)
where: D f =s D relative variability of futures and spot returns; ’ D r f =.r s i / D relative excess return on futures to that of spot; ” D Pf =Ps D current price ratio of futures to spot; œ D ˛= D .r f =f /=Œ.r s i /=s D risk-to-excessreturn relative of futures versus the spot position; Ps ; Pf D the current price per unit for the spot and futures respectively; ¡ D simple correlation coefficient between the spot and futures returns; s D standard deviation of spot returns; f D standard deviation of futures returns; r s D mean return on the spot over some recent past interval; r f D mean return on the futures over some recent past interval; and i D risk-free rate. By analyzing the properties of œ these authors discern some important insights for the coordinated use of futures in a hedge portfolio. Numerically, œ expresses the relative attractiveness of investing in futures versus the spot position. When œ < 1; œ D 1, and œ > 1, the futures contract offers less, the same, and more excess return per unit of risk than the spot position, respectively. Since this analysis is being undertaken from a hedger’s point of view, it is assumed œ < 1. An assumption that œ > 1 would inappropriately imply that theoretically it is possible to hedge the futures position with the spot asset. From a practitioner’s perspective it is also important to note that even when œ ¤ ¡, a hedged position using the futures may not provide a real net improvement in the riskreturn performance of a portfolio. Unless HE is significantly greater than 1, other factors such as transaction costs, taxes, the potential for margin calls, and liquidity may negate the overall benefit of hedging with futures. This point, along with the previous results about hedging, helps explain why certain futures contracts highly correlated with their underlying
22
assets are not used extensively as hedging vehicles as might be expected. This section has focused on the basis concepts of futures markets. Important terms were defined and basic models to evaluate futures contracts were discussed. Finally, hedging concepts and strategies were analyzed and alternative hedging ratios were investigated in detail. These concepts and valuation models can be used in security analysis and portfolio management related to futures and forward contracts. Please see Chap. 57 for a generalized model for optimum futures hedge ratio.
1.9 Conclusion This chapter has reviewed and summarized alternative valuation theories – discounted cash flow, M and M, CAPM, APT, OPT, and futures valuation – and the Markowitz portfolio theory that are basic to introductory courses in financial management or investments. These theories can directly and indirectly become guidelines for further study of security analysis and portfolio management. Derivations and applications of these valuation models to security analysis and portfolio management are studied in detail in later parts of this handbook.
References Barnea, A., R. A. Haugen, and L. W. Senbet. 1981 “Market imperfections, agency problems, and capital structure: a review.” Financial Management 10, 7–22. Beranek, W. 1981. “Research directions in finance.” Quarterly Journal of Economics and Business 21, 6–24. Black, F. and M. Scholes. “The pricing of options and corporate liabilities.” Journal of Political Economy (May–June, 1973): 673–654. Brealey, R. and S. Mers. 1988. Principles of corporate finance, McGraw-Hill, New York. Brigham, E. F. 1988. Financial management: theory and practice, 4th ed., Dryden Press, Hinsdale. Copeland, T. E. and J. F. Weston. 1988. Financial theory and corporate policy, 3rd ed., Addison-Wesley, New York. Cox, J., S. A. Ross, and M. Rubinstein. 1979. “Option Pricing: A Simplified Approach.” Journal of Financial Economics, 7, 229–263. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. DeAngelo, H. and R. W. Masulis. 1980. “Optimal capital structure under corporate and personal taxation.” Journal of Financial Economics 8, 3–29. Fama, E. F. and M. H. Miller. 1972. Theory of finance, Holt, Rinehart and Winston, New york. Galai, D. and R. W. Masulis. 1976. “The option pricing model and the risk factor of stock.” Journal of Financial Economics 3, 53–81. Haley, C. W. and L. D. Schall. 1979. Theory of financial decision, 2nd ed., McGraw-Hill, New York. Howard, C. T. and L. J. D’Antonio. 1984. “A risk-return measure of hedging effectiveness.” Journal of Financial and Quantitative Analysis 19, 101–112.
1
Theoretical Framework of Finance
Hsia, C. C. 1981. “Coherence of the modern theories of finance.” The Financial Review, 16, 27–42. Jensen, M.C. and W.H. Meckling. 1978. “Can the corporation survive?” Financial Analysts Journal, 34, 31–37. Johnson, L. L. 1960. “The theory of hedging and speculation in commodity futures.” Review of Economic Studies 27, 139–151. Keynes, J. M. 1930. A treatise on money, Vol 2. Macmillan & Co., London. Lee, C. F. 1983. Financial analysis and planning: theory and application. a book of readings. Addison-Wesley, MA. Lee, C. F. 1985. Financial analysis and planning: theory and applications. Addison-Wesley, MA. Lee, C. F. 1991. “Markowitz, Miller, and Sharpe: the first Nobel laureates in finance,” Review of Quantitative Finance and Accounting 1 209–228. Lee, C. F. and J. E. Finnerty. 1990. Corporate finance: theory, method, and applications. Harcourt Brace Jovanovich, Orlando, FL. Lee, C. F. and J. C. Junkus. 1983. “Financial analysis and planning: an overview.” Journal of Economics and Business 34, 257–283. Leland, H. and D. Pyle. 1977. “Informational asymmetries, financial structure, and financial intermediation.” Journal of Finance 32, 371–387. Mao, H. C. F. 1969. Quantitative analysis of financial decisions, The Macmillan Company, New York. Markowitz, H. 1959. Portfolio selection, Wiley, New York. Miller, M. H. 1977. “Debt and taxes.” Journal of Finance 32, 101–175. Miller, M. H. 1988. “The Modigliani–Miller proposition after 30 years.” Journal of Economic Perspectives 2, 99–120. Miller, M. H. and F. Modigliani. 1961. “Dividend policy growth and the valuation of share.” Journal of Business 34, 411–433. Modigliani, F. and M. Miller. 1958. “The cost of capital, corporation finance and the theory of investment.” American Economic Review 48, 261–297. Modigliani, F. and M. Miller. 1963. “Corporate income taxes and the cost of capital: a correction.” American Economic Review 53, 433–443. Pogue, G. A. and K. Lull. 1974. “Corporate finance: an overview.” Sloan Management Review, 15, 19–38. Reilly, F. K. 1985. Investment analysis and portfolio management, 2nd ed., Dryden Press, Hinsdale. Ross, S. 1976. “The arbitrage theory of capital asset pricing.” Journal of Economic Theory, 13, 341–360. Ross, S. A. 1977a. “The determination of financial structure: the incentive signalling approach.” Bell Journal of Economics 8, 23–40. Ross, S. 1977b. “Return, risk and arbitrage.” in Risk and return in finance, I Friend and J. L. Bicksler. (Eds.). vol. 1, Ballinger, Cambridge, MA. Ross, S. 1978. “Mutual fund separation in financial theory – the separating distributions.” Journal of Economic Theory 15, 254–286. Rutledge, D. J. S. 1972. “Hedgers’ demand for futures contracts: a theoretical framework with applications to the United States soybean complex.” Food Research Institute Studies 11, 237–256. Stiglitz, J. E. 1969. “A re-examination of the Modigliani-Miller theorem.” The American Economic Review 54, 784–793. Stiglitz, J. E. 1974. “On the irrelevance of corporate financial policy.” The American Economic Review 54, 851–866. Taggart, R. A Jr. 1980. “Taxes and corporate capital structure in an incomplete market.” Journal of Finance 35, 645–659. Van Horne, J. C. 1985. Financial management and policy, 6th ed., Prentice-Hall, Englewood Cliffs, NJ. Weston, J. F. 1981. “Developments in finance theory.” Financial Management 10 (Tenth Anniversary Issue), 5–22. Weston, J. F. and T. E. Copeland. 1986. Managerial finance, 8th ed., Dryden Press, Hinsdale. Working, H. 1953. “Hedging reconsidered.” Journal of Farm Economics 35, 544–561.
Chapter 2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
Abstract The purpose of this chapter is to discuss the interaction between investment, financing, and dividends policy of the firm. A brief introduction of the policy framework of finance is provided in Sect. 2.1. Section 2.2 discusses the interaction between investment and dividends policy. Section 2.3 discusses the interaction between dividends and financing policy. Section 2.4 discusses the interaction between investment and financing policy. Section 2.5 discusses the implications of financing and investment interactions for capital budgeting. Section 2.6 discusses the implications of different policies on the beta coefficients. The conclusion is presented in Sect. 2.7.
investment and financing policy is discussed; the role of the financing mix in analyzing investment opportunities will receive detailed treatment. Again, the recognition of risky debt will be allowed, so as to lend further practicality to the analysis. The recognition of financing and investment effects are covered in Sect. 2.5, where capital-budgeting techniques are reviewed and analyzed with regard to their treatment of the financing mix. In this section several numerical comparisons are offered, to emphasize that the differences in the techniques are nontrivial. Section 2.6 will discuss implication of different policies on systematic risk determination. Summary and concluding remarks are offered in Sect. 2.7.
Keywords Investment policy r Financing policy r Dividend policy r Capital structure r Financial analysis r Financial planning, Default risk of debt r Capital budgeting r Systematic risk
2.2 Investment and Dividend Interactions: The Internal Versus External Financing Decision
2.1 Introduction This chapter discusses the three-way interaction between investment, financing, and dividend decisions. As shown in financial policy literature, there exists a set of ideal conditions under which there will be no interaction effects between the areas of concern. However, the ideal conditions imposed by academicians to analyze the effects dividend and financing policy have on the investment decision and the value of the firm are not realistic, when one considers financial management in the real world. Thus, an overview of the interactions of financing, investment, and dividend decisions is important for those concerned with financial analysis and planning. This chapter sequentially addresses the three interaction effects. In Sect. 2.2 the relation between investment and dividend policy is explored through a discussion of internal vs. external financing, with the emphasis on the use of retained earnings as a substitute for new equity, or vice versa. The interaction between corporate financing and dividend policies will be covered in Sect. 2.3, where default risk on debt is recognized. Then in Sect. 2.4, the interaction between
Internal financing consists primarily of retained earnings and depreciation expense, while external financing is comprised of new equity and new debt, both long and short term. Decisions on the appropriate mix of these two sources for a firm are likely to affect both the payout ratio and the capital structure of the firm, and this in turn will generally affect its market value. In this section an overview of internal and external financing is provided, with the discussion culminating in a summary of the impacts that earnings retention or earnings payout (with or without supplemental financing from external sources) can have on a firm’s value and on planning and forecasting.
Internal Financing Changes in equity accounts between balance-sheet dates are generally reported in the statement of retained earnings. Retained earnings are most often the major internal source of funds made available for investment by a firm. The cost of these retained earnings is generally less than the cost associated with raising capital through new common-stock issues.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_2,
23
24
Table 2.1 Payout ratio – composite for 500 firms
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973
0.58 0:567 0:549 0:524 0:517 0:548 0:533 0:547 0:612 0:539 0:491 0:414
It follows that retained earnings, rather than new equity, should be used to finance further investment if equity is to be used and the dividend policy (dividends paid from retained earnings) is not seen to matter. The availability of retained earnings is then determined by the firm’s profitability and the payout ratio, the latter being indicative of dividend policy. Thus we find that the decision to raise funds externally may be dependent on dividend policy, which in turn may affect investment decisions. The payout ratios indicated in Table 2.1 show that, on average, firms in the S&P 500 retained more than 50% of their earnings after 1971 rather than pay them out in dividends. This is done because of the investment opportunities available for a firm, and indicates that retained earnings are a major source of funds for a firm.
External Financing External financing usually takes one of two forms, debt financing or equity financing. We leave the debt vs. equity question to subsequent sections, and here directly confront only the decision whether to utilize retained earnings or new common stock to finance the firm. We have previously shown that the market value of the firm is unaffected by such factors if dividend policy and capital structure are irrelevant. It should also be clear that dividend policy and capital structure can affect market values. Therefore, the consideration of an optimal internal-external financing combination is important in the field of financial management. This optimal combination is a function of the payout ratio, the debt-to-equity ratio, and the interaction between these two decision variables. From this it can be shown (and this will be the topic of discussion in the following paragraphs) that different combinations of internal and external
1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985
0.405 0:462 0:409 0:429 0:411 0:38 0:416 0:435 0:422 0:399 0:626 0:525
1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
0.343 0:559 0:887 0:538 0:439 0:418 0:137 0:852 0:34 0:346 0:327 0:774
1998 1999 2000 2001 2002 2003 2004 2005 2006
0.234 0.435 0.299 0.383 0.04 0.259 0.311 0.282 0.301
financing will have different effects on the growth rate of earnings per share, of dividends per share, and, presumably, of the share price itself. Higgins (1977) considered the amount of growth a firm could maintain if it was subject to an external equity constraint and sought to maintain certain debt and payout ratios. He was able to show the sustainable growth rate in sales, S , to be: S D
p.1 d /.1 C L/ .rr/.ROE/ D t p.1 d /.1 C L/ 1 .rr/.ROE/
(2.1)
where p, profit margin on sales; d , dividend payout ratio; L, debt-to-equity ratio; t, total asset-to-sales ratio; rr, retention rate; ROE, return on equity. Higgins (1977) used 1974 U.S. manufacturing firms’ composite financial statement data as an example to calculate S . Using the figures p D 5:5%; d D 33%; L D 88% and t D 73%, he obtained: S D
.0:055/.1 0:33/.1 C 0:88/ D 10:5% 0:73 .0:055/.1 0:33/.1 C 0:88/
Using this method we can calculate sustainable growth rate in sales for financial analysts and planning models. From this we can see that a firm with many valuable investment opportunities may be forced to forego some of these opportunities due to capital constraints, and the value of the firm will not be maximized as it could have been. Dividend and capital-structure decisions relate directly to investment decisions. While it may not be reasonable to assume that the firm could not issue new equity, the question is at what cost it can be raised under such a constraint. In an even more practical vein, Higgins also incorporated inflation into his model, acknowledging the fact that most depreciation methods that serve to make depreciation a source of funds are founded on historical costs, and not the
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
replacement values the firm must pay to sustain operations at their current level. Introducing the new variables defined below, it is possible to define sustainable growth in real and nominal terms, the former being the item of interest here. Let c, nominal current assets to nominal sales; f, nominal fixed assets to real sales; j, inflation rate. So real sustainable growth Sr is Sr D
.1 C j /p.1 d /.1 C L/ jc : .1 C j /c C f .1 C j /p.1 d /.1 C L/ (2.2)
Using figures from the manufacturing sector and an inflation rate of 10%, which at the time of writing was approximately the actual inflation rate then prevailing, it was found that real sustainable growth was only a third of that of the nominal figure; this serves to further emphasize the importance of the interaction between dividend policy and investment decisions, since the former acts as a constraint on the latter. Higgins (1977) paper has recently been generalized by Johnson (1981) and Higgins (1981). The usual caveat associated with the dividend-irrelevance proposition is that investment policy is unaffected; that is, new equity is issued to replace those retained earnings that are paid out. Here we emphasize new equity with the intent of avoiding financing-policy questions. Knowing that the flotation costs involved with new issues could be avoided by employing retained earnings, the effect such a strategy has on firm value is largely an empirical question. Any such tests directed toward this issue must also recognize that there may be a preference for dividends that may dominate or mitigate the flotation-cost effects. With these factors in mind, Van Horne and McDonald (1971) ran cross-sectional regressions on samples of utility stocks and electronics manufacturers to see whether dividend payouts and the rates of new issues of common stocks had significant effects on priceearnings ratios. The utility industry results indicate that, at the lower end of the new-equity issue spectrum, the dividendpreference effect overshadowed the flotation-cost effect, a result that appeared to shift in direction when higher levels of new equity financing were considered. Further analysis into the dividend-preference question has been performed by Litzenberger and Ramaswamy (1979). Using a generalized capital-asset-pricing model, they examined this trade-off between dividend preference and tax effects, all in an effort to justify the contention of an optimal dividend policy. The results presented by Van Horne and McDonald were admittedly tentative (especially since the electronics industry sample yielded few corroborating results), and most of the t-statistics associated with the utility sample new-issue coefficients were quite low. The strongest statement that could be made following this analysis is that there appears to be little detrimental effect on firm valuation from following a strategy of main-
25
taining a high payout rate and financing further investment with new-equity issues.
2.3 Interactions Between Dividend and Financing Policies If we were able to hold the capital structure of firms constant, then we would be able to determine the advantage or disadvantage of internal financing relative to external financing. Van Horne and McDonald, as briefly mentioned before, were able to empirically test for the effects created by either policy, providing the substance for a major part of the following discussion. If we were to go into more depth and specifically consider firms that issue risky debt while maintaining shareholder’s limited-liability status, and still keeping to the chosen internal or external equity plan, then the interactions between financing and investment decisions can affect the relative positions of stock and bondholders. Black (1976) argued that a possible strategy a firm could follow to transfer economic resources from bondholders to stockholders is to pay as generous dividends as possible. In this way the internal-external financing plan is predetermined, as paying large dividends jeopardizes the bondholders’ position as assets are siphoned away from the firm, all to the gain of the shareholders. In this section the interactions between dividend and financing policy will be analyzed in terms of (1) cost of equity capital and (2) the default-risk viewpoint of debt.
Cost of Equity Capital and Dividend Policy1 Van Horne and McDonald (1971) chose to develop a cross-sectional model to test for the significance of dividend payouts in the valuation process in the electric utility industry. The use of the electric utility industry proved to be operationally less difficult than other industries because it was desired to hold interfirm differences constant, and independent variable selection to this end was relatively straightforward. Since the authors were interested in capitalization rates, year-end P/E ratios were selected to be the dependent variables.1 This value was thought to be a function of three variables: growth in assets, dividend payout, and financial risk. The latter variable is considered sufficient for this sample due to the homogeneity of the other aspects of the included firm’s risks. The model can be written out and more carefully defined by the following; P =E D a0 C a1 g C a2 p C a3 R C u;
(2.3)
26
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
where g, compound growth of assets over eight previous years; p, dividend payout ratio on an annual basis; Lev, interest charges/[operating revenues – operating expenses]; u, error term. While the growth and risk factors do not correspond exactly to what most capital-market theory tells us is relevant, research with respect to this particular industry lends credence to these variables as defined here (see Malkiel 1970). Upon regressing the data for the 86 companies included in the sample, all three independent variables are found to be statistically significant. Of the 86 firms, all of which paid dividends, 37 firms also had new equity issues. Testing to see whether the 37 firms that issued new equity came from the same population as those that did not raise new equity, the researchers found no essential difference. Since we know there are nontrivial flotation costs associated with the issuance of new equity, the finding of no difference between the sub samples implies one of two things: either the costs associated with new issues are relatively too small compared to the total costs of the firm to be detected (despite their large absolute size, or size relative to the new issue by itself), or a net preference for dividends by holders of electric utility stocks exists that acts to offset the aforementioned expenses. The main thrust of this paper was to assess the impact of new equity flotation costs on firm value. With the number of firms issuing new equity, Van Horne and McDonald were able to calculate new-issue ratios – that is, new issues/total shares outstanding at year-end – and separate these firms into four groups, leaving those firms issuing new equity in a separate classification, as indicated in the upper portion of Table 2.2. Adding dummy variables to Equation (2.3), the effect of new issue rates could be analyzed: P =E D a0 C a1 .g/ C a2 p C a3 .Lev/ C a4 .F1 / Ca5 .F2 / C a6 .F3 / C a7 .F4 /:
(2.4)
where g, compound growth rate; Lev, financial risk measured by times interest earned, and F1 ; F2 ; F3 ; F4 , dummy variables representing levels of new equity financing. We would expect negative coefficients on the dummy variables if the flotation costs were to be relevant, but in fact all coefficients were positive, though only one was significant (possibly due to small sample size and small relative differ-
Table 2.2 New-issue ratios of electric utility firms
ences). Empirical results are indicated in the lower portion of Table 2.2. However, by replacing the dummy variable of Group E for any one of the four dummy variables discussed above, Van Horne and McDonald found that the estimate of its coefficient is negative. One question pertaining to the figures presented here stems from the interpretation of the significance of the dummy variables. As it turns out, the class B dummy-variable coefficient is greater than the class A coefficient, supposedly because of a preference for dividend payout, while actually that payout was lower in class B. The question is whether the dummy that is intended to substantiate the dividend-preference claim through the new-issue ratio actually tells us that investors instead attach a higher value to higher earnings retention. The finding of higher P/E ratios resulting from earnings retention would be consistent with the Litzenberger-Ramaswamy framework cited earlier, and the new-issue effect is somewhat confounded with the dividend effect. From the data we can see that the coefficients did decrease as the new-issue ratios increased, though they were not significant, and, not surprisingly, new-issue ratios were rather highly negatively correlated with dividend payout ratios. From this we can say, although with some hesitation, that external equity appears to be a more costly alternative compared to internal financing when pushed to rather extreme limits. Over more moderate ranges, or those more closely aligned with the industry averages, this claim cannot be made with the same degree of certainty, since we cannot be certain that the payout ratio does not have a positive influence on share price, and therefore an inverse relationship to the cost of equity capital.
Default Risk and Dividend Policy With the development of the Option Pricing Model, Black and Scholes (1973) have made available a new method of valuing corporate liabilities or claims on the firm. Chen (1978) has reviewed some recent developments in the theory of risky debt, and examines in a more systematic fashion the determinants of the cost of debt capital. Both Reseck (1970) and Hellwig (1981) have shown that the Modigliani
F dummy variable grouping A
B
C
D
New-issue ratio interval 0 0.001–0.05 0.05–0.1 0.1–0.15 Number of firms in interval 49 16 11 6 Mean dividend payout ratio 0:681 0:679 0:678 0:703 Dummy variable coefficient 1:86 3:23 1:26 0:89 Dummy variable t -statistic 1:33 2:25 0:84 0:51 From Van Horne and McDonald (1971), Reprinted by permission
E 0.15 and up 4 0.728 N.A. (N.A.)
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
and Miller (1958 and 1963) arbitrage process that renders capital structure irrelevant is generally invalidated if the debt under consideration is risky and the shareholders enjoy limited liability. From this and the 1963 M&M article, we can show that there do exist optimal structures for firms under the more realistic conditions mentioned above. In the option-pricing framework the two claimants of the firm, the debt holders and the equity holders, are easily seen to have conflicting interests; thus one group’s claim can only be put forth at the expense of the other party. The assumption that the total firm value is unchanged is utilized to highlight the wealth-transfer effects, but, as we see in the following section, that need not be the case. It is now necessary to find a way to value corporate debt when default risk is introduced. If the total value of the firm is known or given (this value should be reasonably well known or approximated prior to debt valuation as the former is an upper bound on the latter), then once the debt value is established the total equity value falls out as the residual. It is also required that we know how to perform the transfer of wealth (since the stated goal of corporate finance is to maximize the shareholders’ wealth), and learn of any possible consequences that could result from such attempts. Merton (1974) sought to find the value of risky corporate debt, assuming that the term structure of riskless interest rates was flat and known with certainty. This was an indirect goal, since the true emphasis lay in finding a way to value the equity of the firm as a continuous option. The further assumption of consol-type debt was also employed in an attempt to avoid the transactions-cost arguments involved with rolling over debt at maturity. Invoking Modigliani and Miller’s Proposition I (M&M 1958) and allowing for risky debt, we find the value of the stock to be the difference between the value of the firm as a whole and the value of the total debt financing employed to support that whole. Explicitly, S DV
C .1 L/; r
(2.5)
where S , total value of the firm’s stock; V , total firm value; C , constant coupon payment on the perpetual bonds; r, riskless rate of interest; L, a complicated risk factor associated with possible default on the required coupon payment. The last term of Equation (2.5) is more often represented by B, the value of the total debt claims outstanding against the firm, stated in a certainty-equivalent form. It should be apparent from this equation that by introducing a greater probability of default on the debt while maintaining firm value, the equity holders gain at the expense of the bondholders. Rendleman (1978) took Merton’s model and adapted it to allow for the tax deductibility of interest charges. As in the Merton article, debt is assumed to be of perpetual type, consols – a justifiable assumption as most firms roll over
27
their debt obligations at maturity. The tax benefit introduced is assumed to be always available to the firm. In instances where the interest expense is greater than earnings before taxes, the carryback provision of the tax code is used to obtain a refund for the amount of the previously unused tax shield and, in the event the three-year carryback provision is not sufficient to make use of the tax shield, Rendleman suggests that a firm could sell that advantage to another firm by means of a merger or an acquisition, when the other firm involved could use the tax benefit. Thus the coupon (net) to be paid each time period is given by C.1 T /. At first glance it is apparent that the equity holders gain from this revision, as they are alleged to in the 1963 taxcorrected model of M&M. But something else is at work as well: Since the firm is subject to a lower debt-service requirement, it can build a larger asset base, which serves to act as insurance in the case of default probabilities owing to the small net coupon payment. In short, the risk premium associated with debt is reduced and the value of the stock, given in Equation (2.5) can be rewritten as shown below: S DV
C .1 T / .1 L /; r
(2.6)
where all the variables are as defined before and L is less than the L given before because of the lessened default risk. It has been argued above that the bondholders benefit when interest is tax-deductible, excluding the possibility here that an overzealous attempt to lever the firm is quickly undertaken when the tax-deductibility feature is introduced, and Equations (2.5) and (2.6) seem to indicate that the value of the shareholders’ claim also increases, although we have not yet provided any theoretical justification for such a statement. Rendleman’s analysis gave us the L-value, which is the clue to this seeming problem. Management can act in a way to jeopardize the bondholders’ claims by issuing more debt and thus making all debt more risky, or by undertaking projects that are riskier than the average project undertaken by the firm. This is the subject of the next section. Another possibility is to pay large dividends so as to deplete the firm of its resources, thus paying off the shareholders but hurting the bondholders. Black (1976) goes so far as to suggest that the firm could liquidate itself, pay out the total as a dividend, and leave the bondholders holding the proverbial bag. Bond covenants, of course, prevent this sort of action (hence, agency theory), and if a firm could not sell its growth opportunities not yet exploited in the market, this may not maximize the shareholder’s wealth, either, but within bounds it does seem a reasonable possibility. In their review article, Barnea, Haugen, and Senbet (BHS, 1981) discuss the issues related to market imperfections, agency problems, and the implications for the consideration of optimal capital structures.2 The authors make use of M&M’s valuation theory and option-pricing theory to
28
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
reconcile the differences between academicians and practitioners about the relevance of financing and dividend policies. They arrive at the conclusion that, without frictionless capital markets, agency problems can give rise to potential costs. These costs can be minimized through complex contractual arrangements between the conflicting parties. Potential agency costs may help to explain the evolution of certain complexities in capital structure, such as conversion privileges of corporate debt and call provisions. If these agency costs are real, then financial contracts that vary in their ability to reduce these costs may very well sell at different equilibrium prices or yields, even if the financial marketplaces are efficient. An optimal capital structure can be obtained when, for each class of contract, the costs associated with each agency problem are exactly balanced by the yield differentials and tax exposures. Overall, BHS show that optimal capital structures can exist and that this is still consistent with the mainstream of classical finance theory.
2.4 Interactions Between Financing and Investment Decisions Myers (1974) has analyzed in detail the interactions of corporate financing and investment decisions and the implications therein for capital budgeting. He argues that the existence of these interaction effects may be attributable to the recognition of transaction costs, corporate taxes, or other market imperfections. Ignored in his analysis is the probability of default on debt obligations, or, as could otherwise be interpreted, changes in this default risk. As alluded to in the previous section, if we consider possible effects of default risk on the firm’s investment and financing decisions, then further analysis must be performed to determine how these interactions can affect the wealth positions of shareholders and bondholders. In this section we will analyze this issue by considering both the risk-free debt case and the risky debt case. Chapter 92 and Chap. 60 discuss the theoretical and empirical issues of capital structure.
2.4.1 Risk-Free Debt Case Following Myers (1974), the basic optimization framework is presented in accordance with well-accepted mathematical programming techniques as discussed in the literature of capital-rationing. It is presented as a general formulation; it should be considered one approach to analyzing interactions and not the final word per se. Specific results derived by Chambers et al. (1982), will be used later to demonstrate the importance of considering alternative financing mixes when evaluating investment opportunities.
We identify a firm Q, which faces several investment opportunities of varying characteristics. The objective is to identify those projects that are in the stockholders’ interest (i.e., they maximize the change in the firm’s market value ex-dividend at the end of the successive time periods t D 0; 1; : : : ; T) and undertake them in order of their relative values. We specify relative values so that project divisibility remains possible, as was assumed by Myers. Required is a financing plan that specifies the desired mix of earnings retained, debt outstanding, and proceeds from the issuance of new equity shares. Let dV be a general function of four factors in a direct sense: (i) the proportion of each project j accepted, xj ; (ii) the stock of debt outstanding in period t; yt ; (iii) the total cash dividends paid in period t, Dt ; and (iv) the net proceeds from equity issued in period t; Et . For future use let Z be denoted as the debt capacity of the firm in period t, this being defined as a limit on yt imposed internally by management or by the capital markets, and Ct is the expected net after-tax cash flow to the firm in time period t. The problem can now be written as: Maximize dV .xj ; yt ; Dt ; Et / D W; subject to the constraints; .a/ Uj D xj 1 0 .j D 1; 2; : : : ; J /I .b/ UtF D yt Zt 0 .t D 1; 2; : : : ; T /I
(2.7)
.c/ UtC D Ct Œyt yt 1 .1C.1 /r/ CDt Et 0; where £ and r are the tax rates and borrowing rates, respectively. Both are assumed constant for simplicity, but in actuality both could be defined as functions of other variables. Constraints (a) and (b) specify the percentage of the project undertaken cannot exceed 100%, and the debt outstanding cannot exceed the debt capacity limit. Constraint (c) is the accounting identity, indicating that outflows of funds equal inflows, and could be interpreted as the restriction that the firm maintain no excess funds. Constraint (b) can be used to investigate the interaction between the firm’s financing and investment decisions by examining the necessary conditions for optimization. If we assign the symbols Lj ; Lft , and Lct to be the shadow prices on constraints (a)–(c), respectively, and let Aj D dW=dxj , Ft D dW=dyt ; Zjt D dZ t =dxj , and Cjt D dCt =dxj , we can rewrite Equation (2.7) in Lagrangian form as the following: Max W 0 D W Lj .xj 1/ Lf t .yt Zt / Lct fCt Œyt yt 1 .1 C .1 t/r/ g C.Dt Et /: (2.70 )
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
The necessary first-order conditions for the optimum are shown as Equations (2.8) through (2.11), with accompanying explanations. For each project: Aj C
T X
Œ Lf t Zjt C Lct Cjt Lj 0
(2.8)
t D0
This can be interpreted as follows: The percentage of a project undertaken should be increased until its incremental cost exceeds the sum of the incremental value of the project, the latter consisting of the added debt capacity and the value of the cash flows generated by that project. The incremental increase in value of the firm obtained by increasing xj to this maximum point is termed the Adjusted Present Value, APV (“adjusted” because of the consideration of interaction effects). We can examine each of these effects in turn. For the debt constraint in each period: Ft Lf t C Lct Lc;t 1 Œ1 C .1 /r 0:
(2.9)
For the constraint implied on dividends: dW Lct 0I dDt
(2.10)
29
tax purposes. This procedure will be further investigated in the following section when we compare it with other wellaccepted capital-budgeting techniques. When dividend policy is not irrelevant, the standard argument arises as to whether the preference for dividends, given their general tax status, outweighs the transaction costs incurred by the firm when dividends are paid and new financing must be raised. In this framework, dividends are viewed as another cash flow and, as such, the issue centers on whether the cash flows from the project plus any increases in the debt capacity of the firm are adequate to cover the cost of financing, through whatever means. Although we are unable to discern the effect of dividend policy on the value of the firm, it is not outside the solution technique to incorporate such efforts by including the related expenses as inputs to the numerical-solution procedure. It is generally assumed that these interaction effects exist, so if we are to consider disregarding them, as some would insist, we must know what conditions are necessary for lack of dependence of financing and investment decisions. As Myers points out, only in a world of perfect markets with no taxes is this the case. Otherwise, the tax deductibility of the interest feature of debt suggests that the APV method gives a more accurate assessment of project viability than does the standard NPV method.
and for the new equity constraint: dW C Lct 0: dEt
(2.11)
While Equations (2.8)–(2.11) are all of interest, the focus is on Equation (2.8), which tells us that the Net Present Value (NVP) rule commonly put forth should be replaced by the APV rule, which accounts for interaction effects. Specifically considering the financing constraint, Equation (2.9), we would like to be able to find the value of this constraint, which is most easily done by assuming for the moment that dividend policy is irrelevant. Combined with Equations (2.8) and (2.9) and the definition of APV we obtain3; APV j D Aj C
T X
Zjt Ft
(2.12)
t D0
which, in the spirit of the M&M with-tax firm-valuation model, tells us that the value of a project is given by the increase in value that would occur in an unlevered firm, plus the value of the debt the project is capable of supporting. This follows from the firms ability to deduct interest expenses for
Risky Debt Case Rendleman (1978) not only examined the risk premiums associated with risky debt, but also considered the impact that debt financing could have on equity values, with taxes and without. The argument is to some extent based on the validity (or lack thereof) of the perfect-market assumption often invoked, which, interestingly enough, turns out to be a double-edged sword. Without taxes, the original M&M article claims, the investment decision of a firm should be made independent of the financing decision. But the financing base of the firm supports all the firm’s investment projects, not some specific project. From this we infer that the future investments of a firm and the risk premiums embodied in the financing costs must be considered when the firm takes on new projects. If, for example, the firm chooses to take on projects of higherthan-average risk, then this may have an adverse effect on the value of the outstanding debt (to the gain of the shareholders), and the converse holds true as well. It follows that the management of a firm should pursue more risky projects to transfer some of the firm’s risk from the shareholders to
30
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
the bondholders, who do not receive commensurate return for that risk. If the bondholders anticipate this action on the part of management, then it is all the more imperative that management takes the action because the bondholders are requiring and receiving a risk premium, for which they are not incurring the “standard” or market-consensus level of risk. Myers (1977) presents an argument in which a firm should issue no risky debt. The rationale for this strategy is that a firm possesses certain real options, or investment opportunities, that unfold over time. With risky debt some of these investment projects available only to the firm of interest may not be undertaken if it is in the interest of the shareholders to default on a near-term scheduled debt payment. In this way risky debt induces a suboptimal investment policy and firm value is not maximized as it would be if the firm issued no risky debt. One problem here is that we do not have strict equivalence between equity-value maximization and firm-value maximizations, so investment policy cannot be thought of entirely in terms of firm-value maximization. This is a quirk associated with the option-pricing framework when applied to firm valuation, and it clouds the determination of what is suboptimal in the finance area.4 Although we concede that we aren’t entirely sure as to how we should treat the interaction effects of financing and investment policy, we can state with some degree of certainty that, even in the absence of the tax deductibility of interest, these two finance-related decisions are interdependent and, as a result, the financial manager should remain wary of those who subscribe to the idea that financing decisions do not matter. Allowing the tax deductibility of interest, it is ironic to note that the conclusions are not nearly as clear-cut as before. If the firm undertakes further, more risky projects, the value of debt may actually increase if the firm does not issue further debt. The shareholders gain from the new project only on its own merits. If no additional debt financing is raised, it is impossible to obtain a larger tax shield and the debt may actually become more secure as a result of the larger asset base. If, however, the firm does issue more debt and in that way acts to jeopardize the currently outstanding debt, the number of considerations multiply, and analysis becomes exceedingly difficult because the value of the project by itself, plus the value of the added tax shield, needs to be considered in light of the possible shifting of wealth due to transfers of risk among claimants of the firm. Thus it seems that, when we allow the real world to influence the model, as it well should, the only thing we can say for certain about financing and investment decisions is that each case must be considered separately.
2.5 Implications of Financing and Investment Interactions for Capital Budgeting This section is intended to briefly review capital-budgeting techniques, explain how each in turn neglects accounting for financing influences in investment-opportunity analysis, and discuss the method by which they do incorporate this aspect of financial management into the decision process. We will draw heavily on the work of Chambers et al. (1982), in making particular distinctions between methods most often covered in corporate-finance textbooks and presumably used in practice, presenting numerical comparisons derived from varying sets of circumstances. Chambers, Harris, and Pringle (CHP) examined four standard, discounted cash-flow models and considered the implications of using each as opposed to the other three. By way of simulation, they were able to deal with differences in financing projects as well as with possible differences in the risk underlying the operating cash flows generated by each project. The problem inherent in this and any project evaluation (not withstanding other difficulties) is in concentrating on the specification of the amount of debt used to finance the investment project. Project debt is therefore defined as the additional debt capacity afforded the firm as a result of accepting the project, and can alternatively be described as the difference between the firm’s optimal debt level with the project and without it. Conceptually this is a fairly concrete construct, but it still leaves some vague areas when we are actually performing the computations. It is essential to arrive at some value estimate of the estimated cash flows of a project. The CHP analysis considered the following four methods: (i) the Equity Residual method; (ii) the “after-tax Weighted Average Cost-ofCapital” method; (iii) the “Arditti-Levy” weighted cost-ofcapital method; and (iv) the Myers “Adjusted Present Value” method. For simplicity, Table 2.3 contains the definitions of the symbols used in the following discussion, after which we briefly discuss the formulation of each method, and then elaborate on the way in which each method incorporates interacting effects of financing and investment decisions.
Equity-Residual Method The equity-residual method is formulated in Equation (2.13) in a manner that emphasizes the goal of financial management, the pursuance of the interests of shareholders:
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
Table 2.3 Definitions of variables
31
Rt D Pretax operating cash revenues of the project during period t; Ct D Pretax operating cash expenses of the project during period t; dept D Additional depreciation expense attributable to the project in period t; £c D Applicable corporate tax rate; I D Initial net cash investment outlay; Dt D Project debt outstanding during period t; NP D Net proceeds of issuing project debt at time zero; rt D Interest rate of debt in period t; ke D Cost of the equity financing of the project; kw D After-tax weighted-average cost of capital (i.e., debt cost is after-tax); kAL D Weighted average cost of capital – debt cost considered before taxes; ¡ D Required rate-of-return applicable to unlevered cash-flow series, given the risk class of the project r, ke , and ¡ are all assumed to be constant over time
NPV.ER/ D
N X Œ. Rt Ct dept rD t /.1 c / C Dt . Dt Dt C1 /
.1 C ke /t
t D1
The formula presented above can be interpreted as stating that the benefit of the project to the shareholders is the present value of the cash flows not going to pay operating expenses or to service or repay debt obligations. With these flows identified as those going to shareholders, it is appropriate, and rather easy, to discount these flows at the cost of equity. The only difficulty involved is identifying this cost of equity, a problem embodied in all capital-budgeting methods.
ŒI NP :
(2.13)
2.5.1 Arditti and Levy Method The Arditti-Levy method is most similar to the after-tax weighted-average cost-of-capital method. This formulation can be written as: NPV.AL/ D
N X Œ. Rt Ct dept rD t /.1 c / C dept C rD t I .1 C kAL /t tD1
(2.15)
After-Tax, Weighted-Average, Cost-of-Capital Method The after-tax, weighted-average, cost-of-capital method, depicted in Equation (2.14), has two noticeable differences from the formulation of the equity-residual method: NPV D
N X . Rt Ct dept /.1 c / C dept t D1
.1 C kw /t
I:
(2.14)
First, no flows associated with the debt financing appear in the numerator, or as relevant cash flows. Second, the cost of capital is adjusted downward, with kw being a weighted average of debt and equity costs, the debt expense accounted for on an after-tax basis. In that way debt financing is reckoned with in an indirect manner. With the assumption that r and ke are constant over time, kw can be affected only by the debtto-equity ratio, a problem most often avoided by assuming a fixed debt-to-equity ratio.
It was necessary to restate the after-tax, weighted-average, cost-of-capital formula in this manner because the tax payment to the government has an influence on the net cash flows, and for that reason the cash-flow figures would be misleading. To rectify this problem the discount rate must now be adjusted as the interest tax shield is reflected in the cash flows and double counting would be involved. While the after-tax, weighted-average cost of capital recognized the cost of debt in the discount rate, it was akin to the equity-residual method in considering only returns to equity. The Arditti and Levy formulas imply that a weighted average discount rate including the lower cost of debt could only (or best) be used if all flows to all sources of financing were included; hence the term rDt is found at the end of the first term. This can be rationalized if one considers the case of a firm where there is one owner. The total cash flow to the owner is the relevant figure, and the discount rate applicable is simply a weighted average of the two individual required rates of return.
32
2
Table 2.4 Application of four capital budgeting techniques
Investment, Dividend, Financing, and Production Policies: Theory and Implications
Inputs: (1) ke D 0:112 (2) r D 0:041 (3) c D 0:46 (4) D 0:0802 (5) w D 0:6 Method NPV results Discount rates 1. Equity-residual 2. After-tax WACC 3. Arditti-Levy WACC 4. Myers APV
$230:55 270:32 261:67 228:05
ke D 0:112 kw D 0:058 kAL D 0:069 r D 0:041 and D 0:0802
From Chambers et al. (1982). Reprinted by permission
2.5.2 Myers Adjusted-Present-Value Method This method, derived in Sect. 2.4, is closely related to the Arditti-Levy method except for the exclusion of the interestexpense flows to the bondholders. In treating the financing mix, Myers implicitly assumes that the tax shield, created by the interest payments and afforded to the equity holders, has the same risk for the equity holders as the coupon or interest payments have for the bondholders. Instead of aggregating all factors and attempting to arrive at a suitable weightedaverage discount rate, Myers found it less difficult to leave operating- and financing-related flows separated, and to discount each at an appropriate rate. This formulation can be written as: APV D
N X . Rt Ct dept /.1 c / C dept t D1
I C
.1 C /t N X c rD t : .1 C r /t t D1
(2.16)
This formulation appears to be closely related to the Modigliani and Miller with-tax firm-valuation model, and well it should, given Myers’ motivation for this work. We choose not to discuss it here, but the reader should be aware that Myers’ emphasis was not solely on the tax advantage of debt, as the last term in the above equation tends to imply. The four methods are obviously comparable, but usually give different figures for the net present value. The equityresidual, after-tax, weighted-average cost-of-capital and the Arditti-Levy weighted-average, cost-of-capital formulations are comparable if the value of the debt outstanding remains a constant proportion of the remaining cash flows; this is often taken to mean a constant debt ratio. This will not guarantee that the Myers APV method will yield the same results, although the Myers method is equivalent to the other three only if the project life can be described as one period, or as infinite with constant perpetual flows in all periods. By way of numerical examples we now address the task of determining the factors that create the differences in the net-presentvalue figures generated by each method. The first example involves the simplest case, where a $1,000 investment generates operating flows of $300 per
year for all 5 years of the project’s life. Debt financing comprises $600 of the project’s total financing and the principal is repaid over the five time periods in equal amounts. The discount and tax rates employed are included below in Table 2.4 with the results for each of the four methods. Because the Arditti-Levy method recognizes the acquisition of the debt capital and uses a lower discount rate, as does the after-tax WACC, these two methods give the highest netpresent-value figures. The equity-residual method recognizes only the $400 outflow at time zero, but the higher discount suppresses the net-present-value figure. The Myers APV method, though discounting financing-related flows at the cost of debt, also attains a low net-present-value figure because the majority of the flows are discounted at a higher unlevered cost-of-equity rate. Basically we can speak of three differences in these methods that create the large discrepancies in the bottom-line figures. The risk factor, reflected in discount rates that may vary over time, is a major element, as was evidenced in the example shown above. The pattern of debt and debt payments will also be an important factor, particularly so when the debt repayment is not of the annuity form assumed earlier. Finally, the recognition and valuation of the debt tax shields will play an important role in net-present-value determination, especially when the constant-debt-repayment assumption is dropped and the interest expenses and associated tax shields grow larger. In the CHP study the valuation models described earlier were employed in a simulation procedure that would allow the assessment of the investment proposal. The inputs to the capital-budgeting procedures were varied across simulations, and the effects of the changes in each input were scrutinized from a sensitivity-analysis viewpoint. They considered, but did not dwell upon, the effects of changing discount rates over time or by some scaling factor on the bottom-line valuation figures; further discussion was deferred primarily because of the multitude of possible combinations that would be of interest. Compounding the problem, one would also be interested in different debt-equity combinations and the effects these would have with changing discount rates, as the weighting scheme of the appropriate discount rates plays an integral part in the analysis. In avoiding this aspect of the analysis, the projects evaluated in the forthcoming discussion will be viewed as being of equivalent risk, or as coming
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
Table 2.5 Inputs for simulation
Project
Net cash inflows per year
33
Project life
1 2 3 4
$300 per year 5 years $253.77 per year 5 years $124.95 per year 20 years $200 per year, years 1–4 5 years $792.58 in year 5 For each project the initial outlay is $1,000 at time t D 0, with all subsequent outlays being captured in the yearly flows Debt schedule Market value of debt outstanding remains a constant proportion of the project’s market value Equal principal repayments in each year Level debt, total principal repaid at termination of project r D 0:041 Inputs: ke D 0:112 .M &M / D 0:0986 kw D 0:085 c D 0:46 M D 0:085 W D 0:3 From Chambers et al. (1982). Reprinted by permission
from the same risk class. Lest the reader feel shortchanged, it is suggested that one examine each method and verify for him or herself what changes would be produced in the netpresent-value figures with varying debt schedules. More in tune with the basic theme of this chapter, we go on to consider the effects that changing financing mixes, debt-payment patterns, and project lives have on the figures attained for each of the four methods considered, the first (financing mix) being the major issue involved. In confronting mix effects, we are required to select a model for valuing debt in the Myers APV method because ¡ (the unlevered cost of equity capital) is unobservable. In this case we must choose between numerous alternatives, the most notable being those of Modigliani and Miller (1963), where debt provides an interest-tax shield, and that of Miller (1977), where the inclusion of personal taxes on investor interest income has the effect of perfectly offsetting the tax shield the firm receives, rendering the debt tax advantage moot. In the simulation results of CHP presented in the following paragraphs, the Myers cost of unlevered equity capital is computed using each method, with subscripts denoting the particular method used. Four projects of varied cash-flow patterns and lives are to be presented and valued, each of which will be simulated with three different debt schedules. Brief descriptions of the projects and debt schedules can be found in Table 2.5, along with the fixed inputs to be used in the actual computations. The projects’ cash flows, as listed in the table, were actually manipulated in a predetermined way, so that the more interesting cases would be presented. Project 1 is simply a base figure, while project 2 has a cash-flow pattern that makes the net present value 0, when using the after-tax, weightedaverage cost of capital. Project 3 has a longer life, and thus will serve as a method of determining the effects of increasing project life. Finally, Project 4 has four level payments
with a larger final payment in year 5, intended to simulate an ongoing project with terminal value in year 5. The results of the simulations are presented in Table 2.6. For purposes of comparison, the net-present-value figures of the after-tax, weighted-average cost of capital are reported first, for two reasons. The after-tax weighted-average cost of capital is probably the best-known technique of capital budgeting, and its net present value figures are insensitive to the debt schedule. The initial weights of the debt and equity used to support the project are all that are used in calculating the weighted discount rate, rendering the repayment pattern of debt irrelevant. As indicated earlier, the after-tax weighted-average cost of capital, Arditti-Levy weighted-average cost of capital, and the equity-residual methods are all equivalent if the debt ratio is held constant. It is also of interest here that the Myer’s APV figures, using the Miller method for determining the cost of capital, are constant over debt schedules; they are, for all intents and purposes, the same as the after-tax, weightedaverage cost-of-capital figures, a finding that is reasonable if one believes that the cost of debt capital is the same as that of unlevered equity. Of all the methods cited above, the equityresidual method is the most sensitive to the debt schedule employed, with both principal and interest payments included in the cash flows, a feature compounded by the higher discount rates included in the computations. The two methods that include the interest tax shield, the Arditti-Levy method and the Myers APV (M&M) method are also sensitive to the amount of debt outstanding, the interest tax shield being of greater value the longer the debt is outstanding. In all cases the latter method gives the lowest net-present-value figures due to the treatment of the interest tax shield. In further simulations CHP showed (and it should be no surprise) that as higher levels of debt are employed and higher tax rates are encountered, the magnitude of the
34
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
Table 2.6 Simulation results
Net-present-value under alternative debit schedule Project
Capital budgeting
Constant debt ratio
Equal principal
Level debt
1
After-tax WACC Arditti-levy WACC Equity-residual Myers APV (M&M) Myers APV (M) After-tax WACC Arditti-Levy WACC Equity-Residual Myers APV (M&M) Myers APV (M) After-tax WACC Arditti-Levy WACC Equity Residual Myers APV (M&M) Myers APV (M) After-tax WACC Arditti-Levy WACC Equity Residual Myers APV (M&M) Myers APV (M)
182 182 182 160 182 0 0 0 18 0 182 182 182 138 182 182 182 182 155 182
182 179 167 157 182 0 1 3 19 0 182 169 128 119 182 182 174 147 146 182
182 187 202 166 182 0 7 32 10 0 182 186 194 150 182 182 182 183 156 182
2
3
4
From Chambers et al. (1982). Reprinted by permission
differences is amplified, and the method employed in the capital-budgeting decision takes on greater and greater importance. Even so it was argued that changes associated with changes in the inputs of the longest-lived project, where the changes in the net-present-value figures were the most pronounced, were not that great when compared with the outcomes associated with changing the estimates of the cash flows by as little as 5%. The importance of this finding is that projects that are of short duration and are financed with relatively little debt are not as sensitive to the capitalbudgeting technique employed. But, in the case of longerlived projects, the method selected can have serious implications for the acceptance or rejection of a project, particularly when higher levels of debt financing are employed. Analysts undoubtedly possess their own views as to which method is stronger conceptually and, in the event of capitalbudgeting procedures, should be aware of the debt policy to be pursued. Even in light of these views, it may be prudent to use the Myers APV method with the M&M unlevered-equity cost-determination method as the first screening device for a project or set of projects. Since this method yields the most conservative figures, any project that appears profitable following this analysis should be undertaken, and any project failing this screening can be analyzed using the methods chosen by the financial manager, if further analysis is thought to be warranted.
In this book, Chap. 56 discusses the capital structure and CEO entrenchment in Asia. Moreover, Chap. 60 discusses the theory and application of alternative methods to determine optimal capital structure. Finally, Chap. 92 provides discussion on the capital structure and entry deterrence.
2.6 Implications of Different Policies on the Beta Coefficient Investment, financing, dividend, and production policy are four important policies in financial management decision. In previous sections, we have discussed investment, financing and dividend policies. In this section, we will discuss the impacts of financing, production and the dividend policy on beta coefficient determination.
Impact of Financing Policy on Beta Coefficient Determination Suppose that the security is a share in the common stock of a corporation. Let us assume that that this corporation increases the proportion of debt in its capital structure, all other relevant factors remaining unchanged. How would you
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
expect this change to affect the firm’s beta? We have seen that an increase in the level of debt leads to an increase in the riskiness of stockholders’ future earnings. To compensate for this additional risk, stockholders will demand a higher expected rate of return. Therefore, from Equation (2.17), beta must rise for this company’s stock. We see that, all other things equal, the higher the proportion of debt in a firm’s capital structure, the higher the beta of shares of its common stock.
B.1 c / (2.17) ˇL D ˇU 1 C S where ˇL is the leveraged bet; ˇU is the unlevered operating beta; B is the amount of debt; S is the amount of equity; and c is the corporate tax rate. When the market model is used to estimate a firm’s beta, the resulting estimate of the beta is the market assessment of both operating and financial risk. This is called leveraged beta. Hamada (1972) and Rubinstein (1973) suggest that Equation (2.17) can be modified to calculate the unleveraged beta. This beta is an estimate of the firm’s operating or business, risk.
Impact of Production Policy on Beta Coefficient Determination In this section, we will first discuss production policy. Then we will discuss implications of different policies on the beta coefficient determination. Production policy refers to how a company uses different input mix to produce its products. The company’s production process can be classified into either capital intensive or labor intensive, which depend on whether capital labor ratio (K=L Ratio) is larger or smaller than one. We can use either Cobb-Douglas production function or Variable Elasticity of Substitution (VES) production function to show how capital and labor affect the change of the beta coefficient. The Cobb-Douglas production function in two factors, capital .K/ and labor .L/ can be defined as follows. (2.18) Q D K a Lb where Q is firm’s output, a and b are positive parameters. Lee et al. (1990) have derived the theoretical relationship between beta coefficient and the capital labor ratio in terms of the Cobb-Douglas production functions as follows in Equation (2.18). ˇD
.1 C r/Cov.e; Q RQ m / Q Var.Rm / f Œ1 .1 E/b g
(2.19)
35
where r, the risk-free rate; RQ m , return on the market portfolio; e, Q random price disturbances with zero mean; E D .@P =@Q/.Q=P /, an elasticity constant; b, contribution of labor to total output; D 1 cov.Qv; RQ m /, and , the market price of systematic risk. In addition, the VES production function in two factors capital .K/ and labor .L/ can be defined as follows in Equation (2.19). Q D K ˛.1s/ ŒL C . 1/K ˛s
(2.20)
where Q is firm’s output and ˛; s, and are parameters with the following constraints: ˛ > 0; 0<s<1 0 s 1; L=K > .1 /=.1 s/3 : Lee et al. (1995) have derived the theoretical relationship between beta coefficient and the capital labor ratio in terms of the VES production functions as follows. ˚ .1 C r/ C ov.e; Q RQ m / Œˆ.1 E/˛s 1 w.pQ/1 .1 /K C ov.Qv; RQ m / ˇD Var.RQ m / fˆ Œˆ.1 E/˛s w .pQ/1 .1 /K (2.21) where p, expected price of output; , reciprocal of the price elasticity of demand, 0 1I w, expected wage rate; vQ , random shock in the wage rate with zero mean; ˆ D 1 cov.e; Q RQ m /, and r; RQ m ; e; Q E; ; are as defined in Equation (2.19).
Impact of Dividend Policy on Beta Coefficient Determination Impact of payout ratio on beta coefficient is one of the important issues in finance research. By using dividend signaling theory, Lee et al. (1993) have derived the relationship between the beta coefficient and pay out ratio as follows: ˇi D ˇpi Œ1 C F .di /
(2.22)
ˇi , the firm’s systematic risk when the market is informationally imperfect and the information asymmetry can be resolved by dividends; ˇpi , the firm’s systematic risk when market is informationally perfect; , a signaling cost incurred
36
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
if firm’s net after-tax operating cash flow X falls below the promised dividend DI di , firm’s dividend payout ratio; F .di /, cumulative normal density function in term of payout ratio. Equation (2.21) implies that beta coefficient is a function of payout ratio. In sum, this section has shown that how financing, production, and dividend policy can affect the beta coefficient. This information discussed in this section is important for calculating the cost of capital equity.
2.7 Conclusion In this chapter we have attacked many of the irrelevant propositions of the neoclassical school of finance theory, and in so doing have created a good news-bad news sort of situation. The good news is that by claiming that financial policies are important, we have justified the existence of academicians and a great many practicing financial managers. The bad news is that we have made their lot a great deal more difficult as numerous tradeoffs were investigated, the more general of these comprising the title of the chapter. In the determination of dividend policy, we examined the relevance of the internal-external equity decision in the presence of nontrivial transaction costs. While the empirical evidence was found to be inconclusive because of the many variables that could not be controlled, there should be no doubt mind that flotation costs (incurred when issuing new equity to replace retained earnings paid out) by themselves have a negative impact on firm value. But if the retained earnings paid out are replaced whole or in part by debt, the equity holders may stand to benefit because the risk is transferred to the existing bondholders – risk they do not receive commensurate return for taking. Thus, if the firm pursues a more generous dividend-payout policy while not changing the investment policy, the change in the value of the firm depends on the way in which the future investment is financed. The effect that debt financing has on the value of the firm was analyzed in terms of the interest tax shield it provides and the extent to which the firm can utilize that tax shield. In Myers’ analysis we also saw a that a limit on borrowing could be incorporated so that factors such as risk and the probability of insolvency would be recognized when making each capital-budgeting decision. When compared to other methods widely used in capital budgeting, Myers’ APV formulation was found to yield more conservative benefit estimates. While we do not wish to discard the equityresidual, after-tax weighted cost-of-capital method or the Arditti-Levy weighted cost-of-capital method, we set forth Myers’ method as the most appropriate starting point when a firm is first considering a project, reasoning that if the project
was acceptable following Myers’ method, it would be acceptable using the other methods – to an even greater degree. If the project was not acceptable following the APV criteria, it could be reanalyzed with one of the other methods. The biases of each method we hopefully made clear with the introduction of debt financing. In Sect. 2.6 we have discussed how different policies can affect the determination of beta coefficient. In essence, this chapter points out the vagaries and difficulties of financial management in practice. Virtually no decision concerning the finance function can be made independent of the other variables under management’s control. Profitable areas of future research in this area are abundant; some have already begun to appear in the literature under the heading “simultaneous-equation planning models.” Any practitioner would be well advised to stay abreast of developments in this area.
Notes 1. Alderson (1984) has applied this kind of concept to the problem of corporate pension funding. He finds grounds for the integration of pension funding with capital structure decisions. In doing so, he argues that unfunded vested liabilities are an imperfect substitute for debt. 2. BHS (1981) discusses the relationship between debt capacity and the existence of optimal capital structure. 3. If the dividend policy is irrelevant, then Lct D Lc;t1 D 0; if dW=dyt D Ft is positive, then the constraints U will always be binding, Equation (2.7b) will be strict equalities, and Lft for all t. Substituting Lct D 0 and Lft D Ft into Equation (2.8), we obtain Equation (2.12). 4. This statement can be explained by the agency theory, which was developed by Jensen and Meckling (1976) and was extended by Barnea et al. (1981). According to the agency theory and the findings of Stiglitz (1972), the manager might not act out of the equityholders’ best interest. For his own benefit, the manager might shift the wealth of a firm from the equityholders to bondholders.
References Alderson, M. J. 1984. Unfunded pension liabilities and capital structure decisions: a theoretical and empirical investigation, PhD dissertation, The University of Illinois, Urbana-Champaign Arditti, F. D. 1980. “A survey of valuation and the cost of capital.” Research in Finance 2, 1–56 Arditti, F. D. and Y. C. Peles. 1977. “Leverage, bankruptcy, and the cost of capital.” in Natural resources, uncertainty and general equilibrium systems: essays in memory of Rafael Lusky, A. Blinder and P. Friedman (Eds.). Academic Press, New York
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications Arditti, F. D. and H. Levy. 1977. “The weighted average cost of capital as a cutoff rate: a critical analysis of the classical textbook weighted average.” Financial Management 6(3), 24–34 Ang, J. S. and J. H. Chua. 1982. “Mutual funds: different strokes for different folks.” The Journal of Portfolio Management 8, 43–47 Barnea, A., R. A. Haugen and L. W. Senbet. 1981. “Market imperfections, agency problems, and capital structure: a review.” Financial Management 10, 7–22 Baron, D. P. 1975. “Firm valuation, corporate taxes, and default risk.” Journal of Finance 30, 1251–1264 Baumol, W. and B. G. Malkiel. 1967. “The firm’s optimum debtequity combination and the cost of capital.” Quarterly Journal of Economics 81(4), 546–578 Black, F. 1976. “The dividend puzzle.” Journal of Portfolio Management, 5–8 Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81(3) 637–654 Chambers, D. R., R. S. Harris, and J. J. Pringle. 1982. “Treatment of financing mix in analyzing investment opportunities.” Financial Management 11, 24–41 Chen, A. H. 1978. “Recent development in cost of debt capital.” Journal of Finance 33, 863–877 Conine, T. E. Jr. 1980. “Debt capacity and the capital budgeting decision: a comment.” Financial Management, 20–22 DeAngelo, H. and L. DeAngelo. 2006. “The irrelevance of the MM dividend irrelevance theorem.” Journal of Financial Economics 79, 293–315 Fama, E. F. and M. H. Miller. 1972. The theory of finance, Holt, Rinehart and Winston, New York Gahlon, J. M. and R. D. Stover. 1979. “Debt capacity and the capital budgeting decision: a caveat.” Financial Management 8, 55–59 Gupta, M. C. and A. C. Lee. 2006. “An integrated model of debt issuance, refunding, and maturity.” Review of Quantitative Finance and Accounting 26(2), 177–199 Gupta, M. C., A. C. Lee, and C. F. Lee. 2007. Effectiveness of dividend policy under the capital asset pricing model: a dynamic analysis. Working Paper, Rutgers University Hamada, R. S. 1972. “The effect of the firm’s capital structure on the systematic risk of common stocks.” Journal of Finance 27(2), 435–452 Hanoch, G. and H. Levy. 1969. “The efficiency analysis of choices involving risk.” Review of Economic Studies 36, 335–346 Hellwig, M. F. 1981. “Bankruptcy, limited liability, and the ModiglianiMiller theorem.” The American Economic Review 71, 155–170 Higgins, R. C. 1977. “How much growth can a firm afford?” Financial Management 6(3), 7–16 Higgins, R. C. 1981. “Sustainable growth under inflation.” Financial Management, 36–40 Hong, H. and A. Rappaport. 1978. “Debt capacity and the capitalbudgeting decision: a caveat.” Financial Management, 7–11 Jean, W. 1975. “Comparison of moment and stochastic-dominance ranking methods.” Journal of Financial and Quantitative Analysis 10, 151–162 Jensen, M. and W. Meckling. 1976. “Theory of the firm: managerial behavior, agency costs, and ownership structure.” Journal of Financial Economics 3, 305–360 Johnson, D. J. 1981. “The behavior of financial structure and sustainable growth in an inflationary environment.” Financial Management 10, 30–35 Kim, E. H. 1978. “A mean-variance theory of optimal capital structure and corporate debt capacity.” Journal of Finance 33, 45–64 Kim, E. H. 1982. “Miller’s equilibrium, shareholder leverage clienteles, and optimal capital structure.” Journal of Finance 37, 301–319 Kim, E. H., J. J. McConnell, and P. R. Greenwood. 1977. “Capitalstructure rearrangements and me-first roles in an efficient capital market.” Journal of Finance 32, 811–821
37
Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York Lee, C. F., W. Chunchi, and D. Hang. 1993. “Dividend policy under conditions of capital markets and signaling equilibrium.” Review of Quantitative Finance and Accounting 3, 47–59 Lee, C. F., S. Rahman, and K. Thomas Liaw. 1990. “The impacts of market power and capital-labor ratio on systematic risk: a Cobb-Douglas approach.” Journal of Economics and Business 42(3), 237–241 Lee, C. F., K. C. Chen, and K. Thomas Liaw. 1995. “Systematic risk, wage rates, and factor substitution.” Journal of Economics and Business 47(3), 267–279 Levy, H. and M. Sarnat. 1972. Investment and portfolio analysis, Wiley, New York Litzenberger, R. H. and K. Ramaswamy. 1979. “The effect of personal taxes and dividends on capital-asset prices: theory and empirical evidence.” Journal of Financial Economics 7, 163–195 Litzenberger, R. H., and H. B. Sosin. 1979. “A comparison of capitalstructure decision of regulated and nonregulated firms.” Financial Management 8, 17–21 Malkiel, B. G. 1970. “The valuation of public-utility equity.” Bell Journal of Economics and Management Science 1, 143–160 Martin, J. D. and D. F. Scott. 1976. “Debt capacity and the capital budgeting decision.” Financial Management 5(2), 7–14 Martin, J. D. and D. F. Scott Jr. 1980. “Debt capacity and the capital budgeting decision: a revisitation.” Financial Management 9(1), 23–26 Merton, R. C. 1974. “On pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29, 449–470 Miller, M. H. 1977. “Debt and taxes.” Journal of Finance 32, 261–275 Modigliani, F. and M. H. Miller. 1958. “The cost of capital, corporate finance, and theory of investment.” American Economic Review 48(3), 261–297 Modigliani, F. and M. H. Miller. 1963. “Corporate income taxes and the cost of capital: a correction.” American Economic Review 53(3), 433–443 Myers, S. C. 1974. “Interaction of corporate financing and investment decision.” Journal of Finance 29, 1–25 Myers, S. C. 1977. “Determinants of corporate borrowing.” Journal of Financial Economics 5, 147–175 Porter, R. B. and J. E. Gaumwitz. 1972. “Stochastic dominance vs. mean-variance portfolio analysis: an empirical evaluation.” American Economic Review 57, 438–446 Rendleman, R. J. Jr. 1978. “The effects of default risk on firm’s investment and financing decisions.” Financial Management 7(1), 45–53 Reseck, R. W. 1970. “Multidimensional risk and the Modigliani-Miller hypothesis.” Journal of Finance 25, 47–51 Rhee, S. G. and F. L. McCarthy. 1982. “Corporate debt capacity and capital-budgeting analysis.” Financial Management 11, 42–50 Rubinstein, M. E. 1973. “A mean-variance synthesis of corporate financial theory.” Journal of Finance 28, 167–187 Stiglitz, J. 1972. “Some aspects of pure theory of corporate finance: Bankruptcies and take-overs.” Bell Journal of Economics and Management Science 3, 458–482 Tuttle, L., and R. H. Litzenberger. 1968. “Leverage, diversification, and capital-market effects on a risk-adjusted capital-budgeting framework.” Journal of Finance 23, 427–443 Van Horne, J. C. and J. D. McDonald. 1971. “Dividend policy and new equity financing.” Journal of Finance 26, 507–519
38
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
Appendix 2A Stochastic Dominance and its Applications to Capital-Structure Analysis with Default Risk
EF U.X / > EG U.X /
where EF U.X / and EG U.X / are expected utilities. Mathematically, they can be defined as: Zx
2A.1 Introduction EF U.X / D Mean-variance approaches were extensively used in the literature to derive alternative finance theories and perform related empirical studies. Besides mean-variance approaches, there is a more general approach, stochasticdominance analysis, which can be used to choose a portfolio, to evaluate mutual-fund performance, and to analyze the optimal capital-structure problem. Levy and Sarnat (1972), Porter and Gaumwitz (1972), Jean (1975), and Ang and Chua (1982) have discussed the stochastic-dominance approach to portfolio choice and mutual-fund performance evaluation in detail. Baron (1975), Arditti and Peles (1977), and Arditti (1980) have used the theory of stochastic dominance to investigate the optimal capital-structure question. In this appendix we will discuss only how stochastic-dominance theory can be used to analyze the issue of optimal capital structure with default risk.
2A.2 Concepts and Theorems of Stochastic Dominance
(2A.1)
U.X /f .X /dx
(2A.2a)
U.X /g.X /dx
(2A.2b)
x
Zx EF U.X / D x
where U.X / D the utility function, X D the investment dollar-return variable, f .X / and g.X / are probability distributions of X . It should be noted that the above-mentioned first-order stochastic dominance does not depend upon the shape of the utility function with positive marginal utility. Hence, the investor can either be risk-seeking, risk neutral, or risk-averse. If the risk-aversion criterion is imposed, the utility function is either strictly concave or nondecreasing. If utility functions are nondecreasing and strictly concave, then the second-order stochastic-dominance theorem can be defined. Mathematically, the second-order dominance can be defined as: Zx ŒG.T / F .t/ dt > 0; where G.t/ ¤ F .t/ for some t: x
The expected utility rule can be used to introduce the economics of choice under uncertainty. However, this decision rule has been based upon the principle of utility maximization, where either the investor’s utility function is assumed to be a second-degree polynomial with a positive first derivative and a negative second derivative, or the probability function is assumed to be normal. A stochastic-dominance theory is an alternative approach of preference orderings that does not rely upon these restrictive assumptions. The stochastic-dominance technique assumes only that individuals prefer more wealth to less. An asset is said to be stochastically dominant over another if an individual receives greater wealth from it in every (ordered) state of nature. This definition is known as first-order stochastic dominance. Mathematically, it can be described by the relationship between two cumulativeprobability distributions. If X symbolizes the investment dollar return variable and F .X / and G.X / are two cumulativeprobability distributions, then F .X / will be preferred to G.X / by every person who is a utility maximizer and whose utility is an increasing function of wealth if F .X / G.X / for all possible X , and F .X / < G.X / for some X . For the family of all monotonically nondecreasing utility functions, Levy and Sarnat (1972) show that first-order stochastic dominance implies that:
(2A.3) Equation (2A.3) specifies a necessary and sufficient condition for an asset F to be preferred over a second asset G by all risk-averters. Conceptually, Equation (2A.3) means that in order for asset F to dominate asset G for all risk-aversion investors, the accumulated area under the cumulative-probability distribution G must be greater than the accumulated area for F for any given level of wealth. This implies that, unlike first-order stochastic dominance, the cumulative density functions can cross. Stochastic dominance is an extremely important and powerful tool. It is properly founded on the basis of expected utility maximization, and even more important, it applies to any probability distribution. This is because it takes into account every point in the probability distribution. Furthermore, we can be sure that, if an asset demonstrates second-order stochastic dominance, it will be preferred by all risk-aversion investors, regardless of the specific shape of their utility functions. Assume that the density functions of earnings per share (EPS) for both firm A and firm B are f .X / and g.X /, respectively. Both f .X / and g.X / have normal distributions and the shape of these two distributions are described in Fig. 2A.1. Obviously the EPS of firm A will dominate the EPS of firm B if an investor is risk-averse because they both
2 Investment, Dividend, Financing, and Production Policies: Theory and Implications
39
Fig. 2A.1 Probability distributions of asset F and asset G
offer the same expected level of wealth .f D g / and because the variance of g.X / is larger than that of f .X /. Based on both the first-order and the second-order stochastic-dominance theorems mentioned earlier, a new theorem needed for investigating capital structure with default risk can be defined as: Theorem 2A.1. Let F , G be two distributions with mean values 1 and 2 respectively, such that, for some X0 < 1, F G for X X0 (and F < G for some X1 < X0 ) and F G for some X X0 , then F dominates G (for concave utility functions) if and only if 1 2 . The proof of this theorem can be formed in either Hanoch and Levy (1969) or Levy and Sarnat (1972). Conceptually, this theorem states that if two cumulative distributions, F and G, intersect only once, then if F is below G to the left of the intersection point and has a higher mean than does G, the investment with cumulative distribution F dominates that with cumulative return distribution G on a second-degree stochastic-dominance basis.
earnings X, before taxes and financial charges, such that, in any state of nature that occurs, both firms have the same earnings. In addition, it is assumed that this random variable X is associated with a cumulative-distribution function F .X /. Firm A is assumed to be financed solely by equity, while firm B is financed by both debt and equity. The market value V1 of firm A equals E1 , the value of its equity, while the market value V2 of firm B equals the market value of its equity .E2 / plus the value of its debt .D2 /. Debt is assumed to sell at its par value and to carry a gross coupon rate of r .r > 0/. The firm can generally use the coupon payments as a tax shield and therefore the after-tax earnings for firms A and B can be defined as X.1T / and .X rD2 /.1T /, respectively (where T is the corporate-profit tax rate.) If an investor purchases a fraction of firm A, this investment results in dollar returns of: 0; if X 0; Y1 D (2A.4) ˛X.1 T / if X > 0; with cumulative probability function:
2A.3 Stochastic-Dominance Approach to Investigating the Capital-Structure Problem with Default Risk The existence of default risk is one of the justification of why an optimal capital structure might exist for a firm. To analyze this problem. Baron (1975), Arditti and Peles (1977), and Arditti (1980) have used the stochastic-dominance theorem described in the previous sections to indicate the effects of debt-financing bonds on relative values of levered and unlevered firms. Baron analyzed the bonds in terms of default risk, tax rates, and debt levels. Consider two firms, or the same firm before and after debt financing, with identical probability distribution of gross
G1 .Y / D
F .0/ F .Y =.1 T /
if Y D 0; if Y > 0:
(2A.5)
If the investor purchased a fraction ’ of the equity and ’ of the debt of firm B, this dollar return would be: 8 if X 0; ˆ <0 Y2 D ˛.1 k/X.1 T / if 0 < X.1 T / < D2 CrD2 ; ˆ : ˛.X.1 T / C T rD2 if x.1 T / D2 C rD2 ; (2A.6) where k represents the percentage of liquidation cost. Since these bonds comprise a senior lien on the earnings of the firm, failure to pay the promised amount of D2 CrD2 to bondholders can force liquidation of the firm’s assets.
40
2
Investment, Dividend, Financing, and Production Policies: Theory and Implications
The cumulative distribution associated with Equation (2A.7) is: (
if Y D0; if 0 < y ˛.1 k/ŒD2 CrD2 ; F ŒY =˛.1T / T rD2 if Y ˛ŒD2 C rD2 :
F .0/
G2 .Y / D F ŒY =˛.1k/.1T /
(2A.7) Comparing Equation (2A.5) with Equation (2A.7), it can be found that: G1 .Y / < G2 .Y / if 0 < Y < ˛.D2 C rD2 /; (2A.8) G2 .Y / G2 .Y / if Y ˛.D2 C rD2 /;
a firm should issue. However, these results have indicated the possibility of interior (noncorner) optimal capital structure. By using a linear utility function, Arditti and Peles (1977) show that there are probability distributions for which an interior capital structure for a firm can be found. In addition, they also make some other market observations, which are consistent with the implications of their model. These observations are (i) firms with low income variability, such as utilities, carry high debt-equity ratios; and (ii) firms that have highly marketable assets and therefore low liquidation costs, such as shipping lines, seem to rely heavily on debt financing relative to equity.
(2A.9)
where ˛.D2 CrD2 / is the critical income level of bankruptcy. Equations (2A.8) and (2A.9) imply that the levered firm cannot dominate the unlevered firm on a first-degree or second-degree stochastic-dominance basis. From the theorem of stochastic dominance that we presented in the previous section, it can be concluded that the unlevered firm cannot dominate the levered firm because the expected return of the levered firm is generally higher than that of the unlevered firm (otherwise why levered?), and because the G2 .Y / and G1 .Y / curves intersect only once. Consequently, a general statement using stochastic dominance cannot be made with respect to the amount of debt that
2A.4 Summary In this appendix we have tried to show the basic concepts underlying stochastic dominance and its application to capitalstructure analysis with default risk. By combining utility maximization theory with cumulative-density functions, we are able to set up a decision rule without explicitly relying on individual statistical moments. This stochastic-dominance theory can then be applied to problems such as capitalstructure analysis with risky debt, as was shown earlier.
Chapter 3
Research Methods in Quantitative Finance and Risk Management
Abstract The main purpose of this chapter is to discuss important quantitative methods used to do the research in quantitative finance and risk management. We first discuss statistics theory and methods. Second, we discuss econometric methods. Third, we discuss mathematics. Finally, we discuss other methods such as operation research, stochastic process, computer science and technology, entropy, and fuzzy set theory. Keywords Statistics r Econometrics r Mathematics r Operation research r Stochastic process r Computer science and technology r Entropy r Fuzzy set theory
3.1 Introduction Quantitative finance and risk management research requires finance theories, finance policies, and methodology. Finance theories and finance policies have been discussed in Chaps. 1 and 2. This chapter briefly discusses the alternative methodologies used in quantitative finance and risk management research. In Sect. 3.2 we will review statistics theory and methodology used in quantitative finance and risk management research. Econometric and mathematics used in quantitative finance and risk management research will be presented in Sects. 3.3 and 3.4, respectively. In addition, in Sect. 3.5 we will briefly review the methods used in quantitative finance and risk management such as operation research, stochastic process, computer science and technology, entropy, and fuzzy set theory. Finally, our conclusion will be presented Sect. 3.6.
3.2 Statistics Statistics is important in developing finance theories and in implementing empirical models. It is the prerequisite for studying financial economics, especially for researchers who encounter much uncertainty. The concepts of distribution,
such as binomial distribution, multinomial distribution, normal distribution, log-normal distribution, and non-central Chi-square distribution, help researchers to develop alternative option pricing models. In this book, Chap. 24 discusses the applications of the binomial distribution to evaluate call options. Chapter 25 discusses the multinomial options pricing model. It extends the binomial option pricing model to multinomial option pricing model. Then we derive the multinomial option pricing model and apply it on the limiting case of Black and Scholes model. Finally, we introduce a lattice framework for option pricing model and its application to option valuation. In Chaps. 27 and 28, normal distribution, lognormal distribution, and bivariate normal distribution option pricing models are discussed. Chapter 27 introduces normal distribution, lognormal distribution, and their relationship. Multivariate normal and lognormal distributions are also examined. Finally, both normal and lognormal distributions are applied to derive Black–Sholes formula under the assumption that the rate of stock price follows a lognormal distribution. Chapter 28 presents the American option pricing model on stock with dividend payments and without dividend payments. A Microsoft excel program for evaluating the American option pricing model is also presented. Chapter 29 compares the American option prices with one known dividend under two alternative specifications of the underlying stock price: displace log normal and logmal processes. Many option pricing models follow the standard assumption of the Black-Scholes model (1973) in which the stock price follows a log normal process. However, in order to reach a closed form solution for the American option price with one known dividend, Roll (1977), Geske (1979), and Whaley (1981) assume a displaced lognormal process for the cum-dividend stock price, which results in a lognormal process for the ex-dividend stock price. We compare two alternative pricing results in this paper. Chapter 31 describes the non-central Chi-square type of OPM. It reviews the renowned Constant Elasticity of Variance (CEV) option pricing model and gives the detailed derivations. There are two purposes of this Chap. 10. First, it shows the details of the formulas needed in deriving the option pricing and bridges the gaps in deriving the necessary formulas for the
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_3,
41
42
3
Research Methods in Quantitative Finance and Risk Management
model. Second, it uses a result by Feller to obtain the transition probability density function of the stock price at time T given its price at time t with t < T . In addition, some computational considerations are given that will facilitate the computation of the CEV option pricing formula. Chapter 37.1 analyzes the convergence rate and pattern of binomial distribution. It extends the generalized CoxRoss-Rubinstein (hereafter GCRR) model of Chung and Shih (2007). We provide a further analysis of the convergence rates and patterns based on various GCRR models. The numerical results indicate that the GCRR-XPC model and the GCRR-JR (p D 1=2) model (defined in Table 34.1) outperform the other GCRR models for pricing European calls and American puts. Our results confirm that the node positioning and the selection of the tree structures (mainly the asymptotic behavior of the risk-neutral probability) are important factors for determining the convergence rates and patterns of the binomial models. Chapter 35 uses mixture distribution to estimate implied probabilities from options prices. It examines a variety of methods for extracting implied probability distributions from option prices and the underlying. The chapter first explores non-parametric procedures for reconstructing densities directly from options market data. It then considers local volatility functions, both through implied volatility trees and volatility interpolation. It then turns to alternative specifications of the stochastic process for the underlying. A mixture of log normal model is estimated and applied to exchange rate data, and it is illustrated how to conduct forecast comparisons. Finally, the chapter turns to the estimation of jump risk by extracting bipower variation. Chapter 57 uses generalized hyperbolic distribution to develop a generalized model for optimum futures hedge ratio. It proposes the generalized hyperbolic distribution as the joint log-return distribution of the spot and futures. Using the parameters in this distribution, it derives several most widely used optimal hedge ratios: minimum variance, maximum Sharpe measure, and minimum generalized semivariance. Under mild assumptions on the parameters, it finds that these hedge ratios are identical. Regarding the equivalence of these optimal hedge ratios, the analysis in this chapter suggests that the martingale property plays a much important role than the joint distribution assumption. Multivariate statistics such as factor analysis, and discriminant analysis are also important in performing empirical study in quantitative finance and risk management. This naturally leads to the use of regression analysis, which includes the single equation and multiple equation models. Maximum likelihood methods are also important in statistical estimation. Moreover when the estimation requires fewer distributional assumptions about the variables, nonparametric estimation is preferred. Chapter 65 uses nonparametric methods in examining segmenting financial services market. It analyzes segmentation of financial markets
based on the general segmentation bases. In particular, it identifies potentially attractive market segments for financial services using a customer dataset. It develops a multigroup discriminant model to classify the customers into three ordinal classes: prime customers, highly valued customers, and price shoppers based on their income, loan activity, and demographics (age). The multi-group classification of customer segments uses both classical statistical techniques and a mathematical programming formulation. This study uses the characteristics of a real dataset to simulate multiple datasets of customer characteristics. The results of the proposed experiments show that the mathematical programming model in many cases consistently outperforms standard statistical approaches in attaining lower APparent Error Rates (APER) for 100 replications in both high and low correlation cases. Researchers use Bayesian inference in which the observations are used to update the probability that their hypothesis is true. Chapter 11 discusses Bayesian triangulation in asset pricing. It surveys some relevant academic research on the subject, including work about the combining of forecasts (Bates and Granger 1969), the BlackLitterman model (1991, 1992), the combining of Bayesian priors and regression estimates (Pastor 2000), model uncertainty and Bayesian model averaging (Hoeting et al. 1999; Cremers 2002), the theory of competitive storage (Deaton and Laroque 1992), and the combination of valuation estimates (Yee 2007). Despite its wide-ranging applicability, the Bayesian approach is not a license for data snooping. The second half of this chapter describes common pitfalls in fundamental analysis and comments on the role of theoretical guidance in mitigating these pitfalls. Chapter 91 has proposed Bayesian inference of financial models using Monte Carlo Markov Chian (MCMC) algorithm. It introduces Gibbs sampler and Markov Chain Monte Carlo (MCMC) algorithms. The marginal posterior probability densities obtained by the MCMC algorithms are compared to the exact marginal posterior densities. This chapter presents two financial applications of MCMC algorithms: the CKLS model of the Japanese uncollaterized call rate and the Gaussian copula model of S&P 500 and FTSE 100. The concept of stochastic dominance is important in portfolio optimization problems. Chapter 15 considers the problem of constructing a portfolio of finitely many assets whose return rates are described by a discrete joint distribution. This chapter presents a new approach to portfolio selection based on stochastic dominance. The portfolio return rate in the new model is required to stochastically dominate a random benchmark. It formulates optimality conditions and duality relations for these models and constructs equivalent optimization models with utility functions. Two different formulations of the stochastic dominance constraint: primal and inverse, lead to two dual problems that involve
3 Research Methods in Quantitative Finance and Risk Management
43
von Neuman–Morgenstern utility functions for the primal formulation and rank dependent (or dual) utility functions for the inverse formulation. This study also discusses the relations of our approach to value at risk and conditional value at risk. Numerical illustration is provided. Characteristics function and spectrum analysis are important in option pricing research. Chapter 37 has applied characteristics function to derive generalized option pricing model. Chapter 38 discusses the application of the characteristic function in financial research. It introduces the application of the characteristic function in financial research. It considers the technique of the characteristic function useful for many option pricing models. Two option pricing models are derived in details based on the characteristic functions. Copula distribution is also important in doing research in the default risk. Chapter 46 provides a new approach to estimate future credit risk on target portfolio based on the framework of CreditMetricsTM by J.P. Morgan. However, it adopts the perspective of factor copula and then brings the principal component analysis concept into factor structure to construct a more appropriate dependence structure among credits. In order to examine the proposed method, this chapter uses real market data instead of virtual ones. It also develops a tool for risk analysis that is convenient to use, especially for banking loan businesses. The results show that people assume dependence structures are normally distributed will indeed lead to risks underestimate. On the other hand, the proposed method in this chapter captures better features of risks and shows the fat-tail effects conspicuously even though assuming the factors are normally distributed.
discussion of stochastic model fitting and explains the fitting approach that motivates the introduction of ARM processes. It then continues with a review of ARM processes and their fundamental properties, including their construction, transition structure, and autocorrelation structure. Next, the chapter outlines the ARM modeling methodology by describing the key steps in fitting an ARM model to empirical data. It also describes in some detail the ARM forecasting methodology and the computation of the conditional expectations that serve as point estimates (forecasts of future values) and their underlying conditional distributions from which confidence intervals are constructed for the point estimates. Finally, the chapter concludes with an illustration of the efficacy of the ARM modeling and forecasting methodologies through an example utilizing a sample from the S&P 500 Index. Chapter 76 has proposed univariate and bivariate GARCH analyses for the volume versus GARCH effects. It tests for the generalized autoregressive conditional heteroscedasticity (GARCH) effect on U.S. stock markets for different periods and re-examines the findings in Lamoureux and Lastrapes (1990) by using alternative proxies for information flow, which are included in the conditional variance equation of a GARCH model. It also examines the spillover effects of volume and turnover on the conditional volatility using a bivariate GARCH approach. Our findings show that volume and turnover have effects on conditional volatility and that the introduction of volume/turnover as exogenous variable(s) in the conditional variance equation reduces the persistence of GARCH effects as measured by the sum of the GARCH parameters. The results confirm the existence of the volume effect on volatility, consistent with the findings by Lamoureux and Lastrapes (1990) and others, suggesting that the earlier findings were not a statistical fluke. The findings in this chapter also suggest that, unlike other anomalies, the volume effect on volatility is not likely to be eliminated after discovery. In addition, this study rejects the pure random walk hypothesis for stock returns. The bivariate analysis also indicates that there are no volume or turnover spillover effects on conditional volatility among the companies, suggesting that the volatility of a company’s stock return may not necessarily be influenced by the volume and turnover of other companies. Chapter 82 has developed defense forecasting methods to forecast stock price. This study introduces a new method of forecasting, defensive forecasting, and illustrates its potential by showing how it can be used to predict the yields of corporate bonds in real time. Defensive forecasting identifies a betting strategy that succeeds if forecasts are inaccurate and then makes forecasts that will defeat this strategy. The theory of defensive forecasting is based on the game-theoretic framework for probability introduced by Shafer and Vovk in 2001. This study frames statistical tests as betting strategies. It rejects a hypothesis when a strategy for betting against
3.3 Econometrics Econometric models are the necessary research tools in the research of quantitative finance and risk management. The major econometric methods used in the modern finance research include but not limited to the following subjects: linear regression models, time series modeling, multiple equations models, generalized methods of moments, and panel data models. Many of the aforementioned subjects also contain well developed sub-subjects. For example, the development of the concepts of stationarity, autoregressive and moving average model, and cointegration, within the framework of time series are widely applied in the finance research. Multiple equation models include the application of vector autoregression (VAR), simultaneous equation models, and seemingly uncorrelated regression. Chapter 73 discusses the ARM processes and their properties. It also presents AMR modeling and forecasting methodology that try to capture a strong statistical signature of the empirical data. This chapter starts with a high-level
44
3
Research Methods in Quantitative Finance and Risk Management
it multiplies the capital it risks by a large factor. The study proves each classical theorem, such as the law of large numbers, by constructing a betting strategy that multiplies the capital it risks by a large factor if the theorem’s prediction fails. Defensive forecasting plays against strategies of this type. Chapter 86 discusses the application of simultaneous equation in finance research. This chapter introduces the concept of simultaneous equation system and the application of such system to finance research. The study first discusses the order and rank condition of identification problem. It then introduces the two-stage least squares (2SLS) and three-stage least squares (3SLS) estimation method. The pro and cons of these estimations methods are summarized. Finally, the results of a study on executive compensation structure and risk-taking are used to illustrate the difference between single equation and simultaneous equation method. Chapter 89 has proposed a method to detect structural instability of financial time series. It reviews the literature that deals with structural shifts in statistical models for financial time series. The issue of structural instability is at the heart of capital asset pricing theories on which many modern investment portfolio concepts and strategies rely. The need to better understand the nature of structural evolution and to devise effective control of the risk in real world investments has become ever more urgent in recent years. Since the early 2000s, concepts of financial engineering and the resulting product innovations have advanced in an astonishing pace, but unfortunately often on a faulty premise and without proper regulatory keep-ups. The developments have threatened to destabilize worldwide financial markets. This chapter explains the principle and concepts behind these phenomena. Along the way it highlights developments related to these issues over the past few decades in the statistical, econometric, and financial literature. It also discusses recent events that illuminate the dynamics of structural evolution in the mortgage lending and credit industry. Chapter 93 discusses theoretical VAR model and its applications. Chapter 96 uses GARCH and SplineGARCH to model and forecast volatilities and asset returns. Dynamic modeling of asset returns and their volatilities is a central topic in quantitative finance. Herein this chapter reviews basic statistical models and methods for the analysis and forecasting of volatilities. It also surveys several recent developments in regime-switching, change-point, and multivariate volatility models. Chapter 99 uses combined forecasting methods to estimate cost of equity capital for property/casualty insurers. This chapter estimates the cost of equity capital for Property/Casualty insurers by applying three alternative asset pricing models: the Capital Asset Pricing Model (CAPM), the Arbitrage Pricing Theory (APT), and a unified CAPM/APT model (Wei 1988). The in-sample forecast ability of the models is evaluated by applying the mean squared error method, the Theil U2 (1966) statistic, and the
Granger (1974) conditional efficiency evaluation. Based on forecast evaluation procedures, the APT and Wei’s unified CAPM/APT models perform better than the CAPM in estimating the cost of equity capital for the PC insurers and a combined forecast may outperform the individual forecasts. Chapter 103 uses Box–Cox functional forms transformation in closed country fund research. This chapter proposes a generalized functional form CAPM model for international closed-end country funds performance evaluation. It examines the effect of heterogeneous investment horizons on the portfolio choices in the global market. Empirical evidences suggest that there are some empirical anomalies that are inconsistent with the traditional CAPM. These inconsistencies arise because the specification of the CAPM ignores the discrepancy between observed and true investment horizons. A comparison between the functional forms for share returns and NAV returns of closed-end country funds suggests that foreign investors may have more heterogeneous investment horizons versus their U.S counterparts. Market segmentation and government regulation does have some effect on the market efficiency. No matter which generalized functional model is used, the empirical evidence indicates that, on average, the risk-adjusted performance of international closed-end fund is negative even before the expenses. The fixed effect panel data model and the random effect panel data model are also important in much empirical finance research. Chapter 51 uses a panel threshold model to examine the effect of default risk on equity liquidity. This chapter sets out to investigate the relationship between credit risk and equity liquidity. It posits that as the firm’s default risk increases, informed trading increases in the firm’s stock and uninformed traders exit the market. Market-makers widen spreads in response to the increased probability of trades against informed investors. Using the default likelihood measure calculated by Merton’s method, this chapter investigates whether financially ailing firms do indeed have higher bid-ask spreads. The panel threshold regression model is employed to examine the possible non-linear relationship between credit risk and equity liquidity. Since high default probability and worse economic prospects lead to greater expropriation by managers, and thus greater asymmetric information costs, liquidity providers will incur relatively higher costs and will therefore offer higher bid-ask spreads. This issue is further analyzed by investigating whether there is any evidence of increases in the vulnerability in equity liquidity of firms with high credit risk. The results in this chapter show an increase in the default likelihood might precipitate investors’ loss of confidence of equities trading and thus cause a decrease in liquidity, as particularly evident during the Enron crisis period. Chapter 69 discusses the spurious regression and data mining in conditional asset pricing models. Stock returns are not highly autocorrelated, but there is a spurious regression bias in predictive regressions for
3 Research Methods in Quantitative Finance and Risk Management
45
stock returns similar to the classic studies of Yule (1926) and Granger (1974). Data mining for predictor variables reinforces spurious regression bias because more highly persistent series are more likely to be selected as predictors. In asset pricing regression models with conditional alphas and betas, the biases occur only in the estimates of conditional alphas. However, when time-varying alphas are suppressed and only time-varying betas are considered, the betas become biased. The analysis shows that the significance of many standard predictors of stock returns and time-varying alphas in the literature is often overstated. There are many other econometric methods discussed in this book as well. For example, Chap. 43 uses combinatorial methods for constructing credit risk ratings. This study uses a novel combinatorial method, the Logical Analysis of Data (LAD) to reverse-engineer and construct credit risk ratings that represent the creditworthiness of financial institutions and countries. The proposed approaches are shown to generate transparent, consistent, self-contained, and predictive credit risk rating models, closely approximating the risk ratings provided by some of the major rating agencies. The scope of applicability of the proposed method extends beyond the rating problems discussed in this study, and can be used in many other contexts where ratings are relevant. The proposed methodology is applicable in the general case of inferring an objective rating system from archival data, given that the rated objects are characterized by vectors of attributes taking numerical or ordinal values. Chapter 50 uses dynamic econometric loss model in the default study of U.S. subprime markets. It applies not only the static loan and borrower characteristic variables such as loan terms, Combined-Loan-To-Value ratio (CLTV), Fair Isaac credit score (FICO), as well as dynamic macroeconomic variables such as HPA to projects defaults and prepayments, but also the spectrum of delinquencies as an error correction term to add an additional 15% accuracy to the model projections. In addition to the delinquency attribute findings, this chapter also shows that cumulative HPA and the change of HPA contribute various dimensions that greatly influence defaults. Another interesting finding is a significant long-term correlation between HPI and disposable income level (DPI). Since DPI is more stable and easier for future projections, it suggests that HPI will eventually adjust to coincide with DPI growth rate trend and that HPI could potentially experience as much as a 14% decrease by the end of 2009. Chapter 62 uses robust logistic regression to study the default prediction with outliers. This chapter suggests a Robust Logit method, that extends the conventional logit model by taking outlier into account, to implement forecast of defaulted firms. This study employs five validation tests to assess the in-sample and out-of-sample forecast performances, respectively. With respect to in-sample forecasts, the Robust Loigt method proposed in this chapter is substantially
superior to the logit method as it uses all validation tools. With respect to the out-of-sample forecasts, the superiority of Robust Logit is less pronounced. Issues related with errorsin-variables problems in asset pricing tests is discussed in Chap. 70. This chapter develops a new method of correcting the EIV problem in the estimation of the price of beta risk within the two-pass estimation framework. The study performs the correction for the EIV problem by incorporating the extracted information about the relationship between the measurement error variance and the idiosyncratic error variance into the maximum likelihood estimation under either homoscedasticity or heteroscedasticity of the disturbance term of the market model. The evidence shows that after correcting for the EIV problem, market beta has significant explanatory power for average stock returns, and that the firm size factor becomes much less important. This chapter also reexamines the explanatory power of beta and the firm size for average stock returns using the proposed correction. If no correction for the EIV problem is implemented, then, consistent with recent empirical studies, there is no significant relation between beta and expected returns, especially when firm size is included as an additional explanatory variable. With the EIV correction, both the price and significance of betas increase. This chapter documents that the EIV correction diminishes the impact of the firm size variable. Nevertheless, firm size remains a significant force in explaining the crosssection of average returns. While my results suggest a more important role for market beta, the evidence is still consistent with a misspecification in the capital asset pricing model. Chapter 72 discusses term structure models that incorporate Markov regimes shifts. Both discrete-time and continuoustime regimes switching term structure models are discussed. In this chapter we briefly review models of the term structure of interest rates under regime shifts. These models retain the same tractability of the single-regime affine models on one hand, and allow more general and flexible specifications of the market prices of risk, the dynamics of state variables as well as the short-term interest rates on the other hand. These additional flexibilities can be important in capturing some silent features of the term structure of interest rates in empirical applications. The models can also be applied to analyze the implications of regime-switching for bond portfolio allocations and the pricing of interest rate derivatives. Chapter 78 discusses the hedonic regression analysis in real estate markets. This chapter provides an overview of the nature and variety of hedonic pricing models that are employed in the market for real estate. It explores the history of hedonic modeling and summarizes the field’s utilitytheory-based, microeconomic foundations. It also provides empirical examples of each of the three principal methodologies and a discussion of and potential solutions for common problems associated with hedonic modeling. Chapter 92 discusses the instrumental variable approach to correct for
46
3
endogeneity in corporate finance. The theoretical literature on the link between an incumbent firm’s capital structure (financial leverage, debt/equity ratio) and entry into its product market is based on two classes of arguments: the “limited liability” arguments and the “deep pocket” arguments. However, these two classes of arguments provide contradictory predictions. This study provides a distinct strategic model of the link between capital structure and entry that is capable of rationally producing both of the existing contradictory predictions. The focus is on the role of beliefs and the cost of adjusting capital structure, two of the factors that are absent in the existing models. The beliefs may be exogenously given or endogenized by a rational expectations criterion.
obviously predates economic theory. However, a formal model showing how to make the most of the power of diversification was not devised until 1952, a feat for which Harry Markowitz eventually won the Nobel Prize in economics. Markowitz portfolio shows that as assets are added to an investment portfolio the total risk of that portfolio — as measured by the variance (or standard deviation) of total return — declines continuously, but the expected return of the portfolio is a weighted average of the expected returns of the individual assets. In other words, by investing in portfolios rather than in individual assets, investors could lower the total risk of investing without sacrificing return. In the second part, it introduces the mean-variance spanning test that follows directly from the portfolio optimization problem. Chapter 12 provides discussion on estimation risk and power utility portfolio selection. It examines additional ways of estimating joint return distributions that allow us to explore the stationary/nonstationary tradeoffs implicit in “expanding” versus "moving" window estimation methods, the benefits of alternative methods of allowing for the “memory loss” inherent in the moving-window EPAA, and the possibility that weighting more recent observations more heavily may improve investment performance. Chapter 13 discusses the theory and method in international portfolio management. This chapter investigates the impact of various investment constraints on the benefits and asset allocation of the international optimal portfolio for domestic investors in various countries. The empirical results indicate that local investors in less developed countries, particularly in East Asia and Latin America, benefit more from global diversification. Though the global financial market is becoming more integrated, adding constraints reduces but does not completely eliminate the diversification benefits of international investment. The addition of portfolio bounds yields the following characteristics of asset allocation: a reduction in the temporal deviation of diversification benefits, a decrease in time-variation of components in optimal portfolio, and an expansion in the range of comprising assets. Our findings are useful for asset management professionals to determine target markets to promote the sales of national/international funds and to manage risk in global portfolios. In the research of derivatives or options pricing, the use of the Ito calculus and the derivation of the ordinary and partial differential equations are important in arriving at the closed form solution for option pricing model. Applications of Ito calculus in deriving option pricing model model can be found in Chapter 30. It basically develops certain relatively recent mathematical discoveries known generally as stochastic calculus, or more specifically as Itô’s Calculus, and illustrates their application in the pricing of options. The mathematical methods of stochastic calculus are illustrated in alternative derivations of the celebrated Black-Scholes-Merton model. The topic is motivated by a desire to provide an
3.4 Mathematics The mathematical methodology used in finance has its root dated back to research in economics. Financial researchers are interested in equilibrium analysis, optimization problems, and dynamic analysis. The use of linear and matrix algebra, real analysis, multivariate calculus, constrained and unconstrained optimization, nonlinear programming, and optimal control theory are indispensable. Research in portfolio theory and its applications frequently uses constrained maximization. Chapter 5 discusses risk aversion, capital asset allocation, and Markowiatz portfolio selection model. It introduces utility function and indifference curve. Based on utility theory, it derives the Markowitz’s model and the efficient frontier through the creation of efficient portfolios of varying risk and return. It also includes methods of solving for the efficient frontier both graphically and mathematically, with and without explicitly incorporating short selling. Chapter 7 demonstrates how index model can be used for portfolio selection. It discusses both single-index model and multiple-index portfolio selection model. It uses constrained maximization instead of minimization procedure to calculate the portfolio weights. We find that both single-index and multi-index models can be used to simplify the Markowitz model for portfolio section. Chapter 8 discusses Sharpe measure, Treynor measure, and optimal portfolio selection. Following Elton et al. (1976) and Elton et al. (2007), it introduces the performancemeasure approaches to determine optimal portfolios. We find that the performance-measure approaches for optimal portfolio selection are complementary to the Markowitz full variance-covariance method and the Sharpe index-model method. The economic rationale of the Treynor method is also discussed in detail. Chapter 10 discusses portfolio optimization models and mean-variance spanning tests. It introduces the theory and the application of computer program of modern portfolio theory. The notion of diversification is the age-old adage “don’t put your eggs in one basket,”
Research Methods in Quantitative Finance and Risk Management
3 Research Methods in Quantitative Finance and Risk Management
47
intuitive understanding of certain probabilistic methods that have found significant use in financial economics. Chapter 32 uses Ito calculus in stochastic volatility option pricing model. It assumes that the volatility of option price model is stochastic instead of deterministic. The chapter applies such assumption to the nonclosed-form solution developed by Scott (1987) and the closed-form solution of Heston (1993). In both cases, it considers a model in which the variance of stock price returns varies according to an independent diffusion process. For the closed form option pricing model, the results are expressed in terms of the characteristic function. Chapter 33 uses traditional calculus method to derive Greek letters and discuss their applications. It introduces the definitions of Greek letters. It also provides the derivations of Greek letters for call and put options on both dividendspaying stock and non-dividends stock. It then discusses some applications of Greek letters. Finally, it shows the relationship between Greek letters, one of the examples can be seen from the Black-Scholes partial differential equation. Chapter 37 applies Ito calculus to option pricing and hedging performance under stochastic volatility and stochastic interest rate. Recent studies have extended the Black-Scholes model to incorporate either stochastic interest rates or stochastic volatility. But, there is not yet any comprehensive empirical study demonstrating whether and by how much each generalized feature will improve option pricing and hedging performance. This chapter fills this gap by first developing an implementable option model in closed-form that admits both stochastic volatility and stochastic interest rates and is parsimonious in the number of parameters. The model includes many known ones as special cases. Based on the model, both delta-neutral and single-instrument minimumvariance hedging strategies are derived analytically. Using S&P 500 option prices, we then compare the pricing and hedging performance of this model with that of three existing ones that respectively allow for (1) constant volatility and constant interest rates (the Black-Scholes), (2) constant volatility but stochastic interest rates, and (3) stochastic volatility but constant interest rates. Overall, incorporating stochastic volatility and stochastic interest rates produces the best performance in pricing and hedging, with the remaining pricing and hedging errors no longer systematically related to contract features. The second performer in the horse-race is the stochastic volatility model, followed by the stochastic interest rates model and then by the Black-Scholes. Chapter 44 uses Ito calculus in the structural approach to modeling credit risk. This chapter presents a survey of recent developments in the structural approach to modeling of credit risk. It first reviews some models for measuring credit risk based on the structural approach. This study then discusses the empirical evidence in the literature on the performance of structural models of credit risk. Chapter 85 has demonstrated the applications of alternative ODE in finance and economics research. This chapter introduces ordinary
differential equation (ODE) and its application in finance and economics research. There are various approaches to solve an ordinary differential equation. This study mainly introduces the classical approach for a linear ODE and Laplace transform approach. It illustrates the application of ODE in both deterministic and stochastic systems. As application in deterministic systems, the study applies ODE to solve a simple Gross Domestic Product (GDP) model and an investment model of a firm that is well studied in Gould (1968). As an application in stochastic systems, this chapter employs ODE in a jump-diffusion model and derives the expected discounted penalty given that the asset value follows a phase-type jump diffusion process. In particular, this study examines capital structure management model and derives the optimal default-triggering level. For related results, please see Chen et al. (2007). Chapter 98 discusses an ODE approach for the expected discounted penalty at ruin in a jump-diffusion model. Under the assumption that the asset value follows a jump diffusion, this chapter shows the expected discounted penalty satisfies an ODE and obtains a general form for the expected discounted penalty. In particular, if only downward phase-type jumps are allowed, we get an explicit formula in terms of the penalty function and jump distribution. On the other hand, if downward jump distribution is a mixture of exponential distributions (and upward jumps are determined by a general Levy measure), closed form solutions for the expected discounted penalty can be obtained. In this chapter, Leland’s structural model with jumps is analyzed. For earlier and related results, see Gerber and Landry (1998), Hilberink and Rogers (2002), Mordecki (2002), Kou and Wang (2004), Asmussen et al. (2004) and others. Chapters 101 and 102 use the stochastic differential equation approach to derive a theoretical model of positive interest rate. Chapter 101 presents a dynamic term structure model in which interest rates of all maturities are bounded from below at zero. Positivity and continuity combined with no arbitrage result in only one functional form for the term structure with three sources of risk. This chapter casts the model into a state-space form and extracts the three sources of systematic risk from both the U.S. Treasury yields and the U.S. dollar swap rates. It analyzes the different dynamic behaviors of the two markets during credit crises and liquidity squeezes. Chapter 102 explores what types of diffusion and drift terms forbid negative yields, but nevertheless allow any yield to be arbitrarily close to zero. This chapter shows that several models have these characteristics; however, they may also have other odd properties. In particular the square root model of Cox-Ingersoll-Ross has such a solution, but only in a singular case. In other cases, bubbles will occur in bond prices leading to unusually behaved solutions. Other models, such as the CIR three-halves power model, are free of such oddities.
48
3
Research Methods in Quantitative Finance and Risk Management
Fuzzy set theory is also important in quantitative research risk management research. Chapter 77 discusses fuzzy set theory and its applications to finance research. Chapter 87 discusses the fuzzy set and data mining applications in accounting and finance research. To reduce the complexity in computing tradeoffs among multiple objectives, this chapter adopts a fuzzy set approach to solve various accounting or finance problems such as international transfer pricing, human resource allocation, accounting information system selection, and capital budgeting problems. A solution procedure is proposed to systematically identify a satisfying selection of possible solutions that can reach the best compromise value for the multiple objectives and multiple constraint levels. The fuzzy solution can help top management make a realistic decision regarding its various resource allocation problems as well as the overall firm’s strategic management when environmental factors are uncertain. The purpose of last part of the chapter is to test the effectiveness of a multiple criteria linear programming (MCLP) approach to data mining for bankruptcy prediction using financial data. Data mining applications have recently been receiving more attention in general business areas, but there is a need to use more of these applications in accounting and finance areas when dealing with large amounts of both financial and nonfinancial data. The results of the MCLP data mining approach in our bankruptcy prediction studies are promising and may extend to other countries. Chapter 61 discusses the actuarial mathematics and its applications in quantitative finance. First, it introduces the traditional actuarial interest functions and uses them to price different types of insurance contracts and annuities. Based on the equivalence principle, risk premiums for different payment schemes are calculated. After premium payments and the promised benefits are determined, actuarial reserves are calculated and set aside to cushion the expected losses. In a similar method, it uses actuarial mathematics are used to price the risky bond, the credit default swap, and the default digital swap, while interest structure is not flat and the first passage time is replaced by the time of default instead of the future life time. Moreover, product market games and signaling model in finance is discussed in Chap. 94. A crucial set of assumption in a large class of models of financial market signaling and product market/financial market interactions are related to super- or submodularity of the maximand. The assumption of one or the other can produce result reversals or indeterminacy. To the extent that data or underlying economics cannot unambiguously choose between super- and submodularity, the theoretical robustness and empirical testability of such models becomes questionable. Based on three examples drawn from recent finance literature, this study discusses the role that the assumptions play and raises issues of empirical identifiability in testing model predictions.
3.5 Other Disciplines In addition to statistics, econometrics, and mathematics, there are many other disciplines used in the finance research. Operation research model is one example. Linear programming models and quadratic programming are used in cash flow matching and portfolio immunization. Chapter 14 discusses the Le Châtelier principle in the Markowitz quadratic programming investment model using a case of world equity fund market. Due to limited numbers of reliable international equity funds, the Markowitz investment model is ideal in constructing an international portfolio. Overinvestment in one or several fast-growing markets can be disastrous as political instability and exchange rate fluctuations reign supreme. This chapter applies the Le Châtelier principle to the international equity fund market with a set of upper limits. Tracing out a set of efficient frontiers, this study inspects the shifting phenomenon in the mean-variance space. The optimum investment policy can be easily implemented and risk minimized. Chapter 16 discusses quadratic programming in the portfolio analysis. Markowitz portfolio analysis delineates a set of highly desirable portfolios; these portfolios have the maximum return over a range of different risk levels. Conversely, Markowitz portfolio analysis can find the same set of desirable investments by delineating portfolios that have the minimum risk over a range of different rates of return. The set of desirable portfolios is called the efficient frontier. Portfolio analysis analyzes rate of return statistics, risk statistics (standard deviations), and correlations from a list of candidate investments (stocks, bonds, and so forth) to determine which investments in what proportions (weights) enter into the efficient portfolios. Markowitz’s portfolio theory can be analyzed further to determine interesting asset pricing implications. The role of risk measures such as Value at Risk (VaR) and Conditional Value at Risk (CVar) in risk management are the interest of both academia and practitioners. Chapter 95 demonstrates the estimation of short- and long-term VaR for long-memory stochastic volatility models. The phenomenon of long-memory stochastic volatility (LMSV) has been extensively documented for speculative returns. This chapter investigates the effect of LMSV for estimating the value at risk (VaR) or the quantile of returns. The proposed model allows the return’s volatility component to be short- or longmemory. This chapter derives various types of limit theorems that can be used to construct confidence intervals of VaR for both short and long-term returns. For the latter case, the results are particularly of interest to financial institutions with exposure to long-term liabilities, such as pension funds and life insurance companies, which need a quantitative methodology to control market risk over longer horizons.
3 Research Methods in Quantitative Finance and Risk Management
49
Computer science is another important discipline in quantitative finance and risk management research. The program design, data structures, and algorithms help finance researchers implement their models with large data sets. The concepts of neural networks have also been applied in security analysis and bankruptcy prediction literature. In empirical implementation of the derivatives pricing model, numerical analysis is required to compute the fair value of the derivatives. Chapter 39 uses numerical methods in evaluating Asian options. It introduces Monte Carlo simulations, diverse mathematical and numerical techniques designed to estimate the distribution of underlying asset, then uses arbitrage arguments, binomial tree models, and the application of insurance formulas in the evaluation of options. Chapter 40 demonstrates that numerical valuation of Asian options with higher moments in the underlying distribution. It develops a modified Edgeworth binomial model with higher moment consideration for pricing European or American Asian options. If the number of the time steps increases, our numerical algorithm is as precise as that of Chalasani et al. (1999), with lognormal underlying distribution for benchmark comparison. If the underlying distribution displays negative skewness and leptokurtosis, as often observed for stock index returns, our estimates are better and very similar to the benchmarks in Hull and White (1993). The results show that the modified Edgeworth binomial model proposed in this chapter can value European and American Asian options with greater accuracy and speed given higher moments in their underlying distribution. Chapter 79 shows numerical solutions of financial partial differential equations. This chapter provides a general survey of important PDE numerical solutions and studies in detail of certain numerical methods specific to finance with programming samples. These significant numerical solutions for financial PDEs include finite difference, finite volume, and finite element. Finite difference is simple to discretize and easy to implement; however, explicit method is not guaranteed stable. Finite volume has an advantage over finite difference that it does not require a structured mesh. If a structured mesh is used, as in most cases of pricing financial derivatives, the finite volume and finite difference method yield the same discretization equations. Finite difference method can be considered a special case of finite element method. In general, the finite element method has better quality in approximation to exact solution when compared to the finite difference method. Since most PDEs in financial derivatives pricing have simple boundary condition, implicit method of finite difference is preferred to the finite element method in applications of financial engineering due to low cost of implementation.
Moreover, in constructing stochastic or probabilistic financial models as opposed to the traditional static and deterministic models, the Monte Carlo method is required to compute the results. Chapters 71 and 91 show how Monte Carlo Markov Chain (MCMC) method is utilized in finance research. Chapter 71 proposes to use Monte Carlo Markov Chain methods to estimate the parameters of Stochastic Volatility Models with several factors varying at different time scales. The originality of this approach, in contrast with classical factor models, is the identification of two factors driving univariate series at well-separated time scales. This is tested with simulated data as well as foreign exchange data. Other methodologies used in quantitative finance and risk management research also include entropy methods. Chapters 104 and 105 show how entropy methods can be used to finance research. Chapter 104 expresses the density of the minimal entropy Martingale measure in terms of the value process of the related optimization problem and show that this value process is determined as the unique solution of a semiMartingale backward equation. This chapter considers some extreme cases when this equation admits an explicit solution. Chapter 105 derives the density process of the minimal entropy Martingale measure in the stochastic volatility model proposed by Barndorff-Nielsen and Shephard. The density is represented by the logarithm of the value function for an investor with exponential utility and no claim issued, and a Feynman-Kac representation of this function is provided. The dynamics of the processes determining the price and volatility are explicitly given under the minimal entropy Martingale measure, and this study derives a Black & Scholes equation with integral term for the price dynamics of derivatives. It turns out that the minimal entropy price of a derivative is given by the solution of a coupled system of two integro-partial differential equations.
3.6 Conclusion This chapter discusses the methodology in quantitative finance and risk management research. The list of the subjects that contribute to the modern finance research include statistics, econometrics, mathematics, and other disciplines such as operation research models and computer science. This list is definitely not exhaustive as finance research requires a broad range of methodological tools from various disciplines. Detailed applications of the aforementioned subjects are discussed in the latter part of this handbook.
50
3
References
Heston, S. 1993. “A closed-form solution for options with stochastic volatility with application to bond and currency options.” The Review of Financial Studies 6(2), 327–343. Hilberink, B. and L. C. G. Rogers. 2002. “Optimal capital structure and endogenous default.” Finance and Stochastics 6, 237–263. Hoeting, J., D. Madigan, A. Raftery, and C. Volinsky. 1999. “Bayesian model averaging: a tutorial.” Statistical Science 14(4), 382–417. Hull, J. and A. White. 1993. “Efficient procedures for valuing European and American path-dependent options.” The Journal of Derivatives 1, 21–31. Kou, S.G., H. Wang. 2004. “Option pricing under a double exponential jump diffusion model.” Management Science 50, 1178–1192. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Heteroskedasticity in stock returns data: volume versus GARCH effects.” Journal of Finance 45(1), 221–29. Mordecki, E. 2002. “Optimal stopping and perpetual options for Levy processes.” Finance and Stochastics 6, 473–493. Pastor, L. 2000. “Portfolio selection and asset pricing models.” Journal of Finance 55(1), 179–223. Roll, R. 1977. “An analytical valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 5, 251–258. Scott, L. 1987. “Option pricing when the variance changes randomly: theory, estimation, and an application.” Journal of Finance and Quantitative Analysis 22(4), 419–438. Shafer, G. and V. Vovk. 2001. Probability and finance: it’s only a game, Wiley, New York. Theil, H. 1966. Applied economic forecasting, North-Holland, Amsterdam. Wei, K. C. J. 1988. “An asset-pricing theory unifying the CAPM and APT.” Journal of Finance 43(4), 881–892. Whaley, R. 1981. “On the valuation of American call options on stocks with known dividends.” Journal of Financial Economics 9, 207–211. Yee, K. 2007. “Using accounting information for consumption planning and equity valuation.” Review of Accounting Studies 12(2–3), 227–256. Yule G. U. 1926. “Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series.” Journal of the Royal Statistical Society 89, 1–64
Asmussen, S., F. Avram, M. R. Pistorius. 2004. “Russian and American put options under exponential phase-type Levy models.” Stochastic Processes and their Applications 109, 79–111. Bates, J. and C. Granger. 1969. “The combination of forecasts.” Operational Research Quarterly 20, 451–468. Black, F. and R. Litterman. 1991. “Asset allocation: combining investor views with market equilibrium.” Journal of Fixed Income 1(2), 7–18. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Black, F. and R. Litterman. 1992. “Global portfolio optimization.” Financial Analysts Journal (September/October) 28–43. Chalasani, P., S. Jha, F. Egriboyun, and A. Varikooty. 1999. “A refined binomial lattice for pricing American Asian options.” Review of Derivatives Research 3(1), 85–105. Chen, Y. T., C. F. Lee, and Y. C. Sheu. 2007. “An ODE approach for the expected discounted penalty at ruin in a jump-diffusion model.” Finance and Stochastics 11, 323–355. Chung, S. L. and P. T. Shih. 2007. “Generalized Cox–Ross–Rubinstein binomial models.” Management Science 53, 508–520. Cremers, K. 2002. “Stock return predictability: a Bayesian model selection perspective.” Review of Financial Studies 15(4), 1223–1249. Deaton, A. and G. Laroque. 1992. “On the behavior of commodity prices.” Review of Economic Studies 59, 1–23. Elton, E.J., M.J. Gruber, S.J. Brown, and W.N. Goetzmann. 2007. Modern portfolio theory and investment analysis, Wiley, New Jersey. Elton, E.J., M.J. Gruber, and M. W. Padberg. 1976. “Simple criteria for optimal portfolio selection.” Journal of Finance 31, 1341–1357. Gerber, H. U., B. Landry. 1998. On the discounted penalty at ruin in a jump-diffusion and the perpetual put option. Insurances: Mathematics and Economics 22, 263–276. Geske, R. 1979. “A note on an analytical valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 7, 375–380. Gould, J. P. 1968. “Adjustment costs in the theory of investment of the firm.” The Review of Economic Studies 1, 47–55. Granger, C. W. J. and P. Newbold. 1974. “Sourious Regressions in Econometrics.” Journal of Econometrics 2, 111–120.
Research Methods in Quantitative Finance and Risk Management
Chapter 4
Foundation of Portfolio Theory Cheng-Few Lee, Alice C. Lee, and John Lee
Abstract In this chapter, we first define the basic concepts of risk and risk measurement. Using the relationship of risk and return, we introduce the efficient-portfolio concept and its implementation. Then the concept of dominance principle and performance measures are also discussed and illustrated. Finally, the interest rate and market rate of return are used as measurements to show how the commercial lending rate and the market risk premium can be calculated. Keywords Risk r Systematic risk r Unsystematic risk r Return r Efficient frontier r Efficient portfolio r Performance measure r Risk premium r Sharpe measure r Jensen measure r Treynor measure
4.1 Introduction This chapter focuses on the various types of risk. An understanding of the sources of risk and their determination is very useful for doing meaningful security analysis and portfolio management; thus, discussing risk prior to an exploration of portfolio-selection models is essential. In this chapter methods of measuring risk are defined by using basic statistical methods, and the concepts and applications of the dominance principle and portfolio theory are discussed in some detail. A discussion of the different types of risk and their classification is followed by a review of the concepts and applications of portfolio analysis. The dominance as well as the necessity of using performance measures as a means to comC.-F. Lee () Rutgers University, New Brunswick, NJ, USA e-mail:
[email protected] A.C. Lee State Street Corp., Boston, MA, USA e-mail:
[email protected] J. Lee Center for PBBEF Research, Hackensack, NJ, USA e-mail:
[email protected]
pare performance among different portfolios are treated. The determination of the commercial lending rate in accordance with risk and return concepts is then analyzed, and the final section of this chapter concerns calculations of the market rate of return and the market risk premium.
4.2 Risk Classification and Measurement Risk is defined as the probability of success or failure. To be able to measure risk, it is necessary to define the range of outcomes and the probability that these outcomes will occur. A probability distribution may be determined either subjectively or objectively; a subjective determination is made by the individual about the likelihood of various future outcomes, while an objective determination involves measuring past data that are associated with the frequency of certain types of outcomes. In either case, a subjective or objective determination requires a quantitative measure of risk. The measure most commonly used is the standard deviation or the variance of the possible returns. In considering two securities, A and B, it is possible to use a subjective approach, an objective approach, or a combination of the two approaches to indicate the expected range of outcomes and the probability that each outcome will occur. Figure 4.1 illustrates the distribution of return possibilities for these two securities. While the standard deviation is stated in rates of return, the variance is stated in terms of the rate of return squared. As it is more natural to discuss rates of return rather than rates of return squared, risk is usually measured with the standard deviation of returns. Nevertheless, for statistical purposes it is usually more convenient to use the variance rather than the standard deviation. Either risk measure is appropriate because the standard deviation is merely a simple mathematical transformation of the variance. It is the job of the security analyst to provide estimates of a financial asset’s risk. Supplying estimates of a security variance or standard deviation is one method of estimating the total risk of a security.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_4,
53
54
C.-F. Lee et al.
Fig. 4.1 Probability distributions between securities A and B
Why do securities have differing levels of total risk? What factors cause the returns of securities to vary? Table 4.1 defines various types of risk and in so doing offers answers to these questions, and in analyzing the risks defined in Table 4.1 the following paragraphs provide further illustration. At any point a security or a portfolio could be subject to a number of these risks. In general, the various risks are not mutually exclusive, nor are they additive. By mutually exclusive is meant that the fact that a portfolio contains a certain
kind of risk does not mean that it also contains other types of risk. By additive is meant that if a portfolio contains a number of different types of risk, the total risk of the portfolio is simply the sum of the individual risks. Additionally, some the risks defined in Table 4.1 are easily quantified or measured – for example using beta to measure systematic risk – while other types of risk may not have well-defined quantitative risk measures such as management risk or political risk. Finally, it should be noted that the types of risk listed on Table 4.1 are mutually exclusive.
4 Foundation of Portfolio Theory
55
Table 4.1 Types of risk Risk type
Description
Call risk
The variability of return caused by the repurchase of the security before its stated maturity
Convertible risk
The variability of return caused when one type of security is converted into another type of security
Default risk
The probability of a return of zero when the issuer of the security is unable to make interest and principal payments – or for equities, the probability that the market price of the stock will go to zero when the firm goes bankrupt
Interest-rate risk
The variability of return caused by the movement of interest rates
Management risk
The variability of return caused by bad management decisions; this is usually a part of the unsystematic risk of a stock, although it can affect the amount of systematic risk
Marketability risk
The variability of return caused by the commissions and price concessions associated with selling an illiquid asset
Political risk
The variability of return caused by changes in laws, taxes, or other government actions
Purchasing-power risk
The variability of return caused by inflation, which erodes the real value of the return
Systematic risk
The variability of a single security’s return caused by the general rise or fall of the entire market
Unsystematic risk
The variability of return caused by factors unique to the individual security
4.2.1 Call Risk
4.2.2 Convertible Risk
An investor who purchases a security, whether it is a debt instrument or equity, usually has a pretty good idea of his or her investment horizon – that is, how long he or she intends to hold the security. If the issuer calls the bond or repurchases the stock at a point in time prior to the end of the investor’s investment horizon, the return earned by the investor may be less than expected. The investor is facing call risk (see Table 4.1 for a detailed definition). Generally all corporate bonds and preferred stocks are callable at some point during the life of the security. Investors do not like this call feature, because bonds are most often called after interest rates have fallen, so that the issues can replace them with a new issue of bonds at a lower interest rate. When a bond is called, instead of receiving the expected interest flows for the remainder of his or her investment horizon, the investor must reinvest the proceeds received from the issuer at a lower rate of interest. It is this reinvestment at a new lower rate of interest that causes the yield or return of the bond to be different than what the investor expected. For this reason, in periods of high interest rates, when interest rates are expected to drop to lower levels in the future, bonds that are call protected for a certain number of years usually sell at a higher price than callable bonds. Thus, a deferred call feature can cause an interest spread or differential to develop vis-à-vis a similar issue with no call feature or an issue that is immediately callable. The size of the interest differential varies with the investor’s expectations about the future movement of interest rates. Whether a bond is callable immediately after it is issued or after a deferred period, the actual call of the bond involves the payment of a premium over the par value of the bond by the issuer.
If a bond or a preferred stock is convertible into a stated number of shares of common stock of the corporation issuing the original security, the rate of return of the investment may vary because the value of the underlying common stock has increased or decreased. This kind of investment is facing convertible risk, as described in Table 4.1. A convertible security normally has a lower coupon rate, or stated dividend (in the case of preferred stocks), because investors are willing to accept a lower contractual return from the company in order to be able to share in any rise in the price of the firm’s common stock. The size of the yield spread between straight and convertible securities is dependent on the future prospects of the individual firm and the general level of interest rates. The difference between the value paid for the conversion right and the actual returns experienced over the life of the security increase the variability of returns associated with the investment.
4.2.3 Default Risk The quality of the issue or the chances of bankruptcy for the issuer have a significant impact on the rate of return of a security. A lower-quality issue will sell at a lower price and thus offer a higher yield than a similar security issued by a higherquality issuer. This is true in all segments of the securities markets. Hence, most investors hold diversified portfolios in order to reduce their exposure to any one security going into default or the issuer going bankrupt. The risk relating to default or bankruptcy is called default risk, as described in Table 4.1.
56
4.2.4 Interest-Rate Risk As the general level of interest rates changes, the value of an individual security also changes. As interest rates rise, the value of existing securities falls, and vice versa. Additionally, longer-term securities are more affected by a given change in the general level of interest rates than are shorter-term securities. The fluctuation of security returns caused by the movement of interest rates is called interest-rate risk, as described in Table 4.1. For example, Bond A has a coupon of 7% and a maturity of 5 years. If interest rates go from 7 to 8%, the price of A will fall $6.75 for each $100 of par value, whereas the price of B will fall only $4.00 for each $100 of par value. Another factor that affects the amount of price change for a given interest rate changes is the size of the coupon. The lower the coupon rate of an issue, the more widely its price will change for a given change in interest rates. For example, Bond C is a 20-year bond with a percent coupon and a market value of $68 when the general level of interest rates are 7%. Bond D is a 20-year bond with 7% coupon, so its market value is par $100. If interest rates go from 7 to 8%, the value of Bond C will fall to $60.38, a decrease in value of 11.3% based on its market price [(60.38–68)/68], whereas Bond D’s value will fall by 9.8% [(90.2–100)/100]. In general, a shorter-maturity, higher-coupon security is subjected to less interest-rate risk than a long-term, low coupon security.
4.2.5 Management Risk Management risk, as indicated in Table 4.1, is caused by errors of a firm’s managers when they make business and financing decisions. Business risk refers to the degree of fluctuation of net income associated with different types of business operations. This kind of risk is related to different types of business and operating strategies. Financial risk refers to the variability of returns associated with leverage decisions. How much of the firm should be financed with equity and how much should be financed with debt? It should be noted that both business risk and financial risk are not necessarily constant over time. Both risks can be affected by the fluctuations of the business cycle or by changes in government policy. Jensen and Meckling have presented a theory, called agency theory, that deals with the problem of management errors and their impact on security owners. An agentprincipal relationship exists when decision-making authority is delegated. The modern corporation is a good example of this phenomenon. The owners or shareholders of the
C.-F. Lee et al.
firm delegate the decision-making authority to the firm’s managers, who are in reality employees or agents of the owners. Other things being equal, the managers make decisions that satisfy the needs of the managers rather than the desires of the owners unless the firm’s owners can protect themselves from management decisions. Some have argued that it is the function of financial arrangements using bond covenants, options, and so on, to narrow the divergence of goals between a firm’s owners and its managers. In any case, it is the possibility of management error or divergent goals that is one of the causes of variability of return for securities.
4.2.6 Marketability (Liquidity) Risk The marketability risk (or liquidity risk) of a security (see the description in Table 4.1) affects the rate of return received by its owner. Marketability is made up of two components: (1) the volume of securities that can bought or sold in a short period of time without adversely affecting the price, and (2) the amount of time necessary to complete the sale of a given number of securities. Other things being equal, the less marketable a security, the lower its price or the higher its yield. A highly liquid security – for example, IBM shares traded on the New York Stock Exchange (NYSE) – can be purchased or sold in large quantities in a very short time. Millions of shares of IBM are traded every day on the NYSE. The stock of a small firm traded over the counter (OTC) may trade only a few hundred shares a week and is said to be illiquid. One good measure of the marketability of a security is the spread between the bid and ask prices. The bid price is the current price at which the security can be sold and the ask price is the current price at which the security can be bought. The bid price is always lower than the ask price. The bid-ask spread is the cost of selling the asset quickly. That is to say, it is the amount of margin required by the market maker to stand ready to buy or sell reasonable amounts of the security quickly. The more illiquid the security the wider the bid-ask spread.
4.2.7 Political Risk International political risk (as described in Table 4.1) stems from the political climate and conditions of a foreign country. For instance, if a government is unstable, as is the case in some South American countries, there may be wild fluctuations in interest rates due to a lack of confidence by the population and the business community. There may even be some chance of sabotage of plant and equipment or other acts
4 Foundation of Portfolio Theory
of terrorism. Contracts may not be upheld or enforceable. These are all factors that may cause the rate of return on certain securities to vary. Domestic political risk takes the form of laws, taxes, and government regulations. As the government changes the tax law, the returns of securities can be greatly affected. For example, the Tax Reform Act of 1986 probably had a very important role in the bull market of January to August 1987, when the Dow-Jones Industrial Average went from less than 1,900 to above 2,700.
4.2.8 Purchasing-Power Risk Purchasing-power risk, as described in Table 4.1, is related to the possible shrinkage in the real value of a security even though its nominal value is increasing. For example, if the nominal value of a security goes from $100 to $200, the owner of this security is pleased because the investment has doubled in value. But suppose that, concurrent with the value increase of 100%, the rate of inflation is 200% – that is, a basket of goods costing $100 when the security was purchased now costs $300. The investor has a “money illusion” of being better off in nominal terms. The investment did increase from $100 to $200; nevertheless, in real terms, whereas the $100 at time zero could purchase a complete basket of goods, after the inflation only two-thirds of a basket can now be purchased. Hence, the investor has suffered a loss of value.
57
4.3 Portfolio Analysis and Application Essential to adequate diversification is a knowledge of the primary concepts of portfolio analysis and their application. This section focuses on basic analysis and applications; more advanced portfolio theory and methods will be discussed in the next chapters. A portfolio can be defined as any combination of assets or investments. Portfolio analysis is used to determine the return and risk for these combinations of assets. Portfolio concepts and methods are employed here to formally develop the idea of the dominance principle and some other measures of portfolio performance.
4.3.1 Expected Return on a Portfolio The rate of return on a portfolio is simply the weighted average of the returns of individual securities in the portfolio. For example, 40% of the portfolio is invested in a security with a 10% expected return (security A); 30% is invested in security B with a 5% expected return; and 30% is invested in security C with a 12% expected return. The expected rate of return on this portfolio can be expected: RN p D Wa RN a C Wb RN c C Wc RN c D .0:4/.0:1/ C .0:3/.0:5/ C .0:3/.0:12/ D 0:091
4.2.9 Systematic and Unsystematic Risk In Table 4.1, total risk was defined as the sum of systematic and unsystematic risk. Total risk is also equal to the sum of all the risk components just discussed. However, the importance and the contribution to total risk depends on the type of security under consideration. The total risk of bonds contains a much larger fraction of interest-rate risk than the total risk of a stock. Each of the types of risk discussed may contain a systematic component that cannot be diversified away and an unsystematic component that can be reduced or eliminated, if the securities are held in a portfolio. It is assumed throughout this text that rational investors are average to risk, whatever its source; hence a knowledge of various risk-management techniques is essential. Toward that end this chapter now focuses on the management of risk through diversification.
in which Wa ; Wb , and Wc are the percentages of the portfolio invested in securities A, B, and C, respectively. The summation of these weights is equal to one. Rp represents the expected rate of return for the portfolio and is the weighted average of the securities’ expected rate of return. In general, the expected return on an n-asset portfolio is defined by Equation (4.1): RN p D
n X
RN i Wi
(4.1)
i D1
where: n P i D1
Wi D 1 Wi D the proportion of the individual’s investment allocated to security i I and RN i D the expected rate of return for security i:
58
C.-F. Lee et al.
4.3.2 Variance and Standard Deviation of a Portfolio
Var.W1 R1t C W2 R2t / D
t D1
The riskiness of a portfolio is measured by the standard deviation (or variance) of the portfolio. To calculate the standard deviation of a portfolio, we should first identify the covariance among the securities within the portfolio. The covariance between two securities used to formulate a portfolio can be defined as:
Cov.W1 R1 ; W2 R2 / D
N X .W1 R1t W1 RN 1 /.W2 R2t W2 RN 2 / N 1 tD1
D W1 W2
N X .R1t RN 1 /.R2t RN 2 / N 1 tD1
D W1 W2 Cov.R1 ; R2 /
N P
(4.2)
D Œ.0:4/.0:1/ C .0:6/.0:05/ .0:4/.0:15/ C .0:6/.0:1/ 2 =2 CŒ.0:4/.0:15/ C .0:6/.0:1/ .0:4/.0:15/ C .0:6/.0:1/ 2 =2 CŒ.0:4/.0:2/ C .0:6/.0:15/ .0:4/.0:15/ C .0:6/.0:1/ 2 =2 .0:12 0:12/2 .0:07 0:12/2 C D 2 2 .0:17 0:12/2 C 2 Varportfolio D 0:0025
The riskiness of a portfolio can be measured by the standard deviation of returns, as indicated in Equation (4.3):
where: R1t D the rate of return for the first security in period tI R2t D the rate of return for the second security in period tI N N R1 and R2 D average rates of return for the first security and the secondsecurity; respectivelyI and Cov.R1 ; R2 / D the covariance between R1 and R2 : The covariance as indicated in Equation (4.2) can be used to measure the covariability between two securities (or assets) when they are used to formulate a portfolio. With this measure the variance for a portfolio with two securities can be derived: Var.W1 R1t C W2 R2t / 2 N P .W1 R1t C W2 R2t / .W1 RN 1 C W2 RN 2 / D N 1 tD1
Œ.W1 R1t CW2 R2t /.W1 RN 1 CW2 RN 2 / 2 N 1
p D
v u N uP N 2 u t t D1 .Rpt Rp / N 1
where p is the standard deviation of the portfolio’s return and RN p is the expected return of the n possible returns. Figure 4.2 illustrates possible distributions for two portfolios, assuming the same expected return but different levels of risk. Since portfolio B’s variability is greater than that of portfolio A, investors regard portfolio B as riskier than portfolio A.
4.3.3 The Two-Asset Case To explain the fundamental aspect of the risk-diversification process in a portfolio, consider the two-asset case. Following Equation (4.3):
Example 4.1 provides further illustration. Example 4.1. If you have two securities, you can use Equation (4.2) to calculate the portfolio variance as follows: Security 1
Security 2
W1 D 40% t D1 RN 1t D 10% D2 15% D3 20% RN 1 D 15%
W2 D 60% t D1 RN 2t D 5% D2 10% D3 15% RN 1 D 10%
(4.3)
Fig. 4.2 Probability distributions of returns for two portfolios
4 Foundation of Portfolio Theory
s p D D
N P t D1
s
N P tD1
59
If 12 D 1, Equation (4.7) reduces from:
.Rpt RN p /2 N 1 ŒW21 .R1t RN 1 /2 CW22 .R2t RN 2 /2 C2W1 W2 .R1t RN 1 /.R2t RN 2 /
@p D ŒW12 12 C .1 W1 /2 22 C 2W1 .1 W1 /12 1 2 1=2 @W1
Œ2W1 12 2.1 W1 /22 C 2.1 2W1 /12 1 2 R W1 Œ12 22 2 12 1 2 Œ12 12 1 2 D ŒW12 12 C .1 W1 /2 22 C 2W1 .1 W1 /12 1 2
N 1
q D W21 Var.R1t / C W22 Var.R2t / C 2W1 W2 Cov.R1 ; R2 / (4.4)
where W1 C W2 D 1. By the definitions of correlation coefficients between R1 and R2 .12 /, the Cov R1 ; R2 can be rewritten: (4.5) Cov.R1 ; R2 / D 12 1 2 where 1 and 2 are the standard deviations of the first and second security, respectively. From Equations (4.4) and (4.5), the standard deviation of a two-security portfolio can be defined: p p D qVar.W1 R1t C W2 R2t / D W12 12 C .1 W1 /2 22 C 2W1 .1 W1 /12 1 2 (4.6) Example 4.2 provides further illustration.
to:
2 .2 1 / .2 1 /.2 1 / 2 D 2 1
W1 D
Returning to the Examples for securities 1 and 2, since the standard deviations are equal and the securities are perfectly positively correlated, Equation (4.7a) indicates that there is no value for W1 that will minimize the portfolio variance. The weight from W1 that gives the minimum-variance portfolio is 0.05/(0.05–0.05) or 0.05/0, which is undefined. If 12 D 1, Equation (4.7) reduces to: W1 D
Example 4.2. For securities 1 and 2 used in the previous example, applying Equation (4.6) should give the same results for portfolio variance. Security 1
Security 2
12 D 0:0025 W1 D 0:4 12 D C1
22 D 0:0025 W2 D 0:6
(4.7a)
2 .2 C 1 / .1 C 2 /.1 C 2 / 2 .2 C 1 /
(4.7b)
However, if the correlation coefficient between 1 and 2 is 1, then the minimum-variance portfolio must be divided equally between security 1 and security 2 – that is: 0:05 0:05 C 0:05 D 0:5
W1 D
p p D p.0:4/2 .0:0025/C.0:6/2 .0:0025/C2.0:4/.0:6/.1/.0:05/.0:05/ D 0:0025
p D 0:05 or Var portfolio D 0:0025, the same answer as for Example 4.1. If 12 D 1:0 Equation (4.6) can be simplified to the linear expression: p D W1 1 C W2 2
As an expanded form of Equation (4.6), a portfolio can be written: 31=2 2 n n n1 X X X p D 4 W12 12 C 2 Wi Wj ij i j 5 i D1
2
where W2 D .1 W1 /. Since Equation (4.6) is a quadratic equation, some value of W1 minimizes p . To obtain this value, differentiate Equation (4.6) with respect to W1 and set this derivative equal to zero. Solving for W1 : 2 .2 12 1 / W1 D 2 1 C 22 212 1 2
D4
n n X X
i D1 j Di C1
31=2 Wi Wj Cov.Ri t ; Rjt /j5
(4.8)
j D1 i D1
where: (4.7)
A condition needed for Equation (4.7) is that W1 1; that is, no more than 100% of the portfolio can be in any single security, and negative positions (short positions) can be maintained on any security.
Wi and Wj D the investor’s investment allocated to security i and securityj; respectivelyI ij D the correlation coefficient between security i and security j I and n D the number of securities included in the portfolio:
60
C.-F. Lee et al.
Since Equation (4.8) has n securities, there are n variance terms (that is, W12 12 ) and .n2 n/ covariance terms (that is, Wi Wj 12 i j ). If n D 200, Equation (4.8) will have 200 variance terms and 39,800 covariance terms, and any practical application will require a great amount of information as well as many computations. This issue will be discussed in more detail in Chap. 5; in the meanwhile, Example 4.3 provides further illustration. Example 4.3. Consider two stocks, A and B: RN A D 10; RN B D 15; A D 4, and B D 6. (1) If a riskless portfolio could be formed from A and B, what would be the expected return of Rp ? (2) What would the expected return be if AB D 0? Solution 1=2 1. p D WA2 A2 C.1 WA /2 B2 C2WA .1 WA /AB A B If we let AB D 1 p D ŒWA A .1 WA /B p D 0 D 4WA 6.1 WA /RB WA D 3=5 so RN p D WA RN A C .1 WA /RB
2 3 .10/ C .15/ D 5 5 D 12% 2. If we let AB D 0 i1=2 h p D WA2 A2 C .1 WA /2 B2 i @p 1h D 2WA A2 C 2.1 WA /B2 .1/ @WA 2h i1=2 D0
WA2 A2 C .1 WA /2 B2 WA A2 .1 WA /B2 D 0 WA D B2 =A2 C B2 D B2 = 42 C 62 D 9=13 4 9 .10/ C .15/ RN P D 13 13 D 11:54%
4.4.1 The Efficient Portfolio By definition, a portfolio is efficient, if there are no other portfolios having the same expected return at a lower variance of returns. Moreover, a portfolio is efficient if no other portfolio has a higher expected return as the same risk of returns. This suggests that given two investments, A and B, investment A will be preferred to B if: E.A/ > E.B/ and Var.A/ D Var.B/ or E.A/ D E.B/ and Var.A/ < Var.B/ where: E.A/ and E.B/ D the expected returns of A and B: and Var (A) and Var (B) D their respective variances or risk. The mean returns and variance of every investment opportunity can be calculated and plotted as a single point on a mean-standard deviation diagram, as shown in Fig. 4.3. All points below curve EF represent portfolio combinations that are possible. Point D represents a portfolio of investments with a return RD and risk D . All points above EF are combinations of risk and returns that do not exist. Point B would therefore represent risk and return that cannot be obtained with any combination of investments. The EF curve is also called the efficient frontier because all points below the curve are dominated by a point found on the curve. For instance, suppose a firm is willing to assume a maximum level of risk D . It can obtain a return of RD with portfolio D or move to point C on the frontier and receive a higher return RC with that portfolio. Therefore, C dominates D because it would be preferred to D. For the same level of risk, it has a higher return. A similar argument could be made in terms of risk. If the firm wants to achieve a return of RA , it will select portfolio
4.4 The Efficient Portfolio and Risk Diversification Utilizing the definitions of standard deviation of expected return of a portfolio discussed previously, this section discussed the concepts of the efficient portfolio and risk diversification. Fig. 4.3 The efficient frontier in portfolio analysis
4 Foundation of Portfolio Theory
61
A over D, because A represents the same return at a smaller level of risk or standard deviation: A < B . Therefore, point D is not efficient but points A and C are. A decision maker could, therefore, select any point on the frontier and be secure in knowing that a better portfolio is not available. Example 4.4 further illustrates this concept. Example 4.4. To show how the portfolio concepts and methods discussed in this section can be used to do practical analysis, monthly rates of return for January 1980 to December 1984 for Pennzoil and Coca Cola are used as examples. The basic statistical estimates for these two firms are average monthly rates of return and the variance-covariance matrix. The average monthly rates of return for Pennzoil (PZ) and Coca Cola (CK) are 0.0093 and 0.0306, respectively. The variances and covariances are listed in the following table. Variance-covariance matrix PZ CK Pennzoil Coke
0.0107391 0.000290
0.000290 0.0495952
From Equation (4.7), we have: 0:0495952 C 0:000290 0:0107391 C 0:0495952 C 0:00058 0:0498852 D 0:069143 D 0:8189
W1 D
W2 D 1:0 0:8189 D 0:1811 Using the weight estimates and Equations (4.2) and (4.3): E.RN P / D .0:8189/.0:0093/ C .0:1811/.0:0306/ D 0:01316 P2 D .0:8189/2.0:0107391/ C .0:1811/2.0:0495952/
4.4.2 Corporate Application of Diversification The effect of diversification is not necessarily limited to securities but may have wider applications at the corporate level. Frequently, managers will justify undertaking many product lines because of the effects of diversification. Instead of “putting all of the eggs in one basket,” the investment risks are spread out among many lines of services or products in hope of reducing the overall risks involved and maximizing returns. To what degree this diversification takes place in other types of corporate decisions depends on the decision maker’s preference for risk and return. The overall goal is to reduce business risk fluctuations of net income. However, it should be noted that investors can do their own homemade diversification, which generally reduces the need of corporate diversification. This type of corporate diversification can be taken to the multinational level. For example, General Motors has overseas divisions throughout the world. Although these divisions all produce the same product, autos and auto parts, GM’s status as a multinational corporation allows it to take advantage of the diversifying effects of different exchange rates and political and economic climates.
4.4.3 The Dominance Principle The dominance principle has been developed as a means of conceptually understanding the risk/return tradeoff. As with the efficient-frontier analysis, we must assume an investor prefers returns and dislikes risks. For example, as depicted in Fig. 4.4, if an individual is prepared to experience risk associated with A , he or she can obtain a higher expected return with portfolio A.RN A / than with portfolio B.RN B /. Thus A dominates B and would be preferred. Similarly, if an individual were satisfied with a return of RN B , he or she would select portfolio B over C because the risks associated with B
C2.0:8189/.0:1811/.0:000290/ D 0:0088 P D 0:0940 when 12 is less than 1.00 it indicates that the combination of the two securities will result in a total risk less than their added respective risks. This is the diversification effect. If 12 were equal to 1.00, this would mean that the combination of the two securities has no diversification effect at all. The correlation coefficient of 12 D 0:0125659 indicates that a portfolio combining Pennzoil and Coca Cola would show a diversification effect and a reduction in risk.
Fig. 4.4 The dominance principle in portfolio analysis
62
C.-F. Lee et al.
are less .B < C /. Therefore using the dominance principle reinforces the choice of an efficient portfolio and is also a method in determining it. Figure 4.4 makes clear that points A and B are directly comparable because they have a common standard deviation, A . Points B and C are directly comparable because of a common return, RN B . Now consider portfolio D. How does its risk versus return compare with the other portfolios shown in Fig. 4.4? It is difficult to say because the risk and return are not directly comparable using the dominance principle. This is the basic limitation of the dominance principle – that portfolios without a common risk or return factor are not directly comparable.
4.4.4 Three Performance Measures For such a situation it becomes necessary to use some other performance measure. There are basically three important portfolio performance measures taught in investment courses: (1) the Sharpe, (2) the Treynor, and (3) the Jensen measures. The Sharpe measure (SP) (Sharpe 1966) is of immediate concern. Given two of the portfolios depicted in Fig. 4.4, portfolios Band D, their relative risk-return performance can be compared using the equations: SPD D
RN D Rf RN B Rf and SPB D D B
where: SPD ; SPB RN D ; RN B Rf D ; B
D Sharpe performance measuresI D the average return of each portfolioI D risk-free rateI and D the respective standard deviation on risk of each portfolio:
Because the numerator is the average return reduced by the risk-free rate, it represents the average risk premium of each portfolio. Dividing the risk premium by the total risk per portfolio results in a measure of the return (premium) per unit of risk for each portfolio. The Sharpe performance measure equation will therefore allow a direct comparison of any portfolio, given its risk and returns. Consider Fig. 4.5; portfolio A is being compared to portfolio B. If a riskless rate exists, then all investors would prefer A to B because combinations of A and the riskless asset give higher returns for the same level of risk than combinations of the riskless asset and B. The preferred portfolio lies on the ray passing through Rf that is furthest in the counterclockwise direction (the ray that has the greatest slope). Example 4.5 further illustrates this concept.
Fig. 4.5 Combinations of portfolio and the risk-free investment
Example 4.5. An insurance firm is trying to decide between two investment funds. From past performance it was able to calculate the average returns and standard deviations for these funds. The current T-bill rate is 9.5% and the firm will use this as a measure of the risk-free rate.
Average return R (percent) Standard deviation (percent)
Smyth fund
Jones fund
18 20
16 15
Risk-free rate D Rf (percent) D 9:5 Using the Sharpe performance measure, the risk-return measurements for these two firms are: 0:18 0:095 D 0:425 0:20 0:16 0:095 D 0:433 D 0:15
SPSmyth D SPJones
It is clear that the Jones fund has a slightly better performance and would be the better alternative of the two. The Sharpe measure looks at the risk-return decision from the point of view of an investor choosing a portfolio to represent the majority of his or her investment. An investor choosing a portfolio to represent a large part of his or her wealth would likely be concerned with the full risk of the portfolio, and the standard deviation is a measure of that risk. On the other hand, if the risk level of the portfolio is already determined by the investor and what is important is to evaluate the performance of the portfolio over and above the total market performance, perhaps the proper risk measure would be the relationship between the return on the portfolio and the return on the market, or beta. All combinations of a riskless asset and a risky portfolio lie on a straight line con-
4 Foundation of Portfolio Theory
63
necting them. The slope of the line connecting the risky asset A and the risk-free rate is .RN A Rf /=A . Here, as in the Sharpe measure, an investor would prefer the portfolio on the most counterclockwise ray emanating from the riskless asset. This measure, called the Treynor measure (TP), developed by Treynor in 1965, examines differential return when beta is the risk measure. Example 4.6 provides further illustration.
SPA D 0:84
The Treynor performance measure uses the beta coefficient (systematic risk) instead of total risk for the j th portfolio .j / as a risk measure. Applications of the TP are similar to the SP as discussed previously.1 Jensen (1968, 1969) has proposed a measure referred to as the Jensen differential performance index (Jensen’s measure). The differential return can be viewed as the difference in return earned by the portfolio compared to the return that the capital asset pricing line implies should be earned. Consider the line connecting the riskless rate and the market portfolio. A manager could obtain any point along this line by investing in the market portfolio and mixing this with the riskless asset to obtain the desired risk level. If the constructed portfolio is actively managed, then one measure of performance is the difference in return earned by actively managing the portfolio, versus what would have been earned if the portfolio had been passively constructed of the market portfolio and the riskless asset to achieve the same risk level. The slope of the line connecting the riskless asset and the market portfolio is .RN M Rf /=ˇM , and the intercept must be the riskless rate. The beta on the market portfolio is one; therefore, the CAPM equation results in:
SPB D 0:73
RN P D Rf C .RN M Rf /ˇP :
Example 4.6. Rank the portfolios shown in the table based on the Sharpe measure. Assume Rf D 8%. If Rf D 5%, how does the order change? Portfolio
Return (percent)
Risk (percent)
A B C D E
50 19 12 9 8:5
50 15 9 5 1
Solution Sharpe measure SPM D
RN M Rf
SPC D 0:44 SPD D 0:20 SPE D 0:50 Ranked by the Sharpe measure, A > B > E > C > D. The Sharpe measure indicates that portfolio A is the most desirable: it has the highest return per unit of risk. For Rf D 5 percent SPA D 0:90 SPB D 0:933 SPC D 0:77 SPD D 0:80 SPE D 0:35
Jensen’s measure (JM) is the differential return of the managed portfolio’s actual return less the return on the portfolio of identical beta that lies on the line connecting the riskless asset and the market portfolio. Algebraically, Jensen’s measure is expressed: JM D RN P ŒRf C .RN M Rf /ˇp Figure 4.6 depicts portfolio rankings for the Jensen measure, and Example 4.7 provides further illustration. Example 4.7. Rank the portfolio in the table according to Jensen’s measure:
The order changes to E > B > A > D > C. E is now the best portfolio as it has the highest return per unit of risk. The Treynor measure can be expressed by the following: TP D
RN j Rf ˇj
where: RN j D average return of j th portfolioI Rf D ris free rateI and ˇj D beta coefficient for j th portfolio:
Fig. 4.6 Jensen’s measure for portfolio rankings
1
Discussion of the Treynor measure adapted from Treynor (1965). Adapted by permission.
64
1. 2. 3. 4.
C.-F. Lee et al.
Assuming RM D 10 percent and Rf D 8 percent Assuming RM D 12 percent and Rf D 8 percent Assuming RM D Rf D 8 percent Assuming RM D 12 percent and Rf D 4 percent
Portfolio
Ri (percent)
(percent)
ˇi
A B C D E
50 19 12 9 8:5
50 15 9 5 1
2.5 2.0 1.5 1.0 0.25
Solution
2.
3.
4.
D
ŒRN P Rf ŒRN M Rf P m
D SPP SPm .commom constant/l If the Jensen measure (JM) is divided by ˇP , it is equivalent TM to the Treynor measure plus some constant common to all portfolios: ŒRN P Rf ŒRN M Rf ˇP JM D ˇP ˇP ˇP D TMP ŒRN M Rf
Ri D Rf Cˇi .RM Rf /
1.
ŒRN P Rf ŒRN M Rf .pm / JM D p P P m m
JM D .Ri Rf /ˇi .RM Rf /
JMA D 37 percent JMB D 7 percent JMC D 1 percent JMD D 2 percent JME D 0 percent Ranked by the Jensen measure A > B > C > E > D. JMA D 32 percent JMB D 3 percent JMC D 2 percent JMD D 3 percent JME D 0:5 percent The rank change to A > B > E > C > D. JMA D 42 percent JMB D 11 percent JMC D 4 percent JMD D 1 percent JME D 0:5 percent The rank is A > B > C > D > E. JMA D 26 percent JMB D 1 percent JMC D 4 percent JMD D 3 percent JME D 2:5 percent The rank now is A > E > B > D > C.
Interrelationship among Three Performance Measure. It should be noted that all three performance measures are interrelated. For instance, if pm D pm =p m D 1, then the Jensen measure divided by p becomes equivalent to the Sharpe measure. Since ˇp D pm =m2 and pm D pm =p m the Jensen measure (JM) must be multiplied by 1=p to derive the equivalent Sharpe measure:
D TMP commom constant Example 4.8 provides further illustration.
Example 4.8. Continuing with the example used for the Sharpe performance measure in Example 4.5, assume that in addition to the information already provided, the market return is 10%, the beta of the Smyth Fund is 0.8, and the Jones Fund beta is 1.1. Then, according to the capital asset pricing line, the implied return earned should be: RN Smyth D 0:095 C .0:10 0:095/.0:8/ D 0:099 RN Jones D 0:095 C .0:10 0:095/.1:1/ D 0:1005 Using the Jensen measure, the risk-return measurements for these two firms are: JMSmyth D 0:18 0:099/ D 0:081 JMJones D 0:16 0:1005 D 0:0595 From these calculations, it is clear that the Smyth Fund has a better performance and would be the better alternative of the two. Note that this is the opposite of the results determined from the Sharpe performance measure in Example 4.5. Computing the Treynor measure would reinforce the Jensen results. More analysis of these performance measures will be undertaken in detail in next chapters.
4.5 Determination of Commercial Lending Rate This section concerns a process for estimating the lending rate a financial institution would extend to a firm or the borrowing rate a firm would think is reasonable based on economic, industry, and firm-specific factors.
4 Foundation of Portfolio Theory
65
As shown previously, part of the rate of return is based on the risk-free rate. The risk-free rate Rf must first be forecasted for three types of economic conditions – boom, normal, and poor. The second component of the lending rate is the risk premium .RP /. This can be calculated individually for each firm by examining the change in earnings before interest and taxes (EBIT) under the three types of economic conditions. The EBIT is used by the lender as an indicator of the ability of the potential borrower to repay borrowed funds. Table 4.2 has been constructed based on the methods discussed previously. In total there are nine possible lending rates under the three different economic conditions. The construction of these lending rates is shown in Table 4.3. This table shows that during a boom the risk-free rate is set at 12%, but the risk premium can taken on different values. These is a 40% chance that it will be 3.0%, a 30% chance it will be 5.0%, and a 30% chance it will be 8.0%. The products of the RP probabilities and the Rf probability are the joint probabilities of occurrence for the lending rates computed from these parameters. Therefore, there is a 10% chance that a firm will be faced with a 15% lending rate during a boom, a 7.5% chance of a 17% rate, and a 7.5% chance of an 18%
Table 4.2 Possible lending rates Economic conditions Rf (percent)
Probability
Boom
12.0
0.25
Normal
10.0
0.50
Poor
8.0
0.25
rate. This process applies for the other conditions, normal and poor, as well. Based upon the mean and variance Equations (4.1) and (4.2) it is possible to calculate the expected lending rate and its variance. Using the information provided in Table 4.3, the weighted average can be calculated: R D.0:100/.15%/ C .0:075/.17%/ C .0:075/.20%/ C .0:200/.13%/ C .0:150/.15%/ C .0:150/.18%/ C .0:100/.11%/ C .0:075/.13%/ C .0:075/.16%/ D15:1% With a standard deviation of: h D .0:100/.15 15:1/2 C .0:075/.17 15:1/2 C .0:075/.20 15:1/2 C .0:200/.13 15:1/2 C .0:150/.5 15:1/2
EBIT ($ millions)
Probability
Rp (percent)
2.5 1.5 0.5 2.5 1.5 0.5 2.5 1.5 0.5
0.40 0.30 0.30 0.40 0.30 0.30 0.40 0.30 0.30
3 5 8 3 5 8 3 5 8
Table 4.3 Construction of actual lending rates Economic conditions
(A) Rf (percent)
(B) Probability
(C) Rp (percent)
(D) Probability
.B D/ Joint probability of occurrence
.A C C/ Lending rate (percent)
Boom
12
0.25
Normal
10
0.50
Poor
8
0.25
3.0 5.0 8.0 3.0 5.0 8.0 3.0 5.0 8.0
0.40 0.30 0.30 0.40 0.30 0.30 0.40 0.30 0.30
0.100 0.075 0.075 0.200 0.150 0.150 0.100 0.075 0.075
15 17 20 13 15 18 11 13 16
66
C.-F. Lee et al.
Fig. 4.7 Probability of Xi in the intervals ˙1; ˙2; ˙3
C .0:150/.18 15:1/2 C .0:100/.11 15:1/2 C .0:075/.13 15:1/2 i1=2 C .0:075/.16 15:1/2 D.0:001 C 0:271 C 1:801 C 0:882 C 0:0015 C 1:2615 C 1:681
away. The remaining risk is systematic risk that which influences all risky assets. The market rate of return can be calculated using one of several types of market indicator series, such as the DowJones Industrial Average, the Standard and Poor (S&P) 500, or the New York Stock Exchange Index, using the following equation: It It 1 D Rmt It 1
C 0:331 C 0:061/ D2:51% If this distribution is indeed approximately normal, the mean and standard deviation can be employed to make some statistical inferences. Figure 4.7 makes it clear that 68.3% of the observations of a standard normal distribution are within one standard deviation of the mean, 95.4% are within three. Since the average lending rate is assumed to be normally distributed with a mean of 15.1% and a standard deviation of 2.51%, it is clear that almost all (99.7%) of the lending rates will lie in the range of 7.57–22.63%, because 7.57% is three standard deviations below 15.1 and 22.63% is three standard deviations above the mean. It is also clear that 68.3% of the rates will lie in the range of 12.59–17.61%.
4.6 The Market Rate of Return and Market Risk Premium The market rate of return is the return that can be expected from the market portfolio. This portfolio is of all risky assets – that is, stocks, bonds, real estate, coins, and so on. Because all risky assets are included, the market portfolio is a completely diversified portfolio. All unsystematic risks related to each individual asset would, therefore, be diversified
(4.9)
where: Rmt D market rate of return at time tI It D market index at tI and It 1 D market index at t 1: This equation calculates the percent change in the market index during period t and the previous period t 1. This change is the rate of return an investor would expected to receive in t had he or she invested in t 1. A risk-free investment is one in which the investor is sure about the timing and amount of income streams arising from that investment. However, for most types of investments, investors are uncertain about the timing and amount of income of their investments. The types of risks involved in investments can be quite broad, from the relatively riskless T-bills to highly risky speculative stocks. The reasonable investor dislikes risks and uncertainty and would, therefore, require an additional return on his investment to compensate for this uncertainty. This return, called the risk premium, is added to the nominal risk-free rate. The risk premium is derived from several major sources of uncertainty or risk, as was discussed at the beginning of this chapter. Table 4.4 illustrates this concept. In this table the market rate of return using the S&P 500 was calculated using
4 Foundation of Portfolio Theory
67
Table 4.4 Market returns and T-bill by quarters Year
Month
S&P 500
2005 2006
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1; 248:29 1; 280:08 1; 280:66 1; 294:87 1; 310:61 1; 270:09 1; 270:20 1; 276:66 1; 303:82 1; 335:85 1; 377:94 1; 400:63 1; 418:30 1; 438:24 1; 406:82 1; 420:86 1; 482:37 1; 530:62 1; 503:35 1; 455:27 1; 473:99 1; 526:75 1; 549:38 1; 481:14 1; 468:36 1; 378:55 1; 330:63 1; 322:70 1; 385:59 1; 400:38 1; 280:00 1; 267:38 1; 282:83 1; 164:74 968:75 896:24 887:88
2007
2008
Equation (4.9) to devise average monthly returns. Monthly T-bill rates are listed in the column B. The T-bill rate was deducted from the market return rate .Rm Rf / to devise the risk premium. In the last 2 months of 2007 and for most months of 2007 the market was in decline, with low returns resulting in each month. This allowed the T-bill investors to obtain a higher than the market return and resulted in negative risk premiums. The second half of 2006 demonstrated an increasing market level and higher market returns. In October 1982 the return was 3.15%, the highest in the past 10 months. This al-
(A) Market return (percent) 2:55 0:05 1:11 1:22 3:09 0:01 0:51 2:13 2:46 3:15 1:65 1:26 1:41 2:18 1:00 4:33 3:25 1:78 3:20 1:29 3:58 1:48 4:40 0:86 6:12 3:48 0:60 4:75 1:07 8:60 0:99 1:22 9:21 16:83 7:48 0:93
(B) T-bill rate (percent)
(A–B) Risk premium (percent)
0.34 0.36 0.37 0.38 0.39 0.39 0.41 0.42 0.41 0.40 0.41 0.41 0.41 0.42 0.42 0.41 0.40 0.39 0.40 0.40 0.36 0.32 0.31 0.25 0.27 0.17 0.14 0.12 0.12 0.15 0.15 0.14 0.14 0.07 0.04 0.01
2:21 0:32 0:74 0:84 3:48 0:38 0:10 1:71 2:05 2:75 1:23 0:85 0:99 2:60 0:58 3:92 2:86 2:17 3:60 0:89 3:22 1:16 4:71 1:11 6:38 3:65 0:74 4:64 0:95 8:75 1:14 1:08 9:35 16:90 7:53 0:94
lowed market rates to leap beyond the riskless T-bill rate, and the result was a positive risk premium. During the period of fluctuation in the level of stock market prices, the first half of 2007 revealed a fluctuated risk premium. Theoretically it is not possible for a risk premium required by investors to be negative. Taking on risk involves some positive cost. Nevertheless, using short-run estimators as in Table 4.4 may result in negative figures because they reflect the fluctuations of the market. The basic problem with using actual market data to assess risk premiums is the difference between expected returns (which are always positive)
68
and realized returns (which may be positive or negative). It becomes evident that investors’ expectations will not always be realized.
4.7 Conclusion This chapter has defined the basic concepts of risk and risk measurement. The efficient-portfolio concept and its implementation was demonstrated using the relationships of risk and return. The dominance principle and performance measures were also discussed and illustrated. Finally, the interest rate and market rate of return were used as measurements to show how the commercial lending rate and the market risk premium can be calculated. Overall, this chapter has introduced uncertainty analysis assuming previous exposure to certainty concepts. Further application of the concepts discussed in this chapter as related to security analysis and portfolio management are explored in later chapters.
References Ben-Horin, M. and H. Levy. 1980. “Total risk, diversifiable risk and non-diversifiable risk: a pedagogic note.” Journal of Financial and Quantitative Analysis 15, 289–295. Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Bowman, R. G. 1979. “The theoretical relationship between systematic risk and financial (accounting) variables.” Journal of Finance 34, 617–630. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, 7th Edition, Wiley, New York. Evans, J. L. and S. H. Archer. 1968. “Diversification and the reduction of dispersion: an empirical analysis.” Journal of Finance 23, 761–767.
C.-F. Lee et al. Francis, J. C. and S. H. Archer. 1979. Portfolio analysis, Prentice-Hall, Englewood Cliffs, NJ. Ibbotson, R. G. and R. A. Sinquefield. 1976. “Stocks, bonds, bills, and inflation: simulations of the future (1976–2000).” Journal of Business 49, 313–338. Jensen, M. C. 1968. “The performance of mutual funds in the period 1945–1964.” Journal of Finance 23, 389–416. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolios.” Journal of Business 42, 167–185. Lee, C. F. and S. N. Chen. 1981. “The sampling relationship between Sharpe’s performance measure and its risk proxy: sample size, investment horizon and market conditions.” Management Science 27(6), 607–618. Lee, C. F. and S. N. Chen. 1984. “On the measurement errors and ranking of composite performance measures.” Quarterly Review of Economics and Business 24, 6–17. Lee, C. F. and S. N. Chen. 1986. “The effects of the sample size, the investment horizon and market conditions on the validity of composite performance measures: a generalization,” Management Science 32(11), 1410–1421. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd Edition, Wiley, New Jersey. Markowitz, H. M. 1959. Portfolio selection: efficient diversification of investments, Wiley, New York, NY. Modigliani, F. and G. A. Pogue. 1974. “An introduction to risk and return.” Financial Analysis Journal 30, 69–86. Robicheck, A. A. and R. A. Cohn. 1974. “The economic determinants of systematic risk.” Journal of Finance 29, 439–447. Schall, L. D. 1972. “Asset valuation, firm investment, and firm diversification.” Journal of Business 45, 11–28. Sharpe, W. F. 1966. “Mutual fund performance.” Journal of Business 39, 119–138. Thompson, D. J. 1976. “Sources of systematic risk in common stocks.” Journal of Business 49, 173–188. Tobin, J. 1958. “Liquidity preference as behavior toward risk.” Review of Economic Studies 25, 65–86. Treynor, J. 1965. “How to rate management of investment funds.” Harvard Business Review 43, 63–75. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th Edition, Duxbury Press, California.
Chapter 5
Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model Cheng-Few Lee, Joseph E. Finnerty, and Hong-Yi Chen
Abstract In this chapter, we first introduce utility function and indifference curve. Based on utility theory, we derive the Markowitz’s model and the efficient frontier through the creation of efficient portfolios of varying risk and return. We also include methods of solving for the efficient frontier both graphically and mathematically, with and without explicitly incorporating short selling. Keywords Markowitz model r Utility theory r Utility functions r Indifference curve r Risk averse r Short selling r Dyl model r Iso-return line r Iso-variance ellipse r Critical line r Lagrange multipliers
5.1 Introduction In this chapter, we address basic portfolio analysis concepts and techniques discussed in the Markowitz portfolioselection model and other related issues in portfolio analysis. Before Harry Markowitz (1952, 1959) developed his portfolio-selection technique into what is now modern portfolio theory (MPT), security-selection models focused primarily on the returns generated by investment opportunities. The Markowitz theory retained the emphasis on return, but it elevated risk to a coequal level of importance, and the concept of portfolio risk was born. Whereas risk had been considered an important factor and variance an accepted way of measuring risk, Markowitz was the first to clearly and rigorously show how the variance of a portfolio can be reduced through the impact of diversification. He demonstrated that by combining securities that are not perfectly positively correlated into a portfolio, the portfolio variance can be reduced. The Markowitz model is based on several assumptions concerning the behavior of investors:
1. A probability distribution of possible returns over some holding period can be estimated by investors. 2. Investors have single-period utility functions in which they maximize utility within the framework of diminishing marginal utility of wealth. 3. Variability about the possible values of return is used by investors to measure risk. 4. Investors use only expected return and risk to make investment decisions. 5. Expected return and risk as used by investors are measured by the first two moments of the probability distribution of returns-expected value and variance. 6. Return is desirable; risk is to be avoided. It follows, then, that a security or portfolio is considered efficient if there is no other investment opportunity with a higher level of return at a given level of risk and no other opportunity with a lower level of risk at a given level of return.
5.2 Measurement of Return and Risk This section focuses on the return and risk measurements utilized in applying the Markowitz model to efficient portfolio selection.
5.2.1 Return Using the probability distribution of expected returns for a portfolio, investors are assumed to measure the level of return by computing the expected value of the distribution. E.RP / D
H.-Y. Chen () and C.-F. Lee Rutgers University, New Brunswick, NJ, USA e-mail:
[email protected]
n X
Wi E .Ri /
(5.1)
i D1
where:
J.E. Finnerty University of Illinois, Urbana-Champaign, IL, USA e-mail:
[email protected] C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_5,
n X
Wi D 1:0I
i D1
69
70
C.-F. Lee et al.
n D the number of securities; Wi D the proportion of the funds invested in security i ; Ri , RP D the return on i th security and portfolio p; and E . / D the expectation of the variable in the parentheses.
Since rAA and rBB D 1:0 by definition, terms can be simplified and rearranged, yielding:
Thus, the return computation is nothing more than finding the weighted average return of the securities included in the portfolio. The risk measurement to be discussed in the next section is not quite so simple, however. For only in the case of perfect positive correlation among all its components is the standard deviation of the portfolio equal to the weighted average standard deviation of its component securities.
For a three-security portfolio, the variance of portfolio can be defined:
5.2.2 Risk Risk is assumed to be measurable by the variability around the expected value of the probability distribution of returns. The most accepted measures of this variability are the variance and standard deviation. The variance of a single security is the expected value of the sum of the squared deviations from the mean, and the standard deviation is the square root of the variance. The variance of a portfolio combination of securities is equal to the weighted average covariance of the returns on its individual securities: n n X X Wi Wj Cov Ri ; Rj Var Rp D
(5.2)
i D1 j D1
Covariance can also be expressed in terms of the correlation coefficient as follows: Cov Ri ; Rj D rij i j D ij
(5.3)
Where rij D correlation coefficient between the rates of return on security i; Ri , and the rates of return on security j; Rj , and i , and j represent standard deviations of Ri and Rj respectively. Therefore: n n X X Wi Wj rij i j Var Rp D
(5.4)
i D1 j D1
For a portfolio with two securities, A and B, the following expression can be developed: B B X X Var Rp D Wi Wj rij i j i DA j DA
D WA WA rAA A A C WA WB rAB A B CWB WA rBA B A C WB WB rBB B B
Var Rp D WA2 A2 C WB2 B2 C 2WA WB rAB A B
(5.5)
C P C P Var Rp D Wi Wj rij i j i DA j DA
D WA WA rAA A A C WA WB rAB A B CWA WC rAC A C C WB WA rAB B A CWB WB rBB B B C WB WC rBC B C CWC WA rCA C A C WC WB rCB C B CWC WC rC C C C Again simplifying and rearranging yields: Var Rp D WA2 A2 C WB2 B2 C WC2 C2 C 2.WA WB rAB A B CWA WC rAC A C C WB WC rBC B C / (5.6) Thus, as the number of securities increases from three to two, there is one more variance term and two more covariance terms. The general formula for determining the number of terms that must be computed (NTC) to determine the variance of a portfolio with N securities is NTC D N variancesC
N2 N covariances 2
For the two-security example two variances were involved ı A2 and B2 , and 22 2 2 D .4 2/=2 D 1 covariances, AB , or three computations. ı For the three-security case, three variances and 32 3 2 D .9 3/=2 D 3 covariances were needed, AB ; AC , and BC , or six computations. ıWith four securities, the total number would be 4 C 42 4 2 D 4 C .16 4/=2 D 4 C 6 D 10 computations. As shown from these examples, the number of covariance terms increases by N 1, and the number of variance terms increases by one, as N increases by 1 unit. If rAB D rAC D rBC D 1, then the securities are perfectly positively correlated with each other, and Equation (5.6) reduces to: Var Rp D .WA A C WB B C WC C /2
(5:60 )
This implies that the standard deviation of the portfolio is equal to the weighted average standard deviation of its component securities. In other words: Standard deviation of Rp D WA A C WB B C WC C
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
Table 5.1 Portfolio size and variance computations Number of securities Number of Var and Cov terms 2 3 4 5 10 15 20 25 50 75 100 250 500
3 6 10 15 55 120 210 325 1; 275 2; 850 5; 050 31; 375 125; 250
Table 5.1 clearly illustrates the tremendous estimation and computational load that exists using the Markowitz diversification approach. Later chapters will illustrate how this problem can be alleviated.
5.3 Utility Theory, Utility Functions, and Indifference Curves In this section, utility theory and functions, which are needed for portfolio analysis and capital asset models, will be discussed in detail. Utility theory is the foundation for the theory of choice under uncertainty. Following Henderson and Quandt (1980), cardinal and ordinal theories are the two major alternatives used by economists to determine how people and societies choose to allocate scarce resources and to distribute wealth among one another over time.1
5.3.1 Utility Functions Economists define the relationships between psychological satisfaction and wealth as “utility.” An upward-sloping relationship, as shown in Fig. 5.1, identifies the phenomena of increasing wealth and increasing satisfaction as being directly related. These relationships can be classified into linear, concave, and convex utility functions. 1
A cardinal utility implies that a consumer is capable of assigning to every commodity or combination of commodities a number representing the amount or degree of utility associated with it. An ordinary utility implies that a consumer needs not be able to assign numbers that represent (in arbitrary unit) the degree or amount of utility associated with commodity or combination of commodity. The consumer can only rank and order the amount or degree of utility associated with commodity.
71
In Fig. 5.1a, for each unit change in wealth, there is a linear utility function and equal increase in satisfaction or utility. A doubling of wealth will double satisfaction, and so on. If an investor’s utility function is a linear utility function, we call this kind of investor a risk-neutral investor. This is probably not very realistic: a dollar increase in wealth from $2 to $1 is probably more important than an increase from $2 million to $1 million, because the marginal utility diminishes with increased wealth. In Fig. 5.1b the concave utility function shows the relationship of an increase in wealth and a less-than-proportional increase in utility. In other words, the marginal utility of wealth decreases as wealth increases. As mentioned above, the $1 increase from $2 to $1 of wealth is more important to the individual than the increase from $1,000,001 to $1 million. Each successive increase in wealth adds less satisfaction as the level of wealth rises. We call an investor with a concave utility function a risk-averse investor. Finally, Fig. 5.1c is a convex utility function, which denotes a more than proportional increase in satisfaction for each increase in wealth. Behaviorally, the richer you are the more satisfaction you receive in getting an additional dollar of wealth. Investors with convex utility functions are called risk-seeking investors. The utility theory primarily used in finance is that developed by Von Neumann and Morgenstern (VNM 1947). VNM define investor utility as a function of rates of return or wealth. Mao (1969) points out that the VNM utility theory is really somewhere between the cardinal and ordinal utility theories. The function associated with the VNM’s utility theory in terms of wealth can be defined: U D f .w ; w / where w indicates expected future wealth and w represents the predicted standard deviation of the possible divergence of actual future wealth from w . Investors are expected to prefer a higher expected future wealth to a lower value. Moreover, they are generally risk averse as well. That is, they prefer a lower value of w to a higher value, given the level of w .2 These assumptions imply that the indifference curves relating w and w will be upward sloping, as indicated in Fig. 5.2. In Fig. 5.2, each indifference curve is an expected utility isoquant showing all the various combinations of risk and return that provide an equal amount of expected utility for the investor. In explaining how investment decisions or portfolio choices are made, utility theory is used here not to imply that individuals actually make decisions using a utility curve, but rather as an expository vehicle that helps explain how 2
Technically, these conditions can be represented mathematically by @U =@w > 0 and @U =@w < 0.
72
C.-F. Lee et al.
Fig. 5.1 Utility functions
utilities of the various possible returns. The weights are the probabilities of occurrence associated with each of the possible returns. It is calculated by the following formula: E .U / D
n X
U .wi /Pi
(5.7)
i
where: E .U / D expected utility; U .wi / D the utility of the i th outcome wi ; and Pi D the Probability of the i th outcome. Example 5.1 provides further illustration. Example 5.1. Given investments A and B as shown in the table, determine the utilities of A and B for the given utility functions. Fig. 5.2 Indifference curves of utility functions
A Outcome wi 10 5 1
Probability 2/5 2/5 1/5
B Outcome wi 9 3
Probability investors presumably act. In general, humans behave as if 2/3 more is better than less (the utility curve is upward slop1/3 ing) and marginal utility is decreasing (the utility curve is concave). In an uncertain environment it becomes necessary to ascertain how different individuals will react to risky situ- 1. U.w/ D wi ations. The risk is defined as the probability of success or 2. U.w/ D w2i failure. Alternatively, risk could be described as variability 3. U.w/ D w2i wi of outcomes, payoffs, or returns. This implies that there is a distribution of outcomes associated with each investment Solution decision. What is needed is a linkage between utility or expected utility and risk. Expected utility has been defined as 1. For U.w/ D wi the numerical value assigned to the probability distribution associated with a particular portfolio’s return. This numeriUtility A D 25 .10/ C 25 .5/ C 15 .1/ D 6 51 cal value is calculated by taking a weighted average of the Utility B D 23 .9/ C 13 .3/ D 7
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
73
Fig. 5.3 Risk-neutral investors and fair games
2. For U.w/ D w2i Utility A D 25 .100/ C 25 .25/ C 15 .1/ D 50 51 Utility B D 23 .81/ C 13 .9/ D 57 3. U.w/ D w2i wi (use results from 1 and 2) Utility A D50 51 6 15 D 44 Utility B D57 7 D 50 In all three cases, B has the higher degree of utility because it has a higher expected value as well as a smaller dispersion than A. Linear utility function and risk. It is useful now to consider how the shape of an individual’s utility function affects his or her reaction to risk. Assume that an individual who has $5,000 and whose behavior is a linear utility function (Fig. 5.1a) is offered a chance to gain $10,000 with a probability of half or to lose $10,000 with a probability of half. What should he or she pay for such an opportunity? The answer is nothing, for as shown in Fig. 5.3, this individual would be no better or worse off accepting or rejecting this opportunity. If he rejected the offer, his wealth would be $5,000 with utility U1 ; if he paid nothing for the opportunity, his wealth would remain as $5,000 with Utility U1 . Any payment for this chance would reduce his wealth and therefore be undesirable. This is so because the expected value of the fair game is zero: 1 2
.10; 000/ C 12 .10; 000/ D 0
Figure 5.3 illustrates this linear utility function concept. In the following section we will analyze the implication of concave utility function. Concave utility function and risk. Now consider an individual whose behavior is a concave utility function. If this individual participates and wins, her utility is shown by point UW in Fig. 5.4. But if she loses, her position is shown by UL . The expected value of this fair game, having a 50% chance of winning and a 50% chance of losing, is shown by point A.3 The utility of the fair game is UF . A comparison of UF with her initial position, Ui , shows that the investor should not accept this fair game. As shown in Fig. 5.4 the utility of winning .UW Ui / is less than the utility of losing .Ui UL /. Therefore, the utility of doing nothing is greater than the expected utility of accepting the fair game. In fact, the individual should be willing to pay up to the difference between the utility of winning and the utility of losing .Ui UL / .UW Ui / to avoid being involved in this situation. Alexander and Francis (1986) theoretically analyze this issue in more detail. Hence investors with concave utility functions are said to be risk averse. That is, they would reject a fair game because the utility derived from winning is less than the utility lost should they lose the game. In other words, the expected utility of participating in a fair game is negative. Example 5.2 will provide illustration of linear, concave, and convex utility function cases for different investment decisions. 3
Following Fama (1970) and Alexander and Francis (1986:177), a fair game means that the expected returns, given information set , equal the expected returns without the information set. Note that this does not mean the expected returns are zero or positive – they could be negative.
74
C.-F. Lee et al.
For a risk-averse investor QUi > E (UF) Þ E(UF)-Ui < 0 \Reject a fair game
Uw -Ui =
Ui -UL =
E(UF)
E(Ul) Concave Utility Function Linear Utility Function
Fig. 5.4 Risk-averse investors and fair games
E.UF / D 12 .Uw C UL / is the expected utility of the fair game under a concave utility function. E.U` / is the expected utility of the fair game under a linear utility function.
This implies the investor is risk averse and would reject a fair gamble. 3.
u.w/ D e 2w u .w/ D 2e 2w 00 u .w/ D 4e 2w < 0 0
Example 5.2. Given the following utility functions for four investors, what can you conclude about their reaction towards a fair game? 1. 2. 3. 4.
u.w/ D w C 4 u.w/ D w 12 w2 (quadratic utility function) u.w/ D e 2w (negative exponential utility function), and u.w/ D w2 4w
Evaluate the second derivative of the utility functions according to the following rules. 00
u .w/ < 0 implies risk averse 00 u .w/ D 0 implies risk neutral 00 u .w/ > 0 implies risk seeker Solution 1.
u.w/ D w C 4 u0 .w/ D 1 00 u .w/ D 0 This implies risk neutrality; it is indifference for the investor to accept or reject a fair gamble.
2.
u.w/ D w 12 w2 u0 .w/ D w2 00 u .w/ D 2w < 0 .We assume wealth is nonnegative:/
This implies risk adversity; the investor would reject a fair gamble. 4.
u.w/ D w2 4w u .w/ D 2w 4 00 u .w/ D 2 > 0 0
This implies risk preference; the investor would seek a fair gamble. The convex utility function is not realistic in real-world decisions; therefore, it is not further explored at this point. The following section discusses the implications of alternative utility functions in terms of indifference curves.
5.3.2 Risk Aversion and Utility Values But when risk increases along with return, the most attractive portfolio is not obvious. How can investors quantify the rate at which they are willing to trade off return against risk? By assuming that each investor can assign a welfare, or utility, score to competing investment portfolios based on the expected return and risk of those portfolios. Higher utility values are assigned to portfolios with more attractive risk-return
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
profiles. Portfolios receive higher utility scores for higher expected returns and lower scores for higher volatility. Assigning a portfolio with expected return and variance of returns, the following utility score: U D E.r/
1
A s2 2
Where U D utility E.r/ D expected return on the asset or portfolio A D coefficient of risk aversion s 2 D variance of returns Example: Consider three investors with different degrees of risk aversion: A1 D 2:0; A2 D 3:0, and A3 D 4:0, all of whom are evaluating the three portfolios in Table 5.2. Because the risk-free rate is assumed to be 5%, utility score implies that all three investors would assign a utility score of 0.05 to the risk-free alternative. Table 5.2 presents the utility scores that would be assigned by each investor to each portfolio. The portfolio with the highest utility score for each investor appears in bold. Notice that the high-risk portfolio, H, would be chosen only by the investor with the lowest degree of risk aversion, A1 D 2:0, while the low-risk portfolio, L. would be passed over even by the most risk-averse of our three investors. All three portfolios beat the risk-free alternative for the investors with levels of risk aversion given in Table 5.2.
5.3.3 Capital Allocation Across Risky and Risk-Free Portfolios From the previous discussion we know that the riskier investments offer higher average returns, while less risky investments offers lower average returns. Therefore, a rational investor makes optimal portfolio choice between risky and risk-free securities instead of making all-or-nothing choices from these investment classes. They can and do construct their portfolios using securities from all asset classes. For example, some of the portfolios may be in risk-free Treasury bills, some in high-risk stocks.
75
The most straightforward way to control the risk of the portfolio is through the fraction of the portfolio invested in Treasury bills and other safe money market securities versus risky assets. The capital allocation decision is an example of an asset allocation choice – a choice among broad investment classes, rather than among the specific securities within each asset class. Most investment professionals consider asset allocation the most important part of portfolio construction. Example 5.3. Assume that the total market value of a private fund is $500,000, of which $100,000 is invested in a risk-free asset for practical purposes. The remaining $400,000 is invested in risky securities, where $240,000 in equities (E) and $160,000 in long-term bonds (B). Under such assumption, the risky portfolio consists 60% of E and 40% of B, and the weight of the risky portfolio in the mutual fund is 80%. Suppose that the fund manager wishes to decrease risk by reducing the allocation to the risky portfolio from 80 to 70% and not change the proportion of each asset in the risky portfolio. The risky portfolio would then total only 0:7 $500; 000 D $350; 000, requiring the sale of $50,000 of the original $400,000 of risky holdings, with the proceeds used to purchase more shares in risk-free asset. Total holdings in the risk-free asset will increase $500; 000 .10:7/ D $150; 000, the original holdings ($100,000) plus the new contribution ($50,000). To leave the proportions of each asset in the risky portfolio unchanged. Because the weights of E and B in the risky portfolio are 60 and 40%, respectively, the fund manager should sell 0:6 $50; 000 D $30; 000 of E and 0:4 $50; 000 D $20; 000 of B. After the sale, the proportions of each asset in the risky portfolio are in fact unchanged: E W WE D
240; 000 30; 000 D 0:6 400; 000 50; 000
B W WB D
160; 000 20; 000 D 0:4 400; 000 50; 000
5.3.4 Indifference Curves Indifference (utility-function) curves are abstract theoretical concepts. They cannot as a practical matter be used to actually measure how individuals make investment deci-
Table 5.2 Utility scores of alternative portfolios for investors with varying degrees of risk aversion Investor Utility score portfolio L Utility score portfolio L Utility score portfolio L risk aversion (A) E.r/ D :07I D :05 E.r/ D :09I D :1 E.r/ D :13I D :2 2.0
:07
3.0
:07
4.0
:07
1 2 1 2 1 2
2 :052 D :0675 3 :052 D :0663 2
4 :05 D :0650
1 2 :12 2 :09 12 3 12 :09 12 4 :12
:09
D :0800
:13
D :0750
:13
D :0700
:13
1 2 1 2 1 2
2 :22 D :0900 3 :22 D :0700 4 :22 D :0500
76 Fig. 5.5 Indifference curves for various types of individuals
C.-F. Lee et al.
a
b
Risk-Averse investor d
c
Risk-Seeking Investor sions – or any other decisions for that matter. They are, however, useful tools for building models that illustrate the relationship between risk and return. An investor’s utility function can be utilized conceptually to derive an indifference curve, which shows individual preference for risk and return.4 An indifference curve can be plotted in the risk-return space such that the investor’s utility is equal all along its length. The investor is indifferent to various combinations of risk and return, hence the name indifference curve. Various types of investor’s indifference curves are shown in Fig. 5.5. In the same level of satisfaction, the risk-averse investor requires more return for an extra unit of risk than the return he asks for previous one unit increase of risk. Therefore, the indifference curve is convex. Figure 5.5a presents two indifference curves that are risk-averse and U1 > U2 (higher return and lower risk). For a risk-neutral investor, the indifference curve is a straight line. In the same level of satisfaction, the investor requires the same return for an extra unit of risk as the return he asks for previous one unit increase of risk. Figure 5.5b presents two indifference curves that are
4
Risk-Neutral Investor
By definition, an indifference curve shows all combinations of products (investments) A and B that will yield same level of satisfaction or utility to consume. This kind of analysis is based upon ordinal rather than cardinal utility theory.
Level of Risk-Aversion
risk-neutral, and U1 > U2 . For a risk-seeking investor, the indifference curve is a concave function. In the same level of satisfaction, the investor requires less return for an extra unit of risk than the return he asks for previous one unit increase of risk. Figure 5.5c presents two indifference curves that are risk-seeking and U1 > U2 . The more risk-averse individuals will claim more premiums when they face uncertainty (risk). In Fig. 5.5d, when facing 0 , investor 1 and investor 2 have the same expected return, E .R0 /. However, when the risk adds to 1 , the expected return E .R1 / investor 1 asks is higher than the expected return E .R2 / investor 2 asks. Thus, investor 1 is more risk-averse than investor 2. Therefore, in return-risk plane, the larger slope of indifference curve that the investor has, the higher risk-averse level the investor is. Later in this chapter, different levels of risk and return are evaluated for securities and portfolios when a decision must be made concerning which security or portfolio is better than another. It is at that point in the analysis that indifference curves of hypothetical individuals are employed to help determine which securities or portfolios are desirable and which ones are not. Basically, values for return and risk will be plotted for a number of portfolios as well as indifference curves for different types of investors. The investor’s optimal portfolio will then be the one identified with the highest level of utility for the various indifference curves.
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
Investors generally hold more than one type of investment asset in their portfolio. Besides securities an investor may hold real estate, gold, art, and so on. Thus, given the measures of risk and return for individual securities developed, the measures of risk and return may be used for portfolios of risky assets. Risk-averse investors hold portfolios rather than individual securities as a means of eliminating unsystematic risk; hence the examination of risk and return will continue in terms of portfolios rather than individual securities. Indifference curves can be used to indicate investors’ willingness to trade risk for return; now investors’ ability to trade risk for return needs to be represented in terms of indifference curves and efficient portfolios, as discussed in the next section.
5.4 Efficient Portfolios Efficient portfolios may contain any number of asset combinations. Two examples are shown, a two-asset combination and a three-asset portfolio; both are studied from graphical and mathematical solution perspectives. The degree to which a two-security portfolio reduces variance of returns depends on the degree of correlation between the returns of the securities. This can be best illustrated by expressing the variability in terms of standard deviation (the square root of the variance): p D
q
WA2 A2 C WB2 B2 C 2WA WB rAB A B
(5.8)
First, assume that rAB D 1:0, which would mean that securities A and B are perfectly positively correlated. Then: p D
q
WA2 A2 C WB2 B2 C 2WA WB A B
or:
q p D
.WA A C WB B /2
so: p D WA A C WB B
(5.9)
With the securities perfectly positively correlated, the standard deviation of the portfolio combination is equal to the weighted average of the standard deviation of the component securities. Since the correlation coefficient cannot be greater than 1.0, the weighted average represents the highest possible values of the portfolio standard deviation as discussed previously in this chapter. In this case, there is no diversification taking place. With any correlation coefficient less than 1.0, there will be a diversification effect, and this effect will be larger the lower the value of the correlation coefficient. The ultimate diversification impact occurs if rAB D 1:0, perfect negative correlation.
p D
77
q
WA2 A2 C WB2 B2 2WA WB A B
This can be reduced to: q p D .WA A WB B /2 D jWA A WB B j
(5.10)
Example 5.3 provides further illustration. Example 5.4. Two-security portfolio diversification. Assume that RA D 10 percent; RB D 8 percent; A D 4 percent, and B D 3 percent. Since the values for both RA and A are greater than those for RB and B , respectively, there is no dominant choice between the two securities. Higher return is associated with higher risk. Additionally, assume that rAB D 1:0 and that the securities are equally weighted in the portfolio. If follows then that: q .0:5/2 .4/2 C .0:5/2 .3/2 C .2/ .0:5/ .0:5/ .1:0/ .4/ .3/ p D .0:25/ .16/ C .0:25/ .9/ C .2/ .0:25/ .12/ p D 4 C 2:25 C 6 p D 12:25
p D
D 3:5
Notice that this is the same result that would have been achieved using Equation (5.9), p D .0:5/ .4/ C .0:5/ .3/ D 2 C 1:5 D 3:5 percent. If rAB D 0:5 there still is positive correlation, but as it is not perfect positive correlation, there is a diversification effect. q .0:5/2 .4/2 C .0:5/2 .3/2 C .2/ .0:5/ .0:5/ .0:5/ .4/ .3/ p D 4 C 2:25 C 3 p D 9:25 D 3:04
p D
This reduction of p from 3.50 to 3.04 is the effect of diversification. Notice that the first two terms under the square root radical were unaffected and that the third term was only half the size it was with no diversification. The diversification impact from a lower correlation coefficient occurs in the covariance term only. If security A and B were independent of one another, that is, rAB D 0, it is clear that the covariance term would equal zero. p p D p 4 C 2:25 C .2/ .0:5/ .0:5/ .0/ .4/ .3/ D p4 C 2:25 C 0 D 6:25 D 2:5 and the diversification impact reduces p to 2.50. If security A and B were perfectly and negatively correlated, that is, AB D 1, substituting the numbers from the example, first into Equation (5.8), gives
78
C.-F. Lee et al.
p p D p 4 C 2:25 C .2/ .0:5/ .0:5/ .1:0/ .4/ .3/ D p6:25 6 D 0:25 D 0:5 and into Equation (5.10): p D .0:5/ .4/ .0:5/ .3/ D 2 1:5 D 0:5 with the appropriate weighting factors, the p may be reduced to zero if there is negative correlation. For example, if WA D 3=7 and WB D 4=7 in this problem: p D .3=7/ .4/ .4=7/ .3/ D0
Fig. 5.6 The relationship between correlation and portfolio risk
5.4.1 Portfolio Combinations Assume that the actual correlation coefficient is 0.50. By varying the weights of the two securities and plotting the combinations in the risk-return space, it is possible to derive a set of portfolios (see Table 5.3) that form an elliptical curve (see Fig. 5.6). The amount of curvature in the ellipse varies with the degree of correlation between the two securities. If there were perfect positive correlation, all of the combinations would lie on a straight line between points A and B. If the combination were plotted with the assumption of perfect negative correlation, the curvature would be more pronounced and one of the points .WA D 0:43; WB D 0:57/ would actually be on the vertical axis (a risk of zero). This process could be repeated for all combination of investments; the result is graphed in Fig. 5.7. In Fig. 5.7 the area within curve XVYZ is the feasible opportunity set representing all possible portfolio combinations. The curve YV represents all possible efficient portfolios and is the efficient frontier. The line segment VX is on the feasible opportunity set, but not on the efficient frontier; all points on VX represent inefficient portfolios. The portfolios and securities that lie on the frontier VX in Fig. 5.7 would not be likely candidates for investors to hold. This is so because they do not meet the criteria of maximizing expected return for a given level of risk or minimizing risk Table 5.3 Possible risk-return combination .RA D 0:10; RB D 0:08; A D 0:04; B D 0:03; AB D 0:5/ Portfolio WA WB RP (percent) P (percent) 1 2 3 4 5
1:00 0:75 0:50 0:25 0:00
0:00 0:25 0:50 0:75 1:00
10:0 9:5 9:0 8:5 8:0
4:00 3:44 3:04 2:88 3:00
Fig. 5.7 The minimum-variance set
for a given level of return. This is easily seen by comparing the portfolio represented by points X and X 0 . As investors always prefer more expected return than less for a given level of risk, X 0 is always better than X . Using similar reasoning, investors would always prefer V to X because it has both a higher return and a lower level of risk. In fact, the portfolio at point V is identified as the minimum-variance portfolio, because no other portfolio exists that has a lower variance. For each of the combinations of individual securities and inefficient portfolios in Fig. 5.7 there is a corresponding portfolio along the efficient frontier that either has a higher return given the same risk or a lower risk given the same return. However, points on the efficient frontier do not dominate one another. While point Y has considerably higher return than point V , it also has considerably higher risk. The opti-
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
79
number of securities making up the portfolios will result in a function similar to that illustrated in Fig. 5.9. As shown, the additional reduction in portfolio variance rapidly levels off as the number of securities is increased beyond five. An earlier study by Evans and Archer (1968) shows that the percentage of diversification that can be achieved with randomly selected, equally weighted portfolios level off rapidly beyond a portfolio size of about 15.
5.4.2 Short Selling
Fig. 5.8 Indifference curves and the minimum-variance set
mal portfolio along the efficient frontier is not unique with this model and depends upon the risk/return tradeoff utility function of each investor. Portfolio selection, then, is determined by plotting investors’ utility functions together with the efficient-frontier set of available investment opportunities. No two investors will select the same portfolio except by chance or if their utility curves are identical. In Fig. 5.8, two sets of indifference curves labeled U and U 0 are shown together with the efficient frontier. The U curves have a higher slope, indicating a greater level of risk aversion. The investor is indifferent to any combination of Rp and p along a given curve, for example, U1 ; U2 , or U3 . The U 0 curve would be appropriate for a less risk-averse investor – that is, one who would be willing to accept relatively higher risk to obtain higher levels of return. The optimal portfolio would be the one that provides the highest utility – a point in the northwest direction (higher return and lower risk). This point will be at the tangent of a utility curve and the efficient frontier. The tangency point investor in Fig. 5.8 is point X 0 ; for the risk-averse it is point Y . Each investor is logically selecting the optimal portfolio given his or her risk-return preference, and neither is more correct than the other. Individual investors sometimes find it necessary to restrict their portfolios to include a relatively small number of securities. Mutual-fund portfolios, on the other hand, often contain securities from more than 500 different companies. To give some idea of the number of securities that is necessary to achieve a high level of diversification, Levy and Sarnat (1971) considered a naïve strategy of equally weighted portfolios – that is, of dividing the total investment into equal proportions among component securities. Plotting the variance of the portfolios developed in this manner against the
Short selling (or “going short”) is a very regulated type of market transaction. It involves selling shares of a stock that are borrowed in expectation of a decline in the security’s price. When and if the price declines, the investor buys an equivalent number of shares of the same stock at the new lower price and returns to the lender the stock that was borrowed. The Federal Reserve Board requires short selling customers to deposit 50% of the net proceeds of such short sales with the brokerage firm carrying out the transaction. Another key requirement of a short sale, set by the Securities and Exchange Act of 1934, is that the short sale must occur at a price higher than the preceding sale – or at the same price as the preceding sale, if that took place at a higher price than a preceding price. This is the so-called uptick or zero-tick rule. It prevents the price of a security from successively falling because of continued short selling. Relaxing the assumption of no short selling in this development of the efficient frontier involves a modification of the analysis of the previous section. The efficient frontier analyzed in the previous sections was bounded on both ends by Y and the minimum variance portfolio V , respectively, as shown in Fig. 5.7. Point Y is called the maximum-return portfolio, as there is no other portfolio with a higher return. This point is normally an efficient security or portfolio with the greatest level of risk and return. It could also be a portfolio of securities, all having the same highest levels of risk and return. Point Z is normally a single security with the lowest level of return, although it could be a portfolio of securities, all having the same low level of return. The Black (1972) model is identical to the Markowitz model except that it allows for short selling.5 That is, the nonnegativity constraint on the amount that can be invested in each security is relaxed, WA ? 0. A negative value for the weight invested in a security is allowed, tantamount to allowing a short sale of the security. The new efficient frontier that can be derived with short selling is shown in Fig. 5.10a.
5
Most texts do not identify the Markowitz model with restrictions on short sale. Markowitz (1952), in fact, excluded short sales.
80
C.-F. Lee et al.
Fig. 5.9 Native diversification reduces risk to the systematic level in a randomly selected portfolio
The major difference between the frontier in Fig. 5.10a (short selling) and Fig. 5.10b (no short selling) is the disappearance of the end points Y and Z. An investor could sell the lowest-return security .Y /. If the number of short sales is unrestricted, then by a continuous short selling of X and reinvesting in Y the investor could generate an infinite expected return. Hence the upper bound of the highest-return portfolio would no longer be Y but infinity (shown by the arrow on the top of the efficient frontier). Likewise the investor could short sell the highest-return security U and reinvest the proceeds into the lowest-yield security X , thereby generating a return less than the return on the lowest-return security. Given no restriction on the amount of short selling, an infinitely negative return can be achieved, thereby removing the lower bound of X on the efficient frontier. But rational investors will not short sell a high-return stock and buy a low-return stock and buy a low-return stock. The portfolios on VZ 0 always dominate those of VX, as shown in Fig. 5.10a. Whether an investor engages in any of this short-selling activity depends on the investor’s own unique set of indifference curves. Hence, short selling generally will increase the range of alternative investments from the minimum-variance portfolio to plus or minus infinity. However, in the Black model with short selling, no provision was made for the SEC margin requirement. Dyl (1975) imposed the margin requirement on short selling and added it to the Markowitz development of the efficient frontier. The Dyl model. Dyl introduced short selling with margin requirements by creating a new set of risky securities, the ones sold short, which are negatively correlated with the existing set of risky securities. These new securities greatly enhance the diversification effect when they are placed in portfolios. The Dyl model affects the efficient frontier in two ways: (1) If the investor were to combine in equal weight any long position in a security or portfolio with a short position in a security or a portfolio, the resulting portfolio would yield zero return and zero variance and (2) any combination of unequal weighted long or short positions would yield
Fig. 5.10 (a) The efficient frontier with short selling. (b) Efficient frontiers with and without short selling and margin requirements
portfolios with higher returns and lower risk levels. Overall, these two effects will yield an efficient frontier that dominates the Markowitz efficient frontier. Figure 5.10b compares the Dyl and Markowitz efficient frontiers.
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
Even though the inclusion of short selling changes the location and boundaries of the efficient frontier, the concavity of the curve is still intact. This is important in that it preserves the efficient frontier as the locus of optimal portfolios for risk-average investors. As long as the efficient frontier remains concave, by using indifference curves it will be possible to locate the optimal portfolio for each investor. Techniques for calculating the efficient frontier with short selling. Since there are thousands of securities from which the investor can choose for portfolio formation, it can be very costly and time consuming to calculate the efficient frontier set. One way of determining the optimal investment proportions in a portfolio is to hold the return constant and solve for the weighting factors (W1 : : : ; Wn for n securities) that minimize the variance, given the constraint that all weights sum to one and that the constant return equals the expected return developed from the portfolio. The optimal weights can then be obtained by minimizing the Lagrange function C for portfolio variance. C D
n P Wi Wj rij i j C 1 1 Wi i D1 j D1 i D1 n P C2 E Wi E .Ri / n P n P
(5.11)
i D1
in which 1 and 2 are the Lagrange multipliers, E and E .Ri / are targeted rate of return and expected rate of return for security i , respectively, ij is the correlation coefficient between Ri and Rj , and other variables are as previously defined. Function C has n C 2 unknowns: W1 ; : : : ; Wn ; 1 , and 2 . By differentiating C with respect to Wi ; 1 and 2 and equating the first derivatives to zero, nC2 equations can be derived to solve for n C 2 unknowns. As shown in the empirical example later in this chapter, matrix algebra is best suited to this solution process. By using this approach the minimum variance can be computed for any given level of expected portfolio return (subject to the other constraint that the weights sum to one). In practice it is best to use a computer because of the explosive increase in the number of calculations as the number of securities considered grows. The efficient set that is generated by the aforementioned approach [Equation (5.11)] is sometimes called the minimum-variance set because of the minimizing nature of the Lagrangian solution. Thus far no specific distribution for measuring returns has been assumed. In most cases specific distribution of returns will be needed when applying the portfolio model. The normal and log normal are two of the most commonly used distribution. The normal distribution. As you undoubtedly are aware, the normal (probability) distribution is a bell-shaped curve centered on the mean of a given population or sample distri-
81
bution. The area under the curve is an accumulation of probabilities that sum to one. With half of the area lying to the left and half to the right of the mean, probability statements may be made about the underlying sample that makes up the distribution. Within the scope of the Markowitz model, the normal distribution may be applied because of the use of the meanvariance assumptions. The formation of probability statements from the underlying sample is a result of the standard deviation (square root of the variance) quantifying the spread of the distribution. If the sample is normally distributed, approximately 68% of the observations will lie within one standard deviation on either side of the mean, approximately 95% will lie within two standard deviations, and approximately 99% will lie within three standard deviations. This ability to ascertain intervals of confidence for the observations allows the utilization of the normal distribution to predict ranges within which the portfolio returns will lie. Utilizing the standard normal distribution (mean of zero, standard deviation of one), any set of mean-variance data can be standardized to develop probability statements about returns on a given portfolio. Standardization is an algebraic operation in which the mean is subtracted from a given return and this difference is divided by the standard deviation. The resultant metric is then standard normal and can be compared with tabulated values to calculate probabilistic quantities of occurrence. Within this framework, any hypothesized level of return can be assigned a probability of occurrence given a probabilistic value of occurrence that is based upon the nature of the sample. For expository purposes, suppose the mean return on a particular investment is 10% for a given period, and that historically these returns have a variance of 0.16%. It is of interest to know what the probabilities are of obtaining a 15% or greater return, or a return less than or equal to 8%. These probabilities may be evaluated by using the normal distribution, as stated in notation below: ˇ
ˇ P X 0:15 ˇRp D 0:1; p2 D 0:0016 ˇ
ˇ P X 0:18 ˇRp D 0:1; p2 D 0:0016 by standardizing the probability statements to z values:
0:15 0:1 D 1:25 P z 0:04
0:08 0:1 P z D 0:5 0:04
(5.12)
These standardized values can then be compared with the tabulated z values found in tables of the standardized normal distribution. For z 1:25 the probability is 10.56%
82
(50% – 39.44%), and for z D 0:05 the probability is 0.3085. Therefore this investment has an 11.51% chance of obtaining a 15% or more return, and a 30.85% chance of obtaining 8% or less. It is extremely difficult to make probabilistic statements when using the normal distribution. Given that the return data utilized are ex post, the predictive ability of the normal distribution based on historical data is limited. The past does not always predict the future; hence, this problem is somewhat alleviated by an assumption that the security under consideration is in a static state. Nevertheless, as shown by research and common sense, this assumption is rather bold, and the use of the normal distribution should be limited to comparisons of various past portfolio returns. Another difficulty with the normal distribution is that if the distribution of the sample is skewed in any way, the standard deviation will not properly reflect the equivalent areas under the curve on either side of the mean. This problem will be discussed and solved in the next section through the application of the log normal distribution. A variable is log normally distributed if a logarithm of this variable is normally distributed. The log normal distribution. The log normal distribution discussed in the next chapter will explore further of this section. The approach outlined in the last section for delineating the efficient frontier assumes that the security-return data are normally distributed. In reality, most financial researchers would agree that security-return data tend to be positively skewed. This skewness can be a serious problem in accurately developing the efficient frontier with the Markowitz model because of the assumption that only the first two moments of the return distribution, mean and variance, are important. One reason for returns being positively skewed is the inability of the investor to lose more than 100% of his or her investment, effectively creating a lower bound to portfolio returns. This is called the limited-liability constraint. But since capital gains and dividends could conceivably be infinite, the upper tail of the distribution of returns has no upper bounds. The range of probable returns is spread towards the positive side and therefore contains a potential bias when it is utilized in developing statistical estimates. If the return distributions are significantly skewed, the efficient frontier can be more accurately determined with the use of logarithmically transformed holding-period returns. That is, each holding-period return for security i with holding period T .1 C RiT / is transformed by computing its natural logarithm, ln.1 C RiT /. The logarithmic transformation converts a data set of discretely compounded returns into continuously compounded returns. The distribution of discrete time returns will be more positively skewed the larger the differencing interval used to measure the returns; that is,
C.-F. Lee et al.
if skewness exists in the return distribution, annual data will be more positively skewed than monthly data, which will be more skewed than weekly data. The continuous compounding implied in the logarithmically transformed data will virtually eliminate any positive skewness existing in the raw return data when rates of return are log normally distributed. In practice, it is possible to directly derive the efficient frontier using means and variances of logarithmically transformed data. Under this circumstance, assume that utility functions use continuous returns and variances, and then use the ln.1 C Rp / transformation on discrete data. To deal with the untransformed log normally distributed data, Elton et al. (1976) use the mean and variance of log normal distributed as defined in Equations (5.13) and (5.14) to derive the efficient frontier. In other words, they first delineate the efficient frontier by using the means and variances of the untransformed data and then use Equations (5.13) and (5.14) to determine the subset of the ŒE.1 C Rp /; .1 C Rp / efficient frontier in terms of log-transformed data, ln.1 C Rp /. E.1 C Rp / D eE .rp /C1=2
2
(5.13)
where rp D ln.1 C Rp / and is the standard deviation of rp .
2
2 2 .1 C Rp / D e2E .rp /C e 1
(5.14)
where rp D ln.1 C Rp / and 2 D Var rp . Use of the logarithmic transformation can also be shown to eliminate less desirable portfolios in the lowest-return segment of the efficient frontier computer using the raw data.6 Before moving on to an example, however, it will be useful to briefly summarize the portfolio-selection process. Again, the Markowitz model of portfolio selection is a mathematical approach for deriving optimal portfolios; that is, portfolios that satisfy the following conditions. 1. The least risk for a given level of expected return (minimum-variance portfolios). 2. The greatest expected return for a given level of risk (efficient portfolios). How does a portfolio manager apply these techniques in the real world? The process would normally begin with a universe of securities available to the fund manager. These securities would be determined by the goals and objectives of the mutual fund. For example, a portfolio manager who runs a mutual fund specializing in health-care stocks would be required to select securities from the universe of health-care stocks. This would greatly reduce the analysis of the fund manager by limiting the number of securities available. 6
See Baumol (1963).
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
The next step in the process would be to determine the proportions of each security to be included in the portfolio. To do this, the fund manager would begin by setting a target rate of return for the portfolio. After determining the target rate of return, the fund manager can determine the different proportions of each security that will allow the portfolio to reach this target rate of return. The final step in the process would be for the fund manager to find the portfolio with the lowest variance given the target rate of return. The next section uses a graphical approach to derive the optimal portfolios for three securities.
5.4.3 Three-Security Empirical Solution To facilitate a realistic example, actual data have been taken from a set of monthly returns generated by the Dow-Jones 30 Industrials. This example focuses on the returns and risk of the first three industrial companies – AXP (American Express), XOM (Exxon Mobil), and JNJ (Johnson & Johnson) – for the period January 2001–September 2007. The data used are tabulated in Table 5.4. Both graphical and mathematical analyses are employed to obtain an empirical solution. Graphical analysis. To begin to develop the efficient frontier graphically, it is necessary to move from the three dimensions necessitated by the three-security portfolio to a twodimensional problem by transforming the third security into an implicit solution from the other two.7 To do this it must be noted that since the summation of the weights of the three securities is equal to unity, then implicitly: W3 D 1 W1 W2
(5.15)
Additionally, the above relation may be substituted into Equation (5.1): E Rp D W1 E .R1 / C W2 E .R2 / C W3 E .R3 / D W1 E .R1 / C W2 E .R2 / C E .R3 /
Table 5.4 Data for three securities Company E .ri / i2 JNJ AXP XOM 7
0:0053 0:0055 0:0126
0:0455 0:0614 0:0525
D ŒE .R1 / E .R3 / W1 C ŒE .R2 / E .R3 / W2 C E .R3 / Finally, inserting the values for the first and second securities yields: E Rp D .0:0053 0:0126/ W1 C .0:0055 0:0126/ W2 C 0:0126
(5.17)
D 0:0073W1 0:0071W2C0:0126 As can be seen, Equation (5.17) is a linear function in two variables and as such is readily graphable. Since given a certain level of portfolio return the function will solve jointly for the weights of securities 1 and 2, it solves indirectly for the weight of security 3. The variance formula shown in Equation (5.2) is converted in a similar manner by substituting in Equation (5.15) as follows: p2 D
3 3 X X
Wi Wj Cov Ri ; Rj D Var Rp
i D1 j D1
D W12 11 C W22 22 C W32 33 C 2W1 W2 12 C2W1 W3 13 C 2W2 W3 23 D W12 11 C W22 22 C .1 W1 W2 /2 33 12 C2W1 W2 C 2W1 .1 W1 W2 / 13 C2W2 .1 W1 W2 / 23 D .11 C 33 213 / W12 C .233 C 212 213 223 / W1 W2 C .22 C 33 223 / W22 C .233 C 213 / W1 C .233 C 213 / W2 C 33
(5.18)
Inserting the covariances and variances of the three securities from Table 5.4:
D W1 E .R1 / CW2 E .R2 / C .1 W1 W2 / E .R3 / W1 E .R3 / W2 E .R3 /
83
(5.16) Cov Ri ; Rj 12 D 0:0009 23 D 0:0010 13 D 0:0004
The process of finding the efficient frontier graphically described in this section was originally developed by Markowitz (1952). Francis and Archer (1979) have discussed this subject in detail.
p2 D Œ0:0455 C 0:0525 2 .0:0004/ W12 C Œ2 .0:0525/ C2 .0:0009/ 2 .0:0004/ 2 .0:0010/ W1 W2 CŒ0:0614 C 0:0525 2 .0:0010/ W22 CŒ2 .0:0525/ C 2 .0:0004/ W1 CŒ2 .0:0525/ C 2 .0:0010/ W2 C 0:0525 D 0:0972W12 C 0:104W1 W2 C 0:1119W22 0:1042W1 0:103W2 C 0:0525
(5.19)
84
C.-F. Lee et al.
Minimum-risk portfolio. Part of the graphical solution is the determination of the minimum-risk portfolio. Standard partial derivatives are taken of Equation (5.18) with respect to the directly solved weight factors as follows: @p2 @W1
D 2 .11 C 33 213 / W1 C .233 C 212 213 223 / W2 C .233 C 213 / D 0
@p2 @W2
D .233 C 212 213 223 / W1
(5.20)
C2 .22 C 33 223 / W2 C .223 233 / D 0 When these two partial derivatives are set equal to zero and the unknown weight factors are solved for, the minimum risk portfolio is derived. Using the numeric values from Table 5.4:
Table 5.5 Iso-return lines Target return W2 1:0 0:5 0:0 0:5 1:0
0:0082 W1 1:5753 1:0890 0:6027 0:1164 0:3699
0:01008 W1 1:3178 0:8315 0:3452 0:1411 0:6274
0:01134 W1 1:1452 0:6589 0:1726 0:3137 0:8000
other to solve for W2 . For example, if W2 D 0 and values E Rp D 0:0082, then W1 can be solved as follows (see Table 5.5 for final results). 0:0082 D 0:0073W1 0:0071 .0/ C 0:0126 0:0073W1 D 0:0126 0:0082 D 0:01154 W1 D 0:6027
In a similar fashion, other points are calculated and are listed in Table 5.5. A line may be drawn between the points to de@p2 D 2Œ0:0455 C 0:0525 2 .0:0004/ W1 C Œ2 .0:0525/ velop the various iso-expected return function lines. @W1 There are a multitude of possible return functions, but C2 .0:0009/ 2 .0:0004/ 2 .0:0010/ only these three lines .IR1 ; IR2 and IR3 ) in terms of the data listed in Table 5.5 are shown in Fig. 5.11. W2 C Œ2 .0:0525/ C 2 .0:0004/ In Fig. 5.11 vertical axis and horizontal axis represent W1 D 0:1944W1 C 0:104W2 0:1042 D 0 and W2 , respectively. Each point on the iso-expected return line of Fig. 5.11 rep@p2 D Œ2 .0:0525/ C 2 .0:0009/ 2 .0:0004/ 2 .0:0010/ resents a different combination of weights placed in the three @W2 securities. The issue now is to determine the portfolio that W1 C 2Œ.0:0614/ C 0:0525 2 .0:0010/ lies on the iso-expected return line with the lowest variance. W2 C Œ2 .0:0010/ 2 .0:0525/ (5.21) Table 5.6 shows the portfolio variances as we move along the iso-expected return line. For example, moving along the D 0:104W1 C 0:2238W2 0:103 D 0 iso-expected return line associated with an expected return of 0.82%, note that the minimum variance portfolio is asBy solving these two equations simultaneously the weights sociated with weights of 0.2577 in security 2 and 0.3521 in of the minimum-risk portfolio are derived. This variance rep- security 1. resents the lowest possible portfolio-variance level achievable, given variance and covariance data for these stocks. Iso-variance ellipses. As shown in Table 5.6 the minimumThis can be represented by the point V of Fig. 5.8. This variance portfolio can be found by moving along the isosolution is an algebraic exercise that yields W1 D 0:3857 expected return line until the minimum-variance portfolio is and W2 D 0:2810 and, therefore, through Equation (5.15), reached. To better visualize this, examine the variance of a portfolio as the proportions in security 1 and security 2 W3 D 0:3333. are changed. A plot of these points will allow the family of The iso-expected return line. The variance and return iso-variance ellipses to be traced. The iso-variance ellipses equations have been derived, and it is now time to com- are the ellipses that have the same variance on every point plete the graphing procedure. To begin the graphing, the iso- of this ellipse curve. It should be noted that an ellipse is expected return function lines must be delineated given var- an egg-shaped circle of points with a common center and ious levels of expected return. The iso-expected return line orientation. The minimum-risk portfolio variance is the cenis a line that has the same expected return on every point ter of all possible ellipses as it has the least risk. It will be of the line. Utilizing Equation (5.17), three arbitrary returns desirable, then, to find this minimum risk value, as no other are specified: 0.0082, 0.01008, 0.01134 (these correspond weighting scheme of these three securities will develop a to 70, 80, and 90% annual returns of Exxon Mobil). These lesser risk. The solutions of Equation (5.21) have determined three monthly returns are then set equal to Equation (5.17), that the weights associated with the minimum-risk portfolio and graphing is now possible by setting W1 equal to zero or are W1 D 0:3857; W2 D 0:2810, and W3 D 0:3333. Substi-
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
85
Fig. 5.11 Iso-return lines
Table 5.6 Portfolio variance along the iso-return line
0.0082 w2 1 0:75 0:5 0:25 0 0:0795 0:151 0:25 0:2577 0:5 0:75 1
w1 1:5753 1:3322 1:0890 0:8459 0:6027 0:5254 0:4559 0:3596 0:3521 0:1164 0:1267 0:3699
0.01008 Var 0.1806 0.1225 0.0771 0.0447 0.0250 0.0214 0.0194 0.0182 0.0182 0.0242 0.0431 0.0748
w1 1:3178 1:0747 0:8315 0:5884 0:3452 0:2679 0:1983 0:1021 0:0946 0:1411 0:3842 0:6274
0.01134 Var 0.1618 0.1091 0.0693 0.0423 0.0281 0.0263 0.0258 0.0268 0.0269 0.0383 0.0626 0.0998
w1 1:1452 0:9021 0:6589 0:4158 0:1726 0:0953 0:0257 0:0705 0:0780 0:3137 0:5568 0:8000
Var 0.1564 0.1074 0.0713 0.0479 0.0374 0.0368 0.0373 0.0397 0.0400 0.0549 0.0829 0.1238
Note: Underlined variances indicate minimum variance portfolios
tuting this information listed in Table 5.4 into Equation (5.2), the variance of the minimum-risk portfolio is: n n X X Wi Wj Cov Ri Rj Var Rp D i D1 j D1
be completed by taking Equation (5.18) andholding one of , constant. portfolio variance Var R the weights, say W p 2 Bring the Var Rp to the right-hand side of the equation and notice that the equation is of a quadratic form and can be solved using the quadratic formula:
D W12 11 C W22 22 C W32 33 C 2W1 W2 12
W1 D
C2W2 W3 23 C 2W1 W3 13 D .0:3857/2 .0:0455/ C .0:2810/2 .0:0614/ C .0:3333/2 .0:0525/ C2 .0:3857/ .0:2810/ .0:0009/ C2.0:2810/.0:3333/.0:0010/ C 2.0:3857/ .0:3333/.0:0004/ D 0:017934 Note that the variance of the minimum-risk portfolio can be used as a base for graphing the iso-variance ellipses. It can
b ˙
p b 2 4ac 2a
(5.22)
where: a D all coefficients of W12 ; b D all coefficients of W1 ; and c D all coefficients that are not multiplied by W1 , or W12 : or a D 11 C 33 C 13 ; b D .233 C 212 213 223 / W2 233 C 213 ; and c D .22 C33 223 / W22 C .233 C 223 / W2 C 33 Var Rp .
86
C.-F. Lee et al.
Substituting the numbers from the data of Table 5.4 into Equation (5.18) yields: Var.Rp / D 0:0972W21 C 0:104W1 W2 C 0:1119W22 0:1042W1 0:103W2 C 0:0525 0 D 0:0972W21 C 0:104W1 W2 C 0:1119W22 0:1042W1 0:103W2 C 0:0525Var.Rp / where: a D 0:0972I b D 0:104W2 0:1042I and c D 0:1119W22 0:103W2 C 0:0525 Var.Rp /: When these expressions are plugged into the quadratic formula: p .0:104W2 0:1042/ ˙ b 2 4ac W1 D (5.23) 2 .0:0972/ where b 2 D .0:104W2 0:1042/2 4ac D 4f.0:0972/Œ0:1119W22 0:103W2 C0:0525 Var.Rp / g This is a solution for the two points of W1 (W11 and W12 ) for a given portfolio variance and weight of the second security .W2 /. When selecting the level of Var .Rp / it is best to choose a value that is slightly larger than the minimumrisk value. This assures the calculation of a possible portfolio, since no portfolio of these three securities may have less risk. Additionally, an initial selection of a value for W2 close to the minimum-risk portfolio W2 will be desirable. In sum, Equation (5.23) can be used to construct the isovariance ellipse for given Var .Rp / in terms of arbitrary W2 . By jointly considering iso-expected return lines and isovariance ellipses, efficient portfolios can be identified that are defined as the portfolios with the minimum variance given the expected rate of return. This task can be done by computer, as mentioned in the last section. To obtain the weights of an efficient portfolio, the following set of instructions must be entered into the computer. 1. Find the portfolio weights that minimize portfolio variance, subject to the target expected rate-of-return constraint. For this case, target rates of return are 0.82, 1.008, or 1.1134%. The sum of the portfolio weights for all stocks in the portfolio must be equal to one. Mathematically, this instruction is defined in Equation (5.15).
2. The expected return and variance for a three-security portfolio is defined in Equations (5.16) and (5.18), respectively. 3. The estimated expected return, variance, and covariance as defined in Table 5.4 should be entered into the computer for estimation. Using E.Rp / D 0:82 percent as an example, W1 D 0 and W2 D 0:6197. From the relationship W3 D 1 W1 W2 D 1 0 0:6197 D 0:3803. In other words, the point H represents three portfolio weights for the three stocks. The computer substitutes weights, variances, and covariances (listed in Table 5.4) into Equation (5.18) to obtain portfolio variance. The portfolio variance consistent with the weights of point H is computed. The computer now moves by some predetermined distance either northeast or southwest along the 0.82% iso-expected return line. From this kind of search, the minimum variance associated with this 0.82% iso-expected return line is identified to be 0.0182, as indicated in Table 5.6. By a similar procedure, the minimum variances associated with 1.008 and 1.134% iso-expected return lines are identified to be 0.0258 and 0.0368, respectively. When minimum variances associated with predefined iso-expected return are identified, the optimal weights associated with expected returns equal to 1.008 and 1.134% in terms of data indicated in Table 5.4 are also calculated. These results are indicated in Table 5.6. Using the minimum variances associated with isoexpected returns (0.82, 1.008, and 1.134%), three isovariance ellipses can be drawn as IV 1 ; IV 2 , and IV 3 . When Var.Rp / D 0:0182 and W2 D 0:3, the quadratic formula from Equation (5.23) above yields two roots. p .0:0730/C
W11 D D 0:4247
.0:0730/2 .0:0052/ 2.0:0972/
p .0:0730/2 W12 D .0:0730/ 2.0:0972/ .0:0052/ D 0:3263 To solve for W11 and W12 we need b 2 > 4ac. Solving for the variance: .0:104W2 0:1042/2 4f.0:0972/ > Œ0:1119W22 0:103W2 C 0:0525 Var.Rp / g If W2 and Var .Rp / are not selected so that b 2 > 4ac, Equation (5.23) will require taking the square root of a negative number, and mathematically the solution will be an imaginary number. Solving W2 in terms of an imaginary number is not meaningful; therefore we only consider the case when b 2 > 4ac.
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
The variance chosen was the minimum variance of the portfolio associated with E.Rp / D 0:0082. Actually, any variance might have realistically been chosen as long as it exceeded the minimum-risk portfolio variance. Table 5.7 lists various W11 and W12 values for given levels of W2 and Var .Rp /. In addition, Table 5.7 presents the value of b 2 and 4ac to check whether or not the root is a real number. It should be noted that all possible variances are higher than the minimum-risk portfolio variance. Data from Table 5.7 are used to draw three iso-variance ellipses, as indicated in Figs. 5.12 and 5.13. The critical line and efficient frontier. After the iso-expected return functions and iso-variance ellipses have been plotted, it is an easy task to delineate the efficient frontier. By definition, the efficient portfolio is the portfolio with the highest return for any given risk. In Fig. 5.13, the efficient portfolios are those where a given iso-expected return line is just tangent to the variance ellipse. MRPABC is denoted as Table 5.7 Various W1 given W2 and Var.Rp / W2 Var.Rp / b b2 0:28 0:29 0:30 0:31 0:32 0:28 0:29 0:30 0:31 0:32 0:28 0:29 0:30 0:31 0:32
Fig. 5.12 Iso-variance ellipses
0:0182 0:0182 0:0182 0:0182 0:0182 0:0258 0:0258 0:0258 0:0258 0:0258 0:0368 0:0368 0:0368 0:0368 0:0368
0:0751 0:0740 0:0730 0:0720 0:0709 0:0751 0:0740 0:0730 0:0720 0:0709 0:0751 0:0740 0:0730 0:0720 0:0709
0:0056 0:0055 0:0053 0:0052 0:0050 0:0056 0:0055 0:0053 0:0052 0:0050 0:0056 0:0055 0:0053 0:0052 0:0050
87
the critical line; all portfolios that lie between points MRP and C are said to be efficient, and the weights of these portfolios may be read directly from the graph. From the graph, portfolio weights for the portfolios that minimize variances, given a 0.82, 1.008, or 1.134% expected rate of return are listed in Table 5.6. It is possible, given these various weights, to calculate the E.Rp / and the variances of these portfolios, as indicated in Table 5.8. The efficient frontier is then developed by plotting each risk-return combination, as shown in Fig. 5.14. In this section a step-wise graphical approach has been employed to obtain optimal weights for a three-security portfolio. In the next section a mathematical optimization approach is used to calculate the optimal weight for a threesecurity portfolio. Mathematical analysis. The same efficient frontier can be obtained mathematically through the use of the Lagrangian multipliers. The Lagrangian method allows the minimization
C 0:0142 0:0138 0:0135 0:0131 0:0128 0:0066 0:0062 0:0059 0:0055 0:0052 0:0044 0:0048 0:0051 0:0055 0:0058
4 ac 0:0055 0:0054 0:0052 0:0051 0:0050 0:0026 0:0024 0:0023 0:0021 0:0020 0:0017 0:0019 0:0020 0:0021 0:0023
W11 0:4385 0:4325 0:4247 0:4149 0:4025 0:6707 0:6652 0:6594 0:6534 0:6470 0:8268 0:8213 0:8157 0:8099 0:8039
W12 0:3339 0:3293 0:3263 0:3254 0:3272 0:1017 0:0965 0:0916 0:0870 0:0827 0:0543 0:0596 0:0647 0:0696 0:0742
88
C.-F. Lee et al.
Fig. 5.13 Iso-variance ellipses and iso-return lines
Table 5.8 Weights, Rp , and p for efficient points Portfolio W1 W2 W3 A B C
0:3521 0:1983 0:3902
0:2577 0:1510 0:6507
0:3902 0:0795 0:8252
E.Rp /
Var.Rp /
0:0082 0:01008 0:01134
0:0182 0:0258 0:0368
.Rp / 0:1349 0:1606 0:1918
Portfolios A, B, and C represent expected returns of 0.82, 1.008, and 1.134%, respectively Fig. 5.14 The efficient frontier for the data of Table 5.8
or maximization of an objective function when the objective function is subject to some constraints. One of the goals of portfolio analysis is minimizing the risk or variance of the portfolio, subject to the portfolio attaining some target expected rate of return, and also subject to the portfolio weights’ summing to one. The problem can be stated mathematically: Min p2 D
n n X X i D1 j D1
Wi Wj ij
Subject to
(a)
n P
Wi E.Ri / D E
i D1
where E is the target expected return and
(b)
n P i D1
Wi D 1:0
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
The first constraint simply says that the expected return on the portfolio should equal the target return determined by the portfolio manager. The second constraint says that the weights of the securities invested in the portfolio must sum to one. The Lagrangian objective function can be written: C D
n n X X
Wi Wj Cov Ri Rj C 1 1
i D1 j D1
"
C2 E
n X
# Wi E .Ri /
n X
! Wi
i D1
(5.11)
i D1
Taking the partial derivatives of this equation with respect to each of the variables, W1 ; W2 ; W3 ; 1 ; 2 and setting the resulting five equations equal to zero yields the minimization of risk subject to the Lagrangian constraints. This system of five equations and five unknowns can be solved by the use of matrix algebra. Briefly, the Jacobian matrix of these equations is 2
211 6 221 6 6 231 6 4 1 E .R1 /
212 213 222 223 232 233 1 1 E .R2 / E .R3 /
3 2 3 2 3 W1 0 1 E .R1 / 6W2 7 6 0 7 1 E .R2 / 7 7 6 7 6 7 6 7 6 7 1 E .R3 / 7 7 6W3 7 D 6 0 7 4 1 5 4 1 5 5 0 0 E 0 0 2
89
W A1 K 3 2 3 W1 0:9442 6 W2 7 6 0:6546 7 6 7 6 7 6 W3 7 D 6 0:5988 7 6 7 6 7 4 1 5 4 0:1937 5 2 20:1953 2
(5.26)
Using the matrix operation on the other two arbitrary returns yields the results shown in Table 5.8. As shown in the composition of Portfolios A, B, and C, the weights of security 3 are negative, implying that this security should be sold short in order to generate an efficient portfolio. It should be noted that the results of Table 5.8 are similar to those of Table 5.6. Therefore, both graphical and mathematical methods can be used to calculate the optimal weights. With the knowledge of the efficient-portfolio weights, given that E.Rp / is equal to 0.00106, 0.00212, and 0.00318, the variances of the efficient portfolios may be derived from plugging the numbers into Equation (5.2). Taking the square root of the variances to derive the standard deviation, the various risk-return combinations can be plotted and the efficient frontier graphed, as shown in Fig. 5.13 in the previous section. As can be seen from comparing this figure to the graphical derivation of the efficient frontier derived graphically, the results are almost the same.
(5.24) Therefore, it is possible to premultiply both sides of the matrix Equation (5.24), AW D K, by the inverse of A (denoted A1 ) and solve for the W column. This is possible because all values in the A and K matrices are known or arbitrarily set. The first problem is the inversion of the matrix of coefficients. First, derive the determinant of matrix A. Next, after developing the signed, inverted matrix of cofactors for A, divide the cofactors by the determinant, resulting in the inversion of the original matrix. Finally, premultiply the column vector for the investment weights. Plugging the data listed in Table 5.4 and E D 0:00106 into the matrix above yields 2
0:0910 6 0:0036 6 6 6 0:0008 6 4 1 0:0053
0:0018 0:1228 0:0020 1 0:0055
0:0008 0:0020 0:1050 1 0:0126
3 2 3 2 3 1 0:0053 W1 0 6 7 6 7 1 0:0055 7 7 6 W2 7 6 0 7 7 6 7 6 7 1 0:0126 7 6W3 7 D 6 0 7 7 6 7 6 7 5 4 1 5 4 1 5 0 0 2 0 0 0:00106 (5.25)
When matrix A is properly inverted and postmultiplied by K, the solution vector A1 K is derived:
5.4.4 Portfolio Determination with Specific Adjustment for Short Selling By using a definition of short sales developed by Lintner (1965) the computation procedure for the efficient frontier can be modified. Lintner defines short selling as putting up an amount of money equal to the value of the security sold short. Thus the short sale is a use rather than a source of funds to the short seller. The total funds the investor invests short, plus the funds invested long, must add up to the original investment. The proportion of funds invested short is jXi j, since Xi < 0. The constraint in the minimization problem concerning the weights of the individual securities needs to be modified to incorporate this fact. Additionally, the final portfolio weight (output of the matrix inversion) must be rescaled so the sum of the absolute value of the weights equals one. By defining short sales in this manner, the efficient frontier does not extend to infinity (as shown in Fig. 5.10a) but resembles the efficient frontier in Fig. 5.10b. Monthly rates of return for Johnson & Johnson, American Express, and Exxon Mobile are used to perform the analysis in this section. The sample period is from January
90
C.-F. Lee et al.
2001 to September 2007. The Markowitz model determines optimal asset allocation by minimizing portfolio variance using a constrained optimization procedure:
Min Var Rp D
3 3 X X
Wi Wj ij
(5.2)
Security JNJ AXP XOM
Monthly E.Ri /
i2
12
13
23
0.0053 0.0055 0.0126
0.0455 0.0614 0.0525
0.0009
0.0004
0.0010
i D1 j D1
Subject to: (a) (b)
3 P i D1 3 P
Substituting these values into the matrix: 2 3 2 0:0910 6 0:0018 6 6 6 0:0008 6 4 0:0053 1
Wi E .Ri / D E (5.1)
jWi j D 1:0 (5.1)
i D1
Where the E is the investors’ desired rate of return, and where the absolute value of the weights jWi j allows for a given W to be negative (sold short) but maintains the requirement that all funds are invested or their sum equals one. The Lagrangian function is Min L D
n n X X
W i W j C 1
i D1 j D1
C2
n X
!
n X
Wi E .Ri / E
0:0018 0:0614 0:0020 0:0055 1
E D 0:013
By solving using the identity-matrix technique the weights for the three securities can be obtained: Johnson & Johnson D 0:0406 American Express D 0:0146 Exxon Mobile D 1:0552
i D1
By using the following relationship to rescale these weights so that the second constraint for the sum of the absolute values of the weights to equal one is satisfied:
Wi 1
i D1
Again, derivatives with respect to Wi ’s and ’s are found. Setting these equations equal to zero leaves the following system of equations in matrix form: C
X K 2 3 3 2 W1 0 211 212 21n E .R1 / 1 6 6 221 7 7 6 222 22n E .R2 / 1 7 7 6 W2 7 6 0 7 6 6 6 7 6 : 7 :: :: :: :: 7 6 :: 7 6 :: 7 6 : 7 : : : :7 6 : 7D 6 : 7 6 : 7 6 6 7 7 6 6 2n1 2n2 2nn E .Rn / 1 7 6 Wn 7 6 0 7 7 6 6 7 6 7 4 E .R1 / E .R2 / E .Rn / 0 0 5 4 1 5 4 E 5 1 1 1 0 0 2 1 2
3 3 2 0:0008 0:0053 1 0 W1 7 7 6 6 0:0020 0:0055 1 7 7 6 W2 7 6 0 7 7 7 6 7 6 0:0525 0:0126 1 7 6 W3 7 D 6 0 7 7 6 7 7 6 0:0126 0 0 5 4 1 5 4 E 5 2 1 1 0 0
3
To solve this matrix:
WjAj D
jAj jAj C jBj C jC j
(5.27)
The resealed absolute weights are: 0:0406 D 0:0366 0:0406 C 0:0146 C 1:0552 0:0146 WAXP D D 0:0131 0:0406 C 0:0146 C 1:0552 1:0552 WXOM D D 0:9503 0:0406 C 0:0146 C 1:0552 WJNJ D
The return on this portfolio is: CX C1 CX IX X
DK D C1 K D C1 K D C1 K
Rp D .0:0366/ .0:0053/ C .0:0131/ .0:0055/ C .0:9503/ .0:0126/ D 1:22%
where C1 is the inverse of matrix C and I is the identity matrix. The solution to the above formula will give the weights in terms of E . By using Johnson & Johnson, American Express, and Exxon Mobile a three-security portfolio will be formed and optimal weights will be solved for.
The variance: p2 D .0:0366/2 .0:0455/ C .0:0131/2 .0:0614/ C .0:9503/2 .0:0525/ C 2 .0:0366/ .0:0131/ .0:0009/ C 2 .0:0366/ .0:9503/ .0:0004/ C 2 .0:0131/ .0:9503/ .0:0010/ D 0:0475
5 Risk-Aversion, Capital Asset Allocation, and Markowitz Portfolio-Selection Model
5.4.5 Portfolio Determination Without Short Selling The minimization problem under study can be modified to include the restriction of no short selling by adding a third constraint: Wi 0; i D 1; : : : ; N The addition of this nonnegativity constraint precludes negative values for the weights (that is, no short selling). The problem now is a quadratic programming problem similar to the ones solved so far, except that the optimal portfolio may fall in an unfeasible region. In this circumstance the next best optimal portfolio is elected that meets all of the constraints. An example of this situation is shown in Fig. 5.15.8 In Fig. 5.15a the optimal portfolio is located in the region of positive values for weights Wi , and thereby satisfies the
91
constraint Wi 0. In Fig. 5.15b the optimal portfolio shown by Wi falls in a region where W1 are negative, and so the constraint is not satisfied. The next best optimal portfolio is shown by B, and this is the solution of the problem that satisfies the nonnegativity constraint. Additional discussion of the problems is presented in later chapters. Most recently, Lewis (1988) has developed a simple algorithm for the portfolioselection problem. His method is based on an interactive use of the Markowitz critical-line method for solving quadratic programs. In ı Fig. 5.15a, b the vertical axis is defined as .RN p Rf / p , this is a Sharpe performance mean, as defined in Chap. 4, and can be regarded as the objective function for a portfolio selection.
5.5 Conclusion This chapter has focused on the foundations of Markowitz’s model and on derivation of efficient frontier through the creation of efficient portfolios of varying risk and return. It has been shown that an investor can increase expected utility through portfolio diversification as long as there is no perfect positive correlation among the component securities. The extent of the benefit increases as the correlation is lower, and also increases with the number of securities included. The Markowitz model can be applied to develop the efficient frontier that delineates the optimal portfolios that match the greatest return with a given amount of risk. Also, this frontier shows the dominant portfolios as having the lowest risk given a stated return. This chapter has included methods of solving for the efficient frontier both graphically and through a combination of calculus and matrix algebra, with and without explicitly incorporating short selling. The next chapter will illustrate how the crushing computational load involved in implementing the Markowitz model can be alleviated through the use of the index portfolio, and how the tenets of the Markowitz efficient frontier are still met.
References Alexander, G. J. and J. C. Francis. 1986. Portfolio analysis, PrenticeHall, Englewood Cliffs, NJ. Baumol, W. J. 1963. “An expected gain-confidence limit criterion for portfolio selection.” Management Science 10, 171–182. Fig. 5.15 Optimal portfolio in feasible and unfeasible regions ı In Fig. 5.15a and b the vertical axis is defined as .Rp Rf / p , which is the ı Sharpe performance measure as defined in Chap. 4. .Rp Rf / p is the objective function for portfolio optimization; it will be explored in Chap. 8.
8
Bertsekas, D. 1974. “Necessary and sufficient conditions for existence of an optimal portfolio.” Journal of Economic Theory 8, 235–247. Black, F. 1972. “Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–455. Blume, M. 1970. “Portfolio theory: a step toward its practical application.” Journal of Business 43, 152–173.
92 Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Brealey, R. A. and S. D. Hodges. 1975. “Playing with portfolios.” Journal of Finance 30, 125–134. Breen, W. and R. Jackson. 1971. “An efficient algorithm for solving large-scale portfolio problems.” Journal of Financial and Quantitative Analysis 6, 627–637. Brennan, M. J. 1975. “The optimal number of securities in a risky asset portfolio where there are fixed costs of transaction: theory and some empirical results.” Journal of Financial and Quantitative Analysis 10, 483–496. Cohen, K. and J. Pogue. 1967. “An empirical evaluation of alternative portfolio-selection models.” Journal of Business 46, 166–193. Dyl, E. A. 1975. “Negative betas: the attractions of selling short.” Journal of Portfolio Management 1, 74–76. Elton, E. J. and M. Gruber. 1974. “Portfolio theory when investment relatives are log normally distributed.” Journal of Finance 29, 1265–1273. Elton, E. J. and M. Gruber. 1978. “Simple criteria for optimal portfolio selection: tracing out the efficient frontier.” Journal of Finance 13, 296–302. Elton, E. J., M. Gruber, and M. E. Padberg. 1976. “Simple criteria for optimal portfolio selection.” Journal of Finance 11, 1341–1357. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, 7th Edition, Wiley, New York. Evans, J. and S. Archer. 1968. “Diversification and the reduction of dispersion: an empirical analysis.” Journal of Finance 3, 761–767. Fama, E. F. 1970. “Efficient capital markets: a review of theory and empirical work.” Journal of Finance 25, 383–417. Feller, W. 1968. An introduction to probability theory and its application, Vol. 1, Wiley, New York. Francis, J. C. and S. H. Archer. 1979. Portfolio analysis, Prentice-Hall, Englewood Cliffs, NJ. Gressis, N., G. Philiippatos, and J. Hayya. 1976. “Multiperiod portfolio analysis and the inefficiencies of the market portfolio.” Journal of Finance 31, 1115–1126. Henderson, J. and R. Quandt. 1980. Microeconomic theory: a mathematical approach, 3d Edition, McGraw-Hill, New York. Lee, C. F. and A. C. Lee 2006. Encyclopedia of finance, Springer, New York.
C.-F. Lee et al. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Levy, H. and M. Sarnat. 1971. “A note on portfolio selection and investors’ wealth.” Journal of Financial and Quantitative Analysis 6, 639–642. Lewis, A. L. 1988. “A simple algorithm for the portfolio selection problem.” Journal of Finance 43, 71–82. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolio and capital budgets.” Review of Economics and Statistics 47, 13–27. Mao, J. C. F. 1969. Quantitative analysis of financial decisions, Macmillan, New York. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd ed., Wiley, New Jersey. Markowitz, H. M. 1952. “Portfolio selection.” Journal of Finance 1, 77–91. Markowitz, H. M. 1959. Portfolio selection. Cowles Foundation Monograph 16, Wiley, New York. Markowitz, H. M.1976. “Markowitz revisited.” Financial Analysts Journal 32, 47–52. Markowitz, H. M. 1987. Mean-variance analysis in portfolio choice and capital markets. Blackwell, Oxford. Martin, A. D., Jr. 1955. “Mathematical programming of portfolio selections.” Management Science 1, 152–166. Merton, R. 1972. “An analytical derivation of efficient portfolio frontier.” Journal of Financial and Quantitative Analysis 7, 1851–1872. Mossin, J. 1968. “Optimal multiperiod portfolio policies.” Journal of Business 41, 215–229. Ross, S. A. 1970. “On the general validity of the mean-variance approach in large markets,” in Financial economics: essays in honor of Paul Cootner, W. F. Sharpe and C. M. Cootner (Eds.). Prentice Hall, Englewood Cliffs, NJ, pp. 52–84. Sharpe, W. F. 1970. Portfolio theory and capital markets, McGraw-Hill, New York. Von Neumann, J. and O. Morgenstern. 1947. Theory of games and economic behavior, 2nd Edition, Princeton University Press, Princeton, NJ. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th Edition, Duxbury Press, California.
Chapter 6
Capital Asset Pricing Model and Beta Forecasting Cheng-Few Lee, Joseph E. Finnerty, and Donald H. Wort
Abstract In this chapter, using the concepts of portfolio analysis and the dominance principle, we derive the capital asset pricing model (CAPM). Then we show how total risk can be decomposed into systematic risk and unsystematic risk. Finally, we discuss the determination of beta and introduce different methods for forecast beta coefficient. Keywords Capital asset pricing model r Beta r Capital market line r Market risk premium r Security market line r Market model r Capital labor ratio r Efficient market hypothesis r Market beta r Accounting beta r Systematic risk r Unsystematic risk
6.1 Introduction One of the important financial theories is the capital asset pricing model, commonly referred to as CAPM, which constitutes the major topic of this chapter; in addition, alternative methods for forecasting beta (systematic risk) are discussed. Using the concepts of basic portfolio analysis and the dominance principle discussed in the last two chapters, two alternative methods are employed to derive the CAPM. The discussion concerns how the market model can be used to decompose (that is, divide) risk into two components, systematic and unsystematic risk. In addition, applications of the beta coefficient and procedures for forecasting a beta coefficient. A graphical approach is first utilized to derive the CAPM, after which a mathematical approach to the derivation is developed that illustrates how the market model can be used to decompose total risk into two components. This is followed J.E. Finnerty () University of Illinois at Urbana-Champaign, Champaign, IL, USA e-mail:
[email protected] C.-F. Lee Rutgers University, New Brunswick, NJ, USA D.H. Wort University of California, Hayward, CA, USA
by a discussion of the importance of beta in security analysis and further exploration of the determination and forecasting of beta. The discussion closes with the applications and implications of the CAPM. The appendix offers empirical evidence of the risk-return relationship.
6.2 A Graphical Approach to the Derivation of the Capital Asset Pricing Model Following the risk-return tradeoff principle and the portfolio diversification process, Sharpe (1964), Lintner (1965), and Mossin (1966) have developed an asset pricing model that can determine both the market price of a portfolio and the price of an individual security. They focus on the pricing determination of those parts of security risk that can be eliminated through diversification as well as those that cannot. Systematic risk is that part of total risk that results from the common variability of stock prices and the subsequent tendency of stock prices to move together with the general market. The other portion of total risk is unsystematic risk, the result of variables peculiar to the firm or industry; for example, a labor strike or resource shortage. Systematic risk, also referred to as market risk, reflects the swings of the general market. Some stocks and portfolios can be very sensitive to movements in the market, while others show more independence and stability. The universally accepted notion for the measure of a stock’s or a portfolio’s relative sensitivity to the market based upon its past record is the Greek letter beta .“/. The estimation, application, and forecasting of beta are discussed in the following sections.
6.2.1 The Lending, Borrowing, and Market Portfolios The next chapters will assume that the efficient frontier was constructed with risky assets only. If there are both risk-free
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_6,
93
94
Fig. 6.1 The capital market line
and risky assets, investors would then have the choice of investing in one or the other, or in some combination of the two. The concept of the risk-free asset is generally proxied by a government security, such as the Treasury bill (T-bill). T-bills are backed by the Federal government and are default free, hence considered riskless. An investor’s portfolio can be composed of different combinations of riskless and risky assets. Figure 6.1 is a graph of different sets of portfolio opportunities; it includes the risk-free asset with a return of Rf . Since the riskless asset has zero risk, it is represented by a point on the vertical axis. With the additional alternative to invest in risk-free assets that yield a return of Rf , the investor is able to create new combinations of portfolios that combine risk-free assets with the risky assets. The investor is thus able to achieve any combination of risk and return that lies along the line connecting Rf and a tangent point Mp the market portfolio. All portfolios along the line Rf Mp C are preferred to the risky portfolio opportunities on the curve AM p B. Therefore, the points on the line Rf Mp C represent the best attainable combinations of risk and return. At point Rf the investor has all available funds invested in the riskless asset and expects to receive the return of Rf . The portfolios along the line Rf Mp are lending portfolios and contain combinations of investments in the risk-free asset and investments in a portfolio of risky assets Mp . These are called lending portfolios because investors are, in effect, lending money to the government at the risk-free rate, when they invest in the risk-free asset. At point Mp the investor wants only risky assets and has put his wealth into the risky-asset portfolio, which is called the market portfolio. The investor at this point is neither lending nor borrowing. An alternative view is that at this point the investor may be lending and borrowing equal amounts that just offset each other. At Mp investors receive a rate of return Rm and undertake risk m .
C.-F. Lee et al.
If it is assumed that the investor can borrow money at the risk-free rate Rf and invest this money in the risky portfolio Mp , he will be able to derive portfolios with higher rates of return but with higher risks along the line extending beyond Mp C . The portfolios along line Mp C are borrowing portfolios. They are so called because they contain a negative amount of the risk-free asset. The negative amount invested in the risk-free asset can be viewed as borrowing funds at the risk-free rate and investing in risky assets. The borrowing for investment is called margin and is controlled by the government. Therefore, the new efficient frontier becomes Rf Mp C and is referred to as the capital market line (CML). The capital market line describes the relationship between expected return and total risk.
6.2.2 The Capital Market Line An illustration and explanation of the capital market line have already been provided; the equation for the CML is p E Rp D Rf C E .Rm / Rf m
(6.1)
where: Rf D the risk-free rate; Rm D return on market portfolio Mp ; Rp D return on the portfolio consisting of the risk-free asset and portfolio Mp ; and p m D the standard deviations of the portfolio and the market, respectively. The implementation of Equation (6.1) can be explained graphically. An investor has three choices in terms of investments. He may invest in the riskless asset Rf in the market portfolio Mp or in any other efficient portfolio along the capital market line, such as portfolio P of Fig. 6.1. If the investor puts his money into the riskless asset he can receive a certain return of Rf , if his investments are put into the market portfolio he can expect an average return of Rm and risk of m , if he invests in portfolio P he can expect an average return of Rp with risk of p . The difference between Rm and Rf Rm Rf is called the market risk premium. The investor of portfolio P only needs to take on risk of p so his risk premium is Rp Rf . This is less than the market risk premium because the investor is taking on a smaller amount of risk, p < m . By geometric theory, triangles Rf PF and Rf Mp D are similar. Consequently, they are directly proportional. Thus: p E Rp Rf D E .Rm / Rf m
6 Capital Asset Pricing Model and Beta Forecasting
95
At equilibrium all investors will want to hold a combination of the risk-free asset and the tangency portfolio Mp . Since the market is cleared at equilibrium – that is, prices are such that the demand for all marketable assets equals their supply – the tangency portfolio Mp must represent the market portfolio. An individual security’s proportional makeup .Xi / in the market portfolio will be the ratio of its market value (its equilibrium price times the total number of shares outstanding) to the total market value of all securities in the market, or: Xi D
Market value of individual asset Market value of all assets
(6.2)
Thus, at equilibrium, prices are such that supply is equated to demand, and all securities in the market will be represented in the market portfolio according to their market value. Given the capital market line (CML), assuming that investors are interested only in mean and variance of return, the CML will dominate all other attainable portfolios and securities because of the dominance principle, as discussed in Chap. 4. This implies that they will attempt to put some portion of their wealth into the market portfolio of risky assets, depending on their individual risk preference. Since it has been established that the market portfolio is the only relevant portfolio of risky assets, the relevant risk measure of any individual security is its contribution to the risk of the market portfolio. The beta coefficient relates the covariance between security and market to the market’s total variance, and it is the relevant market-risk measure. This will be explored in the next two sections. Notationally, beta is ˇi D
im Cov .Ri ; Rm / D m2 Var .Rm /
Example 6.1 provides further illustration about risk/return tradeoffs.
at too large a premium. C is an asset with expected negative return – for example, a lottery ticket. The price of the lottery ticket is too high to be justified by the expected value of winning the jackpot. Even though the jackpot may be large, the probability of winning it is very small; hence, its expected value is small. Therefore, the cost of the ticket is greater than the expected value of winning, which yields a negative return.
6.2.3 The Security Market Line: The Capital Asset Pricing Model An asset’s systematic risk with the market, ˇ, is the only relevant risk measure of capital asset pricing for the individual asset and the portfolio. Consequently, a derivation of the relationship between systematic risk and return can be made where the expected linear relationship between these two variables is referred to as the security market line (SML), illustrated by Fig. 6.2. To derive the graphical picture of the SML it is necessary to list a few assumptions concerning the investors and the securities market. 1. Investors are risk averse. 2. The CAPM is a one-period model because it is assumed that investors maximize the utility of their end-of-period wealth. 3. All investors have the same efficient frontier – that is, they have homogeneous expectations concerning asset returns and risk. 4. Portfolios can be characterized by their means and variances.
Example 6.1. Describe the kinds of assets that could characterize the risk/return tradeoffs depicted by A, B, and C in the figure.
Solution A is a correctly priced asset, perhaps a stock or a portfolio. B is an overpriced asset, perhaps a bond selling
Fig. 6.2 The capital asset pricing model showing the security market line
96
C.-F. Lee et al.
5. There exists a risk-free asset with a return Rf , the rate at which all investors borrow or lend. The borrowing rate is equal to the lending rate. 6. All assets are marketable and perfectly divisible, and their supplies are fixed. 7. There are no transaction costs. 8. Investors have all information available to them at no cost. 9. There are no taxes or regulations associated with trading. All individual assets or portfolios will fall along the SML. The position of an asset will depend on its systematic risk or beta. The required risk premium for the market is Rm Rf , and ˇm is equal to one. Given the SML, the return on a risky asset is equal to: E .Ri / D Rf C ˇi E .Rm / Rf
(6.3)
E.Ri / D the expected rate of return for asset i ; Rf D the expected risk-free rate; ˇi D the measure of normalized systematic risk (beta) of asset i , and E.Rm / D the expected return on the market portfolio. (Note: This security market line is also generally called the capital asset pricing model (CAPM). The mathematical derivation of this model will be shown in the next section.) The relationship between the CML and the SML can be seen by rearranging the definition of the beta coefficient: ˇi D
im Cov .Ri ; Rm / i;m i m i;m i D 2 D D 2 Var .Rm / m m m
(6.4)
where: i D standard deviation of a security’s rate of return; m D standard deviation of the market rate of return; i;m D the correlation coefficient of Ri and Rm ; and im im D : i m If im D 1, then Equation (6.3) reduces to: i E .Rm / Rf E .Ri / D Rf C m
(6:30 )
If i;m D 1, this implies that this portfolio is an efficient portfolio. If i is an individual security, it implies that the returns and risks associated with the asset are perfectly correlated with the market as a whole. There are several implications: 1. Equation (6.3) is a generalized case of Equation (6.30 ). 2. The SML instead of the CML should be used to price an individual security or an inefficient portfolio. 3. The CML prices the risk premium in terms of total risk, and the SML prices the risk premium in terms of systematic risk.
Example 6.2 provides further illustration. Example 6.2. Suppose the expected return on the market portfolio is 10% and that Rf D 6 percent. Further, if that you were confronted with an investment opportunity to buy a security with return expected to be 12% and with ˇ D 1:2. Should you undertake this investment? Solution Ri D a C bˇi for the market portfolio with ˇ D 1 we have 0:10 D a C .b 1/ D a C b and for the risk-free rate wit ˇ D 0, we have 0:06 D a C .b 0/ D a Solving these equations for a and b yields: a D 0:06 b D 0:04 For the security with ˇ D 1:2 the expected return given the SML is Ri D 0:06 C 0:04.1:2/ D 0:108 Since 12% > 10:8%, the security is undervalued; that is, the security should be purchased since its return is above the equilibrium return of 10.8% for that level of the risk.
6.3 Mathematical Approach to the Derivation of the Capital Asset Pricing Model Using the assumptions about efficient and perfect markets stated earlier in the chapter, we can show how Sharpe derived the capital asset pricing model (CAPM). Sharpe (1964) used a general risky asset that did not lie along the CML and dubbed it i . Risk and return for the possible combinations of security i with the market portfolio M are shown in Fig. 6.3. The average return and standard deviation for any I-M combination can be approached in the Markowitz fashion for a two-asset case: .i/ E Rp Dwi E .Ri / C .1 wi /E.Rm /
1=2 .ii/ Rp D w2i i2 C .1 wi /2 m2 C 2 .1 wi / wi im (6.5)
6 Capital Asset Pricing Model and Beta Forecasting
97
along the ı security market line. It should be noted that the term im m2 represents the beta coefficient for the regression of Ri vs. Rm so that Equation (6.9) can be rewritten as E .Ri / D Rf C E .Rm / Rf ˇi
(6.10)
which is the formula for the CAPM.
6.4 The Market Model and Risk Decomposition
Fig. 6.3 The opportunity set provide by combinations of risky asset i and the market portfolio M
in which wi represents excess demand for i or demand greater than its equilibrium weight in portfolio M. The changes in mean and standard deviation as the proportion wi changes are represented by: @E Rp D E .Ri / E .Rm / @wi i1=2 h 1=2 w2i i2 C .1 wi /2 m2 C 2wi .1 wi / im @ Rp D h i @wi 2wi i2 2m2 C 2wi m2 C 2im 4wi im (6.6)
when wi D 0, the i th security is held in proportion to its total market value, and there is no excess demand for security i . This is the key insight to Sharpe’s argument, for when wi D 0, it is possible to equate the slope of the curve iMi0 with the capital market line, thus obtaining an expression for the return on any risky security i . At equilibrium when wi D 0, the slope along the iMi0 curve will be equal to @E Rp D @ Rp
@E .Rp / @wi @ .Rp / @wi
D
E .Ri / E .Rm / 2 im m m
(6.7)
To use the capital asset pricing model the market model must be employed to estimate the beta (systematic risk). In addition, the market model can be used to do risk decomposition. Both the market model and risk decomposition are discussed in this section.
6.4.1 The Market Model As noted previously, the total risk of a portfolio or security can be conceived as the sum of its systematic and unsystematic risks. The equation that expresses these concepts states that the return on any asset at time t can be expressed as a linear function of the market return at time t plus a random error component. Thus, the market model is expressed as: Ri;t D ˛i C ˇi Rm;t C ei;t
(6.11)
where Ri;t D the return of the i th security in time t; ˛i D the intercept of the regression; ˇi D the slope; Rm;t D the market return at time t; and ei;t D random error term.
The slope of the capital market line at point M is E .Rm / Rf m
6.4.2 Risk Decomposition (6.8)
Setting Equation (6.7) equal to Equation (6.8) and rearranging the terms to solve for E .Ri / gives the equation for the security market line or CAPM: im E .Ri / D Rf C E .Rm / Rf m2
(6.9)
This represents the return on any risky asset i . At equilibrium, every risky asset will be priced so that it lies
The regression model of Equation (6.11), which describes the risky asset’s characteristics relative to the market portfolio, is also called the characteristic line. Since beta is the slope coefficient for this regression (market model) it demonstrates how responsive returns for the individual earning assets are to the market portfolio. By using the market model, the total variance for security i i2 can be represented by and decomposed into: (6.12) i2 D ˇi2 m2 C ei2
98
C.-F. Lee et al.
in which ˇi2 m2 is the systematic-risk component of total risk and ei2 is the unsystematic component. The CAPM as developed here is expressed in terms of expected values. Since expected values are not directly measured, we must transform the CAPM into an expression that uses observable variables. Assuming that on average expected returns E .Ri;t / for a security equal realized returns, returns can be expressed as: Ri;t D E .Ri;t / C ˇi ŒRm;t E .Rm;t / C ei;t
(6.13)
in which ei;t is a random error term. Assuming that the expected value of the error term is 0, that it is uncorrelated with the term ŒRm;t E .Rm;t / and that Cov.ei;t ; ei;t 1 / D 0, the expression for E .Ri;t / can be substituted from the CAPM Equation (6.10) into Equation (6.13): Ri;t D Rf;t C E Rm;t Rf;t ˇi C Rm;t E Rm;t ˇi C ei;t
Simplifying this equation: Ri;t Rf;t D ˛i C ˇi Rm;t Rf;t C ei;t
(6.14)
where ˛i is the intercept. All variables of Equation (6.14) can be estimated from observed data. Equation (6.14) is called the risk-premium version of the market model. It is similar to the market model indicated in Equation (6.11), except that instead of using the total returns Ri and Rm , it uses the riskpremium portion of the returns, or Ri Rf and Rm Rf . Example 6.3 provides further illustration. Example 6.3. To show how Equations (6.11) and (6.12) can be used, we will use monthly return data from Exxon Mobil and American Express. The time period covers January 2000 to November 2007. Their average return, beta coefficient, total variance, and residual variance are as indicated in the table below. We can see that American Express with ˇ D 1:26176 is more sensitive to the fluctuations of the market than Exxon Mobil with ˇ D 0:64647.
Exxon Mobil American Express
Return
Beta
Residual variance
Total variance
0.0126
0.64647
0.00214
0.00275
0.0055
1.26176
0.00138
0.00377
Both the magnitude of total variance and beta of American Express were larger than those of Exxon Mobil, although the average rates of return for Exxon Mobil were higher than American Express. Hence, Exxon Mobil was a more desirable security, if that was the only choice the investor had.
6.4.3 Why Beta Is Important for Security Analysis Implications and applications of beta coefficients in security analysis will be discussed in this section. Beta is used as a measurement of risk: it gauges the sensitivity of a stock or portfolio relative to the market, as indicated in Equation (6.11). A beta of 1.0 is characteristic of a broad market index such as the NYSE index or the S&P 500. A beta of 2.0 indicates that the stock will swing twice as far in either direction than a fall or rise in the market average. If the market gains 15%, a security of beta D 2 is expected to gain 30%. Conversely, should the market fall 15%, the stock is expected to fall 30%. A beta of 0.50 indicates that the stock is more stable than the market and will move only half as much as the market. For example, if the market loses 20%, a stock with beta D 0:50 will lose only 10% of the market, and if the market should gain 20%, the stock will gain only 10%. Highbeta stocks have been classified as aggressive while low-beta stocks are referred to as defensive. From the table in Example 6.3 it is clear that Exxon Mobil is an offensive stock while American Express is a defensive stock. The CAPM has illustrated the concept that risks are associated with portfolios where the relevant (systematic) risk of an individual security is dependent upon the security’s effect on portfolio risk. Therefore, the CAPM equation E.Ri / D Rf C ˇi ŒE.Rm / Rf represents a description of how rates of return are established in the marketplace assuming investors behave according to the assumptions of the model. Thus a stock’s beta measures its contributions to the risk of the portfolio and is therefore a measure of the stock’s riskiness relative to the market. In the previous section it was found that the rates of return of stock i are presumed to bear a linear relationship with the market rate of return as defined in Equation (6.11). Equation (6.11) is a fixed-coefficient market model. Following Fabozzi and Francis (1978), Sunder (1980), and Lee and Chen (1980), a random-coefficient market model can be defined as Equation (6.15a) or Equation (6.15b): Rit D ˛i C ˇit Rmt C eit Rit Rft D ˛i0 C ˇit0 Rmt Rft C eit
(6.15a) (6.15b)
in which ˇit D ˇi C it :it represents the random fluctuation associated with the beta coefficient. Using the randomcoefficient market model, the total risk can be decomposed into three components, as defined in Equation (6.16): i2 D ˇj2 m2 C "i2 C 2 m2
(6.16)
in which 2 m2 represents an interaction risk between the market and the random fluctuation of the ˇ. Equation (6.16)
6 Capital Asset Pricing Model and Beta Forecasting
is a generalized case of Equation (6.12). It contains systematic risk, unsystematic risk, and a risk term that reflects any interaction between the systematic and unsystematic risk. Both fixed-coefficient and random-coefficient market models can be used to do security analysis and portfolio selection, discussed later in this text. The relationship between total risk, market risk, and firmspecific risk can be shown as: Total risk D Market risk C Firm-specific risk Since firm-specific risk can be eliminated by diversification: Relevant risk D Market risk Beta is important for the investment manager because it can be used to (1) select individual stocks for investment, (2) construct portfolios of financial assets with desired levels of risk and return, and (3) evaluate the performance of portfolio managers. Details of the various uses of beta and the CAPM are provided later in this text. Example 6.4 provides further illustration. Example 6.4. Given the SML Ri D 0:06 C 0:08ˇi . What should the expected return of a security be if it has a ˇ twice as great as a similar security returning 18%? Solution Ri D 0:06 C 0:08ˇ D 0:18 Solving for ˇ yields ˇ D 1:5 Therefore, the security’s ˇ is 2 1:5 D 3:0 and the required return is: Ri D 0:06 C 0:08 .3:0/ D 0:30 A 30% return is required.
6.4.4 Determination of Systematic Risk As mentioned above, systematic risk is nondiversifiable because it runs across industries and companies and affects all securities. Previous chapters discussed many factors that determine risk. At this point we are interested only in those management decisions that can be related to the degree of systematic risk a firm exhibits; namely, the operating decisions and the financing decisions of management.
99
There are two dimensions of risk that affect a firm’s systematic risk. The first is financial risk, the additional risk placed on the firm and its stock holders due to the firm’s decision to be leveraged – that is, to take on additional debt. The second, business risk, is the riskiness involved with a firm’s operations, if it takes on no debt. A particular firm’s capital structure affects the riskiness inherent in the company’s common stock and thus affects its required rate of return and the price of the stock. A company’s capital-structure policy requires choosing between risk and return. Taking on increasing levels of debt increases the riskiness of the firm’s earning stream, but it usually also results in a higher expected rate of return. High levels of risk tend to lower a stock’s price, but a high level of expected rates of return tends to raise it. Therefore, striking a balance with the optimal capital structure maximizes the price of the stock. Business risk is the risk inherent in a firm’s operations. It can also be defined as the uncertainty inherent in projection of future operating income or earnings before interest and taxes (EBIT). Fluctuations in EBIT can result from a number of factors. On the national level these can be economic factors such as inflationary or recessionary times. At the industry level some factors may be the level of competition between similar industries, natural or man-made catastrophes, labor strikes, price controls, and so on. There are a host of possibilities that affect EBIT by raising or lowering its level. Uncertainty regarding future income flows is a function of the company’s business risk. It may fluctuate among industries, among firms, and cross time. The extent of business risk is dependent on the firm and the Industry. Cyclical industries such as steel production or automobile manufacture have especially high business risks because they are dependent on the strength of the economy. The retail food industry is considered to be quite stable because food is a necessary good that will be purchased regardless of the state of the economy. Business risk is dependent on several factors, the more important of which are listed here: Demand variability : Stability in the levels of demand for the firm’s product results in a reduction of business risk. 1. Sales price variability: Highly volatile prices result in high-risk, volatile markets; therefore, stability of prices results in reduction of business risk. 2. Suppliers’ price variability: Firms whose input prices are highly variable are exposed to higher levels of risk. 3. Output price flexibility relative to input prices: As a result of inflation, a firm that is able to raise its output prices with increasing input costs minimizes its business risk. When a firm uses debt or financial leverage, business risk and financial risk are concentrated on the stockholders. For
100
C.-F. Lee et al.
example, if a firm is capitalized only with common equity, then the investors all share the business risk in proportion to their ownership of stock. If, however, a firm is 50% levered (50% of the corporation is financed by debt, the other half by common equity), the investors who put up the equity will then have to bear all business risk and some financial risk. The effect of leverage on return on assets (ROA) and return on equity (ROE) and its effect on the stockholders can be generalized as follows: 1. The use of leverage or debt generally increases ROE. 2. The standard deviation of ROA.ROA / is a measure of business risk while the standard deviation of ROE.ROE / is a measure of the risk borne by stockholders. ROA D ROE if the firm is not levered; otherwise with the use of debt ROE > ROA an indication that business risk is being borne by stockholders. 3. The difference between ROE and ROA is the actual risk stockholders face and a measure of the increased risk resulting from financial leverage. Thus 2 2 ROA : Risk of financial leverage D ROE
It becomes necessary, therefore, to define a company’s sustainable growth rate: P .1 D/ .1 C L/
S D g D S T P .1 D/ .1 C L/
(6.17)
where: P D the profit margin on all sales; D D the target dividend payout ratio; L D the target debt to equity ratio; T D the ratio of total assets to sales; S D annual sales; and
S D the increase in sales during the year. How is Equation (6.17) derived? Assuming a company is not raising new equity, the cash to finance growth must come from retained profits and new borrowings: Retained profits D Profits Dividends D Profit margin Total sales Dividends
6.5 Growth Rates, Accounting Betas, and Variance in EBIT Besides leverage, other financial variables associated with the firm can affect the beta coefficient. These are the growth rate, accounting beta, and variance in EBIT.
D P .S C S/ .1 D/ Further, because the company wants to maintain a target debt-to-equity ratio equal to L, each dollar added to the owners’ equity enables it to increase its indebtedness by $L. Since the owners’ equity will rise by an amount equal to retained profits: New borrowings D Retained profit
6.5.1 Growth Rates
Target debt-to-equity ratio D P .S C S /.1 D/L
The growth rate can be measured in terms of the growth in total assets or the growth in sales. It is determined by the percentage change between two periods. salest salest 1
100% or salest 1 total assetst total assetst 1
100% total assetst 1 Another method to measure growth rates is presented by Higgins (1984). Growth and its management present special problems in financial planning. From a financial perspective growth is not always a blessing. Rapid growth can put considerable strain on a company’s resources, and unless management is aware of this effect and takes active steps to control it, rapid growth can lead to bankruptcy.
The use of cash represented by the increase in assets must equal the two sources of cash (retained profits and new borrowings): Uses of cash D Sources of cash Increases in assets D Retained profits C New borrowings
ST D P .S C S /.1 D/ C P .S C S /.1 D/L
S T D P .1 D/.1 C L/S C P .1 D/.1 C L/ S
6 Capital Asset Pricing Model and Beta Forecasting
S ŒT P .1 D/.1 C L/ D P .1 D/.1 C L/S P .1 D/.1 C L/
S D S T P .1 D/.1 C L/ In Equation (6.17) the S =S or g is the firm’s sustainable growth rate assuming no infusion of new equity. Therefore, a company’s growth rate in sales must equal the indicated combination of four ratios, P, D, L, and T . In addition, if the company’s growth rate differs from g , one or more of the ratios must change. For example, if a company grows at a rate in excess of g , then it must either use its assets more efficiently or it must alter its financial policies. Efficiency is represented by the profit margin and asset-to-sales ratio. It therefore would need to increase its profit margin .P / or decrease its asset-to-sales ratio .T / to increase efficiency. Financial policies are represented by payout or leverage ratios. In this case, a decrease in its payout ratio .D/ or an increase in its leverage .L/ would be necessary to alter its financial policies to accommodate a different growth rate. It should be noted that increasing efficiency is not always possible and altering financial policies not always wise (For other available methods to alter growth the reader is referred to Chap. 8 of Lee (1985).)
6.5.2 Accounting Beta The accounting beta can be calculated from earnings-pershare data. Using EPS as an example, beta can be computed as follows: EPSi;t D ˛i C ˇi EPSm;t C ei;t where: EPSi;t D earnings per share of firm i at time t; EPSm;t D earnings per share of market average at time t; and ei;t D error term. The estimate of ˇi is the EPS type of accounting beta.
101
The total variance of EBIT can be used to measure the overall fluctuation of accounting earnings for a firm.
6.5.4 Capital-Labor Ratio The capital-labor ratio has an impact on the magnitude of the beta coefficient. In order to examine this impact it is necessary to examine the capital-labor ratio. A production function is a function that can be seen as a function of labor and capital: Q D f .K; L/
(6.18)
where K D capital and L D labor: K=L (the capital-labor ratio) is generally used to measure a firm’s degree of capital intensity. Corporations often choose between increasing their capital intensity through installation of computers, use of robotics in place of labor, or increase in labor inputs. Small industries that specialize in hand-crafted or -tooled goods will have to increase their capital ratio in to increase production. However, it has been the trend in recent years for many growth-oriented firms, whether manufacturers or other members of the business sector, to increase their efficiencies through increased investments in capital. Auto manufacturers are finding that robots are able to assemble cars of high and consistent quality at only a fraction of the cost of workers. Capital intensity results in increased total risks and generally results in an increase in beta. Large investments are often needed to fully automate a plant or to computerize a bank completely. Taking on debt or issuing securities is normally how these capitalization increases are financed. If the capital-labor ratio is greater than one – that is, if K is greater than L – a firm is capital intensive. If the capital-labor ratio is less than one – that is, if K is less than L – then there is a reduction in capital intensity and a shift toward humanresource investment.
6.5.5 Fixed Costs and Variable Costs 6.5.3 Variance in EBIT The variance in EBIT .X / can be defined as: n P
Xt X
2
t D1
n1 in which Xt D earnings before interest and taxes in period t, and X D average EBIT.
Business risk is dependent upon the extent a firm builds fixed costs into its operations. If a large percentage of a firm’s costs are fixed, costs cannot decline proportionally when demand falls off. If fixed costs are high, then a slight drop in sales can lead to large declines in EBIT. Therefore, the higher a firm’s fixed costs are, the greater its business risk and, generally, the higher its beta. A firm with a large amount of fixed costs is said to have a large degree of operating leverage. An example of these costs may be the highly skilled workers of an engineering
102
firm. The firm cannot hire and fire experienced and highly skilled workers easily; therefore, the workers must be retained and paid during a period of slack demand. Similarly, a firm that is highly leveraged will be characterized by the scenario that small changes in sales will result in large changes in operating income. Variable costs have the opposite effect, because they are adjustable to the firm’s needs. Should a drop in sales occur, variable costs can be lowered to meet the lowered output. The extent to which firms can control their operating leverage is dependent upon their technological needs. Companies that require large investments in fixed asset – such as steel mills, auto manufacturers, and airlines – will have large fixed costs and operating leverages. Therefore, how much a company is willing to undertake operating leverages must come into play during capital-budgeting decisions. If the company is risk averse, it may opt for alternatives with smaller investments and fixed costs. A company having a larger percentage of fixed costs generally implies that it uses more capital-intensive types of technology in production. For example, an auto manufacturer such as General Motors has a higher percentage of fixed costs than a food manufacturer such as General Foods. Therefore, GM’s capital-labor ratio can be expected to be higher than that of General Foods. An implication of this phenomenon is that General Foods has a lower beta than General Motors.
C.-F. Lee et al.
the estimation of beta. They conclude that using both price and accounting information shows promise for better beta estimates. Rosenberg and Marathe (1975) use 54 factors in six categories to estimate betas. Their factors include historical price information, price/earning ratios, financial ratios, and statistical measures associated with the market model. They support the notion that betas are determined by fundamental factors related to firms in addition to market pricing data. Given that the most useful beta is one that can be forecasted correctly, and that the beta is related to the financial and business risk of the firm as well as the degree to which a firm’s businesses co-vary with the total economy, it becomes necessary to determine the best way to forecast beta. The next section addresses this issue.
6.5.7 Market-Based versus Accounting-Based Beta Forecasting Market-based beta forecasts are based on market information alone. Historical betas of firms are used as a proxy for their future betas. This implies that the unadjusted sample beta, ˇOt , is equal to the population value of future beta: ˇt C1 D ˇOt
6.5.6 Beta Forecasting Analysts and investment managers who use the CAPM and beta are interested in whether the beta coefficient and the standard-deviation statistics for different securities and portfolios are stable through time or whether they change in a predictable fashion. If the beta coefficient and standard deviation are stable, then using a beta derived from current and historical price data is fine, because the beta today is the same as the beta in the future. However, if the beta coefficient and standard deviation are unstable or vary through time, the analyst or manager must forecast a beta’s future value before employing it. The available evidence on the stability of beta indicates that the beta on an individual security is generally not stable, while portfolios have stable betas. Hence we are faced with the problem of forecasting future betas in order to use the CAPM for individual securities. Beta forecasting refers to using the historical beta estimates or other historical financial information to forecast future betas. Beaver et al. (1970) argue that if it is possible to find the underlying determinants of beta, such knowledge can be used to help forecast future betas. Beaver et al. use accounting information from the financial statements of firms to help in
(6.19a)
Alternatively, there may be a systematic relationship between the estimated betas for the first period and those of the second period, as shown by Blume (1971): ˇOi;t C1 D a0 C a1 ˇOi;t
(6.19b)
in which ˇOi;t C1 and ˇOi;t estimated beta for the i th firm in period t C 1 and t respectively. Example 6.5 provides further illustration. Example 6.5. If aO 0 D 0:35; aO 1 D 0:80, and, ˇOi;t D 0:12, then the future beta can be either ˇt C1 D 1:2 or ˇOt C1 D 0:35 C .0:80/.1:2/ D 1:31 It is worthwhile to note that Value Line uses Equation (6.19b) to estimate the future beta. Accounting-based beta forecasts rely on the relationships of accounting information such as the growth rate of the firm, EBIT, leverage, and the accounting beta as a basis for forecasting beta. To use accounting information in beta forecasts, the historical beta estimates are first cross-sectionally related to accounting information such as growth rate, variance of EBIT, leverage, accounting beta, and so on:
6 Capital Asset Pricing Model and Beta Forecasting
ˇi D a0 C a1 X1i C a2 X2i C aj Xj i C : : : C am Xmi
103
(6.20)
where Xji is the j th accounting variables for i th firm, and aj is the regression coefficient. Some researchers have found that the historically based beta is the best forecast, while others have found that the accounting-based beta is best. Lee et al. (1986) use composite concepts to show that both accounting and market information are useful for beta forecasting. They find that neither beta forecasts based on market information nor those based on accounting information are conditionally more efficient with respect to each other. It can be inferred, then, that each set of forecasts contains useful information for the prediction of systematic risk. The statistical procedure used by Lee et al. (1986) is the ordinary least-squares method. Ordinary least squares is a statistical procedure for finding the best fitting straight line for a set of points; it seems in many respects a formalization of the procedure employed when fitting a line by eye. For instance, when visually fitting a line to a set of data, the ruler is moved until it appears that the deviations of the points from the prospective line have been minimized. If the predicted value of yi (dependent variable) obtained from the fitted line is denoted as yOi the prediction equation becomes: yOi D ˇO0 C ˇO1 xi where ˇO0 and ˇO1 represent estimates of the true ˇ0 and ˇ1 I xi is an independent variable. Graphically, the vertical lines drawn from the prediction line to each point represent the deviations of the points from the predicted value of y. Thus the deviation of the i th point is yi yOi , where: SSE D
n X
.yi yOi /
In order to find the best fit it is necessary to minimize the deviations of the points. A criterion of best fit that is often employed is known as the principle of least squares. It may be stated as follows: Choose the best fitting line as the one that minimizes the sum of squares of the errors (SSE) of the observed values of Y from those predicted. Further, the mean sum of squares of the error (MSSE) of the observed values of Y could be minimized from the predicted mean. Expressed mathematically:
MSSE D
n P
.yi yOi /2
i D1
n
.yi yOi /2
i D1
n
Mincer and Zarnowitz (1969) suggest a decomposition of the mean squared error term into three components representing
2 D .yi yOi /2 C Sy Sy C 1 2 Sy2
where y, and Sy represent the mean, standard deviation, and correlation coefficient of y, respectively. Using the ordinary least-squares method, Lee et al. (1986) show that it is possible to achieve better forecasts by combining two types of betas. Using an ordinary least-squares estimate of ˇOLS and a Bayesian adjustment procedure of Vasicek (1973) as defined in Equation (6.21):
ˇV D
ˇN C V ˇO V
ˇOOLS
ˇOOLS (6.21)
1 1
C V ˇO V ˇOOLS
where: ˇOLS D the least-squares estimate of a first-period individual beta; V ˇOOLS D variance estimate of ˇOLS ; ˇN D the
cross-sectional mean value of estimated ˇOLS ; O V ˇ D the cross-sectional variance of estimated ˇOLS ; and ˇV D the Bayesian adjusted beta. Equation (6.21) indicates that the Vasicek type of Bayesian adjustment beta is a weighted average of ˇN and ˇOLS . The weights are
2
i D1
n P
bias, inefficiency, and random error. Mathematically, this is represented:
w1 D
1 V ˇO 1 1 C
V ˇO V ˇOOLS
and w2 D
1
V ˇOOLS
1 1 C
V ˇO V ˇOOLS
The authors first use first-period regressions projected forward to obtain forecasts of the second-period betas. The summary of the stepwise regression for the first-period data is given in Table 6.1. Table 6.2 summarizes the overall mean squared errors together with the mean squared error decomposition. Lee et al. (1986) note that for forecasts developed without the Bayesian adjustment, the accounting-based forecast is better, in terms of overall mean squared error. This advantage results entirely from inefficiency in the Mincer-Zarnowitz sense of the market-based forecast. By contrast, the Bayesian-adjusted market-based forecasts suffer far less from this problem and, as a result, the mean squared prediction error for Bayesian-adjusted market-based
104
C.-F. Lee et al.
Table 6.1 Summary of stepwise regression results for first-period data
Dependent variable adjusted R2 Independent variables Intercept Financial leverage Dividend payout Sales Operating income Assets
“O OLS 0.245
“O V 0.236
Coefficients (t -values) 0.911 1.037 (11.92) (12.71) 0.704 0.599 (5.31) (5.81) 0:175 0:108 .3:50/ .2:85/ 0.030 0.018 (3.02) (2.28) 0.011 0.026 (2.18) (2.28) – 0:026 – .2:41/
Source: Lee et al. (1986) Table 6.2 Mean squared error decompositions for forecasts of second-period beta Market-based forecasts Accounting-based forecasts Without Bayesian adjustment Bias Inefficiency Random error Total mean squared error
0.0003 0.0309 0.0581 0.0893
With Bayesian adjustment 0.0000 0.0090 0.0577 0.0667
Without Bayesian adjustmentl 0.0003 0.0015 0.0659 0.0677
With Bayesian adjustment 0.0000 0.0001 0.0638 0.0640
Source: Lee et al. (1986)
forecasts are only marginally larger than the accountingbased forecasts 0:0667 compared to 0.0640. Lee et al. (1986) tested a composite predictor for beta consisting of both a market beta and an accounting beta. Table 6.3 shows the results of using this composite method to forecast beta for both the Bayesian-adjusted and nonadjusted model. In Table 6.3, ˇOLS (2) represents the OLS estimated A and ˇOVA represent accountingbeta in the second period. ˇOLS based beta without and with Bayesian adjustment, respectively. By comparing both MSE (market) of 0.893 and 0.0667 and MSE(accounting) of 0.0677 and 0.0640 with MSE(composite) of 0.0545 and 0.0530, respectively, it can be concluded that on the basis of mean squared error, the composite predictor outperforms either the market- or accounting-information predictor. Of the models used for the composite predictor, it appears that the Bayesian adjustment case is superior. From research it appears, then, that accounting-based and market-based forecasts can be combined to produce a superior composite forecast of beta.
6.6 Some Applications and Implications of the Capital Asset Pricing Model Since its development, the uses of the capital asset pricing model (CAPM) have extended into all areas of corporate finance and investments. The CAPM can be applied to two
aspects of the capital-budgeting problem: (1) determining the cost of capital, and (2) assessing the riskiness of a project under consideration. The CAPM can also be useful in a realestate problem, deciding to lease or buy. The use of the CAPM can be extended into valuation of the entire firm. Because of its impact upon firm valuation, the CAPM has been of great use in the merger-analysis area of financial analysis. The CAPM has also been used to test various financial theories. By including a dividend term and considering its effects, the CAPM can be used to test the effects of the firm’s dividend policy. An area that has received a great deal of attention is the use of the CAPM in testing the efficient-market hypothesis. An application of the CAPM to the capital-budgeting process concerns the valuation of risky projects. If accurate estimates can be made about the systematic risk of a project, then the CAPM can be used to determine the return necessary to compensate the firm for the project’s risk. If the sum of the estimated cash flows discounted by the CAPM-calculated required rate of return is positive, then the firm should undertake the project. Rubinstein (1973) demonstrates how the CAPM can be used to value securities and to calculate their risk-adjusted equilibrium price. First, the CAPM must be converted to using price variables instead of expected return. It may be rewritten: P1 P0 E.Ri / D P0
6 Capital Asset Pricing Model and Beta Forecasting
105
Table 6.3 Results for the estimation of market- and accounting-based composite with and without Bayesian adjustment O Without Bayesian adjustment “O OLS .2/ D a C b1 “O A OLS C b2 “OLS C w “O A OLS 0.399 0.102 MSE D 0:0545 O With Bayesian adjustment “O OLS .2/ D a C b1 “O A V C b2 “V C w Intercept 0:251 0:104
Coefficients Standard errors
Coefficients Standard errors
Intercept
“O A V
0:096 0:134
0.606 0.133 MSE D 0:0530
“O OLS 0:370 0:052
“O V 0:475 0:067
Source: Lee et al. (1986)
where: Ri D the expected returns for the i th firms; P1 D the price of stock in time l; and P0 D the price of stock in the previous period. Thus, the CAPM is redefined: im E .P1 / P0 D Rf:t C E .Rm / Rf P0 m2
(6.22)
or, rearranging Equation (6.22): P0 D
E .P1 / 1 C Rf C E .Rm / Rf im2 m
Thus the rate of return used to discount the expected endof-period price contains a risk premium dependent upon the security’s systematic risk. CAPM has also been applied in the analysis of mergers. It has been shown that the risks of a portfolio can be substantially reduced with the inclusion of securities that are not perfectly correlated in terms of returns. This principle also applies with respect to the mergers between firms. The merging of two firms with different product lines, called a conglomerate merger, creates diversification, considered of great benefit. Suppose that one firm sells a product that is recession resistant, then a decrease in earnings of one division of the conglomerate will be offset by the steady earnings of another division. The overall result will be a relatively stable income stream despite shifting trends in the economy.
6.7 Conclusion This chapter has discussed the basic concepts of risk and diversification and how they pertain to the CAPM. The procedures for deriving the CAPM itself were presented, and the CAPM was shown to be an extension of the capital market
line theory. The possible uses of the CAPM in financial management were also indicated. The concept of beta and its importance to the financial manager was introduced. Beta represents the systematic risk of a firm and is a comparison measure between a particular firm’s security or portfolio risk and the market average. Systematic risk was further discussed through an investigation of the beta coefficient and the impact of other important financial variables on the magnitude of the beta coefficient. The statistical method of least squares and its application were introduced, and beta forecasts based on the leastsquares method were compared with those based on market information, accounting information, and a compositepredictor beta forecast composed of both accounting and market information. The composite predictor appears to yield a better forecast than either the market-information or accounting-information forecasts separately.
References Banz, R. W. 1981. “The relationship between return and market value of common stocks.” Journal of Financial Economics 9, 3–18. Basu, S. 1977. “Investment performance of common stocks in relation to their price-earnings ratios: a test of the efficient markets hypothesis.” Journal of Finance 32, 663–682. Beaver, W., P. Kettler, and M. Scholes. 1970. “The association between market determined and accounting determined risk measures.” Accounting Review 45, 654–682. Black, F., M. C. Jensen, and M. Scholes. 1972. “The capital asset pricing model: some empirical tests,” in Studies in the theory of capital markets, M. C. Jensen (Ed.). Praeger, New York, pp. 20–46. Blume, M. E. 1971. “On the assessment of risk.” Journal of Finance 26, 1–10. Blume, M. E. and I. Friend. 1973. “A new look at the capital asset pricing model.” Journal of Finance 28, 19–34. Blume, M. E. and F. Husick. 1973. “Price, beta, and exchange listing.” Journal of Finance 28, 19–34. Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Brennan, M. J. 1971. “Capital market equilibrium with divergent borrowing and lending rate.” Journal of Financial and Quantitative Analysis 7, 1197–1205.
106 Douglas, G. W. 1969. “Risk in the equity markets: an empirical appraisal of market efficiency.” Yale Economic Essays 9, 3–45. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, 7th Edition, Wiley, New York. Fabozzi, F. J. and J. C. Francis. 1978. “Beta as a random coefficient.” Journal of Financial and Quantitative Analysis 13, 101–116. Fama, E. F. 1968. “Risk, return and equilibrium: some clarifying comments.” Journal of Finance 23, 29–40. Fama, E. F, and J. MacBeth. 1973. “Risk, return and equilibrium: empirical tests.” Journal of Political Economy 31, 607–636. Francis, J. C. 1986. Investments: analysis and management, 4th Edition, McGraw-Hill, New York. Higgins, R. C. 1974. “Growth, dividend policy and cost of capital in the electric utility industry.” Journal of Finance 29, 1189–1201. Higgins, R. C. 1984. Analysis for financial management, Richard D. Irwin, Homewood, IL. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolio.” Journal of Business 42, 607–636. Jensen, M. C. 1972. “Capital markets: theory and evidence.” The Bell Journal of Economic and Management Science 3, 357–398. Keim, D. B. 1983. “Size-related anomalies and stock return seasonality: further empirical evidence.” Journal of Financial Economics 11, 13–32. Lee, C. F. 1985. Financial analysis and planning: theory and application, Addison-Wesley, Reading, MA. Lee, C. F. and S. N. Chen. 1980. “A random coefficient model for reexamining risk decomposition method and risk-return relationship test.” Quarterly Review of Economics and Business 20, 58–69. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Lee, C. F, P. Newbold, J. E. Finnerty, and C. C. Chu. 1986. “On accounting-based, market-based and composite-based beta predictions: methods and implications.” The Financial Review 21, 51–68. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets.” Review of Economics and Statistics 47, 13–37. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd Edition, Wiley, Hoboken, NJ. Merton, R. C. 1980. “On estimating the expected return on the market, an exploratory investigation.” Journal of Financial Economics 8, 323–361. Miller, M. and M. Scholes. 1972. “Rates of return in relation to risk: a reexamination of some recent findings,” in Studies in theory of capital markets, M. C. Jensen (Ed.). Praeger, New York, pp. 47–78. Mincer, J. and V. Zarnowitz. 1969. “The evaluation of economic forecasts,” in Economic forecasts and expectations, J. Mincer (Ed.). National Bureau of Economic Research, New York. Mossin, J. 1966. “Equilibrium in a capital asset market.” Econometria 34, 768–873. Reinganum, M. R. 1981. “Misspecification of capital asset pricing: empirical anomalies based on earnings yields and market values.” Journal of Financial Economics 8, 19–46. Roll, R. 1977. “A critique of the asset pricing theory’s tests-part i: on past and potential testability of the theory.” Journal of Financial Economics 4, 129–176. Roll, R. 1978. “Ambiguity when performance is measured by the securities market line.” Journal of Finance 33, 1051–1069. Rosenberg, B. and V. Marathe. 1975. Tests of the capital asset pricing hypothesis, Working Paper No. 32 of the Research Program in France, Graduate School of Business and Public Administration, University of California, Berkeley.
C.-F. Lee et al. Rosenberg, B. and W. McKibben. 1973. “The prediction of systematic and specific risk in common stocks.” Journal of Finance and Quantitative Analysis 8, 317–333. Rubinstein, M. E. 1973. “A mean-variance synthesis of corporate financial theory.” Journal of Finance 28, 167–168. Sharpe, W. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. Sharpe, W. 1966. “Mutual fund performance.” Journal of Business 39, 119–138. Sunder, S. 1980. “Stationarity of market risk: random coefficients tests for individual stocks.” Journal of Finance 35, 883–896. Vasicek, O. A. 1973. “A note on using cross-sectional information in Bayesian estimation of security betas.” Journal of Finance 28, 1233–1239. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th Edition, Duxbury Press, California.
Appendix 6A Empirical Evidence for the Risk-Return Relationship The validity of the CAPM can be borne out partly through observations of actual portfolios held in the marketplace. As discussed in previous chapters, there are two primary relationships between risk and return. First, the rates of return of efficient portfolios are linear functions of their riskiness as measured by their standard deviation. This is illustrated by the capital market line (CML). Second, the rate of return of an individual asset is determined by its contribution of risk to the portfolio, and this is measured by beta, where beta has a linear relationship with the security’s expected rate of return. This is illustrated by the security market line (SML). The performance of mutual funds can be employed to test the explanatory powers of the linear relationship between risk and return of the CML. Mutual funds are professionally managed and therefore the most visible type of portfolio, easily used for comparison testing. One study was performed by Sharpe (1966) to test the performance of a fund and the relationship between its rate of return and risk over time. Sharpe computed average annual returns and the standard deviations of these returns for 34 mutual funds from 1954 to 1963. His model implies that portfolios with higher risks will receive higher returns. This Sharpe found to be true for all 34 funds. He calculated the correlation between average returns and their standard deviations to be 0.836, indicating that more than 80% of the difference in returns was due to differences in risk. Sharpe also found that there was a linear relationship between returns and risks, except in the region of very high risks. Sharpe’s study provides basic support to the contention that the CML explains the relationship between risk and return, both in portfolio theory and in the marketplace. Another study was performed by Jensen (1969). He studied the correlation of beta coefficients (market sensitivity) and the expected return of mutual funds. On the basis
6 Capital Asset Pricing Model and Beta Forecasting
107
of analysis of 115 mutual funds over a 9-year period he was able to conclude that high returns were associated with high volatility or high systematic risks. He also found evidence that beta coefficients are a valid and accurate measure of risk. Both the Sharpe and Jensen studies on the risk and return of mutual funds show that an empirical risk-return relationship does exist among mutual funds (However, Sharpe used the capital market line to perform his empirical tests while Jensen used the security market line, derived by Sharpe 1964.) The second implication of the risk-return relationship is that the risk premium on individual assets depends on the contribution each makes to the riskiness of the entire portfolio. The CAPM is a simple linear model expressed in terms of expected returns and expected risk. In its ex-ante form: E .Ri / D Rf C E .Rm / Rf ˇi
(6A.1)
Although many of the aforementioned extensions of the model support this simple linear form, others suggest that it may not be linear, that factors other than beta are needed to explain E .Ri /, or that the Rf is not the appropriate riskless rate. The first step necessary to empirically test the theoretical CAPM is to transform it from expectations (ex-ante) form into a form that uses observed data. On average, the expected rate of return on an asset is equal to the realized rate of return. This can be written: Rit D E .Rit / C ˇi ımt C eit where: ımt D Rmt E.Rmt /I E.ımt / D 0I eit D a random error termI Cov .eit ; ımt / D 0I Cov .eit ; ei t 1 / D 0I and ˇit D Cov.Rit Rmt /Var.Rmt /: When CAPM is empirically tested it is usually written in the following form: 0 D 0 C 1 ˇp C ept Rpt
where:
1 D Rmt Rf t 0 Rpt D Rpt Rf t
(6A.2)
These relationships can be stated as follows. 1. The intercept term 0 should not be significantly different from zero. 2. Beta should be the only factor that explains the rate of return on a risky asset. If other terms, such as residual variance, dividend yields, price/earnings ratios, firm size, or beta squared are included in an attempt to explain return, they should have no explanatory power. 3. The relationship should be linear in beta. 4. The coefficient of beta, “II” should be equal to Rmt Rft . 5. When the equation is estimated over very long periods of time, the rate of return on the market portfolio should be greater than the risk-free rate. The work of Jensen (1972) provides a comprehensive and unifying review of the theoretical developments and the empirical work done in the field until that year. In his paper he points out that the main result of the original papers in this area is the demonstration that one can derive the individual’s demand function for assets, aggregate these demands to obtain equilibrium prices (or expected returns) solely as a function of potentially measurable market parameters. Thus, the model becomes testable. Let us now summarize the empirical work of Douglas (1969), Black et al. (1972), and Fama and MacBeth (1973). The first published test of the CAPM was by Douglas (1969), who regressed the returns of a large cross-sectional sample of common stocks on their own variances and on their beta coefficients ˇi obtained by market models. His results are in variance with the Sharpe-Lintner-Mossin model, for he found that the return was positively related to the variance of the security but not the covariance with the index of returns. Douglas also summarizes some of the work of Lintner (1965), who estimates beta from a typical market model. Douglas then adds a term for the standard deviation of error (proxy for unsystematic risk). He finds that the coefficient for the unsystematic risk is both positive and significant, the intercept term is higher than the appropriate risk-free rate, and the coefficient for the market risk premium is too low. Black, Jensen, and Scholes observe that cross-sectional tests may not provide direct validation of the CAPM, and they proceed to construct a time-series test, which they consider more powerful. Their results lead them to assert that the usual form of the CAPM does not provide an accurate description of the structure of security returns. Their results indicate that ˇs are non-zero and are directly related to the risk level. Low-beta securities earn significantly more on average than predicted by the model, and high-risk securities earn significantly less on average than predicted by the model. They go on to argue for a two-factor model: Rit D .1 ˇi /Rzt C ˇi .Rmt / C eit
(6A.3)
108
C.-F. Lee et al.
If E.Rz / D 0 then the Sharpe-Lintner-Mossin CAPM would be consistent with this model. However, the cross-sectional term for the intercept is a constant and not equal to zero. They then proceed to look for a rationale of this finding in Black’s zero-beta model. Fama and MacBeth (1973) test (1) a linear relationship between return on the portfolio and the portfolio’s beta, and (2) whether unsystematic risk has an effect between portfolio return and a risk measure in addition to beta. Their basic estimation equation is: 2 RQ it D Q0t C Q1t ˇ i C Q2t ˇ i C Q3t i2 ./ C eQit
(6A.4)
in which Q0t is the intercept term, ˇ i is the average of the ˇi for all individual securities in portfolio i , and i ./ is the average of the residual standard deviations from all securities in portfolio j . Although they find that there are variables in addition to the portfolio beta that systematically affect period-byperiod returns (which are apparently related to the average squared beta of the portfolio and the risk factor other than beta), they dismiss the latter as “almost surely proxies,” since “there is no rationale for their presence in our stochastic riskreturn model.” Their results seem to suggest that the SharpeLintner-Mossin model does not hold. The intercept factor, Q0t is generally greater than RF , and Q1t is substantially less than Rm Rf . This seems to indicate that the zero-beta model is more consistent with the data. Blume and Husick (1973) find empirical evidence to indicate that historical rates of return may sometimes foreshadow changes in future betas, and that stocks with higher transaction costs should yield somewhat higher gross expected returns. Their data indicate that beta is not stationary over time, and that it does change over time as a function of price. Transaction-cost effects appear less important than the informational effects of price in explaining future returns or future betas. Therefore, the return-generating process may be more complex than what has been assumed. Blume and Friend (1973) examine both theoretically and empirically the reasons why the CAPM does not adequately explain differential returns on financial assets. Empirically the risk-return tradeoffs implied by stocks on the New York Stock Exchange for three different periods after World War II cast doubt on the validity of the CAPM either in its S-L-M form or zero-beta form. However, they do confirm the linearity of the relationship for NYSE stocks. Rosenberg and McKibben (1973) observe that predictions of the riskiness of returns on common stocks can be based on fundamental accounting data for the firm and also on the previous history of stock prices. This paper tries to combine both sources of information to provide efficient predictions. A stochastic model of the parameters is built and the mean squared error is used as a criterion for the evaluation of the
forecasting performance of estimators. They conclude that the results “strongly confirm the usefulness of the specific risk predictions based on the accounting descriptors.” Merton (1980) is concerned with the estimation of the expected return on the market. He notes that the current practice for estimating the expected market return adds the historical average realized excess market returns to the current observed interest rate. However, while this model explicitly reflects the dependence of the market return on the interest rate, it fails to account for the effects of changes in the level of market risk. Three models of equilibrium expected market returns are elaborated, and estimation procedures that incorporate the prior restriction that equilibrium expected excess returns on the market must be positive are derived and applied to return data for the period 1926–1978. The following are the principal conclusions of the study. 1. The nonnegativity restriction above should be explicitly included as part of the specifications. 2. Estimators that use realized returns should be adjusted for heteroskedasticity. Roll (1977) directs an attack on the empirical tests of the CAPM. While recognizing that the theory is testable in principle, he asserts that “no correct and unambiguous test to the theory (has) appeared in the literature and there is practically no possibility that such a test can be accomplished in the future.” This conclusion is derived from the mathematical equivalence between the individual return beta linearity and the market portfolio’s mean variance efficiency. Therefore, any valid test presupposes complete knowledge of the market portfolio’s composition. The major results reported by Roll from his theoretical inquiry include the following: 1. The only testable hypothesis is that the market portfolio is mean-variance efficient. 2. All other so-called implications of the CAPM are not independently testable. 3. In any sample there will always be an infinite number of ex-post mean-variance efficient portfolios; betas calculated will satisfy the linearity relation exactly, whether or not the true market portfolio is mean-variance efficient. 4. The theory is not testable unless the exact composition of the true market portfolio is known and used in the tests. 5. Using a proxy for the market portfolio does not solve the problem, for the proxy itself might be mean-variance efficient even when the true market portfolio is not, and conversely. 6. Empirical tests that reject the S-L-M model have results fully compatible with the S-L-M model and a specification error in the measured market portfolio. 7. If the selected index is mean-variance efficient, then the betas of all assets are related to their mean returns by the same linear function (all assets and portfolios fall exactly on the SML).
6 Capital Asset Pricing Model and Beta Forecasting
8. For every ranking of performances obtained with a mean-variance inefficient index, there exists another nonefficient index that reverses the ranking. Roll’s critique is a broad indictment of most of the accepted empirical evidence concerning the CAPM theory.
Appendix 6B Anomalies in the Semi-strong Efficient-Market Hypothesis Three anomalies in the semi-strong efficient-market hypothesis are noteworthy. Four authors – Basu, Banz, Reinganum, and Keirn – deal with these three anomalies: (1) P/E ratios; (2) size effects; and (3) the January effect. Basu (1977) empirically notes that a firm with a low P/E ratio, when adjusted for risk, has an excess return over firms that have a high P/E ratio. If this is true, then there are implications for the market’s efficiency, the validity of the CAPM, or both. However, Basu found that the excess returns, when adjusted for transaction costs, taxes, and so forth, were so much smaller as to be insignificant. Therefore, CAPM and market efficiency were supported. In conjecturing why a difference in P/E ratios could affect the returns of the firm, we believe that the relationship may involve the firm’s ability to raise debt. A low P/E ratio may indicate more difficulty in raising capital than a high P/E ratio. This difficulty in raising capital could result in different lending and borrowing rates for different firms. Therefore, a Brennan version of the standard CAPM pricing model may be more applicable. Reinganum (1981) also empirically tests the P/E effect and finds the same results as Basu. In addition, Reinganum was concerned with the efficiency of the market. In order to see if the market was informationally efficient, Reinganum also looked at returns of firms with neither high nor low P/E ratios. He found these firms to be correctly priced. From these results he conjectures that the market was informationally efficient, but CAPM did not allow for the P/E effect on returns. Therefore an APT model with the P/E effect as one of its factors would be preferable to the CAPM model. Banz (1981) empirically tests the effect of firm size, finding that small-company stock returns were higher than
109
large-company stock returns. Banz argues that the P/E ratio serves as a proxy for the size of a firm and not vice versa. His conjecture about why this anomaly exists centers around informational distribution. A small firm’s information distribution is somewhat limited, which causes investors to be wary of buying the stock and depresses the price. Banz suggests the APT valuation model may be more robust than the CAPM in that the APT would be able to capture the size effect by using a P/E ratio as a proxy for the size effect as a factor in the model. Since the distribution of information affects a firm’s ability to raise capital (less information on a firm may cause a firm to pay a premium for capital) the premium would indicate that the lending and borrowing rates of different firms are not the same. A valuation model capturing different lending and borrowing rates was provided by Brennan (1971). The utilization of this model could be implemented where rates differed for large and small firms. If small firms do not borrow in the capital market, then Black’s noborrowing CAPM would be more suitable than the standard CAPM. Keim (1983) empirically tests one of the most baffling anomalies, the January effect. He found that for stocks with excess returns, over 50% of these excess returns were realized in January. In addition, 50% of the January excess return occurs in the first week of January. This phenomenon of excess returns occurring in the month following the taxyear-end has been found empirically in Great Britain also. Although Keim offers no rationale for this phenomenon, others have tried to find tax reasons for this anomaly. The selling of assets in January rather than December to postpone capital gains taxes to the next taxable year has been one suggested rationale. Unlike the P/E and size effects, this anomaly does not have a clear proxy that could be utilized as a factor in an APT model. CAPM does not capture the effect well, and without any theoretical or economical rationale for this effect, any valuation model would be hard pressed to account for it. Many of the authors seem to conclude that the standard CAPM is not working well, and that alternative valuation models should be considered to capture these anomalies. This might imply that security analysis and portfolio management techniques can be used to beat the market.
Chapter 7
Index Models for Portfolio Selection Cheng-Few Lee, Joseph E. Finnerty, and Donald H. Wort
Abstract In this chapter, we discuss both the single-index model and multiple-index portfolio selection model. We use constrained maximization instead of minimization procedure to calculate the portfolio weights. We find that both single-index and multi-index models can be used to simplify the Markowitz model for portfolio section. Keywords Single-index model r Multi-index model r Market model r Multiple indexes r Linear programming approach r Lagrange multipliers
7.1 Introduction Previously, we have presented and discussed the Markowitz model for delineating the efficient frontier. Numerous examples were shown that indicated the potentially crushing number of computations resulting from the calculations for even a three-security portfolio. This chapter offers some simplifying assumptions that reduce the overall number of calculations through the use of the Sharpe single-index and multiple-index models. The essential difference between the single- and multipleindex models is the assumption that the single-index model explains the return of a security or a portfolio with only the market. The multiple-index model describes portfolio returns through the use of more than one index. The investor may quantify the return on a portfolio by seeking an index that is representative of the market together with indexes that are representative of the industries of which the component securities are members or exhibit some other
D.H. Wort () University of California, Hayward, CA, USA e-mail:
[email protected] C.-F. Lee Rutgers University, New Brunswick, NJ, USA J.E. Finnerty University of Illinois at Urbana-Champaign, Champaign, IL, USA
common factor. More is said about the multiple-index model later in the chapter; for now the single-index model is the focus of discussion.
7.2 The Single-Index Model The major simplifying assumption that yields the index model from Markowitz’s portfolio theory is that covariances between individual securities contained in the portfolio are zero. This assumption greatly reduces the number of calculations needed to find the set of efficient portfolios. The use of the index model necessitates additional statistical estimates for the parameters of the index; nevertheless, these additions are minor in comparison to the reduction in the calculation load as a result of ignoring the covariance terms between securities. Suggested by Markowitz (1959), the single-index model was fully developed by Sharpe (1970), who assumed that the covariances could be overlooked. The return of an individual security was tied to two factors – a random effect and the performance of some underlying market index. Notationally: Rit D ai C bi RI t C eit
(7.1)
where: ai and bi D regression parameters for the i th firm; RIt D the tth return of some underlying market index; Rit D the tth return on security i ; and eit D the tth random effect for the i th security. Equation (7.1) is the market model as discussed in the last chapter. This regression makes several assumptions about the random effect term. 1. The expected value of the tth random effect for security i is zero. More explicitly, E .eit / D 0. 2. The variance of the error terms is constant. This amounts to the assumption that the errors are homoscedastic. 3. There is no relationship between the errors and the return on the market: Cov .eif ; RIt / D 0.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_7,
111
112
C.-F. Lee et al.
4. The random effects are not serially correlated: E.eit ; eitCn / D 0. 5. The i th security’s random effect is unrelated to any other random effects of any other security: E.eit ; ejt / D 0. The fifth assumption guarantees that the regression coefficients ai and bi are the best unbiased linear estimators of the true parameters. An investigation of some of the results of the previous assumptions is in order. The expected value of the return on security i is equal to the sum of the intercept, the adjusted return on the index, and some random effect. This can be expressed:
returns of two different securities are not interrelated but only connected to the market index, the covariance between the two securities can be derived from the twice-adjusted variance of the market index (once by the b coefficient of the first security with the market and again by the b coefficient of the second security with the market). The investigation starts with a statement about the covariance of the two securities: ij D E .Rit ri / Rjt rj D EŒ..ai C bi RIt C eit /.ai C bi rI //
..aj C bj RIt C ejt /.aj C bj rI // D E .bi .RIt rI / C eit / bj .RIt rI / C ejt D bi bj E .RIt rI /2 C bi E ejt .RIt rI /
E.Ri / D E.ai C bi RIt C eit / D E.ai / C E.bi RIt / C E.eit /
(7.2)
Because ai and bi are constants, and E .eit / is equal to zero: ri D E.Ri / D ai C bi rI
(7.3)
where rI is equal to the mean of the returns on the market index. If the mean of the returns on security i equals ri , then the variance of security i is equal to the expected value of squared deviations from rit . This translates to: i2 D E.Rit ri /2 D EŒ.ai C bi RIt C eit / .ai C bi rI / 2 D EŒbi .RIt rI / C eit 2 D bi2 E .RIt rI /2 C 2bi EŒeit .RIt rI / C E.eit /2 (7.4) By the third assumption the covariance of the random effect and the deviation of the index return from its mean are zero; also, the expected value of the squared errors is equal to the variance of the random effect. Thus: i2 D bi2 E.RIt rI /2 C E.eit /2 D bi2 I2 C ei2
(7.5)
This is equivalent to saying that the variance of the returns on a security is made up of some adjusted quantity of the variance of the market (usually referred to as systematic risk) plus the variance of the random effects exclusive to that particular security (unsystematic risk). The last result to be investigated from these assumptions involves a minor step into abstraction in which the possibility of interaction between two securities and the market index is considered. It is suggested that because variations in the
C bj E.eit .RIt rI // C E.eit ejt /
(7.6)
Since according to the third and fourth assumptions the last three terms of the last summation are equal to zero: ij D bi bj I2
(7.7)
The number of calculations necessary to utilize the ı Markowitz model is N C .N 2 N / 2, as shown in the last chapter. For a portfolio with a hundred securities, this translates to 5,050 calculations. With the Sharpe singleindex model, only 100 estimates are needed for the various security-regression coefficients and only one variance calculation, the variance of the returns on the market. In addition, 100 estimates of the unsystematic risk ei2 are also needed for the single-index model. Hence, the single-index model has dramatically reduced the input information needed.1
7.2.1 Deriving the Single-Index Model So far only the Sharpe single-index model has been utilized to study the returns of a single security i as determined by its relation to the returns on a market index. Expected return of a portfolio. Now consider the return on a portfolio of n securities. The return of a portfolio of n securities is the weighted summation of the individual returns of the component securities. Notationally: n X E Rpt D xi E .Ri / i D1
1
Discussion of the single-index model adapted in part from Sharpe (1970). Adapted by permission.
7 Index Models for Portfolio Selection
113
Where Rpt is the rate of return for a portfolio in period t and Xi is the weight associated with the i th security. Rit D ai C bi RIt C eit Rpt D D
n X i D1 n X
xi .ai C bi RIt C eit / xi .ai C eit / C
i D1
D
n X
n X
xi .bi RIt /
i D1
xi ai C
i D1
n X
.xi bi / .RIt / C
i D1
n X
xi eit
(7.8a)
i D1
It was shown previously that when the number of component securities in a portfolio approaches 15, the unique risk of the component securities is reduced through diversification. In Equation (7.10) the last term, the weighted sum of the random effect variances, approach zero as n increases. So, again, as the number of securities increases, the unsystematic risk is reduced and the remaining risk of the portfolio is the adjusted variance of the market index. Example 7.1 further illustrates this concept. Example 7.1. Given the following information, what should the ˇ of the portfolio .bp / be?
Thus, E.Rpt / D
n X
xi ai C
i D1
n X
.xi bi /E.RIt /
(7.8b)
Variance of a portfolio. To derive the variance of the portfolio p2 , consider first that the mean return of the portfolio is equal to the expected value of the return on the portfolio. Then, following the definition of p2 in previous chapters and Equations (7.5) and (7.7), we can derive P2 as follows. The variance of the portfolio is equal to P2 D EŒ.Rpt E.Rpt //
(7.9a)
Substituting Equations (7.8a) and (7.8b) into Equation (7.9A), we obtain h i2 Xn Xn 2 P D E .RIT E.RIT // .xi bi / C xi ei i D1
D D D
Xn
i D1
hXn
i D1
i D1
2 Xn .xi bi / I2 C
i D1
Xn
i D1
j D1
xi bi
.xi xj bi bj /I2 C
i hXn j D1
i
2 xi ei Xn
xj bj I2 C
i D1
Xn i D1
i D1
i D1
xj bj D bp . Hence, the last equation reduces to: n X p2 D bp2 I2 C xi2 ei2 i D1
Solution (7.10):
Substituting related information into Equation
bp D D
I2 P2 0:082 0:041
D 2:0 Therefore, bp D 1:414. Equation (7.8b) implies that the portfolio can be viewed as an investment in n basic securities and a weighted adjusted return in the market, or n X
.xi bi / E.RIt t/
i D1
The return on the market can be decomposed as a combination of the expected return plus some random effect. When this random effect is positive, the atmosphere is bullish and when it is negative, the atmosphere is bearish. Notationally:
D anC1 C enC1;t
xi2 ei2
Because the weighted sum of the bi coefficients is equal n P xi bi D bp , similarly to the coefficient of the portfolio
j D1
xi2 ei2 D 0
RIt D E.RIt / C enC1;t
xi2 ei2
(7.9b)
n P
n X
i D1
This equation indicates that the return of a portfolio may be decomposed into the summation of the weighted returns peculiar to the individual securities and the summation of the weighted adjusted return on the market index. Thus, the portfolio may be viewed as a combination of n basic securities and a weighted adjusted return from an investment in the market index.
Xn
p2 D 0:082 I2 D 0:041
(7.10)
(7.11)
This bit of algebraic maneuvering enables the weighted adjusted investment in the market to be viewed as an investment in an artificial security, the .n C 1/th of an n-security portfolio. The weight for this .n C 1/th security is the sum of the n weights multiplied by their respective related coefficients to the market index. Thus: xnC1 D
n X i D1
xi bi
(7.12)
114
C.-F. Lee et al.
The reason for this divergence in notation resulting in the definition of the .n C 1/th security’s return and weight is that Equation (7.8) can be simplified to yield a working model for portfolio analysis. Substituting the last results for XnC1 and RIt into Equation (7.8) yields: Rpt D
n X
D
t D1
relationship n X
xi .ai C eit / C xnC1 .anC1 C enC1;t /
RIt D n˛i C
t D1
i D1 nC1 X
Example 7.2. Given RIt ; Rit and ˛i as indicated in the table n P shown below and the fact that ˇi RIt D 56, using the
n X
ˇi RIt C
t D1
find the values for ˛i ; ˇi and ei . xi .ai C eit /
(7.13)
Solution n X
Because the expected value of the random-effect terms is zero, the summation that results after the application of the expectations operator can be expressed: E.Rpt / D
n X
n P
RIt D 28; so ˇi D
t D1
D
nC1 X
xi ai C xnC1 E.RIt /
D
(7.14)
This yields a formula for the return of a portfolio that is easily applied to portfolio analysis. Before proceeding, however, it is necessary to simplify the variance formula so that it may be used as well. Remembering that the variance of the portfolio is the expected value of the squared deviations from the expected market return, the last results concerning Rpt and E.Rpt / may be applied:
Substitute the values
iD1
n X
n P
ˇi RIt D 56; n D 4
t D1
RIt D n˛i C
n X
ˇi RIt C
t D1
n X
eit
t D1
and solve for ˛i . RIt
Rit
˛i
ˇi RIt
ei
4 6 10 8 28
12 14 20 18 64
2 2 2 2 8
8 12 16 16 56
12 2 8 D 2 14 2 12 D 0 20 2 20 D 2 18 2 16 D 0 0
iD1
iD1
The ˇi RIt column in the table is filled by simply multiplying ˇ.D 2/ by the RIt column, The ei ’s are the amounts such that Rit D ˛ C ˇi Rmt C eit is an equality so Rit D ˛i C ˇi RIt C eit is satisfied, From the last column of the table we know that n P eit D 0. Therefore
iD1
12 xi eit A
iD1 nC1 X
RIt D 64;
t D1
iD1
92 82 3 nC1 = < nC1 X X DE 4 xi .ai C eit /5 xi E .ai / ; :
D Var @
56 28
into the regression line
92 82 3 nC1 = < nC1 X X DE 4 xi .ai C eit /5 xi ŒE .ai / C E .eit / ; :
0
n P t D1
82 3 2 392 nC1 = < nC1 X X 5 4 4 Var Rpt D E xi .ai C eit / E xi .ai C eit /5 ; :
nC1 X
RIt
D2 xi ai
i D1
iD1
ˇi RIt
t D1 n P t D1
i D1
DE@
eit
t D1
i D1
0
n X
1 xi eit A
i D1
(7.15)
iD1
This result follows from the assumption that the covariances are equal to zero. Additionally, each variance term is only a weighted sum of the errors around the market return. The direct usefulness of the last two conclusions will become apparent when solving for the security weights. Example 7.2 provides further illustration.
˛i D
64 56 C 0 D2 4
In this section, the expected return E.Rpt / and the variance of a portfolio Var.Rpt / in terms of the single-index model has been derived. In the following section, the optimal portfolio selection procedures discussed in previous chapters are used to explore the single-index optimum-portfolio selection model.
7 Index Models for Portfolio Selection
115
7.2.2 Portfolio Analysis and the Single-Index Model Before beginning the portfolio analysis using the single-index model, it is necessary to explain a derivation of security weights through the use of the Lagrangian calculus maximization discussed in previous chapters. The maximization procedure maximizes a linear combination of the following two equations:2 E.Rp / D
nC1 X
E.Ri /
or
i D1
Max nC1 X Var Rp D Var xi eit
! (7.16a)
i D1
subject to nC1 P
xi D 1 and
i D1
xnC1 D
n P
(7.16b) xi bi
i D1
The first constraint is equivalent to requiring that the sum of the weights of the component securities is equal to one. The second constraint requires that the weight of the market index within the portfolio returns is equal to the summation of the weighted adjustment factors of the component securities. This requirement is as described in Equation (7.16b). Combining the above two objective functions with the two constraints yields a Lagrangian function: ! n X xi 1 P D ˆE Rp Var Rp C 1
C2
n X
!
i D1
xi bi xnC1
i D1
Dˆ
nC1 X
xi ai
i D1
C2
nC1 X
Var .eit / C 1
i D1 n X
xi bi xnC1
!
n X
! xi 1
i D1
(7.16c)
i D1
The only difference between this maximization function and the minimization shown in previous chapters, beyond intent,
2 The maximization of the negative of the variance is equivalent to the minimization of the variance itself. This can be viewed as originating from the negative spectrum of the number line and maximizing towards zero.
is that instead of fixing some arbitrary value of return needed .E /, an attempt can be made to quantify the level of risk aversion that the investors of the portfolio require, thereby placing the portfolio on the efficient frontier not by desired return but by level of utility, as discussed in previous chapters. This indication of risk aversion is denoted by the Greek letter ˆ (Phi). When low values are exhibited, risk aversion is pronounced; when high values are in evidence, substantial risk taking is allowed. This notion of the level of risk aversion is best pictured in Fig. 7.1. In Fig. 7.1, A represents an investor’s objective function to minimize the risk only – therefore, the aggressiveness to return is zero. C represents an investor’s objective function to maximize returns only – therefore, the aggressiveness to return is infinite. At B an investor’s attitude toward return and risk is between A and C. Figure 7.2 provides further illustration of the approach being utilized. What is depicted is the variation of the objective function P as the parameter denoting risk is varied. Again, notice that the risk-return relation for a low ˆ is lower than that for the more risk-taking, high value of ˆ. Points A and C are as discussed for Fig. 7.1. Lines BD and B 0 D 0 represent the objective function when the risk-aversion parameter is equal to one. Note that BD instead of B 0 D 0 represents the maximization of the objective function. In sum, different values of ˆ generate different optimal objective functions. Consider now the three-security portfolio. In this framework, the preceding objective function expands to: P D ˆx1 a1 C ˆx2 a2 C ˆx3 a3 Cˆx4 a4 x12 Var .e1t / x22 Var .e2t / x32 Var .e3t / x42 Var .e4t / C1 x1 C 1 x2 C1 x3 1 C 2 x1 b1 C 2 x2 b2 C 2 x3 b3 2 x4 (7.17) Take note of all terms in the preceding equation that contain a4 and x4 . By utilizing the relations of the individual securities to the market, it has been possible to delete most of the calculations necessitated by the full Markowitz variancecovariance model. The return that will be generated from an optimal portfolio derived through this maximization rests on the estimation of the future expected return and variance of the index. This will lead to the availability of ranging analysis for these estimations, which will produce robust estimates for the security weights highlighted at the end of this section. To proceed with the Lagrangian maximization it is necessary to notice that the above equation has six unknowns – the four weights and the two Lagrangian coefficients. All other values are expected to be known or estimated. To maximize, the partial derivative of the objective function is taken with respect to each of the six variables:
116
C.-F. Lee et al.
Fig. 7.1 Level of risk aversion and investors’ investment attitude
Fig. 7.2 Level of risk aversion and objective function
32 D Var .e3t / I and
@P D ˆa1 2x1 12 C 1 C 2 b1 D 0 @x1
42 D Var .e4t / :
@P D ˆa2 2x2 22 C 1 C 2 b2 D 0 @x2
Equation (7.18) is a set of six equations in six unknowns when set equal to zero. These six equations can be rewritten in matrix format:
@P D ˆa3 2x3 32 C 1 C 2 b3 D 0 @x3 @P D ˆa4 2x4 42 2 D 0 @x4 @P D x1 C x2 C x3 1 D 0 1 @P D x1 b1 C x2 b2 C x3 b3 x4 D 0 2 Where: 12 D Var .e1t / I 22 D Var .e2t / I
(7.18)
A 2 212 0 0 0 6 0 222 0 0 6 6 0 0 232 0 6 6 0 0 0 242 6 4 1 1 1 0 b2 b3 1 b1
1 1 1 0 0 0
3 b1 b2 7 7 b3 7 7 17 7 05 0
x D k 2 3 2 3 x1 ˆa1 6x2 7 6ˆa2 7 6 7 6 7 6x3 7 6 7 6 7 D 6ˆa3 7 6x4 7 6ˆa4 7 6 7 6 7 41 5 4 1 5 2 0
(7.19a) Equation (7.19a) has related the individual securities to the index and has discarded the use of numerous covariance terms. The majority of the elements of the matrix indicated in the first matrix of Equation (7.19a) are zero.
7 Index Models for Portfolio Selection
117
As in previous chapters, to solve for the x column vector of variables the P matrix is inverted and each side of the equation is pre multiplied by A1 . This yields the x column vector of variables on the left-hand side and A1 multiplied by k, the solution column vector, on the right-hand side. Ax D k A1 Ax D A1 k Ix D A1 k
(7.19b)
where I is the identity matrix. The procedure of solving Equations (7.19a) and (7.19b) is discussed in later chapters. A linear-programming approach to solve this kind of model is discussed in the first appendix of this chapter. In order to estimate a single-index type of optimal portfolio the estimates of a market model are needed; a discussion of the market model and beta estimates is therefore needed as well.
7.2.3 The Market Model and Beta Equation (7.1) defines the market model: Rit D ai C bi RIt C eit
(7.1)
From this market model, ai ; bi ; Var .eit / and Var .RIt / can be estimated: all are required for the single-index type of optimal portfolio. So far the regression coefficient has been referred to as bi when, in fact, it is a risk relationship of a security with the market. Quantified as the covariance of security i with the market index divided by the variance of the return of the market, beta is a relational coefficient of the returns of security i as they vary with the returns of the market. Notationally: bi D
i:I D I2
n X
Œ.Rit rit / .RIt rIt /
t D1
.RIt rIt /2
(7.20)
Of course, due to the continuous nature of security returns, the result of the above calculation is an estimate and therefore subject to error. This error can be quantified by the use of standard error estimates, which provide the ability to make interval estimations for future predictions. The following discussion is related to the beta estimate and its forecasting as discussed in previous chapters. Because of the linkage between security returns and the firm’s underlying fundamental nature, beta estimates will vary over time. It is the job of the security analyst to decide how to modify not only the beta estimate but also the period
of time from which the sample of returns is to be drawn. Some popular security-evaluation techniques use a tiered growth model to correspond to the product life-cycle theory. Within the scope of this theory, beta estimates for the relation of that security to the market will vary with respect to the time period chosen for the sample of returns. It is obvious that the returns generated by a firm in its infancy have very little correlation with returns during the growth or maturity phase. It is up to the analyst to judge which phase a company may be in and to adjust the beta estimates accordingly. Assume for expository purposes that all stocks move in perfect alignment with the market, and therefore the security returns all produce beta estimates of 1. It could then be said that any estimation of betas above 1 would indicate positive sampling error, and all estimates below 1 negative sampling error. Future periods would show returns that caused movement from either side of the estimate deviation back towards 1. Blume (1975) shows that the adjustment of a beta in one period could accurately predict the adjustment in subsequent periods. Utilizing regression analysis, Blume demonstrates that a historical beta could be adjusted successfully to predict future levels. Large brokerage houses have taken this theory a little further by applying weighted averages of historic and average betas in adjusting beta estimates. This has the advantage of being simple to calculate, but it has the ability nevertheless to adjust beta back towards an average level (This kind of beta-adjustment process was discussed in the last chapter; the Value Line beta estimate is essentially based on this process, as explored in previous chapters.) Historic beta estimates give the analyst a feeling for how the firm is generating returns to an investor but give very little guidance about the firm’s underlying fundamental nature. It is held that a firm’s balance-sheet ratios over time can indicate how risky the firm is relative to the market. Using seven financial quantities of a firm, Beaver et al. (1970) show that betas can be estimated by multiple regression relating the financial levels to the riskiness of the firm. Using 1. 2. 3. 4. 5. 6. 7.
Dividend payout Asset growth Leverage Liquidity Asset size Earnings volatility, and Accounting beta (relation of earnings of the firm with the earnings of the economy as a whole)
BKS show not only that each of the variables carries a logical relation to the market but also that the multiple regression is significant. Other studies, most notably Rosenberg and Guy (1976a, b), use many more fundamental factors in the regression for estimating betas. Historic betas are of consequence because of their relation with market returns; nevertheless, the sampling period
118
tends to crowd out the information contained in returns due to recent developments for the firm. Fundamental betas tend to recognize significant changes in the makeup of a firm – for example, a change in debt structure, or liquidity. The disadvantage related to fundamental betas is that they treat intercompany responsiveness as constant. It is patently obvious that a small firm taking on a large debt will be at much more risk than a Ford or GM doing the same. Because of the advantages and disadvantages apparent in both types of beta estimates, it has been shown that a combination of the estimates is more suitable to risk classification. Rosenberg and McKibben (1973) found that intercompany responsiveness is substantial, so an analysis was undertaken to introduce these variations into the regression. Utilizing a set of dummy variables to capture the advantages of both types of betas, Rosenberg and Marathe (1974, Berkeley Working Paper Series) have described a multiple regression for the beta estimate of a firm that include information such as variability, level of success, relative size, and growth potential. By using a very large and complex model they found that the forecasting ability of the regression analysis is substantial, but they were inconclusive as to whether the benefit of the analysis outweighs the computational cost. Forecasting future beta levels holds its roots in the difficulty of forecasting the fundamental nature of the firm itself. This section has briefly looked over some of the available adjustment processes for beta estimation. While reasonable in its scope, this coverage of the adjustment process is by no means complete. Additional research material is cited at the end of this chapter for use in further study. It should be noted during future readings that the more sophisticated an adjustment process is, the more likely it is that the computational costs will outweigh the improvement of the beta forecast.
7.3 Multiple Indexes and the Multiple-Index Model The previous section reviewed the possibilities of adjusting the beta estimate in the single-index model to capture some of the information concerning a firm not contained in the historic returns. The multiple-index model (MIM) pursues the same problem as beta adjustment, but approaches the problem from a different angle. The multiple-index model tackles the problem of relation of a security not only to the market by including a market index, but also to other indexes that quantify other movements. For example, U.S. Steel has returns on its securities that are related to the market, but due to the declining nature of the American steel industry there is also some relation to the steel industry itself. If an index
C.-F. Lee et al.
could be developed that represented the movement of the U.S. steel industry, it could be utilized in the return analysis, and a more accurate estimate of possible security returns for U.S. Steel could be derived. The assumption underlying the single-index model is that the returns of a security vary only with the market index. The expansion provided by the multiple-index model includes factors that affect a security’s return beyond the effects of the market as a whole. Realistically any index might be used, but well-known, published indexes are usually incorporated. These may include general business indicators, industry-specific indicators, or even self-constructed indexes concerning the structure of the firm itself. The model to be addressed contains L indexes; nevertheless, a few examples of possible indexes of interest are offered. The multi-index model is related to the arbitrage pricing model developed by Ross (1976, 1977), which will be explored in detail in the next chapter. The covariance of a security’s return with other market influences can be added directly to the index model by quantifying the effects through the use of additional indexes. If the single-index model were expanded to take into account interest rates, factory orders, and several industry-related indexes, the model would change to: I1 C bi2 I2 C C biM IM C ci Ri D ai C bi1
(7.21)
In this depiction of the multiple-index model, Ij is the actual level of index j , while bij is the actual responsiveness of security i to index j . If the component of the security return is not related to any of the indexes then this index model can be divided into two parts: (1) ai , the expected value of the unique return, and (2) ci which represents the distribution of the random effect. ci has a mean effect of zero and a variance of c2i . This model can be utilized with multiple regression techniques, but if the indexes are unrelated to each other the calculations would be made much simpler. This assumption reduces the number of calculations, as compared with the Markowitz full variance-covariance model, but is obviously more complex than the single-index model. To assure that the indexes are unrelated, the index variables can be orthogonalized (made uncorrelated) by completing inter-regressions on the indexes themselves. Assume there is a hypothetical model that deals with two indexes: I1 C bi2 I2 C ci Ri D ai C bi1
(7.22)
Suppose the indexes are the market index and an index of wholesale prices. If these two indexes are correlated, the correlation may be removed from either index.
7 Index Models for Portfolio Selection
119
To remove the relation between I1 and I2 , the coefficients of the following equation can be derived by regression analysis: I2 D e0 C e1 I1 C di where:
2 DE bi1 .I1t C I 1 /2 C 2bi1 bi 2 .I1t C I 1 /.I2t C I 2 / C bi22 .I2t C I 2 /2 C 2bi1 .I1t C I 1 /eit C2bi 2 .I2t C I 2 /eit C eit2 2 D bi1 E.I1t C I 1 /2 C bi1 bi 2 E .I1t C I 1 /.I2t C I 2 /
e0 and e1 D the regression coefficients; and di D the random error term. By the assumptions of regression analysis, di is uncorrelated with I1 . Therefore:
C bi22 E.I2t C I 2 / C bi1 EŒ.I1t C I 1 /ei C bi 2 EŒ.I2t C I 2 /ei C E ei2 But by assumption:
dOi D I2 eO0 C eO1 I1
EŒ.I1t C I 1 /.I2t C I 2 / D 0 EŒ.I1t C I 1 /ei D 0
which is an index of the performance of the sector index without the effect of I1 (the market removed). Defining: I2 D dOi D I2 eO0 eO1 I1
EŒ.I2t C I 2 /ei D 0 and E.I1t C I 1 /2 D 12
an index is obtained that is uncorrelated with the market. By solving for I2 and substituting into Equation (7.22):
E.I2t C I 2 /2 D 22 E ei2 D ei2 I
I1 C bi2 I2 bi2 eO0 bi2 eO1 I1 C ci Ri D ai C bi1
therefore,
Rearranging: bi2 eO1 I1 C bi2 I2 C ci Ri D ai bi2 eO0 C bi1 If the first set of terms in the brackets are defined as ai and the second set of terms are defined as bi1 ; bi2 D bi 2 ; I1 D I1 and ci D ei , the equation can be expressed: Ri D ai C bi1 I1 C bi 2 I2 C ei
(7.23)
in which I1 and I2 are totally uncorrelated: the goal has been achieved. As will be seen later, these simplifying calculations will make the job of determining variance and covariance much simpler. The expected return can be expressed:
D E.ai / C E.bi1 I1t / C E.bi 2 I2t / C E.eit / (7.24)
Since ai ; bi1 and bi 2 are constants, E .ei / D 0 by assumption, where I 1 D E.I1t / and I 2 D E.I2t /. Variance can be expressed: 2 DE .Rit Ri /2
(7.25)
Covariance between security i and security j can be expressed ij DEŒ.Rit Ri /.Rjt Rj / Œwhere Ri DE.Rit / and Rj D E.Rjt / ˚ DE .ai Cbi1 I1 Cbi 2 I2t C e1t / .ai C bi1 I 1 C bi 2 I 2 /
.aj C bj1 I1t C bj 2 I2t C ejt / .aj C bj1 I 1 C bj 2 I 2 / DEfŒbi1 .I1t I 1 / C bi 2 .I2t I 2 / C ei Œbj1 .I1t I 1 / C .bj 2 .I2t I 2 / C ejt g D EŒbi1 bj1 .I1t I 1 /2 C bi 2 bj 2 .I2t I 2 /2
E.Rit / D E.ai C bi1 I1t C bi 2 I2t C eit /
D ai C bi1 I 1 C bi 2 I 2
2 2 i2 D bi1 1 C bi22 22 C ei2
where Ri D E .Rit /
DEŒ.ai C bi1 I1 C bi 2 I2 C eit / .ai C bi1 I 1 C bi 2 I 2 / 2 DEŒ.a1 C ai / C bi1 .I1t C I 1 / C bi 2 .I2t C I 2 / C eit 2
D bi1 bj1 12 C bi 2 bj 2 22
(7.26)
since all remaining expected values of the cross-product terms equal zero. The extended results of expected return variance and covariance for a multi-index and more than two indexes can be found in the second appendix of this chapter. One simplifying way of applying the multiple-index model is to start with the basic market model and add indexes to reflect industry-related effects. If the firm has 100% of its operations in one industry, Equation (7.23) can be used to
120
C.-F. Lee et al.
represent a two-index model with market index and industry index. In general this approach reduces the number of data inputs to 4N C 2I C 2. These data inputs are (1) the expected return and variance for each stock and market index; (2) the covariance between the individual security and the market index and the industry index; and (3) the mean and variance of each industry index. Although this is a larger number of data inputs than for the simple market model, the accuracy of the estimation of security return increases. So the tradeoff is one of more information (higher cost to use) versus greater accuracy of the forecasted security return. Care must be taken in applying the multiple-index model. It is often the case that the additional information resulting from the application of a higher-complexity model is outweighed by the computational cost increase. In an attempt at making the model as parsimonious as possible, it is necessary to judge the increased information gained by utilizing the more complex model. This can be accomplished by examining the mean square error for the forecast of the actual historic values. Although not within the scope of this text, the ability to judge the accuracy of a forecast is essential and can be acquired from any good statistical forecasting text. Example 7.3 provides further illustration of the single-index model.
Going back to the calculus maximization derivation of optimal portfolio weights, recall that a linear combination of the following two factors is being maximized: nC1 X E .Ri / or E Rp D i D1 nC1 X Var Rp D Var xi eit
! (7.27a)
i D1
subject to nC1 P
xi D 1 and
i D1 n P
(7.16B)
xi bi D xnC1
i D1
P D ˆE Rp Var Rp C 1 n
P xi bi xnC1 C2 i D1
n P
xi 1
i D1
nC1 P 2 xi ai xi Var .Ri / C 1 i D1 i D1 n
P xi bi xnC1 C2
Dˆ
nC1 P
n P
xi 1
i D1
i D1
(7.16C) Example 7.3. During the discussion of the single-index model, a method was presented for determining optimal portfolio weights given different levels of risk aversion. In this section a three-security portfolio is examined in which the returns of the securities are related to a market index.
Ri Alpha Beta Var (residual)
JNJ
AXP
XOM
S&P 500
Risk-free
Artificial security
0.0053 0.0046 0.29 0.0019
0.0055 0.0022 1.21 0.0016
0.0126 0.0109 0.65 0.0022
0.0025 – – –
0.0023 – – –
– – – 0.0015
The only difference between this maximization function and the minimization shown in previous chapters, beyond intent, is that instead of fixing some arbitrary value of return needed .E /, an attempt is made to quantify the level of risk aversion that the holders of the portfolio require, thereby placing the portfolio on the efficient frontier, not by desired return, but by level of utility. Consider again a three-security portfolio. In this framework, the preceding objective function expands to: P D ˆx1 a1 C ˆx2 a2 C ˆx3 a3 C ˆx4 a4 x12 Var .R1 / x22 Var .R2 / x32 Var .R3 / x42 Var .R4 / C 1 x1 C 1 x2 C 1 x3 1 C 2 x1 b1 C 2 x2 b2 C 2 x3 b3 2 x4
Varaiance-covariance matrix
JNJ AXP XOM S&P 500
JNJ
AXP
XOM
S&P 500
0.0020 0.0009 0.0004 0.0004
0.0009 0.0038 0.0010 0.0018
0.0004 0.0010 0.0027 0.0010
0.0004 0.0018 0.0010 0.0015
The securities and the index have the following observed parameter estimates taken from actual monthly returns during the period January 2001 to September 2008 (see table). The table includes the calculation of the beta estimate for each security. Check these figures, remembering that beta is equal to the covariance of security i with the market divided by the variance of the market returns.
(7.17)
Take note of all terms in the preceding equation that contain a4 and x4 . By utilizing the individual securities’ relations to the market, it has been possible to delete most of the calculations necessitated by the Markowitz full variancecovariance model. The return that will be generated from an optimal portfolio derived through this maximization rests on the estimation of the future expected return and variance of the index. By partially differentiating the above objective function with respect to each of the weights and the ’s, it is possible to develop the following six equations in six unknowns:
7 Index Models for Portfolio Selection
121 n n X X xi xj ij Var Rp D
@P D ˆa1 2x1 12 C 1 C 2 b1 D 0 @x1
i D1 j D1
@P D ˆa2 2x2 22 C 1 C 2 b2 D 0 @x2
Taking the covariance expressions:
@P D ˆa3 2x3 32 C 1 C 2 b3 D 0 @x3
Var Rp D Œ..0:0652/2 0:0020/C..0:4341/2 0:0038/ C ..1:4994/2 0:0027/ C 2 ..0:0652/
@P D ˆa4 2x4 42 C 1 C 2 b4 D 0 @x4
.0:4341/ 0:0009/
@P D x1 C x2 C x3 1 D 0 @1
C 2 ..0:0652/ .1:4994/ 0:0004/ (7.18)
As shown in Equation (7.18), these last six equations can be transformed into Jacobian Matrix notation. A matrix can be developed by utilizing the set of data in the first table of this problem and a ˆ of 1.0 (denoting moderate risk aversion): A x D k 2 3 2 3 3 0:013 0 0 0 1 1:15 0:0084 x1 0:015 0 0 1 0:907 6x2 7 6 0 60:01277 6 6 7 6 7 7 0 0:006 0 1 0:557 6x3 7 6 0 60:01877 6 7 6 7 D 6 7 0 0 0:004 0 0 7 6x4 7 6 0 60:01257 4 1:0 5 4 5 4 1 1:0 1:0 0 0 0 1:0 5 2 1:15 0:90 0:55 1:0 0 0 0 2
When the P matrix is invented and premultiplies each side of the equation: x D A1 k 3 2 3 2 0:06523 x1 6x2 7 60:434147 7 6 7 6 6x3 7 6 1:49937 7 7 6 7D6 6x4 7 6 0:42381 7 7 6 7 6 41 5 40:005215 0:00129 2 This solution vector shows that investment should short 6:52% in Johnson & Johnson, short 43:41% in American Express, and long 149.94% in Exxon Mobil. Additionally, the weight of x4 is the sum of the weighted adjustments as indicated in Equation (7.16b). The return on this portfolio is the weighted sum of the individual returns: E.Rp / D
n X
xi E.Ri /
D 0:0055 A portfolio has been developed that is efficient within the realm of this model. In the table below are the portfolios derived by varying the utility factor ˆ from 0 (totally risk averse) to 2 (more aggressive risk posture). This last table offers the ability to develop an efficient frontier under the Sharpe single-index model. The figure shows that the portfolios start to trace out an efficient boundary. For a fuller graph, the analyst would continue the calculations with ever higher values of ˆ stretching the efficient frontier. Var Rp
Portfolio
ˆ
1
0.0
0.5233
0.1531
0.3236
0.0077
0.0013
0.0361
2
0.5
0.2290
0.1405
0.9115
0.0119
0.2223
0.4715
X1
X2
Rp
X3
p
3
1.0
0.0652
0.4341
1.4994
0.0162
0.0055
0.0742
4
1.5
0.3595
0.7278
2.0873
0.0204
0.0109
0.1044
5
2.0
0.6538
1.0214
2.6751
0.0246
0.0185
0.1360
0.03 0.025 Expected Retrun E(R)
@P D x1 b1 C x2 b2 C x3 b3 x4 D 0 @2
C 2 ..0:4341/ .1:4994/ 0:0010/
0.02 0.015 0.01 0.005
i D1
D Œ.0:0652 0:0053/ C .0:4341 0:0055/
0 0.0000
0.0200
0.0400
0.0600
0.0800
0.1000
0.1200
0.1400
0.1600
Risk
C .1:4994 0:0126/ D 0:0162 The variance of the portfolio is the sum of the weighted variance and covariance terms:
7.4 Conclusion This chapter has discussed the essentials of single- and multiple-index models. The theoretical underpinnings of the
122
theories have been explored and numerical examples have been provided. An efficient boundary has been derived under the guidelines of the model and the quantitative analysis of the related parameters has been studied. It has been shown that both single-index and multi-index models can be used to simplify the Markowitz model for portfolio section; remember, however, that the multi-index model is much more complicated than the single-index model.
References Beaver, W., P. Kettler, and M. Scholes. 1970. “The association between market determined and accounting determined risk measures.” The Accounting Review 45, 654–682. Blume, M. 1975. “Betas and their regression tendencies.” Journal of Finance 20, 785–795. Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Brenner, M. 1974. “On the stability of the distribution of the market component in stock price changes.” Journal of Financial and Quantitative Analysis 9, 945–961. Elton, E. J., M. J. Gruber, and T. Urich. 1978. “Are betas best?” Journal of Finance 23, 1375–1384. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, 7th Edition, Wiley, New York. Fouse, W., W. Jahnke, and B. Rosenberg. 1974. “Is beta phlogiston?” Financial Analysts Journal 30, 70–80. Frankfurter, G. and H. Phillips. 1977. “Alpha-beta theory: a word of caution.” Journal of Financial Management 3, 35–40. Frankfurter, G. and J. Seagle. 1976. “Performance of the sharpe portfolio selection model: a comparison.” Journal of Financial and Quantitative Analysis 11, 195–204. Gibbons, M. R. 1982. “Multivariate tests of financial models, a new approach.” Journal of Financial Economics 10, 3–27. Haugen, R. and D. Wichern. 1975. “The intricate relationship between financial leverage and the stability of stock prices.” Journal of Finance 20, 1283–1292. Jacob, N. 1974. “A limited-diversification portfolio selection model for the small investor.” Journal of Finance 19, 847–856. King, B. 1966. “Market and industry factors in stock price behavior.” Journal of Business 39, 139–140. Latane, H., D. Tuttle, and A. Young. 1971. “How to choose a market index.” Financial Analysts Journal 27, 75–85. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Levy, R. 1974. “Beta coefficients as predictors of return.” Financial Analysts Journal 30, 61–69. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd Edition, Wiley, Hoboken, NJ. Markowitz, H. 1959. Portfolio selection. Cowles Foundation Monograph 16, Wiley, New York. Morgan, I. G. 1977. “Grouping procedures for portfolio formation.” Journal of Finance 21, 1759–1765. Roll, R. 1969. “Bias in fitting the Sharpe model to time series data.” Journal of Financial and Quantitative Analysis 4, 271–289. Rosenberg, B. and J. Guy. 1976a. “Prediction of beta from investment fundamentals.” Financial Analysts Journal 32, 60–72.
C.-F. Lee et al. Rosenberg, B. and J. Guy. 1976b. “Prediction of beta from investment fundamentals, part II.” Financial Analysts Journal 32, 62–70. Rosenberg, B. and W. McKibben. 1973. “The prediction of systematic and specific risk in common stocks.” Journal of Financial and Quantitative Analysis 8, 317–333. Rosenberg, B. and V. Marathe. 1974. The prediction of investment risk: systematic and residual risk, Berkeley Working Paper Series. Ross, S. A. 1976. “Arbitrage theory of capital-asset pricing.” Journal of Economic Theory 8, 341–360. Ross, S. A. 1977. “Return, risk and arbitrage,” in Risk and return in finance, Vol. 1, I. Friend and J. L. Bicksler (Eds.). Ballinger, Cambridge, pp. 187–208. Sharpe, W. F. 1967. “A linear programming algorithm for mutual fund portfolio selection.” Management Science 13, 499–510. Sharpe, W. F. 1970. Portfolio theory and capital markets, McGraw-Hill, New York. Smith, K. V. 1969. “Stock price and economic indexes for generating efficient portfolios.” Journal of Business 42, 326–335. Stone, B. 1973. “A linear programming formulation of the general portfolio selection problem.” Journal of Financial and Quantitative Analysis 8, 621–636. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th ed., Duxbury Press, California.
Appendix 7A A Linear-Programming Approach to Portfolio-Analysis Models Sharpe (1967) developed a simplified portfolio-analysis model designed to be formulated as a linear-programming problem. Jacob (1974) developed a linear-programming model for small investors that delineates efficient portfolios composed of only a few securities. Both of these approaches have as their objectives: (1) a reduction in the amount of data required and (2) a reduction in the amount of computer capability required to solve the portfolio-selection problem. Sharpe approaches the problem of capturing the essence of mean-variance portfolio selection in a linear-programming formulation by the following: 1. Making a diagonal transformation of the variables that will convert the problem into a diagonal form, and 2. Using a piecewise linear approximation for each of the terms for variance. The LP that results from the use of market responsiveness as the risk measure and the imposition of an upper limit on investment in each security is Max P D
" n X i D1
# xi E.Ri / .1 /
"
n X
# xi ˇi
i D1
(7A.1)
7 Index Models for Portfolio Selection
subject to:
n P
123
a reasonable linear approximation to the first term in the objective function is provided by:
xi D 1
i D1
!2 " n # n X 1 1 X xi ˇi 2 .Rm / D 2 .Rm / xi ˇi K i D1 K i D1
0 xi U
where: xi D the fraction of the portfolio invested in security i ; E .Ri / D the expected returns of security i ; ˇi D the beta coefficient of security i ; U D the maximum fraction of the portfolio that may be held in any one security; and D a parameter reflecting the degree of risk aversion. The is used in generating the efficient frontier. The value of corresponds to the rate of substitution of return for risk measured by the individual security ˇ, while the U maximum percentage to be invested in any security greatly reduces the number of securities that will be needed for diversification. Building on this framework, Jacob derives a LP model that allows the small investor explicit control over the number of securities held. Jacob’s LP model incorporates the effects of unsystematic risk as well as systematic risk, the beta in the Sharpe LP. Jacob’s model is to minimize Systematic risk 1 2 Rp D K
Unsystematic risk n X
!2 xi ˇi
2 .Rm / C
i D1
1 K
2 X n
xi2 2 .ei /
i D1
After division by K and rearrangement of terms, the objective function can be restated: 1 2 2 ZD xi ˇi .Rm / C .ei / K i D1 n X
Letting Zi D ˇi 2 .Rm / C .1=K/ 2 .ei /, the problem can be cast as: Minimize Z D
n X
Xi Zi .1 /
i D1
n X
Xi E.Ri /
i D1
subject to: n X
Xi D 1:0
0 Xi 1=K
i D 1; 2; : : : ; N
i D1
in which is varied from zero to one. This problem can be solved by the linear-programming approach suggested by Sharpe (1967). An example of solving this problem can be found in Jacob (1974).
(7A.2) subject to: 1 K 1 K
n P i D1 n P
xi E .Ri / E Rp xi D K
xi D 0 or 1
i D1
where: xi D the investment in security i – all, or nothing; E .Ri / D the expected return on security i equal to ˛i C ˇi Rm K D the desired upper bound on the number of securities the investor is willing to consider, usually in the range of 15–20; ˇi D the measure of systematic risk; 2 .e i/ D the measure of unsystematic risk; and E Rp D the lowest acceptable rate of return the investor is willing to earn on his or her portfolio. An additional simplification is to turn the objective function of Equation (7A.2) into a linear relationship. Since the decision variables .xi / are binary valued, the unsystematicrisk term of the objective
is already linear. Given n function P xi ˇi =K is very close to unity, that the portfolio beta i D1
Appendix 7B Expected Return, Variance, and Covariance for a Multi-index Model Using the orthogonalization technique discussed in the text, the multi-index model (MIM) is transformed into Ri D ai C bi1 I1 C bi 2 I2 C C biL IL C ci
(7B.1)
in which all Ij are uncorrelated with each other. To interpret the transformed indexes, notice that I2 is now the difference between the actual level of the index and the level it would be, given the level of the other indexes. Also, bi 2 is now the sensitivity of security i ’s return to a change in I2 , given that all other indexes are held constant. It is also convenient, beyond making the indexes uncorrelated, to assume that the covariance of the residuals with the indexes is equal to zero. With this final assumption, the MIM can be recapped as follows. Generalized equation: Ri D ai C bi1 I1 C bi 2 I2 C C biL IL C ci
124
C.-F. Lee et al.
for all securities i D 1 to N , and:
Remembering that:
E.ci / D 0 Var .ci / D ci2 Var .Ij / D Ij2 Cov .Ij ; Ik / D EŒ.Ij I j /.Ik I k / D 0; I j D E.Ij /; I ci D E.Ici / 5. Cov.ci ; Ij / D EŒci .Ij I j / D 0 6. Cov.ci ; cj / D E.ci ; cj / D 0 for all i, j, and k.
1. 2. 3. 4.
The last statement is equivalent to the residuals being unrelated, which is to say that the only reason for common comovement of returns is due to the movement of the indexes. Although this is like ignoring some of a model’s shortcomings, it facilitates a usable model to obtain a quantified idea of future return patterns. The expected value of the model with the multiple indexes is
E
Ii I i
Ij I j D 0 and E Ii I i ci D 0
the only nonzero term involving index one is expressed: 2 2 2 E.I1 I 1 /2 D bi1 I1 bi1
Because all terms involving ci are zero and E.ci /2 D ci2 : 2 2 2 2 I1 C bi22 I22 C C biL IL C ci2 i2 D bi1
(7B.3)
The covariance of the returns between security i and j utilizing the multiple-index model can be expressed: ij D E .Ri ri / Rj rj Again, substituting for Ri ; ri ; Rj , and rj :
E.Ri / D E.ai C bi1 I1 C bi 2 I2 C C biL IL C ci / D E.ai / C E.bi1 I1 / C E.bi 2 I2 / C
ij DEŒŒ.ai C bi1 I1 C bi 2 I2 C C biL IL C ci / .ai C bi1 I 1 C bi 2 I 2 C C biL I L /
C E.biL IL / C E.ci /
Œ.aj C bj1 I1 C bj 2 I2 C C bjL IL C cj / because a and the b’s are constants, and the expected value of the residuals is equal to zero ri D ai C bi1 I 1 C bi 2 I 2 C C biL I L
(7B.2)
where I j is the expected value of index j . The variance of the returns using multiple indexes is:
.aj C bj1 I 1 C bj 2 I 2 C C bjL I L / Again, noting that the ai cancel and combining the terms involving the same b’s: ij D EfŒbi1 .I1 I 1 / C bi 2 .I2 I 2 / C C biL .IL I L / C ci
i2 D E.Ri ri /2 where ri is the expected value of the returns of security i . Substituting Ri and ri from above: i2
DEŒ.ai C bi1 I1 C bi 2 I2 C C biL IL C ci / ai C bi1 I 1 C bi 2 I 2 C C biL I L 2
Rearranging, and noticing that the ai cancel, yields: i2 D EŒbi1 I1 I 1 C bi2 I2 I 2 C C biL IL I L C ci 2
Next the terms in the brackets are squared. To proceed with this, concentrate on the first index, and the rest of the terms involving the other indexes follow directly. The first index times itself and all other terms yields 2 2 I1 I 1 C bi1 bi 2 I1 I 1 I2 I 2 C EŒbi1 Cbi1 biL I1 I 1 IL I L C bi1 I1 I 1 .ci /
Œbj1 .I1 I 1 / C bj 2 .I2 I 2 / C C bjL .IL I L / C cj g Next multiply out terms, again concentrating on the terms involving bi1 : EŒbi1 bj1 .I1 I 1 /2 C bi1 bi 2 .I1 I 1 /.I2 I 2 / C C bi1 bjL .I1 I 1 /.IL I L / C bi1 .I1 I 1 /cj Because the covariance between two indexes is zero, and the covariance between any residual and an index is zero: 2 bi1 bj1 E.I1 I 1 /2 D bi1 bj1 I1
To conclude, remember that the covariance of the residuals is equal to zero and therefore: 2 2 ij D bi1 bj1 I1 C bi 2 bj 2 I22 C C biL bjL IL
(7B.4)
Chapter 8
Performance-Measure Approaches for Selecting Optimum Portfolios Cheng-Few Lee, Hong-Yi Chen, and Jessica Shin-Ying Mai
Abstract In this chapter, following Elton et al. (Journal of Finance 31:1341–57, 1976; Modern portfolio theory and investment analysis, 7th edn. Wiley, New York, 2006), we introduce the performance-measure approaches to determine optimal portfolios. We find that the performance-measure approaches for optimal portfolio selection are complementary to the Markowitz full variance-covariance method and the Sharpe index-model method. The economic rationale of the Treynor method is also discussed in detail. Keywords Sharpe performance measure approach r Lintner’s method of short sales r Treynor measure r Short sales r Standard method of short sales
8.1 Introduction Previously, we have discussed Markowitz’s (1952, 1959) full variance-covariance approach to determine optimal weights of a portfolio. Moreover, we also utilized Sharpe’s indexmodel approach to simplify Markowitz’s optimal portfolioselection process. This chapter assumes the existence of a risk-free borrowing and lending rate and advances one step further to simplify the calculation of the optimal weights of a portfolio and the efficient frontier. First discussed are Lintner’s (1965) and Elton et al. (1976) Sharpe performancemeasure approaches for determining the efficient frontier with short sales allowed. This is followed by a discussion of the Treynor performance-measure approach for determining the efficient frontier with short sales allowed. The Treynor measure approach is then analyzed for determining the efficient frontier with short sales not allowed. And finally, Dow Jones 30 data from January 2003 through December 2007 are employed to demonstrate how the Treynor method can be applied in the real world. Overall, this chapter
C.-F. Lee, H.-Y. Chen, and J.S.-Y. Mai () Rutgers University, New Brunswick, NJ, USA e-mail:
[email protected]
relates the performance-measure concepts and methods to the portfolio-selection models, and make more accessible the insights of optimal portfolio selection.
8.2 Sharpe Performance-Measure Approach with Short Sales Allowed In deriving the capital asset pricing model (CAPM), Lintner (1965) suggests a performance-measure approach for determining the efficient frontier discussed in previous chapters. Lintner arrived at this approach through a sequence of logical steps. Following previous chapters, the objective function for portfolio selection can be expressed: Max L D
n X
W i R i C 1
i D1
82 9 31=2 ˆ > n n X < X = 4 Cov Ri ; Rj 5 p ˆ > : j D1 i D1 ; C2
n X
! Wi 1
(8.1)
i D1
where: Ri D average rates of return for security i ; Wi (or the optimal weight for i th (or j th) security; Wj / D Cov Ri ; Rj D the covariance between Ri and Rj ; p D the standard deviation of a portfolio; and 1 ; 2 D Lagrangian multipliers. Markowitz’s portfolio selection model minimizes the variance given the targeted expected rate of return. Equation (8.1) maximizes the expected rates of return given targeted standard deviation. If a constant risk-free borrowing and lending rate Rf is subtracted from Equation (8.1):
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_8,
125
126
C.-F. Lee et al.
Max L0 D
n X
Wi Ri Rf C 1
i D1
82 9 31=2 ˆ > n n X < X = 5 4 Wi Wj Cov Ri ; Rj p ˆ > : i D1 j D1 ; C2
n X
! Wi 1
(8.2)
i D1
Equations (8.1) and (8.2), both formulated as a constrained maximization problem, can be used to obtain optimum portfolio weights Wi .i D 1; 2; ; n/. Since Rf is a constant, the optimum weights obtained from Equation (8.1) will be equal to those obtained for Equation (8.2). Previous chapters used the methodology of Lagrangian multipliers; it can be shown that Equation (8.2) can be replaced by a nonconstrained maximization method as follows. Incorporating n P Wi 1 into the objective function by the constant i D1
substituting n X
Rf D .1/ Rf D
! Wi Rf D
i D1
n X
Wi Rf
i D1
into Equation (8.2): Max L0 D
n X
Wi Ri Rf C 1
i D1
20 11=2 3 n n 6@X X 7 Wi Wj Cov Ri ; Rj A p 5 (8.2a) 4 i D1 j D1
A two-Lagrangian multiplier problem has been reduced to a one-Lagrangian problem as indicated in Equation (8.2a). By using a special property of the relationship between !1=2 n n P n P P Ri Rf and Wi Wj Cov Ri ; Rj the i D1
i D1 j D1
constrained optimization of Equation (8.2a) can be reduced to an unconstrained optimization problem, as indicated in Equation (8.3).1 n P
Wi Ri Rf
i D1
Max L D n P i D1
Wi2 i2 C
n P n P
Fig. 8.1 Linear efficient frontier
where ij D Cov Ri ; Rj . Alternatively, the objective function of Equation (8.3) can be developed as follows. This ratio L is equal to excess average rates of return for the i th portfolio divided by the standard deviation of the i th portfolio. This is a Sharpe performance measure. Following Sharpe (1964) and Lintner if there is (1965), a risk-free lending and borrowing rate Rf and short sales are allowed, then the efficient frontier (efficient set) will be linear, as discussed in previous chapters. In terms of return Rp standard-deviation p space, this linear efficient frontier is indicated as line Rf E in Fig. 8.1. AEC represents a feasible investment opportunity in terms of existing securities to be included in the portfolio when there is no risk-free lending and borrowing rate. If there is a risk-free lending and borrowing rate, then the efficient frontier becomes Rf E. An infinite number of linear lines represent the combination of a riskless asset and risky portfolio, such as Rf A; Rf B and Rf E. It is obvious that line Rf E has the highest slope, as represented by Rp Rf (8.4) ‚D p in which Rp D
Wi Ri ; Rf , and p are defined as in
i D1
Equation (8.2). Thus the efficient set is obtained by maximizn P Wi D 1, Equation (8.4) ing ‚. By imposing the constraint i D1
is expressed:
! n X Rp Rf C Wi 1 ‚ D p i D1 0
!1=2
.i ¤ j /
Wi Wj ij
i D1 j D1
Since the ratio of Equation (8.3) is homogeneous of degree zero with respect to Wi . In other words, the ratio L is unchanged by any proportionate change in the weight of Wi .
(8.5)
By using the procedure of deriving Equations (8.2a), (8.5) becomes n P
(8.3) 1
n P
‚0 D
Wi Ri Rf
i D1
n P
i D1
Wi2 i2
C
n P n P i D1 i D1
.i ¤ j / ij
(8.5a)
8 Performance-Measure Approaches for Selecting Optimum Portfolios
This equation is equivalent to Equation (8.3). This approach is used by Elton and Gruber (1987) to derive their objective function for optimal portfolio selection.2 Following the maximization procedure discussed earlier in previous chapters, it is clear that there are n unknowns to be solved in either Equation (8.3) or Equation (8.5a). Therefore, calculus must be employed to compute n firstorder conditions to formulate a system of n simultaneous equations: 1
dL D0 dW 1
2
dL D0 dW 2
127
Example 8.1. Let R1 D 15% R2 D 12% R3 D 20% 3 D 9% 1 D 8% 2 D 7% r13 D 0:4 r23 D 0:2 r12 D 0:5 Rf D 8% Substituting this information into Equation (8.6a): 15 8 D 64H1 C .0:5/ .8/ .7/ H2 C .0:4/ .8/ .9/ H3 12 8 D .0:5/ .8/ .7/ H1 C 49H2 C .0:2/ .7/ .9/ H3 20 8 D .0:4/ .8/ .9/ H1 C .0:2/ .7/ .9/ H2 C 81H3 Simplifying:
:: :
:: :
7 D 64H1 C 28H2 C 28:8H3
n
dL D0 dW n
4 D 28H1 C 49H2 C 12:6H3 12 D 28:8H1 C 12:6H2 C 81H3
From Appendix 8A, the n simultaneous equations used to solve Hi are R1 Rf D C H2 12 C H3 13 C C Hn 1n R2 Rf D H1 12 C H2 22 C H3 23 C C Hn 2n :: :
Using Cramer’s rule, H1 ; H2 , and H3 can be solved for as follows:
H1 12
Rm Rf D H1 1n C H2 2n C H3 3n C C Hn n2 (8.6) where:
H1 D
Rp Rf p2
.
The HS are proportional to the optimum portfolio weight Wi .i D 1; 2; : : : ; n/ by a constant factor K. To determine the optimum weight Wi ; Hi is first solved from the set of equations indicated in Equation (8.6). Having done so the Hi must be called to calculate Wi , as indicated in Equation (8.7). Wi D
Hi n P
ˇ ˇ ˇ 7 ˇ ˇ ˇ 4 ˇ ˇ ˇ ˇ 12
R1 Rf D H1 12 C H2 12 C H3 13 R3 Rf D H1 13 C H2 23 C H3 32 Example 8.1 provides further illustration. 2
Elton et al. (2006).
ˇ
49 12:6 ˇˇ
12:6 81
ˇ ˇ ˇ
D
33648:1 27:1177 6350:4 D D 3:97% 160030 160030 ˇ ˇ ˇ 64 7 28:8 ˇ ˇ ˇ ˇ ˇ ˇ 28 4 12:6 ˇ ˇ ˇ ˇ 28:8 12 81 ˇ
H2 D ˇ ˇ ˇ 64 28 28:8 ˇ ˇ ˇ ˇ ˇ ˇ 28 49 12:6 ˇ ˇ ˇ ˇ 28:8 12:6 81 ˇ D
.2540:2 C 9676:6 C 20736/ .9676:8 C 15876 C 3317:8/ 160030
D
32952:8 28870:6 D 2:55% 160030
H3 D
R2 Rf D H1 12 C H2 22 C H3 23
ˇ ˇ
28 28:8 ˇˇ
.4233:6 C 1451:5 C 27783/ .1111:3 C 9072 C 16934:4/ .10160:6 C 10160:6 C 254016/ .10160:6 C 63504 C 40642:6/
i D1
If there are only three securities, then Equation (8.6) reduces to:
ˇ
49 12:6 ˇˇ ˇ 12:6 81 ˇˇ
ˇ ˇ ˇ 64 ˇ ˇ ˇ 28 ˇ ˇ ˇ ˇ 28:8
(8.7)
Hi
ˇ ˇ
28 28:8 ˇˇ
D
Hi D kW i .i D 1; 2; : : : ; n/; and kD
(8.6b)
(8.6a)
ˇ ˇ ˇ 64 ˇ ˇ ˇ 28 ˇ ˇ ˇ ˇ 28:8 ˇ ˇ ˇ ˇ 64 ˇ ˇ 28 ˇ ˇ ˇ ˇ 28:8
ˇ ˇ
28 28:8 ˇˇ ˇ 49 12:6 ˇˇ ˇ ˇ ˇ ˇ ˇ 28:8 ˇˇ ˇ 12:6 ˇˇ ˇ 81 ˇˇ
12:6 81 28 49 12:6
.3225:6 C 2469:6 C 37632/ .3225:6 C 9480 C 9878:4/ 160030 20815:2 D D 13:01% 160030
D
128
C.-F. Lee et al.
Using Equation (8.7), W1 ; W2 , and W3 are obtained: W1 D
H1 3 P
D
Hi
3:97 3:97 D 3:97 C 2:55 C 13:01 18:53
The efficient frontier for this example is shown in the figure. A represents an efficient portfolio with Rp D 17:94 percent and p2 D 50:90 percent.
i D1
D 20:33% H2 2:55 W2 D 3 D P 19:53 Hi
8.3 Treynor-Measure Approach with Short Sales Allowed
i D1
Using the single-index market model discussed in previous chapter, Elton et al. (1976) define:
D 13:06% H3 13:01 W3 D 3 D P 19:53 Hi
p D
i D1
n P iD1
Wi2 ˇi2 m2
C
D 66:61%
n P n P iD1 j D1
Wi Wj ˇi ˇj m2
C
n P iD1
!1=2 2 Wi2 2i
j ¤i
Rp and p2 can be calculated by employing these weights:
Substituting of this value of p into Equation (8.3):
Rp D .15/ .0:2033/ C .12/ .0:1306/ C .20/ .0:0061/ D 3:049 C 1:5672 C 13:322
n P iD1
LD
D 17:9382%
Wi Ri Rf
p D
n P iD1
Wi2 ˇi2 m2
C
n n P P iD1 j D1
Wi Wj ˇi ˇj m2
C
n P iD1
!1=2 2 Wi2 2i
j ¤i
p2
D
3 X
Wi2 i2
i D1
C
3 3 X X
(8.8)
Wi Wj rij i j
i ¤j
i D1 j D1
D .0:2033/2 .64/ C .0:1306/2 .49/ C .0:6661/2 .81/
In order to find the set of Wi s that maximizes L, take the derivative of the above equation with respect to each Wi . Let
C2 .0:2033/ .0:1306/ .0:5/ .8/ .7/
Hi D
C2 .0:2033/ .0:6661/ .0:4/ .8/ .9/ C2 .0:2/ .0:1306/ .0:6661/ .7/ .9/ D 2:645 C 0:836 C 35:939 C 1:487 C 7:8 C 2:192 D 50:899%
Rp Rf p2
! Wi
since Rp Rf =p2 is a constant factor for each security to be included in the portfolio. Hence, it can be cancelled by using a standard scaling method: Wi D
Efficient frontier for Example 8.1
Hi n P Hi
(8.9)
i D1
From Appendix 8B Hi can be obtained as follows: 1 n P .Rj Rf / 2 2 B m j D1 2i C ˇ 2
C B i B C i ˇ2 A 2 @ P j 2i 1 C m2 2 0
Hi D
Ri Rf 2 2i
j D1
(8.10)
ej
Equation (8.10) can be modified to: ˇi Hi D 2 2i
Ri Rf C ˇi
! (8.11)
8 Performance-Measure Approaches for Selecting Optimum Portfolios
ˇ in which Ri Rf ˇ ˇi is the Treynor performance measure. C can be defined as m2 C D
i P .Ri Rf /ˇi j D1
1C
m2
2 2j
i P
ˇj2
j D1
2 2j
(8.12)
The Hi s must be calculated for all of the stocks in the portfolio. If Hi is a positive value, this indicates the stock will be held long, whereas a negative value indicates that the stock should be sold short. This method is called the Treynor measure approach. The argument will be clearer when the case of portfolio selection with short sales not allowed is discussed. To determine the optimum portfolio from the Hi s (such that 100% of funds are invested) the weights must be scaled. One method follows the standard definition of short sales, which presumes that a short sale of stock is a source of funds to the investor; it is called the standard method of short sales. This standard scaling method is indicated in Equation (8.9). In Equation (8.9), Hi can be positive or negative. This scaling factor includes a definition of short sales and the constraint: n X
jWi j D 1
i D1
(1) Security number
(2) Ri
(3) Ri Rf
(4) “i
(5) 2 2i
1 2 3 4 5
15 13 10 9 7
10 8 5 4 2
1 2 1.43 1.33 1
30 50 20 10 30
From the information in the table, using Equations (8.10) and (8.11). Hi .i D 1; 2; : : : ; 5/ can be calculated: H1 D
1 30
3:067 D 0:2311
13 5 3:067 D 0:0373 2
10 5 1:43 H3 D 3:067 D 0:0307 20 1:43
95 1:33 H4 D 3:067 D 0:0079 10 1:33
75 1 H5 D 3:067 D 0:0356 30 1
H2 D
2 50
15 5 1
According to Lintner’s method:
A second method (Lintner’s (1965) method of short sales) assumes that the proceeds of short sales are not available to the investor and that the investor must put up an amount of funds equal to the proceeds of the short sale. The additional amount of funds serves as collateral to protect against adverse price movements. Under these assumptions, the constraints on the Wi s can be expressed: n X
129
5 X
Now to scale the Hi values into an optimum portfolio, apply Equation (8.13): W1 D
jWi j D 1
W2 D
i D1
And the scaling factor is expressed as: Wi D
Hi n P jHi j
W3 D (8.13)
W4 D
i D1
W5 D
Example 8.2 provides further illustration. Example 8.2. The following example shows the differences in security weights in the optimal portfolio due to the differing short-sale assumptions. Data associated with regressions of the single-index model are presented in the table. The mean return, R, the excess return Ri Rf , the beta coef2 are presented ficient ˇi , and the variance of the error term 2i in columns 2–5.
jHi j D 0:3426
i D1
0:2311 D 0:6745 0:3426 0:0373 D 0:1089 0:3426 0:0307 D 0:0896 0:3426 0:0079 D 0:0231 0:3426 0:0356 D 0:1039 0:3426
Thus, the Lintner model states that an investor should invest 67.45% in security 1, 10.89% in security 2, and 8.96% in security 3. The investor should then sell short 2.31 and 10.39% of securities 4 and 5, respectively. If this same example using the standard defini 5 is scaled
P tion of short sales Hi , which provides funds to the i D1
investor:
130
C.-F. Lee et al. 5 X
where:
Hi D 0:2556
m2
i D1
W1 D W2 D W3 D W4 D W5 D
Ci
0:2311 D 0:9041 0:2556 0:0373 D 0:1459 0:2556 0:0307 D 0:1201 0:2556 0:0079 D 0:0309 0:2556 0:0356 D 0:1393 0:2556
i P
ˇj2
j D1
2 2j
(8.15)
Example 8.3 provides further illustration.
8.4 Treynor-Measure Approach with Short Sales Not Allowed Elton et al. (1976) also derive a Treynor-measure approach with short sales not allowed. From Appendix 8C, Equation (8.11) should be modified to: !
Ri Rf Ci C i ˇi
ˇj
If all securities have positive ˇi s, the following three-step procedure from Elton et al. can be used to choose securities to be included in the optimum portfolio.3 ı 1. Use the Treynor performance measure Ri Rf ˇi to rank the securities in descending order. 2. Use Equation (8.15) to calculate Ci for all securities. 3. Include in ı the portfolio all securities for which Ri Rf ˇi is larger than Ci .
Example 8.3. The tape showing Center for Research in Security Prices was the source of 5 years of monthly return data, from January 2003 to December 2007, for the 30 stocks in the Dow Jones Industrial Averages. The value-weighted average of the NYSE index was used as the market while 3-month Treasury-bill rates were used as the risk-free rate. The single-index model was used with an ordinary leastsquares regression procedure to determine each stock’s beta. The following data were compiled for each stock. 1. 2. 3. 4. 5.
The mean monthly return Ri The mean excess return Ri Rf The beta coefficient The variance of the residual errors ı The Treynor performance measure Ri Rf ˇi
All data are listed in the worksheet on next page, which lists the companies in descending order of the Treynor performance measure.4 To calculate the Ci as defined in Equation (8.15), cali h . 2 . 2 i P , culate Rj Rf ˇj 2j ; Rj Rf ˇj 2j
(8.14)
j D1
.
and
2 ˇj2 2j :
presented
ˇi 2i
j D1
Rj Rf 2 2j
1 C m2
Using this definition of short sales provides that the investor should invest 90.41% of his or her money in security 1, and so on. If all Wi s are added together, they equal 100%. This is true because the definition says that the funds received from selling short a security should be used to purchase more of the other securities. The difference between Lintner’s method and the standard method are due to the different definitions of short selling discussed earlier. The standard method assumes that the investor has the proceeds of the short sale, while Lintner’s method assumes that the short seller does not receive the proceeds and must provide funds as collateral. The method discussed in this section does not require the inputs of covariance among individual securities. Hence, it is a simpler method than that of the Sharpe performancemeasure method discussed in the previous section. The relative advantage of the Sharpe performance method over the Treynor performance method is exactly identical to the relative advantage of the Markowitz model over the Sharpe single-index model. In sum, the Treynor performance method for portfolio selection requires the information of both the risk-free rate and the market rates of return.
Hi D
D
i P
0:00207;
i P
. i P 2 ˇj2 2j
is
calculated
and
j D1
in the worksheet. Substituting m2 D h .
i . 2 i P 2 Rj Rf ˇj 2j , and ˇj2 2j
j D1
j D1
into Equation (8.15) produces Ci for every firm as listed in the last column in the worksheet. Using company HPQ as an example: 3 If the beta coefficient ˇi for i th security is positive, then the size of Hi depends on the sign of the term ı in parentheses. Therefore, if a security with a particular Ri Rf ˇi is included in the optimum portfolio, all ı securities with a positive beta that have higher values of R i Rf ˇi must be included in the optimum portfolio. 4 This set of data has been analyzed in later chapters. The names of these 30 firms can be found in Table 19.1.
XOM MCD JNJ BA T CAT HPO CVX GE DIS KO JPM INTC AA UTX IBM AXP HD MRK VZ DD KFT GM MSFT C PG MMM WMT PFE BAC Rm Var.Rm / Rf
Ticker
Mean Ri 0:0185 0:0258 0:0043 0:0193 0:0115 0:0128 0:0202 0:0101 0:0088 0:0117 0:0080 0:0120 0:0121 0:0132 0:0077 0:0068 0:0076 0:0063 0:0038 0:0035 0:0038 0:0016 0:0004 0:0000 0:0010 0:0013 0:0017 0:0009 0:0035 0:0054 0:00887 0:002433333 0:000615571
0.0067 0.0073 0.0045 0.0080 0.0083 0.0113 0.0084 0.0102 0.0052 0.0067 0.0056 0.0070 0.0104 0.0098 0.0106 0.0066 0.0061 0.0083 0.0100 0.0066 0.0066 0.0064 0.0144 0.0114 0.0073 0.0099 0.0111 0.0059 0.0070 0.0095
Standard deviation i
Worksheet for Dow Jones industrial averages
R i Rf 0:0160 0:0234 0:0018 0:0168 0:0091 0:0104 0:0178 0:0077 0:0064 0:0093 0:0056 0:0096 0:0096 0:0107 0:0053 0:0043 0:0051 0:0038 0:0014 0:0011 0:0014 0:0008 0:0028 0:0025 0:0035 0:0011 0:0042 0:0015 0:0059 0:0079
(1)
0.7905 1.3600 0.1178 1.2084 0.7139 0.8210 1.4747 0.7931 0.7268 1.0829 0.7845 1.3579 1.6463 1.9542 1.1547 1.1726 1.4069 1.2612 0.6459 0.8905 1.1670 0.7377 1.6717 1.0800 1.2265 0.3752 0.9803 0.2021 0.7079 0.7140
ˇi
(2)
0.00228 0.00204 0.00118 0.00299 0.00370 0.00720 0.00288 0.00587 0.00130 0.00197 0.00152 0.00183 0.00482 0.00341 0.00591 0.00177 0.00103 0.00319 0.00572 0.00217 0.00176 0.00213 0.01076 0.00706 0.00226 0.00574 0.00682 0.00206 0.00264 0.00506
2 2i
(3)
(5) .Ri Rf /ˇi ei2
5:5603 15:5812 0:1837 6:8054 1:7527 1:1; 863 9:0985 1:0377 3:5539 5:0882 2:8695 7:1293 3:2879 6:1519 1:0314 2:8744 6:9941 1:5228 0:1537 0:4538 0:9278 0:2825 0:4414 0:3761 1:8779 0:0729 0:6010 0:1498 1:5809 1:1104
(4) .Ri Rf / ˇi
0:0203 0:0172 0:0157 0:0139 0:0127 0:0127 0:0121 0:0097 0:0087 0:0086 0:0071 0:0071 0:0058 0:0055 0:0046 0:0037 0:0037 0:0031 0:0021 0:0012 0:0012 0:0011 0:0017 0:0023 0:0028 0:0030 0:0043 0:0076 0:0083 0:0110
5:5603 21:1416 21:3253 28:1307 29:8834 31:0697 40:1682 41:2059 44:7598 49:8480 52:7175 59:8467 63:1346 69:2865 70:3180 73:1924 80:1865 81:7093 81:8631 82:3169 83:2447 82:9622 82:5208 82:1447 80:2668 80:1939 79:5929 79:4430 77:8622 76:7518
(6) Cumulative sum of column (5) 273:92 905:36 11:73 488:69 137:56 93:62 754:92 107:18 406:76 593:91 403:88 1; 008:07 562:52 1; 120:50 225:46 776:21 1; 913:36 499:04 72:94 366:18 775:66 255:97 259:71 165:15 664:60 24:53 140:89 19:79 189:74 100:70
ˇi2 ei2
(7)
0.002929 0.007540 0.007574 0.008514 0.008683 0.008788 0.009362 0.009370 0.009317 0.009234 0.009086 0.008788 0.008563 0.008158 0.008065 0.007709 0.007029 0.006862 0.006833 0.006667 0.006344 0.006201 0.006051 0.005951 0.005548 0.005534 0.005439 0.005422 0.005246 0.005136 0.009370
Max C1
(9) C1
273:92 1; 179:29 1; 191:02 1; 679:71 1; 817:28 1; 910:90 2; 665:82 2; 773:00 3; 179:75 3; 773:66 4; 177:53 5; 185:61 5; 748:12 6; 868:63 7; 094:09 7; 870:31 9; 783:66 10; 282:71 1; 0355:65 1; 0721:83 1; 1497:49 1; 1753:45 1; 2013:16 1; 2178:30 1; 2842:91 1; 2867:43 1; 3008:32 13; 028:81 13; 217:85 13; 318:55
(8) Cumulative sum of column (7)
8 Performance-Measure Approaches for Selecting Optimum Portfolios 131
132
C.-F. Lee et al. Positive optimum weight for eight securities
CHPQ D
Ticker
Hi
XOM MCD JNJ BA T CAT HPQ CVX Total C D
6.0189 6.4370 0.8050 2.1887 0.7820 0.4428 1.3769 0.0421 18.0935 0.009370
ˇi ="i2
.R i Rf / ˇi
(A) optimum percentage
(B) mean
.A/ .B/
346.5158 665.6941 99.5753 404.4066 192.6984 114.0371 511.8963 135.1436
0.0203 0.0172 0.0157 0.0139 0.0127 0.0127 0.0121 0.0097
0.3327 0.3558 0.0445 0.1210 0.0432 0.0245 0.0761 0.0023 1.0000
0.0185 0.0258 0.0043 0.0193 0.0115 0.0128 0.0202 0.0101 Rp
0.00615 0.00919 0.00019 0.00233 0.00050 0.00031 0.00154 0.00002 0.02023
.0:0006155/ .40:6817114/ D 0:009362497 1 C .0:0006155/ .265:818464/
From Ci of the worksheet it is clear that there are ten securities included in the portfolio. The estimated ı 2thatshould beı ; Ri Rf ˇi and Ci of these ten securities are ˇi 2i listed in the table at the top of this page. Substituting this information into Equation (8.14) produces Hi for all ten securities. Using security 5 as an example: HXOM D .346:5158/.0:0202988 0:009370285/ D 6:018948255 Using Equation (8.12) the optimum weights can be estimated for all ten securities, as indicated in the table. In other words, 33.27% of our fund should be invested in security XOM, 35.58% in security MCD, 4.45% in security JNJ, 12.10% in security BA, 4.32% in security T, 2.45% in security CAT, 7.61% in security HPQ, 0.23% in security CVX, based on the optimal weights, the average rate for the portfolio Rp is calculated as 2.02%, as presented in the last column of the table above.
8.5 Impact of Short Sales on Optimal-Weight Determination The Markowitz model of portfolio analysis. The Markowitz model requires a large number of inputs, as it is necessary to estimate the covariance between each pair of securities. Previously, the analysis was simplified by the assumption that security returns were related through a common response to some market index. This model, known as Sharpe’s
single-index model, greatly reduces the number of inputs necessary to analyze the risk and return characteristics of portfolios. In both the Markowitz and Sharpe models the analysis is facilitated by the presence of short selling. This chapter discusses a method proposed by Elton and Gruber for the selection of optimal portfolios. Their method involves ranking securities based on their excess return to beta ratio, and choosing all securities with a ratio greater than some particular cutoff level C . It is interesting to note that while the presence of short selling facilitated the selection of the optimum portfolio in both the Markowitz and Sharpe models, it complicates the analysis when we use the Elton and Gruber approach. From Example 8.3, using the Dow Jones Industrial Average (DJIA), the absence of the short selling allowed formation of the optimal portfolio using only eight stocks.
8.6 Economic Rationale of the Treynor Performance-Measure Method Cheung and Kwan (1988) have derived an alternative simple rate of optimal portfolio selection in terms of the singleindex model. First, Cheung and Kwan relate Ci as defined in Equation (8.12) to the correlation coefficient between the portfolio rates of return with i securities Ri and market rates of return Rm .i /: Ci i D (8.16) m ‚i where: i D i m =i m I i m D covariance between Ri and Rm I ı ‚i D Ri Rf i ; the Sharpe performance
8 Performance-Measure Approaches for Selecting Optimum Portfolios
measure associatedwith the i th portfolioI and i and m D standard deviation for ith portfolio and market portfolio, respectively. Cheung and Kwan show that Pi and Ci display the same functional behavior for optimal portfolio selection. In other words, if portfolios are formed by adding securities successively from the highest rank to the lowest rank, the optimal portfolio is reached when the correlation of the portfolio returns and the index is at its maximum. Since the expected return on the index must be positive if the investor is to invest in stocks, an objective of the investor using the single-index model to establish the risk-return tradeoff is to pick securities that benefit the most from a market upswing. Hence, the role of index in the selection of securities for an optimum portfolio is demonstrated explicitly. Based on the single-index model and the risk decomposition discussed in previous chapters, the following relationships can be defined: 1: 2: 3:
i D
i m i m
i m D ˇi m2 2 i2 D ˇi2 m2 C 2i
(8.17)
From Equation (8.17), Cheung and Kwan define i in terms of ˇi2 ; m2 , and i . i D
q
ı
2 ˇi2 m2 i2 C 2i D
s 1
2 2i i2
2 in which 2i is the nonsystematic risk for the i th portfolio. They use both i and ‚i to select securities for an optimum portfolio, and they conclude that i can be used to replace ‚i in selecting securities for an optimum portfolio. Nevertheless, ‚i information is still needed to calculate the weights for each security. Hence, Cheung and Kwan’s i criteria is good only for understanding Elton and Gruber’s performance-measure method for portfolio selection.
8.7 Conclusion Following Elton et al. (1976) and Elton and Gruber (1987) we have discussed the performance-measure approaches to selecting optimal portfolios. We have shown that the performance-measure approaches for optimal portfolio selection are complementary to the Markowitz full variancecovariance method and the Sharpe index-model method.
133
These performance-measure approaches are thus worthwhile for students of finance to study following an investigation of the Markowitz variance-covariance method and Sharpe’s index approach.
References Alexander, G. J. and B. J. Resnick. 1985. “More on estimation risk and single rule for optimal portfolio selection.” Journal of Finance 40, 125–134. Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Chen, S. N. and S. J. Brown. 1983. “Estimation risk and simple rules for optimal portfolio selection.” Journal of Finance 38, 1087–1093. Cheung, C. S. and C. C. Y. Kwan. 1988. “A note on simple criteria for optimal portfolio selection.” Journal of Finance 43, 241–245. Elton, E. J., M. J. Gruber, and M. W. Padberg. 1976. “Simple criteria for optimal portfolio selection.” Journal of Finance 31, 1341–1357. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, 7th Edition, Wiley, New York. Kwan, C. C. Y. 1984. “Portfolio analysis using single index, multiindex, and constant correlation models: a unified treatment.” Journal of Finance 39, 1469–1483. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance. Springer, New York. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Lintner, J. 1965. “The valuation of risk asset on the selection of risky investments in stock portfolio and capital budgets.” The Review of Economics and Statistics 57, 13–37. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd Edition, Wiley, Hoboken, NJ. Markowitz, H. 1952. “Portfolio selection.” Journal of Finance 7, 77–91. Markowitz, H. 1959. Portfolio selection: efficient diversification of investments, Wiley, New York. Sharpe, W. F. 1963. “A simplified model for portfolio analysis.” Management Science 9, 277–293. Sharpe, W. F. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th Edition, Duxbury Press, California.
Appendix 8A Derivation of Equation (8.6) The objective function L as defined in Equation (8.3) can be rewritten: # " n X Max L D Wi Ri Rf i D1
2 31=2 n n n X X X 4 Wi2 i2 C Wi Wj ij 5 i ¤j i D1
i D1 j D1
134
C.-F. Lee et al.
0
Then, following the product and chain rule, we have: d dL D dW i d Wi 2 4
n X
"
n X
Wi Ri Rf
Wi i2 C
n n X X
D4
n X
Wj ij 5
n n X X
i D1
n P
i ¤j kD
31=2
4
n X
i D1 j D1
Wi i2 C
n n X X
i D1
2 D4
n X
k
i D1
Wj ij 5 31=2
Wi2 i2 C
i D1
n n X X
42Wi i2 C 2
n X
i D1 j D1
Wi i2
C
n X
! Wi ij
C Ri Rf D 0
j ¤i
There is one equation like this for each value of i .
33=2
R1 Rf D H1 12 C H2 12 C C Hn 1n
Wi Wj ij 5 3
R2 Rf D H1 21 C H2 22 C C Hn 2n :: :
Wj ij 5
Rn Rf D H1 n1 C H2 n2 C C Hn n2
i D1 j D1
2
i ¤j Wi Wj ij
Ri Rf D Hi 1i C H2 2i C C Hi i2 C C Hn ni
i D1 n X
n P n P
Define Hi D kW i , where the Wi are the fractions to invest in each security and the Hi are proportional to this fraction. Substituting Hi for kW i :
Wi Wj ij 5
Ri Rf # " n X 1 Wi Ri Rf 2
4
Wi2 i2 C
dL D .kW1 1i C kW2 2i C C kWi i2 C dW i CkWn ni / C .Ri Rf / D0
2
i D1
Therefore:
i D1 j D1
Wi Ri Rf
i D1
i D1 j D1
Wi2 i2 C
(8A.2)
yields
31=2
n n X X
n P i D1
Wi Wj ij 5
" n # X d
Wi Ri Rf dW i i D1 " n # X d C Wi Ri Rf dW i i D1 2
Wj ij A
Defining
31=2
i D1 j D1
Wi2 i2 C
1
j D1
i D1
i D1
2
@Wi i2 C
#
n X
j D1
D0
.i D 1; 2; : : : ; n/
(8A.1) 2
Multiplying n P
Equation
by
4
n P
i D1
#1=2 Wi Wj ij
(8A.1)
Wi2 i2 C
n P
Appendix 8B Derivation of Equation (8.10)
i D1
Following the optimization procedure Equation (8A.2) in Appendix 8A:
; i ¤ j , and rearranging yields:
j D1
0 n B P B Wi Ri Rf B B i D1 Ri Rf D B n n P n P BP 2 2 B Wi i C Wi Wj ij @ i D1 j D1 j D1 i ¤j
1 C C C C C C C A
n P
dL D Ri Rf dW i 0
Wi Ri Rf p2
n X j D1
j ¤i
deriving
i D1
@Wi ˇi2 m2 C ˇi D0
for
1
2 A Wj ˇj m2 C Wi 2i
8 Performance-Measure Approaches for Selecting Optimum Portfolios
Let Hi D
h
Rp Rf
.
i p2 Wi and, solving for any Hi ,
Ri Rf Hi D 2 2i
ˇi m2
n P
Hj ˇj
j D1
(8B.1)
2 2i
135
Hi D
Hi ˇi D
Ri Rf ˇi 2 2i
ˇi m2
n P j D1 2 2i
d P d X
Hj ˇj2
j D1
Hj ˇj D
(8B.2)
Hj ˇj D
2 1 C 2i
2 2i
n P
ˇj2
j D1
2 2i
n X
Hj ˇj D
j D1
(8B.3) and let:
By substituting Equation (8B.3) into Equation (8B.1), Equation (8.10) is obtained.
Appendix 8C Derivation of Equation (8.15) This appendix discusses the use of performance measure to examine the optimal portfolio with short sales not allowed. Therefore, Equation (8B.1) from Appendix 8.B must be modified: n Ri Rf ˇi m2 X 2 ˇj Hj C i Hi D 2 2i 2i j D1
Rj Rf 2 2j
ˇj (8C.3)
d P j D1
ˇj 2 2j
Notice since the set d contains all stocks with positive Hi :
n P .Rj Rf /ˇj j D1
j D1
1 C m2
j D1
Adding together the n equation of this form yields: n X
(8C.2)
Multiplying both sides by ˇj , summing over all stocks in d , and rearranging yields:
Multiplying both sides of the equation by ˇi :
d Ri Rf ˇi m2 X Hj ˇj and i D 0 2 2 2i 2i j D1
(8C.1)
where Hi 0; i 0, and i Hi D 0 for all i . The justification of this equation can be found in Elton et al. (1976). Assuming all stocks that would be in an optimal portfolio (called d ) can be found, and then arranging these stocks as i D 1; 2; : : : ; d , for the subpopulation of stocks that make up the optimal portfolio:
Hj ˇj
j D1
d P
C D
d X
j D1 m2
Rj Rf 2 2j
ˇj
n P
ˇj2
j D1
2 2j
1 C m2
(8C.4)
Using Equation (8C.4), the following equation for Hi is obtained after substitution and rearranging from Equation (8C.1): ˇi Hi D 2 2i
Ri Rf C ˇi
! C i
(8C.5)
Since i 0, the inclusion of i can only increase the value of Hi . Therefore, if Hi is positive with i D 0 can never make it zero and the security should be included. If Hi < 0 when i D 0, positive values of i can increase Hi . However, because the product of i and Hi must equal zero, as indicated in Equation (8C.1), positive values of i imply Hi D 0. Therefore, any security Hi < 0 when i D 0 must be rejected. Therefore, Equation (8.15) in the text can be used to estimate the optimal weight of a portfolio.
Chapter 9
The Creation and Control of Speculative Bubbles in a Laboratory Setting James S. Ang, Dean Diavatopoulos, and Thomas V. Schwarz
Abstract Persistent divergence of an asset price from its fundamental value has been a subject of much theoretical and empirical discussion. This paper takes an alternative approach of inquiry – that of using laboratory experiments – to study the creation and control of speculative bubbles. The following three factors are chosen for analysis: the compensation scheme of portfolio managers, wealth and supply constraints, and the relative risk aversion of traders. Under a short investment horizon induced by a tournament compensation scheme, speculative bubbles are observed in markets of speculative traders and in mixed markets of conservative and speculative traders. These results maintain with super-experienced traders who are aware of the presence of a bubble. A binding wealth constraint dampens the bubbles as does an increased supply of securities. These results are unchanged when traders risk their own money in lieu of initial endowments provided by the experimenter. Keywords Speculative bubbles r Experimental asset markets r Fundamental asset values r Tournament r Market efficiency r Behavioral finance
Speculative bubbles are induced in this study under a pictoria laboratory setting, where a New York Stock Exchange type of double oral auction market (without a specialist) involving many traders is modeled. Speculative bubbles occur when buyers are willing to bid higher and higher prices for an asset, which, in retrospect, is far in excess of its worth based on fundamentals. The bubbles ultimately burst and prices drop to a much lower level.1 The stock market crash in the U.S. in 1987, in Japan in 1991–1992, the dot com bubble in 2000, and the recent housing bubble in the U.S. are examples.2 In recent years, academicians and practitioners are slowly but grudgingly coming to the realization that the extant theories of stock market behavior (e.g., efficient market hypothesis and capital asset pricing theory), fail to explain the magnitude of fluctuations in the stock market. Not only have stock prices been found to fluctuate too much relative to fundamentals, but also there have been occurrences of speculative bubbles that could not be explained by arrival of new information. Several plausible explanations for bubbles are offered such as rational bubbles (Shiller 1988; West 1988), irrational bubbles (Ackert et al. 2002; Lei et al. 2001), judgment error (Ackert et al. 2006) and herding behavior (Froot et al. 1992).3
9.1 Introduction The purpose of this study is to investigate the formation of speculative bubbles in asset prices under a laboratory setting. Specifically, we investigate how to create, control, and dismantle bubbles, as well as the conditions in which bubbles may or may not arise. J.S. Ang () Department of Finance, Florida State University, Tallahassee, FL 32306, USA e-mail:
[email protected] D. Diavatopoulos Villanova University, Villanova, PA 19085, USA e-mail:
[email protected] T.V. Schwarz Grand Valley State University, Grand Rapids, MI 49504, USA e-mail:
[email protected]
1 Stiglitz (1990), in his overview of a symposium on bubbles, defines the existence of bubbles to be: “if the reason that the price is high today is only because investors believe that the selling price will be high tomorrow – when ‘fundamental’ factors do not seem to justify such a price.” Similarly, he defines the breaking of a bubble as marked price declines that occur without any apparent new information. 2 Other notable example of bubbles include the Dutch tulip mania in the seventeenth century, the South Sea Islands Company bubbles (Voth and Temin 2003), John Low’s Mississippi Company Scheme Bubbles of the eighteenth century, the South Sea Islands Company bubbles, John Low’s Mississippi Company scheme bubbles of the eighteenth century, the U.S. stock market boom of the late 1920s, the Florida land price bubbles of the 1920s, the great bull market of the 1950s and 1960s, and the high-tech stock boom of the early 1980s and the boom and bust of the California and Massachusetts housing markets in recent years. However, due to the difficulties in specifying the fundamentals, there is still disagreements as to whether these cases could be explained by the fundamental example of Garber (1990) versus White (1990). 3 Outstanding surveys of this literature are provided by Porter and Smith (2003), Camerer (1989) and Sunder (1992).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_9,
137
138
While some work has been done to show that bubbles can be abated with experience (Dufwenberg et al. 2005), an understanding of the formation of speculative bubbles is still important to researchers for several reasons. First, bubbles could cause significant disruptions in the asset market, not only by creating a large redistribution of wealth among investors, but also by adversely affecting the supply of funds to the market as well as resource allocation among and within firms. Second, the identification of factors affecting the formation of bubbles is crucial in aiding regulators in designing policies to reduce the occurrence or magnitude of bubbles. In particular, if bubbles can be replicated in a laboratory setting, then various proposals to dampen bubbles could also be tested and compared for their effectiveness. Roll (1989) summarizes the difficulty with examining recent empirical results of the 1987 Crash in this regard. Third, an understanding of the dynamic process of bubble formation would contribute to our knowledge of how to model the behavior of asset prices (DeLong et al. 1989; Cutler et al. 1991). In spite of some interesting recent theoretical developments, empirical research on the existence of bubbles tends to be inconclusive and with low power; for example, Gurkaynak (2005), West (1988), Flood and Hodrick (1990). A major problem is the difficulty of specifying the fundamental value of an asset, as bubbles are defined as the price in excess of the fundamental values (Bierman 1995; Robin and Ruffieux 2001). Without being able to calculate the time series of the asset’s fundamental value, price movement could simply be caused by factors affecting the fundamental valuation of the asset; for example, change in risk aversion, arrival of new information, and so forth. And if the fundamental value could only be measured imperfectly using proxies such as past dividends, the imprecise estimates would, of course, reduce the power of any test. The experimental approach reduces this problem (Cason and Noussair 2001). By design, the value of the fundamentals can be specified in advance; hence there is no measurement problem. Any gross and persistent divergence of the asset price from the prespecified fundamental value can now be attributed to bubbles (Siegel 2003). In addition to reducing the identification/measurement problem, performing laboratory experiments to study asset bubbles has two other advantages. First, it allows different characteristics of the market institutions and participants to be introduced in a controlled manner. That is, relevant factors may be manipulated to create or discourage the formation of bubbles. This is an important feature because some of these factors may not be isolated in the real world for detailed study while other factors are simply proposals in the design of market institutions of the future. Second, by controlling the information available to market participants, we can control the role played by unrelated, or exogenous events (e.g., sunspots). Thus, the laboratory experiment approach to study asset market behavior
J.S. Ang et al.
complements the theory/model building process. The three types of variables chosen for analysis in this study are the following: 1. The compensation scheme of a portfolio manager. Allen and Gorton (1988) have argued that compensation schemes for portfolio managers may induce bubbles even in a finite horizon. Also, recent literature on tournaments (see James and Issac 2000; Ehrenberg and Bognanno 1990 and others) has shown that the level and structure of relative compensation influences participant behavior, while Hirota and Sunder (2005) have found that short horizons are important factors in the emergence of bubbles. Three types of compensation structure are used in these experiments: a linear compensation scheme based on portfolio performance, and two versions of compensation based on relative performance in a short-term horizon. 2. Wealth constraint (tight/loose); supply of securities. An infinite number of trades (e.g., overlapping generations and the availability of credit) is often cited as a prerequisite for bubbles. Ricke (2004) discusses how credit made available from margin could generate bubbles. High liquidity leads to bubbles in the work of Caginalp et al. (2001). Scheinkman and Xiong, (2003), Hong et al. (2006) analyze the effect of a short sales constraint on the formation of bubbles. Therefore, experimenting with wealth constraints may provide valuable insights into the effectiveness of certain policies (such as margin rule change, credit availability, and so forth) to control bubbles. 3. The type of investors in the market (speculative/conservative). This variable tests the Keynes-Hicks theory of speculation where differences in traders’ willingness to take risks is the foundation of speculative markets. Traders in the experiments are pre-tested for attitudes toward risk taking. Bubbles are observed under the following conditions: 1. A market of speculators with short-term investment horizon. 2. A market of mixed conservative and speculative traders with short-term investment horizon. 3. A market of mixed trader types with short-term investment horizon using their own money. On the other hand, bubbles are dampened under the following investment environments: 1. A market of conservative traders with a short-term horizon. 2. A market of mixed trader types with a long-term investment horizon. 3. A market of mixed trader types when the wealth of the traders, especially the bulls, is constrained. 4. A single-period trading environment.
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
The remaining part of this paper is organized into four sections. Section 9.2 presents the hypotheses to be tested by incorporating them into the experimental design, which is discussed in greater detail in Sect. 9.3. The results are reported in Sect. 9.4 with Sect. 9.5 summarizing and concluding the paper.
9.2 Bubbles in the Asset Markets The possibility of asset bubbles has long been recognized; however, more formal theoretical development is of relatively recent vintage. Harrison and Kreps (1978), for instance, suggest that in general the right to resell the asset makes traders willing to pay more for it than they would if obliged to hold it forever. Thus, market price could exceed fundamental value. Literature on rational bubbles emphasizes that once a bubble is started, it would be rational to price the bubble component even if it is expected to burst with positive probability. Brunnermeier and Nagel (2004) examine stock holdings of hedge funds during the recent NASDAQ tech bubble and find that the portfolios of these sophisticated investors were heavily tilted towards (overpriced) technology stocks. However, this does not seem to be the result of unawareness of the bubble on the part of hedge funds.4 At an individual stock level, hedge funds reduced their exposure before prices collapsed suggesting awareness and implicit pricing of the bubble component. On the other hand, whether bubbles can even get started has been questioned 64 Diba and Grossman (1987). Essentially, if there is a finite number of periods, starting from the next to the last period, the expectation that the bubble might end may be sufficient to keep it from ever starting. By the process of backward induction or an unraveling argument, bubbles will not exist. Moreover, if the number of trades is finite, withdrawal of early trades at a profit means the remaining traders would be at a negative sum game; that is, with finite trades, will there be a “greater fool” who gets stuck when the bubble bursts. Still, perturbing the model by adding uncertainties on the length of the horizon among traders or market size may preserve the possibility of asset bubbles. Smith et al. (1988) are among the first to investigate the incidence of bubbles. Their design was to give traders common beliefs (according to one of Tirole’s requirements) and long horizons of up to 15 trading periods. Bubbles are observed in 4
Griffin et al. examine the extant theoretical literature about bubbles that includes models where naive individuals cause excessive price movements and smart money trades against (and potentially eliminates) a bubble versus models where sophisticated investors follow market prices and help drive a bubble. In considering these competing views over the tech bubble period on NASDAQ they find evidence that supports the view that institutions contributed more than individuals to the spectacular NASDAQ rise and fall.
139
several of their experimental markets. It is unclear, however, what institutional setting, other than long trading periods, induces bubbles in their study. In a speculative market where bubbles could be present, speculative traders are more likely to purchase shares (and even more so, if bubbles are rationally priced) than riskaverse traders. Not only are they more willing to put a higher value on risky assets, they are also more likely to take the chance that they might not be able to sell out their inventory before the bubble bursts. Therefore, our first hypothesis is that bubbles are more likely to be formed in a market of risk taking traders (speculators). The compensation scheme could also affect the behavior of traders. For instance, Allen and Gorton (1988), and Allen and Gale (2000) show that an option-type compensation scheme for portfolio managers could induce speculative bubbles in asset prices. Portfolio managers are encouraged via incentive rewards to generate short-term trading gains even in a finite horizon world. The current practice of publishing and ranking the short-term investment performance of portfolio managers and the very substantial incentives to hedge fund managers’ performance that may be based on unrealized gains on illiquid assets could give rise to adverse incentives. Portfolio managers who are concerned about these rankings will either take on a riskier strategy for the possibility of outshining their peers or they will simply play it safe and follow the crowd. Both portfolio strategies could lead to the formation of bubbles. The play safe by “following the herd” strategy will cause asset prices to have a strong positive correlation in the short term. Portfolio managers would be buying when others are buying, thus creating an upward price trend, and selling when others were selling, thus bursting the bubble it created. On the other hand, pursuit of a risky strategy may be sufficient to create price leadership that is followed by others in the market. This would be more likely in an uncertain valuation environment. Voth and Temin (2003) suggest that riding the bubble may actually be a profitable strategy. Portfolio managers are subject to an occupational hazard: unless they produce winning results, they stand a good chance of being fired. On the other hand, star performers receive seven or even eight figure incomes as new cash flows into the funds they manage. This compensation system is similar to tournament models where participants are paid according to their relative performance among a group of peers rather than on their absolute performance. Tournament systems are likely to produce increased performance when (1) there is difficulty in monitoring the activities of the agent, (2), when the agent possesses valuable information (Baker 1992), and (3) when good performance measures are available (Baker 1992). All three of these conditions exist in the realm of professional money management, and therefore it is probable that a relative performance compensation system
140
will be effective in increasing manager performance.5 Thus, it is hypothesized that a tournament incentive scheme that encourages a short-term horizon for portfolio managers is more likely to create bubbles.6 Finally, an important policy question has been whether restricting the availability of credit or the supply of securities in a market (by raising the margin requirement or allowing short sales) could reduce or even eliminate the formation of speculative bubbles. (Ackert et al. 2006 find that price run ups and crashes are moderated when traders are allowed to short sell). Most countries, including the U.S. and Japan, have adjusted these conditions in the recent past through adjustments in the use of stock index futures and by easing credit conditions. These changes have tended to occur subsequent to large declines in the country’s equity markets. With limited or asymmetric ability to go short versus long, speculators on the long side have an advantage in acquiring funds for investment. Additionally, if the life of a bubble is uncertain and relatively long lasting, costly short sell will not be profitable even if the bubbles eventually burst. The usual experimental design often endows traders with a relatively large initial wealth such that the budget constraint is not binding. This experiment will test the effect of a tighter budget constraint by both reducing the initial endowment and increasing the supply of securities. It is hypothesized that a wealth constraint and/or relative increase in the supply of securities will reduce the incidence of bubbles. To summarize, the effect of three factors: attitude toward risk, investment horizon, and wealth constraint are examined as to their contribution to the creation and control of asset bubbles. They are tested by incorporating them into the experimental design of a laboratory setting described below, the importance of these factors in the formation and control of asset price bubble, singly and jointly, can now be formally examined.
9.3 Experimental Design The evolutionary nature of laboratory experimental research is such that the results of any study acts as a catalyst for new questions and therefore new experiments. As with Smith
5
Becker and Huselid (1992) and Ehrenberg and Bognanno (1990) have documented in field studies that such tournament compensation systems are effective in raising performance in professional golf and auto racing competitions. 6 It is possible that, if there is sufficient number of short horizon portfolio managers herding in the manner described by Froot et al. (1992), a bubble can start on basis of any information. Shleifer and Vishny (1990) also propose that the portfolio managers have short horizon; however, it is the risk of uncertain return from investing in the longer horizon that prevented disequilibrium to be arbitraged away.
J.S. Ang et al.
et al. (1988), we note that many of our latter experiments were directly motivated by the results obtained from our earlier ones. This progression of thought and analysis will be apparent in the later section on results. Herein, however, we present the method of our investigation in comprehensive form. The creation of “bubbles” within asset markets is examined under the control of three primary factors: (1) the degree of trader risk aversion, (2) trader investment horizon, and (3) available investment capital/supply of securities. Table 9.1 summarizes the design of 14 experiments used to investigate these factors on the presence of asset bubbles. Each of these experiments uses a common market mechanism that builds on the earlier work of Forsythe et al. (1982), Plott and Sunder (1982), and Ang and Schwarz (1985). These common features are summarized below.
9.3.1 General Market Design 1. A double-oral auction, similar to that used on the floor of major U.S. exchanges, is replicated. The recruited traders are physically present within a single room during the course of trading. These traders are independent and trade solely for their own account. There are no specialists or other privileged traders. 2. Only those shares of a single generic security are traded. The sole attribute of these shares is the payment of dividends at the end of each period. 3. Each market (experiment) has ten trading periods. These periods are further categorized into five trading years each of which consists of two contiguous trading periods (A and B). Endowments (discussed below) are reinitialized at the end of the second period of each year. Thus, the initial market represents a two-period model with each security entitled to two payoffs (dividends), one at the end of period A and the other at the end of period B.7 4. Each trading period last for 6 min, with opening, warning (at 5 and 51=2 min), and closing bells. Consequently, each experiment has a total of 60 (6 min 10 periods) trading minutes. During the 6-min periods, traders can observe the continually updated bid/ask and past transacted prices. 7
In the experiment, a trader has at least the following choices available: (a) Maintain the endowed position by not trading and receiving the stochastic payoffs at the end of each period; (b) Hold the securities through period A and sell in period B, in which case the investor will receive the first period dividend and the selling price; (c) Sell the initial holdings in period A to receive the sale price; (d) Buy additional shares in period A, receive dividends at the end of the period, and then sell the securities in period B; (e) Sell the securities in period A and then buy back securities in period B in order to receive the dividends; (f) Purchase a net amount of shares in both periods; (g) Purchase and sell shares within each period.
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
141
Table 9.1 Experimental Design This table categories five designs of fourteen experiments used to examine the impact of risk aversion, investment horizon, and credit/supply constraints (initial endowment) upon the formation and control of asset bubbles. Participant Groupsa Las Vegas
Design 1
Initial Endowmentb 2 securities 10,000 francs
Investment Horizonc Two period
Risk Aversiond Mixed
Las Vegas
2
2 securities 10,000 francs
ShortenedMixed
4m, 6m$
Las Vegas
3
5 securities 3,000 francs
ShortenedMixed
7x,8x,9t,10t
FSU1
4
2 securities 10,000 francs
Two period
FSU1
5
2 securities 10,000 francs
ShortenedSingle
Single Type
Experimentse 1,2,3,5$
11s, 13c 12sm, 14cm
Type
FSU2
2
2 securities 10,000 francs
ShortenedMixed
15, 16
FSU2
6
10 securities 1,000 francs
ShortenedMixed
17, 18
Albania
1
2 securities 10,000 francs
Two period
Albania
2
2 securities 10,000 francs
ShortenedMixed
24,25
Albania
6
10 securities 1,000 francs
ShortenedMixed
26
Albania
7
2 securities 5,000 francs
Single Period
Mixed
19
Albania
8
2 securities 5,000 francs
Single Period/Tournament
Mixed
20,21
Albania
9
20 securities 500 francs
Single Period/Tournament
Mixed
22
a
Mixed
23
The participant groups consist of the following: Las Vegas represents students from the University of Las Vegas at Nevada. FSU1 and FSU2 represent students from the Florida State University at two different time periods. Albania represents students from the University of Tirana in Albania.
b
The initial endowment refers to traders wealth position at the beginning of each trading year of an experiment. This endowment allows traders to sell (using provided securities) or buy (using francs, the currency used in these experiments). The additional securities and reduced currency endowments provided in Design 3 serves to better equate relative purchase and selling abilities. c Investment horizon refers to the horizon within which traders effectively operate. A two-period horizon refers to a market where period A securities are based on the dividends paid in both periods (A & B) of a trading year. In a shortened investment horizon, the trader is induced (via the tournament compensation schedule of Table 9.3) to operate with a horizon which is shorter than the two-period environment in which the securities will pay dividends. d Mixed risk aversion means that traders with various risk preferences were participants within the same market. Single type means that only speculative (s) or conservative (c) traders made up that market. The designations (s) and (c) appear next to experiments 11–14 in the last column. e Notation is as follows: $ represents a market where traders provided $20 of their own money to trade, the sum of which became the pool of money dispersed according to relative profit performance. m,x,t represents the number of traders receiving the tournament prize as outlined in Table 9.3. This tournament compensation was used to induce a shortened horizon market and was differentially paid to the top two (t) or the top six (x) traders. In experiments marked (m), the first three trading years paid a bonus to the top six traders followed by years where only the top two traders received bonuses.
9.3.2 Dividend Design 1. At the beginning of each year, each trader is endowed with trading capital and shares of the generic security. Each share pays dividends at the end of the first (A) and second (B) periods. The second period dividend is a liquidating
dividend. Reinitializing of position at the beginning of each year allows for replication of decision making in experimental markets.8 8 See Smith et al. (1988) for an example of when reinitialization is not used.
142
J.S. Ang et al.
Table 9.2 Dividend Design This table presents the cash flow payoffs which a single asset will provide to its owner. This payoff is different for Trader Types I, II, and III and therefore provides for different fundamental valuations. Rational Expectations Equilibrium are determined by the trader type with the highest valuation for that period.
I II III
Period A Dividend Statec G B 350 110 250 150 200 140
Expected Dividendd
Yearly Expected Dividende
230 200 170
Period B Dividend Statec G B 250 150 200 140 350 110
200 170 230
430 370 400
2
I II III
200 350 250
140 110 150
170 230 200
350 250 200
110 150 140
230 200 170
400 430 370
3
I II III
250 200 350
150 140 110
200 170 230
200 350 250
140 100 150
170 230 200
370 400 430
4
I II III
350 250 200
110 150 140
230 200 170
250 200 350
150 140 110
200 170 230
430 370 400
5
I II III
200 350 250
140 110 150
170 230 200
350 250 200
110 150 140
230 200 170
400 430 370
Yearsa
Trader Typeb
1
Expected Dividendd
a Each experiment is composed of five trading years, each of which contains two trading periods A and B. Ownership of an asset in period A entitles the bidder of both period A and Period B dividends (two-period valuation) whereas period B ownership merits only that period’s dividend (single-period valuation). b There are three trader types in each trading year with four traders in each category. These trader types only differ by the amount of dividend cashflows that the single traded asset will provide its holder. The four traders within each category are rotated within the other categories so as to maintain an uncertain valuation environment. c Dividend States refer to the stochastic payoff that will be provided to specific trader types given the occurrence of the G (Good) or B (Bad) state. The realization of the state of nature is determined at the end of each trading period by flipping a fair coin. d Given equal fifty percent probability of occurrence of G or B, the expected dividend is the simple average of period G and B payoffs. e The yearly expected dividend represents the summation of expected dividend for both periods A and B. Signifies trader type with the highest expected value
2. The dividends to be paid at the end of each period are stochastic. Two equally likely dividend outcomes are possible, the good (G) state and the bad (B) state. The realized state is announced at the end of each 6-min trading period as determined by the flip of a coin by the experimenter. 3. The dollar amount of the dividends paid at the end of each period depends on the trader’s type and the realized state. The 12 traders who make up each market are classified into three types to allow for differences in induced values. The dividend payouts for these three trader types are summarized in Table 9.2. As an example, at the beginning of year 1, the four traders of type I are privately informed that for period A they will receive either 350 or 110 for the good (G) and bad (B) states, respectively, and for period B either 250 or 150 for the good and bad states, respectively. 4. The trader type with the highest expected dividend (0.5
Good Dividend C 0.5 Bad Dividend) is rotated each period so as to enhance trader uncertainty about equilibrium prices. Virtually all previous experimental studies have documented that given sufficient learning (through repeated trading) in a stationary dividend payout environment, prices will rather quickly approach the rational equilibrium level. This learning has two sources: (a) observation that one’s own payouts are not changing, and
(b) observation that market generated bids, offers, and transacted prices are not changing. Our expectation is that the greater the trader’s reliance on market generated (as opposed to prior dividend) information, the more likely bubbles are to occur due to bandwagon and other crowd psychologies. If, instead of bubbles, we should observe that prices converge to rational equilibrium prices (as in the constant dividend studies), then this would strengthen our knowledge concerning efficiency in these laboratory markets. This result would also suggest that trading methods based on historical prices alone would not have value.
9.3.3 Investment Horizon 1. Three types of investment horizon are provided within these experiments: a single-period, a two-period, and a shortened-horizon. Initially, at the beginning of each trading year, a trader is entitled to two stochastic dividends for each security held, one each at the end of periods A and B. Therefore, at the beginning of period A, a rational trader will value the security for both its period A and period B stochastic dividends. Hence, all A period
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
pricing should reflect a two-period investment horizon. Subsequent to the termination of period A trading and the announcement and payment of the period A dividend, period B trading proceeds. As the security is now only entitled to the B period dividend, a single-period investment horizon results for all B periods. Our hypothesis is that a shortened investment horizon increases the possibility of an asset pricing bubble. We test for this by creating a tournament compensation package in period A. The incentive for traders is to concentrate upon their single (A) period performance over the concerns of a rational twoperiod price. This incentive results in a shortened investment horizon.9 2. In this study, Initially, a trader’s dollar compensation is defined by the following: we alter the compensation structure to induce a change in the length of a trader’s investment horizon profit function: Pi D fŒdi;j;A Xi;A C di;j;B Xi;B C .Ri Ci /
(9.1)
where Pi , dollar profit per trading year for trader i. It consists of dividend income and trading gains (losses) from both periods; f, the conversion rate of francs into dollars.10 ; di;j;t , the dividend paid in francs to trader i, given state j occurs in period t; j D G or B; t D A or B; Xi;t , the number of shares held by trader i at the end of period t; t D A or B; Ri , revenues in francs for trader i for all shares sold during periods A and B; Ci , costs in francs for trader i for all shares purchased during periods A and B. In order to induce pressure for a shortened investment horizon, an additional compensation package is introduced in Period A of some experiments as identified in Table 9.1. This tournament compensation system is based on the traders’ relative performance as measured by the Tournament Performance Index (TPI) below: TPIi D Ri Ci C MXi;A
(9.2)
9 It is important to note that there is a difference between a one- and two-period horizon and a shortened-horizon. In a one-period model, only a single dividend is valued. In a two-period model, two dividends are valued. In our shortened-investment horizon, the trader is induced to operate within a horizon that is different from that of his operating environment. That is, within a two-period operating environment, the trader is given an incentive to operate with a shorter (possibly single) period horizon. This is quite different from a single-period model. This shortened horizon is a stronger test of market efficiency, in that the pressures are away from rather toward rational equilibrium prices, (as defined in Equation (9.4), subsequently). The methodology is meant to emulate modern portfolio managers operating in an environment of perpetual horizon stock securities yet receiving tournament incentives to outperform colleagues on a short-term basis. 10 Francs are the currency used within this study. They have been used successfully by Plott and Sunder (1982), Ang and Schwarz (1985), as well as others. Their primary benefit is to avoid the technical problem of dealing with small dollar amounts.
143
where Ri ; Ci , and Xi;A are as previously defined, and M represents the closing market value of the shares. This closing market value is taken to be the price of the second to last transacted price for that period. This procedure is introduced in order to reduce the possibility of manipulating market value by collaboration on a final transaction. It represents a simplified version of the priceaveraging process that takes place on most organized exchanges for the setting of opening and closing prices. 3. The tournament compensation system provides traders with an incentive to outperform each other in period A only. This incentive system increases the importance of single-period performance (in A) over two-period concerns; that is, it induces a shorter investment horizon in period A. A trader’s compensation is dependent upon his relative rank as summarized in Table 9.3. In Schedule Six, the top six (of twelve) traders are rewarded with francs ranging from 1,500 to 200. Schedule Two is an alternative schedule that is hypothesized to induce even greater competitive pressure as only the top two traders are compensated greatly.11 Ehrenberg and Bognanno (1990) and Table 9.3 Tournament Compensation Schedule This table presents the additional tournament compensation schedule provided to traders based on their relative profitability in period A of certain experiments (see shortened investment horizon listed in Table 1). Relative profitability is measured by: TPIi D Ri Ci C M Xi;A where TPIi is the tournament performance index for trader i, Ri is the revenues received from the sale of assets in period, Ci is the cost of assets purchased, M is the closing market value for the period, and Xi;A represents the end-of-period asset holdings. Together the index measures the total of realized and unrealized capital gains. The addition of the tournament compensation to period A provides an incentive for traders to prefer period A capital gains over equivalent period B dividends and thereby induces a shortened (from the two period model) investment horizon. Schedule Six (s)
Two (t)
Rank 1 2 3 4 5 6 7 to 12
Lowest
1
Highest
3000 francs
Lowest
1000 0
2 3 to 12
TPI Highest
Compensation 1500 francs 1000 700 400 200 200 0
a
Two compensation schedules are introduced. The first provides for those traders who do better than the average (i.e., the top six) receive the additional compensation list. In the second schedule, only the top two “superstars” are richly rewarded. The tournament literature (e.g. Baker 1991) suggests that tournament systems, and especially schedule two, provide effective incentive systems to increase performance. This design is meant to emulate the short-term performance pressures faced by professional money managers.
11
The compensation schemes depict the different ways portfolio managers are being rewarded: those who are above the average or beaten
144
Becker and Huselid (1992) find that the reward spread does cause increased performance incentives. Therefore, we expect Schedule Two to increase incentives for shortterm pricing behavior. Table 9.1 summarizes the experimental use of the performance reward schedule.
9.3.4 Risk Aversion 1. Prior to selection, each potential trader was given a lengthy questionnaire. Intermingled within this material were two psychological tests on risk taking: the Jackson Personality Inventory (1976) and the Jackson et al. (1972) tests.12 These two tests have been applied in laboratory (Ang and Schwarz 1985) and field studies (Durand et al. 2006) and are more practical to administer than the theoretical risk measures found in the economics literature.13 Those persons who score in the top 12, signifying the least risk averse, and the bottom 12, or the most risk averse, are invited to participate in the second stage of the experiment. 2. Traders for experiments 1–10 were students from the University of Las Vegas at Nevada and were recruited from a senior level options class. These students had all taken two statistics, a corporate finance, a valuation, a portfolio analysis, and an options course. They were well trained in arbitrage, present value, and expected value. From this pool of students, 12 were chosen to participate based on their attribute ranking in risk aversion. Participants were chosen so that a mix of risk aversion types were represented in the same market. Included were those the market (Schedule Six), and those who are the superstars (Schedule Two). 12 The authors are aware of the work of Holt and Laury (2002), which was not available at the time of this study. According to Holt and Laury their experiment shows that increases in the payoff level increase RRA. However, when estimating RRA, Holt and Laury assume that subject’s utilities depend only on payments in the experiment. They fail to account for the wealth subjects have from other sources (see Heinemann 2003). 13 The Jackson Personality Inventory is scientifically designed questionnaire for the purpose of measuring a variety of traits of interest in the study of personality. It was developed for use on populations of average or above average ability. Jackson states (1976), p.9, “It is particularly appropriate for use in schools, colleges, and universities as an aid to counseling, for personality research in a variety of settings, and in business and industry.” Of the 16 measurement scales of personality presented, one scale directly measures monetary risk-taking using a set 20 true and false questions. Mean and standard deviation measures for 2,000 male and 2,000 female college students are provided. Jackson et al. (1972) demonstrate four facets of risk taking: physical, monetary, social, and ethical. The authors’ questionnaires are situational in that the respondent is asked to choose the probability that would be necessary to induce the respondent to choose a risky over a certain outcome. Jackson (1977) presents high internal consistency correlation between the risk measurement techniques.
J.S. Ang et al.
who ranked at all levels of the scale, from high to low risk aversion. This was done so that differences in individual risk behavior could be tracked within an identical market environment. 3. Experiments 11–14 were conducted at Florida State University (FSU1) and as summarized in Table 9.1, these experiments were designed so that an experimental market consisted entirely of traders that were either relatively more risk averse (conservatives) or less risk averse (speculators). This experimental form allowed for evaluation of whether risk aversion is uniquely a necessary or sufficient condition for the presence of bubbles. 4. Experiments 15–26 were conducted at a later date at Florida State University (FSU2) and the University of Tirana in Albania. This was done to confirm the robustness of our results. We intentionally chose students from two different universities with different backgrounds to represent the two extremes in our test. Experiments 19–26 were administrated to subjects in Albania, who have a low degree of familiarity with capital markets while experiments 15–18 were administered to Florida State University students (FSU2) who had taken a financial engineering course and completed another more involved laboratory asset market experiment. Hence, we consider these FSU2 students to be super experienced relative to the students from Albania.
9.3.5 Validation Procedures The following procedures are incorporated into the experimental design to insure the reliability and external validity of the results: 1. To guard against the possibility that subjects’ experience with trading could change their attitudes toward risk taking, they were retested. Subsequent to the first four experiments, additional risk questionnaires were given to the participants. A Spearman rank correlation (with initial risk rankings) was 902 with a t-statistic of 6.61 indicating that there had been no significant change in the relative risk attributes of the traders. 2. Videos were used to verify recorded information, identify possible irregularities, and to train new subjects. 3. Subjects were given extensive training on the operation of the game; the main experiments were conducted on groups of experienced, if not super-experienced, traders. 4. Lengthy post experiment questionnaires were also given to the subjects. Among other things, these were used to verify that the traders considered their trading strategies taken at the time of trade to be rational.
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
9.4 Results and Analysis 9.4.1 Control Experiments For experiments 1–14, five experimental designs were used to test for the effects of risk aversion, investment horizon, and capital endowment on the presence of asset bubbles. These designs are summarized in Table 9.1. The first design consisting of experiments 1, 2, 3, and 5 were control markets where the two-period model was tested without extrane-
Fig. 9.1 Experiment #1
Fig. 9.2 Experiment #2
145
ous influence from the three treatment variables mentioned above. Figures 9.1–9.5 plot the series of resulting prices. Bid and ask prices are represented by a “C” symbol and are connected by a vertical solid line. Transacted prices are identified by a solid horizontal line connecting each trade. From earlier laboratory studies, we would expect prices to converge to rational expectation equilibrium levels after an initial period of learning. Two relevant concepts of equilibrium prices in these markets have been proposed (see Forsythe et al. 1982). The first is the Naive Equilibrium (NE) Price. The NE is the highest
146
J.S. Ang et al.
Fig. 9.3 Experiment #3
Fig. 9.4 Experiment #4
price any trader in the market is willing to pay based on his individual valuation of the expected dividends for the two periods or: NE D Max kŒE.DA;k / C E.DB;k /
(9.3)
where k classifies the trader type (1, 2, or 3) based on prior expected dividend valuations (see Table 9.2). E.DA;k / and E.DB;k / are the values of expected dividends in periods A and B to the k’th trader type.
The NE price is the market price that will prevail if the traders use only their private information to determine value. It is naïve in the sense that traders do not learn about the valuations of other traders from the market trading information. These traders also ignore the option value to trade (e.g., hold a security for a period and then sell it to another trader who would value it most in the remaining period). The second is the Perfect Foresight Equilibrium (PFE) price. It is equal to the highest total value that successive owners of the same share will pay; or, in the experiment, the
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
147
Fig. 9.5 Experiment #5
sum of the highest expected payoffs for periods A and B for all traders or: PFE D Maxk E.DA;k / C Max E.DB;k /
(9.4)
These prices represent two extreme benchmarks in the continuum of the value of capital market in discovering information through trading. The NE price gives no role to capital market in price discovery, while the PFE price assumes full discovery (i.e., trading in capital market can correctly identify the share’s highest value in each future holding period). They define, respectively, the lower and upper bounds of the share’s fundamental value. Thus with payoffs to traders and across holding periods under the control of the experimenters, we can now identify with certainty whether a stock is undervalued (when price is below NE), or overvalued (when price is above PFE), or is in a bubble (when price is grossly below NE or above PFE, as in a negative or positive bubble). If the experimental market captures a well functioning capital market, learning and repeated trails would cause prices to converge toward PFE. There are two properties in (Equations (9.3) and (9.4)) that are worth noting. First, NE and PFE prices are identical in a one period world when price determination is closer to a simple auction of a single period payoff. Second, when the payoff in the equations are dollars, as in cash dividends and capital gains or losses, NE and PFE give the risk neutral prices. In the absence of risk neutrality, a negative difference between observed prices and these prices may be interpreted as a risk premium. The results illustrated in Figs. 9.1–9.3 establish the validity of our experimental design as we are able to produce results similar to those obtained in previous experimental
studies. In particular, we are able to reproduce the result that prices converge to PFE with learning and repeated trials. These prices are plotted as a solid horizontal line and are greater than the NE prices.14 The inexperience of traders in experiment 1 is greatly reduced in experiments 2 and 3 as traders learn to cope with the large uncertainty in valuations (introduced by design). This pricing uncertainty increases the traders’ reliance on “market generated information” in order to determine valuation. A micro-analysis of traders’ accounts in experiment 2 shows that some traders became actively involved in arbitrating between the A and B periods of a trading year. As a consequence these prices tended toward their PFE equilibrium levels. Traders’ learning contributed to further pricing efficiencies in experiment 3. Some earlier “irrational” trades by selected individuals had resulted in substantial losses creating a “once-bitten” effect and more rational decisions were 14 Note that all odd numbered experiments used the dividend design in Table 9.2. In order to differentiate between (1) learning about a stationary environment and (2) learning efficient valuation within laboratory markets, we created nonstationarity in equilibrium prices across experiments. In particular, for all even numbered experiments, the dividend payoffs of Table 9.2 were simply cut in half so that rational equilibrium prices were also one half that of the odd numbered experiments. When this equilibrium dividend rotation is viewed in conjunction with the previously mentioned rotation of trader types, it becomes apparent that each individual trader was likely to view the environment (at least initially) as nonstationary. Consequently, any results that we show regarding equilibrium pricing and convergence would suggest that learning about valuation methods rather than a stationary environment creates rational valuation. That is, we are concerned about learning that takes place within the trader (how he values) not about the environment (stationary value). We are able to pursue this expanded question due to our debt to earlier authors who have already well established the presence of the latter.
148
followed subsequently. By the end of this experiment, prices in both the A and B periods were close to the PFE price.15 A final examination of the validity of the experimental design was performed by requiring each trader from Las Vegas experiments 5 and 6 to “invest” his own money ($20) into the markets. As a result, it was possible for traders to lose as well as to win. The results, illustrated in Fig. 9.5, show continued price convergence toward equilibrium levels.16 Of interest is the pattern of the bid-ask spread within a period. The data suggests that the primary resolution of uncertainty is obtained during the first transaction of a period. Subsequent trading tends to vary little from earlier levels with subsequently smaller bid-ask spread levels. We conclude the control section noting that the experimental design creates price-revealing trades that foster PFE equilibrium pricing. While consistent with earlier research, these results extend our knowledge into a much more uncertain (nonstationary) valuation environment more typical of real world asset markets. In addition, the validity of these results are not affected by whether or not a dollar investment is required from traders; trading behavior is similar under both environments.
9.4.2 The Formation of Bubbles With a well-functioning experimental design established, we now sequentially introduce our hypothesized treatment variables. In experiment 4, we introduce the shortened trading horizon with a tournament prize as described in the experimental design. At this point, we have an advantage over previous studies in that we were able to recruit the identical 12 traders back. This level of experience will lead to converging equilibrium prices as opposed to bubble formation.17 The effect of the tournament compensation is to shorten the traders’ investment horizon in period A from a PFE twoperiod model. By providing tournament payment based on period A relative ranking, there is an increased incentive to generate period A capital gains over equivalent period B div15
While period A prices exceeded the calculated PFE price of 460, this price is somewhat unknown to traders at this point. Prior trading results had created a history of B period prices averaging 320. Consequently, it was rational for a PFE trader to pay up to 550 (230 for A period plus 320 for B period sales price). The last trade in period 5A of 505 was well below that level. A more detailed presentation of the experimental results further reveals the rationality of these prices and is available from the authors upon request. 16 Again, period A prices seem to drift upward due to initial excess pricing in period B. 17 Our design is to eliminate the bubbles effect of miscalculation caused by inexperienced traders as suggested by White (1990) and King et al. (1990). It is more useful and realistic to study the formation and control of bubbles in markets of experienced traders.
J.S. Ang et al.
idends. The tournament compensation, while increasing the incentive to win, does not necessarily equate to higher equilibrium prices. The prize is paid to the largest (realized and unrealized) relative capital gains that can be achieved in either a bull or bear market. Extraordinary results are shown in Fig. 9.4 where five massive price bubbles are observed in each of the A periods. At this point, a new learning phase was initiated as traders competed strategically for the tournament prize. The dominant initial strategy centered on buying all available assets at increasing price levels thereby creating artificial price support for capital gains. While this often resulted in achieving the prize, it also meant dealing with an inventory of overvalued assets in period B. Some traders actually lost money for the year even though they obtained the prize. It is important to note that the bubbles did not discourage the traders from participating, and at least for awhile the number willing to participate actually increased. Examination of asset holdings reveals that there were four to five active prize seekers in later bubbles versus one to two initially. In addition, seven to nine traders continued to hold securities at the periods’ end rather than to sell out at extremely high bubble levels. The much higher increased tournament reward structure for “superstar” performers of periods 4 and 5 (see Schedule Two of Table 9.3) resulted in the largest bubbles (consistent with our predictions) and with the greatest variability in prices and bid-ask spreads. Again, the buying frenzy in period 5 was led by different traders than those in period 4. This continued rotation in trading leadership highlights that the results are not driven by a few misinformed traders. In fact, period B prices are very stable and efficiently priced. Furthermore, a trader questionnaire survey at the end of experiment 4 revealed that traders were fully cognizant of expected dividend value, yet they looked to both dividends and market generated information to determine value. Traders stated that they were influenced by the behavior of their peers and were motivated to earn as much as possible. Several traders noted that the introduction of the tournament compensation stimulated them to take on more risk. The net result of these effects was to create a herd or bandwagon effect centered on market-generated information. Despite the earlier findings of experiment 5, we tested the validity of these bubbles in an environment where traders used their own money rather than the experimenters.18 Would such wild speculation occur when a trader’s own money was at risk? Figure 9.6 clearly shows this answer to be yes. In all five A periods, average prices are more than twice the equilibrium value. As before, period B pricing is very efficient
18
To the authors’ knowledge, this is the first time traders in an experimental market of this type have used their own money to trade and still produced bubbles.
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
149
Fig. 9.6 Experiment #6
and stable. That is, although our traders engaged in bubble pricing, they arrived at it through rational means. The traders in these markets had now participated in six experiments, the most of any research to date. Yet, even in the presence of super-experienced traders we continue to find bubble formation. In addition, these traders were aware of the situation and made every opportunity to profit from the bubble.19
9.4.3 The Control of Bubbles It became readily apparent from the earlier experiments that restrictions on the supply side of the market were having an influence on market prices. Many traders found themselves bound in their actions by the institutional makeup of the experimental markets. Many of the traders suggested that they be allowed to short-sell in future experiments so as to implement sell strategies in overvalued markets. 19 For instance, new strategies were employed at various stages (which perpetuated continuing uncertainty in the markets). At one time, the market actually stood still for an extended period. Then traders began to liquidate at any price rather than to replicate their earlier strategy of waiting until late in the period to sell out at bubble prices. Other traders began to try and scalp the market by driving prices both up and down thereby generating capital gains in both price directions. Even others began to try and force losses on traders with large inventories and thereby improve their relative ranking. This was accomplished successfully in period 2A by selling at a loss (at a price below market prices) in order to create a low settle price, M (the second to last trade). Other attempts at this strategy followed in all remaining A periods. Nevertheless, bubbles persisted and many traders were frustrated in their inability to arbitrage them away.
As previously mentioned, the tournament compensation system does not alter PFE prices as the prize can be achieved in any type of market environment and with any type of price pattern. Given the results of our previous experiments as well as traders’ comments,20 it appeared that buyers (longs) had an advantage over sellers (shorts). Is it possible that the bubbles we observe were due to differential market position in addition to a shortened investment horizon? In order to answer this question, we conducted four more experiments that provided traders with initial endowments and better equated the position of buyers and sellers. Rather than being endowed with two securities and 10,000 francs of trading capital as before, each trader is initially endowed with five securities and 3,000 francs (see Table 9.3).21 The price patterns of experiments 7–10 are as startling as the dramatic bubbles earlier. We find that the market is immediately priced at a discount to PFE.22 This had never 20 Traders completed survey questionnaire at the completion of experiments 4, 6, and 10. 21 Given that experiment 6, period A prices averaged around 600, initial trading capital of 3,000 francs would provide buying power of roughly five securities. Consequently, the new buying power and selling power were a prior relatively equal. Even though period A prices turned out to be quite a bit lower in experiments 7–10, this did not create a great advantage to buyers since the supply of securities (5 traders 12 traders D 60) was relatively large for a 6-min trading period. As such, there was an ample supply of securities relative to buying power in order to drive prices down should traders turn bearish. 22 We are unable to recruit all 12 traders back for experiments 7–10 due to graduation, taking of jobs, etc. We were, however, able to retain 7 of the original 12 traders. These traders had now participated in six previous experiments. The five replacements were drawn from the original pool of subjects that had completed the risk attribute questionnaires. These new traders were chosen to replace the risk types that had
150
happened in any of the tournament periods before. If this had been simply the result of learning, we would have expected a gradual decline from the lofty levels of experiments 4 and 6. Rather, we see an immediate discount price that generally remains at a discount throughout all four experiments.23 Overall, we consider this to be strong evidence that a necessary condition for the creation of the large bubbles of these markets is that the institutional environment be biased toward more purchasing ability relative to that of selling. In summary, experiments 7–10 highlight the importance of the supply of securities and the supply of investable funds that may be augmented by short selling. Bubbles observed in experiments 4 and 6 are immediately eliminated when the relative purchasing advantage of long traders is removed. Rational pricing reflecting a modest risk premium results even when traders are faced with a shortened investment horizon.
9.4.4 The Impact of Risk Aversion The results of previous experiments, especially 6, showed that trader risk aversion was an important factor in determining trader strategy and therefore price patterns. In general, it was found that speculative traders were more likely to seize
vacated, so that, in general, we maintained a wide dispersion of risk types within the market. In addition, some of these new traders had sat in as observers to previous experiments. Others viewed videos of the earlier experiments. All were instructed in the past experimental results and the various strategies previously used were explained. As such, we do not believe that this change is a critical factor in the continuation of our investigation. 23 An analysis of many of the last trades of period A for experiments 7–10 often shows either a sharp spike up or down. This illustrates that the traders had become very efficient (through learning) in their manipulation of closing prices. Given the large supply of securities available to squelch a price bubble, speculators were no longer singularly (due to large initial endowments of trading capital) able to create capital gains by driving market prices up. With this constraint, they quickly learned that all they needed to accomplish was to purchase the most securities at current prices and then drive the market up on the final few trades. This was often easily accomplished in that (1) only the second to last trade needed to be higher in line with the calculation rules of the TPI, and (2) as no surprise, there was always many traders who were willing to sell their securities at a price above the current level. The art to this strategy became a matter of timing; do not try to buy the market too early lest you run out of capital, and do not be too late lest you be unable to make the second to the last trade. There did not appear to be too much of a problem for buyers in accomplishing this in experiments 7 and 8; however, starting in experiment 9, some short traders, annoyed at bullish traders getting the tournament prize, began jockeying in these last seconds with the long traders to drive prices down. The results of such feuds appear in periods 3A, 4A, and 5A of experiment 9 and each A period of experiment 10. The winner of these duels increasingly became the trader who was best able to execute his trade. Eventually, trading activity become so frantic in the last 15 s of trading that the open outcry systems of double oral auction began to break down.
J.S. Ang et al.
upon the opportunity created by the introduction of uncertainty (via the tournament period) in search of capital gains. In contrast, the more conservative traders were likely to allow the speculators to act first by creating a positive price trend and would simply sell at inflated prices or they would allow speculators to first initiate the “burst” of the bubble and then follow in their footsteps. Consequently, the conservative traders were often those responsible for the perpetuation of a direction initially set by speculators. The purpose of experiments 11–14 were to further test these relationships.24 We chose at this time to create two separate trading groups according to risk aversion, each composed of 12 traders. These 24 traders were chosen from a pool of 70 students that completed the risk-ranking questionnaire described earlier. The 70 respondents were rank ordered from highest to lowest in risk aversion. The top 12 and bottom 12 students were chosen to participate in the experiments. This method allows us to obtain good separation according to risk aversion. Contrary to our previous experiments, these markets would be made up entirely of one risk aversion class. We label these two risk classes as speculators and conservatives. This is a relative nomenclature as all of these traders are considered to be risk averse, and we only presume to provide an ordinal measure of risk aversion Figs. 9.7–9.10. The design of these experiments follows that of experiments 1–6, as we wish to test for the presence of bubbles and the initial endowments of experiments 7–10 have already been shown to eliminate bubbles. All of these traders had previously participated in two experimental markets and therefore can be considered experienced. Nevertheless, we test for rationality of pricing in experiments 11 and 13 before introducing the shortened horizons in experiments 12 and 14. Figures 9.11–9.13 reveal that both markets are quite rational in that they charge a discount from PFE as a risk premium. As expected, the conservative traders of experiment 13 charge a larger risk premium than the speculative traders of experiment 11. This result provides strong evidence in support of our measure/separation of risk aversion. We also note that the speculative group exhibits prices above the PFE levels of period B. This is consistent with our earlier results where this was found in the single period case Fig. 9.14. Experiments 12 and 14 introduce the tournament compensation schedule to induce a shorter investment horizon. As expected, the speculative group seizes upon the opportunity and price bubbles are generated in the latter periods. Also to no surprise, the conservative group does not create
24
Experiments 11–14 were conducted at a second university and, therefore, the results provide information about the external validity of our experiments outside the setting of a single university.
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
151
Fig. 9.7 Experiment #7
Fig. 9.8 Experiment #8
the pressure necessary to cause bubbles to form. As a result, we conclude that a necessary condition for asset bubbles is the presence of speculators.25
9.4.5 The Formation of Negative Bubbles We have just learned that the effect of the reduced investment horizon is to increase the incentive for short-term speculative
25
A detailed examination of individual trades reveals the speculative group of traders are found to be more innovative in designing new trading strategies both in the creating and bursting of bubbles. The finding is consistent with the observation made by Friedman (1992) in his review of a dozen NBER working papers on asset pricing. He finds these recent research results demonstrate that rational speculative behaviors such as
an attempt by investors to learn from other investors, to affect another’s opinion, or to simply engage in protective trading could in some context, such as imperfect information, magnify price fluctuations.
152
J.S. Ang et al.
Fig. 9.9 Experiment #9
Fig. 9.10 Experiment #10
gains and that speculative traders are those most eager to earn these profits. We now extend the research design to investigate the question of whether negative bubbles are also possible. We test this proposition by conducting four new experiments (labeled as experiments 15–18 in Table 9.1). We conduct experiments 15 and 16 as “controls” to replicate the positive bubble environment found in experiments 4, 6, and 12. Experiments 15 and 16 validate our previous results with a new set of experimental subjects while Figs. 9.15 and 9.16
plot the pattern of close prices relative to the equilibrium level (horizontal line). In both experiments large positive bubbles emerge in most trading years. We now pose the following question, “Would an environment opposite to that of Design 2 lead to negative bubbles?” We keep the structure of Design 2, but since it was the unequal endowment effect (more purchasing power versus selling pressure, under 2 securities, 10,000 francs) that created the ability to pursue profits in a positive bubble
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
153
Fig. 9.11 Experiment #11
Fig. 9.12 Experiment #12
environment, we reverse the endowment effect in experiments 17 and 18 by providing 10 securities and 1,000 francs to each trader. This one change provides traders in experiments 17 and 18 with a much greater ability to buy relative to sell. The results plotted in Figs. 9.17 and 9.18 show a preponderance for negative bubbles. While the initial four years of experiment 17 show some learning adjustment to this new and difficult trading scheme, large price discounts emerge to the extent that period 5A’s closing price is insignificantly different than period 5B’s, which is a single period receiving only a single dividend. By experiment 18, each year shows
downward trending markets in each A period. The reader may notice that the positive bubbles seem to burst while the negative bubbles don’t. However, since our design did not allow more cash to be made available through borrowing or infusion, correction may not be observed in the short trading period.
9.4.6 Statistical Analysis Table 9.4 summarizes regression analyses of the impact of the variables just discussed upon the divergence of asset
154
J.S. Ang et al.
Fig. 9.13 Experiment #13
Fig. 9.14 Experiment #14
prices from their PFE levels in period A. In particular, we test the following relation: PL PFE D f .I; E; I E; T; I E T; S; I S; A; I A; $/ (9.5) where PL -PFE, the deviation from equilibrium for period A of each trading year where PL is the last trade of the period and PFE is the Perfect Foresight Equilibrium price; f , a linear additive model; I, a dummy variable representing the shortened Investment horizon according to Table 9.1. I D 1 for shortened horizon, and 0 otherwise (i.e., exper-
iments 4,610,12,14); E, a dummy variable representing the Endowment effect according to Table 9.1. E D 1 when 2 securities are issued, and 0 otherwise (i.e., experiments 1–6, 11–14); I E, an interaction dummy variable representing both a shortened investment horizon and two security endowment (i.e., experiments 4, 6, 12, 14); T, a dummy variable representing the Tournament effect according to Table 9.1. T D 1 when there is a tournament prize for two traders only, and 0 otherwise (i.e., experiments 4, 6, 12, 14 (years 4 and 5) and 9, 10); I E T, an interaction dummy variable representing a shortened investment horizon, a two security
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting Fig. 9.15 Experiment #15
Fig. 9.16 Experiment #16
Fig. 9.17 Experiment #17
155
156
J.S. Ang et al.
Fig. 9.18 Experiment #18
endowment, and a tournament effect (i.e., experiments 4, 6, 9, 10 (years 4 and 5)); S, a dummy variable representing the extent to which speculators participated in the experiments according to Table 9.1. S D 1 for experiments 11, 12 and 0 otherwise; I S, an interaction dummy variable representing the shortened investment horizon and a pure speculative trader market (i.e., experiment 12); A, the ratio of end-ofperiod asset inventory for speculative traders to total asset holdings. Speculative traders are those who scored in the top one-half of the risk measurement questionnaires; I A, an interaction variable for shortened investment horizon and ratio asset holdings for speculators (experiments 4, 6–10, 12, 14); $, a dummy variable representing experiments where traders risked their own money according to Table 9.1. $ D 1 when own money is used, and 0 otherwise (i.e., experiments 5 and 6). Due to their differential design, the results for experiments 1–10 appear separately in Panel A and those for experiments 11–14 in Panel B. Model 1 of Panel A tests the impact of (1) I D 1, a shortened horizon, (2) E D 1, a restricted endowment effect (wealth and supply effects), and (3) I D 1; E D 1, an interaction of a shortened horizon with restricted initial endowment. Given that the regression was run with no intercept, the coefficients represent estimates of each variable’s independent impact. The results suggest that neither a shortened investment horizon nor a biased endowment effect (advantage to “bulls” versus “bears”) is sufficient to induce bubble behavior. However, the interaction of these two variables is highly significant in explaining the bubble results of these experiments. That is, an environment that provides both the incentive and the ability to profit from a bubble will likely result in positive price divergence.
As hypothesized earlier, we test for the heightened effect of tournament incentives (i.e., T D 1) by examining the effect of “superstar” prizes paid to only the top two traders (as outlined in Tables 9.1 and 9.3). We also test for an interaction effect with a shortened horizon .I D 1/ and restricted endowment .E D 1/. Model 2 results are consistent with Model 1 in that a tournament effect is not sufficient in itself (t D 0:13 on T variable); however, in conjunction with a reduced horizon and restricted endowment, the tournament interacts to explain a significant part (t D 4:57 on I E T) of the bubbles in these experiments. In Model 3, we observe the impact of speculative traders vis à vis conservatives by introducing a measure of asset purchase activity. The end-of-period asset holdings for the speculative group (the top one half of traders in risk ratings) is compared to the total asset endowment for all traders. In the absence of any effect, assets should be evenly divided and this ratio, A, should be equal to 5. The results of Model 3 indicate that speculators independently do not impact the presence of a bubble (t D 0:01 for A); however, when speculators operate within a shortened horizon .I A/ they do significantly differentiate themselves from conservatives by buying more and contributing to positive price bubbles. Finally, the impact of the use of the trader’s own money is shown not to significantly alter the effects of the price bubbles (t D 0:54 for $). The R2 of 80 suggest that the vast majority of price deviation from PFE levels can be explained by investment horizon, endowment effects, and risk aversion. Panel B reports the results for experiments 11–14 where markets were either composed of all speculators (11, 12) or all conservatives (13, 14). Due to this makeup, variables A and I*A are not defined in these regressions, although S and I*S are substituted in their place and represent the speculative
represents the deviation from equilibrium for Period A of each trading year where PL is the last trade of the period and PFE is the Perfect Foresight Equilibrium price. a dummy variable representing the shortened Investment horizon according to Table 9.1. I D 1 for shortened horizon, and 0 otherwise (i.e., experiments 4,6–10,12,14) is a dummy variable representing the Endowment effect according to Table 9.1 E D 1 when 2 securities are issued, and 0 otherwise. (i.e., experiments 1–6, 11–14) is an interaction dummy variable representing both a shortened investment horizon and two security endowment (i.e., experiments 4,6,12,14) is a dummy variable representing the Tournament effect according to Table 9.1. T D 1 when there is a tournament prize for two traders only, and 0 otherwise (i.e., experiments 4,6,12,14 (years 4&5) and 9,10) is an interaction dummy variable representing a shortened investment horizon, a two security endowment, and a tournament effect (i.e., experiments 4,6 (years 4&5), and 9, 10). is a dummy variable representing experiments composed entirely of Speculators according to Table 9.1. S D 1 for experiments 11 & 12 and 0 otherwise. is an interaction dummy variable representing the shortened investment horizon and a pure speculative trader market (i.e., experiment 12) represents the ratio of end-of-period Asset inventory for speculative traders to total asset holdings. Speculative traders are those who scored in the top one-half of the risk measurement questionnaires. is an interaction variable for shortened investment horizon and ratio asset holdings for speculators (experiments 4,6–10,12,14) is a dummy variable representing experiments where traders risked their own money according to Table 9.1. $ D 1 when own money is used, and 0 otherwise (i.e., experiments 5 & 6)
b As reported in Table 9.1, experiments 1–10 consisted of traders with a wide range (mixed) of risk aversion. Experiments 11 & 12 were composed only of speculative traders with experiments 13 & 14 composed of conservatives. c NOINT means the regression was run by suppressing the intercept. , , signify statistical significance levels at .10, .05, and .01, respectively.
I A $
I E T S I S A
PL -PFE I E I E T
Table 9.4 The Impact of Investment Horizon, Credit/Supply Constraints, Risk Aversion, and Other Variables This table shows the extent to which certain variables cause a deviation from perfect foresight equilibrium values.a The following regression is estimated separately for experiments 1–10 and 11–14 according to the experimental design of Table 1. PL -PFE D f(I, E, I E, T, I E T, S, I S, A, I A, $) Model Intercept I E I E T I E T S I S A I A $R2 Panel A Experiments 1–10 .n D 50/b 1 NOINTc 45:1 .1:48/ 38:5 .1:26/ 512.3 (8.40) .67 .79 2 48:1 .1:43/ 9.6 (.23) 311.8 (6.28) 6.1 (0.13) 382.3 (4.57) 145.1 (1.80) 195.9 (2.23) 17.3 (0.38) 282.5 (3.08) 0.8 (0.01) 292.4 (1.77) 23:6 .0:54/ .80 3 178:1 .2:73/ Panel B Experiments 11–14 .n D 20/ 4 150:5 .11:17/ 109.1 (6.98) 89.9 (4.92) .75 61.1 (2.83) 67.0 (3.17) 45.8 (1.53) .83 5 139:0 .9:30/ 61.8 (2.71) a t-values in parentheses. Variables defined as follows:
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting 157
158
J.S. Ang et al.
Table 9.5 Negative Bubble and Single Period Results This table shows the extent to which endowment in conjunction with other variables causes a deviation from perfect foresight equilibrium values.a The following regression is estimated separately for experiments 15–18, 19–22, and 23–26 according to the experimental design of Table 9.1. PL -PFE D f .NI; E; I E; EN ; I EN / Model
Intercept
I E
NI
I EN
R2
201:3 .2:34/
.55
92:9 .6:89/
.57
208:0 .4:08/
.80
Panel A Experiments 15–18 .n D 20/ 6
NOINTb
45.0 (0.37)
400.6 (4.67)
Panel B Experiments 19–22 .n D 40/ 7
NOINTb
22:8.1:60/
22.5 (2.36)
Panel C Experiments 23–26 .n D 20/ 8 a
I E I E EN I EN
174.5 (3.95)
t-values in parentheses. Variables defined as follows:
PL -PFE
b
130:0 .3:61/
represents the deviation from equilibrium for Period A of each trading year where PL is the last trade of the period and PFE is the Perfect Foresight Equilibrium price. is a dummy variable representing the shortened Investment horizon according to Table 9.1. I D 1 for shortened horizon, and 0 otherwise. is a dummy variable representing the Endowment effect according to Table 9.1. E D 1 when 2 securities are issued, and 0 otherwise. is an interaction dummy variable representing both a shortened investment horizon and two security endowment. is a dummy variable representing the sell side of the Endowment effect hypothesized to lead Negative bubbles. E D 1 when 10 securities are issued and 0 otherwise. is an interaction dummy variable representing both a shortened investment horizon and a ten security environment.
NOINT means the regression was run by suppressing the intercept. ; ; signify statistical significance levels at .10, .05, and .01, respectively.
markets (11 and 12) and the interaction of shortened horizon with a speculative market (12). In addition, a restricted endowment effect .E D 1/ is imposed for experiments 11–14 since experiments 7–10 clearly established their necessity in creating bubbles. Model 4 results highlight the significant positive effect of the combined shortened horizon/restricted endowment effect (t D 6:98 for I). More importantly, the speculative group statistically differs from conservatives with an additional mean price difference of 89.9 .t D 4:92/. Model 5 supports the results of experiments 1–10 in that: (1) a shortened investment horizon with restricted endowments leads to price bubbles .t D 2:71/ for I, (2) a heightened tournament incentive will heighten short-term horizons and lead to positive price effects (t D 2:83 for I T), and (3) speculators contribute to positive price bubbles in restricted endowment environments (t D 3:17 for S).26 The visual analysis of experiments 15–18 (negative bubble experiments) is confirmed by the regression results reported in Table 9.5. The variables are as defined earlier under equation 5 albeit the EN representing a dummy variable for the negative endowment effect. EN D 1 when the initial endowment equals 10 securities and 1,000 firms and 0 otherwise. In addition, since the shortened horizon variable I occurs for all years except 1A of each experiment, I and Furthermore, although insignificant, the p-value for I S is equal to 14 suggesting that the speculative difference may be even greater under a shortened investment horizon.
26
E are highly correlated. The design is therefore set to only measure the interaction effects of a shortened horizon and endowment. The four periods (1A of each experiment) are the control periods where a shortened horizon is not present (dummy NI D 1 for not I). The parameter estimates of Model 6 show significant positive results for both positive and negative bubbles. The joint presence of a shortened horizon induced by a tournament payoff along with a buy side endowment (2 securities, 10,000 firms); that is, I E D 1, leads to an average increase of 400.6 francs in price levels. The single alteration of the endowment to sell side (10 securities, 1,000 francs) in the presence of a tournament leads to an average decrease in price of 201.3 francs. The estimate for NI reflects the insignificant impact of the control periods where the endowment effect is present but without the tournament payoff inducing a shortened horizon. So as in the earlier results, the combined effect of the incentive (i.e., the tournament) and the ability (i.e., the endowment) work to create both positive and negative price bubbles.
9.4.7 Further Tests To check the robustness of our results we conducted eight final experiments in a unique and different setting, the former communist country of Albania. Of its many unique
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
characteristics, one of the most important is its history of being the most isolated (politically and economically) country in Europe since World War II. Since democratic reforms began in 1991, a new business school was opened in the second largest city of Albania, Shkodra, where the third year students served as traders. Would the students whose country didn’t have a securities market or a history of free market trade show the same results as we had found at U.S. universities? While our previous experiments had the most experienced traders ever used in a study, these Albanian students may indeed represent the least experienced traders examined to date, which may be regarded as an extreme test of the validity of our results. Because of the newness of the trading experience for these students, a single period design was used in the first four experiments. For each experiment’s ten trading years (no period B), asset payoffs were for a single dividend payoff. The amounts used were the same as those of Table 9.2 so that equilibrium levels remained at 230 for each year. As shown in Table 9.1, Design 7 (experiment 19) consists of a single period security without a tournament effect. Design 8 (experiments 20 and 21) introduces the tournament payoff of Table 9.3 (Schedule Two) within the single period environment. This allows us to test for the presence of bubbles in the simpler pricing environment while also easing the learning experience of the Albanian students toward two-period tournament pricing. The pricing results for these three experiments can be seen in Figs. 9.19–9.21. Without the tournament in experiment 19, pricing is rational and typical showing a discount (risk pre-
Fig. 9.19 Experiment #19
159
mium) of about 30 francs from the equilibrium level of 230. Near the end of experiment 20, the tournament effect appears to have created some price movement above equilibrium. This pressure continues into experiment 21 where prices trade at an average premium of 30 francs. Although these premiums do not constitute a bubble, it is clear they had a significant positive impact on pricing levels. Would this effect be eliminated (reversed) by changing the buy/sell pressure as was done earlier under Design 6 where negative bubbles were induced? Design 9 tests this proposition by changing the endowment from 2 securities and 5,000 francs to 20 securities and 500 francs. The results, reported in Fig. 9.22, show that even in these simple markets the endowment effect combined with tournament payoff leads to pricing away from equilibrium. These observations are confirmed by the regression results of Model 7 in Table 9.5 where buy side preference .I E D 1/ leads to significant increase in prices while sell side preference .I EN D 1/ leads to lower prices. The absence of a tournament payoff .NI D 1/ leads to insignificant price effects as investment horizon cannot be altered in a single period market. The Albanian students had now participated in four single period experiments and were ready to attempt two-period pricing. Experiment 23 was a simple two period pricing environment without any tournament payoff as in Design 1 (control). The plot of prices in Fig. 9.23 shows that the students initially struggled with two-period pricing since period A prices (two payoffs) differed little from period B prices (single payoff), although by the end of the experiment enough learning had developed.
160
J.S. Ang et al.
Fig. 9.20 Experiment #20
Fig. 9.21 Experiment #21
Experiments 24 and 25 introduce the shortened investment horizon (tournament effect) within the two-period framework as in Design 2 earlier. The price patterns in Figs. 9.24 and 9.25 show the creation of positive price bubbles to levels approaching 650 francs. Despite the histor-
ical background of this country and these students, they responded to market pressures in the same bubble-like manner. The last experiment, 26, alters the endowment to the sell side as before to see if negative bubbles can also be obtained. Price paths in Fig. 9.26 show a general downward trend of
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
161
Fig. 9.22 Experiment #22
Fig. 9.23 Experiment #23
prices. The prices in period A show significant and growing discounts from the equilibrium levels of 460. These observations are confirmed by the regression results reported in Panel C of Table 9.5 with buy side endowment contributing 174.5 francs and sell side endowment reducing levels by 208.0 francs.
9.5 Conclusions The results of this study have a number of implications for real world markets. Experiments 1–6 seem to imply that within an environment that restricts selling pressures,
162
J.S. Ang et al.
Fig. 9.24 Experiment #24
Fig. 9.25 Experiment #25
a shortened investment horizon is sufficient to create asset bubbles. In application to the real world, short-term performance of traders, portfolio managers, and so forth could create pressures leading to price bubbles. Experiments 7–10 provide restrictions to the previous conclusion in that a shortened investment horizon creates bubble pressure only
when the market environment favors buyers over sellers. Unfortunately, most of our real world securities markets do have such a bias via restricted short sales, asymmetric leverage for longs versus shorts, restricted options and futures, and the like. Experiments 11–14 add to the puzzle by demonstrating the role of speculators within bubble formation. As a whole,
9 The Creation and Control of Speculative Bubbles in a Laboratory Setting
163
Fig. 9.26 Experiment #26
the study suggests that necessary and sufficient conditions for the formation of asset bubbles are a shortened investment horizon, restricted selling activity relative to buyers, and the presence of speculators. We have also shown that repeated replication of these experiments under different settings still produce robust results. The first and third variables are a matter of fact within U.S. securities markets while restricted selling activity relative to buyers can take many forms. Either enhancing the buyer’s position or restricting the seller’s position is sufficient. Examples include increasing purchasing (speculative) ability through reduced stock margin levels, introduction of high leverage stock index futures, and in macroeconomic terms, a growing money supply or savings level. This latter variable may help explain the previous high levels of the Japanese equity market. The high level of Japanese savings creates very large endowments available for investment purchase. Given a limited supply of securities, our experimental markets show that these conditions will lead to a bubble. They also suggest that the bubble will burst when there is greater equating between the supply and demand. Recent changes in the Japanese institutional framework may, as predicted by this study, have lead to the bursting of that bubble. The primary prescription put forth for regulatory authorities to eliminate unnecessary market volatility resulting from asset bubbles is to create an institutional environment that does not restrict the transfer of information to the market. Structure the variables so that both bulls and bears have equal costs in executing their trades.
References Ackert, L., N. Charupat, B. Church and R. Deaves. 2002. “Bubbles in experimental asset markets: irrational exuberance no more.” Working paper 2002–2024. Ackert, L., N. Charupat, R. Deaves and B. Kluger. 2006. “The origins of bubbles in laboratory asset markets.” Working paper 2006–6, Federal Reserve Bank of Atlanta. Allen, F. and D. Gale. 2000. “Bubbles and crises.” Economic Journal 110, 236–255. Allen, F. and G. Gorton. 1988. “Rational finite bubbles,” in L. Rodney (Ed.). Working paper no. 41–88, White Center for Financial Research. Ang, J. and T. Schwarz. 1985. “Risk aversion and information structure: an experimental study of price variability in the securities market.” Journal of Finance 40, 924–844. Baker, G. P. 1992. “Incentive contracts and performance measurement.” Journal of Political Economy 100, 598–614. Becker, B. E. and M. A. Huselid. 1992. “The incentive effects of tournament compensation systems.” Administrative Science Quarterly 37, 336–350. Bierman, H., Jr. 1995. “Bubbles, theory and market timing.” Journal of Portfolio Management 22, 54–56. Brunnermeier, M. and S. Nagel. 2004. “Hedge funds and the technology bubble.” Journal of Finance 59, 2013–2040. Caginalp, G., D. Porter and V. Smith. 2001. “Financial bubbles: excess cash, momentum, and incomplete information.” The Journal of Psychology and Financial Markets 2(2), 80–99. Camerer, C. 1989. “Bubbles and fads in asset price: a review of theory and evidence.” Journal of Economic Surveys 3, 3–41. Cason, T. and C. Noussair. 2001. “The experimental study of market behavior,” Advances in Experimental Markets, Springer, New York, pp. 1–14. Cutler, D. M., J. M. Poterba and L. H. Summers. 1991. “Speculative dynamics.” Review of Economic Studies 58, 529–546.
164 DeLong, J. B., A. Schleifer, L. H. Summers and R. T. Waldmann. 1989. “The size and incidence of the losses from noise trading.” Journal of Finance 44, 681–696. Diba, B. T. and H. Grossman. 1987. “On the inception of rational bubbles.” Quarterly Journal of Economics 102, 197–700. Dufwenberg, M., T. Lindqvist and E. Moore. 2005. “Bubbles and experience: an experiment.” The American Economic Review 95, 1731–1737. Durand, R., R. Newby and J. Sanghani. 2006. “An intimate portrait of the individual investor.” Working paper. Ehrenberg, R. G. and M. L. Bognanno. 1990. “Do tournaments have incentive effects?” Journal of Political Economy 98, 1307–1324. Flood, R. P. and R. J. Hodrick. 1990. “On testing for speculative bubbles.” Journal of Economic Perspectives 4, 85–102. Forsythe, R., T. Palfrey and C. Plott. 1982. “Asset valuation in an experimental model.” Econometrica 50, 537–563. Friedman, B. M. 1992. “Monetary economics and financial markets: retrospect and prospect.” NBER Reporter, 1–5. Froot, K. A., D. S. Scharfstein and J. C. Stein. 1992. “Herd on the street: informational inefficiencies in a market with short-term speculation.” Journal of Finance 47, 1461–1484. Garber, P. 1990. “Famous first bubbles.” Journal of Economic Perspectives 4, 35–54. Gurkaynak, R. S. 2005. “Econometric tests of asset price bubbles: taking stock,” REDS Working paper No. 2005–2004. Harrison, M. and D. Kreps. 1978. “Speculative investor behavior in a stock market with heterogeneous expectations.” Quarterly Journal of Economics 92, 323–336. Hart, O. D. and D. M. Kreps. 1986. “Price destabilizing speculation.” Journal of Political Economy 94, 927–953. Heinemann, F. 2003. “Risk aversion and incentive effects: comment.” Working paper (Ludwig-Maximilians-Universitat Munchen). Hirota and Sunder. 2005. “Price bubbles sans dividend anchors: evidence from laboratory stock markets.” Working paper. Holt, C. A. and S. K. Laury. 2002. “Risk aversion and incentive effects.” American Economic Review 92, 1644–1655. Hong, H. G., J. A. Scheinkman and W. Xiong. 2006. “Asset float and speculative bubbles.” Journal of Finance 61, 1073–1117. Jackson, D. 1976. “Jackson personality inventory manual,” Research Psychologists Press, Goshen, NY. Jackson, D. 1977. “Reliability of the Jackson personality inventory.” Psychological Reports 40, 613–614. Jackson, D., D. Hourany and N. Vidmar. 1972. “A four dimensional interpretation of risk taking.” Journal of Personality 40, 433–501. James, D. and R. M. Isaac. 2000. “Asset markets: how they are affected by tournament incentives for individuals.” American Economic Review 90, 995–1004.
J.S. Ang et al. King, R. R., V. L. Smith, A. Williams and M. Van Boening. 1990. “The robustness of bubbles and crashes,” Working paper, University of Washington, Seattle, WA. Lei, V., C. Noussair and C. Plott. 2001. “Nonspeculative bubbles in experimental asset markets: lack of common knowledge of rationality vs. actual irrationality.” Econometrica 69(4), 831–859. Plott, C. and S. Sunder. 1982. “Efficiency of experimental markets with insider information: an application of rational expectations models.” Journal of Political Economy 90, 663–698. Porter, D. P. and V. L. Smith. 2003. “Stock market bubbles in the laboratory.” Journal of Behavioral Finance 4(1), 7–20. Ricke, M. 2004. “What is the link between margin loans and stock market bubbles?” University of Muenster, Department of Banking No. 03–01. Robin, S. and B. Ruffieux. 2001. “Price bubbles in laboratory asset markets with constant fundamental values.” Experimental Economics 4, 87–105. Roll, R. 1989. “Price volatility, international market links, and their implications for regulatory policy.” Journal of Financial Services Research 3, 211–246. Scheinkman, J. A. and W. Xiong. 2003. “Overconfidence and speculative bubbles.” Journal of Political Economy 111, 1183–1219. Shiller, R. J. 1988. “Fashions, fads, and bubbles in financial markets,” in Knights, raiders, and targets, J. Coffee Jr., L. Lowenstein and S. Ross-Ackerman (Eds.). Oxford Press, New York. Shleifer, A. and R. W. Vishny. 1990. “Equilibrium short horizons of investors and firms.” American Economic Review 80, 148–153. Siegel, J. J. 2003. “What is an asset price bubble? an operational definition.” European Financial Management 9, 11–24. Smith, V., G. Suchanek and A. Williams. 1988. “Bubbles, crashes, and endogenous expectations in experimental spot asset markets.” Econometrica 56, 1119–1151. Stiglitz, J. 1990. “Symposium on bubbles.” Journal of Economic Perspective 4, 13–18. Sunder, S. 1992. “Experimental asset markets: a survey,” in Handbook of experimental economics, J. H. Kagel and A. E. Roth (Eds.). Princeton University Press, Princeton, NY. Voth, H.-J. and Temin, P. 2003. “Riding the South Sea Bubble,” MIT Economics, Working Paper No. 04–02. West, K. 1988. “Bubbles, fads, and stock price volatility tests: a partial evaluation.” Journal of Finance 43, 639–655. White, E. 1990. “The stock market boom and crash of 1929 revisited.” Journal of Economic Perspectives 4, 67–84.
Chapter 10
Portfolio Optimization Models and Mean–Variance Spanning Tests Wei-Peng Chen, Huimin Chung, Keng-Yu Ho, and Tsui-Ling Hsu
Abstract In this chapter we introduce the theory and the application of the computer program of modern portfolio theory. The notion of diversification is age-old: “don’t put your eggs in one basket,” obviously predates economic theory. However, a formal model showing how to make the most of the power of diversification was not devised until 1952, a feat for which Harry Markowitz eventually won the Nobel Prize in economics. Markowitz portfolio shows that as you add assets to an investment portfolio the total risk of that portfolio – as measured by the variance (or standard deviation) of total return – declines continuously, but the expected return of the portfolio is a weighted average of the expected returns of the individual assets. In other words, by investing in portfolios rather than in individual assets, investors could lower the total risk of investing without sacrificing return. In the second part we introduce the mean–variance spanning test that follows directly from the portfolio optimization problem. Keywords Minimum variance portfolio r Optimal risky portfolio r Capital allocation line r Mean–variance spanning tests
10.1 Introduction of Markowitz Portfolio-Selection Model Harry Markowitz (1952, 1959) developed his portfolioselection technique, which came to be called modern portfolio theory (MPT). Prior to Markowitz’s work, securityselection models focused primarily on the returns generated W.-P. Chen () Department of Finance, Shih Hsin University, Taipei, Taiwan e-mail:
[email protected] H. Chung and T.-L. Hsu Graduate Institute of Finance, National Chiao Tung University, Hsinchu, Taiwan e-mail:
[email protected];
[email protected] K.-Y. Ho Department of Finance, National Taiwan University, Taipei, Taiwan e-mail:
[email protected]
by investment opportunities. Standard investment advice was to identify those securities that offered the best opportunities for gain with the least risk and then construct a portfolio from these. Following this advice, an investor might conclude that railroad stocks all offered good risk-reward characteristics and compile a portfolio entirely from these. The Markowitz theory retained the emphasis on return; but it elevated risk to a coequal level of importance, and the concept of portfolio risk was born. Whereas risk has been considered an important factor and variance an accepted way of measuring risk, Markowitz was the first to clearly and rigorously show how the variance of a portfolio can be reduced through the impact of diversification. He proposed that investors focus on selecting portfolios based on their overall risk-reward characteristics instead of merely compiling portfolios from securities that each individually have attractive risk-reward characteristics. A Markowitz portfolio model is one where no added diversification can lower the portfolio’s risk for a given return expectation (alternately, no additional expected return can be gained without increasing the risk of the portfolio). The Markowitz Efficient Frontier is the set of all portfolios of which expected returns reach the maximum given a certain level of risk. The Markowitz model is based on several assumptions concerning the behavior of investors and financial markets:
1. A probability distribution of possible returns over some holding period can be estimated by investors. 2. Investors have single-period utility functions in which they maximize utility within the framework of diminishing marginal utility of wealth. 3. Variability about the possible values of return is used by investors to measure risk. 4. Investors care only about the means and variance of the returns of their portfolios over a particular period. 5. Expected return and risk as used by investors are measured by the first two moments of the probability distribution of returns-expected value and variance.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_10,
165
166
W.-P. Chen et al.
10.2.2 Risk
6. Return is desirable; risk is to be avoided.1 7. Financial markets are frictionless.
10.2 Measurement of Return and Risk Throughout this chapter, investors are assumed to measure the level of return by computing the expected value of the distribution, using the probability distribution of expected returns for a portfolio. Risk is assumed to be measurable by the variability around the expected value of the probability distribution of returns. The most accepted measures of this variability are the variance and standard deviation.
The variance of a single security is the expected value of the sum of the squared deviations from the mean, and the standard deviation is the square root of the variance. The variance of a portfolio combination of securities is equal to the weighted average covariance2 of the returns on its individual securities: n n X X wi wj Cov ri ; rj Var rp D p2 D
(10.2)
i D1 j D1
Covariance can also be expressed in terms of the correlation coefficient as follows: Cov ri ; rj D ij i j D ij
(10.3)
10.2.1 Return Given any set of risky assets and a set of weights that describe how the portfolio investment is split, the general formula of expected return for n assets is: E.rP / D
n X
wi E.ri /
where ij D correlation coefficient between the rates of return on security i; ri , and the rates of return on security j; rj I i , and j represent standard deviations of ri and rj respectively. Therefore: n n X X wi wj ij i j Var rp D
(10.1)
i D1
where
n P
wi D 1:0; n D the number of securities, wi the
i D1
proportion of the funds invested in security i; ri ; rP the return on i th security and portfolio p; E. / the expectation of the variable in the parentheses. The return computation is nothing more than finding the weighted average return of the securities included in the portfolio.
1 Markowitz model assumes that investors are risk averse. This means that given two assets that offer the same expected return, investors will prefer the less risky one. Thus, an investor will take on increased risk only if compensated by higher expected returns. Conversely, an investor who wants higher returns must accept more risk. The exact tradeoff will differ by investor based on individual risk aversion characteristics. The implication is that a rational investor will not invest in a portfolio if a second portfolio exists with a more favorable risk-return profile (i.e., if for that level of risk an alternative portfolio exists that has better expected returns). Using risk tolerance, we can simply classify investors into three types: risk-neutral, risk-averse, and risk-lover. Riskneutral investors do not require the risk premium for risk investments; they judge risky prospects solely by their expected rates of return. Riskaverse investors are willing to consider only risk-free or speculative prospects with positive premium; they make investment according the risk-return tradeoff. A risk-lover is willing to engage in fair games and gambles; this investor adjusts the expected return upward to take into account the “fun” of confronting the prospect’s risk.
(10.4)
i D1 j D1
Overall, the estimate of the mean return for each security is its average value in the sample period; the estimate of variance is the average value of the squared deviations around the sample average; the estimate of the covariance is the average value of the cross-product of deviations.
10.3 Efficient Portfolio Efficient portfolios may contain any number of asset combinations. We examine efficient asset allocation by using two risky assets for example. After we understand the properties of portfolios formed by mixing two risky assets, it will be easy to see how the portfolio of many risky assets might best be constructed.
2
High covariance indicates that an increase in one stock’s return is likely to correspond to an increase in the other. A low covariance means the return rates are relatively independent and a negative covariance means that an increase in one stock’s return is likely to correspond to a decrease in the other.
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
167
10.3.1 Two-Risky-Assets Portfolio
E.rP / D wA E.rA / C wB E.rB /
asset B
r=0
r=1 0
r = −1 0
Standard Deviation
Fig. 10.1 Investment opportunity sets for asset A and asset B with various correlation coefficients
(10.5)
The variance of the rate of return on the two-asset portfolio is P2 D w2A A2 C w2B B2 C 2wA wB AB A B
r = −1
Expected Return
Because we now envision forming a portfolio from two risky assets, we need to understand how the uncertainties of asset returns interact. It turns out that the key determinant of portfolio risk is the extent to which the returns on the two assets tend to vary in tandem or in opposition. The degree to which a two-risky-assets portfolio reduces variance of returns depends on the degree of correlation between the returns of the securities. Suppose a proportion denoted by wA is invested in asset A, and the remainder 1 wA , denoted by wB , is invested in asset B. The expected rate of return on the portfolio is a weighted average of the expected returns on the component assets, with the same portfolio proportions as weights.
(10.6)
where AB is the correlation coefficient between the returns on asset A and asset B. If the correlation between the component assets is small or negative, this will reduce portfolio risk. First, assume that AB D 1:0, which would mean that assets A and B are perfectly positively correlated, the righthand side of Equation (10.6) is a perfect square and simplifies to p2 D w2A A2 C w2B B2 C 2wA wB A B D .wA A C wB B /2 or p D wA A C wB B Therefore, the portfolio standard deviation is a weighted average of the component security standard deviations only in the special case of perfect positive correlation. In this circumstance, there are no gains to be had diversification. Whatever the proportions of asset A and asset B, both the portfolio mean and the standard deviation are simple weighted averages. Figure 10.1 shows the opportunity set with perfect positive correlation – a straight line through the component assets. No portfolio can be discarded as inefficient in this case, and the choice among portfolios depends only on risk preference. Diversification in the case of perfect positive correlation is not effective. Perfect positive correlation is the only case in which there is no benefit from diversification. With any correlation coefficient less than 1:0 . < 1/, there will be a diversification effect, the portfolio standard deviation is less than the weighted average of the standard deviations of the component securities. Therefore, there are benefits to diversification whenever asset returns are less than perfectly correlated. Our
analysis has ranged from very attractive diversification benefits .AB < 0/ to no benefits at all AB D 1:0. Negative correlation between a pair of assets is also possible. Where negative correlation is present, there will be even greater diversification benefits. Again, let us start with an extreme. With perfect negative correlation, we substitute AB D 1:0 in Equation (10.6) and simplify it in the same way as with positive perfect correlation. Here, too, we can complete the square, this time, however, with different results. P2 D w2A A2 C w2B B2 2wA wB A B D .wA A wB B /2 And, therefore, P D jwA A wB B j
(10.7)
With perfect negative correlation, the benefits from diversification stretch to the limit. Equation (10.7) points to the proportions that will reduce the portfolio standard deviation all the way to zero. An investor can reduce portfolio risk simply by holding instruments that are not perfectly correlated. In other words, investors can reduce their exposure to individual asset risk by holding a diversified portfolio of assets. Diversification will allow for the same portfolio return with reduced risk.
10.3.2 The Concept of Markowitz Efficient Frontier Every possible asset combination can be plotted in riskreturn space, and the collection of all such possible portfolios defines a region in this space. The line along the upper edge of this region is known as the efficient frontier. Combinations
168
W.-P. Chen et al.
Z
B’ Minimum Variance Portfolio (MVP)
Investment Opportunity Set
V
Efficient frontier of risky assets
Expected Return
Expected Return
A
Minimum variance portfolio
Individual assets
B 0
Standard Deviation
0
Standard Deviation
Fig. 10.2 Investment opportunity set for asset A and asset B
Fig. 10.3 The efficient frontier of risky assets and individual assets
along this line represent portfolios (explicitly excluding the risk-free alternative) for which there is lowest risk for a given level of return. Conversely, for a given amount of risk, the portfolio lying on the efficient frontier represents the combination offering the best possible return. Mathematically, the efficient frontier is the set of mean–variance choices from the portfolio set where for a given variance no other portfolio offers a higher man return. Figure 10.2 shows investors the entire investment opportunity set, which is the set of all attainable combinations of risk and return offered by portfolios formed by asset A and asset B in differing proportions. The curve passing through A and B shows the risk-return combinations of all the portfolios that can be formed by combining those two assets. Investors desire portfolios that lie to the northwest in Fig. 10.2. These are portfolios with high expected returns (toward the north of the figure) and low volatility (to the west). The area within curve BVAZ is the feasible opportunity set representing all possible portfolio combinations. Portfolios that lie below the minimum-variance portfolio (point V ) on the figure can therefore be rejected out of hand as inefficient. The portfolios that lie on the frontier VA in Fig. 10.2 would not be likely candidates for investors to hold because they do not meet the criteria of maximizing expected return for a given level of risk or minimizing risk for a given level of return. This is easily seen by comparing the portfolio represented by points B and B 0 . Since investors always prefer more expected return than less for a given level of risk, B 0 is always better than B. Using similar reasoning, investors would always prefer B to V because it has both a higher return and a lower level of risk. In fact, the portfolio at point V is identified as the minimum-variance portfolio because no other portfolio exists that has a lower standard deviation. The curve VA represents all possible efficient portfolios and is the efficient frontier3, which represents the set of portfolios that
offers the highest possible expected rate of return for each level of portfolio standard deviation. Any portfolio on the downward sloping portion of the frontier curve is dominated by the portfolio that lies directly above it on the upward sloping portion of the frontier curve since that portfolio has a higher expected return and equal standard deviation. The best choice among the portfolios on the upward sloping portion of the frontier curve is not as obvious because in this region higher expected return is accompanied by higher risk. The best choice will depend on the investor’s willingness to trade off risk against expected return (Fig. 10.3).
3
The efficient frontier will be convex – this is because the riskreturn characteristics of a portfolio change in a non-linear fashion as
10.3.3 Short Selling Various constraints may preclude a particular investor from choosing portfolios on the efficient frontier, however. Short sale restrictions are only one possible constraint. Short selling is a usual regulated type of market transaction. It involves selling assets that are borrowed in expectation of a fall in the assets’ price. When and if the price declines, the investor buys an equivalent number of assets at the new lower price and returns to the lender the borrowed ones. Now, relaxing the assumption of no short selling, investors could sell the lowest-return asset B (here, we assume that E.rA / E.rB / and A B ). If the number of short sales is unrestricted, then by a continuous short selling of B and reinvesting in A the investor could generate an infinite expected return. The efficient frontier of unconstraint portfolio is shown in Fig. 10.4. The upper bound of the
its component weightings are changed. (As described above, portfolio risk is a function of the correlation of the component assets, and thus changes in a non-linear fashion as the weighting of component assets changes.) The efficient frontier is a parabola (hyperbola) when expected return is plotted against variance (standard deviation).
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
Z’ Expected Return
A
without short sales with short sales
V P
0
Z
169
Table 10.1 The minimum variance portfolio weight of two-assets portfolio without short selling The correlation of two assets Weight of Asset A Weight of Asset B B A 2B D1 wA D wB D A B A B B A D 1 wA D wB D A C B A C B D0
wA D
B
B2 A2 C B2
wB D
A2 2B2 A2 C B2
Standard Deviation
Fig. 10.4 The efficient frontier of unrestricted/restricted portfolio
highest-return portfolio would no longer be A but infinity (shown by the arrow on the top of the efficient frontier). Likewise the investor could short sell the highest-return security A and reinvest the proceeds into the lowest-yield security B4 , thereby generating a return less than the return on the lowest-return assets. Given no restriction on the amount of short selling, an infinitely negative return can be achieved, thereby removing the lower bound of B on the efficient frontier. Hence, short selling generally will increase the range of alternative investments from the minimum-variance portfolio to plus or minus infinity.5 Relaxing the assumption of no short selling in this development of the efficient frontier involves a modification of the analysis of the efficient frontier of constraint (not allowed short sales). In the next section, we introduce the mathematical analysis of the efficient frontier with/without short selling constraints.
Mathematically, the portfolio selection problem can be formulated as a quadratic program. For two risky assets A and B, the portfolio consists of wA and wB , and the return of the portfolio is then rA wA C rB wB . The weights should be chosen so that the risk is minimized, that is Min P2 D w2A A2 C w2B B2 C 2wA wB AB A B wA
for each chosen return and subject to wA C wB D 1; wA 0; wB 0. The last two constraints simply imply that the assets cannot be in short positions. The minimum variance portfolio weights are shown in Table 10.1.6 Above, we simply use two-risky-assets portfolio to calculate the minimum variance portfolio weights. If we generalization to portfolios containing N assets, the minimum portfolio weights can then be obtained by minimizing the Lagrange function C for portfolio variance. Min p2 D
n n X X
wi wj ij i j
i D1 j D1
Subject to w1 C w2 C : : : C wN D 1
10.3.4 Calculating the Minimum Variance Portfolio In Markowitz portfolio model, we assume investors choose portfolios based on both expected return, E.rp /, and the standard deviation of return as a measure of its risk, p . Therefore, the portfolio selection problem can be expressed as maximizing the return with respect to the risk of the investment (or, alternatively, minimizing the risk with respect to a given return, hold the return constant and solve for the weighting factors that minimize the variance).
4
Rational investors will not short sell a high-return asset and buy a low-return asset. This case is just for extreme assumption. 5 Whether an investor engages in any of this short-selling activity depends on the investor’s own unique set of indifference curves.
C D
n n X X
wi wj Cov ri rj C 1 1
i D1 j D1
n X
! Wi
(10.8)
i D1
in which 1 are the Lagrange multipliers, respectively, ij is the correlation coefficient between ri and rj , and other variables are as previously defined. By using this approach the minimum variance can be computed for any given level of expected portfolio return (subject to the other constraint that the weights sum to one). In practice, it is best to use a computer because of the explosive increase in the number of calculations as the number of securities considered grows. The efficient set that is gen6 The solution uses the minimization techniques of calculus. Write out the expression for portfolio variance from Equation (10.6), substitute 1 wA for wB , differentiate the result with respect to wA , set the deriva-
tive equal to zero, and solve for wA to obtain wA D wB D 1 wA .
B2 AB A B A2 CB2 2AB A B
and
170
W.-P. Chen et al.
erated by the aforementioned approach (Equation (10.8)) is sometimes called the minimum-variance set because of the minimizing nature of the Lagrangian solution.
The Lagrangian function is C D
n n X X
One of the goals of portfolio analysis is minimizing the risk or variance of the portfolio. Previous sections introduced the calculation of minimum variance portfolio, and we minimize the variance of the portfolio subject to the portfolio weights’ summing to one. If we add a condition to Equation (10.8), which is subject to the portfolio attaining some target expected rate of return, we can get the optimal risky portfolio. Min p2 D
n n X X
Wi Wj ij i j
i D1 j D1
Subject to
n P
"
wi wj Cov ri rj C 1 E
i D1 j D1
10.3.5 Calculating the Weights of Optimal Risky Portfolio
C2 1
n X
n X
# wi E .ri /
i D1
! jwi j
(10.10)
i D1
If the restriction of no short selling is in minimization variance problem, it needs to add a third constraint: wi 0;
i D 1; : : : ; N
The addition of this non-negativity constraint precludes negative values for the weights (that is, no short selling). The problem now is a quadratic programming problem similar to the ones solved so far, except that the optimal portfolio may fall in an unfeasible region. In this circumstance, the next best optimal portfolio is selected that meets all of the constraints.
Wi E.Ri / D E , where E is the target ex-
i D1
n P
Wi D 1:0.
10.3.6 Finding the Efficient Frontier of Risky Assets The first constraint simply says that the expected return
pected return and
i D1
on the portfolio should equal the target return determined by the portfolio manager. The second constraint says that the weights of the securities invested in the portfolio must sum to one. The Lagrangian objective function can be written: C D
n n X X
wi wj Cov ri rj
i D1 j D1
"
C1 E
n X
# wi E.ri / C 2 1
i D1
n X
! wi
(10.9)
i D1
Taking the partial derivatives of this equation with respect to each of the variables, w1 ; w2 ; : : : ; wN ; 1 , and 2 and setting the resulting equations equal to zero yields the minimization of risk subject to the Lagrangian constraints. Then, we can solve the weights and these weights represent an optimal risky portfolio by using matrix algebra. If there is no short selling constraint in the portfolio analn P wi D 1:0, should substitute to ysis, second constraint, n P
According to two-fund separation, the efficient frontier of risky assets can be formed by any two risky portfolios on the frontier. All portfolios on the mean–variance efficient frontier can be formed as a weighted average of any two portfolios or funds on the efficient frontier; this is called twofund separation. So, if we have any two points of the portfolio combinations, we can draw an entire efficient frontier of the risky assets. In previous sections, we have introduced the minimum variance portfolio and optimal risky portfolio; given the expected return, then, we can generate the entire efficient frontier by the separation property. Deriving the efficient frontier may be quite difficult conceptually, but computing and graphing it with any number of assets and any set of constraints is quite straightforward. Later, we will use Excel and MATLAB to generate the efficient frontier.
10.3.7 Finding the Optimal Risky Portfolio
i D1
jwi j D 1:0, where the absolute value of the weights jwi j
i D1
allows for a given wi to be negative (sold short) but maintains the requirement that all funds are invested or their sum equals one.
We already have an efficient frontier; however, how do we decide the best allocation of portfolio? One of the factors to consider when selecting the optimal portfolio for a particular investor is the degree of risk aversion. Investor’s willingness to trade off risk against expected return. This
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
level of aversion to risk can be characterized by defining the investor’s indifference curve, consisting of the family of risk/return pairs defining the tradeoff between the expected return and the risk. It establishes the increment in return that a particular investor will require in order to make an increment in risk worthwhile. The optimal portfolio along the efficient frontier is not unique with this model and depends on the risk/return tradeoff utility function of each investor. We use the utility function that is commonly employed by financial theorists and the AIMR (Association of Investment Management and Research) where a portfolio with a given expected return E.rp / and standard deviation p is assigned the following utility function: U D E.rp / 0:005Ap2
(10.11)
where U is the utility value and A is an index of the investor’s risk aversion. The factor of 0.005 is a scaling convention that allows us to express the expected return and standard deviation in Equation (10.11) as percentages rather than decimals. We interpret this expression to say that the utility from a portfolio increases as the expected rate of return increases, and it decreases when the variance increases. The relative magnitude of these changes is governed by the coefficient of risk aversion, A. For risk-neutral investors, A D 0. Higher levels of risk aversion are reflected in larger values for A. Portfolio selection, then, is determined by plotting investors’ utility functions together with the efficient-frontier set of available investment opportunities. In Fig. 10.5, two sets of indifference curves labeled U1 and U2 are shown together with the efficient frontier. The U1 curve has a higher slope, indicating a greater level of risk aversion. The investor is indifferent to any combination of rp and p along a given curve. The U2 curve would be appropriate for a less riskaverse investor – that is, one who would be willing to accept relatively higher risk to obtain higher levels of return. The optimal portfolio would be the one that provides the highest utility – a point in the northwest direction (higher return and
U2
Expected Return
B
171
lower risk). This point will be at the tangent of a utility curve and the efficient frontier. Each investor is logically selecting the optimal portfolio given his or her risk-return preference, and neither is more correct than the other. In order to simplify the determination of optimal risky portfolio, we use the capital allocation line (CAL), which depicts all feasible risk-return combinations available from different asset allocation choices, to determine the optimal risky portfolio. To start, however, we will demonstrate the solution of the portfolio construction problem with only two risky assets (in our example, asset A and asset B) and a riskfree asset. In this case, we can derive an explicit formula for the weights of each asset in the optimal portfolio. This will make it easy to illustrate some of the general issues pertaining to portfolio optimization. The objective is to find the weights wA and wB that result in the highest slope of the CAL (i.e., the weights that result in the risky portfolio with the highest reward-to-variability ratio). Therefore, the objective is to maximize the slope of the CAL for any possible portfolio, P. The objective function is the slope that we have called CALS : The entire portfolio including risky and risk-free assets. CALS D
E.rp / rf p
(10.12)
For the portfolio with two risky assets, the expected return and standard deviation of portfolio S are E.rP / D wA E.rA / C wB E.rB / 1=2 P D w2A A2 C w2B B2 C 2wA wB AB A B
(10.5) (10.13)
When we maximize the objection function, CALS , we have to satisfy the constraint that the portfolio weights sum to 1. Therefore, we solve a mathematical problem formally written as E.rp / rf Max CALS D wi p P Subject to wi D 1. In this case of two risky assets, the solution for the weights of the optimal risky portfolio, S, can be shown to be as follows7 :
A
wA D
U1
ŒE.rA /rf B2 ŒE.rB /rf AB A B ŒE.rA /rf B2 CŒE.rB /rf A2 ŒE.rA /rf CE.rB /rf AB A B
wB D 1 wA Minimum variance portfolio
0
Efficient frontier of risky assets
Standard Deviation
Fig. 10.5 Indifference Curves and Efficient frontier
7
The solution procedure for two risky assets is as follows. Substitute for expected return from Equation (10.5) and for standard deviation from Equation (10.13). Substitute 1 wA for wB . Differentiate the resulting expression for CALs with respect to wA , set the derivative equal to zero, and solve for wB .
172
W.-P. Chen et al.
CAL Expected Return
Then, we form an optimal complete portfolio8 given an optimal risky portfolio and the CAL generated by a combination of portfolio S and risk-free asset. We have constructed the optimal portfolio S, and then we can use the individual investor’s degree of risk aversion, A, to calculate the optimal proportion of complete portfolio to invest in the risky component. Assume that a risk-free rate is rf , and a risky portfolio with expected return E.rp / and standard deviation p will find that for any choice of y, the expected return of the complete portfolio is
Indifference Curve Opportunity Set of Risky Assets Optimal Risky Portfolio Optimal Complete Portfolio
rf 0
Standard Deviation
Fig. 10.6 Determination of the optimal portfolio
E.rC / D rf C yŒE.rp / rf The variance of the overall portfolio is
3. Allocation funds between the risky portfolio and the riskfree asset:
C2 D y 2 p2 The investor attempts to maximum utility, U, by choosing the best allocation to the risky asset, y. To solve the utility maximization problem more generally, we write the problem as follows: Max U D E.rC / 0:005AC2 y
D rf C yŒE.rp / rf 0:005Ay 2 p2 Setting the derivative of this expression to zero, we can solve for y yield the optimal position for risk-averse investors in the risky asset, y , as follows: y D
E.rp / rf 0:01Ap2
(10.14)
The solution shows that the optimal position in the risky asset is, as one would expect, inversely proportional to the level of risk aversion and the level of risk (measured by the variance) and directly proportional to the risk premium offered by the risky asset. Once we have reached this point, generalizing to the case of many risky assets is straightforward. Before we move on, let us briefly summarize the steps we followed to arrive at the complete portfolio (Fig. 10.6). 1. Specify the return characteristics of all securities (i.e., expected return, variance, and covariance). 2. Establish the risky portfolio: (a) Calculate the optimal risky portfolio S. (b) Calculate the properties of portfolio S using the weights determined in step and Equations (10.5) and (10.13). 8
The complete portfolio means that the entire portfolio including risky and risk-free assets.
(a) Calculate the fraction of the complete portfolio allocated to Portfolio S (the risky portfolio) and to riskfree asset (Equation (10.14)). (b) Calculate the share of the complete portfolio invested in each asset and in risk-free asset. For further information about modern portfolio theory, please refer to Lee et al. (1990), and Bodie et al. (2007, 2008).
10.4 Mean–Variance Spanning Test 10.4.1 Mean–Variance Spanning and Intersection Tests Huberman and Kandel (1987) first introduced a mean– variance spanning test. The method statistically tests whether adding a set of new assets can improve the investment opportunity set relative to a set of basis assets. In other words, mean–variance spanning tests enable us to analyze the effect on the mean–variance frontier of adding new assets to a set of benchmark assets. For ease of illustration, we define the union of both new assets and benchmark assets as augmented assets. If the mean–variance frontier of the benchmark portfolios and that of the augmented portfolios coincides, then there is spanning. In this case, investors do not benefit from adding new portfolios to their current portfolios. Whether an observed shift is statistically significant can be tested using regression-based mean–variance spanning tests. We briefly describe the approaches that we use to examine whether adding an IPO portfolio could significantly improve the investment opportunity set relative to a set of benchmark assets. For convenience, we follow the notations and treatment in Kan and Zhou (2008).
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
We denote by K the set of benchmark portfolios with returns, R1t and by N the set of test assets with returns, R2t . Let the returns on N C K assets as Rt , we denote the ex1 pected returns on N C K assets as D EŒRt 2 and the variance-covariance matrix of N C K assets as V D V11 V12 , where V is non-singular. We then esVarŒRt V21 V22 timate the following model using ordinary least squares as:
173
LM D T
i D1
(10.15)
Equation (10.15) can be rewritten as R D XB C E in matrix O D notation with the estimators of B and ˙ being BO Œ˛O ˇ P 1 0 1 0 0 O O .R XB/. O Under the D T .R XB/ .X X / .X R/ and P O0 normality assumption, we Phave t N.0; / and vec.B / 0 0 1 N.vec.B /; .X X / ˝ /. Following Huberman and Kandel (1987), the null hypothesis of “spanning” is: H0S W ˛ D 0N ; ı D 1N ˇ1K D 0N ;
(10.16)
where 0N is defined as a zero vector with N elements. If we fail to reject the null hypothesis, the benchmark assets then span the mean–variance frontier of the benchmark assets plus an IPO portfolio. In other words, failing to reject the null hypothesis implies that investors are not able to enlarge their investment opportunity set by adding an IPO portfolio. On the other hand, if the null hypothesis is rejected, adding an IPO portfolio does improve the investment opportunity set. 0 We can rewrite this null hypothesis as ‚ D ˛ ı D 00N 1 00K O2N D AB C , where A D and C D . 0 10K 10N O0 The distribution of the null P hypothesis is vec.‚ / 0 0 1 0 N.vec.‚ /; A.X X / A ˝ /. By defining " GO D TA.X X / A D 0
1
0
1 C O 01 VO111 O 1
O 01 VO111 1K
O 01 VO111 1K
10K VO111 1K
#
and 2 P1 1 X ˛O 0 O ˛O O O ‚ O0 D4 HO D ‚ P1 ˛O 0 O ıO
P1 3 ˛O 0 O ıO 5 P1 ıO0 O ıO
we then denote by 1 and by 2 the two eigenvalues of the matrix, HO GO 1 . The distributions of the asymptotic Wald, likelihood ratio, and Lagrange multiplier test statistics for the null hypothesis are: A
W D T .1 C 2 / 22N ; LR D T
2 X i D1
A
ln.1 C i / 22N ;
1 1
1
(10.19)
T KN N
F2N;2.T KN / for N 2; (10.20)
1 T K 1 1 F2;.T K1/ for N D 1; (10.21) U 2
ˇ ˇ ˇ.ˇ ˇ ˇ ˇ ˇ where U D ˇGO ˇ ˇHO C GO ˇ. In general, the test for mean–variance spanning can be divided into two parts: (1) the spanning of the global minimum-variance (GMV) portfolio and (2) the spanning of the tangency portfolio. Therefore, we can rewrite all three asymptotic test statistics based on this geometric feature. For example, the Wald test can be rewritten as W DT
.O R1 /2 1 CT .O R /2
! 1 C OR .R1GMV /2 1 ; 1 C OR1 .R1GMV /2 (10.22)
where .O R1 /2 and .O R /2 are the global minimum-variance of the benchmark assets assets plus new as and benchmark sets, respectively. OR1 R1 GMV is the slope of the asymptote of the frontier for the benchmark assets, and mean–variance OR R1 GMV is the slope of the tangency line of the mean– variance frontier for the benchmark portfolios plus new portfolios. The first term measures the change of the GMV portfolios due to the addition of a set of new assets. The second term measures whether there is an improvement of the squared tangency slope due to adding a set of new portfolios to the set of benchmark portfolios. Another relevant idea about testing the movement or change in the tangency portfolio is the mean–variance intersection test proposed in Huberman and Kandel (1987). If the mean–variance frontier of the benchmark assets and the mean–variance frontier of the augmented assets have only one point in common, this is known as intersection. Using Equation (10.15), the null hypothesis for intersection is: H0I W ˛ .1 ˇ1K / D 0;
(10.23)
where is the risk-free rate. Following DeRoon and Nijman (2001), the test statistic for testing the intersection hypothesis can be rewritten in terms of the maximal Sharpe ratios as
(10.17) WI D T (10.18)
i A 2 2N : 1 C i
In addition, the exact finite sample distribution of the likelihood ratio test under the null, as Huberman and Kandel (1987) and Jobson and Korkie (1989) also show, is:
U2 R2t D ˛ C ˇ R1t C t ; t D 1; 2; : : : ; T:
2 X
! 1 C OR ./2 1 DT 1 C OR ./2 1
! OR ./2 OR1 ./2 ; 1 C OR1 ./2 (10.24)
174
W.-P. Chen et al.
where OR1 ./ is the maximal Sharpe ratio attainable for the benchmark assets, and OR ./ is the maximal Sharpe ratio attainable for the augmented assets. We note that the Wald test statistic for intersection is chi-square distributed with degree of freedom N . The test statistic specified in Equation (10.24) indicates that the numerator is related to the difference in the squared maximal Sharpe ratios attainable for benchmark assets and augmented assets. The rejection of the null hypothesis of the intersection test implies that the mean–variance frontier of augmented assets does not have any point in common with the mean–variance frontier of benchmark assets based on the reference point of risk-free rate. Thus, the maximal Sharpe ratios are different between the augmented assets and benchmark assets.
10.4.2 Step-Down Tests for Mean–Variance Spanning
W2 D T
! cO C dO 1 C aO 1
1 ; 1 C aO cO1 C dO1
(10.28)
where aO 1 D O 01 VO111 O 1 ; bO1 D O 01 VO111 1K ; cO1 D 1k 0 VO111 1K O cO and dO are the analogues and dO1 D aO 1 cO1 bO12 . Here a; O b; O O of aO 1 ; b1 ; cO1 and d1 , based on benchmark assets plus new assets.
10.4.3 Mean–Variance Spanning Tests Under Non-normality and Heteroskedasticity The tests described so far assume that the returns are normally distributed and the error term in Equation (10.15) is homoskedastic. We can use a GMM Wald test to adjust for return non-normality and heteroskedasticity. The GMM Wald test is: O 0 /0 Œ.AT ˝IN /ST .AT 0 ˝IN / 1 vec.‚ O 0 / 22N; WaD T vec.‚ (10.29) A
Kan and Zhou (2008) report that the asymptotic tests have very good power in testing assets that can reduce the variance of the global minimum-variance (GMV) portfolio, but have little power against test assets that can only improve the tangency portfolio. They therefore suggest a step-down procedure that requires us to first test ˛ D 0N and then to test ı D 0N conditional on ˛ D 0N . If the rejection is due to the first test, we know that it is because the two tangency portfolios are very different. If the rejection is due to the second test, it is because the two GMV portfolios are very different. The first test .˛ D 0N / is straightforward. We simply ignore the null hypothesis of ı D 0N in Equation (10.16) and apply the same asymptotic Wald test procedure. Note that HO GO 1 then degenerates to a scalar and its eigenvalue, denoted as 3 , is actually itself. The second test (ı D 0N conditional on ˛ D 0N ) is a test of ı D 0N on estimating Equation (10.15) without an intercept. We follow the same asymptotic Wald test procedure detailed above. The matrix HO GO 1 in the second test is also a scalar. Thus, its eigenvalue, denoted as 4 , is itself. The step-down asymptotic Wald tests can be written as: A
W1 D T .3 / 2N ; A
W2 D T .4 / 2N :
(10.25) (10.26)
Equations (10.25) and (10.26) can also be rewritten using the similar notation of the finite sample step-down F tests as shown in Kan and Zhou (2008): W1 D T
aO aO 1 1 C aO 1
;
(10.27)
where the moment condition is EŒg "t D E.X ˝ E/ # D 0 O 1 1 C aO 1 O 1 V11 00N.1CK/ ; ST D E Œgt 0 gt , and AT D . bO1 10K VO111 We can also conduct step-down GMM Wald tests to disentangle the two sources of the spanning test. The test for ˛ D 0N (denote as Wa1 ) is straightforward. We simply ignore the null hypothesis of ı D 0N in Equation (10.16) and use (10.29). However, we need to change AT to Equation 1 C aO 1 01 VO111 due to the change in the null hypothesis. Testing ı D 0N conditional on ˛ D 0N (denote as Wa2 ) is actually a test of ı D 0N using Equation (10.15) without an intercept. (10.29) by chang We can again use Equation 1 . Furthermore, new ST is ing AT to 10K VO11 C 1 01 different from ST of the joint test since X for the second step-down test has no intercept term. Finally, we note that both step-down GMM Wald test statistics are distributed as chi-square with N degrees of freedom. The details on how the change in matrix AT is made are shown as follows. Based on Kan and Zhou (2008), AT is the consistent estimate for AEŒXX 0 1 . For the first stepdown test (testing ˛ D 0N ), A has to be changed as 1 0K . Therefore, AT D AEŒXX 0 1 D 1 C aO 1 O 01 VO111 :
(10.30)
For the second step-down test (testing 0ı D 0N conditional on ˛ D 0N ), A has to be changed to 1K . Thus, i h AT D AEŒXX 0 1 D 10K .VO11 C 1 01 /1 ;
(10.31)
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
where new X is different from X in Equation (10.30), since there is no intercept term in the regression equation estimated.
10.5 Alternative Computer Program to Calculate Efficient Frontier Several software packages can be used to generate the efficient frontier. In this section, we will demonstrate the method using Microsoft Excel and MATLAB.
10.5.1 Application: Microsoft Excel Excel is far from the best program for generating the efficient frontier and is limited in the number of assets it can handle, but working through a simple portfolio optimizer in Excel can illustrate concretely the nature of the calculations used in more sophisticated “black-box” programs. You will find that in Excel, the computation of the efficient frontier is fairly easy. Assume an American investor who forms a six-stockindex portfolio. The portfolio consists of six stock indexes: United States (S&P 500), United Kingdom (FTSE 100), Switzerland (Swiss Market Index, SMI), Singapore (Straits Times Index, STI), HongKong (Hang Seng Index, HSI), and Korea (Korea Composite Stock Price Index, KOSPI), with monthly price data from Jan. 1990 to Dec. 2006. He/she wants to know his/her optimal portfolio allocation. The Markowitz portfolio selection problem can be divided into three parts. First, we need to calculate the efficient frontier. Second, we need to choose the optimal risky portfolio given one’s capital allocation line (find the point at the tangent of a CAL and the efficient frontier). Finally, using the optimal complete portfolio allocate funds between the risky portfolio and the risk-free asset.
175
age returns, standard deviations, and the correlation matrix10 for the rates of return on the stock index. After we input Table 10.2A into our spreadsheet as shown, we create the covariance matrix in Table 10.2B using the relationship Cov.ri ; rj / D ij i j . Before computing the efficient frontier, we need to prepare the data to establish a benchmark against which to evaluate our efficient portfolios, so we can form a bordermultiplied covariance matrix. We use the target mean of 15% for example. To compute the target portfolio’s mean and variance, these weights are entered in the border column A47:B52 and border row B46:G46. We calculate the variance of this portfolio in cell B54 in Table 10.2D. The entry in B54 equals the sum of all elements in the border-multiplied covariance matrix where each element is first multiplied by the portfolio weights given in both the row and column borders. We also include two cells to compute the standard deviation and expected return of the target portfolio (formulas in cells B55 and B56).11 To compute points along the efficient frontier we use the Excel Solver in Table 10.2D (which you can find in the Tools menu).12 Once you bring up Solver, you are asked to enter the cell of the target (objective) function. In our application, the target is the variance of the portfolio, given in cell B54. Solver will minimize this target. You must then input the cell range of the decision variables (in this case, the portfolio weights, contained in cells A47:A52). Finally, you enter all necessary constraints into the Solver. For an unrestricted efficient frontier that allows short sales, there are two constraints: first, that the sum of the weights is 1.0 (cell A53 D 1), and second, that the portfolio’s expected return equals the target return 15% (cell B56 D 15). Once you have entered the two constraints you ask the Solver to find the optimal portfolio weights. The Solver beeps when it has found a solution and automatically alters the portfolio weight cells in row 46 and column A to show the makeup of the efficient portfolio.
10
10.5.1.1 Step One: Finding Efficient Frontier First, we need to calculate expected return, standard deviation, and covariance matrix. The expected return and standard deviation can be calculated by applying the Excel STDEV and AVERAGE functions to the historic monthly percentage returns data.9 Table 10.2A and B show aver-
9
The expected return and standard deviation of each index needs to be annualized.
The correlation matrix is calculated in Excel using the data analysis function that is found under the Tool Menu. Note that if Data Analysis does not appear on the Tool Menu you will need to select Add-in and add to the Menu. 11 Because the expected returns and the portfolio weights are represented by column vectors (denoted e and w, respectively, with row vector transposes e T and wT ), and the variance-covariance terms by matrix V , then the expressions can be written as simple matrix formulas. So, the calculation of expected return and variance of portfolio can use the Excel array functions. Matrix notation Excel formula: D SUMPRODUCT.w; e/ Portfolio return: wT e Portfolio variance: wT Vw D MMULT.TRANSPOSE.w/; MMULT.V; w// 12 If Solver does not show up under the Tools menu, you should select Add-Ins and then select Analysis. This should add Solver to the list of options in the Tools menu.
176
W.-P. Chen et al.
Table 10.2 Performance of six stock indexes A
B
C
D
E
F
G
1 A. Annualized S tandard Deviation , Average Return of S &P 500, FTS E 100, S MI, S TI, HIS , KOS PI 1990-2006 2 3 4 5 6 7 8 9
Asset Data S&P 500 FTSE100 SMI STI HSI KOSPI
Avreage Ret. 9.57 6.75 11.48 8.76 15.98 7.91
Std. Dev. 13.90 14.01 17.07 24.09 26.25 32.20
(%)
FTSE100 0.7431 1 0.7434 0.5570 0.5559 0.4302
SMI 0.6643 0.7434 1 0.4832 0.4666 0.3301
STI 0.5692 0.5570 0.4832 1 0.7518 0.4236
HSI 0.5561 0.5559 0.4666 0.7518 1 0.3731
FTSE100 144.8102 196.4143 177.8828 188.0728 204.5563 194.1278
SMI 157.7058 177.8828 291.5349 198.7550 209.1882 181.5081
STI 190.6630 188.0728 198.7550 580.4614 475.5806 328.6319
HSI 203.0139 204.5563 209.1882 475.5806 689.3243 315.4615
10 B. Correlation Matrix 11 12 13 14 15 16 17 18
S&P 500 1 0.7431 0.6643 0.5692 0.5561 0.3747
S&P 500 FTSE100 SMI STI HSI KOSPI
KOSPI 0.3747 0.4302 0.3301 0.4236 0.3731 1
19 C. Covariance Matrix 20 21 22 23 24 25 26 27
S&P 500 193.3182 144.8102 157.7058 190.6630 203.0139 167.7584
S&P 500 FTSE100 SMI STI HSI KOSPI
KOSPI 167.7584 194.1278 181.5081 328.6319 315.4615 1036.9482
28 D. Boardered Covariance Matrix for Efficient Frontier with mean of 15% 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
(allow short selling up to 100% and limit the investment to 150%) Cell Formulas S&P 500 FTSE100 SMI STI Weights A32 A33 A34 A35 0.78 A33*B32*B21 A33*C32*C21 A33*D32*D21 A33*E32*E21 -0.54 A34*B32*B22 A34*C32*C22 A34*D32*D22 A34*E32*E22 0.60 A35*B32*B23 A35*C32*C23 A35*D32*D23 A35*E32*E23 -0.34 A36*B32*B24 A36*C32*C24 A36*D32*D24 A36*E32*E24 0.41 A37*B32*B25 A37*C32*C25 A37*D32*D25 A37*E32*E25 0.08 A38*B32*B26 A38*C32*C26 A38*D32*D26 A38*E32*E26 SUM(A33:A38) SUM(B33:B38) SUM(C33:C38) SUM(D33:D38) SUM(E31:E36) Portfolio variance SUM(B39:G39) Portfolio SD B40^0.5 Portfolio mean A33*B4+A34*B5+A35*B6+A36*B7+A37*B8+A38*B9
HSI A36 A33*F32*F21 A34*F32*F22 A35*F32*F23 A36*F32*F24 A37*F32*F25 A38*F32*F26 SUM(F33:F38)
KOSPI A37 A33*G32*G21 A34*G32*G22 A35*G32*G23 A36*G32*G24 A37*G32*G25 A38*G32*G26 SUM(G33:G38)
HSI 0.41 65.71 -45.19 51.39 -66.89 117.20 10.90 122.63
KOSPI 0.08 11.03 -8.71 9.06 -9.39 10.90 7.28 20.25
Results Weights 0.78 -0.54 0.60 -0.34 0.41 0.08 1.00 Portfolio variance Portfolio SD
S&P 500 0.78 119.12 -60.90 73.75 -51.05 65.71 11.03 147.42 292.92 17.11
FTSE100 -0.54 -60.90 56.37 -56.77 34.37 -45.19 -8.71 -72.65
It adjusts the entries in the border-multiplied covariance matrix to reflect the multiplication by these new weights, and it shows the mean and variance of this optimal portfolio-the minimum variance portfolio with mean return of 15%. These results are shown in Table 10.2D, cells B54:C56. You can find that they yield an expected return of 15% with a standard deviation of 17.11% (results in cells B56 and B55). To generate the entire efficient frontier, keep changing the required mean in the constraint (in cell B56), letting the Solver
SMI 0.60 73.75 -56.77 103.47 -40.39 51.39 9.06 132.04
STI -0.34 -51.05 34.37 -40.39 67.54 -66.89 -9.39 -56.77
work for you. If you record a sufficient number of points, you will be able to generate a graph with the quality of Fig. 10.7. If short selling is not allowed, you can compute the portfolio on efficient frontier with the short sales constraint applying Solver. As to the example described above, we need to impose the additional constraints that each weight (the elements in column A and row 46) must be nonnegative. Once they are entered, you repeat the variance-minimization exercise until you generate the entire restricted frontier. The outer
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
177
Table 10.2 (continued) A
B
C
D
E
F
G
H
I
58 E. The Unrestricted Efficient Frontier 59
Mean
Std. Dev.
S&P 500
FTSE100
SMI
ST I
HSI
KOSPI
60
6.000
12.999
0.459
0.699
-0.111
0.061
-0.188
0.080
7.730
12.710
0.522
0.461
0.025
-0.016
-0.072
0.081
62
9.000
12.867
0.568
0.287
0.125
-0.073
0.012
0.081
63
11.000
13.716
0.640
0.013
0.282
-0.162
0.146
0.082
64
13.000
15.185
0.713
-0.261
0.439
-0.252
0.279
0.083
65
15.000
17.115
0.785
-0.536
0.596
-0.341
0.412
0.084
66
17.000
19.369
0.857
-0.810
0.753
-0.430
0.546
0.085
19.784
22.976
0.812
-1.000
0.940
-0.635
0.812
0.071
68
22.000
26.499
0.661
-1.000
1.063
-0.860
1.087
0.050
69
24.000
30.075
0.453
-1.000
1.191
-1.000
1.335
0.020
61
67
MVP
ORP
70 71 F. The Restricted Efficient Frontier 72
Mean
Std. Dev.
S&P 500
FTSE100
SMI
ST I
HSI
KOSPI
73
6.745
14.050
0.0000
1.0000
0.0000
0.0000
0.0000
0.0000
7.000
13.525
0.0586
0.8649
0.0000
0.0000
0.0000
0.0765
8.281
12.829
0.4797
0.4237
0.0190
0.0000
0.0000
0.0776
76
10.000
13.350
0.5922
0.0522
0.2402
0.0000
0.0384
0.0769
77
11.000
14.096
0.4743
0.0000
0.3262
0.0000
0.1411
0.0584
74 75
78
MVP
12.897
16.998
0.1292
0.0000
0.4839
0.0000
0.3775
0.0095
79
ORP
13.000
17.199
0.1104
0.0000
0.4924
0.0000
0.3904
0.0068
80
14.000
19.414
0.0000
0.0000
0.4396
0.0000
0.5604
0.0000
81
15.000
22.540
0.0000
0.0000
0.2176
0.0000
0.7824
0.0000
82
U=E (rp ) − 0.005As
83 Utility Function 84 Risk Aversion Index (A)
2 p
5
85
⇒ (1−y) rf + y* E ( rp )
86 G.Optimal Complete Risky Portfolio(Unrestricted Portfolio) Cell Formulas
87 88
Risk-free Asset
89 Return
SUMPRODUCT (B96:C96,B98:C98)
B69
90 Risk
0.00
C69
91 optimal position
1-C98
(C96-B96)/(0.01*B88*C97^2)
92
Optimal Overall Portfolio
Optimal Risky Portfolio
4.01
C98*C97
⇒ y *s p
(1-y,y)
93 95
Results
96
Risk-free Asset
Optimal Risky PortfolioOptimal Overall Portfolio
97 Return
4.01
19.8
13.44
98 Risk
0.00
23.0
13.73
99 optimal position
0.40
0.60
100
(1-y,y)
101 102 H. Optimal Complete Risky Portfolio (Restricted Portfolio) 103
Risk-free Asset
Optimal Risky Portfolio Optimal Overall Portfolio
104 Return
4.01
12.90
9.48
105 Risk
0.00
17.00
10.46
0.38
0.62
106 optimal position 107
(1-y,y)
108
frontier in Fig. 10.7 is drawn assuming that the investor may maintain negative portfolio weights, and the inside frontier obtained allowing short sales. Table 10.2E and F present a number of points on the two frontiers with and without short sales. You can see that the weights in restricted portfolios are never negative. The minimum variance portfolios in two frontiers are not the same. Before we move on, let us summarize the steps of using Solver to calculate the variance-minimization portfolio.
The steps with Solver are as follows: 1. Invoke Solver by choosing Tools then Options then Solver. 2. Specify in the Solver parameter Dialog Box: The Target cell to be optimized, specify max or min. 3. Choose Add to specify the constrains then OK. 4. Solve and get the results in the spreadsheet.
178
W.-P. Chen et al.
Fig. 10.7 Efficient frontier of unrestricted and restricted portfolio Portfolio Return (%)
Portfolio Efficient Frontier
30 25 20 15 10 5 0
0
5
10
15
20
25
30
35
Portfolio Risk (%) Restricted
10.5.1.2 Step Two: Finding Optimal Risky Portfolio Now that we have the efficient frontier, we proceed to step two. In order to get the optimal risky portfolio, we should find the portfolio on the tangency point of capital allocation and efficient frontier. To do so, we can use the Solver to help us. First, you enter the of the target function, “maximum” the E.r /r reward-to-variability ratio ( pp f , we assume risk-free rate is 4.01%13 ) the slope of the CAL, input the cell range (the portfolio weights, contained in cells A47:A52), and other necessary constraints (such like the sum of the weights equal to one and others). Then ask the Solver to find the optimal portfolio weights. The results are shown in Table 10.2E and F. The optimal risky portfolio with short selling allowance has an expected return of 19.784% (in cell B67) with a standard deviation of 22.976% (in cell C67). The expected return and standard deviation of the restricted optimal risky portfolio are 12.897% (in cell B78) and 16.998% (in cell C78).
10.5.1.3 Step Three: Capital Allocation Decision A person’s allocation decision will be influenced by his or her degree of risk aversion. Now we have the optimal risky portfolio, and then we can use the concept of complete portfolio allocation funds between the risky portfolio and the risk-free asset. We use Equation (10.11) as our utility function and set the risk aversion equal to 5; the risk-free rate is 4.1%. First, we construct a complete portfolio with the risk-free asset and the optimal risky portfolio.
13
We use 3-month Treasury Bill interest rate as the risk-free rate; the average interest rate of a 3-month T-Bill is 4.09% from 1990/01 to 2006/12.
Unrestricted
S&P 500
FTSE100
SMI
STI
HSI
KOSPI
According to Equation (10.14), the optimal weight in E.rp /rf the risky portfolio is 0:01A and the optimal position 2 p
E.r /r
p f of the risk-free asset is 1 0:01A 2 . Then we can use p Equations (10.5) and (10.6) to calculate the expected return and standard deviation of the overall optimal portfolio. The results are shown in Table 10.2G and H. The optimal unrestricted (restricted) portfolio has a 13.44% (9.48%) expected return with 13.73% (10.46%) standard deviation. And the investor will invest 60% (62%) of portfolio value in risky portfolio and 40% (38%) in risk-free asset. To sum up the three steps, all the results are shown in Table 10.3.
10.5.2 Application: MATLAB By using the Solver, you can repeat the varianceminimization exercise until you generate the entire efficient frontier, just with a push of a button. The computation of the efficient frontier is fairly easy in Microsoft Excel. However, Excel is limited in the number of assets it can handle. If our portfolio consists of hundreds of asset, using Excel to deal with the Markowitz optimal problem will become more complicated. MATLAB can handle this problem. The Financial Toolbox within MATLAB provides a complete integrated computing environment for financial analysis and engineering. The toolbox has everything you need to perform mathematical and statistical analysis of financial data and display the results with presentation-quality graphics. You do not need to switch tools, convert files, or rewrite applications. Just write expressions the way you think of problems, MATLAB will do all that for you. You can quickly ask, visualize, and answer complicated questions. Financial Toolbox includes a set of portfolio optimization functions designed to find the portfolio that best meets investor requirements. Recall the steps when using Excel in
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
179
Table 10.3 The results of optimization problem Unrestricted portfolio
Restricted portfolio
Minimum variance portfolio
Optimal risky portfolio
Optimal Overall portfolio
Minimum variance portfolio
Optimal risky portfolio
Optimal Overall portfolio
Portfolio return
7.73%
19.78%
13.44%
8.28%
12.89%
9.48%
Portfolio risk
12.71%
22.98%
13.73%
12.83%
16.99%
10.46%
Portfolio
previous section. In the portfolio optimization problem, we need to prepare several of the data for the computation preparation. First, we need the expected return, standard deviation, and covariance matrix. Then, we can calculate the optimal risky portfolio weights and know how allocation will be more efficient. Before a taking real portfolio selection example by using MATLAB, we introduce the several portfolio optimization functions that will be use in our portfolio selection analysis.
10.5.2.1 Mean–Variance Efficient Frontier Calling function frontcon returns the mean–variance efficient frontier for a given group of assets with user-specified asset constraints, covariance, and returns. The computation is based on sets of constraints representing the maximum and minimum weights for each asset and the maximum and minimum total weight for specified groups of assets. The efficient frontier computation functions require information about each asset in the portfolio. This data is entered into the function via two matrices: an expected return vector and a covariance matrix. Calling frontcon while specifying the output arguments returns the corresponding vectors and arrays representing the risk, return, and weights for each of the 10 points computed along the efficient frontier. As there are no constraints, you can call frontcon directly with the data you already have. If you call frontcon without specifying any output arguments, you get a graph representing the efficient frontier curve.
is a vector of the expected return of each portfolio, and PortWts is a matrix of weights allocated to each asset.14 The output data is represented row-wise, where each portfolio’s risk, rate of return, and associated weight is identified as corresponding rows in the vectors and matrix. ExpReturn and ExpCovariance are frontcon required information. NumPorts, PortReturn, AssetBounds, and GroupBounds all are optional information. ExpReturn specifies the expected return of each asset; ExpCovariance specifies the covariance of the asset returns. In NumPorts, you can define the number of portfolios you want to be generated along the efficient frontier. If NumPorts is empty (entered as []), frontcon computes 10 equally spaced points.15 PortReturn is vector of length equal to the number of portfolios containing the target return values on the frontier. If PortReturn is not entered or [], NumPorts equally spaced returns between the minimum and maximum possible values are used. AssetBounds is a matrix containing the lower and upper bounds on the weight allocated to each asset in the portfolio.16 Groups is number of groups matrix specifying asset groups or classes.17 GroupBounds matrix allows you to specify an upper and lower bound for each group. Each row in this matrix represents a group. The first column represents the minimum allocation, and the second column represents the maximum allocation to each group. Notice, the arguments in the parentheses are the input data and the items in the square bracket are the output returns of function. If you do not want the results to show up in output window, you may use the semicolon to suppress output from the evaluation of function. However, if you want to get all
Function: frontcon Syntax [PortRisk, PortReturn, PortWts] D frontcon(ExpReturn, ExpCovariance, NumPorts, PortReturn, AssetBounds, Groups, GroupBounds) PortRisk, PortReturn, and PortWts are the returns of frontcon, where PortRisk is a vector of the standard deviation of each portfolio, PortReturn
14
The total of all weights in a portfolio is 1. When entering a target rate of return (PortReturn), enter NumPorts as an empty matrix []. 16 The Default of lower asset bound is all 0s (no short-selling); default upper asset bound is all 1s (any asset may constitute the entire portfolio). 17 Groups.i; j/ D 1 (jth asset belongs in the ith group). Groups.i; j/ D 0 (jth asset not a member of the ith group). 15
180
output results from the evaluation of function, you must add square brackets with output arguments when in expression feedback of functions in writing syntax. The names of input and output arguments can be user-specified. It is not necessary to be named as our example shows. You can name what you want, as long as the concept of data matches the function arguments required.
W.-P. Chen et al.
tions. The number of columns of the matrix A, and the length of the vector Wts correspond to the number of assets. The number of rows of the matrix A, and the length of vector b correspond to the number of constraints. This method allows you to specify any number of linear inequalities to the function portopt.18 Function: portcons Syntax ConSet D portcons(varargin19)
10.5.2.2 Portfolios on Constrained Efficient Frontier If you want set some linear constraints when computing efficient frontier, you can call portopt function. The portopt computes portfolios along the efficient frontier for a given group of assets, based on a set of user-specified linear constraints. Typically, these constraints are generated using the constraint specification functions that we will describe in next section. Function: portopt
ConSet D portcons(‘ConstType’, Data1, : : :, DataN) creates a matrix ConSet, based on the constraint type ConstType, and the constraint parameters Data1, : : :, DataN. ConSet D portcons(‘ConstType1’, Data11, : : :, Data1N, ‘ConstType2’, Data21, : : :, Data2N, : : :) creates a matrix ConSet, based on the constraint types ConstTypeN, and the corresponding constraint parameters DataN1, : : :, DataNN.
Syntax [PortRisk, PortReturn, PortWts] D portopt(ExpReturn, ExpCovariance, NumPorts, PortReturn, ConSet)
10.5.2.4 Optimal Capital Allocation to Efficient Frontier Portfolios
Since the portopt function is able to cope with more general constraints than the frontcon function, the portopt function can be viewed as an advanced version of the frontcon function. The main difference between these two is ConSet; others are the same as described in frontcon function. ConSet, optional information, is a constraint matrix for a portfolio of asset investments, created using portcons. If not specified, a default is created. The syntax expressed above will return the mean–variance efficient frontier with user-specified covariance, returns, and asset constraints (ConSet). If portopt is invoked without output arguments, it returns a plot of the efficient frontier (Table 10.4).
A last function we describe here may be used to find an optimal portfolio. So far, we have deal with efficient portfolios, leaving the risk/return trade-off unresolved. We may resolve this tradeoff by linking mean–variance portfolio theory to the more general utility theory. Function portalloc computes the optimal risky portfolio on the efficient frontier, based on the risk-free rate, the borrowing rate, and the investor’s degree of risk aversion. It also generates the capital allocation line, which provides the optimal allocation of funds between the risky portfolio and the risk-free asset. In the Financial toolbox the function portalloc is provided, which yields the optimal portfolio assuming the quadratic utility function: U D E.rp / 0:005Ap2 , where the parameter A is linked to risk aversion; its default value is 3.
10.5.2.3 Portfolio Constraints While frontcon allows you to enter a fixed set of constraints related to minimum and maximum values for groups and individual assets, you often need to specify a larger and more general set of constraints when finding the optimal risky portfolio. The function portopt addresses this need, by accepting an arbitrary set of constraints as an input matrix. The auxiliary function portcons can be used to create the matrix of constraints, with each row representing an inequality. These inequalities are of the type A Wts0 b, where A is a matrix, b is a vector, and Wts is a row vector of asset alloca-
18
In reality, portcons is an entry point to a set of functions that generate matrices for specific types of constraints; portcons allows you to specify all the constraints data at once, while the specific portfolio constraint functions allow you to build the constraints incrementally. These constraint functions are pcpval, pcalims, pcglims, and pcgcomp. 19 varargin is variable length input argument list. The varargin statement is used only inside a function M-file to contain optional input arguments passed to the function. The varargin argument must be declared as the last input argument to a function, collecting all the inputs from that point onwards. In the declaration, varargin must be lowercase.
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
181
Table 10.4 The detail information of portcons function (Source: MATLAB Financial toolbox) Constraint type Description Value Default
All allocations are 0; no short selling allowed. Combined value of portfolio allocations normalized to 1
NumAssets (required). Scalar representing number of assets in portfolio.
PortValue
Fix total value of portfolio to PVal.
PVal (required). Scalar representing total value of portfolio. NumAssets (required). Scalar representing number of assets in portfolio.
AssetLims
Minimum and maximum allocation per asset.
AssetMin (required). Scalar or vector of length NASSETS, specifying minimum allocation per asset. AssetMax (required). Scalar or vector of length NASSETS, specifying maximum allocation per asset. NumAssets (optional).
GroupLims
Minimum and maximum allocations to asset group.
Groups (required). NGROUPS-by-NASSETS matrix specifying which assets belong to each group. GroupMin (required). Scalar or a vector of length NGROUPS, specifying minimum combined allocations in each group. GroupMax (required). Scalar or a vector of length NGROUPS, specifying maximum combined allocations in each group.
GroupComparison
Group-to-group comparison constraints.
GroupA (required). NGROUPS-by-NASSETS matrix specifying first group in the comparison. AtoBmin (required). Scalar or vector of length NGROUPS specifying minimum ratios of allocations in GroupA to allocations in GroupB. AtoBmax (required). Scalar or vector of length NGROUPS specifying maximum ratios of allocations in GroupA to allocations in GroupB. GroupB (required). NGROUPS-by-NASSETS matrix specifying second group in the comparison.
Custom
Custom linear inequality constraints A PortWts’b.
Function: portalloc Syntax [RiskyRisk, RiskyReturn, RiskyWts, RiskyFraction, OverallRisk, OverallReturn] D portalloc(PortRisk, PortReturn, PortWts, RisklessRate, BorrowRate, RiskAversion) RiskyRisk is the standard deviation of the optimal risky portfolio; RiskyReturn is the expected return of the optimal risky portfolio; RiskyWts is a vector of weights allocated to the optimal risky portfolio; RiskyFraction is the fraction of the complete portfolio allocated to the risky portfolio; OverallRisk is the standard deviation of the
A (required). NCONSTRAINTS-by-NASSETS matrix, specifying weights for each asset in each inequality equation b (required). Vector of length NCONSTRAINTS specifying the right hand sides of the inequalities.
optimal overall portfolio; OverallReturn is the expected rate of return of the optimal overall portfolio. PortRisk is standard deviation of each risky asset efficient frontier portfolio; PortReturn is expected return of each risky asset efficient frontier portfolio; PortWts is weights allocated to each asset. RisklessRate is risk-free lending rate (for investing), a decimal number; BorrowRate is borrowing rate, which is larger than the riskless rate – a decimal number.20 RiskAversion is coefficient of investor’s degree of risk aversion (Default D 3). BorrowRate and RiskAversion are optional. As with frontcon, calling portopt while specifying output arguments returns the corresponding vectors and ar20
If borrowing is not desired, or not an option, set to NaN (default).
182
W.-P. Chen et al.
rays representing the risk, return, and weights for each of the portfolios along the efficient frontier. Use them as the first three input arguments to the function portalloc. Calling portalloc while specifying the output arguments returns the variance (RiskyRisk), the expected return (RiskyReturn), and the weights (RiskyWts) allocated to the optimal risky portfolio. It also returns the fraction (RiskyFraction) of the complete portfolio allocated to the risky portfolio and the standard deviation (OverallRisk) and expected return (OverallReturn) of the optimal overall portfolio consisting of the risky portfolio and the risk-free asset. The overall portfolio combines investments in the risk-free asset and in the risky portfolio. The actual proportion assigned to each of these two investments is determined by the degree of risk aversion characterizing the investor. The value of RiskyFraction exceeds 1 (100%), implying that the risk tolerance specified allows borrowing money to invest in the risky portfolio, and that no money will be invested in the risk-free asset. This borrowed capital is added to the original capital available for investment. Calling portalloc without specifying any output arguments gives a graph displaying the critical points.
to create the matrix of constraints for short selling allowance and call portfopt compute constrained efficient frontier. Finally, use PortRisk, PortRet, PortWts input arguments to the function portalloc and find the optimal risky portfolio and the optimal allocation of funds between the risky portfolio and the risk-free asset. In this example the resulting optimal portfolio is formed with 88.92% in the risky portfolio and 11.08% in the risk-free asset. The optimal overall portfolio return is 17.03% with 20.76% standard deviation. The expected return of six-stockindex portfolio is 18.64%, and the portfolio risk is 23.35%. The best allocation proportion of each asset for the investor is to allocate 85.84% portfolio value in S&P 500 Index, 96.39% in Swiss Market Index (SMI), and 81.39% in Hang Seng Index (HSI), and short selling 100% portfolio value in FTSE 100 Index, 61.82% in Straits Times Index (STI), and 1.8% in Korea Composite Stock Price Index (KOPSI) (Figs. 10.8–10.10). For further information about computer program application of modern portfolio theory, please refer to Benninga (2008), Jackson and Staunton (2001), and Brandimarte (2006).
10.5.2.5 Six-Stock-Index Portfolio
10.6 Conclusion Following the previous example, assume that an investor forms a six-stock-index portfolio that is his/her optimal portfolio allocation. Assume that the risk free investment return is 4.1%, investor’s degree of risk aversion is 3, and the borrowing rate is 7%. We also allow short selling up to 100% of the portfolio value in each stock index, but limit the investment in any each stock index to 150% of the portfolio value. Now we use MATLAB to assist him/her. First, we calculate rate of return of the natural logarithm stock index price. Second, we use mean and cov functions to calculate the expected return of each stock index and covariance matrix of portfolio. Third, call portcons
In this chapter, we introduced the theory and the application of computer program of modern portfolio theory. In the first part, we illustrated the Markowitz portfolio model to show that investors could lower the total risk of investing without sacrificing return by investing in portfolios rather than in individual assets. The Markowitz portfolio selection problem involves three topics: (1) finding the efficient frontier, (2) constructing the capital asset line, and (3) determining the optimal portfolio. We also demonstrated the methods applied in these topics by using Microsoft Excel and MATLAB. In the second part, we introduced the mean–variance span-
clear all; clc; tic %portfolio.m load stock_index.txt; Portfolio = stock_index (:,1:6); [m n] = size(Portfolio); [AvrRet]= mean(Portfolio); [ExpRetVec]= AvrRet*12;
Fig. 10.8 MATLAB code for asset allocation
[CovMtx]= cov(Portfolio)*12;
10 Portfolio Optimization Models and Mean–Variance Spanning Tests
183
Fig. 10.8 (continued) % COMPUTE EFFICIENT FRONTIER (SHORT SELLING ALLOWED) NumPorts = n; AssetMin = [ -1 -1 -1 -1 -1 -1]; AssetMax = [ 1.5 1.5 1.5 1.5 1.5 1.5]; pval = 1; Constraint = portcons('PortValue',pval, NumPorts, 'AssetLims', AssetMin, AssetMax, NumPorts); [PortRisk,PortRet,PortWts] = portopt(ExpRetVec, CovMtx, 50, [], Constraint); [OptRisk,OptRet,OptWts]=portopt(ExpRetVec, CovMtx, NumPorts, [], Constraint); % OPTIMAL ASSET ALLOCATION RisklessRate = 0.041; BorrowRate = 0.07; RiskAversion = 3; [RiskyRisk, RiskyReturn, RiskyWts, RiskFraction, OverallRisk, OverallReturn] = portalloc (PortRisk, PortRet, PortWts, RisklessRate, BorrowRate, RiskAversion)
% PLOT THE RESULT
plot(PortRisk, PortRet, 'm-', ... 0, RisklessRate, 'k:square', RiskyRisk, RiskyReturn, 'k:diamond',... OverallRisk, OverallReturn, 'o',... [0; RiskyRisk], [RisklessRate; RiskyReturn],'r--') xlabel('Portfolio Risk'); ylabel('Portfolio Return '); title('Efficient Frontier'); legend('Efficient Frontier', 'Risk free asset', 'Risky Portfolio', 'Asset Allocation Point') grid on
RiskyRisk = 0.2335 RiskyReturn = 0.1864 RiskyWts = 0.8584 -1.0000 RiskFraction = 0.8892 OverallRisk = 0.2076 OverallReturn = 0.1703
0.9639
Fig. 10.9 MATLAB output of optimal allocation problem
-0.6182
0.8139
-0.0180
184
W.-P. Chen et al.
Fig. 10.10 Optimal asset allocation
ning test that follows directly from the portfolio optimization problem. The mean–variance spanning tests enable us to analyze the effect on the mean–variance frontier of adding new assets to a set of benchmark assets. In conclusion, this chapter provides us with a complete overview and some techniques on portfolio analyses.
References Benninga, S. 2008. Financial modeling, 3rd ed., MIT, MA. Bodie, Z., A. Kane, and A. J. Marcus. 2007. Essentials of investment, 6th ed., McGraw-Hill/Inwin, New York. Bodie, Z., A. Kane, and A. J. Marcus. 2008. Investment, 7th ed., McGraw-Hill/Inwin, New York. Brandimarte, P. 2006. Numerical methods in finance and economics: a MATLAB-based introduction, 2nd ed., Wiley, New York.
DeRoon, F. A. and T. E. Nijman. 2001. “Testing for mean-variance spanning: a survey.” Journal of Empirical Finance 8, 111–155. Financial Toolbox 3. 2009. User’s Guide. The MathWorks, Inc. Huberman, G. and S. Kandel. 1987. “Mean-variance spanning.” Journal of Finance 42, 873–888. Jackson, M. and M. Staunton. 2001. Advanced modeling in finance using excel and VBA, Wiley, New York. Jobson, J. D. and B. Korkie. 1989. “A performance interpretation of multivariate tests of asset set intersection, spanning and meanvariance efficiency.” Journal of Financial and Quantitative Analysis 24, 185–204. Kan, R., and G. Zhou. 2008. “Tests of mean-variance spanning.” Working Paper, University of Toronto and Washington University. Lee, C. F., J. E. Finnerty, and D. H. Wort. 1990. Security analysis and portfolio management. Scott, Foresman/Little Brown, Glenview, IL. Markowitz, H. M. 1952. “Portfolio selection.” Journal of Finance 7, 77–91. Markowitz, H. M. 1959. Portfolio selection: efficient diversification of investments, Wiley, New York.
Chapter 11
Combining Fundamental Measures for Stock Selection Kenton K. Yee
Abstract Securities selection is the attempt to distinguish prospective winners from losers – conditional on beliefs and available information. This article surveys some relevant academic research on the subject, including work about the combining of forecasts (Operational Research Quarterly 20, 451–468, 1969), the Black-Litterman model (Journal of Fixed Income 1(2), 7–18, 1991; Financial Analysts Journal (September/October) 28–43, 1992), the combining of Bayesian priors and regression estimates (Journal of Finance 55(1), 179–223, 2000), model uncertainty and Bayesian model averaging (Statistical Science 14(4), 382–417, 1999; Review of Financial Studies 15(4), 1223–1249, 2002), the theory of competitive storage (Review of Economic Studies 59, 1–23, 1992), and the combination of valuation estimates (Review of Accounting Studies 12(2–3), 227–256, 2007). Despite its wide-ranging applicability, the Bayesian approach is not a license for data snooping. The second half of this article describes common pitfalls in fundamental analysis and comments on the role of theoretical guidance in mitigating these pitfalls. Keywords Value investing r Fundamental analysis r Noisy markets r Data snooping r Theory of storage
11.1 Introduction Even if one believes the market is efficient in the intermediate or long run, there is no reason to believe that instant market quotes reflect long-run value in the short run (Black 1986). On the contrary, market microstructure, liquidity, and behavioral phenomena suggest that at any given time prices may be noisy and returns skewed. Accordingly, investors, bankers, insiders, analysts, tax authorities, judges, and other capital market participants are constantly pressed to assess
K.K. Yee () Columbia Business School, New York, NY, USA e-mail:
[email protected]
and reassess value not only to “beat the market” but also to set fair prices for contentious transactions. A large industry of institutional “tactical” asset managers exist to exploit temporary market mispricing and to push the limits of arbitrage. Obtaining value estimates or “scores” is the starting point for stock selection (Treynor and Black 1973; Klarman 1991; Penman 2001; Yee 2008b), value-based indexing in a noisy market (Chan and Lakonishok 2004; Arnott et al. 2005; Siegel 2006; Perold 2007), and beating benchmark indices through tactical asset allocation (Black and Litterman 1991, 1992; Lo 2007). Obtaining accurate value estimates is also useful at the firm level. Boards of directors and shareholders determine what prices to tender or accept for mergers and acquisitions (Yee 2005b). Tax authorities and other regulators must appraise estates, assess the fairness of transfer prices, and audit the value of employee stock option plans (ESOPs). When irreconcilable disputes arise, their assessments are adjudicated in court by a judge, who chooses a terminal appraisal after weighing conflicting testimony from adversarial experts. In principle, valuing a company is not rocket science. According to asset pricing theory, value is the sum of expected cash outflows suitably discounted. One way to estimate equity value is the discounted cash flow (DCF) method. DCF is implemented by forecasting expected cash inflows from operations, netting them against forecasted payments to creditors, and then discounting. The discount factor, which depends on interest rates, the equity risk premium, leverage, time, and possibly other factors, is estimated based on models such as CAPM or APT (Kaplan and Ruback 1995; Ang and Liu 2004). One may also transform the DCF formula into representations that take advantage of accounting information to predict cash flows and, accordingly, represent value (Feltham and Ohlson 1999; Yee 2005a; Ohlson and Gao 2006). In a certain world with perfect markets, DCF is cut and dry. But owing to uncertainty in prospective cash flows (which may be exacerbated by limited available information) and inherent shortcomings of CAPM and APT (which require estimating beta and forecasting values for the timevarying equity risk premium), DCF analysis inevitably leads
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_11,
185
186
K.K. Yee
to an imprecise answer. Indeed, DCF analysis – where subjectively shifting numbers around can lead to radically different answers – gives a new twist on the old adage that there are three types of lies: lies, damned lies, and equity valuation. In the face of such uncertainty and subjectivity, estimated value – like beauty – is in the eyes of the beholder even without further complications caused by heterogeneous beliefs or information asymmetry. A valuation result depends on the procedure one uses to obtain it. Along with DCF, the most popular procedures include asset-based valuation, the method of comparables, and ratio analysis. The method of comparables entails identifying a sample of peer companies and an accounting benchmark such as earnings, revenues, or forward earnings. Assuming the price-to-benchmark ratios of all peer companies are roughly equal, one infers the value of a company from the average or median price-tobenchmark ratio of its peers. Although the method of comparables is arguably the most popular valuation method, it is plagued with subjectivity in both the determination of pertinent comparable measures and the identification of a firm’s true peers (Bhojraj and Lee 2003). In practice, the outcome can be manipulated by including or excluding standouts or outliers from a candidate peer group. Another problem arises when modestly different procedures yield disparate answers when applied to the same company at the same time. No single procedure is conclusively most precise and accurate in all situations. Since no method universally dominates the others, it is not surprising that financial analysts frequently estimate DCF as well as method of comparable values for a more complete analysis. Accordingly, popular valuation textbooks – for example, Penman (2001) and Damodaran (2002) – promulgate multiple techniques and view them as being simultaneously and contextually informative, with no dominating method. When available, market value serves as yet another estimate of value. How should an analyst combine two or more noisy valuation estimates into a superior valuation estimate that makes optimal use of the given information? Although there is tremendous practitioner interest this question, the academic literature offers surprisingly little guidance on how to combine estimates (Yee 2004a, 2008a). When analysts do combine estimates, their techniques are usually ad hoc since there is no theoretical guiding framework. A common practice is to form a linear weighted average of the different valuations.1
Indeed, weighted average valuations have become industry practice in financial legal proceedings (Sharfman 2003) and IRS tax-related valuation disputes (Beatty et al. 1999). The Delaware Block Method is an important case in point. In the years following the Great Depression, the Delaware Court of Chancery, the leading corporate law jurisdiction in the United States, evolved a valuation procedure that, by the late 1970s, became known as the Delaware Block Method. Virtually all American courts employed the Delaware Block Method for appraising equity value in the 1970s and early 1980s. In Delaware, the Block Method was the legally prescribed valuation procedure for shareholder litigation up until 1983. In 1983, the Delaware Supreme Court updated the law to allow judges to consider “modern” methods like discounted cash flow. Subsequently, the Delaware courts currently use the Block Method, discounted cash flow, and hybrids of these methods (Yee 2002; Chen et al. 2007 and 2010). The Delaware Block Method is most easily characterized in algebraic terms, even though judges do not write down such explicit formulas in their legal options. The Block Method relies on three measures to estimate the fair value of equity:
1
2
The author’s equity-valuation students enrolled in an MBA elective at a leading business school choose to take weighted averages of their valuation estimates even though nothing in the class materials advises them to combine estimates. When asked why they do this, students report they learned to do this at work, other classes, or that weighing makes common sense. The students do not articulate how they determine their weights, and the weights vary greatly between students.
contemporaneous market price P net asset value2 of equity b 5-year trailing earnings average e capitalized by ®, a num-
ber estimated from price-to-earnings ratios of comparable firms. Then, the fair value V of equity at date t is estimated as the weighted average V D .1 b e / P C b b e ' e:
(11.1)
The weights .1 ›b ›e /; ›b , and ›e – numbers between 0 and 1 – are adjudicated on a case-by-case basis. Earnings capitalization factor ¥ is also determined on a case-by-case basis and typically ranges between 5 and 15. Even if one accepts combining estimates by averaging them, as in the Delaware Block Method, how should the relative weights on P, b, and e be determined? Should more weight be given to market value than net asset value? Should net asset value receive more attention than earnings value? The simplest choice might be to take an equally weighted average, but even this parsimonious rule is ad hoc and unjustified. Net asset value is the liquidation value of assets minus the fair or settlement value of liabilities. Net asset value is not the same as accounting “book” value. Book value, because it derives from historical cost accounting, arcane accounting depreciation formulas, and accounting rules that forbid the recognition of valuable intangible assets, is known to be an unreliable (typically conservatively biased) estimate of net asset value.
11 Combining Fundamental Measures for Stock Selection
On this issue, the IRS valuation rules, honed by decades of spirited high stakes litigation and expensive expert testimony, do not lend insight. Revenue Ruling3 59–60 advises that “where market quotations are either lacking or too scarce to be recognized,” set 1 b e D 0 and focus on earnings and dividend-paying capacity. For the comparables method, use the “market price of corporations engaged in the same or similar line of business.” Upon recognizing that “determination of the proper capitalization rate presents one of the most difficult problems,” the Ruling advises that “depending upon the circumstances in each case, certain factors may carry more weight than others: : :valuations cannot be made on the basis of a prescribed formula.” Intuition provides a bit more guidance. The intuitively sensible rule is to weigh the more credible estimates more and the less credible estimates less. For example, if a security is thinly traded, illiquidity may cause temporary price anomalies.4 Likewise, “animal spirits” may cause the price of hotly traded issues to swing unexplainably. In either extreme, price quotations are unreliable indicators of fundamental value and common sense says to weigh price quotations less in such cases. On the other hand, price should be given more weight when it is believed that market price is more efficient and has not been subject to animal spirits. Yee (2008a) proposes Bayesian “triangulation” as the appropriate conceptual framework for combining value estimates. Section 11.2 describes the Bayesian triangulation formula, which prescribes how to combine different valuation estimates and price into a superior valuation estimate. Section 11.3 illustrates how Bayesian triangulation can be used to back out how noisy Delaware shareholder litigation judges believe share prices are. Bayesian methods provide a coherent framework for integrating views of analysts and investors into empirical statistical models. Model conclusions blend the information content of observed data with subjective information input. A prominent example is the Black-Litterman (1991, 1992) model, which is reviewed in Sect. 11.4. Section 11.4 also describes other applications of the Bayesian framework in asset pricing research, including Pastor (2000), Cremers (2002), and the Bayesian combination of forecasts. Sections. 11.5–11.7 turn to the dangers of misusing the Bayesian framework as an excuse to justify data snooping. Section 11.5 recalls how data snooping misleads researchers 3 The Internal Revenue Code addresses the valuation of closely held securities in Section 2031(b). The standard of value is “fair market value,” the price at which willing buyers or sellers with reasonable knowledge of the facts are willing to transact. Revenue Ruling 59–60 (1959–1 C.B. 237) sets forth the IRS’s interpretation of IRC Section 2031(b). 4 For example, a large block trade of micro-cap shares may cause temporary price volatility. Non-synchronous trading may cause apparent excessive price stability, as seen in emerging equity markets.
187
into drawing false conclusions even within a seemingly rigorous Bayesian context. Using an example from commodities pricing, Sect. 11.6 illustrates how micro-economic theory can be used to lend credibility to empirical results. Section 11.7 describes data snooping pitfalls in financial statement analysis and deep-dive fundamental research. Section 11.8 concludes.
11.2 Bayesian Triangulation This section prescribes how to fold two different valuation estimates together with the market price to achieve a more precise valuation estimate. First, it introduces Theorem 11.1 and Equation (11.2), the formula prescribing how much to weigh the individual estimates for maximum accuracy of the triangulated valuation estimate. Then, it describes how to implement Equation (11.2) and give a recipe for estimating input parameters. To state Theorem 11.1, it is necessary to introduce notation. Assume the analyst has three distinct valuation estimates: market price P, an intrinsic valuation estimate VI , and a method of comparables estimate VC . In particular, VI might be a DCF or net asset valuation estimate, while VC might be an estimate based on peer-group forward price-to-earnings ratios or trailing price-to-book ratios. While raw valuation estimates are biased in practice (e.g., Lie and Lie 2002), assume that VI and VC are pre-processed to remove anticipated bias. (The theorem does not address the issue of how to remove systematic bias, whose removal is outside the scope of this article.) Let V denote what the analyst wants to estimate. While it is common to equate V with intrinsic value or the present value of expected dividends, one wants to avoid this trap since intrinsic value is not an observable. Rather, let us just say that V is the value one wants to estimate in a given contextual situation. From an active portfolio manager’s perspective, V represents the price target or “score” that determines a security’s “alpha” in the extended CAPM framework of Treynor and Black (1973; see also Grinold 1989). From a tax appraiser’s perspective, V represents the valuation that is most “fair.” From an empirical researcher’s perspective (see Sect. 11.4), V might identified as the future realization of market price at the investment horizon (say, next year’s price suitably discounted back to the current date). Then, from this perspective, the question is well defined: How does one use Bayesian triangulation to make the most accurate forecast of market price (or alpha) at the investment horizon? Theorem 11.1 states how to estimate V given inputs fP; VI ; VC ; ¢; ¢I ; ¢C ; ¡g where:
188
K.K. Yee
¢ is the standard error of market price P, which is a noisy
measure of fundamental value V. Specifically, V D P C e where5 e N 0; 2 . ¢I is the standard error of intrinsic valuation estimate VI , also a noisy of V. Assume VI D V C eI where measure eI N 0; I2 is uncorrelated with e. ¢C is the standard error of method of comparables estimate VC , also a noisy measure of V. In particular, VC D V C eC where eC N 0; C2 is uncorrelated with eI but is correlated with market noise so that corr .e; eC / > 0. The amount e by which market price P differs from V is noise. Since market noise is unobservable, suppose that e is an unobservable mean-zero random error. Error e reflects Fischer Black’s (1986) assertion that “because there is so much noise in the world, certain things are essentially unobservable.” VI may represent either a DCF or liquidation valuation estimate of V. Assume its estimation error eI , is uncorrelated6 with market noise e. In contrast, because method of comparable estimates relies directly on market price, they are affected by temporary market movements. Hence, assume that eC , the method of comparables estimation error, is positively correlated with market noise e. The key difference between VI and VC is that corr .e; eI / 0 while corr .e; eC / > 0. The central result is (Yee 2008a): Theorem 11.1. The most precise estimate of value V an analyst constructs from a linear combination of P, VI , and VC is _
V D .1 I C /P C I VI C C VC ;
(11.2)
where kI D
.12 / 2 C2 .12 / 2 C2 C . 2 C C2 C 2C /I2
kC D
.CC /I2 .12 / 2 C2 C . 2 C C2 C
.1 I C / D
2C /I2
.C C/I2 C : .12 / 2 C2 C . 2 C C2 C 2C /I2
Proof. See Appendix 11A. Y N m; 2 means Y is normally distributed with mean m and standard deviation £. Note that we leave the possibility that the standard deviations may be heteroskedatic – nothing in our model requires them to be independent of V. 6 Accounting treatments like mark-to-market accounting or cookiejar accounting may correlate accounting numbers to market noise. If so, then DCF estimates that rely on accounting numbers to make their cash flow projections may correlate to market noise, so that corr.eI ; e/ ¤ 0. For this reason, the Appendix provides the generalization of Theorem 11.1 to the case when corr.eI ; e/ D I ¤ 0. 5
Appendix 11A also offers a generalization of Theorem 11.1 to the case when corr.eI ; e/ D I ¤ 0, when DCF error eI correlates to market noise e. In the interest of expository parsimony, assume that corr.eI ; e/ D 0 in the main text. Because the choice of weights ›I and ›C in Theorem 11.1 _ _ minimizes the Bayesian uncertainty of V , call V the _ Bayesian triangulation of P, VI , and VC . In particular, V is a more precise estimate than P, VI , or VC are individually. Bayesian triangulation enables analysts to aggregate the information content of the individual estimates P, VI , and _ VC into a more robust estimate, V , of fundamental value V. Bayesian triangulation combines three different estimates _ into a more precise valuation estimate, V . Equation (11.2) has the simple appealing features finan_ cial analysts should find reassuring. First, V is the most efficient unbiased linear estimator ofV conditional on P, VI , _ and VC . In this sense, V is the Bayesian rational estimate of fundamental value given P, VI , and VC . If a valuation estimate is exact, it deserves full weight and should be reflected in the expressions of Theorem 11.1. It is straightforward to verify that if a valuation estimate (P, VI , or VC / is exact, then the expressions in Theorem 11.1 assign it 100% weight and its counterparts zero weight: If I ! 0, then I D 1 and C D .1 I C / D 0. If C ! 0, then C D 1 and I D .1 I C / D 0. If ! 0, then .1 I C / D 1 and I D C D 0.
On the other hand, if a valuation estimate is infinitely imprecise (e.g., worthless), it deserves no weight and the formulas in Theorem 11.1 assign it zero weight: If I ! 1, then I D 0. If C ! 1, then C D 0. If ! 1, then .1 I C / D 0.
When a valuation estimate is worthless, the two remaining estimates have non-zero weights that depend on their relative precisions. If a valuation estimate is noisier, an analyst expects it to receive less weight and the competing estimates to obtain commensurately more weight. As depicted in Fig. 11.1, this is what happens. As the ratio of ¢C to ¢ increases, the method of comparables and intrinsic valuation estimates (¢C D ¢I in the Figure) become more imprecise relative to market price. Accordingly, when this happens in Fig. 11.1, intrinsic value weight ›I and the method of comparables weight ›C fall while the market value weight .1 ›I ›C / increases. ›C exceeds ›I in Fig. 11.1 because the method of comparables estimate, by virtue of being correlated with market price, helps to cancel out market noise and so receives more weight than the uncorrelated intrinsic valuation estimate. Recall that ¡ is the correlation between market price noise and the method of comparables estimation error, VC .
11 Combining Fundamental Measures for Stock Selection Fig. 11.1 Triangulation weights ›C ; ›I , and 1 ›C ›I as a function of z D ¢C =¢ assuming ¡ D 0:5 and ¢C D ¢I
189 Maximum Precision Weights
0.8 0.7
1-kI-kC
weight value
0.6 0.5 0.4 0.3
kC
0.2 kI 0.1 0 0
Assuming all firms have positive CAPM beta, the method of comparables error correlates to market noise because, by construction, the former relies on peers’ market prices to determine VC . However, the correlation is generally not exact for many reasons, one of which is that the method of comparables is subject to sources of error unrelated to the market price. Such errors stem from the use of imperfect peers, noisy benchmarks, or questionable benchmarks (e.g., manipulated earnings). Thus, financial analysts cannot expect that ¡ D 1 in realistic situations. However, for the purpose of shedding light on Equation (11.2), it is instructive to imagine what would happen in the ¡ D 1 limit. In this limit, algebra reveals that lim
!1 I D 0 and Equation (11.2) simplifies to _
0.5
C P C C VC . In other words, if the only V D C C C source of error in the method of comparables were market noise, then there is no need to give weight to an independent DCF valuation estimate, VI . In this idealized world, V is perfectly estimated by weighing market price P and method of comparables estimate VC . In this limit the method of comparables error exactly cancels out market noise (perfect diversification). In real world applications, ¡ ¤ 1, the intrinsic value _ VI should always have positive weight, and V is an imperfect (though always maximum precision) estimate of V. In principle, one can apply Equation (11.2) by estimating the parameters ; C ; I , and based on historical data and then evaluating the weights I and C based on those estimates. Alternatively, following Equation (11.2) it is possible
to run regressions like 1 D .1 I C / C I VI=P C
C VC=P on appropriate historical data samples for the direct determination of I and C . Of course, back testing always presumes that the weights are constant over time and whether this is true must be decided on a case-by-case basis.
1
1.5 z
2
2.5
3
The main point here is conceptual rather than empirical. Theorem 11.1 provides Bayesian justification for the practice of combining individual valuation estimates by taking a linear weighted average. In such weighted averages, the more reliable estimates command more weight. Market price does not receive 100% weight unless price is perfectly efficient. Theorem 11.1 – like valuation generally – should be applied contextually. Theorem 11.1 directly applies only to a continuing firm that is not expected to undergo reorganization or liquidation in the foreseeable future. If a firm is near bankruptcy and liquidation, then liquidation value is the most relevant estimate of value. In this context, capitalized earnings, which is a measure of continuing value, does not deserve any weight; even if it might be accurate, it is accurate only as an estimate of continuing value, not of liquidation value. Similarly, if a firm is expected to be taken over in the near future, then the expected takeover price is the most relevant estimate of value independent of capitalized earnings or net asset value, which are not relevant in that context. Context matters (Yee 2008c; Chen et al. 2010).
11.3 Triangulation in Forensic Valuation Beating the market is one obvious motive for using triangulation. The triangulation formula may also be used to infer the implied imprecision of valuation estimates from the weights that practitioners assign those estimates. This section focuses on the second application in the context of shareholder appraisal litigation. Compared to other institutions that appraise equity value, the judicial system is unique because judges promulgate official opinions describing how they arrive at value
190
K.K. Yee
assessments. Because judicial opinions are supposed to be neutral (unlike sell-side analyst reports), they offer unique first-person accounts of how experienced practitioners perform valuation. More importantly, judicial opinions indirectly reflect the views of the professional financial community because judges solicit testimony from competing appraisal experts before making an appraisal. Judges sometimes appoint a special master or neutral expert to provide an independent, unbiased assessment. Hence, judicial opinions typically benefit from input representing several diverse and sophisticated perspectives. In this light, judicial valuation is an indirect but articulate reflection of how valuation is practiced in the greater financial community. The Delaware Block valuation method, Equation (11.1), is the special case of Equation (11.2) when VI D b and VC D e. As described in the Introduction, the courts have not articulated a guiding principle for picking the weights b and e . Indeed, the articulated rule is that “No rule of thumb is applicable to weighting of asset value in an appraisal : : :; rather, the rule becomes one of entire fairness and sound reasoning in the application of traditional standard and settled Delaware law to the particular facts of each case.”7 As one would expect, judges, after hearing expert testimony, tend to assign more weight to net asset value in liquidation contexts, and more weight to capitalized earnings when valuing a continuing firm. Market price is assigned a weight based on availability and liquidity of the shares, which affect the reliability of price. See Seligman (1984) for vivid case-study descriptions of the facts underlying several sample cases. If one focuses on valuation outcomes rather than on the details of the legal and valuation process underneath, one readily observes the following trend: net asset value is usually weighted almost twice as much as market value or earnings value. As depicted in Table 11.1, a survey of a frequently cited sample of judicial appraisals finds that, on average, net asset weight ›b D 46% – almost twice the average value of the market value and earnings value weights. The weights vary tremendously from case-to-case; the standard deviation of ›b is 26%, more than half of the average value of ›b . The same holds for the standard deviations of the market value and earnings value weights.8 One explanation for the heavier average weight on net asset value in the tiny Table 11.1 sample is self-selection. It could be that liquidation or reorganization cases, where break-up or net asset value is more relevant, tend to be adjudicated in court more than going-concern cases. It might also be that cases litigated in court tend to involve compa-
nies that do not have liquid shares, or companies that do not have credible earnings numbers. Another explanation is simply that judges, or the experts who testify, trust net asset value more than price or capitalize earnings. For the sake of illustration, if one pretends that the latter explanation dominates, then it is possible to calculate the implied relative precisions that judges accord the individual value estimates by inverting the formulas in Theorem 11.1. If judges maximize precision, the average weights in Table 11.1 imply that, on average, judges view net asset value to be a substantially less noisy estimator of value than even market price. Inverting the relationship between I and C and ; I , and C yields
7
The literature on the use of Bayesian methods in asset pricing is extensive and range from Bayesian learning of fundamental attributes (e.g., Lewellen and Shanken 2002; Yee 2004b) to Bayesian judicial valuation in shareholder litigation (Yee 2008c). I shall not attempt to review this extensive literature
Bell vs. Kirby Lumber Corp., Del. Supr., 413 A.2d 137 (1980). For comparison, a random number homogeneously distributed between 0 and 1 has average value 50% and standard deviation 28.9%. Since the market value and earnings value weights have significantly smaller spread and their means differ from 50%, it does not appear these weights are homogeneously distributed random variables. 8
.1 KI 2KC / C C D
and I D
s
p
.1 KI 2KC /2 2 C 4.1 KI KC /KC 2KC
.1 2 / C C C
: 1 C I
Assuming that I b 12 and C e 14 yields q C D and I D 1 2 . Hence, judges select weights as if they think that (1) market prices are as imprecise as those based on capitalized trailing earnings; and (2) net asset valuation q estimates are more precise than market noise (e.g., I
12 ). Again, the purpose of this back-of-the-envelope calculation is to illustrate how one might use historical data together with the framework offered by Theorem 11.1 to estimate the implied precisions of competing value estimates. A definitive calculation of implied precision requires a more comprehensive sample and controls for self-selection and other contextual biases. The next section turns to applications of Bayesian analysis in other areas of asset pricing, including how Pastor (2000), also reverse engineering a Bayesian weighted average formula, finds evidence that U.S. investors may exhibit home bias because they face less parameter uncertainty for U.S. than for foreign stock issues.
11.4 Bayesian Triangulation in Asset Pricing Settings
11 Combining Fundamental Measures for Stock Selection
Table 11.1 These cases are from Seligman (1984)
191
Case
Year
1 kb ke
kb
ke
Levin vs. Midland-Ross In re Delaware Racing Ass’n Swanton vs. State Guarantee Corp. In re Olivetti Underwood Corp. Poole vs. N.V. Deli Maatschappij Brown vs. Hedahl’s-Q B & R Tome Land & Improvement Co. v. Silva Gibbons vs. Schenley Industries Santee Oil Co. vs. Cox In re Creole Petroleum Corp. In re Valuation of Common Stock of Libby Bell vs. Kirby Lumber Corp. Average Standard deviation
1963 1965 1965 1968 1968 1971 1972 1975 1975 1978 1979 1980
0:25 0:4 0:1 0:5 0:25 0:25 0:4 0:55 0:15 0 0:4 0 0:27 0:18
0:5 0:25 0:6 0:25 0:5 0:5 0:6 0 0:7 1 0:2 0:4 0:46 0:26
0:25 0:35 0:3 0:25 0:25 0:25 0 0:45 0:15 0 0:4 0:6 0:27 0:17
While Brown, Libby, Santee Oil, and Tome Land are not Delaware court cases, they all employ the Delaware Block Method.
here. Instead, this section briefly describes and compares three examples that illustrate how the spirit of Theorem 11.1 might play out in other asset pricing settings.
11.4.1 Black and Litterman: Combining Private Views with CAPM Black and Litterman (1991, 1992) and He and Litterman (1999) provide a way for users to integrate their personal “views” about securities into a portfolio selection framework. The idea is as follows. Suppose the actual excess returns of N assets are jointly normally distributed as r MVN .; †/. Investors never directly observe the true expected excess returns . Instead, an investor’s prior is MVN .…; †/, where … are estimates (based on, say, CAPM or APT) and is positive number parameterizing the investor’s uncertainty in these estimates. Suppose by doing private research an investor develops additional private views about K N special portfolios q, an N K matrix. That is, the investor believes that the K special portfolios have jointly normally distributed returns q 0 MVN .Q; /. Then Black and Litterman show that these additional portfolio views modify her priors on . Given these views, her posterior is that jq;Q;; MVN . ; V /, where
1 h i . †/1 … C q1 Q . †/1 C q1 q 0
1 : V . †/1 C q1 q 0 The expression for , like the expression for the combined value estimate in Theorem 11.1, is a weighted average – the weights on … and sum to unity (in a matrix sense). An investor who trusts CAPM (or APT) has ! 0,
which implies that her posterior is the same as her prior, that is, MVN .…; †/. Likewise, an investor who does not hold private views about any special portfolios has MVN .…; ; †/. At the other extreme, an investor who does not trust CAPM has ! 1, which implies she relies exclusively on her K portfolio views to determine 1 q1 q 0 q1 Q. Most investors fall in between these extreme investors and combine information from special portfolios with their priors in accordance with Black and Litterman’s weighed average formula. Practitioners find the Black-Litterman model appealing for three reasons. The investment manager specifies personal bullish or
bearish views on the expected returns of as few or as many assets as desired. By suitably choosing the matrix elements of the view portfolios matrix q, an investor can impose either absolute views about the performance of single assets or relative views about the relative performance of pairs of assets. The Black-Litterman model avoids extreme corner solutions common in mechanical portfolio optimization with traditional short sales constraints.
11.4.2 Incorporating Bayesian Priors in Regression Estimates of Model Parameters The Black and Litterman model does not specify how to use historical information to improve the estimation of expected returns. Pastor and Stambaugh (1999) and Pastor (2000) describe how Bayesian priors may be used to improve the estimation of Treynor and Black (1973) model parameters ˛ and ˇ.
192
K.K. Yee
In particular, suppose that an investor believes that expected excess returns (in excess of the risk-free rate) for a given firm are governed by the time series model rt D ˛ C ˇt C "t : An investor who believes in CAPM would believe that ˛ D 0 identically and that ˇ is the firm’s exposure to the market excess return. An investor with doubts about CAPM, market efficiency, or an investor who is benchmarking returns to something other than the market portfolio, however, would face uncertainty about the values of ˛ and ˇ. Assume that, although the investor does not know the firm’s ˛ and ˇ values precisely, her priors are that
˛ ˛ ;‰ MVN ˇ ˇ and that "t N 0; 2 is i.i.d. Pastor (2000, p. 216) shows that, given T periods of time series data, 0 0 1 1 r1 B :: B :: C R @ : A and X @ : rT
1 1 :: C ; : A
1 T
the investor’s Bayesian estimates for ˛ and ˇ are
˛O Bayes ˇOBayes
1 D ‰ 1 C 2 X 0 X
˛O OLS ˛ ‰ 1 ; C 2 X 0 X O ˇ ˇOLS
˛O OLS ˇOOLS
are the OLS estimates of ˛ and ˇ from the
˛ regression R D X C E, where E 0 .e1 eT /. Thus, ˇ the Bayesian estimates are a matrix weighted average of the prior means and the regression estimates. Pastor’s weighted average manifests properties analogous to the triangulation formula in Theorem 11.1. An investor who has more precise priors (e.g., smaller ‰ eigenvalues) will tend to place more weight on ˛ and ˇ. In contrast, an investor who has imprecise priors but less noisy time series data will put more weight on her regression estimates. Applying this model to data used for studying U.S. home bias in international asset allocation, Pastor was able to estimate an implied value for ‰11 , U.S. investors’ prior uncertainty for ˛. Pastor estimates that ‰11 1%, which suggests that U.S. investors may exhibit home bias because they do not face an exorbitant amount of parameter uncertainty from domestic issues. where
11.4.3 Bayesian Model Averaging Model-based Bayesian analysis is sensitive to model specification. Among others, Hoeting et al. (1999) and Cremers (2002) propose schemes based on Bayes’ theorem to take into account model selection uncertainty and adjust for potential data snooping biases that may arise if a researcher focuses on one particular model from a range of ex ante candidates after observing the data. In particular, Cremers’ approach is motivated by the disturbing fact that the evaluation of explanatory variables with regression t statistics typically assumes one specific model is correct (an implicit joint hypothesis), which overlooks the tremendous uncertainty the researcher has about the true model’s identity. Cremers (2002) simultaneously considers all linear (regression) models that can be constructed from a set of explanatory variables, assigns them each flat priors, and estimates their posterior probabilities from data. The grand model is the weighted average of individual candidate models. The posterior weight on each model is informed by regression results for each model. Cremers shows that specification of a prior for the inclusion of each candidate variable along with the expected R-square of the model regression is sufficient to compute the posterior weight for each model. Existence of a grand model also affects the interpretation of individual-model t statistics. Studies such as Cremers (2002) culminate from a rich stream of research on model and parameter uncertainty dating back to Zellner and Chetty (1965) and Klein and Bawa (1976) regarding how such uncertainty affects portfolios and the identification of efficient portfolios. A closely related recent study is Avramov (2002), which investigates the role of uncertainty about the returns-forecasting model on choosing optimal portfolios.
11.4.4 Bayesian Guidance for Improving Forecasting Suppose one uses historical data to evaluate the regression model Pt C1 D aP Pt C ab bt C aE Et C "t C1 ; where Pt C1 is the next-period market price (cum-dividend and suitably discounted), Pt is the current price, bt is the net asset value estimate, Et is the capitalized earnings estimate, and "t C1 is the error. Popular textbooks such as Damodaran (2002, Chap. 18) suggest regressions along similar lines. Then, this regression model is interpretable within the framework of Theorem 11.1 given Pt C1 represents the resolution
11 Combining Fundamental Measures for Stock Selection
of fair value at date t. From the perspective Equation (11.2), it is expected that aP C ab C aE D 1. (A deviation from aP C ab C aE D 1 is an indication that one or more of the value estimates fPt ; bt ; Et g is systematically biased over the sample.) Moreover, using the measured values of the regression coefficients, one could invert the weight formulas given in Theorem 11.1 to deduce the implied relative precisions of fPt ; bt ; Et g for predicting Pt C1 . The technique of using historical data to estimate parameter values in models like Equation (11.2) is, of course, not new. There is a long tradition of using historical and contemporaneous financial data to project whether a firm will become a winner or a loser (e.g., Altman 1968; Ohlson 1980; Fama and French 1992; Piotroski 2000; and Beneish et al. 2001). Using the Kalman filter, Yee (2004b) models how, if earnings reports provide noisy information about an unobservable fundamental attribute, then earnings realizations cause valuation coefficients to revise dynamically. Chan and Lakonishok (2004) and Arnott et al. (2005) estimate crosssectional and time-series models to predict future returns based on beginning-of-period weighted averages of bookto-price, earnings-to-price, cash-flow-to-price, and sales-toprice. The literature on using regressions to estimate weights for combining forecasts is even more developed. Originating with the seminal paper of Bates and Granger (1969), this literature has sprouted branches that explore behavioral as well as purely statistical aspects of how to optimally combine forecasts. Armstrong (1989), Clemen (1989), and Diebold and Lopez (1996) provide surveys. Borrowing generously from Diebold and Lopez, this section summarizes aspects of this literature that are relevant to combining value estimates. Combining methods fall roughly into two groups, variance-covariance methods and regression-based methods. Consider first the variance-covariance method, which is due to Bates and Granger (1969). Suppose one has two s steps i for i 2 f1; 2g. The composite ahead unbiased forecasts, yO;s forecast formed as the weighted average c 1 2 D ! yO;s C .1 !/ yO;s yO;s
is unbiased if, and only if, the weights sum to unity (as they do here). The combined forecast error will satisfy the same relation as the combined forecast; that is, "c;s D ! "1;s C .1 !/ "2;s 2 2 with a variance c2 D ! 2 11 C.1 !/2 22 C2! .1 !/ 12 2 2 where 11 and 22 are unconditional forecast error variances 2 is their covariance. The weight that minimizes the and 12 combined forecast error variance is
!D
2 11
2 22 12 : 2 C 22 212
193
The optimal weight is determined in terms of both the variances and covariances. Moreover, the forecasterror vari 2 2 . ; 22 ance from the optimal composite is less than min 11 While this suggests that combining forecasts only improves accuracy, there is a caveat. In practice, one must numerically estimate the unknown variances and covariances underlying the optimal combining weights. If one is averaging over L in-sample years, one estimates by replacing ! L P j "i;s ";s so that with O ij D L1 D1
!D
2 O 22 O 12 : 2 2 O 11 C O 22 2O 12
In the small samples typical for valuation studies, sampling error contaminates the combining weight estimates. Sampling error is exacerbated by co-linearity that normally exists among primary forecasts. Accordingly, combining weight estimates does not guarantee the reduction of out-of-sample forecast error. Fortunately, forecast combination seems to perform well in practice (Clemen 1989). Next, consider the time-series regression method of forecast combination. In this method, one regresses realizations y;s on previous forecasts of y;s to determine the optimal weights for combining current and future forecasts. In other words, one assumes that c 1 2 D ! yO;s C .1 !/ yO;s yO;s
and determines the optimal value of ! by back testing. Granger and Ramanathan (1984) show that the optimal weight has a regression interpretation as the coefficient vector of a linear projection of realizations y;s onto the forecasts subject to two constraints: the weights sum to unity, and no intercept is included. In practice, one simply runs the regression on available data without imposing these constraints. The regression method is widely used. There are many variations and extensions. Diebold and Lopez (1996) describe four examples in an excellent review article: timevarying combining weights, dynamic combining regressions, Bayesian shrinkage of combining weights toward equality, and nonlinear combining regressions. The remainder of this section summarizes the examples given by Diebold and Lopez and then warns against the overuse of combining regressions to over-fit in-sample data. For more details and a list of citations to the original literature, refer to the Diebold and Lopez article.
11.4.4.1 Time-Varying Combining Weights Granger and Newbold (1973) and Diebold and Pauly (1987) proposed time-varying combining weights, respectively, in
194
the variance-covariance context and in the regression context. In the regression context, one may undertake weighted or rolling combining regressions, or estimate combining regressions with time varying parameters. Time-varying weights are attractive for a number of reasons. First, different learning speeds may lead to a particular forecast improving over time compared to others. If so, one wants to weight the improving forecast more. Second, the design of various forecasting models makes them better forecasting tools in some situations. In particular, a structural model with a well-developed wage-price sector may substantially outperform a simpler model when there is high inflation. In such times, the more sophisticated model should receive higher weight. Third, agents’ utilities or other properties may drift over time. 11.4.4.2 Dynamic Combining Regressions Serially correlated errors arise in combining regressions. Diebold (1988) considers the covariance stationary case and argues that serial correlation is likely to appear in unrestricted regression-based forecast combining regressions when the unconstrained coefficients do not automatically add up to unity. More generally, serial correlation may capture dynamical elements that are not captured by the explicit independent variables. Coulson and Robins (1993) point out that a combining regression with serially correlated disturbances is a special case of a combining regression that includes lagged dependent variables and lagged forecasts. 11.4.4.3 Bayesian Shrinkage of Combining Weights Toward Equality Fixed arithmetic averages of forecasts are often found to perform well even relative to theoretically optimal composites (Clemen 1989). This is because the imposition of a fixed-weights constraint eliminates variation in the estimated weights at the cost of possibly introducing bias with incorrectly selected weight. However, the evidence indicates that the benefits of imposing equal weights often exceed this cost. With this motivation, Clemen and Winkler (1986) propose Bayesian shrinkage techniques to allow for the incorporation of varying degrees of prior information in the estimation of combining weights. Least-squares weights and the prior weights correspond to the polar cases of the posterior-mean combining weights. The actual posterior mean combining weights are a matrix-weighted average of the two polar cases. They describe a procedure whereby the weights are encouraged, over time, to drift toward a central tendency (e.g., ! D 12 ). While the combining weights are coaxed toward the arithmetic mean, the data is still allowed some opportunity to reflect possibly contrary values.
K.K. Yee
11.4.4.4 Nonlinear Combining Regressions Theory does not specify that combining regressions should be linear. In fact, valuation theory suggests that a degree of nonlinearity is to be expected (Yee 2005a). As a practical matter, empiricists favor linear models (only) because of their simplicity, which is always a virtue. In combining regressions, one way to introduce nonlinearity is to allow the weights to depend on one or more hidden state variables that change dynamically over time. The value of the state variable must be inferred from past forecast errors or the value of other observables. Needless to say, the structural equations can become quite elaborate, the computation quite messy, and the results sensitive to numerical details. For these reasons, I do not recommend introducing nonlinearity or many variables without specific motivation from economic insight. Absent sound theoretical guidance, it is too easy for the “exploration” of complex econometric models to turn into fishing expeditions.
11.5 The Data Snooping Trap The Bayesian rubric does not offer a license to go on undisciplined data mining expeditions without guidance from economic insight and theory. Data snooping, even within a Bayesian rubric, is an unsatisfactory substitute for economic and factual insight if the goal is to achieve good out-of-sample performance. Data snooping is the practice of sifting through in-sample data to look for patterns and, then, over-eagerly interpreting an observed pattern to conclude that it is real if it passes common (but nonetheless inappropriate) statistical tests within the sample. The problem is that phantom patterns will exhibit chance statistical significance at the 1% level about 1% of the time. If a researcher tests 100 phantom patterns, there is a 63.4% .D 1 0:99100 / chance that one or more of them exhibit significance at the 1% level. If a researcher tests 500 phantom patterns, the probability of finding false positives jumps to 99.3% .D 1 0:99500 /. Thus, a data snooper who (either explicitly or implicitly) considers hundreds of potential patterns has almost a certain chance of uncovering one or more patterns that exhibit chance statistical significance. A creative and diligent researcher who is paid to keep on looking until she “finds something” is virtually certain to uncover phantom patterns. Although data snooping is dangerous in any field, data snooping is particularly treacherous in macroeconomics, finance, and accounting research, where data is plentiful and the use of statistics is liberal. With the advent of modern software, huge numbers of hypotheses about a single data set can be “tested” in a short time by algorithmically searching for variables combinations that exhibit correlation.
11 Combining Fundamental Measures for Stock Selection
A seemingly reasonable justification for the practice of data snooping is the argument that a researcher should let the data speak for itself. This rationale licenses the researcher to conjure up a broad set of hypotheses where one is bound to appear significant in sample. This is not to say that researchers should never listen to data. The point is that, upon hearing a pattern, a researcher cannot assess its significance based on the in-sample data set that spoke it because passing an in-sample test is tantamount to verifying a tautology. For example, in a room of 400 people, at least two are guaranteed to share a birthday. Let’s say Mary Jones and John Smith not only share a birthday but are the only two of the 400 to have birthdays on April 1st. A data-snooping researcher would seek to find something special in common between Mary and John, such as that they both play piano or fly airplanes. By rifling through a mental list consisting of hundreds or thousands of possible characteristics, the researcher will inevitably uncover some commonality between John and Mary. For instance, the researcher might find that John and Mary are the only two of the 400 who hold Columbia MBAs and work on Wall Street, facts exposed by exhaustively comparing everyone’s biographies. Then the snooper might propose the hypothesis that “being born on April 1st leads to a higher likelihood of getting a Columbia MBA and working on Wall Street.” With the benefit of hindsight, she can even conjure up plausible economic theories, such as “Kids with April 1st birthdates, being subject to more teasing in school, are more motivated to excel and pursue lucrative Wall Street positions in finance.” The in-sample data seemingly supports this position since not one of the other 398 randomly selected people in the sample had April 1st birthdays, attended Columbia Business School, or work on Wall Street. However, upon turning to out-of-sample tests, the researcher will fail to verify that Wall Streeters are more likely born on special days like April 1st than any other day. Why? Because the original hypothesis was uncovered by data snooping and inadequately tested using in-sample data. Numerous authors have pointed out the dangers of mining financial data. Lo and MacKinlay (1990) attempt to quantify the impact of data-snooping bias in cross-sectional tests of asset pricing models when firm characteristics used to sort stocks is correlated with the error in performance measures. More colorfully, Crack (1999) identified the following astrological anomaly by shifting through 20 years of CRSP daily data. Crack found that stock market volatilities are higher and mean returns lower on trading days surrounding new- and full-moons than surrounding other days. Why should we be skeptical of this anomaly? First, our current understanding of economics does not predict that the moon’s status should affect trading behavior. That it does would be a truly revolutionary idea – and revolutionary ideas require careful verification before acceptance. Second, Crack found that removing the trading days surrounding the
195
October 1987 stock market crash, which happened right before a new moon, reduced the statistical significance of his market-volatility result (but not the mean-return result). “In the absence of any economic justification, and given my data snooping confessions,” Crack writes, “these results cannot be interpreted as anything other than an elegant and off-beat example of data snooping. Unfortunately, most data snooping in financial economics is more insidious – because plausible economics examples are typically involved.” (Crack 1999, p. 96) There is nothing wrong with sifting through lots of data per se. Problems arise only if one subsequently uses inappropriate statistical analysis to draw inferences from the exercise. In particular, if a researcher or his predecessors have considered more than one candidate hypothesis to explain patterns in the data, then the researcher must take into account the universe of hypotheses considered when evaluating each hypothesis. To this end, Sullivan et al. (1999) use bootstrapping analysis to adjust confidence intervals in hypothesis testing to correct for biases caused by data snooping. They show that 26 technical daily trading rules exhibiting an apparent ability to beat the market under traditional hypothesis testing lose that ability once the authors adjust their confidence intervals to take into account that prior academic and practitioner research had selectively identified these rules after examining a universe of over 7,000 closely related but equally ineffectual trading rules. Then, applying White’s (2000) “Reality Check” methodology to 100 years of daily DJIA pricing data, Sullivan et al. conclude that the snoopingadjusted p-value for even the best technical trading rule is over 30%; that is, they fail to reject the null hypothesis that even the best technical trading rule in a universe of thousands of rules is consistent with market efficiency.
11.6 Using Guidance from Theory to Mitigate Data Snooping The first way to mitigate data snooping pitfalls is to adjust hypothesis testing (using something like White’s Reality Check methodology) to take into account that multiple hypotheses have been implicitly or explicitly considered. The second way is to commit to a single hypothesis before looking for patterns in data. This is because, as Kahn et al. (1996) formalize in an insightful Bayesian model, new empirical evidence giving rise to a new hypothesis does not support that new hypothesis as strongly as empirical evidence that had been predicted by a pre-existing hypothesis supports that pre-existing hypothesis. In other words, pre-committing to a single hypothesis suppresses the ability consider multiple hypothesis and select a false one that happens to fit the data.
196
K.K. Yee
For example, consider the following empirical exercise, which I carried out using data from the Commodity Research Bureau: 1. Take 15 monthly commodity cash price series spanning 1901–2007.9 2. Compute the trailing 3-month log return rlag3 for each commodity-month. 3. Pretend that these cash prices are tradable10 and implement a trend-following strategy indicated by the following commodity-level trading rule: buy bt rlag3 dollars if rlag3 > 0 sell st rlag3 dollars if rlag3 < 0; where bt ; st are the appropriate normalization weights so that the long and short commodity portfolios are of equal dollar value in each month t. 4. Compute the monthly returns of the resulting long-short commodity portfolio. 5. For comparison purposes, also track the monthly returns of a long-only equal-weighted portfolio of these 15 commodities. Call this portfolio the “index” portfolio. After computing the historical return series of the longshort and index portfolios, I sorted the months into quintiles based on the performance of the index portfolio. Figure 11.2 depicts average returns of these portfolios during the five monthly quintiles. As depicted, the long-short portfolio does well when the index portfolio does well AND when the index portfolio does 9
Most of my 15 series, including gold, began well after 1901. Only the cotton, corn, and wheat series started in January 1901. 10 While it is unrealistic to believe that investors can trade these commodities at the quoted historical cash prices, a parallel exercise (not reported here) using commodity futures prices would yield similar results.
poorly. While the long-short portfolio is positively correlated to the index in months of strongest index returns, it is negatively correlated to the index in the months of worst index returns. In other words, the long-short return distribution in Fig. 11.2 is U-shape. Is this U shape the result of data snooping or is it real economic phenomenon? The answer to this question is important because one would not expect data snooping results to persist into the future. On the other hand, if it is a real phenomenon, then understanding the causes would not only give us confidence to possibly exploit it – it would help predict analogous phenomena in other asset classes. There are many reasons that should raise red flags about the U shape. The accuracy and liquidity of historical price series dating back to the early twentieth century are always in doubt. Even if they are accurate, many of the reported prices might have been realized in thinly traded markets and, thus, not represent tradable market prices. In addition, survivorship bias is an issue given that the Commodity Research Bureau does not report time series of out-of-style commodities such as candle wax, which never became popularly traded. I will now argue that the U shape is real and not the result of data snooping (even though I examined several variations of trend trading strategies, all of which happily yielded a U shape). My argument is based on the fact that previously known characteristics of commodities predict the U-shape curve. Being implied by pre-existing theory rather than requiring the invention of new theory endows the U shape with more credibility in the sense of Kahn et al. (1996). Pre-existing theory implies the U shape in the following ways. First, it is well known (and re-confirmed in the first column of Table 11.2) that commodity-level returns are serially correlated. As Fung and Hsieh (2001) explain, if returns are serially correlated, trend followers are positioned properly to enjoy positive returns when prices move up and when they move down. More importantly, by weighing constituent
Trend following performs well on good and bad months. equal weighted
Fig. 11.2 The long-short trend following portfolio performs well on both the best and worst quintile months of the equal-weighted portfolio. As argued by Fung and Hsieh (2001), this is because trend following strategies yield return distributions similar to that of look-back straddles. Source: Author’s calculation using Commodity Research Bureau commodity pricing data
annualized average return
50% 40% 30% 20% 10% 0% 1
−10% −20% −30% −40%
2
long-short
11 Combining Fundamental Measures for Stock Selection
Table 11.2 Monthly serial correlations, annualized monthly returns, standard deviation, skewness, and kurtosis of 15 commodities based on monthly cash prices during 1901–2007 (Source: Commodities Research Bureau and author calculation)
197
Serial correlation (%)
Annualized mean (%)
Annualized standard dev (%)
Annualized skewness (%)
Annualized kurtosis (%)
Butter Cotton Copper Cocoa Corn Eggs Gold Hogs Iron Live cattle Platinum Silver Soybeans Wheat Zinc
22:83 25:76 35:55 25:37 25:17 8:22 28:40 17:76 23:30 28:95 23:33 22:55 37:82 20:30 42:39
3:84 3:91 4:62 5:91 5:29 11:11 5:70 6:25 6:80 4:35 5:90 5:56 4:56 4:85 4:71
22:35 26:28 17:21 28:01 24:83 43:04 14:29 29:42 25:77 17:30 20:10 21:47 23:67 22:84 18:81
17:92 17:09 20:91 104:28 40:78 4:72 75:97 42:51 36:52 20:83 74:16 71:80 13:82 14:47 33:18
86:01 277:03 42:06 415:18 138:77 9:34 191:51 88:41 108:71 37:75 172:27 226:50 49:79 48:11 88:56
Median Mean Pooled
23:33 25:85 Inapplicable
5:56 5:56 5:57
21:47 23:69 24:73
36:52 36:99 31:60
88:56 132:00 137:38
The kurtosis reported here would be zero for a normal distribution. The “Median” and “Mean” rows report the median and mean of the commodity-level statistics. The “Pooled” row refer to the same statistic computed by pooling the returns of all commodity-months
commodities with their lag returns wr lag , trend followers enjoy their largest gains when the underlying assets make their most extreme repeated moves. As a result, trend following strategies yield portfolio return distributions that are empirically similar to the ex post payoff distribution of lookback straddles. Since trend following performs best when the investable universe has extreme performance in either direction, the long-short portfolio returns are U-shape. Thus, while its robustness is perhaps too appealing, the U shape in Fig. 11.2 is not surprising. The second reason I believe that the U shape is real is because the fact that inventories cannot fall below zero implies that commodity prices should be serially correlated according to micro-economic theory. According to the theory of competitive storage (e.g., Kaldor 1939; Deaton and Laroque 1992; Litzenberger and Rabinowitz 1995; and Gorton et al. 2007) commodity prices enjoy periodic spikes whenever unexpected demand shifts or production problems cause inventory shortages. Since it takes at least one successful production cycle to remedy a shortage, price hikes caused by shortages can persists for months until the market adjusts one way or another to the shortage. Deaton and Laroque (1992, Equation 10) showed that, if harvests are independent and identically distributed, then the expected commoditylevel returns depend nonlinearly on price: p ; 1 : E ŒRt C1 / min pt p is a constant that depends on many factors like the cost of storage. Hence, commodity prices deviate from the frictionless no-arbitrage relation obeyed by the prices of financial
assets. The returns of commodities in demand for direct consumption depend on current prices and are serially correlated. The third reason that the U shape is credible is because other predictions of the theory of competitive storage have found (partial) empirical support.11 For example, because inventory shortages periodically cause price hikes, the competitive storage predicts that the distributions of commodity returns should be positively skewed and leptokurtic. As indicated in last two columns of Table 11.2, most of our 15 commodity series exhibit positive skewness and positive kurtosis. Figure 11.3 depicts the skewed and leptokurtic return distributions of gold and cocoa.
11.7 Avoiding Data-Snooping Pitfalls in Financial Statement Analysis Like their colleagues who work with macroeconomic summary data sets, fundamental analysts – so-called for their bottoms-up, one-firm (or one-commodity) at a time approach to collecting and analyzing data – are subject to data snooping. This is because pertaining to each large public company (or commodity) is a rich history of financial statements and 11
Unfortunately, the theory of storage is still incomplete. As Deaton and Laroque (2003, p. 1) explain, the model, “although capable of introducing some autocorrelation into an otherwise i.i.d. process, appears to be incapable of generating the high degree of serial correlation of most commodity prices. : : : We are therefore left without a coherent explanation for the high degree of autocorrelation in commodity prices: : :”
198
K.K. Yee
Cocoa's Monthly Cash-Price Return Distribution 1927-2007
Gold's Monthly Cash-Price Return Distribution 1941-2007 1000
1000
100
100
10
10
27%
24%
21%
18%
15%
9%
12%
3%
6%
0%
−3%
−6%
−9%
−12%
−15%
−21%
−18%
19%
20%
17%
18%
14%
16%
15%
13%
12%
9%
11%
10%
7%
6%
8%
5%
3%
4%
2%
1%
0%
−1%
−2%
−3%
−4%
−6%
−5%
−8%
−7%
−9%
−10%
1 −11%
1
Fig. 11.3 The monthly return distributions of the gold and cocoa on a log scale. As one can see, both distributions are positively skewed and leptokurtic. Source: Author’s calculation using Commodity Research Bureau commodity pricing data
anecdotes, that are rich with apparent trends and regularities that invite naïve extrapolation and data snooping. Within the financial statement analysis (FSA), data snooping leads to irrational behaviors like hindsight bias and the invention of faulty heuristics that cause the analyst to compound past mistakes by repeating them in the future. FSA is the process of transforming accounting information from quarterly and annual statements into forecasts of future cash flow and profit.12 In fundamental analysis, changing a few parameters generates a whole different inference, which leaves the door wide open to irrational behavior. Though fundamentalists’ “deep dive” research often adds value because damaging information may be buried in footnotes and managerial conference calls, too much data often obfuscates what’s important by burying it in a haystack. Too much data coupled with an inadequately principled perspective on FSA renders an analyst vulnerable to drawing irrational conclusions and issuing poor investment recommendations. Staring at a large amount of information tricks the human mind into seeing patterns, real or imagined, and drawing improper conclusions from those patterns. Even worst, an analyst might conjure up a bad heuristic that will guide her into making a similar mistake next time, thus compounding her initial mistake. In addition to improperly snooping accounting data, areas of FSA particularly prone to subjectivity include the scrubbing of accounting numbers to correct for untidy bookkeeping loopholes and earnings manipulation; the forecasting of cash flows; and the extrapolation of current growth into
12
Yee (2008b) deconstructs the round-trip process of fundamental analysis, stock selection, and profit taking. Klarman (1991) presents a hearty practitioner’s account.
the future when making valuation estimates. Scientifically accepted prescriptions do not exist for how to perform these everyday tasks. Analysts do not even agree on how to diagnose basic attributes like a firm’s earnings quality. While some analysts rate firms by assigning a numerical earnings quality score, many other analysts believe earnings quality is an ill-defined concept of dubious usefulness. Thus, FSA requires analysts to make a lot of subjective choices, each of which is another invitation to irrational behavior. Compounding the problem is the dizzying smorgasbord of competing valuation techniques spawned by the financial industry (as described in, for example, Damodaran 2002). It may never be determined which of these techniques are scientifically sound. Kaplan and Ruback (1995), Gilson et al. (2000), Lie and Lie (2002), Liu et al. (2001) and Yoo (2006) ran horse races between different valuation methods. Kaplan and Ruback estimated price-to-EBITDA and also DCF valuations for a sample of highly leveraged transactions (HLTs). For their sample of 51 HLTs between 1983 and 1989, they found price-to-EBITDA and DCF gave comparable levels of accuracy and precision for predicting the HLT transaction price. In particular, about half of their valuation estimates fell within 15% of the actual HLT transaction price. However, many will argue that the glass is half empty. Differences exceeding 15% in the other half of the Kaplan and Ruback sample demonstrates that price-to-EBITDA and DCF are inconsistent about half the time. Gilson et al. (2000) compared DCF to the method of comparables for companies emerging from bankruptcy. Using EBITDA multiples based on the industry median, they found about 21% of the valuation estimates fell within 15% of market values. The authors conjectured that Kaplan and Ruback’s HLT valuations were more precise than their bankruptcy ones due to greater information efficiency in the HLT setting. Using 1998 financial data from Compustat, Lie
11 Combining Fundamental Measures for Stock Selection
and Lie found that the method of comparables based on asset value yields more precise and less biased estimates than estimates based on sales and earnings multiples, especially for financial firms. Liu et al. (2001) and Yoo (2006) found that forward price-to-earnings ratios – when consensus analysts’ earnings forecasts are available – perform best in large sample studies. Consistent with Liu et al., and Yoo13 (whose studies are based on a much longer time period), Lie and Lie also conclude that forward price-earnings ratios are more accurate and precise estimators than trailing ones. However, Lie and Lie report that the accuracy and precision of their method of comparable estimates varied greatly according to company size, profitability, and the extent of intangible value. Such unresolved open-ended issues create uncertainty for fundamentalists and leave them subject to behavioral pitfalls, including overconfidence, or the belief that a favorable outcome is more likely than what will actually occur; over optimism, or overestimating the likelihood of good luck; and overreaction to recent events, such as an exciting earnings announcement or press release (Daniel et al. 2001). Other judgment biases common to fundamentalists include hindsight bias, or an over-reliance on anecdotal evidence; herding, or blindly following mass movements irrespective of sound judgment; and myopia, or a narrow fixation on recent, and perhaps exceptional, changes. These behavioral pitfalls influence what an analyst emphasizes or ignores, the frequency of forecast revisions, and the subsequent inferences. They lurk even if the analyst is free of incentive-driven biases or other sinister intentions – behavioral pitfalls hinder even an honest analyst attempting to make fair evaluations. A tenet of western civilization is Plato’s “rule of law” concept – the principle that government authorities can exercise power only in accordance with publicly disclosed laws adopted in accordance with established procedure. The rule of law constrains government authorities from behaving in capricious or self-serving ways contrary to the written law. (Plato’s Statesman, his Laws, and his student Aristotle’s Politics were among the first to enunciate the benefits of rule of law over “rule of men” almost 2,500 years ago.) The rule-of-law principle suggests that commitment to a disciplined FSA process would add value by constraining analysts’ discretion to interject in-the-moment psychology and emotions. Ideally, a conclusion from FSA should not depend on the personality or tastes of the analyst performing the exercise. Just as the rule of law disciplines police, a disciplined 13
Yoo (2006) empirically examined, in a large-scale study of Compustat firms, whether taking a linear combination of several univariate method of comparables estimates would achieve more precise valuation estimates than any comparables estimate alone. He concluded that the forward price-earnings ratio essentially beats the trailing ratios, or combination of trailing ratios, so much that combining it with other benchmark estimates does not help. Yoo did not consider DCF, liquidation value, or market price in his study.
199
decision-making process adds value by forcing analysts to remove their personalities and psychology and conform to pre-agreed stock-selection criteria. I believe that fundamentalists would be wise to regularly step back from accounting minutiae and supplement their bottoms-up research with frameworks that instill discipline and consistency and prevent errors caused by psychological enticements to misinterpret details.
11.8 Conclusion Securities selection is the attempt to distinguish prospective winners from losers – conditional on beliefs and available information. This article surveys some relevant academic research on the subject, including work about the combining of forecasts (Bates and Granger 1969), the BlackLitterman model (1991, 1992), the combining of Bayesian priors and regression estimates (Pastor 2000), model uncertainty and Bayesian model averaging (Hoeting et al. 1999; Cremers 2002), the theory of competitive storage (Deaton and Laroque 1992), and the combination of valuation estimates (Yee 2007). Despite its wide-ranging applicability, the Bayesian approach is not a license for data snooping. The second half of this article describes common pitfalls in fundamental analysis and comments on the role of theoretical guidance in mitigating these pitfalls. Bayesian triangulation provides a conceptual framework that justifies two intuitive heuristics: 1. Estimate value as the weighted average of all available bona fide valuation estimates, including market price if available. 2. Assign greater weight to the more reliable estimates when some estimates are more reliable than are others. The triangulation formula, Equation (11.2), agrees with the Bayesian expectation value conditioned on the information set comprised of the individual valuation estimates. Applying standard Bayesian analysis, Equation (11.2) can be easily extended to provide a conceptual framework for folding in as many valuation estimates as desired. As Sect. 11.3 describes, given a particular choice of weights, one is able to reverse engineer Equation (11.2) to obtain the valuation noise implied by a given set of Bayesian weights. Reverse engineering reveals that shareholderlitigation judges believe market price is as noisy a valuation estimate as capitalized trailing earnings and significantly noisier than net asset value. This is interesting because, while conventional wisdom pre-dating Fischer Black has always believed that market prices and valuation estimates are noisy, it is also generally accepted that the amount of noise is an elusive quantity that cannot be observed. Inverting the
200
Bayesian triangulation formula provides a way of estimating the implied noisiness of price and other valuation estimates from a credible sample of appropriate data. This exercise is suggested as an issue for future research. The Bayesian framework allows for uncertainty in its many forms, provides modeling flexibility, and permits investors and managers to insert subjective priors and views into a rational decision-making process. Despite its virtues, the Bayesian framework and the desire to optimize combining weights is not a license for data snooping. For fundamental analysts, financial statements and anecdotal information comprise an extensive set of data about each large public company (or commodity) that invites data snooping and armchair quarterbacking, which leads to hindsight bias. Thus, fundamental analysts would be wise to regularly step back from accounting minutiae and discipline their FSA with a less precision-oriented, more skeptical and theoretical perspective. In the investment business (to paraphrase Keynes) “it is better to be roughly right than precise-but-wrong” and data mining greatly enhances the likelihood of the latter. Therefore, fundamentalists who complement their bottoms-up research with top-down, quantitative economic discipline will achieve the best long-term investment results. If nothing else, quantified checklists, scorecards, and portfolio constraints can add value simply by disciplining analysts to resist overconfidence, hindsight bias, and psychological pitfalls to make emotionally detached assessments based on pre-agreed criteria. The emphasis should be on the evenness of the analytical process over time. Investing is a repetitive activity. Fortuitous outcomes arising from ex ante unsound decisions must not be celebrated since good luck is not reproducible. The key is commitment to a quantified process and being disciplined enough to stick with it in the long haul irrespective of short-term market swings. Acknowledgments The statements and opinions expressed in this article are those of the author as of the date of the article and do not necessarily represent the views of the Bank of New York Mellon, Mellon Capital Management, or any of their affiliates. This article does not offer investment advice.
References Altman, E. I. 1968. “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy.” Journal of Finance 23, 589–699. Ang, A. and J. Liu. 2004. “How to discount cashflows with time-varying expected returns.” Journal of Finance 59(6), 2745–2783. Armstrong, J. 1989. “Combining forecasts: the end of the beginning or the beginning of the end?” International Journal of Forecasting 5(4), 585–588. Arnott, R., J. Hsu, and P. Moore. 2005. “Fundamental indexation.” Financial Analysts Journal 61(2), 83–99. Avramov, D. 2000. “Stock return predictability and model uncertainty.” Journal of Financial Economics 64, 423–458.
K.K. Yee Bates, J. and C. Granger. 1969. “The combination of forecasts.” Operational Research Quarterly 20, 451–468. Beatty, R., S. Riffe, and R. Thompson. 1999. “The method of comparables and tax court valuations of private firms: an empirical investigation.” Accounting Horizons 13(3), 177–199. Beneish, M., C. Lee, and R. Tarpley. 2001. “Contextual fundamental analysis through the prediction of extreme returns.” Review of Accounting Studies 6(2/3), 165–189. Bhojraj, S. and C. Lee. 2003. “Who is my peer? A valuation-based approach to the selection of comparable firms.” Journal of Accounting Research, 41(5), 745–774. Black, F. 1986. “Noise.” Journal of Finance, vol. 16(3), 529–542. Black, F. and R. Litterman. 1991. “Asset allocation: combining investor views with market equilibrium.” Journal of Fixed Income 1(2), 7–18. Black, F. and R. Litterman. 1992. “Global portfolio optimization.” Financial Analysts Journal (September/October) 28–43. Chan, L. and J. Lakonishok. 2004. “Value and growth investing: review and update.” Financial Analysts Journal 60(1), 71–87. Chen, F., K. Yee, and Y. Yoo. 2007. “Did adoption of forward-looking methods improve valuation accuracy in shareholder litigation?” Journal of Accounting, Auditing, and Finance 22(4), 573–598. Chen, F., K. Yee, and Y. Yoo. 2010. “Robustness of judicial decisions to valuation-method innovation: an exploratory empirical study.” Journal of Business, Accounting, and Finance, forthcoming. Clemen, R. 1989. “Combining forecasts: a review and annotated bibliography.” International Journal of Forecasting 5, 559–583. Clemen, R. and R. Winkler. 1986. “Combining economic forecasts.” Journal of Economic and Business Statistics 4, 39–46. Coulson, N. and R. Robins. 1993. “Forecast combination in a dynamic setting.” Journal of Forecasting 12, 63–67. Crack, T. 1999. “A classic case of “data snooping” for classroom discussion.” Journal of Financial Education 92–97. Cremers, K. 2002. “Stock return predictability: a Bayesian model selection perspective.” Review of Financial Studies 15(4), 1223–1249. Damodaran, A. 2002. Investment valuation: tools and techniques for determining the value of any asset, 2nd ed., Wiley, NY. Daniel, K., D. Hirshleifer, and A. Subrahmanyam. 2001. “Overconfidence, Arbitrage, and Equilibrium Asset Pricing.” Journal of Finance 56(3), 921–965. Deaton, A. and G. Laroque. 1992. “On the behavior of commodity prices.” Review of Economic Studies 59, 1–23. Deaton, A. and G. Laroque. 2003. “A model of commodity prices after Sir Arthur Lewis.” Journal of Development Economics 71, 289–310. Diebold, F. 1988. “Serial correlation and the combination of forecasts.” Journal of Business and Economic Statistics 6, 105–111. Diebold, F. and J. Lopez. 1996. “Forecast evaluation and combination,” in Handbook of statistics, G. S. Maddala and C. R. Rao (Eds.). North Holland, Amsterdam. Diebold, F. and P. Pauly. 1987. “Structural change and the combination of forecasts.” Journal of Forecasting 6, 21–40. Fama, E. and K. French. 1992. “The cross-section of expected stock returns.” Journal of Finance 47, 427–465. Feltham, G. and J. Ohlson. 1999. “Residual earnings valuation with risk and stochastic interest rates.” Accounting Review 74(2), 165–183. Fung, W. and D. Hsieh. 2001. “The risk in hedge fund strategies: theory and evidence from trend followers.” Review of Financial Studies 14, 313–341. Gilson, S., E. Hotchkiss, and R. Ruback. 2000. “Valuation of bankrupt firms.” Review of Financial Studies 13(1), 43–74. Gorton, G., F. Hayashi, and K. Rouwenhorst. 2007. The fundamentals of commodity futures, Yale working paper. Granger, C. and P. Newbold. 1973. “Some comments on the evaluation of economic forecasts.” Applied Economics 5, 35–47. Granger, C. and R. Ramanathan. 1984. “Improved methods of forecasting.” Journal of Forecasting 3, 197–204.
11 Combining Fundamental Measures for Stock Selection Grinold, R. 1989. “The fundamental law of active management.” Journal of Portfolio Management 15(3), 30–37. He, G. and R. Litterman. 1999. The Intuition Behind BlackLitterman Model Portfolios. Investment Management Division, Goldman Sachs. Hoeting, J., D. Madigan, A. Raftery, and C. Volinsky. 1999. “Bayesian model averaging: a tutorial.” Statistical Science 14(4), 382–417. Kahn, J., S. Landsburg, and A. Stockman. 1996. “The positive economics of methodology.” Journal of Economic Theory 68(1), 64–76. Kaldor, N. 1939. “Speculation and economic stability.” Review of Economic Studies 7, 1–27. Kaplan, S. and R. Ruback. 1995. “The valuation of cash flow forecasts: an empirical analysis.” Journal of Finance 50(4), 1059–1093. Klarman, S. 1991. Margin of safety: risk-averse value investing strategies for the thoughtful investor, HarperBusiness, New York, NY. Klein, R. and V. Bawa. 1976. “The effect of estimation risk on optimal portfolio choice.” Journal of Financial Economics 3, 215–231. Lewellen, J. and J. Shanken. 2002. “Learning, asset-pricing tests, and market efficiency.” Journal of Finance, 57(3), 1113–1145. Lie, E. and H. Lie. 2002. “Multiples used to estimate corporate value.” Financial Analysts Journal 58(2), 44–53. Litzenberger, R. and N. Rabinowitz. 1995. “Backwardation in oil futures markets: theory and empirical evidence.” Journal of Finance 50(5), 1517–1545. Liu, D., D. Nissim, and J. Thomas. 2001. “Equity valuation using multiples.” Journal of Accounting Research 40, 135–172. Lo, A. 2007. Where do alphas come from? A new measure of the value of active investment management, Working paper, Sloan School of Business. Lo, A. and A. MacKinlay. 1990. “Data snooping biases in tests of financial asset pricing models.” Review of Financial Studies 3, 431–468. Ohlson, J. 1980. “Financial ratios and the probabilistic prediction of bankruptcy.” Journal of Accounting Research 18(1), 109–131. Ohlson, J. and Z. Gao. 2006. “Earnings, earnings growth and value.” Foundations and Trends in Accounting 1(1), 1–70. Pastor, L. 2000. “Portfolio selection and asset pricing models.” Journal of Finance 55(1), 179–223. Pastor, L. and R. Stambaugh. 1999. “Cost of equity capital and model mispricing.” Journal of Finance 54(1), 67–121. Penman, S. 2001. Financial statement analysis and security valuation. McGraw-Hill, Boston. Perold, A. 2007. “Fundamentally flawed indexing.” Financial Analysts Journal 63(6), 31–37. Piotroski, J. 2000. “Value investing: the use of historical financial statement information to separate winners from losers.” Journal of Accounting Research 38 (Supplement), 1–41. Seligman, J. 1984. “Reappraising the appraisal remedy.” George Washington Law Review 52, 829. Sharfman, K. 2003. “Valuation averaging: a new procedure for resolving valuation disputes.” Minnesota Law Review 88(2), 357–383. Siegel, J. 2006. “The noisy market hypothesis.” Wall Street Journal June 14, 2006, A14. Sullivan, R., A. Timmermann, and H. White. 1999. “Data-snooping, technical trading rule performance, and the bootstrap.” Journal of Finance 54(5), 1647–1691. Treynor, J. and F. Black. 1973. “How to use security analysis to improve portfolio selection.” Journal of Business 46, 66–86. White, H. 2000. “Reality check for data snooping.” Econometrica 68(5), 1097–1126. Yee, K. 2002. “Judicial valuation and the rise of DCF.” Public Fund Digest 76, 76–84. Yee, K. 2004a. “Perspectives: combining value estimates to increase accuracy.” Financial Analysts Journal 60(4), 23–28. Yee, K. 2004b. “Forward versus trailing earnings in equity valuation.” Review of Accounting Studies 9(2–3), 301–329.
201 Yee, K. 2005a. “Aggregation, dividend irrelevancy, and earnings-value relations.” Contemporary Accounting Research 22(2), 453–480. Yee, K. 2005b. “Control premiums, minority discounts, and optimal judicial valuation.” Journal of Law and Economics 48(2), 517–548. Yee, K. 2007. “Using accounting information for consumption planning and equity valuation.” Review of Accounting Studies 12(2–3), 227–256. Yee, K. 2008a. “A Bayesian framework for combining value estimates.” Review of Quantitative Finance and Accounting 30(3), 339–354. Yee, K. 2008b. “Deep-value investing, fundamental risks, and the margin of safety.” Journal of Investing, 17(3), 35–46. Yee, K. 2008c. “Dueling experts and imperfect verification.” International Review of Law and Economics 28(4), 246–255. Yoo, Y. 2006. “The valuation accuracy of equity valuation using a combination of multiples.” Review of Accounting and Finance 5(2), 108–123. Zellner, A. and V. Chetty. 1965. “Prediction and decision problems in regression models from the Bayesian point of view.” Journal of American Statistical Association 60, 608–616.
Appendix 11A Proof of Theorem 11.1 This section establishes that the specific choice of weights _ ›I and ›C in Theorem 11.1 maximizes the precision of V _ or, equivalently, minimizes varŒV . To this end, consider an _ arbitrary linear weighing scheme V D P P C I VI C C VC , where ›P ; ›I , and ›C are yet to be determined real numbers. Plugging in V D P C e; VI D V C eI , and VC D V C eC and _ requiring EŒV D V immediately yields P D 1 I C , _ which implies V D V C .1 C I C C /e C I eI C C eC . _ Computing the variance of this expression yields varŒV D .1 I C /2 2 C I2 I2 C C2 C2 2.1 I C /C C . _
The first order conditions for minimizing varŒV with respect to ›I and ›C may be written in matrix form as
2 C I2 . C C / . C C / 2 C C2 C 2 C
I C
D
2 : . C C /
Inverting the matrix recovers the expressions for ›I and ›C _ given in the Theorem. These weights minimize varŒV , which _ 1 _ means they maximize varŒV , the precision of V . t u
11A.1 Generalization of Theorem 11.1 Theorem 11A.1. assumes that error eI from the intrinsic valuation estimate is uncorrelated to market noise e. If an analyst believes that corr.eI ; e/ D I ¤ 0, he or she should replace the weights in Theorem 11.1 with the following formulas: Œ.1 2 / C C . C C /I I C .1 2 / 2 C2 C 2. C C /I C I C Œ.1 I2 / 2 C 2 C C C2 I2 Œ.1 I2 / I C .I C I /C I kC D 2 2 2 .1 / C C 2. C C /I C I C Œ.1 I2 / 2 C 2 C C C2 I2 kI D
202
K.K. Yee
and
2 C I2 C 2I I . C C C I I / . C C C I I / 2 C C2 C 2C
.1 KI KC / D
.1
2 / 2 C2
Œ.C C /I C I C I C C 2. C C /I C I C Œ.1 I2 / 2 C 2 C C C2 I2
Proof. Same logic as the proof of Theorem 11.1. When corr.eI ; e/ D I , the new first order optimization condition becomes
D
I C
. C I I / : . C C /
Inverting the matrix recovers the expressions for ›I and ›C . These weights maximize the precision of the triangulation _ estimate V . t u
Chapter 12
On Estimation Risk and Power Utility Portfolio Selection Robert R. Grauer and Frederick C. Shen
Abstract Previous studies show that combining a power utility portfolio selection model with the empirical probability assessment approach (EPAA) to estimate the joint return distribution frequently generates economically and statistically significant abnormal returns. In this paper, we examine additional ways of estimating joint return distributions that allow us to explore the stationary/nonstationary tradeoffs implicit in “expanding” versus “moving” window estimation methods; the benefits of alternative methods of allowing for the “memory loss” inherent in the moving-window EPAA; and the possibility that weighting more recent observations more heavily may improve investment performance. Keywords Estimation risk r Portfolio choice r Power utility In a strict sense, there wasn’t any risk——if the world had behaved as it did in the past. Merton Miller, discussing the fall of Long-Term Capital Management Quoted in Lowenstein (2000), page 61
12.1 Introduction Previous studies show that combining a power utility portfolio selection model with the empirical probability assessment approach (EPAA) to estimate the joint return distribution frequently generates economically and statistically significant abnormal returns. The results are perhaps surprising in light of the fact that the EPAA is based exclusively on historic return data in a 32-quarter “moving window”. In this paper, we examine eight additional ways of R.R. Grauer () Faculty of Business Administration, Simon Fraser University, 8888 University Drive, Burnaby, BC, Canada V5A 1S6 e-mail:
[email protected] F.C. Shen Head, Strategic and International, Group Risk Management, OCBC Bank, Singapore e-mail:
[email protected]
estimating joint return distributions that allow us to explore the stationary/nonstationary tradeoffs implicit in “expanding” versus “moving” window estimation methods; the benefits of alternative methods of allowing for the “memory loss” inherent in the moving-window EPAA; and the possibility that weighting more recent observations more heavily may improve investment performance. Although the results show that the EPAA holds its own against the eight other estimation methods in industry-rotation and global asset-allocation settings, the rankings of the estimation methods differ (sometimes dramatically) across datasets, time periods, evaluation methods and investor risk aversion. The paper proceeds as follows. Section 12.2 reviews the literature. Section 12.3 outlines the basic multiperiod investment model and the base-case EPAA. Section 12.4 describes the data. Section 12.5 delineates eight additional ways of estimating the joint return distribution. Section 12.6 presents four methods used to evaluate the ex post performance of the power policies. Section 12.7 reports the results. Section 12.8 summarizes the paper and offers concluding comments. Section 12.9 is an addendum, which shows that combining a moving window approach (to estimate the means from regressions of returns on dividend yields and riskfree interest rates) with an expanding window (to estimate the joint return distribution) holds promise for generating higher portfolio returns while guarding against the possibility of losses in some long forgotten states of nature.
12.2 Literature Review Grauer and Hakansson (1982, 1985, 1986, 1987, 1995a) and Grauer et al. (1990) combine a discrete-time power utility portfolio selection model1 with the empirical probability assessment approach (EPAA) to examine domestic, global and industry rotation asset-allocation problems. The method often generates economically and statistically 1 See Mossin (1968), Hakansson (1971, 1974), Leland (1972), Ross (1974), and Huberman and Ross (1983).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_12,
203
204
significant abnormal returns, which is surprising because the EPAA is based exclusively on historic return data in a 32quarter moving window. This raises the question of whether the results are primarily due to the following: the power utility formulation that underlies the multiperiod portfolio selection model, to the inputs of the model, or to the constraints. Grauer and Hakansson (1993) address the first possibility by comparing two approximation schemes for calculating the optimal portfolios in the multiperiod investment model. Specifically, the mean-variance and the quadratic utility approximations are compared to the more cumbersome power utility method. The mean-variance approximation works well for quarterly decisions, but not quite so well for annual decisions. On the other hand, when short sales are permitted, Grauer (2009) shows that the mean-variance model risks bankruptcy ex ante. In cases where the mean-variance model risks bankruptcy, the power utility model adopts radically different investment policies to preclude any ex ante probability of losing 100 percent (or more) of wealth in some state of nature. The second two possibilities may be considered under the general category of estimation risk. The primary thrust of the investigation of estimation error in the inputs has concentrated on alternative estimates of the means—and in the case of the mean-variance model on alternative estimates of the mean vector and covariance matrix. In a mean-variance framework, empirical evidence based on simulation analysis and out-of-sample performance suggests that correcting for estimation risk, particularly in the means, can improve investment performance substantially. See, for example, Jobson, Korkie and Ratti (1979), Jobson and Korkie (1980, 1981), and Jorion (1985, 1991). In a discrete-time multiperiod setting, the results are not as clear. Grauer and Hakansson (1995b, 2001) compare the investment policies and returns of three classes of estimators of the means— the historic or sample estimator; two forms of shrinkage estimators, the James-Stein and Bayes-Stein estimators; and a capital asset pricing model based estimator—with mixed results. Grauer (2008b) extends the analysis adding a dividend yield – riskfree rate estimator of the means.2 The results show that this estimator performs by far the best. In addition, Grauer and Hakansson (1998) employ an inflation adapter to adjust the means that is surprisingly successful in timing the market.
2 Solnik (1993) explores the use of conditioning variables to predict the means in a mean-variance portfolio selection framework. Merton (1971, 1973) points out that if returns are predictable, investors may wish to hedge against shifts in the opportunity set. Kandel and Stambaugh (1996), Brennan et al. (1997), and Barberis (2000), among others, employ a continuous-time framework to examine whether the optimal portfolio weights of an investor with a long horizon are different from those of an investor with a short horizon.
R.R. Grauer and F.C. Shen
Alternatively, it has been suggested that one might adjust for estimation risk by constraining portfolio weights. See, for example, Frost and Savarino (1988) and Grauer (2008c, 2009) in a mean-variance framework, Grauer and Shen (2000) in the context of the discrete-time multiperiod model, and Black and Litterman (1992) who argue against constraining the weights. Again, the evidence is mixed. Frost and Savarino find that constraining the weights improves performance. Grauer (2009) finds that the imposition of nonnegativity constraints improves performance dramatically. Grauer and Shen find that adding constraints beyond nonnegativity constraints does not lead to better performance, while Grauer (2008c) finds that adding (other) constraints beyond non-negativity constraints hinders performance in some periods and improves it in others. This paper investigates eight alternatives to the basecase EPAA. The paper has had an extremely long gestation period—its origins trace to the late 1980s, see Shen (1988). An appealing aspect of power utility functions is that they protect against any ex ante possibility of bankruptcy and in the case of the more risk-averse power utility functions against any large losses—but only in the states of nature captured in the moving-window based EPAA. We worried about what might happen if we ignored the possibility of some catastrophic event not captured in the moving window. Subsequent real-world events—the rise and fall of Long-Term Capital Management—showed how fundamental our concern is and how truly devastating it can be to base an investment strategy on a model that ignores even an infinitesimal possibility of disaster.3 The ability of the moving-window based EPAA to allow for nonstationary joint return distributions is a major strength. However, as noted, a potentially serious shortcoming is that the moving window is 32-quarters in length, which means for decisions made after say the mid- to late1940s (or the mid- to late-1990s) there is no memory of the losses suffered in the great depression (or in 1987).4 One way to alleviate the problem is to employ an “expanding window” that encompasses all-of-history, but assumes joint return distributions are stationary.5 The eight alternatives to the EPAA either: lengthen the estimation period; add “boom” and/or “disaster” states drawn from all-of-history
3 Lowenstein (2000) provides an extremely readable account of the rise and fall of Long-Term Capital. In addition to the quote by Merton Miller, he includes comments by Paul Samuelson, Eugene Fama, and William Sharpe, that question the idea that “real world” risk can ever be completely eliminated. 4 Results based on 40-quarter and from 8- to 15-year estimating periods have also been discussed. 5 Note that we say one way to alleviate the problem. It seems clear that no model can completely overcome the problem of a truly unforeseen event.
12 On Estimation Risk and Power Utility Portfolio Selection
205
to the moving window; or change the way probabilities of a state are estimated giving more weight to more recent observations. The alternatives allow us to explore the stationary-nonstationary tradeoffs implicit in the expanding versus moving window estimation methods; the benefits of alternative methods of allowing for the “memory loss” inherent in the moving-window EPAA; and the possibility that weighting more recent observations more heavily may improve investment performance.
12.3 The Multiperiod Investment Model The basic model is the same as the one employed in Grauer and Hakansson (1986) and the reader is therefore referred to that paper (specifically pages 288–291) for details. It is based on the pure reinvestment version of dynamic investment theory. At the beginning of each period t, the investor chooses a portfolio, xt , on the basis of some member, , of the family of utility functions for returns r given by max E xt
X 1 1 .1 C rt .xt // D max t s .1 C rt .xt // x t s (12.1)
subject to xi t 0; all xLt 0; xBt 0; X i
xi t C xLt C xBt D 1; X
mi t xi t 1;
(12.2) (12.3) (12.4)
i
Pr .1 C rt .xt / 0/ D 1;
(12.5)
where P d rts .xt / D xit rits C xLt rLt C xBt rBt is the ex ante return on i
the portfolio in period t if state s occurs, 1 D a parameter that remains fixed over time, xit D the amount invested in risky asset category i in period t as a fraction of own capital, xBt D the amount lent in period t as a fraction of own capital, xBt D the amount borrowed in period t as a fraction of own capital, xt D .x1t ; : : : ; xnt ; xLt ; xBt /, rit D the anticipated total return (dividend yield plus capital gains or losses) on asset category i in period t, rLt D the return on the riskfree asset in period t, d D the interest rate on borrowing at the time of the rBt decision at the beginning of period t, mit D the initial margin requirement for asset category i in period t expressed as a fraction,
ts D the probability of state s at the end of period t, in which case the random return rit will assume the value rits . Constraint (12.2) rules out short sales and ensures that lending (borrowing) is a positive (negative) fraction of capital. Constraint (12.3) is the budget constraint. Constraint (12.4) serves to limit borrowing (when desired) to the maximum permissible under the margin requirements that apply to the various asset categories. Constraint (12.5) rules out any ex ante probability of bankruptcy.6 Under the base-case scenario employed in most previous studies, the inputs to the model are derived from the empirical probability assessment approach (EPAA). Consider the EPAA and suppose quarterly revision is used. Then, at the beginning of quarter t, the portfolio problem (12.1)–(12.5) for that quarter uses the following inputs: the (observable) riskfree return for quarter t, the (observable) call money rate plus one percent at the beginning of quarter t, and the (observable) realized returns for the risky asset categories for the previous k quarters. Each joint realization in quarters tk through t1 is assigned a probability 1=k of occurring in quarter t. Thus, under the EPAA, estimates are obtained on a moving basis and used in raw form without adjustment of any kind. On the other hand, since the whole joint distribution is specified and used, there is no information loss—all moments and correlations are implicitly taken into account. It may be noted that the empirical distribution of the past k periods is optimal if the investor has no information about the form and parameters of the true distribution, but believes that this distribution went into effect k periods ago. With these inputs in place, the portfolio weights xt for the various asset categories and the proportion of assets borrowed are calculated by solving system (12.1)–(12.5) via nonlinear programming methods.7 At the end of quarter t, the realized returns on the risky assets are observed, along r (which may differ from with the realized borrowing rate rBt d 8 the decision borrowing rate rBt ). Then, using the weights selected at the beginning of the quarter, the realized return on the portfolio chosen for quarter t is recorded. The cycle is then repeated in all subsequent quarters.9
6
The solvency constraint (12.5) is not binding for the power functions, with < 1, and discrete probability distributions with a finite number of outcomes because the marginal utility of zero wealth is infinite. Nonetheless, it is convenient to explicitly consider (12.5) so that the nonlinear programming algorithm used to solve the investment problems does not attempt to evaluate an infeasible policy as it searches for the optimum. 7 The nonlinear programming algorithm employed is described in Best (1975). 8 The realized borrowing rate is calculated as a monthly average. 9 Note that if k D 32 is under quarterly revision, then the first quarter for which a portfolio can be selected is b C 32, where b is the first quarter for which data is available.
206
All reported returns are gross of transaction costs and taxes and assume that the investor in question had no influence on prices. There are several reasons for this approach. First, as in previous studies, we wish to keep the complications to a minimum. Second, the return series used as inputs and for comparisons also exclude transaction costs (for reinvestment of interest and dividends) and taxes. Third, many investors are tax-exempt and various techniques are available for keeping transaction costs low. Finally, since the proper treatment of these items is nontrivial, they are better left to a later study.
12.4 The Data The data used to estimate the probabilities of next period’s returns on risky assets, and to calculate each period’s realized returns on risky assets, come from several sources. The quarterly returns series for the U.S. asset categories (common stock, long-term corporate bonds, and long-term government bonds) are provided by Ibbotson Associates. The quarterly database on non-U.S. equity returns for 1960–99 covering seven countries (Canada, France, Germany, Japan, the Netherlands, Switzerland, and the United Kingdom) comes from Brinson Associates (1960–70) and from Ibbotson Associates (1970–99). All returns are expressed in U.S. dollars and represent total returns since both dividends (net of foreign taxes withheld) and capital appreciation (or depreciation) are taken into account. The returns from the U.S. value-weighted industry groups are constructed from the returns on individual NYSE firms contained in the Center for Research in Security Prices (CRSP) monthly returns database. The industry data in this paper updates the 1934–86 data in Grauer et al. (1990) through 1999. The firms are combined into 12 industry groups on the basis of the first two digits of the firms’ SIC codes, with value-weighted industry indices constructed from the same universe of firms. The riskfree asset is assumed to be 90-day U.S. Treasury bills maturing at the end of the quarter. We use the Survey of Current Business and the Wall Street Journal as sources. The borrowing rate is assumed to be the call money rate plus one percent for decision purposes (but not for rate of return calculations). The applicable beginning of period decision rate, d , is viewed as persisting throughout the period and thus as rBt riskfree. For 1934–76, the call money rates are obtained from the Survey of Current Business. For later periods, the Wall Street Journal is the source. Finally, margin requirements for stocks are obtained from the Federal Reserve Bulletin.10 10
There is no practical way to take maintenance margins into account in our programs. In any case, it is evident from the results that they would
R.R. Grauer and F.C. Shen
For comparative purposes, we calculate and include the returns on unlevered and levered equal- or value-weighted benchmark portfolios of the risky assets in the investment universe. Table 12.1 summarizes the compositions of these portfolios along with an enumeration of the asset categories included in the study.
12.5 Alternative Ways of Estimating the Joint Return Distribution Nine approaches to estimating the joint return distribution are described in this section. The results, reported in Section 12.7, determine whether one approach dominates the others empirically. The Empirical Probability Assessment Approach. The basecase EPAA is a moving window method for estimating next period’s joint return distribution where only the most recent k periods of jointly realized returns are used. As we move forward in time, the earliest observation is dropped while the most recent one is added. Thus, implicit in the approach is that the joint return distribution is stationary for the last k quarters. The interesting modeling question is: What is the optimal choice of the lag period k? Intuitively, the lag should be long enough to cover a business cycle and to insure that the number of states is greater than the number of asset categories.11 At the same time, the lag should be short enough to reflect up-to-date information and to allow for possible nonstationarities in the data. In this paper, we focus on the basecase 32-quarter estimating period that is employed in most of the previous multiperiod asset allocation papers. Under the EPAA in quarter t, each of the 32 most recent joint return realizations (states) is assumed to occur with a probability of 1/32. However, a possible shortcoming of the EPAA is that it only considers data within the moving window. Therefore, it may underestimate the probability of either “boom” or “disaster.” The “All-of-History” Approach. One way to overcome the possibility of underestimating either very good or very bad outcomes is to employ an expanding window that uses all past returns to estimate the (assumed stationary) joint return distribution. At the beginning of quarter t, returns from quarter 1 (the first available data) to quarter t1 are used. At the beginning of quarter tC1, returns from quarter 1 to quarter t are used to estimate the joint return distribution, and so on.
come into play only for the more risk-tolerant strategies (and even for them only occasionally) and the net effect would be relatively neutral. 11 If the number of assets were equal to the number of states, there would be a complete market. In the absence of constraints other than the budget constraint, the optimizer would then think it could achieve any arbitrary return distribution.
12 On Estimation Risk and Power Utility Portfolio Selection
207
Table 12.1 Asset category and fixed-weight portfolio symbols RL Riskfree Lending (90-day US Treasury Bills) GB Long-term US Government Bonds CB Long-term US Corporate Bonds CS US Common Stocks (S&P 500) B Borrowing FR French Equities GE German Equities JA Japanese Equities NE Dutch Equities CA Canadian Equities SI Swiss Equities UK British Equities EW Equal-weighted portfolio of risky assets in investment universe E5 50% in EW, 50% in RL E15 150% in EW, 50% in B E20 200% in EW, 100% in B
Each joint realization is given an equal probability 1=.t1/ of occurring in quarter t. Note that the probabilities become smaller as we move forward in time, unlike those in the moving window EPAA. For example, with 12 industries in the fourth quarter of 1999, each of 295 possible states is assumed to occur with equal probability. Comparisons of the returns generated from applying the all-of-history and EPAA allow us to explore the stationary-nonstationary tradeoffs implicit in the expanding versus moving window estimation methods. The “Sum-of-the-Digits” Approach. The polar opposite to the all-of-history approach focuses on the most up-to-date data, weighting the more recent observations in the k-quarter window of the EPAA more heavily than earlier observations. Probabilities are assigned by first letting KD
k X
i:
(12.6)
i D1
Then, at the beginning of quarter t, the joint realization of quarter t 1 is assigned probability k=K; the joint realization of quarter t 2 is assigned probability .k 1/=K, and so on, until the joint realization of quarter t k is assigned probability 1=K of occurrence. Thus, more weight is assigned to the recent observations using a linear decay function. The idea is borrowed from the exponentially weighted moving average model of univariate time series modeling, as well as from a common sense argument that the recent past may be a better guide to what might happen next period than is the more distant past. If k D 32, then K D ..33/.32//=2. Thus, in quarter t, the probability of the most recent joint return realization is 32=K and the probability of the joint realization from quarter t k is 1=K. The “Disaster States” Approach. This approach combines the most recent joint return realizations in the EPAA moving window with the “worst” joint realizations that occurred
PETR FINA CDUR BASI FOOD CONS CAPG TRAN UTIL TEXT SERV LEIS
Petroleum Finance and Real Estate Consumer Durables Basic Industries Food and Tobacco Construction Capital Goods Transportation Utilities Textiles and Trade Services Leisure
VW V5 V15 V20
Value-weighted portfolio of risky assets in investment universe 50% in VW, 50% in RL 150% in VW, 50% in B 200% in VW, 100% in B
through all-of-history. Let k be the number of quarters used in the EPAA and n be the number of risky assets. At the beginning of quarter t, the worst state for each asset i (defined as its smallest return) from quarter 1 (the first observation) to quarter t1 (the most recent observation) is recorded. These n disaster states are then appended to the k most recent states of the EPAA to obtain n C k states as the estimate of the joint return distribution for quarter t. The probabilities of each state are determined by the following rule: the probability of each of the k most recent states is equal to 1=.k C j /, and the probability of each of the n disaster states is equal to j=..k C j /n/, where j is an integer such that 1 < j < n. Clearly, the disaster-states approach provides a more conservative estimate of the joint return distribution than the EPAA. Furthermore, letting j vary from 1 to n varies the degree of conservatism. (If j D n, equal probability is assigned to each state.) In this case, the interesting modeling question is: What is a reasonable value for the weighting factor j ? The tradeoff is to keep the investor aware of disaster states, but not to assign so much probability to them so as to completely drive the investor out of the equity markets. In the empirical work, we investigate two possibilities letting j D 1; 2. In the case of 12 industries and a 32-quarter estimation period, each of the 32 normal states (representing the joint return realizations on the 12 industries) is assigned a probability of either 1/33, if j D 1; or 1/34, if j D 2. Each of the 12 disaster states is assigned a probability of either 1/((33)12), if j D 1; or 2/((34)12), if j D 2. Thus, if j D 1 .j D 2/, 32/33 (32/34) of the total probability is assigned to the normal states and the remainder—either 1/33 or 2/34—to the 12 disaster states. The “Boom-Disaster States” Approach. This approach differs from the disaster states approach in that the “best” and “worst” outcomes for each asset category are recorded. Thus, at the beginning of quarter t, the best and worst state for each
208
asset i from quarter 1 to quarter t 1 are recorded. These 2n boom or disaster states are then appended to most recent k states of the EPAA to obtain 2n C k states as the estimate of the joint return distribution for quarter t. The amended rule for assigning the probabilities is: the probability of each of the k most recent states is equal to 1=.k C j /, and the probability of each of the 2n boom or disaster states is equal to j=..kCj /2n/, where j is an integer such that 1 < j < 2n. In the empirical work, we investigate three possibilities letting j D 1, 2 and 4. In the case of 12 industries and a 32-quarter estimation period, each of the 32 normal states is assigned a probability of either 1/33, if j D 1; or 1/36, if j D 4; while each of the 24 boom or disaster states is assigned a probability of either 1/((33)22), if j D 1; or 4/((34)24), if j D 4. Thus, if j D 1 .j D 4/, 32/33 (32/36) of the total probability is assigned to the normal states and the remainder—either 1/33 or 4/36—to the 24 boom or disaster states. The “All-of-History Sum-of-the-Digits” Approach. The final approach assigns sum-of-digits probabilities to each joint realization through all-of-history. In assigning the probabilities in quarter t, we first let the sum in (6) run from 1 to t 1 (instead of from 1 to k). Then, at the beginning of quarter t, the joint realization of quarter t 1 is assigned probability .t 1/=K; the joint realization of quarter t 2 is assigned probability .t 2/=K and so on, until the joint realization of quarter 1 is assigned probability 1=K. For example, with 12 industries in the fourth quarter of 1999, t D 296, the number of states equals 295, and K D ..296/.295//=2. Thus, the probability of the most recent joint return realization is 295=K and the probability of the joint realization from quarter 1 is 1=K.
12.6 Alternate Ways of Evaluating Investment Performance The expected utility-based objective function (1) of multiperiod portfolio theory has no direct tie to any of the commonly used statistical measures of investment performance. Therefore, we employ an eclectic mix of four methods for evaluating ex post performance: compound return versus standard deviation of return, Jensen’s (1968) alpha, Grinblatt and Titman’s (1993) portfolio change measure, and ex post certainty equivalent returns. Each of the methods has a certain intuitive appeal. On the other hand, each method embodies economic or statistical assumptions that may be at variance with: the investor’s objective function, the assumptions employed in estimating the joint return distributions, or the assumptions used in the other methods of measuring performance. Consequently, it is interesting to see whether the rankings of the different performance measures agree. Compound Return – Standard Deviation Pairs. Compound (or geometric mean) return is an intuitive, frequently
R.R. Grauer and F.C. Shen
used measure employed by both practitioners and academics in performance comparisons. It provides a perfect ranking of what a given initial amount would have grown to under reinvestment over the period under question. When it comes to measuring variability, standard deviation has few contenders. We make two comparisons. First, we compare the compound return - standard deviation pairs of the power policies generated from different ways of estimating the joint return distribution with those of the asset categories, and with those of the levered equal- and value-weighted benchmarks. Then, for each of the powers we compare the nine ways of estimating the return distributions ranking the estimation methods from high to low based on compound returns. Jensen’s Alpha. In academic circles, Jensen’s (1968) alpha is probably the most commonly employed performance measure. In this paper, for each portfolio j (power utility policy) we run the regression rjt rLt D ˛j C ˇj .rmt rLt / C ejt ;
(12.7)
where rjt is the quarterly rate of return on portfolio j; rmt is the quarterly rate of return on the CRSP value-weighted index, and rLt is the rate of return on three-month Treasury bills. The intercept ˛j is the measure of risk-adjusted investment performance. Positive (negative) values indicate superior (inferior) performance. The null hypothesis is that there is no superior investment performance and the alternative hypothesis is that there is. Thus, we report the results of onetailed t-tests. However, the test embodies both economic and statistical assumptions, and is not without its critics. In terms of economics, alpha—the intercept from Equation (12.7)—reduces the tradeoff between reward (ex post arithmetic average return) and risk (beta) envisioned in the Sharpe (1964) - Lintner (1965) capital asset pricing model into a single risk-adjusted rate of return that can be interpreted as the distance from the security market line. In statistical terms, both the alpha and beta in Equation (12.7) are assumed to be constant in the performance evaluation period. At a deeper level, Roll (1978) shows that the alphas and betas are a function of the benchmark portfolio (i.e., the proxy for the market portfolio). If the benchmark portfolio is exactly mean-variance efficient, there is no abnormal performance. If two benchmark portfolios are not exactly mean-variance efficient, they can, in principle, lead to completely different sets of “winners and losers”. See also Dybvig and Ross (1985), Elton et al. (1993), Grauer (1991), and Green (1986). The ambiguity this creates in a domestic market is bad enough. But with a universe that goes much beyond U.S. common stocks, the lack of a commonly accepted benchmark portfolio makes it difficult to justify employing the Jensen test. Thus, we limit our analysis of alphas to the industry dataset. As noted above, the assumptions embodied in the measures of performance may conflict with the assumptions
12 On Estimation Risk and Power Utility Portfolio Selection
embodied in the investors’ choices of optimal portfolios. For example, in this paper the investors are viewed as price-takers—no particular equilibrium model is assumed or relevant. Furthermore, the investors maximize expected utility employing power utility functions, whereas the Jensen measure is based on a model of market equilibrium where all investors are mean-variance decision makers. Grinblatt and Titman’s Portfolio Change Measure. In contrast to most other performance measures, Grinblatt and Titman’s (1993) Portfolio Change Measure (PCM) employs portfolio holdings as well as rates of return and does not require an external benchmark portfolio (i.e., a proxy for the market portfolio). In order to motivate the portfolio change measure, assume that uninformed investors perceive that the vector of expected returns is constant, while informed investors can predict whether expected returns vary over time. Informed investors can profit from changing expected returns by increasing (decreasing) their holdings of assets whose expected returns have increased (decreased). The holding of an asset that increases with an increase in its conditional expected rate of return will exhibit a positive unconditional covariance with the asset’s returns. The portfolio change measure is constructed from an aggregation of these covariances. For evaluation purposes, let X ri t .xi t xi;t j /; PCM t D
209
as performance is measured. Thus, there we do not have to worry about possible gaming or window dressing problems that face researchers trying to gauge the performance of mutual funds. However, it is worth noting that the PCM imposes stronger assumptions of stationarity in returns than other measures of portfolio performance. Moreover, with the exception of the all-of-history approach, the strong stationarity assumption is not made in the other eight ways of estimating the joint return distributions. Ex Post Certainty Equivalent Returns. The final performance measure is based on knowledge of the investor’s utility function. At a point in time each investor, employing an estimate of the joint return distribution, maximizes the ex ante expected utility of returns by solving (12.1)–(12.5). For example, employing industry data, each investor makes a time series of 264 portfolio choices, beginning in the first quarter of 1934 and ending in the fourth quarter of 1999, one set for each of the nine ways of estimating the joint return distribution. Now, suppose the investor is asked to evaluate the ex post utility of each of the nine time series of portfolio returns. Define the ex post utility of the time series of (optimal ex ante) portfolio returns as EU D
T X 1 1 .1 C rt .xt // ; T t D1
EU D
T X 1 In.1 C rt .xt //; T t D1
i
where rit is the quarterly rate of return on asset i at time t; xit and xi;t j are the holdings of asset i at time t and time t j , respectively. This expression provides an estimate of the covariance between returns and weights at a point in time. Alternatively, it may be viewed as the return on a zero-weight portfolio. The portfolio change measure is an average of the PCM t s XX ri t xi t xi;t j =T ; (12.8) PCM D t
1;
¤ 0;
D 0;
(12.9)
where each portfolio return .1 C rt .xt // is assigned equal probability of occurrence. While Equation (12.9) gives the desired ranking of the nine time series of returns for each of the powers, the utility numbers are non-intuitive, particularly for the low powers. Therefore, the expected utility numbers from Equation (12.9) are transformed into certainty equivalent returns as follows
i 1
where T is the number of time-series observations. The portfolio change measure test itself is a simple t-test based on the time series of zero-weight portfolio returns; that is, p t D PCM= .P CM / T ; where .PCM/ is the standard deviation of the time series of PCM t s. In their empirical analysis of mutual fund performance, Grinblatt and Titman work with two values of j that represented one- and four-quarter lags. We compute the PCM for the same two lags, but to save space we report results for the four-quarter lag only. The PCM seems particularly apropos in the present study because the portfolio weights are chosen according to a prespecified set of rules over the same quarterly time interval
1 C rCE D ŒEU ;
1;
1 C rCE D expŒEU ;
D 0:
¤ 0; (12.10)
At the risk of oversimplifying, we draw the following analogy between the certainty equivalent return and Jensen’s alpha. Alpha reduces the reward (ex post average return) to risk (beta) tradeoff of the mean-variance capital asset pricing model into a single risk-adjusted rate of return. Similarly, we can think of the certainty equivalent return as reducing the reward (ex post average return) to risk (standard deviation) tradeoff of a specific power utility investor into a single risk-adjusted rate of return.12 12 The argument is simplified as power utility weights all moments of the distribution not just the mean and variance. We also note that Kandel
210
12.7 The Results We evaluate the ex post performance of ten power utility investment strategies when the joint return distributions are estimated nine different ways in industry-rotation and global asset-allocation settings. Four methods are employed in the evaluation: compound return – standard deviation pairs, Jensen’s alpha, Grinblatt and Titman’s portfolio change measure, and ex post certainty equivalent returns. Naturally, space limitations dictate that we can only report a subset of the results. To keep the figures relatively uncluttered we report compound return – standard deviation plots for seven of the ten utility functions focusing on four of the nine ways of estimating the joint return distribution—the base-case EPAA, the polar opposite all-of-history and sum-of-digits approaches, and the boom-disaster 1 approach. In the tables, we report on all nine estimation methods focusing on two power utility functions—the extremely risk-averse power 30 and the well-known much less risk-averse logarithmic utility function. The results in the tables are sorted from best to worst. In Table 12.2 the sort is by compound return, in Table 12.3 by alpha, in Table 12.4 by the portfolio change measure, and in Table 12.5 by certainty equivalent return. In addition, in each of the tables the results for the all-ofhistory, boom-disaster 1, EPAA, and sum-of-digits estimation approaches are presented in bold type so that the results may be more easily compared with the results in the figures. Compound Return – Standard Deviation Pairs. Fig. 12.1 plots the annual compound returns and the standard deviations13 of the realized returns for four sets of ten power utility strategies under quarterly revision for the 32-year period 1968–99 when the risky portion of the investment universe consists of 12 U.S. value-weighted industries. The first set of strategies (see dark circles) shows the returns generated from the base-case EPAA. The second set of strategies (see open circles) shows the returns from the all-of-history approach. The third set (see open squares) shows the returns generated from the sum-of-digits approach. The fourth set (see open triangles) shows the returns from the boom-disaster 1 approach. The figure also shows the compound returns and standard deviations of the returns of the up- and down-levered valueweighted benchmark portfolios (see open diamonds). With the exception of power 1, the EPAA strategies easily outperform the polar opposite sum-of-digits and all-of-history approaches. Moreover, the sum-of-digits approach outperforms the all-of-history approach. The boom-disaster 1 approach performs very well for high powers and not as well and Stambaugh (1996) and Grauer and Shen (2000) have employed certainty equivalent returns to measure investment performance. 13 For consistency with the geometric mean, the standard deviation is based on the log of one plus the rate of return. This quantity is very similar to the standard deviation of the rate of return for levels less than 25 percent.
R.R. Grauer and F.C. Shen
for the low powers. Allowing for the possibility of (boom and) disaster makes the more risk-averse investors even more conservative, penalizing their ex post performance—at least as measured in terms of ex post compound return. Finally, the active strategies perform well relative to the value-weighted benchmarks, except for the high power strategies based on the all-of-history approach.14 Figure 12.2 presents the compound return – standard deviation plot for the 32-year period 1968–99 when the risky portion of the investment universe is ten global asset categories. The difference between the sum-of-digits and all-ofhistory plots in Figs. 12.1 and 12.2 is striking. In the global universe, the all-of-history estimation approach clearly dominates the sum-of-digits method for the high (less risk-averse) powers—just the opposite of what occurs with the U.S. industry universe.15 As in the industry universe, the EPAA and boom-disaster 1 approaches closely shadow each other and, with the possible exception of the power 1 strategy, dominate the sum-of-digits and all-of-history approaches. Finally, the returns earned by the four estimation approaches do not so clearly dominate the levered equal-weighted benchmarks. Table 12.2 contains the compound return-standard deviation results for all nine estimation methods for the power 30 and the logarithmic utility function in the value-weighted industries and global universes. The table shows that for the power 30 there is little difference between the returns earned by the nine estimation methods. For the logarithmic utility function the differences are much more pronounced.16 Finally, the results reinforce the observed differences (see Figs. 12.1 and 12.2) in the rankings of the sum-of-digits and all-of-history estimation methods in the global and industry settings. Jensen’s Alpha. Table 12.3 shows the results for Jensen’s test for the power 30 and logarithmic utility functions in the value-weighted industries universes when the joint return distributions are estimated in nine ways in the 1934–99 and 1968–99 periods. We note three main points from the table. First, all but three of the policies exhibit positive abnormal returns and slightly more than one-third are statistically significant.17 Second, the EPAA and all-of-history estimation approaches consistently rank among the best and worst, respectively. Third, the rankings are highly correlated with the compound return rankings reported in Table 12.2. 14
Figure 12.1 also contains the compound return-standard deviation pairs for a combined approach discussed in Section 12.9. 15 We note, however, that the starting point for the all-of-history estimation period is 1926 in the industry dataset and 1960 in the global dataset. 16 The differences are even more pronounced in an equal-weighted industries universe, reported in the first version of the paper. 17 In the equal-weighted industries universe, all the power policies perform extremely well—especially in the 1968–90 period where the highly statistically significant abnormal returns range from approximately 45 to over 60 percent of excess returns.
16.57 16.47 16.37 15.12 14.65 14.42 13.47 13.42 10.77
6.94 6.72 5.25 5.05 4.97 4.86 4.86 4.66 4.53
31.22 31.85 30.14 26.36 28.00 35.05 21.36 25.24 19.54
7.60 7.36 3.76 3.52 3.50 3.43 3.29 3.30 3.29
Standard deviation
Boom Disaster 1 Boom Disaster 2 EPAA Disaster 1 Boom Disaster 4 Sum of Digits (SOD) Disaster 2 All of History SOD All of History
EPAA Sum of Digits (SOD) Boom Disaster 1 Boom Disaster 2 Disaster 1 All of History SOD Boom Disaster 4 Disaster 2 All of History
17.12 16.51 16.13 15.77 14.17 14.09 13.59 11.44 10.54
8.87 8.58 7.90 7.66 7.60 7.49 7.44 7.28 7.22
Compound return
30.02 29.72 28.86 21.32 27.76 32.51 13.35 23.62 17.50
3.80 4.33 2.99 2.81 2.76 2.70 2.65 2.65 2.55
Standard deviation
Disaster 1 All of History SOD Disaster 2 Boom Disaster 1 EPAA Boom Disaster 2 Boom Disaster 4 All of History Sum of Digits (SOD)
EPAA Boom Disaster 1 Sum of Digits (SOD) Boom Disaster 2 Disaster 1 Boom Disaster 4 Disaster 2 All of History SOD All of History
19.13 19.05 18.98 18.88 18.68 18.55 17.99 17.59 12.27
9.33 8.81 8.71 8.57 8.48 8.30 8.12 8.10 8.02
Compound return
43.44 47.63 44.24 42.34 42.35 42.75 43.74 45.06 42.31
3.92 3.50 4.33 3.37 3.29 3.22 3.09 3.16 2.94
Standard deviation
There are two investment universes: (1) 12 U.S. value-weighted industry groups and (2) eight country common stock indices, U.S. long-term government bonds, and U.S. long-term corporate bonds. The power portfolios are revised quarterly in the 1934–1999 and 1968–1999 periods. Borrowing is permitted. The compound return of each of the nine estimators is sorted from high to low. The benchmark empirical probability assessment approach, sum-of-digits, all-of-history, and boom-disaster 1 estimators are presented in bold type
Boom Disaster 1 EPAA Boom Disaster 2 Disaster 1 Boom Disaster 4 Sum of Digits (SOD) Disaster 2 All of History SOD All of History
Panel B: Power 0
Sum of Digits (SOD) EPAA Boom Disaster 1 Boom Disaster 2 Disaster 1 All of History SOD Boom Disaster 4 Disaster 2 All of History
Panel A: Power 30
Compound return
Table 12.2 Compound return and standard deviation of annual returns for power 30 and power 0 portfolios generated from nine estimators of the joint return distribution Value-weighted industries 1934–1999 Value-weighted industries 1968–1999 Global asset categories 1968–1999
12 On Estimation Risk and Power Utility Portfolio Selection 211
2.69 3.41 4.01 4.00 3.86 3.61 3.33 2.89 1.98
0.66 0.72 0.27 0.22 0.20 0.18 0.18 0.13 0.10 0.77 1.10 1.37 1.41 1.39 1.39 1.32 1.16 0.94
0.17 0.20 0.08 0.07 0.06 0.05 0.06 0.04 0.04 0.93 0.91 0.88 0.78 0.70 0.44 0.32 0.26 0.15
0.28 0.27 0.09 0.07 0.07 0.06 0.05 0.05 0.02 0.03 0.06 0.07 0.07 0.08 0.24 0.22 0.25 0.71
0.03 0.04 0.03 0.03 0.03 0.02 0.05 0.03 0.14 Disaster 1 Boom Disaster 1 EPAA Boom Disaster 2 Disaster 2 Sum of Digits (SOD) Boom Disaster 4 All of History SOD All of History
EPAA Sum of Digits (SOD) Boom Disaster 1 Disaster 1 Boom Disaster 2 Boom Disaster 4 Disaster 2 All of History SOD All of History 2.95 3.71 3.49 3.44 1.92 2.98 2.68 1.84 1.27
0.49 0.43 0.24 0.17 0.19 0.13 0.10 0.15 0.08 1.11 1.57 1.51 1.51 0.64 1.36 1.39 1.14 0.87
0.15 0.18 0.09 0.06 0.07 0.05 0.03 0.06 0.04
1.11 1.10 1.00 0.94 0.86 0.72 0.38 0:05 0.17
0.23 0.13 0.10 0.08 0.07 0.05 0.05 0.04 0.02
Alpha
0.10 0.10 0.15 0.12 0.08 0.24 0.28 0.53 0.65
0.03 0.22 0.04 0.04 0.05 0.06 0.04 0.12 0.19
1-tail p-value
The investment universe consists of 12 U.S. value-weighted industry groups. The power portfolios are revised quarterly in the 1934–1999 and 1968–1999 periods. Borrowing is permitted. Average arithmetic excess returns (raw returns less the riskfree lending rate) and alphas are measured in percent per quarter. The heteroskedasticity-consistent p-values measure the significance of the alphas relative to zero. The alpha of each of the nine estimators is sorted from high to low. The benchmark empirical probability assessment approach, sum-of-digits, all-of-history, and boom-disaster 1 estimators are presented in bold type
Disaster 2 Disaster 1 EPAA Boom Disaster 1 Boom Disaster 2 Sum of Digits (SOD) Boom Disaster 4 All of History SOD All of History
Panel B: Power 0
EPAA Sum of Digits (SOD) Boom Disaster 1 Boom Disaster 2 Disaster 1 Boom Disaster 4 All of History SOD Disaster 2 All of History
Panel A: Power 30
Table 12.3 Average excess returns, betas, and alphas for power 30 and power 0 portfolios generated from nine estimators of the joint return distribution Value-weighted industries 1934–1999 Value-weighted industries 1968–1999 Excess return Beta Alpha 1-tail p-value Excess return Beta
212 R.R. Grauer and F.C. Shen
1.53 1.33 1.10 1.03 0.97 0.96 0.66 0.45 0.33
0.15 0.14 0.09 0.07 0.07 0.05 0.04 0.04 0.02 0.00 0.00 0.00 0.01 0.00 0.03 0.01 0.07 0.03
0.02 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 EPAA Boom Disaster 1 Disaster 1 Disaster 2 Boom Disaster 2 Sum of Digits (SOD) All of History SOD All of History Boom Disaster 4
EPAA Boom Disaster 1 Sum of Digits (SOD) Boom Disaster 2 Disaster 1 Disaster 2 All of History SOD Boom Disaster 4 All of History 1.57 1.49 1.24 1.14 1.00 0.96 0.52 0.38 0.28
0.18 0.10 0.10 0.08 0.08 0.06 0.04 0.04 0.02 0.02 0.03 0.04 0.01 0.07 0.09 0.09 0.07 0.29
0.02 0.01 0.12 0.02 0.01 0.01 0.02 0.06 0.02
All of History All of History SOD Disaster 2 Disaster 1 EPAA Boom Disaster 1 Boom Disaster 2 Sum of Digits (SOD) Boom Disaster 4
EPAA Boom Disaster 1 All of History SOD Boom Disaster 2 Boom Disaster 4 Disaster 1 All of History Disaster 2 Sum of Digits (SOD)
1.05 0.98 0.90 0.61 0.52 0.51 0.47 0.47 0.43
0.13 0.11 0.09 0.09 0.09 0.09 0.08 0.08 0.02
0.01 0.01 0.05 0.16 0.22 0.21 0.22 0.30 0.23
0.02 0.02 0.01 0.02 0.02 0.02 0.00 0.03 0.43
1-tail p-value
There are two investment universes: (1) 12 U.S. value-weighted industry groups and (2) eight country common stock indices, U.S. long-term government bonds, and U.S. long-term corporate bonds. The power portfolios are revised quarterly in the 1934–1999 and 1968–1999 periods. Borrowing is permitted. The portfolio change measure (PCM), which employs a four-quarter lag, is measured in percent per quarter. The p-values measure the significance of the portfolio change measures relative to zero. The portfolio change measure of each of the nine estimators is sorted from high to low. The benchmark empirical probability assessment approach, sum-of-digits, all-of-history, and boom-disaster 1 estimators are presented in bold type
EPAA Boom Disaster 1 Disaster 1 Boom Disaster 2 Disaster 2 Sum of Digits (SOD) All of History SOD Boom Disaster 4 All of History
Panel B: Power 0
Sum of Digits (SOD) EPAA Boom Disaster 1 Boom Disaster 2 Disaster 1 Disaster 2 All of History SOD Boom Disaster 4 All of History
Table 12.4 Portfolio change measures for power 30 and power 0 portfolios generated from nine estimators of the joint return distribution Value-weighted industries 1934–1999 Value-weighted industries 1968–1999 Global asset categories 1968–1999 PCM 1-tail p-value PCM 1-tail p-value PCM Panel A: Power 30
12 On Estimation Risk and Power Utility Portfolio Selection 213
5.02 5.03 4.88 4.43 4.35 4.63 3.71 3.91 3.00
1.24 1.20 1.20 1.29 1.22 1.15 1.12 1.73 1.68 15.31 15.47 14.61 13.30 13.39 15.50 10.24 11.81 8.96
1.13 0.99 1.03 1.28 1.10 0.94 0.90 3.03 2.84 3.91 3.88 3.86 3.58 3.48 3.43 3.21 3.20 2.59
1.05 1.05 1.04 1.04 1.04 1.02 1.00 0.52 1.48 Boom Disaster 1 Boom Disaster 2 EPAA Disaster 1 Boom Disaster 4 Sum of Digits (SOD) Disaster 2 All of History SOD All of History
Boom Disaster 1 Boom Disaster 2 Disaster 1 Boom Disaster 4 All of History SOD Disaster 2 All of History EPAA Sum of Digits (SOD) 5.39 5.12 5.17 4.63 4.36 4.66 3.60 3.52 2.95
1.92 1.87 1.85 1.81 1.83 1.78 1.76 2.17 2.11 17.12 16.24 16.85 13.75 14.48 16.32 8.73 12.22 8.96
1.16 0.98 0.94 0.84 0.91 0.77 0.75 2.01 2.44 4.03 3.89 3.81 3.73 3.37 3.35 3.24 2.75 2.54
1.72 1.72 1.72 1.71 1.70 1.69 1.68 1.54 0.81
Disaster 1 All of History SOD Disaster 2 Boom Disaster 1 EPAA Boom Disaster 2 Boom Disaster 4 All of History Sum of Digits (SOD)
Boom Disaster 2 Boom Disaster 4 Disaster 1 All of History Boom Disaster 1 Disaster 2 All of History SOD EPAA Sum of Digits (SOD)
6.07 6.35 5.97 6.03 6.01 5.96 5.85 5.97 4.55
2.09 2.02 2.06 1.95 2.14 1.98 1.97 2.27 2.14
18.07 19.07 17.44 18.29 18.47 18.19 18.04 18.93 17.90
1.38 1.24 1.33 1.10 1.53 1.15 1.22 1.89 2.35
Standard deviation
4.47 4.46 4.44 4.42 4.37 4.35 4.22 4.14 2.94
1.78 1.78 1.78 1.77 1.77 1.77 1.74 1.70 0.74
Certainty equivalent
There are two investment universes: (1) 12 U.S. value-weighted industry groups and (2) eight country common stock indices, U.S. long-term government bonds, and U.S. long-term corporate bonds. The power portfolios are revised quarterly in the 1934–1999 and 1968–1999 periods. Borrowing is permitted. The arithmetic means, standard deviations, and certainty equivalent returns are measured in percent per quarter. The certainty equivalent return of each of the nine estimators is sorted from high to low. The benchmark empirical probability assessment approach, sum-of-digits, all-of-history, and boom-disaster 1 estimators are presented in bold type
Boom Disaster 1 EPAA Boom Disaster 2 Disaster 1 Boom Disaster 4 Sum of Digits (SOD) Disaster 2 All of History SOD All of History
Panel B: Power 0
Boom Disaster 2 Boom Disaster 4 All of History SOD Boom Disaster 1 Disaster 1 Disaster 2 All of History Sum of Digits (SOD) EPAA
Panel B: Power 30
Table 12.5 Certainty equivalent returns for power 30 and power 0 portfolios generated from nine estimators of the joint return distribution Value-weighted industries 1934–1999 Value-weighted industries 1968–1999 Global asset categories 1968–1999 Arithmetic Standard Certainty Arithmetic Standard Certainty Arithmetic mean deviation equivalent mean deviation equivalent mean
214 R.R. Grauer and F.C. Shen
12 On Estimation Risk and Power Utility Portfolio Selection 24 All of History 0
Sum of Digits
22
EPAA
Boom Disaster 1
0.5 −1
Combined Approach
20
1
Benchmarks
18
Compound Return
Fig. 12.1 Compound return versus standard deviation of ln.1 C r/ of annual portfolio returns for seven power portfolios in a U.S. industry universe. The investment universe consists of twelve U.S. value-weighted industry groups. The benchmark portfolios are the value-weighted CRSP index with differing degrees of leverage. The power portfolios are revised quarterly in the 1968–1999 period. Borrowing is permitted. The joint return distributions are estimated using the all-of-history, sum-of-digits, empirical probability assessment, boom-disaster 1, and combined approaches
215
0 −1
16
0
−5
.5
.5 −1
−1
.5
−5
14
0 −10 1
VW
12
−10
V20
−5
1 .5
0
10
8
−5 −1 −10
V15
−10 −30 V5 −30 −30 −30 −10 −30
1
1
−5
RL
6 0
10
20
30
40
50
Standard Deviation
Grinblatt and Titman’s Portfolio Change Measure. Table 12.4 contains the results for Grinblatt and Titman’s portfolio change measure. Results for the value-weighted industries universe are shown in the 1934–99 and 1968–99 periods and for global asset universe in the 1968–99 period. In this case, we note four main points. First, all the policies exhibit positive abnormal performance. Second, the PCMs are highly statistically significant—all but nine of the 54 are statistically significant at the 10 percent level. Third, in the value-weighted industries universe, the PCMs are more statistically significant than the corresponding alphas in Table 12.3. Fourth, some of the PCM rankings are at complete odds with the compound returns and Jensen’s alpha rankings. For example, compound returns and Jensen’s alpha rank all-of-history ninth in seven of the eight cases reported in Tables 12.2 and 12.3. Yet, for the logarithmic investor in the global universe, compound returns ranks the all-of-history estimation method eighth while the portfolio change measure ranks it first. Ex Post Certainty Equivalent Returns. Table 12.5 shows quarterly arithmetic means, standard deviations, and certainty equivalent returns for power 30 and logarithmic utility in the value-weighted industries and the global uni-
verses. For the logarithmic investor, the certainty equivalent returns rankings are consistent with those of the other performance measures. For power 30, it is a completely different story. For this investor in the global setting, the EPAA ranks eighth by certainty equivalent returns as opposed to first by compound returns and the portfolio change measure. In the industry setting, the power 30 investor actually assigns negative certainty equivalent returns to the EPAA and sum-ofdigits estimation methods in 1934–99 period. Surprisingly, this can be traced to a single observation in the second quarter of 1962, where the EPAA and sum-of-digits estimation methods lose 16.4% and 13.9% respectively. By way of comparison, the third worst loss is 3.8% and the remaining six estimation methods lose less than 2.6%. These results go to the core of the estimation risk problem when the estimation method is based on a finite number of states of nature drawn from historical data. Given a long enough time series we will eventually find some state or quarter t where the ex ante joint return distribution drawn from the 32-quarter moving window will be characterized by either no losses or small losses. Yet, in the next quarter there will be a large loss. While all estimation methods are based on ex ante joint return distributions that contain estimation
216 22
−1
20
−1 0 0
18 0
−1
.5 .5
Compound Return
Fig. 12.2 Compound return versus standard deviation of ln.1 C r/ of annual portfolio returns for seven power portfolios in a global investment universe. The investment universe consists of eight country common stock indices, U.S. long-term government bonds, and U.S. long-term corporate bonds. The benchmark portfolios equal-weight the ten asset classes employing differing degrees of leverage. The power portfolios are revised quarterly in the 1968–1999 period. Borrowing is permitted. The joint return distributions are estimated using the all-of-history, sum-of-digits, empirical probability assessment, and boom-disaster 1 approaches
R.R. Grauer and F.C. Shen
16
−5 E20 −5
14
.5 E15 −1
EW −10 −5
12
−10
.5
−10
E5 −10
−30 −30
1
1
All of History
10
8
0
−5
1
Sum of Digits EPAA
−30
−30
Boom Disaster 1 Benchmarks
RL
6 0
10
20
30
40
50
60
Standard Deviation
error, the estimation methods that assign some probability to loss states outside the moving 32-quarter window will yield more conservative investment policies. As a consequence, when the large loss occurs in quarter t, the realized portfolio return may be (and in some cases is) smaller than the smallest ex ante portfolio return in any state. That is exactly what happens to the EPAA and sum-of-digits methods in 1962. In other scenarios or states, less risk-averse investment policies will yield larger gains than any gains seen in the ex ante joint return distributions. As the disutility of a loss outweighs the utility of an equal-size gain for all risk-averse investors, either the conservative or aggressive policies generated from different estimation methods could be preferred ex post depending on the degree of investor risk aversion. However, the extreme aversion to (unexpected) losses exhibited by the more risk-averse members of the power utility functions will almost certainly mean that the conservative investment policies of estimation methods, which account for the possibility of a large loss, will be preferred ex post.18
12.8 Conclusion Previous studies show that combining a power utility portfolio selection model with the empirical probability assessment approach to estimating the joint return distribution frequently generates economically and statistically significant abnormal returns. Given that the EPAA is based exclusively on historic return data in a 32-quarter moving window, a series of papers explore whether these somewhat surprising results—as well as those generated from mean-variance optimizers that employ historical data to generate the mean vector and covariance matrix—are attributable to the power utility formulation that underlies the multiperiod model, to the inputs to the model, or to the constraints. The latter two possibilities are usually discussed under the rubric of estimation risk. In turn, most of the analysis of the inputs focuses on alternate estimates of the means—or in the context of the mean-variance model in terms of the mean vector and covariance matrix. By way of contrast, this paper investigates eight alternatives to the base-case EPAA that either: lengthen the esti-
18
These results are consistent with Grauer and Shen (2000), who find that ex ante the policies of more risk-averse investors subject to fewer constraints are preferred to policies subject to more constraints.
Nonetheless the policies subject to more constraints are preferred ex post.
12 On Estimation Risk and Power Utility Portfolio Selection
mation period; add “boom” and/or “disaster” states drawn from all-of-history to the moving window; or change the way probabilities of a state are estimated giving more weight to more recent observations. The alternatives allow us to explore the stationary-nonstationary tradeoffs implicit in the expanding versus moving window estimation methods; the benefits of alternative methods of allowing for the “memory loss” inherent in the moving-window EPAA; and the possibility that weighting more recent observations more heavily may improve investment performance. Compound return-standard deviation pairs, Jensen’s alpha, Grinblatt and Titman’s portfolio change measure, and ex post certainty equivalent returns are employed in evaluating the ex post performance of ten power utility investment strategies generated from nine estimation methods of the joint return distributions in two investment universes. The results show that: the EPAA holds its own against other methods of estimating the joint return distribution; the all-ofhistory method does not perform very well; and the methods that add disaster or boom-disaster states to the EPAA moving window perform quite well. We find the latter result encouraging, as in our judgment, ignoring the possibility of disaster states in forming estimates of the joint return distribution may one day prove to be just as costly to the power utility strategies as the catastrophic events of August and September 1998 were to the arbitrage strategies of Long-Term Capital. However, while there is considerable agreement in the rankings of the estimation methods, the rankings differ (sometimes dramatically) across datasets, time periods, performance evaluation methods, and investor risk aversion. For the most part the evaluation measures rank the all-of-history estimation method at or near the bottom (i.e., ninth out of nine estimation methods). Yet, for the logarithmic investor in the global dataset, the portfolio change measure ranks the all-of-history estimation method first, and for the power 30 investor the certainty equivalent measure ranks the all-ofhistory estimation method fourth. For the power 30 investor in the industry dataset, compound return, Jensen’s alpha, and the portfolio change measure rank the EPAA and sum-ofdigits estimation method either first or second (or in one case first and third). Yet, the certainty equivalent measure ranks the EPAA and sum-of-digits method eighth or ninth. Although dramatic disagreement among the performance measures in certain instances is discouraging, it should not be surprising. The expected utility-based objective function of multiperiod portfolio theory has no direct tie to any of the commonly used measures of investment performance. And, while each of the performance measures has a certain intuitive appeal, each of them embodies economic or statistical assumptions that may be at variance with: the investor’s objective function, the assumptions employed in the different methods of estimating the joint return distri-
217
butions, or the assumptions used in the other methods of measuring performance. The results raise two fundamental issues. First, the performance measures may be particularly misleading if a catastrophic event does not show up in the sample. Second, even if a catastrophic event does occur, traditional performance measures may not be equipped to deal with it. For example, Grauer (2008a) finds that Jensen’s alpha and Grinblatt-Titman’s portfolio change measure rank a number of bankrupt mean-variance asset-allocation strategies ahead of perfect-foresight strategies that generate riches beyond one’s wildest dreams.
12.9 Addendum In light of the out-of-sample portfolio returns generated from models that forecast the means based on information variables in a moving window, see for example Grauer (2008b, c, 2009), we examine the effect of combining these movingwindow forecasts of the means with an expanding window approach that guards against the possibility of losses in some long forgotten states of nature. Dividend yields and riskfree interest rates are used to forecast the means. The following regression is run at each time t ri D a0i C a1i dy 1 C a2i rL C ei ; for all i and £; in the tk to t1 estimation period, where the i subscript denotes an industry, dy 1 is the annual dividend yield on the CRSP value-weighted index lagged one month so that it is observable at the beginning of quarter t; rLt is the (observable) beginning-of-quarter Treasury bill rate, and in this case k D 32. The one-period ahead forecast of the mean of industry i is rNDRit D aO 0i C aO 1i dyt 1 C aO 2i rLt ; where aO 0i ; aO 2i and aO 2i are the estimated coefficients and dyt 1 and rLt are observable at the beginning of period t C 1. That is, the quarterly variable dyt 1 is lagged one month and there is no need to lag rLt as it is observable at the beginning of the quarter. For each asset i , we add the difference .NrDRit rNit /, where rNit is the historic mean from the expanding (all-of history) window, to each actual return on asset i in the expanding window estimation period. That is, in each estimation period, we replace the raw return series with the adjusted return series riA D ri C .NrDRit rNit /, for all i and . No adjustment is made to the EPAA variance-covariance structure or to the other moments. We limit reporting results to the compound returnstandard deviation pairs for the industry dataset in 1968–
218
1999. Figure 12.1 shows that the combination approach clearly dominates the other approaches. Acknowledgments We thank the Social Sciences Research Council of Canada and the Centre for Accounting Research and Education for financial support. The paper was presented at the University of British Columbia, the Northern Finance Association meetings in London Ontario, and the Pacific Northwest Finance Conference in Seattle. We thank C.F. Lee, William Ziemba, and seminar participants for valuable comments, and Reo Audette, Christopher Fong, Maciek Kon, Jaspreet Sahni, William Ting, and Jeffrey Yang for most capable assistance.
References Barberis, N. 2000. “Investing for the long run when returns are predictable.” Journal of Finance 55, 225–264. Best, M. J. 1975. “A feasible conjugate direction method to solve linearly constrained optimization problems.” Journal of Optimization Theory and Applications 16, 25–38. Black, F. and R. Litterman. 1992. “Global portfolio optimization.” Financial Analysts Journal 48, September/October, 28–43. Brennan, M. J., E. S. Schwartz, and R. Lagnado. 1997. “Strategic asset allocation.” Journal of Economic Dynamics & Control 21, 1377–1403. Dybvig, P. H. and S. A. Ross. 1985. “The analytics of performance measurement using a security market line.” Journal of Finance 40, 401–416. Elton, E. J., M. J. Gruber, S. Das, and M. Hlavka. 1993. “Efficiency with costly information: a reinterpretation of evidence from managed portfolios.” Review of Financial Studies 6, 1–22. Frost, P. A. and J. E. Savarino. 1988. “For better performance: constrain portfolio weights.” Journal of Portfolio Management, 29–34. Grauer, R. R. 1991. “Further ambiguity when performance is measured by the security market line.” Financial Review 26, 569–585. Grauer, R. R. 2008a. “Benchmarking performance measures with perfect-foresight and bankrupt asset-allocation strategies.” Journal of Portfolio Management 34, 43–57. Grauer, R. R. 2008b. “On the predictability of stock market returns: Evidence from industry rotation strategies.” Journal of Business and Management 14, 47–71. Grauer, R. R. 2008c. “Timing the market with stocks, bonds and bills.” Working paper Simon Fraser University, Canada. Grauer, R. R. 2009. “Extreme mean-variance solutions: estimation error versus modeling error.” In Applications of Management Science on Financial Optimization, 13 Financial Modeling and Data Envelopment Applications, K. D. Lawrence and G. Kleinman (Eds.), Emerald Group Publishing Limited, United Kingdom. Grauer, R. R. and N. H. Hakansson. 1982. “Higher return, lower risk: historical returns on long-run, actively managed portfolios of stocks, bonds and bills, 1936–1978.” Financial Analysts Journal 38, 39–53. Grauer, R. R. and N. H. Hakansson. 1985. “Returns on levered, actively managed long-run portfolios of stocks, bonds, and bills, 1934–1984.” Financial Analysts Journal 41, 24–43. Grauer, R. R. and N. H. Hakansson. 1986. “A half-century of returns on levered and unlevered portfolios of stocks, bonds, and bills, with and without small stocks.” Journal of Business 59, 287–318. Grauer, R. R. and N. H. Hakansson. 1987. “Gains from international diversification: 1968–85 returns on portfolios of stocks and bonds.” Journal of Finance 42, 721–739. Grauer, R. R. and N. H. Hakansson. 1993. “On the use of mean-variance and quadratic approximations in implementing dynamic investment strategies: a comparison of the returns and investment policies.” Management Science 39, 856–871.
R.R. Grauer and F.C. Shen Grauer, R. R. and N. H. Hakansson. 1995a. “Gains from diversifying into real estate: three decades of portfolio returns based on the dynamic investment model.” Real Estate Economics 23, 117–159. Grauer, R. R. and N. H. Hakansson. 1995b. “Stein and CAPM estimators of the means in asset allocation.” International Review of Financial Analysis 4, 35–66. Grauer, R. R. and N. H. Hakansson. 1998. “On timing the market: the empirical probability assessment approach with an inflation adapter.” in Worldwide asset and liability modeling, W. T. Ziemba and J. M. Mulvey (Eds.). Cambridge University Press, Cambridge, 149–181. Grauer, R. R. and N. H. Hakansson. 2001. “Applying portfolio change and conditional performance measures: the case of industry rotation via the dynamic investment model.” Review of Quantitative Finance and Accounting 17, 237–265. Grauer, R. R., N. H. Hakansson, and F. C. Shen. 1990. “Industry rotation in the U.S. stock market: 1934–1986 returns on passive, semipassive, and active strategies.” Journal of Banking and Finance 14, 513–535. Grauer R. R. and F. C. Shen. 2000. “Do constraints improve portfolio performance?” Journal of Banking and Finance 24, 1253–1274. Green, R. C. 1986. “Benchmark portfolio inefficiency and deviations from the security market line.” Journal of Finance 41, 295–312. Grinblatt, M. and S. Titman. 1993. “Performance measurement without benchmarks: an examination of mutual fund returns.” Journal of Business 66, 47–68. Hakansson N. H. 1971. “On optimal myopic portfolio policies, with and without serial correlation of yields.” Journal of Business 44, 324–334. Hakansson N. H. 1974. “Convergence to isoelastic utility and policy in multiperiod portfolio choice.” Journal of Financial Economics 1, 201–224. Huberman, G. and S. Ross. 1983. “Portfolio turnpike theorems, risk aversion and regularly varying utility functions.” Econometrica 51, 1104–1119. Jensen, M. C. 1968. “The performance of mutual funds in the period 1945–1964.” Journal of Finance 23, 389–416. Jobson, J. D. and B. Korkie. 1980. “Estimation for Markowitz efficient portfolios.” Journal of the American Statistical Association 75, 544–554. Jobson, J. D. and B. Korkie. 1981. “Putting Markowitz theory to work.” Journal of Portfolio Management 7, 70–74. Jobson, J. D., B. Korkie, and V. Ratti. 1979. “Improved estimation for Markowitz portfolios using James-Stein type estimators.” Proceedings of the American Statistical Association, Business and Economics Statistics Section 41, 279–284. Jorion, P. 1985. “International portfolio diversification with estimation risk.” Journal of Business 58, 259–278. Jorion, P. 1991. “Bayesian and CAPM estimators of the means: implications for portfolio selection.” Journal of Banking and Finance 15, 717–727. Kandel, S. and R. Stambaugh. 1996. “On the predictability of stock returns: an asset allocation perspective.” Journal of Finance 51, 385–424. Leland, H. 1972. “On turnpike portfolios.” in Mathematical methods in investment and finance, K. Shell and G. P. Szego (Eds.). North-Holland, Amsterdam. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets.” Review of Economics and Statistics 47, 13–47. Lowenstein, R. 2000. When genius failed: the rise and fall of long-term capital management. Random House Trade Paperbacks, New York. Merton, R. C. 1971. “Optimum consumption and portfolio rules in a continuous time model.” Journal of Economic Theory 3, 373–413.
12 On Estimation Risk and Power Utility Portfolio Selection Merton, R. C. 1973. “An intertemporal capital asset pricing model.” Econometrica 41, 867–887. Mossin, J. 1968. “Optimal multiperiod portfolio policies.” Journal of Business 41, 215–229. Roll, R. 1978. “Ambiguity when performance is measured by the securities market line.” Journal of Finance 33, 1051–1069. Ross, S. 1974. “Portfolio turnpike theorems for constant policies.” Journal of Financial Economics 1, 171–198.
219 Sharpe, W. F. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. Shen, F. C. 1988. Industry rotation in the U.S. stock market: an application of multiperiod portfolio theory.” unpublished Ph.D. dissertation, Simon Fraser University, Canada. Solnik, B. 1993. “The performance of international asset allocation strategies using conditioning information.” Journal of Empirical Finance 1, 33–55.
Chapter 13
International Portfolio Management: Theory and Method Wan-Jiun Paul Chiou and Cheng-Few Lee
Abstract This paper investigates the impact of various investment constraints on the benefits and asset allocation of the international optimal portfolio for domestic investors in various countries. The empirical results indicate that local investors in less-developed countries, particularly in East Asia and Latin America, benefit more from global diversification. Although the global financial market is becoming more integrated, adding constraints reduces but does not completely eliminate the diversification benefits of international investment. The addition of portfolio bounds yields the following characteristics of asset allocation: a reduction in the temporal deviation of diversification benefits, a decrease in time-variation of components in optimal portfolio, and an expansion in the range of comprising assets. Our findings are useful for asset management professionals to determine target markets to promote the sales of national/international funds and to manage risk in global portfolios. Keywords International portfolio management r Investment constraints
13.1 Introduction In the past three decades, financial markets throughout the world have experienced evolutions that leave challenges and opportunities to international portfolio management: more openness to domestic and foreign investors, increasing integration with other countries, gradual liberalization from governmental control, and innovations of financial products and services. Specifically, as the investibility and the legal limitation to foreign investment vary across countries, it is W.-J.P. Chiou () Department of Finance, John L. Grove College of Business, Shippensburg University, Shippensburg, PA 17257, USA e-mail:
[email protected] C.-F. Lee Department of Finance and Economics, Rutgers Business School, Rutgers University, New Brunswick, NJ 08854, USA
nature to question the asset allocation of the unrestricted efficient frontier suggested by Markowitz (1952).1 Furthermore, understanding changes in the benefits of international diversification strategies helps to structure dynamic portfolio rebalancing when the world financial market has been increasingly integrated. To ensure the achievability of strategies, the obstacle of the extreme allocations on the optimal international portfolio, such as negative and enormous weights, needs to be taken into account. This chapter explores the theory and method of global asset management by investigating the attributes of the diversifying portfolio. We also provide the evidence of significance to include various investment constraints by observing the change in weights and mean-variance efficiency. Previous studies investigate the strategies of international portfolio management with short-selling constraints (for a more detailed discussion, see Cosset and Suret 1995; De Roon et al. 2001; De Santis and Gerard 1997; Driessen and Laeven 2007; Fletcher and Marshall 2005; French and Poterba 1991; Harvey 1995; Li et al. 2003; Novomestky 1997; and Obstfeld 1994). Yet, the impact of other investment restrictions on the global diversification benefits remains unclear. There are three reasons for portfolio managers to consider the unattainability of the “corner solutions” on the efficient frontier. First, both profitability and liquidity of the diversifying portfolio should be taken into account when asset allocation in international markets is determined. The heavily positive/negative weights of investments in the minor markets recommended by less-constrained strategies may cause illiquidity of the portfolio. Second, the excessive foreign capital in- and out flows in minor markets tend to trigger instability in security prices. This may generate dramatic variation in returns, risks, and correlations. Finally, in many countries, particularly developing nations, governmental regulations prohibit foreign investors to short sell and/or
1
The investiblity of the stock market is indicated by the degrees with which foreign investors can trade as do local investors in the domestic markets and the liquidity of assets. See Bae et al. (2004).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_13,
221
222
to hold more than a certain proportion of company shares.2 From legal and institutional aspects, a large percentage of foreign capital allocation to securities in those small financial markets is impractical. Accordingly, the results of this chapter are more realistic for asset management since illiquidity of portfolio and variation in return and volatility, as well legal limitations, are considered when the optimal portfolio strategies are formed. To appropriately estimate the effectiveness of international portfolio management, two straightforward measures, the increase in risk-adjusted premium by investing in the maximum Sharpe ratio (MSR) portfolio and the reduction in the volatility by investing the minimum variance portfolio (MVP) on the international efficient frontiers, are used. Additionally, this chapter extends the study in diversification benefits by including over-time comparison and cross-region analyses. An increase of correlations and reduction of returns in the global market cast a shadow of doubt on the effectiveness of international investment. An intertemporal analysis of the benefits of diversification is helpful to determine the impact of integration of the international financial market. Our empirical findings confirm the benefits of international portfolio management with various investment constraints. The cross-country comparison suggests that domestic investors in emerging markets, particularly in East Asia and Latin America, benefit more from international diversification. It is also found that the global diversification benefits vary as the world financial market has became more integrated. Although the lower and upper bounds of weights worsen the mean-variance efficiency of the optimal portfolio, they generate some desired attributes for asset management and therefore enhance the feasibility of asset allocation strategies. Specifically, the addition of overweighting investment constraints substantially reduces the time-variation in gains and weights of the global diversification. The expansion of coverage in the optimal portfolio makes the asset allocations more realistic. The chapter is organized as follows. Section 13.2 reviews development of global portfolio management. Section 13.3 reviews literature. The estimate of the benefits of international diversification is described in Sect. 13.4. Section 13.5 reports the empirical results of the benefits of international diversification. The over-time examination of components of the optimal portfolios is reported in Sect. 13.6. Conclusions are presented and relevant issues are discussed in Sect. 13.7.
2 For instance, the ownership of listed companies by foreigners cannot exceed the limit of 10% in Chile, 25% in South Korea, and 10% in Taiwan. In Brazil, international institutional investors cannot own more than 49% of voting share. In Switzerland, corporations are likely to issue international investors special shares that are traded at a premium over the same stocks available exclusively to Swiss nationals.
W.-J.P. Chiou and C.-F. Lee
13.2 Overview of International Portfolio Management International diversification has been a strong trend for investors in all countries after the mid-1990s. For European investors, such as ones in the Netherlands and the U.K., overseas portfolios play an important role in asset management. However, this is a relatively newer option in the portfolio for investors in the U.S. and most emerging markets. The enormous growth of financial markets outside North America after the mid-1980s is one of major reasons that trigger global diversification. Table 13.1 presents the average growth rates of market value in major 34 countries from 1993 to 2005 and the weights of world market capitalization as of the end of 2005. These sample countries are also grouped into seven regions: East Asia, North America, Latin America, North Europe, West/Central Europe, South Europe, and Oceania. The global value-weighted average annual growth rate is 10.5% during the period. The countries with the largest growth in market value were Finland, Turkey, and Greece, in which the increase rates are twice more than the world average. The variation in the growth of market value within each group of countries at a given developmental stage and in various areas is considerable. The fact that the equity markets in developing countries experienced a larger increase in value indicates the significance of international diversification for the U.S. investor. The magnitude of foreign markets also validates the trend toward international diversification. In the early 1970s, the New York Stock Exchange represented more than 60% of the world market value of less than $1 trillion in total. As shown in Figure 13.1, the size of the world market multiplied significantly in the next three decades and at the end of 2005, the stock market value in the above 34 countries was about 38 trillion. The shares among various regions varied drastically. The percentage of the U.S. equity market varies from more than 50% to less than 40% in the early 1990s and back to 43% by the end of 2005. The East Asia and Oceania region, which made up more than one-third of the market capitalization in the world in the early 1990s, reduced to about one-fifth at the end of 2005. Europe represents about one-third of the global market value. The stock markets in developed countries consistently represented more than 90% of the world capitalization. On the other hand, the capitalization of the emerging markets is small, and only Brazil, South Korea, and Taiwan were of a share greater than 1% in the world market as of the end of 2005. International diversification is inspired by two desired characteristics in portfolio management: a low global correlation to decrease the portfolio volatility and the expansion of investment targets to allow investors to allocate asset toward the markets with higher expected return. The above attributes eventually lead to a superior mean-variance
13 International Portfolio Management: Theory and Method
223
Table 13.1 Capitalization and performance of markets Panel A: Developed countries Sharpe ratio Country
Cap growth
Australia Austria Belgium Canada Switzerland Germany Denmark Spain Finland France U.K. Hong Kong Ireland Italy Japan Netherlands Norway New Zealand Singapore Sweden U.S.A. Average
14:81% 15:14% 12:72% 14:96% 13:09% 10:16% 14:28% 19:11% 24:37% 13:13% 9:60% 14:97% 16:12% 15:04% 5:36% 10:24% 20:00% 8:14% 13:62% 14:39% 10:75% 10:47%
Cap weight 2:08% 0:33% 0:78% 3:83% 2:41% 3:15% 0:48% 2:48% 0:54% 4:20% 7:89% 2:72% 0:29% 2:06% 11:80% 1:23% 0:49% 0:10% 0:66% 1:14% 43:88% 4:41%
Mean return
Region
0:113 0:087 0:081 0:107 0:124 0:096 0:137 0:105 0:114 0:111 0:093 0:125 0:099 0:077 0:009 0:117 0:116 0:067 0:084 0:136 0:114 0:097
Oceania C/W Europe C/W Europe N. America C/W Europe C/W Europe C/W Europe C/W Europe N. Europe C/W Europe C/W Europe E. Asia C/W Europe S. Europe E. Asia C/W Europe N. Europe Oceania E. Asia N. Europe N. America
Median 0:001 0:091 0:020 0:024 0:117 0:046 0:060 0:005 0:081 0:057 0:025 0:002 0:002 0:010 0:089 0:122 0:015 0:034 0:023 0:013 0:114 0:021
St. Dev. 0.069 0.134 0.127 0.093 0.151 0.111 0.093 0.116 0.162 0.091 0.133 0.115 0.140 0.083 0.060 0.158 0.090 0.136 0.107 0.128 0.171 0.079
Max 0:201 0:392 0:376 0:219 0:411 0:309 0:312 0:295 0:375 0:254 0:291 0:251 0:372 0:179 0:048 0:360 0:241 0:254 0:223 0:341 0:393 0:192
Time Dec-05 Dec-05 Aug-98 Sep-00 Mar-98 Jul-98 Apr-98 Aug-98 Jan-93 Jan-00 Mar-98 Sep-94 May-98 Apr-98 Jul-97 Jun-98 Jan-98 Nov-05 Feb-96 Mar-00 Jan-00 Apr-98
Min 0.122 0.172 0.196 0.169 0.197 0.199 0.142 0.145 0.261 0.139 0.293 0.145 0.200 0.163 0.240 0.213 0.183 0.249 0.190 0.139 0.147 -0.145
Time Oct-01 Jun-00 Apr-03 Aug-94 Apr-03 Apr-03 Apr-03 Aug-93 Apr-00 Apr-03 Apr-03 Oct-02 Mar-03 Apr-03 Sep-98 Apr-03 Oct-02 Dec-00 Sep-98 Oct-02 Apr-05 Apr-03
Panel B: Emerging markets Sharpe ratio Country
Cap growth
Cap weight
Mean return
Region
Median
St. dev.
Max
Month-Year
Min
Month-Year
Argentina Brazil Chile Greece Indonesia Korea, S. Malaysia Mexico Philippines Portugal Thailand Turkey Taiwan Average
7:48% 19:78% 12:48% 22:19% 15:84% 15:72% 5:37% 4:28% 7:62% 16:08% 6:11% 24:10% 12:74% 12:21%
0:12% 1:23% 0:35% 0:37% 0:21% 1:85% 0:47% 0:62% 0:10% 0:17% 0:32% 0:42% 1:23% 0:57%
0:185 0:193 0:182 0:133 0:082 0:077 0:064 0:223 0:029 0:045 0:060 0:129 0:062 0:108
L. America L. America L. America S. Europe E. Asia E. Asia E. Asia L. America E. Asia C/W Europe E. Asia S. Europe E. Asia
0:019 0:024 0:046 0:009 0:064 0:042 0:013 0:045 0:104 0:041 0:002 0:004 0:040 0:002
0.101 0.089 0.173 0.105 0.108 0.079 0.126 0.133 0.171 0.127 0.141 0.064 0.065 0.066
0:165 0:230 0:368 0:229 0:149 0:206 0:233 0:437 0:266 0:283 0:213 0:143 0:122 0:129
Mar-94 Jul-97 Nov-94 Oct-99 Dec-05 Dec-05 Jan-94 Feb-94 Oct-95 May-98 Jan-94 Jan-94 Sep-97 Feb-94
0.235 0.181 0.217 0.180 0.279 0.239 0.272 0.131 0.269 0.242 0.273 0.136 0.174 0.137
Jun-02 Oct-02 Oct-02 Jul-95 Oct-98 Jan-98 Nov-98 Feb-99 Nov-01 May-03 Sep-98 Feb-95 Oct-02 Oct-02
The growth rate of capitalization during the period of 1992:12–2005:12 and the world capitalization weight as of the end of 2005 in each country are reported. The mean of raw return for each market from 1988:01 to 2005:12 is annualized. The mean, median, standard deviation, maximum, and minimum of the time series of the monthly Sharpe ratio for each market are also presented. The averages for developed countries and emerging markets are calculated by their value-weighted portfolios.
efficiency of holding portfolio. However, several impediments to global diversification in recent years are discussed. The first criticism of global diversification is that consistently higher returns in domestic markets in certain countries challenge the need of overseas investment. It is questionable to claim greater expected return in the future by using the past superior performance. Table 13.1 summarizes the value-weighted average performance and the dynamics of the monthly Sharpe ratio for each market. The raw return in developing countries, on average, is higher than the stock markets in developed countries. However, the Sharpe ratios in
developed countries are higher than the ones in the emerging markets. The countries of the maximum mean-variance efficient domestic portfolio are the U.S., Switzerland, the Netherlands, and Finland. It is consistent with previous findings that the mean-variance efficiency of equity prices in emerging markets is lower due to higher volatility (Bekaert and Harvey 2003). Furthermore, the cross-area disparity of equity returns among emerging markets is considerable. Specifically, the stock return in Latin America outperformed the ones in other regions, while the equities in East Asian countries performed worse than the rest of the world.
224
W.-J.P. Chiou and C.-F. Lee Millions of U.S. Dollars 40,000,000
Oceania
35,000,000
30,000,000
North America
25,000,000
20,000,000
Latin America
15,000,000
East Asia 10,000,000
5,000,000
Europe
0 1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
Fig. 13.1 Stock market capitalization in the world. The equity market capitalization in different regions from 1992 to 2005 is demonstrated
Table 13.1 also indicates significant time-variation of riskadjusted performance in each stock market. Compared to the means of the Sharpe ratio, the time variation, which is measured by standard deviation, of the Sharpe ratio for each market is substantial. For the developed countries, the maximum domestic Sharpe ratio most likely occurred in 1998 and 2005, while for most emerging markets this occurred before 1998. The local investors in the majority of emerging markets generated the worst risk-adjusted performance in 1998 due to financial crises. On the other hand, investors in developed countries experienced the largest loss in 2002 and 2003 because of the evaporation of high-tech bubbles after 2000 and the economic recession worsened by the terrorist attack in 2001. Furthermore, there are nonsynchronous movements of mean-variance efficiency across countries, which imply the potential gains to domestic investors if they diversify their portfolios globally. For instance, there are ten markets of the best mean-variance efficiency and six markets of the worst in 1998. Similarly, four countries had the highest Sharpe ratio, and three countries had the lowest in 2000. This suggests that cross-market performances can differ drastically in the same year and that local investors may avoid losses in their home markets by allocating their funds optimally in other countries. Second, it has been observed that international correlations have trended aloft over the past decades. As shown in Table 13.2, the intertemporal comparison provides evidence of enhancement of correlation in the international financial market. The means of correlation of each market with all other countries in the second period is persistently greater than the ones in the first period. In the first period, some markets are negatively correlated or almost uncorrelated with certain areas. The enhancement of correlation is particularly
considerable for the emerging markets with the countries that are in the different regions. All means of correlation coefficient in the second period are increasing and positive. This finding is consistent with the enhancement of integration of the world financial market over the past two decades. Table 13.2 also shows the means of the correlation coefficients of each country with other countries grouped by regions in two sub-periods. The developed countries, in general, demonstrate higher correlations with other markets than the emerging markets do. Most countries also have the highest coefficients of correlation with the other countries in their home region and with the ones in North America. Although the magnitude of correlations increases over time, the phenomena that the emerging markets are less correlated with other countries can be constantly observed in the two sample periods. The fact that most markets are less correlated with the countries from other regions indicates possible diversification benefits from inter-continent investments. In addition, the stock markets in the rest of the world tend to have considerable co-movements with the U.S. markets. Third, the existence of home bias in international asset allocation suggests investors consider the barriers to overseas investments costly. The factors such familiarity with financial markets in other countries, political risk, market liquidity and efficiency, regulation on foreign ownership of domestic corporations, transaction costs, levies and taxes, and exchange rate risk, rationalize why an investor would want to overweigh his or her local portfolio compared with the weights of asset in the rest of the world. However, previous research suggests that the costs related to the previous variables are smaller than the improvement of mean-variance brought by international diversification. This implies that overseas asset allocation should not be avoided entirely.
Table 13.2 Coefficients of correlation among markets Panel A: Developed countries World average Latin America 88–96 97–05 88–96 97–05 Australia 0.28 0.55 0.14 0.58 Austria 0.31 0.43 0.07 0.39 Belgium 0.32 0.45 0.06 0.32 Canada 0.31 0.56 0.13 0.61 Switzerland 0.32 0.51 0.02 0.40 Germany 0.35 0.57 -0.01 0.52 Denmark 0.30 0.50 0.01 0.47 Spain 0.39 0.57 0.22 0.56 Finland 0.28 0.40 0.12 0.37 France 0.33 0.59 0.10 0.50 U.K. 0.38 0.56 0.06 0.51 Hong Kong 0.33 0.47 0.17 0.55 Ireland 0.37 0.48 0.09 0.43 Italy 0.28 0.49 0.05 0.46 Japan 0.27 0.38 0.06 0.34 Netherlands 0.40 0.58 0.02 0.49 Norway 0.34 0.54 0.14 0.58 New Zealand 0.25 0.48 0.06 0.44 Singapore 0.40 0.47 0.15 0.55 Sweden 0.38 0.55 0.13 0.52 U.S.A. 0.32 0.56 0.19 0.57 Panel B: Emerging markets World average Latin America 88–96 97–05 88–96 97–05 Argentina 0.06 0.36 0.18 0.57 Brazil 0.10 0.52 0.12 0.67 Chile 0.08 0.51 0.14 0.66 Greece 0.19 0.39 0.09 0.38 Indonesia 0.14 0.34 0.05 0.36 Korea, S. 0.15 0.38 0.06 0.33 Malaysia 0.33 0.34 0.10 0.38 Mexico 0.16 0.52 0.22 0.65 Philippines 0.27 0.37 0.15 0.41 Portugal 0.28 0.46 0.09 0.36 Thailand 0.28 0.41 0.17 0.42 Turkey 0.09 0.38 0.06 0.48 Taiwan 0.15 0.40 0.14 0.54 The averages of unconditional coefficients of correlation of East Asia 88–96 0.25 0.30 0.26 0.34 0.25 0.26 0.20 0.29 0.27 0.24 0.29 0.41 0.33 0.25 0.21 0.30 0.24 0.22 0.52 0.31 0.29 97–05 0.53 0.31 0.23 0.47 0.33 0.36 0.31 0.38 0.25 0.35 0.38 0.54 0.32 0.26 0.40 0.38 0.39 0.48 0.57 0.39 0.45
Central/West Europe 88–96 97–05 0.29 0.54 0.42 0.56 0.51 0.68 0.33 0.58 0.50 0.69 0.58 0.73 0.49 0.65 0.51 0.71 0.32 0.46 0.53 0.76 0.54 0.70 0.34 0.40 0.51 0.61 0.38 0.66 0.40 0.36 0.61 0.76 0.46 0.63 0.27 0.48 0.44 0.37 0.50 0.66 0.39 0.62
North Europe 88–96 97–05 0.43 0.57 0.34 0.39 0.35 0.47 0.39 0.67 0.38 0.55 0.42 0.69 0.45 0.59 0.55 0.63 0.55 0.55 0.35 0.73 0.54 0.63 0.35 0.43 0.50 0.51 0.40 0.59 0.38 0.40 0.51 0.65 0.55 0.54 0.39 0.47 0.45 0.42 0.57 0.65 0.40 0.65
South Europe 88–96 97–05 0.04 0.40 0.34 0.34 0.16 0.37 0.02 0.45 0.17 0.40 0.19 0.52 0.15 0.38 0.18 0.51 0.03 0.43 0.17 0.53 0.10 0.48 0.10 0.26 0.22 0.43 0.15 0.49 0.06 0.24 0.16 0.47 0.10 0.49 0.12 0.39 0.17 0.31 0.19 0.53 0.00 0.46
Oceania 88–96 0.66 0.22 0.15 0.42 0.27 0.22 0.19 0.42 0.38 0.23 0.45 0.30 0.34 0.19 0.26 0.39 0.38 0.66 0.37 0.47 0.29
97–05 0.70 0.52 0.42 0.63 0.52 0.54 0.46 0.58 0.41 0.55 0.57 0.55 0.53 0.46 0.52 0.55 0.61 0.70 0.60 0.55 0.54
North America East Asia Central/West Europe North Europe South Europe Oceania 88–96 97–05 88–96 97–05 88–96 97–05 88–96 97–05 88–96 97–05 88–96 97–05 0.16 0.42 0.05 0.33 0.02 0.31 0.02 0.36 0.14 0.36 0.17 0.37 0.08 0.63 0.08 0.42 0.08 0.56 0.22 0.53 0.16 0.41 0.13 0.54 0.14 0.61 0.14 0.50 0.04 0.46 0.08 0.52 0.01 0.48 0:03 0.56 0.06 0.40 0.11 0.23 0.29 0.50 0.18 0.45 0.35 0.33 0.15 0.38 0.20 0.35 0.20 0.48 0.11 0.26 0.14 0.25 0.13 0.19 0.20 0.42 0.19 0.42 0.22 0.45 0.13 0.32 0.23 0.39 0:07 0.27 0.16 0.53 0.41 0.36 0.46 0.49 0.33 0.26 0.37 0.26 0.16 0.20 0.30 0.36 0.25 0.69 0.21 0.48 0.12 0.48 0.20 0.55 0.00 0.45 0.14 0.57 0.34 0.43 0.39 0.51 0.24 0.29 0.20 0.26 0.16 0.20 0.29 0.48 0.20 0.50 0.18 0.26 0.40 0.64 0.32 0.57 0.41 0.40 0.28 0.42 0.36 0.46 0.41 0.57 0.26 0.31 0.21 0.31 0.21 0.18 0.20 0.62 0:05 0.52 0.11 0.25 0.11 0.38 0.03 0.51 0.35 0.33 0.01 0.41 0.12 0.49 0.25 0.47 0.13 0.32 0.14 0.36 0.05 0.28 0.06 0.45 each country with other countries of different regions during two periods, 1988:01–1996:12 and 1997:01–2005:12 are reported.
North America 88–96 97–05 0.43 0.66 0.17 0.41 0.41 0.48 0.65 0.79 0.39 0.58 0.33 0.71 0.27 0.65 0.40 0.66 0.33 0.60 0.40 0.72 0.51 0.70 0.49 0.57 0.40 0.58 0.29 0.56 0.26 0.50 0.51 0.68 0.41 0.66 0.28 0.51 0.48 0.57 0.44 0.71 0.65 0.79
13 International Portfolio Management: Theory and Method 225
226
13.3 Literature Review Previous empirical evidence suggests that investment constraints may not completely eliminate the benefits brought about by international diversification. Bekaert and Urias (1996), Chiou (2008), Chiou et al. (2009), De Roon et al. (2001), Driessen and Laeven (2007), Harvey (1995), Li et al. (2003), Pástor and Stambaugh (2000), and Wang (1998) confirm the benefits of international diversification even when short-sales are not allowed. Cosset and Suret (1995) find that including securities from high political-risk countries into the portfolio can increase mean-variance efficiency. Considering geographical restrictions, De Roon et al. (2001) find that investors in East Asia can still generate benefits from diversifying their portfolios by investing in other countries in Latin America. Green and Hollifield (1992) and Jagannathan and Ma (2003), on the other hand, investigate the impacts of imposing short-sales and upper-bound investment constraints on mean-variance efficiency and portfolio risk. Errunza et al. (1999) find that U.S. investors can utilize domestically traded American Depositary Receipts (ADRs) to duplicate the benefits of international diversification. The degree of international market integration is critical to the effectiveness of international diversification. Bekaert and Harvey (1995), Bekaert et al. (2005), De Jong and De Roon (2005), and Errunza et al. (1992) suggest that the world market is mildly segmented and that the degree of global market integration varies throughout time. The integration of international financial markets, in general, has gradually increased, but the emerging markets are still more segmented than those in developed countries. Bekaert et al. (2005) find that the correlations among individual stock price do not necessarily increase when the international financial market is more integrated, while the idiosyncratic risk does not show a particular time trend. Given the change in the world financial market, an intertemporal study in international diversification benefits would be useful. The institutional and cultural heterogeneities among countries are key factors determining nonsynchronous movement of security prices among markets. Beck et al. (2003) and Demirgüç-Kunt and Maksimovic (1998) suggest that the international differences of financial markets can be explained by natural endowments and legal tradition. La Porta et al. (1998, 2000), Djankov et al. (2003, 2008) show how law affects the protection to investors. Stulz and Williamson (2003) document that liberalization and development of financial markets relate to cultural background such as major religion and language. Bekaert and Harvey (2003) report the major characteristics of emerging markets and their chronological innovations. The differences in cultural background, natural endowments, institutional systems, and legal tradition deter integration of international financial markets so that investors may generate gains from overseas diversification.
W.-J.P. Chiou and C.-F. Lee
In sum, previous studies document the benefits of global diversification with constraints such as short-sales. The increase in correlations of international financial markets does not completely eliminate the benefits because of the dispersion of returns among countries. The difference of cultures, natural endowments, and institutional systems among markets cause the dispersion of returns among markets. The gap of understanding regarding the time-series of the benefits of constrained international diversifications needs to be filled.
13.4 Forming the Optimal Global Portfolio Suppose a representative investor desires to minimize the volatility of her portfolio, given the same return, by allocating funds among international markets. The expected returns in excess of the risk-free interest rate and the variancecovariance of N international asset returns can be expressed as a vector T D Œ1 ; 2 ; : : : ; N and a positive definite matrix V, respectively. Let be the set of all real vectors of weighting wT such that wT 1 D w1 C w2 C : : : C wN D 1, where 1 is an N -vector of ones. We further assume that the best predictors of means, variances, and covariances are their historical averages. The investor thus follows the method of Markowitz (1952) to form the global efficient frontier. The problem of the optimal portfolio selection is then expressed as a Lagrange’s equation: minfw; ;g „ D
1 T w VwC p wT C 1 wT 1 ; 2 (13.1)
where p denotes the expected return on the portfolio, and the shadow prices and are two positive constants. The quadratic programming solution for asset spanning without investment constraints is: wp D V1 C V1 1 :
(13.2)
Various weighting constraints are considered. It is wellknown that short-selling is not allowed for foreign investors in many countries, particularly in less developed nations (see De Roon et al. 2001; Harvey 1995; Li et al. 2003; Pástor and Stambaugh 2000). Furthermore, the investor considers the liquidity of portfolio when global asset allocation is determined. To reflect the argument by Green and Hollifield (1992) and Jagannathan and Ma (2003), the relative magnitude of market value in each country is considered when the upper boundaries of weighting are characterized. The subset of portfolio weights PU with short-sale and over-weighting investment constraints .SS C OW.U // can be described as: PU D fwp 2 W 0 wi U w.C ap/i 1; i D 1; 2 : : : ; N g; U > 1;
(13.3)
13 International Portfolio Management: Theory and Method
227
where w.Cap/i is the proportion of the market value of each country i in the world, and U is any real number. The restriction of weights is then incorporated in Lagrange’s Equation (13.1). Since the incentives of diversification are not only to seek higher yields but also to reduce volatility, the Maximum Sharpe ratio (MSR) represents the highest meanvariance efficiency that can be achieved by the international efficient frontier. Specifically, MSRJ D maxfwp g
12 ˇ
ˇ wTp = wTp Vwp ˇwTp 2 PJ : (13.4)
For domestic investor in country i , the greatest increase in unit-risk return brought by global diversification is •i;J D MSRJ SRi ;
(13.5)
1
where SRi D i =Vi 2 is the Sharpe ratio, and Vi is the variance of the domestic portfolio in country i . The other measure of benefits of diversification is the maximum reduction in volatility as a result of international diversification. Elton et al. (2007) suggest that investors may seek to minimize the variance of a portfolio because of the lack of predictability of expected returns. In this case, an investor may want to construct a minimum-variance portfolio (MVP). The weighting of the MVP can be characterized as: ˇ h io n ˇ wMVP D wp W minfwp g wTp V wp ˇwTp 2 PJ :
(13.6)
The maximum decline in volatility by diversifying internationally with various investment constraints is 1 ©i;J D 1 wTMVP;J VwMVP;J =Vi 2 :
(13.7)
In this chapter, the global efficient frontiers and the Sharpe ratio are estimated by using monthly returns in the previous 5 years. The time-series of ıi;J and "i;J with various investment constraints are generated by time-rolling efficient frontiers. The weights for the MSR portfolio and the MVP under various investment constraints in each month are also calculated.
13.5 The Benefits of International Diversification Around the World Figure 13.2 exhibits the efficient frontiers of the global portfolio at the end of each year from 1993 to 2005. Because of the time-variations in mean-variance efficiency and correlations among markets, the shape and size of efficient frontiers vary noticeably from 1993 to 2005. The movement of
efficient frontiers also does not follow a consistent direction or pattern. The efficient frontier first shifts to the northwest from 1993 to 1994 but then moves to the southeast (reduction in mean-variance efficiency) from 1994 to 1997. The positions of optimal portfolios hover around the same general area from 1998 to 1999. In 2000, the frontier moves northwest, consistent with the U.S. market dot-com boom. Then, from 2001 to 2003, the frontier moves noticeably to the southeast, seemingly reflective of the U.S. market crash. The period from 2002 to 2003 is by far the least mean-variance efficient of our sample period. Then the frontier moves northwest again in 2004 and 2005. The movement of the efficient frontiers seems to reflect the world markets and business cycles. Figure 13.3 graphically demonstrates the time-series of potential diversification benefits to the domestic investors. Graph A shows that the average improvement of meanvariance efficiency to the local investor was at its peak from 1993 to 1994 when only short-sales constraint is considered. On the other hand, when the limit of over-weighting investment is taken into account, ı is at its maximum during 1996 and 2000. In Graph B, " with various restrictions was slightly lower from 1996 to 1998 and from 2003 to 2005. The intertemporal change of the benefit of risk reduction is not as considerable as that of the one of mean-variance efficiency improvement. The cross-strategy comparison shows that the benefits under various constraints are not proportional. In Graph A, for instance, the divergence of the Sharpe ratio benefits among various trading limits is significant when the Sharpe ratio benefit of the no-short-selling portfolio is high. A similar pattern can be found in Graph B. This suggests that the effectiveness of the less restrictive diversifying strategies is more sensitive to the variants of asset returns in the menu. Consideration of these constraints eliminates uncertainty in portfolio performance since some portion of investment shifts to the second-best alternatives, which generally are the national indices representing larger markets with high mean-variance efficiency. Table 13.3 reports means of ı and " of each country over the sample period. The gains of international diversification to local investors in emerging markets, both the improvement of mean-variance efficiency and the decrease in volatility, are greater than the ones in developed countries. This result holds for the various scenarios of weighting constraints. The diversification benefits to the investors in emerging markets are eroded considerably by restrictive upper bounds, especially ı. The domestic investors in developed countries still can increase mean-variance efficiency of their portfolios by approximately 0.136 and reduce volatility by 2.5% via the global diversification with constraints of SS C OW.3/. Table 13.3 also shows that the comparative advantages vary from region to region. The local investors in East Asia,
228
W.-J.P. Chiou and C.-F. Lee
Fig. 13.2 Mean-variance efficient frontiers: 1993–2005. This figure shows the short-sale constrained international efficient frontiers at the beginning of each year from 1993 to 2005. The efficient frontiers are constructed by the monthly returns in previous 5 years
Monthly Expected Return 0.05 0.04 94
0.03
93 01
00
0.02
98
96
99 97
05
0.01 0.00 0.00
03
02
04 95
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
−0.01 Monthly St. Dev.
Fig. 13.3 The benefits of Panel A: δ international diversification. The time-series of the means of the benefits of international diversification for all sample countries are presented. The increase in the risk-adjusted premium and the decrease in volatility of domestic portfolio by international diversification are •i;J D MSRJ SRi and 1 ©i;J D 1 wTMVP;J VwMVP;J =Vi 2 , respectively
Panel B: ε
on average, enjoy superior increments in the risk-adjusted premium and the greatest shrinkages in volatility from global diversification. For home investors in Latin America, the ben-
efits of incorporating overseas stocks in their portfolios primarily are to reduce risk. The benefits for investors in Europe and North America are relatively trivial compared to the ones
13 International Portfolio Management: Theory and Method
Table 13.3 The benefits of international diversification to domestic investors
229
Panel A: Developed countries SS C OW(10)
SS C OW(5)
SS C OW(3)
ıi;J
"i;J
ıi;J
"i;J
ıi;J
"i;J
ıi;J
"i;J
0:3472 0:4358 0:2563 0:2954 0:1908 0:2588 0:2221 0:2403 0:1875 0:2776 0:2938 0:2726 0:2639 0:3044 0:4308 0:1968 0:3253 0:3248 0:3175 0:2028 0:1617 0:2765
2:03% 2:27% 1:49% 2:28% 1:90% 2:77% 2:09% 3:21% 6:04% 2:19% 0:85% 5:21% 2:38% 3:75% 3:05% 1:51% 3:55% 3:52% 3:69% 4:47% 1:01% 2:82%
0:2475 0:2693 0:1937 0:2614 0:0953 0:1803 0:1609 0:1575 0:1217 0:1645 0:1813 0:1450 0:1950 0:2413 0:3196 0:1307 0:2500 0:2549 0:1434 0:1520 0:0995 0:1888
1:94% 2:18% 1:42% 2:13% 1:82% 2:68% 1:91% 3:08% 5:98% 2:15% 0:60% 5:16% 2:22% 3:63% 2:97% 1:42% 3:49% 3:45% 3:47% 4:29% 0:81% 2:70%
0:2196 0:2506 0:1584 0:2444 0:0603 0:1375 0:1187 0:1467 0:0965 0:1156 0:1487 0:1094 0:1553 0:2063 0:2835 0:0975 0:1966 0:2167 0:1073 0:1188 0:0707 0:1552
1:78% 2:07% 1:31% 2:08% 1:76% 2:52% 1:80% 3:04% 5:93% 2:06% 0:51% 5:14% 2:12% 3:54% 2:87% 1:33% 3:39% 3:34% 3:42% 4:18% 0:76% 2:62%
0:1993 0:2351 0:1318 0:2293 0:0395 0:1206 0:0956 0:1387 0:0862 0:0837 0:1311 0:0876 0:1316 0:1806 0:2669 0:0783 0:1810 0:1974 0:0859 0:1007 0:0474 0:1356
1:64% 1:94% 1:18% 1:99% 1:67% 2:33% 1:67% 2:89% 5:84% 1:95% 0:39% 5:11% 1:95% 3:39% 2:75% 1:25% 3:24% 3:21% 3:36% 4:09% 0:63% 2:50%
SS C OW(10)
SS C OW(5)
SS C OW(3)
"i;J
ıi;J
ıi;J
ıi;J
SS Australia Austria Belgium Canada Switzerland Germany Denmark Spain Finland France U.K. Hong Kong Ireland Italy Japan Netherlands Norway New Zealand Singapore Sweden U.S.A. Average
Panel B: Emerging markets SS ıi;J Argentina Brazil Chile Greece Indonesia Korea, S. Malaysia Mexico Philippines Portugal Thailand Turkey Taiwan Average
0:3163 0:2756 0:2344 0:3178 0:4113 0:3924 0:3113 0:2054 0:3538 0:2951 0:3116 0:3426 0:3754 0:3187
9:32% 9:64% 4:40% 5:63% 11:18% 7:14% 4:93% 6:82% 6:59% 2:96% 8:46% 14:80% 6:89% 7:60%
0:1914 0:1778 0:1511 0:1884 0:2806 0:2927 0:1750 0:0725 0:2136 0:2185 0:1881 0:2240 0:2739 0:2037
"i;J 9:11% 9:51% 4:21% 5:54% 11:00% 6:89% 4:86% 6:75% 6:52% 2:86% 8:26% 14:61% 6:71% 7:45%
0:1814 0:1508 0:1195 0:1532 0:2534 0:2469 0:1452 0:0687 0:1796 0:1926 0:1688 0:1676 0:2386 0:1743
"i;J 8:99% 9:42% 4:10% 5:44% 10:92% 6:78% 4:80% 6:58% 6:46% 2:74% 8:13% 14:54% 6:66% 7:35%
0:1699 0:1307 0:1025 0:1223 0:2396 0:2286 0:1285 0:0617 0:1570 0:1799 0:1517 0:1515 0:2175 0:1570
"i;J 8:88% 9:28% 3:97% 5:31% 10:83% 6:65% 4:73% 6:39% 6:39% 2:61% 8:02% 14:43% 6:49% 7:23%
The medians of benefits of international diversification with short-sales (SS) and various sets of short-sale plus over-weighting (SS C OW(U )) investment constraints to domestic investor in different countries are reported. The increase in the risk-adjusted premium and the decrease in volatility of domestic portfolio by 1 international diversification are •i;J D MSRJ SRi and ©i;J D 1 wTMVP;J VwMVP;J =Vi 2 , respectively.
in the rest of world. The relations of the more constrained diversification benefits among regions are similar to the results of the less constrained portfolios. In the long term, the international diversification benefits are time-varying and do not fall significantly. The more restrictive upper bounds of portfolio weights shrink the benefits of international diversification and their temporal variation. The optimal global diversifying strategies indeed generate benefits to the local investors, even though the short-sale and over-weighting constraints are increasingly restrictive. In general, local investors in emerging markets, particularly in East Asia and Latin America, benefit more from international diversification. This result holds for global portfolios with various investment restrictions.
13.6 The Optimal Portfolio Components Table 13.4 shows the average weight of each country for the maximum Sharpe Ratio (MSR) portfolio and the MVP over the period. The comparison of components in the optimal portfolios under various investment constraints indicates the infeasibility of less restrictive diversifying strategies. For the portfolio with short-sale restrictions, weighting in emerging markets is disproportionate to its distribution in the world capital market value. On average, the investors who wish to maximize portfolio mean-variance efficiency should place 31.41% of wealth in emerging markets, which represent merely 7.4% of total market value of all countries at the end of 2005. Among them, although the Chilean equity
230
W.-J.P. Chiou and C.-F. Lee
Table 13.4 Means of portfolio weights Panel A: Developed countries MSR Portfolio Australia Austria Belgium Canada Switzerland Germany Denmark Spain Finland France U.K. Hong Kong Ireland Italy Japan Netherlands Norway New Zealand Singapore Sweden U.S.A. Sum
SS 0.0010 0.1196 0.0256 0.0040 0.1120 0.0002 0.0278 0.0011 0.1640 0.0017 0.0043 0.0009 0.0139 0.0007 0.0000 0.0282 0.0000 0.0014 0.0000 0.0132 0.1663 0.6859
SS C OW(10) 0.0404 0.0025 0.0026 0.0493 0.1282 0.0034 0.0137 0.0221 0.0353 0.0667 0.0073 0.0590 0.0049 0.0125 0.0014 0.0191 0.0059 0.0021 0.0104 0.0095 0.3830 0.8792
MVP SS C OW(5) 0.0211 0.0018 0.0017 0.0324 0.0742 0.0128 0.0098 0.0244 0.0188 0.0814 0.0254 0.0426 0.0031 0.0193 0.0086 0.0119 0.0033 0.0015 0.0084 0.0107 0.5085 0.9218
SS C OW(3) 0.0131 0.0013 0.0012 0.0194 0.0476 0.0144 0.0059 0.0182 0.0118 0.0794 0.0421 0.0308 0.0021 0.0162 0.0217 0.0077 0.0020 0.0010 0.0061 0.0090 0.5963 0.9475
SS 0.0180 0.0530 0.1197 0.0496 0.0441 0.0072 0.0264 0.0000 0.0000 0.0000 0.2426 0.0000 0.0067 0.0059 0.0585 0.0353 0.0000 0.0062 0.0000 0.0000 0.1845 0.8577
SS C OW(10) 0.0351 0.0105 0.0093 0.0578 0.0690 0.0273 0.0265 0.0000 0.0000 0.0075 0.3317 0.0001 0.0051 0.0128 0.0661 0.0119 0.0013 0.0020 0.0022 0.0000 0.2477 0.9239
SS C OW(5) 0.0503 0.0061 0.0049 0.0361 0.0888 0.0258 0.0145 0.0000 0.0010 0.0153 0.2435 0.0016 0.0055 0.0283 0.0855 0.0059 0.0007 0.0013 0.0016 0.0000 0.3273 0.9438
SS C OW(3) 0.0356 0.0043 0.0033 0.0235 0.0658 0.0233 0.0094 0.0000 0.0015 0.0297 0.1553 0.0049 0.0045 0.0427 0.1078 0.0066 0.0005 0.0011 0.0011 0.0000 0.4374 0.9584
Panel B: Emerging markets MSR Portfolio Argentina Brazil Chile Greece Indonesia Korea, S. Malaysia Mexico Philippines Portugal Thailand Turkey Taiwan Sum
MVP
SS
SS C OW(10)
SS C OW(5)
SS C OW(3)
SS
SS C OW(10)
SS C OW(5)
SS C OW(3)
0.0011 0.0055 0.1049 0.0065 0.0044 0.0600 0.0372 0.0640 0.0241 0.0000 0.0010 0.0049 0.0006 0.3141
0.0019 0.0158 0.0112 0.0060 0.0034 0.0241 0.0232 0.0216 0.0030 0.0003 0.0071 0.0026 0.0006 0.1208
0.0010 0.0138 0.0058 0.0039 0.0020 0.0148 0.0145 0.0122 0.0015 0.0008 0.0045 0.0018 0.0016 0.0782
0.0006 0.0093 0.0035 0.0029 0.0013 0.0095 0.0099 0.0074 0.0009 0.0008 0.0029 0.0013 0.0023 0.0525
0.0026 0.0004 0.0290 0.0008 0.0017 0.0249 0.0347 0.0031 0.0071 0.0216 0.0000 0.0067 0.0096 0.1423
0.0015 0.0008 0.0094 0.0061 0.0016 0.0256 0.0120 0.0015 0.0014 0.0044 0.0001 0.0042 0.0077 0.0761
0.0010 0.0006 0.0054 0.0062 0.0015 0.0190 0.0072 0.0007 0.0008 0.0029 0.0003 0.0023 0.0084 0.0562
0.0006 0.0005 0.0040 0.0062 0.0010 0.0120 0.0054 0.0005 0.0006 0.0020 0.0005 0.0015 0.0069 0.0416
The average weights of the MSR portfolio and the MVP with various investment constraints for each country over the sample period are reported.
market represented only about 0.35% of world capitalization, the average weight is 10.49%. The extreme weights also occur in Austria, Finland, Ireland, South Korea, Mexico, and the Philippines. As the overweighting investment constraints are included and become more restrictive, the percentage of the portfolios on the assets in developing countries decreases. An overwhelming amount of investments also can be found in small-cap areas, such as Latin America, Northern Europe, Southern Europe, and Oceania when only shortselling is considered. On the other hand, the weights for the areas of large market value, such as North America
and Central/Western Europe are zero for a number of periods. The more restrictive OW constraints “enforce” portfolio weighting to distribute to other second-best mean-varianceefficient markets of large capitalization. The weightings of those more constrained portfolios are more balanced than the ones of the short-sale-forbidden portfolios. The percentage of the non-zero-weight month in the testing period is reported in Table 13.5. On the whole, assets in the developed countries were more likely to be selected in the optimal portfolios with restrictive upper bounds. When only short-sale was not allowed, three national
13 International Portfolio Management: Theory and Method
231
Table 13.5 Percentage of non-zero portfolio weight Panel A: Developed countries MSR portfolio
MVP SS C OW(5)
Australia Austria Belgium Canada Switzerland Germany Denmark Spain Finland France U.K. Hong Kong Ireland Italy Japan Netherlands Norway New Zealand Singapore Sweden U.S.A. Developed Countries
SS 2:56 14:74 7:05 4:49 42:95 0:64 17:31 1:92 51:28 1:92 5:77 5:13 10:26 0:64 0:00 25:00 0:64 2:56 0:00 8:97 43:59 95:51
SS C OW(10) 24:36 16:67 23:08 21:15 60:26 7:05 43:59 18:59 58:97 19:87 10:26 35:26 18:59 12:82 2:56 35:90 19:23 21:79 25:64 14:74 72:44 100:00
(%) 25:00 24:36 29:49 25:00 64:10 17:95 59:62 26:92 60:90 31:41 23:72 42:31 23:08 24:36 4:49 44:23 21:15 31:41 37:18 28:85 82:05 100:00
SS C OW(5) SS C OW(3) 25:64 29:49 33:33 25:00 67:95 23:08 57:69 32:05 63:46 44:87 33:97 51:28 26:28 28:21 8:33 46:15 21:79 35:26 44:87 39:74 85:26 100:00
SS SS C OW(10) 43:59 58:33 46:79 71:15 73:72 78:21 29:49 30:77 36:54 71:15 11:54 25:00 58:33 81:41 0:00 0:00 0:00 0:64 0:00 13:46 56:41 64:10 0:00 2:56 17:31 21:15 23:08 50:00 70:51 78:21 21:15 21:15 0:00 4:49 19:23 20:51 0:00 6:41 0:00 0:64 67:95 75:00 100:00 100:00
(%) 64:74 82:05 82:05 30:77 89:74 29:49 83:33 0:00 8:33 22:44 65:38 8:97 43:59 80:77 84:62 21:15 5:77 26:92 7:05 0:00 100:00 100:00
SS C OW(3) 73:08 96:15 94:87 30:77 92:31 35:90 91:67 0:64 8:33 39:10 67:31 20:51 56:41 91:03 92:95 41:67 6:41 36:54 9:62 0:64 100:00 100:00
Panel B: Emerging markets MSR portfolio
MVP SS C OW(5)
SS Argentina Brazil Chile Greece Indonesia Korea, S. Malaysia Mexico Philippines Portugal Thailand Turkey Taiwan Emerging Markets
5:77 18:59 30:13 13:46 7:69 13:46 14:10 25:64 30:77 0:00 1:28 12:18 1:92 67:31
SS C OW(10) 25:64 43:59 51:28 24:36 25:64 28:21 45:51 49:36 36:54 3:85 35:90 19:23 2:56 86:54
SS C OW(5)
(%)
SS C OW(3)
26:28 54:49 51:92 26:92 30:13 35:26 54:49 52:56 36:54 16:03 43:59 24:36 7:05 91:03
28:85 55:77 52:56 32:69 31:41 34:62 59:62 53:21 37:82 26:28 47:44 28:85 12:18 92:95
SS 22:44 5:13 44:87 7:69 14:10 44:87 23:08 8:97 13:46 32:05 0:00 28:21 33:97 98:08
SS C OW(10) 22:44 8:33 47:44 36:54 18:59 43:59 25:64 7:69 16:67 46:15 1:92 31:41 39:74 93:59
(%) 28:85 7:05 50:00 51:28 22:44 43:59 30:77 9:62 19:23 58:33 4:49 31:41 51:92 94:23
SS C OW(3) 32:69 5:77 60:26 73:72 25:00 44:23 39:10 8:33 23:08 69:23 8:97 33:97 51:92 99:36
This table reports the percentages of the months of non-zero portfolio weight for the MSR portfolio and the MVP with various investment constraints for each country during the sample period.
indices (Japan, Singapore, and Portugal) were never selected in the MSR portfolio, and eight national indices (Spain, Finland, France, Hong Kong, Norway, Singapore, Sweden, and Thailand) were never selected in the MVP. Furthermore, only five markets (Switzerland, Finland, U.S.A., Chile, and Portugal) were included in more than 30% of the sample period. As the overweighting investments are increasingly constrained, the securities in any group of countries were more frequently included in the MSR portfolio as well as the
MVP. In the most restrictive case, securities in 22 markets were included in the MSR portfolio and 23 markets in the MVP for more than 30% of the sample period, respectively. Figure 13.4 shows the numbers of national indices selected in the optimal portfolios over the sample period. This cross-strategy comparison supports the essentialness of the upper bounds of portfolio weighting. The timeseries average of market indices for the no-short-selling MSR portfolio is 4.2 and implies, overall, that more than
232
W.-J.P. Chiou and C.-F. Lee
Panel A: MSR Portfolio
Panel B: MVP
Fig. 13.4 Number of selected national indices in the optimal portfolios
80% of the international portfolios are redundant in the same month. The inclusion of overweighting constraints expands the coverage of the optimal portfolios to 9:6 ŒSS C OW.10/ ; 11:8 ŒSS C OW.5/ , and 13:3 ŒSS C OW.3/ . Although a certain portion of Sharpe ratio benefits are lost due to compulsory diversifications, those strategies also increase the invariance of weighting and benefits, while at the same time expanding the assets chosen in the optimal portfolios. As the upper bounds are increasingly constrained, the variations in the elements of the optimal portfolio diminish. A similar conclusion can also be found in the weighting of the MVP.
13.7 Conclusion This chapter adds to the current literature by investigating the impact of weighting bounds on the benefits and asset allocation of the globally diversified portfolio. The empirical results suggest that boundaries of weighting, such as shortsale and overweighting, decrease but not completely eliminate the benefits of global investment. Domestic investors in emerging markets, particularly in East Asia and Latin America, benefit more from international diversification. This finding holds even though the global financial market has become increasingly integrated. In addition, more restrictive
13 International Portfolio Management: Theory and Method
investment constraints enhance the feasibility of the optimal strategies. Specifically, some appealing traits for asset management transpire: a reduction in the temporal deviation of diversification benefits, a decrease in time-variation of components in optimal portfolio, and an expansion in the range of comprising assets. The findings to the issues discussed in this chapter provide useful insights for international asset management professionals. Future research of the benefits of global diversification strategies may apply dynamic asset pricing theory and take into account the demands of hedging market exposure and exchange rate. Chang et al. (2005) examine the demands of market risk hedge and currency exposure hedge in international asset pricing. Chan et al. (1999) and Fleming et al. (2001) confirm the economic value in modeling a variancecovariance matrix when the efficient frontier is formed. Bekaert and Harvey (1995) and Bekaert et al. (2005) document the time-variation of the integration of international financial markets. The purpose of this chapter, however, is to investigate the time-varying international diversification benefits and their changes caused by various investment constraints over the long term. Due to the time-variation in expected return, volatility, and correlation among the international assets, application of conditional or dynamic asset pricing models to parameterize the optimal international diversification is helpful to professionals of asset management.
References Bae, K., K. Chan, and A. Ng. 2004. “Investibility and return volatility.” Journal of Financial Economics 71, 239–263. Beck, T., A. Demirgüç-Kunt, and R. Levine. 2003. “Law, endowments, and finance.” Journal of Financial Economics 70, 137–181. Bekaert, G. and C. R. Harvey. 1995. “Time-varying world market integration.” Journal of Finance 50, 403–444. Bekaert, G. and C. R. Harvey. 2003. “Emerging markets finance.” Journal of Empirical Finance 10, 3–56. Bekaert, G., C. Harvey, and A. Ng. 2005. “Market integration and contagion.” Journal of Business 78, 39–70. Bekaert, G. and M. Urias. 1996. “Diversification, integration and emerging market closed-end funds.” Journal of Finance 51, 835–869. Bekaert, G., R. Hodrick, and X. Zhang. 2005. International stock return comovements, NBER working paper, Columbia Business School, Columbia. Chan, L. K., J. Karceski, and J. Lakonishok. 1999. “On portfolio optimization: forecasting covariances and choosing the risk model.” Review of Financial Studies 12, 937–974. Chang, J., V. Errunza, K. Hogan, and M. Hung. 2005. “An intertemporal international asset pricing model: theory and empirical evidence.” European Financial Management 11, 218–233. Chiou, W. 2008. “Who benefits more from international diversification?” Journal of International Financial Markets, Institutions, and Money 18, 466–482.
233 Chiou, W., A. C. Lee, and C. Chang. 2009. “Do investors still benefit from international diversification with investment constraints?” Quarterly Review of Economics and Finance 49(2), 448–483. Cosset, J. and J. Suret. 1995. “Political risk and the benefit of international portfolio diversification.” Journal of International Business Studies 26, 3013–18. De Jong, F. and F. A. De Roon. 2005. “Time-varying market integration and expected returns in emerging markets.” Journal of Financial Economics 78, 583–613. Demirgüç-Kunt, A. and V. Maksimovic. 1998. “Law, finance, and firm growth.” Journal of Finance 53, 2107–2137. De Roon, F. A., T. E. Nijman, and B. J. Werker. 2001. “Testing for mean-variance spanning with short sales constraints and transaction costs: the case of emerging markets.” Journal of Finance 56, 721–742. De Santis, G. and B. Gerard. 1997. “International asset pricing and portfolio diversification with time-varying risk.” Journal of Finance 52, 1881–1912. Djankov, S., R. La Porta, F. Lopez-de-Silanes, and A. Shleifer. 2003. “Courts.” Quarterly Journal of Economics 118, 453–518. Djankov, S., R. La Porta, F. Lopez-de-Silanes, and A. Shleifer. 2008. “The law and economics of self-dealing.” Journal of Financial Economics 88, 430–465. Driessen, J. and L. Laeven. 2007. “International portfolio diversification benefits: cross-country evidence from a local perspective.” Journal of Banking and Finance 31, 1693–1712. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2007. Modern portfolio theory and investment analysis. Wiley, NJ. Errunza, V., K. Hogan, and M. Hung. 1999. “Can the gains from international diversification be achieved without trading abroad?” Journal of Finance 54, 2075–2107. Errunza, V., E. Losq, and P. Padmanabhan. 1992. “Tests of integration, mild segmentation and segmentation hypotheses.” Journal of Banking and Finance 16, 949–972. Fleming, J., C. Kirby, and B. Ostdiek. 2001. “The economic value of volatility timing.” Journal of Finance 56, 329–352. Fletcher, J. and A. Marshall. 2005. “An empirical examination of the benefits of international diversification.” Journal of International Financial Markets, Institutions and Money 15, 455–468. French, K. R. and J. M. Poterba. 1991. “Investor diversification and international equity markets.” American Economic Review 81, 222–226. Green, R. C. and B. Hollifield. 1992. “When will mean-variance efficient portfolios be well diversified?” Journal of Finance 47, 1785–1809. Harvey, C. 1995. “Predictable risk and returns in emerging markets.” Review of Financial Studies 8, 773–816. Jagannathan, R. and T. S. Ma. 2003. “Risk reduction in large portfolios: why imposing the wrong constraints helps.” Journal of Finance 58, 1651–1683. La Porta, R., F. Lopez-de-Silanes, A. Shleifer, and R. Vishny. 1998. “Law and finance.” Journal of Political Economy 106, 1113–1155. La Porta, R., F. Lopez-de-Silanes, A. Shleifer, and R. Vishny. 2000. “Investor protection and corporate governance.” Journal of Financial Economics 58, 3–27. Li, K., A. Sarkar, and Z. Wang. 2003. “Diversification benefits of emerging markets subject to portfolio constraints.” Journal of Empirical Finance 10, 57–80. Markowitz, H. 1952. “Portfolio selection.” Journal of Finance 7, 77–91. Novomestky, F. 1997. “A dynamic, globally diversified, index neutral synthetic asset allocation strategy.” Management Science 43, 998–1016.
234 Obstfeld, M. 1994. “Risk-taking, global diversification, and growth.” American Economic Review 84, 1310–1329. Pástor, L. and R. Stambaugh. 2000. “Comparing asset pricing models: an investment perspective.” Journal of Financial Economics 56, 335–381.
W.-J.P. Chiou and C.-F. Lee Stulz, R. M. and R. Williamson. 2003. “Culture, openness, and finance.” Journal of Financial Economics 70, 313–349. Wang, Z. 1998. “Efficiency loss and constraints on portfolio holdings.” Journal of Financial Economics 48, 359–375.
Chapter 14
The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model: A Case of World Equity Fund Market Chin W. Yang, Ken Hung, and Jing Cui
Abstract Due to limited numbers of reliable international equity funds, the Markowitz investment model is ideal in constructing an international portfolio. Overinvestment in one or several fast-growing markets can be disastrous as political instability and exchange rate fluctuations reign supreme. We apply the Le Châtelier principle to the international equity fund market with a set of upper limits. Tracing out a set of efficient frontiers, we inspect the shifting phenomenon in the mean–variance space. The optimum investment policy can be easily implemented and risk minimized. Keywords Markowitz quadratic programming model r Le Châtelier Principle r International equity fund and added constraints
14.1 Introduction As the world is bubbling in the cauldron of globalization, investment in foreign countries ought to be considered as part of an optimum portfolio. Needless to say, returns from such investment can be astronomically high as was evidenced by several international equity funds in the period of the Asian flu. On the flop side, however, it may become disastrous as political system and exchange rate markets undergo structural changes. The last decade has witnessed a substantial increase in international investments partly due to “hot money” from OPEC, the People’s Republic of China, and various types of quantum funds, many of which are from the U.S. From the report of Morgan Stanley Capital International Perspectives, North America accounted for 51.6% of the
K. Hung () Texas A&M International University, Laredo, TX, USA e-mail:
[email protected] C.W. Yang and J. Cui Clarion University of Pennsylvania, Clarion, PA, USA
world equity market. Most recently, however, the European Union has begun to catch up with the U.S. Along with booming Asian economies, Russia, India and Brazil are making headway into the world economic stage. In addition, the Japanese economy has finally escaped from its decade-long depression. Obviously, the opportunities in terms of gaining security values are much greater in the presence of international equity markets. Unlike the high correlation between domestic stock markets (e.g., 0.95 between the New York Stock Exchange and the S&P index of 425 large stocks), it is rather low between international markets. For example, correlations between the stock indexes of the U.S. and Australia, Belgium, Germany, Hong Kong, Italy, Japan and Switzerland were found to be 0.505, 0.504, 0.489, 0.491, 0.301, 0.348 and 0.523, respectively (Elton and Gruber 2007) In effect, the average correlation between national stock indexes is 0.54. The same can be said of bonds and Treasury bills. The relatively low correlation across nations offers an excellent playground for international diversification, which will no doubt reduce the portfolio risk appreciably. An example was given by Elton and Gruber (2007) that 26% of the world portfolio, excluding the U.S. market, in combination with 74% of the U.S. portfolio reduced total minimum risk by 3.7% compared with the risk if investment is made exclusively in the U.S. market. Furthermore, it was found that a modest amount of international diversification can lower risk even in the presence of the large standard deviation in returns. Solnik found that 15 of 17 countries had higher returns on stock indexes than those on the US equity index for the period of 1971– 1985. In the light of the growing trend of globalization and expanding GDP, a full-fledged Markowitz rather than a world CAPM is appropriate to arrive at an optimum international portfolio. It is of greater importance to study this topic for it involves significantly more dollar amounts (sometime in billions of dollars). The next section presents data and the Markowitz quadratic programming model. Sect. 14.3 introduces the Le Châtelier principle in terms of maximum investment percentage on each fund. Sect. 14.4 discusses the empirical results. A conclusion is given in Sect. 14.5.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_14,
235
236
C.W. Yang et al.
14.2 Data and Methodology Monthly price data are used from March 2006 to March 2007 for ten international equity markets or 130 index prices. Ust 1 / , we obtain 120 index returns from ing return rt D .PtPP t 1 which 10 expected return ri .I D 1; 2 : : : 10/ and 45 return covariances ¢ij and 10 return variances are obtained. In addition, we assume future index return will not deviate from expected returns. In his pioneering work, Markowitz (1952, 1956, 1959, and 1991) proposed the following convex quadratic programming investment model. Min xi ; xj v D
X
xi2 ii C
i 2I
XX
xi xj ij
(14.1)
i 2I j 2J
subject to X
ri xi k
(14.2)
xi D 1
(14.3)
xi 0 8i 2 I
(14.4)
i 2I
X i 2I
where xi D proportion of investment in national equity fund i ii D variance of rate of return of equity fund i ij D covariance of rate of return of equities i and j ri D expected rate of return of equity fund i k D target rate of return of the portfolio i and j are sets of positive integers Subscripts i D 1; 2; 3 : : : : : : 10 denote ten equity funds: Fidelity European Growth Fund, ED America Fund, Australia Fund, Asean Fund, Japan Fund, Latin America fund, Nordic Fund, Iberia Fund, Korea Fund, United Kingdom Fund, respectively. Note that exchange rate is not considered here, as some studies suggest the exchange rate market is fairly independent of the domestic equity market (Elton and Gruber 2007, pp. 266–267). Moreover, rates of returns on equities are percentages, which are measurement free regardless of which local currencies are used. Markowitz model is the very foundation of modern investment with quite a few software available to solve Equations (14.1) through (14.4). A standard quadratic programming does not take advantage of the special feature in the Markowitz model and as such the dimensionality problem sets limits to solving a large-scale portfolio. One way to circumvent this limitation is to reduce the Markowitz model to linear programming framework via the Kuhn-Tucker conditions. The famed duality theorem that the objective function of the primal (maximization) problem is bounded by that of the dual (minimization) problem lends support to solving
quadratic programming as linear programming with ease. We employ LINDO (Linear, Interactive, and Discrete Optimizer) developed by LINDO systems Inc. (2000). The solution in test runs is identical to that solved by other nonlinear programming software, such as LINGO. We present the results in Section 14.4.
14.3 The Le Châtelier Principle in the Markowitz Investment Model When additional constraints are added to a closed system, the response of variables to such perturbation become more “rigid” and may very well increase the total investment risk, since decision variables have fewer choices. If the original solution set remains unchanged as additional constraints are added, it is a case of envelope theorem in differential calculus (Samuelson 1960 and 1972; Silberberg 1971 and 1974; Labys and Yang 1996; Silberberg and Suen 2000). In economics terms, it is often said that long-run demand is more elastic than that of a short run. In reality, however, the decision variables are expected to change as constraints are added. For instance, optimum investment proportions are bound to vary as the Investment Company Act is enacted. In particular, the result by Loviscek and Yang (1997) indicates the loss in efficiency is 1–2 percentage points due to the Investment Company Act, which translates into millions of dollars loss in daily return. The guidelines of the Central Securities Administrators Council declare that no more than 5% of a mutual fund’s assets may be invested in the securities of that issuers. This 5% rule extends to investment in companies in business less than 3 years, warrants options, and futures. It is to be pointed out that the Investment Company Act applies to domestic mutual funds. In the case of world equity markets, the magnitudes are greater and one percentage point may translate into billions of dollars. To evaluate the theoretical impact in terms of the Le Châtelier principle, we formulate the Lagrange Equation from (14.1) to (14.4).
X X xi (14.5) LDvC k ri xij C 1 Where and are Lagrange multiplies indicating change in the risk in response to change in right hand side constraints. In particular assumes important economic meaning: change in total portfolio risk in response to an infinitesimally small change in k while all other decision variables adjust to their new equilibrium levels, that is, D dv=dk. Hence, the Lagrange multiplier is of pivotal importance in determining the shape of the efficient frontier in the model. In the unconstrained Markowitz model the optimum investment proportion in stock (or equity fund) i is a free variable between 0 and 1. In the international equity
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model
market, however, diversification is more important owing to the fact that (1) correlations are low and (2) political turmoil can jeopardize the portfolio greatly if one put most of his/her eggs in one basket. As we impose the maximum investment proportion on each security from 99 to 1%, the solution to the portfolio selection model becomes more restricted (i.e., the values of optimum investment proportion are bounded within a narrower range as the constraint is tightened). Such as impact on the objective function v is equivalent to the following: as the system is gradually constrained, the limited freedom of optimum X gives rise to a higher and higher risk level as k is increased. This is to say, if parameter k is increased gradually, the Le Châtelier principle implies that in the original Markowitz minimization system, isorisk contour has the smallest curvature to accommodate the most efficient adjustment mechanism as shown below: ˇ 2 ˇ ˇ 2 ˇ ˇ 2 ˇ ˇ @ v ˇ ˇ@ v ˇ ˇ@ v ˇ ˇ ˇ ˇ ˇ ˇ ˇ (14.6) ˇ @k 2 ˇ ˇ @k 2 ˇ ˇ @k 2 ˇ where v and v are the objective function (total portfolio risk) corresponding to the additional constrains of xi s and xi s for all i and s > s represent different investment proportions allowed under V and V and abs denotes absolute value. Using the envelope theorem (Dixit 1990), it follows immediately that @ fL.xi ; k/ D v.xi .k//g d fL.xi .k/; k/ D v.xi .k//g D dk @k D jxi D xi .k/ (14.7) as such, Equation (14.6) can be mode to ensure the following inequalities: ˇ ˇ ˇ ˇ ˇ ˇ ˇ @ ˇ ˇ @ ˇ ˇ @ ˇ ˇ ˇˇ ˇ ˇ ˇ (14.8) ˇ @k ˇ ˇ @k ˇ ˇ @k ˇ Table 14.1 The Markowitz investment model without upper limits Expected rate 11% 12% 13% 14% of return Objective function value
237
Well-known in the convex analysis, the Lagrange multiplier œ is the reciprocal of the slope of the efficient frontier curve frequently drawn in investment textbooks. Hence, in the mean–covariance space the original Markowitz efficient frontier has the steepest slope for a given set of xi ’s. Care must be exercised that the efficient frontier curve of the Markowitz minimization system has a vertical segment corresponding to a range of low ks and a constant v. Only within this range do the values of optimum xs remain unchanged under various degrees of maximum limit imposed. Within this range constraint Equation (14.2) is not active. That is the Lagrange multiplier is zero. As a result, equality relation holds for Equation (14.8). Outside this range, the slopes of the efficient frontier curve are different given to the result of Equation (14.8).
14.4 An Application of the Le Châtelier Principle in the World Equity Market Equations (14.1) through (14.4) comprise the Markowitz model without upper limits imposed on stocks. We calibrate the target rate (annualized) k to 34% from 11% with an increment of 1%. At 11%, X2 (ED America Fund) and X3 (Australia Fund) split the portfolio. As k increase X3 dominates while X2 is given less and less weight, which leads to the emergence of X8 (Iberia Fund). At 30%, only X4 (Asean Fund) and X8 (Iberia Fund) remain in the portfolio (Table 14.1). The efficient frontier leftmost curve in Fig 14.1 takes the usual shape in the mean–variance space. As we mandate a 35% maximum limit on each equity fund – the least binding constrain – X1(Fidelity European Growth Fund) X2, X3, X8 and X10 (United Kingdom Fund) take over the portfolio from k D 10% to 15% before X4 drops out 15%
16%
17%
18%
19%
6:78 102 6:82 102 6:95 102 7:16 102 7:39 102 7:64 102 7:91 102 8:20 102 8:58 102
X1 (Fidelity 0 European growth fund)
0
0
0
0
0
0
0
0
X2 (ED America fund)
0.5
0.400806
0.300101
0.219762
0.162278
0.104793
0.047309
0
0
X3 (Australia 0.5 fund)
0.599194
0.699899
0.75939
0.772629
0.785867
0.799106
0.791753
0.699589
X4 (Asean fund)
0
0
0
0
0
0
0
0
0.01296
X5 (Japan fund)
0
0
0
0
0
0
0
0
0 (continued)
238
Table 14.1 (continued) X6 (Latin 0 America fund) X7 (Nordic 0 fund) X8 (Iberia 0 fund) X9 (Korea 0 fund) X10 (United 0 Kingdom fund) Expected rate 20% of return Objective 9:05 102 function value X1 (Fidelity 0 European growth fund) 0 X2 (ED America fund) X3 (Australia 0.634087 fund) X4 (Asean 0.057536 fund) X5 (Japan 0 fund) X6 (Latin 0 America fund) X7 (Nordic 0 fund) X8 (Iberia 0.308377 fund) X9 (Korea 0 fund) X10 (United 0 Kingdom fund)
C.W. Yang et al.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.020848
0.065094
0.109339
0.153585
0.208247
0.287451
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
21%
22%
23%
24%
25%
26%
27%
28%
9:59 102 0.101866
0.108515
0.115818
0.123775
0.132387
0.141654
0.151575
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.568585
0.503083
0.437581
0.372078
0.306576
0.241074
0.175572
0.110069
0.102111
0.146687
0.191262
0.235838
0.280414
0.324989
0.369565
0.41414
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.329304
0.35023
0.371157
0.392084
0.41301
0.433937
0.454864
0.47579
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
50% 45% 40% 35% Expected return
15% 20% 25% 30% 35% 100% SML
30% 25% 20% 15% 10% 5% 0% 0
0.05
0.1
0.15
0.2 Risk
Fig. 14.1 Efficient frontiers with various upper bounds
0.25
0.3
0.35
Table 14.2 Solutions of the Markowitz model with 35% upper limit Expected rate 9% 10% 11% 12% of return 7:43 102 7:58 102 7:73 102 Objective 7:33 102 function value 0.085217 0.133884 0.110655 0.087426 X1 (Fidelity European growth fund) X2 (ED 0.35 0.35 0.35 0.35 America fund) X3 (Australia 0.35 0.35 0.35 0.35 fund) 0 0 0 0 X4 (Asean fund) X5 (Japan 0 0 0 0 fund) X6 (Latin 0 0 0 0 America fund) X7 (Nordic 0 0 0 0 fund) X8 (Iberia 0 0.029984 0.079497 0.12901 fund) X9 (Korea 0 0 0 0 fund) 0.214783 0.136133 0.109848 0.083564 X10 (United Kingdom fund) Expected rate 20% 21% 22% 23% of return 14% 8:07 102
0.040788
0.35
0.35 0.000716 0 0
0 0.227105 0 0.031391
25%
13% 7:90 102
0.064197
0.35
0.35 0 0 0
0 0.178523 0 0.05728
24%
26%
0.008995
0
0.267461
0
0
0
0.007761
0.35
0.35
0.015784
8:26 102
15%
27%
0
0
0.247432
0
0
0
0.058159
0.35
0.344409
0
8:48 102
16%
28%
0
0
0.289142
0
0
0
0.064676
0.35
0.296182
0
8:80 102
17%
29%
0
0
0.330852
0
0
0
0.071193
0.35
0.247955
0
9:13 102
18%
(continued)
0
0
0.35
0
0
0
0.093635
0.35
0.206365
0
9:48 102
19%
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model 239
Table 14.2 (continued) 9:87 102 Objective function value 0 X1 (Fidelity European growth fund) X2 (ED 0.170407 America fund) X3 (Australia 0.35 fund) X4 (Asean 0.129594 fund) X5 (Japan 0 fund) X6 (Latin 0 America fund) X7 (Nordic 0 fund) X8 (Iberia 0.35 fund) X9 (Korea 0 fund) X10 (United 0 Kingdom fund) 0.107476
0
0.09849
0.35 0.20151 0 0
0 0.35 0 0
0.102903
0
0.134448
0.35 0.165552 0 0
0 0.35 0 0
0
0
0.35
0
0
0
0.237468
0.35
0.062532
0
0.112404
0
0
0.35
0
0
0
0.273427
0.35
0.026573
0
0.117688
0
0
0.35
0
0
0
0.314597
0.335403
0
0
0.123998
0
0
0.35
0
0.043638
0
0.35
0.256362
0
0
0.139536
0
0
0.35
0
0.162544
0
0.35
0.137456
0
0
0.170337
0
0
0.35
0
0.28145
0
0.35
0.01855
0
0
0.206101
0
0
0.35
0
0.3
0
0.35
0
0
0
0.228593
240 C.W. Yang et al.
Table 14.3 Solutions of the Markowitz model with 30% upper limit Expected rate of return 8% 9% 10% 7:64 102 7:78 102 Objective function value 7:60 102 X1 (Fidelity European 0.133913 0.192955 0.169726 growth fund) X2 (ED America fund) 0.3 0.3 0.3 X3 (Australia fund) 0.3 0.3 0.3 X4 (Asean fund) 0 0 0 X5 (Japan fund) 0 0 0 X6 (Latin America 0 0 0 fund) X7 (Nordic fund) 0 0 0 X8 (Iberia fund) 0 0.007437 0.05695 X9 (Korea fund) 0 0 0 X10 (United Kingdom 0.266087 0.199608 0.173324 fund) Expected rate of return 18% 19% 20% 9:71 102 0.101255 Objective function value 9:33 102 X1 (Fidelity European 0 0 0 growth fund) X2 (ED America fund) 0.289177 0.253218 0.21726 X3 (Australia fund) 0.3 0.3 0.3 X4 (Asean fund) 0.110823 0.146782 0.18274 X5 (Japan fund) 0 0 0 X6 (Latin America 0 0 0 fund) X7 (Nordic fund) 0 0 0 X8 (Iberia fund) 0.3 0.3 0.3 X9 (Korea fund) 0 0 0 X10 (United Kingdom 0 0 0 fund) 12% 8:09 102 0.123268 0.3 0.3 0 0 0 0 0.155977 0 0.120755 22% 0.110586 0 0.145344 0.3 0.254656 0 0 0 0.3 0 0
11% 7:93 102 0.146497 0.3 0.3 0 0 0 0 0.106463 0 0.14704 21% 0.105743 0 0.181302 0.3 0.218698 0 0 0 0.3 0 0
0 0.3 0 0
0.109385 0.3 0.290615 0 0
23% 0.115785 0
0 0.200288 0 0.09668
0.3 0.3 0.004001 0 0
13% 8:26 102 0.099031
0 0.3 0 0
0.059706 0.3 0.3 0 0.040294
24% 0.126251 0
0 0.240643 0 0.074284
0.3 0.3 0.011046 0 0
14% 8:44 102 0.074027
0 0.3 0 0
0.00518 0.3 0.3 0 0.09482
25% 0.139566 0
0 0.280998 0 0.051888
0.3 0.3 0.018091 0 0
15% 8:63 102 0.049023
0 0.3 0 0
0 0.19239 0.3 0 0.20761
26% 0.169119 0
0 0.3 0 0.033198
0.3 0.3 0.040426 0 0
16% 8:83 102 0.026376
0 0.3 0.1 0
0 0 0.3 0 0.3
27% 0.232735
0 0.3 0 0.017805
0.3 0.3 0.076368 0 0
17% 9:05 102 0.005827
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model 241
Table 14.4 Solutions of the Markowitz model with 25% upper limit Expected rate of return 8% 9% 10% 7:94 102 8:18 102 Objective function value 7:94 102 X1 (Fidelity European 0.25 0.25 0.205568 growth fund) X2 (ED America fund) 0.25 0.25 0.25 X3 (Australia fund) 0.25 0.25 0.25 X4 (Asean fund) 0 0 0 X5 (Japan fund) 0 0 0 X6 (Latin America 0 0 0 fund) X7 (Nordic fund) 0 0 0 X8 (Iberia fund) 0 0 0.083917 X9 (Korea fund) 0 0 0 X10 (United Kingdom 0.25 0.25 0.210515 fund) Expected rate of return 18% 19% 20% 0.10134 0.104717 Objective function value 9:82 102 X1 (Fidelity European 0.02553 0.004981 0 growth fund) X2 (ED America fund) 0.25 0.25 0.25 X3 (Australia fund) 0.25 0.25 0.25 X4 (Asean fund) 0.168138 0.20408 0.237302 X5 (Japan fund) 0 0 0 X6 (Latin America 0 0 0 fund) X7 (Nordic fund) 0 0 0 X8 (Iberia fund) 0.25 0.25 0.25 X9 (Korea fund) 0 0 0 X10 (United Kingdom 0.056332 0.040939 0.012698 fund) 12% 8:50 102 0.157274 0.25 0.25 0.007287 0 0 0 0.173471 0 0.161969 22% 0.124825 0 0.16235 0.25 0.25 0 0.08765 0 0.25 0 0
11% 8:33 102 0.182278 0.25 0.25 0.000242 0 0 0 0.133115 0 0.184365 21% 0.112819 0 0.216876 0.25 0.25 0 0.033124 0 0.25 0 0
0 0.25 0 0
0.107825 0.25 0.25 0 0.142175
23% 0.138059 0
0 0.213826 0 0.139573
0.25 0.25 0.014331 0 0
13% 8:67 102 0.13227
0 0.25 0 0
0.053299 0.25 0.25 0 0.196701
24% 0.152521 0
0 0.25 0 0.117903
0.25 0.25 0.02437 0 0
14% 8:85 102 0.107728
0 0.25 0.01243 0
0 0.23757 0.25 0 0.25
25% 0.170996 0
0 0.25 0 0.10251
0.25 0.25 0.060312 0 0
15% 9:06 102 0.087178
0 0.25 0.25 0
0 0 0.25 0 0.25
26% 0.318692 0
0 0.25 0 0.087117
0.25 0.25 0.096254 0 0
16% 9:28 102 0.066629
0 0.25 0 0.071725
0.25 0.25 0.132196 0 0
17% 9:54 102 0.04608
242 C.W. Yang et al.
Table 14.5 Solutions of the Markowitz model with 20% upper limit Expected rate of return 11% Objective function value 8:91 102 X1 (Fidelity European growth fund) 0.2 X2 (ED America fund) 0.2 X3 (Australia fund) 0.2 X4 (Asean fund) 0 X5 (Japan fund) 0 X6 (Latin America fund) 0 X7 (Nordic fund) 0 X8 (Iberia fund) 0.2 X9 (Korea fund) 0 X10 (United Kingdom fund) 0.2 Expected rate of return 18% Objective function value 0.105156 X1 (Fidelity European growth fund) 0.199003 X2 (ED America fund) 0.2 X3 (Australia fund) 0.2 X4 (Asean fund) 0.2 X5 (Japan fund) 0 X6 (Latin America fund) 0.000997 X7 (Nordic fund) 0 X8 (Iberia fund) 0.2 X9 (Korea fund) 0 X10 (United Kingdom fund) 0 12% 8:94 102 0.196097 0.2 0.2 0.015936 0 0 0 0.187968 0 0.2 19% 0.115271 0.136659 0.2 0.2 0.2 0 0.063341 0 0.2 0 0
13% 9:12 102 0.16853 0.2 0.2 0.044255 0 0 0 0.2 0 0.187215 20% 0.126117 0.074314 0.2 0.2 0.2 0 0.125686 0 0.2 0 0
14% 9:33 102 0.14798 0.2 0.2 0.080197 0 0 0 0.2 0 0.171822 21% 0.137694 0.01197 0.2 0.2 0.2 0 0.18803 0 0.2 0 0
15% 9:56 102 0.127431 0.2 0.2 0.116139 0 0 0 0.2 0 0.156429 22% 0.157136 0 0.131176 0.2 0.2 0 0.2 0 0.2 0.068824 0
16% 9:83 102 0.106882 0.2 0.2 0.152081 0 0 0 0.2 0 0.141037 23% 0.183105 0 0.044055 0.2 0.2 0 0.2 0.003234 0.2 0.152711 0
17% 0.101129 0.086332 0.2 0.2 0.188023 0 0 0 0.2 0 0.125644 24% 0.244151 0 0 0.2 0.2 0 0.2 0 0.2 0.2 0
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model 243
Table 14.6 Solutions of the Markowitz model with 15% upper limit Expected rate of return 14% 15% Objective function value 0.106891 0.107619 X1 (Fidelity European growth fund) 0.15 0.15 X2 (ED America fund) 0.15 0.15 X3 (Australia fund) 0.15 0.15 X4 (Asean fund) 0.15 0.15 X5 (Japan fund) 0.1 0.000542 X6 (Latin America fund) 0 0 X7 (Nordic fund) 0 0.099458 X8 (Iberia fund) 0.15 0.15 X9 (Korea fund) 0 0 X10 (United Kingdom fund) 0.15 0.15 16% 0.114122 0.15 0.15 0.15 0.15 0 0.07293 0.02707 0.15 0 0.15
17% 0.12277 0.15 0.15 0.15 0.15 0 0.129408 0 0.15 0 0.120592
18% 0.134269 0.15 0.15 0.15 0.15 0 0.15 0.035453 0.15 0.019024 0.045523
19% 0.150141 0.142108 0.15 0.15 0.15 0 0.15 0 0.15 0.107892 0
20% 0.175912 0.056087 0.043913 0.15 0.15 0 0.15 0.15 0.15 0.15 0
21% 0.195432 0.1 0 0.15 0.15 0 0.15 0.15 0.15 0.15 0
244 C.W. Yang et al.
14 The Le Chatelier Principle in the Markowitz Quadratic Programming Investment Model
(Table 14.2). Beyond it, X2, X3, X4 and X8 stay positive till k D 24%. At k D 26% and on X3, X4, X6 (Latin America Fund) and X8 prevail the solution set. The efficient frontier curve is the second leftmost as shown in Fig 14.1, indicating the second most efficient risk – return lotus in the market. As we lower the maximum limit on each equity fund to 30%, again X1, X2, X3, X8 and X10 enter the Markowitz portfolio base with X1, X2, X3, and X8 dominating other funds till k D 22% (Table 14.3). At k D 27%; X4 D 0:3, X6 (Latin America Fund) D 0:3; X8 D 0:3, and X9 (Korea Fund) D 0:1 constitute the solution set. The efficient frontier is next right to that with maximum investment limit set at 35%. When the maximum investment limit is lowered to 25%, the diversification across world equity market is obvious (Table 14.4); X1, X2, X3, X4, X8 and X10 become positive at k D 11% to k D 20%. At k D 21%, X10 is replaced by X6 till k D 24%. At the highest k D 26%, the international equity fund portfolio is equally shared by X4, X6, X8 and X9. Again, the efficient frontier has shifted a bit to the right, a less efficient position. Table 14.5 reports the results when the maximum investment amount is set at 20%. This constraint is clearly so overbearing that six of ten equity funds are positive at k D 12%, through k D 17%. The capital market manifests itself in buying majority of equity funds. As a consequence, the efficient frontier is second to the right most curve. Table 14.6 presents the last simulation in this paper: an imposition of a 15% maximum investment limit on each equity fund. This is the most binding constraint and as such we have eight positive investments at k D 15% and 16%. At its extreme, an equity fund manager is expected to buy every equity fund except X5 at k D 18%. The solution at k D 21%, the optimum portfolio consists of X1 D 0:1; X3 D X4 D X6 D X7 D X8 D X9 D 0:15. The over diversification is so pronounced that its efficient frontiers appears to the right most signaling the least efficient locus in the mean–variance space.
14.5 Conclusion The Le Châtelier principle developed in thermodynamics can be easily applied to international equity fund market in which billions of dollars can change hand at the drop of a hat. We need to point out that considering international capital investment is tantamount to enlarging opportunity set greatly especially in the wake of growing globalization coupled with prospering world economy. In applying the Le Châtelier principle, there is no compelling reason to fix the optimum portfolio solution at a fixed set. As such, optimum portfolio investment amounts (proportions) are most likely to change and lead one to a suboptimality. The essence of the Le Châtelier principle is that over diversification has its
245
own price, an exorbitant one, due to its astronomically high value of volume transacted. The forced equilibrium is far from being an efficient equilibrium as the efficient frontiers keep shifting to the right as the maximum investment proportion is tightened gradually. Is cross-border capital investment worthwhile? On the positive side, the enlarged opportunity will definitely moves solution set to the northwest corner of the mean–variance space, which corresponds to higher indifference curve. On the negative side, over diversification in the international equity market has its expensive cost: shifting efficient frontier to the southeast corner of the mean–variance space. If foreign governments view these capital investments negatively, a stiff tax or transaction cost would slow down such capital movement. Notwithstanding these negative considerations, investment in international equity market is on the rise and will become even more popular in the future.
References Dixit, A. K. 1990. Optimization in economic theory, 2nd ed., Oxford University Press, Oxford. Elton, E. J. and M. J. Gruber. 2007. Modern portfolio theory and investment analysis 7th ed., Wiley, New York. Labys, W. C. and C.-W. Yang. 1996. Le Chatelier principle and the flow sensitivity of spatial commodity models. in Recent advances in spatial equilibrium modeling: methodology and applications, van de Bergh, J. C. J. M., P. Nijkamp and P. Rietveld (Eds.). Springer, Berlin, pp. 96–110. Loviscek, A. L. and C.-W. Yang. 1997. “Assessing the impact of the investment company act’s diversification rule: a portfolio approach.” Midwest Review of Finance and Insurance 11(1), 57–66. Markowitz, H. M. 1952. “Portfolio selection.” The Journal of Finance VII(1), 77–91. Markowitz, H. M. 1956 “The optimization of a quadratic function subject to linear constraints.” Naval Research Logistics Quarterly 3, 111–133. Markowitz, H. 1959. Portfolio selection: Efficient Diversification of Investment, John Wiley and Sons, New York. Markowitz, H. M. 1991. “Foundation of portfolio theory.” Journal of Finance XLVI(2), 469–477. Samuelson, P. A. 1948. Foundations of Economic Analysis, Cambridge, MA: Harvard University Press. Samuelson, P. A. 1960. An extension of the Le Chatelier principle. Econometrica pp. 368–379. Samuelson, P. A. 1972. “Maximum principles in analytical economics.” American Economic Review 62(3), 249–262. Schrage, L. 2000. Linear, integer and quadratic programming with LINDO, Palo Alto, CA: The Scientific Press. Silberberg, E. 1971. “The Le Chatelier principle as a corollary to a generalized envelope theorem.” Journal of Economic Theory 3, 146– 155. Silberberg, E. 1974. “A revision of comparative statics methodology in economics, or how to do economics on the back of an envelope.” Journal of Economic Theory 7, 159–172. Silberberg, E. and W. Suen. 2000. The structure of economics: a mathematical analysis, McGraw-Hill, New York. Value Line, Mutual fund survey New York: Value Line, 19. Yang, C.W., K. Hung, and J. A. Fox. 2006. “The Le Châtelier principle of the capital market equilibrium.” in Encyclopedia of Finance, C. F. Lee and A. C. Lee (Eds.). Springer, New York, pp. 724–728.
Chapter 15
Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints ´ Darinka Dentcheva and Andrzej Ruszczynski
Abstract We consider the problem of constructing a portfolio of finitely many assets whose return rates are described by a discrete joint distribution. We present a new approach to portfolio selection based on stochastic dominance. The portfolio return rate in the new model is required to stochastically dominate a random benchmark. We formulate optimality conditions and duality relations for these models and construct equivalent optimization models with utility functions. Two different formulations of the stochastic dominance constraint: primal and inverse, lead to two dual problems that involve von Neuman–Morgenstern utility functions for the primal formulation and rank dependent (or dual) utility functions for the inverse formulation. The utility functions play the roles of Lagrange multipliers associated with the dominance constraints. In this way our model provides a link between the expected utility theory and the rank dependent utility theory. We also compare our approach to models using value at risk and conditional value at risk constraints. A numerical example illustrates the new approach. Keywords Portfolio optimization r Stochastic dominance r Stochastic order r Risk r Expected utility r Duality r Rank dependent utility r Yaari’s dual utility r Value at risk r Conditional value at risk
15.1 Introduction The problem of optimal portfolio selection is subject of major theoretical and computational studies in finance. A fundamental issue while dealing with uncertain outcomes is a theoretically sound approach to their comparison. The theory of stochastic orders plays a fundamental role in economics (see Mosler and Scarsini 1991; Whitmore and
Findlay 1978). These are relations that induce partial order in the space of real random variables in the following way. A random variable R dominates the random variable Y if EŒu.R/ EŒu.Y / for all functions u./ from certain set of functions, called the generator of the order. The concept of stochastic dominance is very popular and widely used in economics and finance because of its relation to models of risk-averse preferences (Fishburn 1964). It originated from the theory of majorization (Hardy et al. 1934) for the discrete case, and was later extended to general distributions (Quirk and Saposnik 1962; Hadar and Russell 1969; Rotschild and Stiglitz 1969). Stochastic dominance of second order is defined by the set of nondecreasing concave functions: a random variable R dominates another random variable Y in the second order if EŒu.R/ EŒu.Y / for all nondecreasing concave functions u./ for which these expected values are finite. Thus, no risk-averse decision maker will prefer a portfolio with return rate Y over a portfolio with return rate R. A popular approach is the utility optimization approach. Von Neumann and Morgenstern (1944) developed the expected utility theory: for every rational decision maker there exists a utility function u./ such that the decision maker prefers outcome R over outcome Y if and only if EŒu.R/ > EŒu.Y / . This approach can be implemented also very efficiently; however, it is almost impossible to elicit the utility function of a decision maker explicitly. More difficulties arise when a group of decision makers with different utility functions have to reach a consensus. Recently, the dual utility theory (or rank dependent expected utility theory) has attracted much attention in economics. This approach was first presented by Quiggin (1982) and later rediscovered in a special case by Yaari (1987). From a different system of axioms than those of von Neumann and Morgenstern, one derives that every decision maker has a certain rank dependent utility function w W Œ0; 1 ! R. Then a nonnegative outcome R is preferred over a nonnegative outcome Y , if and only if
D. Dentcheva Stevens Institute of Technology, Hoboken, NJ, USA A. Ruszczy´nski () Rutgers University, New Brunswick, NJ, USA e-mail:
[email protected]
Z1
Z1 w.p/dF .1/ .RI p/
0
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_15,
w.p/dF.1/ .Y I p/; 0
(15.1) 247
´ D. Dentcheva and A. Ruszczynski
248
where, F.1/ .RI / is the inverse distribution function of R. For a comprehensive treatment of the rank dependent utility theory, we refer to Quiggin (1993), and for its application in actuarial mathematics, see Wang et al. 1997; Wang and Yong 1998. Another classical approach, pioneered by Markowitz (1952, 1959, 1987), is the mean-risk approach, which compares the portfolios with respect to two characteristics. One is the expected return rate (the mean) and another one is the risk, which is given by some scalar measure of the uncertainty of the portfolio return rate. The mean-risk approach recommends the selection of Pareto-efficient portfolios with respect to these two criteria. In a mean-risk portfolio model we combine these criteria by specifying some parameter as a tradeoff between them. As a parametric optimization problem the mean-risk model can be solved numerically very efficiently, which makes this approach very attractive (Konno and Yamazaki 1991; Ruszczy´nski and Vanderbei 2003). In this paper we formulate a model for risk-averse portfolio optimization and demonstrate its relation to the expected utility approach and to rank dependent utility approach. We optimize the portfolio performance under an additional constraint that the portfolio return rate stochastically dominates a benchmark return rate, for example, the return rate of an index. The model is based on the publications of Dentcheva and Ruszczy´nski (2003a, b, c; 2004a, b) where a new model of risk-averse optimization has been introduced. This approach has a fundamental advantage over mean-risk models and utility function models. All data for our model are readily available. In mean-risk models the choice of the risk measure has an arbitrary character, and it is difficult to argue for one measure against another. Similarly, optimization of expected utility requires the form of the utility function to be specified. Our analysis, departing from the benchmark outcome, generates implied utility function of the decision maker. It is implicitly defined by the benchmark used, and by the problem under consideration. We provide two problem formulations in which the stochastic dominance has a primal or inverse form: a Lorenz curve. The primal form has a dual problem in terms of expected utility functions, and the inverse form has a dual problem in terms of rank dependent utility functions. In this way our model provides a link between this two competing economic approaches. Duality relations with coherent measures of risk are explored in Dentcheva and Ruszczy´nski (2008).
15.2 The Portfolio Problem Let R1 ; R2 ; : : : ; Rn be random return rates of assets 1; 2; : : : ; n. We assume that EŒjRj j < 1 for all j D 1; : : : ; n.
Our aim is to invest our capital in these assets in order to obtain some desirable characteristics of the total return rate on the investment. Denoting by x1 ; x2 ; : : : ; xn the fractions of the initial capital invested in assets 1; 2; : : : ; n we can easily derive the formula for the total return rate: R.x/ D R1 x1 C R2 x2 C : : : C Rn xn :
(15.2)
Clearly, the set of possible asset allocations can be defined as follows: X D fx 2 Rn W x1 Cx2 C: : :Cxn D 1; xj 0; j D 1; 2; : : : ; ng: In some applications one may introduce the possibility of short positions (i.e., allow some xj ’s to become negative). Other restrictions may limit the exposure to particular assets or their groups, by imposing upper bounds on the xj ’s or on their partial sums. One can also limit the absolute differences between the xj ’s and some reference investments xN j , which may represent the existing portfolio, and so on. Our analysis does not depend on the detailed way this set is defined; we only use the fact that it is a convex polyhedron. All modifications discussed above define some convex polyhedral feasible sets and are, therefore, covered by our approach. The main difficulty in formulating a meaningful portfolio optimization problem is the definition of the preference structure among feasible portfolios. If we use only the mean return rate EŒR.x/ , then the resulting optimization problem has a trivial and meaningless solution: invest everything in assets that have the maximum expected return rate. For these reasons, the practice of portfolio optimization usually resorts to two approaches. In the first approach we associate with portfolio x some dispersion measure .R.x// representing the variability of the return rate R.x/. In the classical Markowitz model the function .R.x// is the variance of the return rate, .R.x// D VŒR.x/ ; but many other measures are possible here as well. The mean-risk portfolio optimization problem is formulated as follows: max EŒR.x/ .R.x//: x2X
(15.3)
Here, is a nonnegative parameter representing our desirable exchange rate of mean for risk. If D 0, the risk has no value and the problem reduces to the problem of maximizing the mean. If > 0 we look for a compromise between the mean and the risk. Alternatively, one can minimize the risk function .x/, while fixing the expected return rate EŒR.x/ at some value m, and consider a family of problems
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints
parametrized by m. The reader is referred to the book by Elton et al. (2006) for the modern perspective on mean-risk analysis in portfolio theory. The general question of constructing mean-risk models that are in harmony with the stochastic dominance relations has been the subject of the analysis of the recent papers by Ogryczak and Ruszczy´nski (1999; 2001; 2002). We have identified there several primal risk measures, most notably central semideviations, and dual risk measures, based on the Lorenz curve, which are consistent with the stochastic dominance relations. The second approach is to select a certain utility function u W R ! R and to formulate the following optimization problem (15.4) max EŒu.R.x// : x2X
It is usually required that the function u./ is concave and nondecreasing, thus representing preferences of a risk-averse decision maker (Fishburn 1964; 1970). Recently, a dual (rank dependent) utility model attracts much attention. It is based on distorting the cumulative probability distribution of the random variable R.x/ rather than applying a nonlinear function u./ to the realizations of R.x/. The corresponding problem has the following form Z1 F.1/ .R.x/; p/dw.p/:
max x2X
(15.5)
249
functions. For a real random variable V , its first performance function is defined as the right-continuous cumulative distribution function of V : F1 .V I / D PfV g
for 2 R:
A random return V is said (Lehmann 1955; Quirk and Saposnik 1962) to stochastically dominate another random return S in the first order, denoted V .1/ S , if F1 .V I / F1 .S I /
for all 2 R:
We can say that V is “stochastically larger” than S , because it takes values lower than with smaller (or equal) probabilities than S , no matter what the target is. The second performance function F2 is given by areas below the distribution function F , Z F2 .V I / D
F1 .V I / d
for 2 R;
1
and defines the weak relation of the second order stochastic dominance (SSD). That is, random return V stochastically dominates S in the second order, denoted V .2/ S , if F2 .V I / F2 .S I /
for all 2 R:
0
Here F.1/ .R.x/; p/ is the p-quantile of the random variable R.x/, and w./ is the rank dependent utility function, which distorts the probability distribution. We discuss this in Sect. 15.3.2. The challenge in both utility approaches is to select the appropriate utility function or rank dependent utility function that represent our preferences and whose application leads to nontrivial and meaningful solutions of Equation (15.4) or (15.5). In this paper we propose an alternative approach: introducing a comparison to a benchmark return rate into our optimization problem. The comparison is based on the stochastic dominance relation. More specifically, we consider only portfolios whose return rates stochastically dominates a certain benchmark return rate.
15.3 Stochastic Dominance 15.3.1 Direct Forms In the stochastic dominance approach, random return rates are compared by a point-wise comparison of some performance functions constructed from their distribution
(see Hadar and Russell 1969; Rotschild and Stiglitz 1969). We can express the function F2 .V I / as the expected shortfall (see, for example, Levy 2006; Ogryczak and Ruszczy´nski 1999): for each target value we have F2 .V I / D EŒ. V /C ;
(15.6)
where . V /C D max. V; 0/. The function F.2/ .V I / is continuous, convex, nonnegative and nondecreasing. It is well defined for all random variables V with finite expected value. Due to this representation, the second order stochastic dominance relation V .2/ S can be equivalently characterized by the system of inequalities on the expected shortfall below any target : EŒ. V /C EŒ. S /C
for all 2 R:
(15.7)
Also, we obtain an equivalent characterization in terms of the expected utility theory of von Neumann and Morgenstern (see, for example, Hanoch and Levy 1969; Levy 2006; Müller and Stoyan 2002): For any two random variables V; S the relation V .1/ S holds true if and only if for all nondecreasing functions u./ defined on R we have EŒu.V / EŒu.S / :
(15.8)
´ D. Dentcheva and A. Ruszczynski
250
any two random variables V; S with finite expectations, the relation V .2/ S holds true if and only if Equation (15.8) is satisfied for all nondecreasing concave functions u./.
For
In the context of portfolio optimization, we consider stochastic dominance relations between random return rates defined by Equation (15.2). Thus, we say that portfolio x dominates portfolio y in the first order, if F1 .R.x/I / F1 .R.y/I /
for all 2 R:
This is illustrated in Fig. 15.1. Similarly, we say that x dominates y in the second order .R.x/ .2/ R.y//, if F2 .R.x/I / F2 .R.y/I /
for all 2 R:
The second order relation is illustrated in Fig. 15.2. Recall that the individual return rates Rj have finite expected values and thus the function F2 .R.x/I / is well defined.
Let us consider the inverse model of stochastic dominance, frequently referred to as Lorenz dominance. For a real random variable V (for example, a random return rate) we define the left-continuous inverse of the cumulative distribution function F1 .V I / as follows: F.1/ .V I p/ D inf f W F1 .V I / pg
for
0 < p < 1:
Given p 2 .0; 1/, the number q D q.V I p/ is called a p-quantile of the random variable V if PfV < qg p PfV qg: For p 2 .0; 1/ the set of p-quantiles is a closed interval and F.1/ .V I p/ represents its left end. Directly from the definition of the first order dominance we see that V .1/ S , F.1/ .V I p/ F.1/ .S I p/ for all 0 < p < 1: (15.9) The first order dominance constraint can be interpreted as a continuum of probabilistic (chance) constraints, studied in stochastic optimization (see, Dentcheva 2005; Prékopa 2003). Our analysis uses the absolute Lorenz function F.2/ .V I / W Œ0; 1 ! R, introduced in (Lorenz 1905). It is defined as the cumulative quantile:
1
F1(R(x);h)
F1(R(y);h)
15.3.2 Inverse Forms
0.5
Zp F.2/ .V I p/ D
F.1/ .V I t/dt
for
0 < p 1;
0
(15.10) h
0
Fig. 15.1 First order stochastic dominance R.x/ .1/ R.y/
F.2/ .V I 0/ D 0: Similarly to F2 .V I /, the function F.2/ .V I / is well defined for any random variable V , which has a finite expected value. We notice that Z1 F.2/ .V I 1/ D
F2(R(y);h)
F.1/ .V I t/dt D EŒV : 0
F2(R(x);h)
h
Fig. 15.2 Second order dominance R.x/ .2/ R.y/
By construction, the Lorenz function is convex. Lorenz functions are commonly used for inequality ordering of positive random variables, relative to their (positive) expectations (see Gastwirth 1971; Muliere and Scarsini 1989). Such a Lorenz function, p 7! F.2/ .V I p/=EŒV , is convex and nondecreasing. The absolute Lorenz function, however, is not monotone, when negative outcomes occur.
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints
251
It is well known (see, for example, Ogryczak and Ruszczy´nski 2002) that we may fully characterize the second order dominance relation by using the function F.2/ .V I /: V .2/ S , F.2/ .V I p/ F.2/ .S I p/ for all 0 p 1: (15.11)
F(–2)(R(x);p)
F(–2)(R(y);p) This characterization of stochastic dominance by Lorenz p functions is widely used in economics and statistics. 1 We now provide an equivalent characterization by rank 0 0.5 dependent utility functions. It is analogous to the characteriFig. 15.4 Second order dominance R.x/ .2/ R.y/ in the inverse form zation by expected utility functions. Dentcheva and Ruszczy´nski (2006b) provide the following characterization.
For any two random variables V; S the relation V .1/ S
holds true if and only if for all nondecreasing functions w./ defined on [0,1] we have Z1
Z1 F.1/ .V I p/dw.p/
0
F.1/ .S I p/dw.p/: 0
(15.12)
Similarly, we say that x dominates y in the second order .R.x/ .2/ R.y//, if F.2/ .R.x/I p/ F.2/ .R.y/I p/
f or al l p 2 Œ0; 1 : (15.13)
Recall that the individual return rates Rj have finite expected values and thus the function F.2/ .R.x/I / is well defined. The second order relation is illustrated in Fig. 15.4.
For any two random variables V; S with finite expec-
tations, the relation V .2/ S holds true if and only if Equation (15.12) is satisfied for all nondecreasing concave functions w./. The functions w./ appearing in this characterization are rank dependent (dual) utility functions. In the context of portfolio optimization, we consider stochastic dominance relations between random return rates defined by Equation (15.2). Thus, we say that portfolio x dominates portfolio y in the first order, if F.1/ .R.x/I p/ F.1/ .R.y/I p/
for all p 2 .0; 1/:
15.3.3 Relations to Value at Risk and Conditional Value at Risk There are fundamental relations between the concepts of Value at Risk .VaR/ and Conditional Value at Risk .CVaR/ and the stochastic dominance constraints. The VaR constraint in the portfolio context is formulated as follows. We define the loss rate L.x/ D R.x/. We specify the maximum fraction !p of the initial capital allowed for risk exposure at risk level p 2 .0; 1/, and we require that PŒL.x/ !p 1 p:
This is illustrated in Fig. 15.3.
Denoting by VaRp .L.x// the left .1 p/-quantile of the random variable L.x/, we can equivalently formulate the VaR constraint as VaRp .L.x// !p : The first order stochastic dominance relation between two portfolios is equivalent to the continuum of VaR constraints. Portfolio x dominates portfolio y in the first order, if
F(–1)(R(x);p) F(–1)(R(y);p)
VaRp .L.x// VaRp .L.y// for all p 2 .0; 1/: p
0
0.5
1
Fig. 15.3 First order stochastic dominance R.x/ .1/ R.y/ in the inverse form
The CVaR at level p, roughly speaking, has the following form CVaRp .L.x// D EŒL.x/jL.x/ VaRp .L.x// :
´ D. Dentcheva and A. Ruszczynski
252
This formula is precise if VaRp .L.x// is not an atom of the distribution of L.x/. More precisely we express it as follows: 1 CVaRp .L.x// D p
Zp
15.4.1 Direct Formulation
VaRt .L.x//dt: 0
We note that 1 CVaRp .L.x// D F.2/ .R.x/; p/: p
(15.14)
Another description uses extremal properties of quantiles and equivalently represents CVaR as follows (Rockafellar and Uryasev 2000): (15.15)
A CVaR constraint on the portfolio x can be formulated as follows: (15.16) CVaRp .L.x// !p : Using Equations (15.14) and (15.13) we conclude that the second order stochastic dominance relation for two portfolios x and y is equivalent to the continuum of CVaR constraints: R.x/ .2/ R.y/ , CVaRp .L.x// CVaRp .L.y// for all p 2 .0; 1 : (15.17) Assume that we compare the performance of a portfolio x with a random benchmark Y (for example, an index return rate or another portfolio return rate) requiring R.x/ .2/ Y . Then the fraction !p of the initial capital allowed for risk exposure at level p is given by the benchmark Y : !p D CVaRp .Y /;
p 2 .0; 1 :
Assume that Y has a discrete distribution with realizations yi ; i D 1; : : : ; m. Then relation Equation (15.7) is equivalent to EŒ.yi R.x//C EŒ.yi Y /C ;
i D 1; : : : ; m: (15.18)
This result does not imply that the continuum of CVaR constraints Equation (15.17) can be replaced by finitely many constraints of form CVaRpi .R.x// CVaRpi .Y /;
The starting point for our model is the assumption that a benchmark random return rate Y having a finite expected value is available. It may have the form of Y D R.z/, for some benchmark portfolio z. It may be an index or our current portfolio. Our intention is to have the return rate of the new portfolio, R.x/, preferable over Y . Therefore, we introduce the following extension of the optimization problem Equation (15.3): max EŒR.x/ .R.x//
1 CVaRp .L.x// D inf EŒ. R.x//C : p
15.4 The Dominance-Constrained Portfolio Problem
i D 1; : : : ; m;
with some fixed probabilities pi ; i D 1; : : : ; m. The reason is that we do not know at which probability levels the CVaR constraints have to be imposed.
(15.19)
subject to R.x/ .2/ Y;
(15.20)
x 2 X:
(15.21)
Similarly to Equation (15.3), we optimize a mean-risk objective function, but we introduce a constraint that the portfolio return dominates a benchmark. Even when D 0 and we maximize just the expected value of the return rate, our model will still lead to nontrivial solutions, due to the presence of the dominance constraint Equation (15.20). To increase flexibility of model Equations (15.19)– (15.21), we may also allow a uniform shift of R.x/ by a constant c, as in the following model: max EŒR.x/ .R.x// ıc subject to R.x/ C c .2/ Y; x 2 X: Here ı > 0 can be interpreted a cost of the shift c. Observe that the shift c may also become negative, in which case we are rewarded for uniformity of dominating Y . The shift c may be interpreted as an additional cash added to the return, and ı is the interest to be paid when the loan is paid back. To simplify the derivations, from now on we focus on the simplest formulation of the dominance-constrained problem: max EŒR.x/
(15.22)
subject to R.x/ .2/ Y;
(15.23)
x 2 X:
(15.24)
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints
We can observe the first advantage of our problem formulation: all data in it are readily available. Moreover, the set defined by Equation (15.23) is convex (Dentcheva and Ruszczy´nski 2003c; 2004a, c). Let us assume now that Y has a discrete distribution with realizations yi attained with probabilities i ; i D 1; : : : ; m. We also assume that the return rates have a discrete joint distribution with realizations rjt ; t D 1; : : : ; T; j D 1; : : : ; n, attained with probabilities pt ; t D 1; 2; : : : ; T . Then the formulation of the stochastic dominance relation Equation (15.23) resp. Equation (15.18) simplifies even further. Introducing variables sit representing the shortfall of R.x/ below yi in realization t; i D 1; : : : ; m; t D 1; : : : ; T , we can formulate problem Equations (15.22)–(15.24) as follows: max
T X
pt
i D1
n X
xj rjt
(15.25)
j D1
subject to n X
xj rjt C sit yi ;
i D 1; : : : ; m;
t D 1; : : : ; T;
j D1
(15.26) T X
pt Sit
t D1
m X
k .yi yk /C ;
253
If the set X is a convex polyhedron, problem Equations (15.25)–(15.29) becomes a large scale linear programming problem. It may be solved by general-purpose linear programming solvers. However, the size of the problem increases dramatically with the number of assets n, their return realizations T , and benchmark realizations m, which makes it impractical for even moderate dimensions (in thousands). For the purpose of solving these problems, we developed a specialized decomposition method presented in Dentcheva and Ruszczy´nski (2006a).
15.4.2 Inverse Formulation Assume that the return rates have a joint discrete distribution realizations rjt ; t D 1; : : : ; T and j D 1; : : : ; n, attained with probabilities pt ; t D 1; 2; : : : ; T . Moreover, we assume that all probabilities pt are equal, that is, pt D 1=T; t D 1; : : : ; T . This is the case of empirical distributions. Correspondingly, we assume that Y has a discrete distribution with m D T equally probable realizations yt ; t D 1; : : : ; T . We use the symbol RŒt .x/ to denote the ordered realizations of R.x/; that is, RŒ1 .x/ RŒ2 .x/ : : : RŒT .x/:
i D 1; : : : ; m; (15.27)
lD1
sit 0;
i D 1; : : : ; m;
t D 1; : : : ; T:
(15.28)
x 2 X:
(15.29)
Indeed, or every feasible point x of (15.22)–(15.24), setting 0 sit D max@0; yi
n X
1 xj rjt A ;
i D 1; : : : ; m;
j D1
t D 1; : : : ; T; we obtain a feasible pair .x; s/ for Equations (15.26)– (15.29). Conversely, for any feasible pair .x; s/ for Equations (15.26)–(15.29), inequalities Equations (15.26) and (15.28) imply that n X
sit max.0; yi
xj rjt /;
i D 1; : : : ; m;
t D 1; : : : ; T:
j D1
Taking the expected value of both sides and using Equation (15.27) we obtain F2 .R.x/I yi / F2 .Y I yi /;
Since R.x/ has a discrete distribution, the functions F2 .R.x/I / and F.2/ .R.x/I / are piecewise linear. Owing to the fact that all probabilities pt are equal, the break points of F.2/ .R.x/I / occur at t=T , for t D 0; 1; : : : ; m. The same applies to F.2/ .Y I /. It follows from Equation (15.13) that the stochastic dominance constraint Equation (15.23) can be equivalently expressed as
t t F.2/ Y I ; F.2/ R.x/I T T
Note that F.2/ .R.x/I 0/ D F.2/ .Y I 0/ D 0. We have
t 1 X t D RŒk .x/; F.2/ R.x/I T T
t D 1; : : : ; T:
kD1
Therefore problem Equations (15.22)–(15.24) can be written with an equivalent inverse form of the dominance constraint: max EŒR.x/ subject to t X
i D 1; : : : ; m:
Therefore, problem Equations (15.22)–(15.24) is equivalent to problem Equations (15.25)–(15.29).
t D 1; : : : ; T:
RŒk .x/
kD1
x 2 X:
t X
yŒk ;
(15.30) t D 1; : : : ; T;
(15.31)
kD1
(15.32)
´ D. Dentcheva and A. Ruszczynski
254
It was shown in (Ogryczak and Ruszczy´nski 2002) that the P function x 7! tkD1 RŒk .x/ is concave and positively homogeneous. It is also polyhedral. Therefore, Equation (15.31) are convex polyhedral constraints. If the set X is a convex polyhedron, problem Equations (15.30)–(15.32) has an equivalent linear programming formulation. All these transformations are possible due to the crucial assumption that the probabilities of all elementary events are equal. If they are not equal, the break points of the function F.2/ .R.x/I / depend on x, and therefore inequality Equation (15.13) cannot be reduced to finitely many convex inequalities. This is in contrast to the primal formulation, where the discreteness of Y alone was sufficient to reduce the stochastic dominance constraint to finitely many convex inequalities. We have to observe that the quantile formulation Equation (15.31) of stochastic dominance constraints is more involved than the primal formulation, and requires more sophisticated computational methods. Using Equation (15.31) directly would require employing nonsmooth optimization methods to solve problem Equations (15.30)–(15.32). Equivalent formulation with linear constraints has very many constraints, because of the large number of pieces of the function x 7! Pt nski (2010) dekD1 RŒk .x/. Still, Dentcheva and Ruszczy´ veloped a highly efficient cutting plane method, which significantly outperforms direct approaches.
It will play for problem Equations (15.22)–(15.24) a similar role to that of a Lagrangian. It is well defined, because for every u 2 U and every x 2 Rn the expected value EŒu.R.x// exists and is finite. The following theorem has been proved in a more general version in (Dentcheva and Ruszczy´nski 2003c). Theorem 15.1. If xO is an optimal solution of Equations (15.22)–(15.24) then there exists a function uO 2 U such that L.x; O uO / D max L.x; uO /
(15.34)
EŒOu.R.x// O D EŒOu.Y / :
(15.35)
x2X
Conversely, if for some function uO 2 U an optimal solution xO of Equation (15.34) satisfies Equations (15.23) and (15.35), then xO is an optimal solution of Equations (15.22)–(15.24). We can also develop duality relations for our problem. With the function Equation (15.33) we can associate the dual function D.u/ D max L.x; u/: x2X
We are allowed to write the maximization operation here, because the set X is compact and L.; u/ is continuous. The dual problem has the following form min D.u/: u2E
The set U is a closed convex cone and D./ is a convex function, so Equation (15.36) is a convex optimization problem.
15.5 Optimality and Duality 15.5.1 Primal Form From now on we assume that the probability distributions of the return rates are discrete with finitely many realizations realizations rjt ; t D 1; : : : ; T; j D 1; : : : ; n, attained with probabilities pt ; t D 1; 2; : : : ; T . We also assume that there are finitely many ordered realizations of the benchmark outcome Y W y1 < y2 < < ym . The probabilities of these realizations are denoted by i ; i D 1; : : : ; m. We also assume that the set X is compact. We define the set U of functions u W R ! R satisfying the following conditions: u./ is concave and nondecreasing u./ is piecewise linear with break points yi ; i D 1; : : : ; m u.t/ D 0 for all t ym
It is evident that U is a convex cone. Let us define the function L W Rn U ! R as follows L.x; u/ D EŒR.x/ C u.R.x// u.Y / :
(15.36)
(15.33)
Theorem 15.2. Assume that Equations (15.22)–(15.24) has an optimal solution. Then problem Equation (15.36) has an optimal solution and the optimal values of both problems coincide. Furthermore, the set of optimal solutions of Equation (15.36) is the set of functions uO 2 U satisfying Equations (15.34)–(15.35) for an optimal solution xO of Equations (15.22)–(15.24). Note that all constraints of our problem are linear or convex polyhedral, and therefore we do not need any constraint qualification conditions here. The “Lagrange multiplier” u is directly related to the expected utility theory of von Neumann and Morgenstern. We have established earlier that the second order stochastic dominance relation is equivalent to Equation (15.8) for all utility functions in U. Our result shows that one of them, uO ./, assumes the role of a Lagrange multiplier associated with Equation (15.23). A point xO is a solution to Equations (15.22)–(15.24) if there exists a utility function
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints
uO ./ such that xO maximizes over X the objective function EŒR.x/ augmented with this dual utility. We see that the optimization problem in Equation (15.34) is equivalent to max EŒv.R.x// ; x2X
255
Conversely, if for some function wO 2 W we find an optimal solution xO of Equation (15.39) that satisfies Equations (15.31) and (15.40), then xO is an optimal solution of Equations (15.30)–(15.32).
(15.37)
where v./ D C u./. At the optimal solution the function vO ./ D C uO ./ is the implied utility function. It attaches higher penalty to smaller realizations of R.x/ (bigger realizations of L.x/). By maximizing L.R.x/; u/ we look for x such that the left tail of the distribution of R.x/ is thin. It is important to stress that the optimal function uO ./ is piecewise linear, with break points at the realizations y1 ; : : : ; ym of the benchmark Y . Therefore, the dual problem has also an equivalent linear programming formulation.
We can also develop a duality theory based on Lagrangian Equation (15.38). For every function w 2 W the problem max ˆ.x; w/ x2X
is a Lagrangian relaxation of problem Equations (15.30)– (15.32). Its optimal value, ‰.w/, is always greater than or equal to the optimal value of Equations (15.30)–(15.32). We define the dual problem as min ‰.w/:
(15.42)
w2W
15.5.2 Inverse Form In addition to the assumption that all involved distributions are discrete, we also assume that all probabilities pt are equal, and that m D T . We introduce the set W of concave nondecreasing functions w W Œ0; 1 ! R. It is evident that W is a convex cone. Recall the identity Z1 EŒR.x/ D
F.1/ .R.x/I p/dp: 0
Let us define the function ˆ W X W ! R, as follows Z1
Z1 F.1/ .R.x/I p/dp C
ˆ.x; w/ D 0
F.1/ .R.x/I p/d w.p/ 0
Z1
F.1/ .Y I p/d w.p/:
(15.38)
0
It plays a role similar to that of a Lagrangian of Equations (15.30)–(15.32). Theorem 15.3. If xO is an optimal solution of Equations (15.30)–(15.32) then there exists a function w O 2W such that ˆ.x; O w/ O D max ˆ.x; w/ O x2X
Z1
The set W is a closed convex cone and ‰./ is a convex function, so problem Equation (15.42) is a convex optimization problem. Duality relations in convex programming yield the following result. Theorem 15.4. Assume that problem Equations (15.30)– (15.32) has an optimal solution. Then problem Equation (15.42) has an optimal solution and the optimal values of both problems coincide. Furthermore, the set of optimal solutions of Equation (15.42) is the set of functions wO 2 W satisfying Equations (15.39)–(15.40) for an optimal solution xO of Equations (15.30)–(15.32). The “Lagrange multiplier” w in this case is related to rank dependent expected utility theory. We have established earlier that the second order stochastic dominance relation is equivalent to Equation (15.12) for all dual utility functions in W. Our result shows that one of them, w./, O assumes the role of a Lagrange multiplier associated with Equation (15.31). A point xO is a solution to Equations (15.30)–(15.32) if there exists a dual utility function w./ O such that xO maximizes over X the objective function EŒR.x/ augmented with this dual utility. We can transform the Lagrangian Equation (15.38) in the following way: Z1
Z1 F.1/ .R.x/I p/dp C
ˆ.X; w/ D 0
F.1/ .R.x/I p/d w.p/ 0
Z1
Z1 F.1/ .R.x/I O p/d w.p/ O D
0
(15.39)
Z1 F.1/.YI p/dw.p/ D F.1/ .R.x/I p/d v.p/
0
(15.40)
0
Z1
F.1/ .Y I p/d w.p/: O 0
(15.41)
F.1/ .Y I p/d w.p/; 0
´ D. Dentcheva and A. Ruszczynski
256
where v.p/ D p Cw.p/. At the optimal solution the function vO .p/ D p C w.p/ O is the quantile utility function implied R1 by the benchmark Y . Since F.1/ .Y I p/dw.p/ is fixed, 0
the problem at the right hand side of Equation (15.39) becomes a problem of maximizing the implied rank dependent expected utility in X . It attaches higher weights to quantiles corresponding to smaller probabilities p. By maximizing ˆ.R.x/; w/ we look for x such that the left tail of the distribution of R.x/ is thin. Similarly to von Neumann–Morgenstern utility function, it is very difficult to elicit the dual utility function in advance. Our model derives it from a random benchmark. The optimal function w./ O is piecewise linear, with break points at Tt ; t D 1; : : : ; T . Therefore, the dual problem has also an equivalent linear programming formulation. This property, however, is conditioned on the assumption of equal probabilities.
15.6 Numerical Illustration We have tested our approach on a basket of 719 real-world assets, using 616 possible realizations of their joint return rates (Ruszczy´nski and Vanderbei 2003). Historical data on weekly return rates in the 12 years from spring 1990 to spring 2002 were used as equally likely realizations. Implied utility functions corresponding to dominance constraints for four benchmark portfolios.
We have used four benchmark return rates Y . Each of them was constructed as a return rate of a certain index composed of our assets. As we actually know the past return rates, for the purpose of comparison we have selected equally weighted indexes composed of the N assets having the highest average return rates in this period. Benchmark 1 corresponds to N D 26, Benchmark 2 corresponds to N D 54, Benchmark 3 corresponds to N D 82, and Benchmark 4 corresponds to N D 200. Our problem was to maximize the expected return rate, under the condition that the return rate of the benchmark portfolio is dominated. Since the benchmark point was a return rate of a portfolio composed from the same basket, we have m D T D 616 in this case. We have solved the problem by our method of minimizing the dual problem that was presented in Dentcheva and Ruszczy´nski (2006a). The implied utility functions from Equation (15.37) obtained by solving the optimization problem Equation (15.34) in the optimality conditions are illustrated in Fig. 15.5. We see that for Benchmark Portfolio 1, which contains only a small number of fast-growing assets, the utility function is linear on almost the entire range of return rates. Only very negative return rates are penalized. A different situation occurs when the benchmark portfolio contains more assets and is therefore more diversified and less risky. In order to dominate such a benchmark, we have to use a utility function which introduces penalty for a broader range of return rates and is steeper. For the broadly based index in Benchmark Portfolio 4, the optimal utility function is smoother and is nonlinear even for positive return rates. It
Return Rate −0.03
−0.03
−0.02
−0.02
−0.01
0.00
0.00
0.01
0.01 0.000
−0.005
Fig. 15.5 Implied utility functions
Utility
Utility Function for Benchmark 1 Utility Function for Benchmark 2 Utility Function for Benchmark 3 Utility Function for Benchmark 4
−0.010
15 Risk-Averse Portfolio Optimization via Stochastic Dominance Constraints
is worth mentioning that all these utility functions, although nondecreasing and concave, have rather complicated shapes. It would be a very hard task to determine in advance the utility function that should be used to obtain a solution dominating our benchmark portfolio. Obviously, the shape of the utility function is determined by the benchmark within the context of the optimization problem considered. If we change the optimization problem, the utility function will change. Finally, we may remark that our model Equations (15.22)–(15.24) can be used for testing the statistical hypothesis that the return rate Y of the benchmark portfolio is nondominated.
15.7 Conclusions We presented a new approach to portfolio selection based on stochastic dominance. The portfolio return rate in the new model is required to stochastically dominate a random benchmark, for example, the return rate of an index. We formulated optimality conditions and duality relations for these models and constructed equivalent optimization models with utility functions. Two different formulations of the stochastic dominance constraint: primal and inverse, lead to two dual problems that involve von Neuman–Morgenstern utility functions for the primal formulation and rank dependent (or dual) utility functions for the inverse formulation. The utility functions play the roles of Lagrange multipliers associated with the dominance constraints. In this way our model provides a link between the expected utility theory and the rank dependent utility theory. A numerical example illustrates the new approach and demonstrates the efficacy of the method. Future challenges are extensions of the approach to multivariate and multistage outcomes and benchmarks.
References Dentcheva, D. 2005. Optimization models with probabilistic constraints. in Probabilistic and randomized methods for design under uncertainty, G. Calafiore and F. Dabbene (Eds.). Springer, London, pp. 49–97. Dentcheva, D. and A. Ruszczy´nski. 2003a. “Optimization under linear stochastic dominance.” Comptes Rendus de l’Academie Bulgare des Sciences 56(6), 6–11. Dentcheva, D. and A. Ruszczy´nski. 2003b. “Optimization under nonlinear stochastic dominance.” Comptes Rendus de l’Academie Bulgare des Sciences 56(7), 19–25. Dentcheva, D. and A. Ruszczy´nski. 2003c. “Optimization with stochastic dominance constraints.” SIAM Journal on Optimization 14, 548–566. Dentcheva, D. and A. Ruszczy´nski. 2004a. “Optimality and duality theory for stochastic optimization problems with nonlinear dominance constraints.” Mathematical Programming 99, 329–350.
257
Dentcheva, D. and A. Ruszczy´nski. 2004b. “Convexification of stochastic ordering constraints.” Comptes Rendus de l’Academie Bulgare des Sciences 57(3), 5–10. Dentcheva, D. and A. Ruszczy´nski. 2004c. “Semi-infinite probabilistic optimization: first order stochastic dominance constraints.” Optimization 53, 583–601. Dentcheva, D. and A. Ruszczy´nski. 2006a. “Portfolio optimization with stochastic dominance constraints.” Journal of Banking and Finance 30/2, 433–451. Dentcheva, D. and A. Ruszczy´nski. 2006b. “Inverse stochastic dominance constraints and rank dependent expected utility theory.” Mathematical Programming 108, 297–311. Dentcheva, D. and A. Ruszczy´nski. 2008. “Duality between coherent risk measures and stochastic dominance constraints in risk-averse optimization.” Pacific Journal of Optimization 4, 433–446. Dentcheva, D. and A. Ruszczy´nski. 2010. Inverse cutting plane methods for optimization problems with second order stochastic dominance constraints, Optimization in press. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, Wiley, New York. Fishburn, P. C.1964. Decision and value theory, Wiley, New York. Fishburn, P.C. 1970. Utility theory for decision making, Wiley, New York. Gastwirth, J. L. 1971. “A general definition of the Lorenz curve.” Econometrica 39, 1037–1039. Hadar, J. and W. Russell. 1969. “Rules for ordering uncertain prospects.” The American Economic Review 59, 25–34. Hanoch, G. and H. Levy, 1969. “The efficiency analysis of choices involving risk.” Review of Economic Studies 36, 335–346. Hardy, G. H., J. E. Littlewood and G. Polya. 1934. Inequalities, Cambridge University Press, Cambridge, MA. Konno, H. and H. Yamazaki. 1991. “Mean-absolute deviation portfolio optimization model and its application to Tokyo stock market.” Management Science 37, 519–531. Lehmann, E. 1955. “Ordered families of distributions.” Annals of Mathematical Statistics 26, 399–419. Levy, H. 2006. Stochastic dominance: investment decision making under uncertainty, 2nd ed., Springer, New York. Lorenz, M. O. 1905. “Methods of measuring concentration of wealth.” Journal of the American Statistical Association 9, 209–219. Markowitz, H. M. 1952. “Portfolio selection.” Journal of Finance 7, 77–91. Markowitz, H. M. 1959. Portfolio selection, Wiley, New York. Markowitz, H. M. 1987. Mean-variance analysis in portfolio choice and capital markets, Blackwell, Oxford. Mosler, K. and M. Scarsini (Eds.). 1991. Stochastic orders and decision under risk, Institute of Mathematical Statistics, Hayward, California. Muliere, P. and M. Scarsini, 1989. “A note on stochastic dominance and inequality measures.” Journal of Economic Theory 49, 314–323. Müller, A. and D. Stoyan. 2002. Comparison methods for stochastic models and risks, Wiley, Chichester. Ogryczak, W. and A. Ruszczy´nski. 1999. “From stochastic dominance to mean-risk models: semideviations as risk measures.” European Journal of Operational Research 116, 33–50. Ogryczak, W. and A. Ruszczy´nski. 2001. “On consistency of stochastic dominance and mean-semideviation models.” Mathematical Programming 89, 217–232. Ogryczak, W. and A. Ruszczy´nski. 2002. “Dual stochastic dominance and related mean-risk models.” SIAM Journal on Optimization 13, 60–78. Prékopa, A. 2003. Probabilistic programming. in Stochastic Programming, Ruszczy´nski, A. and A. Shapiro. (Eds.). Elsevier, Amsterdam, pp. 257–352. Quiggin, J. 1982. “A theory of anticipated utility.” Journal of Economic Behavior and Organization 3, 225–243.
258 Quiggin, J. 1993. Generalized expected utility theory – the rankdependent expected utility model, Kluwer, Dordrecht. Quirk, J.P and R. Saposnik. 1962. “Admissibility and measurable utility functions.” Review of Economic Studies 29, 140–146. Rockafellar, R. T. and S. Uryasev. 2000. “Optimization of conditional value-at-risk.” Journal of Risk 2, 21–41. Rothschild, M. and J. E. Stiglitz. 1969. “Increasing risk: I. A definition.” Journal of Economic Theory 2, 225–243. Ruszczy´nski, A. and R.J. Vanderbei. 2003. “Frontiers of stochastically nondominated portfolios.” Econometrica 71, 1287–1297. von Neumann, J. and O. Morgenstern. 1944. Theory of games and economic behavior, Princeton University Press, Princeton.
´ D. Dentcheva and A. Ruszczynski Wang, S. S., V. R. Yong, and H.H. Panjer. 1997. “Axiomatic characterization of insurance prices.” Insurance Mathematics and Economics 21, 173–183. Wang, S. S. and V. R. Yong. 1998. “Ordering risks: expected utility versus Yaari’s dual theory of risk.” Insurance Mathematics and Economics 22, 145–161. Whitmore, G. A. and M. C. Findlay. (Eds.). 1978. Stochastic dominance: an approach to decision-making under risk, D.C.Heath, Lexington, MA. Yaari, M. E. 1987. “The dual theory of choice under risk.” Econometrica 55, 95–115.
Chapter 16
Portfolio Analysis Jack Clark Francis
Abstract In 1952, Harry M. Markowitz published a seminal paper about analyzing portfolios. In 1990, he was awarded the Nobel Prize for his portfolio theory. Markowitz portfolio analysis delineates a set of highly desirable investment portfolios. These optimal portfolios have the maximum return at each plausible level of risk, computed iteratively over a range of different risk levels. Conversely, Markowitz portfolio analysis can find the same set of optimal investments by delineating portfolios that have the minimum risk over a range of different rates of return. The set of all Markowitz optimal portfolios is called the efficient frontier. Portfolio analysis analyzes rate of return statistics, risk statistics (standard deviations), and correlations from a list of candidate investments (stocks, bonds, and so on) to determine which investments, and in what proportions (weights), enter into every efficient portfolio. Further analysis of Markowitz’s portfolio theory reveals interesting asset pricing implications. Keywords Portfolio r Rate of return r Portfolio analysis r Standard deviation rCorrelation rWeight rVariance-covariance matrix r Risk minimization r Return maximization r Efficient portfolios r Efficient frontier
The objective of portfolio analysis is to determine a set of “efficient portfolios.” Figure 16.1 is a two-dimensional graph in risk-return space that depicts the set of efficient portfolios as a curve between points E to F. This curve is called the efficient frontier; it is a set of optimum investment portfolios generated by analyzing an underlying group of individual assets. The assets that are being analyzed can be thought of as being individual issues of stocks and bonds. These individual assets are represented by the dots lying below the efficient frontier in Fig. 16.1.
16.2 Inputs for Portfolio Analysis Portfolio analysis requires certain data as inputs. If n individual assets (stocks and bonds) are the potential investments, the inputs to portfolio analysis are as follows: 1. n expected returns. 2. n variances of returns. 3. .n2 n/=2 covariances, or correlations. For instance, to generate a three-security portfolio, the analysis requires the statistics listed in Table 16.1.
16.1 Introduction 16.3 The Security Analyst’s Job Portfolio analysis is a mathematical procedure that determines a selection of optimum portfolios for investors to consider. The procedure was first made public in 1952 by Harry Markowitz.1 The theory caused a scientific revolution in finance. Before Markowitz’s scientific procedure investment counselors passed off “common sense guidelines” to their clients as valuable expert advice.
The security analyst needs to analyze “one period” rates of return defined in Equation (16.1). For example, consider James’s purchase of a stock for $54 per share. He sold that stock 1 year later for $64 to realize a capital gain of $10 and a cash dividend of 80 cents per share. Jim’s total income from the stock is $10.80 and his rate of return is 20%.
J.C. Francis () Baruch College, New York, NY, USA e-mail:
[email protected] 1
The analysis was originally presented in an article: Markowitz (1952). Markowitz expanded his presentation in a book, Markowitz (1959).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_16,
259
260
J.C. Francis
16.4 Four Assumptions Underlying Portfolio Analysis Portfolio analysis is all based on four behavioral assumptions.
Fig. 16.1 Investment opportunities have an efficient frontier that is convex
Table 16.1 Input statistics needed for a 3-asset portfolio Expected Variances of Asset returns returns Covariances E.r1 / E.r2 / E.r3 /
1 2 3
12 D Var.r1 / 22 D Var.r2 / 32 D Var.r3 /
¢1;2 D Cov.r1 ; r2 / ¢2;3 D Cov.r2 ; r3 / ¢1;3 D Cov.r1 ; r3 /
0
1 One-year @ rate of return; A D $10:80 Income $54 Invested r D 20% .Pr ice change; $10/ C .Cash dividend; $0:80/ D .Purchase price; $54/ (16.1) The length of the one-time period can vary within wide limits. It definitely may not be a short-run period or a very long-run period. But between, say, one month and 10 years, the portfolio manager can select any planning horizon that fits the investor’s needs. The security analyst should formulate estimates of the probability distribution of returns for each security that is an investment candidate. These probability distributions may be estimated using historical data. If so, the historical distributions may need to be adjusted to reflect anticipated factors that were not present historically. Figure 16.2 and Table 16.2 provide two examples of probability distributions of rates of return for the same share of stock. The security analyst should also estimate covariances (or correlation coefficients) between all securities under consideration. Given these pieces of information, the input data, portfolio analysis can determine the optimal portfolio for any investor described by the assumptions listed in the next section.
1. All investors endeavor to maximize a one-period expected utility function that exhibits diminishing marginal utility of returns (or wealth). This is equivalent to saying investors have positive but diminishing marginal utility of wealth (or returns). It means the investors prefer more wealth over less wealth, and they are risk-averse. 2. Investors’ risk estimates are proportional to the variability of the one-period rates of returns, as measured by the variance or standard deviation.2 3. Investors are willing to base their decisions solely in terms of expected return and risk. That is, their utility (U) is a function of variability of return .¢/ and expected return [E(r)]. Symbolically, U D fŒ¢; E.r/ .3 4. For any given level of risk, investors prefer higher returns to lower returns. Symbolically, •U=•E.r/ > 0. Conversely, for any given level of rate of return, investors prefer less risk over more risk. Symbolically, •U=•¢ < 0. These assumptions will be maintained throughout this discussion of portfolio analysis. Essentially, portfolio analysis is based on the premise that the most desirable assets are those that have either of the following two characteristics. 1. The investment has the minimum expected risk at any given expected rate of return. Or, conversely, 2. The investment has the maximum expected rate of return at any given level of expected risk. Assets described by either of the two preceding sentences are usually portfolios rather than individual assets. These assets are called efficient portfolios whether they contain one or many assets. Efficient portfolios can be derived by a process we will call “Markowitz diversification.”
16.5 Different Approaches to Diversification Many people define diversification as “not putting all your eggs in one basket” or “spreading your risks.” These popular definitions of diversification seem naïve once you consider the covariances (or correlations) between securities. “Markowitz diversification” is the particular form of diversification activity implied by modern portfolio analysis; it is 2
Problems can arise from using the variance to measure risk. See Szego (2002). 3 The mean and variance may not always provide the optimal selection. See Malevergne and Sornette (2005).
16 Portfolio Analysis
261
Fig. 16.2 Finite probability distribution of returns for a stock with three possible outcomes
Table 16.2 Probability distribution of returns for a $100 stock State of nature Event’s Stock’s Cash probability ending price dividend Column A B C Booming economy Slow growth Recession
0:3 D 30% 0:4 D 40% 0:3 D 30% 1:0 D 100%
$130.00 $110.00 $90.00
superior to naive diversification. Markowitz diversification involves combining investments with less-than-perfect positive correlation in order to reduce risk in the portfolio without unnecessarily sacrificing any of the portfolio’s potential return. In general, the lower the correlation of the assets in a portfolio, the less risky the portfolio will be. This is true regardless of how risky the assets in the portfolio are when analyzed in isolation. Markowitz explained that “in trying to make variance [of returns] small it is not enough to invest in many securities. It is necessary to avoid investing in securities with high covariances among themselves.”4
16.6 A Portfolio’s Expected Return Formula Consider the two individual securities in Table 16.3. Suppose that, say, 71.1% of the portfolio’s funds are invested in asset A and the other 28.9% in asset B. Symbolically, xA D 71:1% and xB D 28:9%, where the weight assigned to each asset is denoted xi for asset i. Throughout portfolio analysis the weights of the candidate assets must sum to one (or 100%), as shown in Equation (16.2). xA C xB D 1:0 D 100%
4
Markowitz (1952).
(16.2)
0 0 0
Single-period rate of return D
Products of A and D sum to E(r) E
0:3 D 30% 0:1 D 10% 0:1 D 10%
0:3 0:3 D 0:09 0:4 0:1 D 0:04 0:3 0:1 D 0:03 E.r/ D 0:16 D 16%
Table 16.3 Statistics describing two assets Investment Expected return (%) Standard deviation (%) A B
7 20
14.86 36.6
The formula for the weighted average return from this portfolio is below. E.rp / D xA E.rA / C xB E.rB /
(16.3)
Equation (16.3a) and (16.3) can be equivalently rewritten in terms of one weight since xB D 1:0 xA . E.rp / D xA E.rA / C .1 xA /E.rB / D xA E.rA / C xB E.rB / D xA .7%/ C xB .20%/ D .:711/.7%/ C .:289/.20%/ D .0:711/.0:07/ C .0:289/.0:2/ D 10:75% (16.3a) Equation (16.3a) concludes that the portfolio with 71.1% of its funds invested in asset A and the other 28.9% in asset B has an expected rate of return of 10.75%.
16.7 The Quadratic Risk Formula for a Portfolio A portfolio’s total risk is measured by either the variance or its square root, the standard deviation, of its rates of return.
262
J.C. Francis
The variance of returns from a portfolio made up of n assets is defined in Equation (16.4). Var.rp / D
n n X X
xi xj ij
(16.4)
i D1 j D1
Equation (16.4) defines the variance-covariance matrix for a portfolio of n assets. The variance-covariance formula is expanded below to show how the variances and covariances of every asset determine the portfolio’s risk. Col. 1 VAR.rp / D Cx1 x1 ¢11 Cx2 x1 ¢21 Cx3 x1 ¢31 : : :. Cxn x1 ¢n1
Col. 2 Cx1 x2 ¢12 Cx2 x2 ¢22 Cx3 x2 ¢32 : : :. Cxn x2 ¢n2
Col. 3 Cx1 x3 ¢13 Cx2 x3 ¢23 Cx3 x3 ¢33 : : :. Cxn x3 ¢n3
Col. n 1 : : :. : : :. : : :. : : :. : : :.
Col. n Cx1 xN ¢1N Cx2 xN ¢2N Cx3 xN ¢3N : : :. Cxn xn ¢nn
Row 1 Row 2 Row 3 Row n 1 Row n
where p D standard deviation of portfolio’s rates of return, i2 and ii D two equivalent conventions for variance of returns from asset i, and, ij and cov.ri ; rj / D equivalent representations for covariance of returns between assets i and j Since x1 x2 ¢12 C x2 x1 ¢21 D 2x2 x1 ¢21 and x2 x2 ¢22 D x2 2 ¢2 2 , we can abbreviate the matrix above by rewriting it as the equivalent matrix below. Col. 1 VAR.rp / D Cx1 x1 ¢11 C2x2 x1 ¢21 C2x3 x1 ¢31 : : :. C2xn x1 ¢n1
Col. 2
Col. 3
Cxn xn ¢nn
Row 1 Row 2 Row 3 Row n 1 Row n
In general, a variance-covariance matrix can be described by the following characteristics. (1) Since it is an n by n matrix, it is a square (not rectangular) matrix. (2) All the diagonal boxes (on the diagonal from the upper left corner to the lower right corner) contain variances, since Cov.ri ; ri / D ¢ii D ¢i 2 D VAR.ri /. (3) Since ¢ij D ¢ji , or equivalently, Cov.rj ; ri / D Cov.ri ; rj /, it is a symmetric matrix in which the boxes above the diagonal are the mirror image of the boxes below the diagonal. The variance-covariance matrix can be equivalently rewritten as follows: VAR.rp / D
i D1
xi2 i2 C
n X n X
The covariance, denoted ¢ij , is related to the correlation coefficient as shown in Equation (16.5). ij D i j ij
(16.5)
where ij is the correlation coefficient between variables i and j . Note that the covariance of the random variable i with itself equals the variance, ii D i i ii D Var.ri /. Also observe that Equation (16.5) can be solved for ij to obtain a definition of the correlation coefficient. The covariance measures how two variables covary. If two assets are positively correlated their covariance will also be positive. For example, almost all NYSE-listed common stocks have positive covariances with each other. If two variables are independent, their covariance is zero. For example, the covariance of the stock market and sun spot activity is zero. If two variables move inversely, their covariance is negative.
16.9 Portfolio Analysis of a Two-Asset Portfolio
Col. n 1 Col. n
Cx2 x2 ¢22 C2x3 x2 ¢32 Cx3 x3 ¢33 : : :. : : :. : : :. C2xn x2 ¢n2 C2xn x3 ¢n3 : : :.
n X
16.8 The Covariance Between Returns from Two Assets
Reconsider the two individual assets denoted A and B in Table 16.3. Equation (16.3a) shows how to compute the expected return of a portfolio that has 71.1% of its funds invested in asset A and the other 28.9% in asset B. This section shows how to reduce the n n variance-covariance matrix, Equation (16.4), into a 2 2 matrix that is appropriate to analyze the risk of a portfolio that contains assets A and B. Equation (16.6) is a simple case of Equation (16.4); it defines the standard deviation of returns for a two-asset portfolio of common stocks, denoted A and B.5 p D
q
xA2 A2 C xB2 B2 C 2xA xB AB
(16.6)
Substituting Equation (16.5) into Equation (16.6) yields Equation (16.7). Equation (16.7) shows how the correlation between assets A and B affects the portfolio’s risk. p D
q
xA2 A2 C xB2 B2 C 2xA xB A B AB
(16.7)
xi xj ij
i D1 j D1 i ¤j
0
1 1 0 1 0 Sum of the Sum of the n The 2 0 D @ variances on A C @ .n n/=2 A D @ portfolio s A total risk the diagonal covariances (16.4a)
5
The risk formula for a two-asset portfolio is derived below: (EDITOR: Please get this footnote all together on one page.) rp D x1 r1 C x2 r2 by definition E.rp / D x1 E.r1 / C x2 E.r2 /
16 Portfolio Analysis
Substituting in the risk statistics from Table 16.3 into Equation (16.7) yields: q x 2 .:1486/2 C xB2 .:366/2 C 2xA xB .:1486/.:366/AB q A D xA2 .0:04/ C xB2 .0:16/ C 0:16xA xB AB (16.7a) Figures 16.3a–d are graphs in risk-return space of assets A and B and the portfolios that can be formed from them at three different values for their correlation coefficient, AB D 1; 0, and C1. These four figures were prepared by plotting the risk and return of the various portfolios composed of A and B for all the positive weights that sum to unity (that is, xA C xB D 1) and for three different correlation coefficients, AB D 1; 0, and C1. To understand Fig. 16.3, it is informative to verify a few points in the figures by substituting some appropriate numbers into Equations (16.3a) and (16.7a) and calculating the portfolio’s expected return and risk. For example, Equation (16.3a) shows that the portfolio that has xA D 0:711 and xB D 0:289 has an expected return of 10.75%, regardless of what value the correlation coefficient assumes. However, the risk of the portfolio with xA D 0:711 and xB D 0:289 varies directly with the correlation coefficient, AB , as shown in Equation (16.7b). p D
p D
q
x 2 2 C xB2 B2 C 2xA xB A B AB s A A .0:711/2 .0:1486/2 C .0:289/2 .0:366/2 D C2.0:711/.0:289/.:1486/.0:366/AB p D 0:0111 C :0113 C :0224AB (16.7b)
16.9.1 Perfectly Positively Correlated Returns, Fig. 16.3a Portfolio analysis of the two-asset portfolio illustrated in Fig. 16.3a shows that when the correlation coefficient between the rates of return from assets A and B is at
These two equations are used in the following derivation. 2 Var.rp / D E rp E.rp / the definition of Var.rp / D E fx1 r1 C x2 r2 Œx1 E.r1 / C x2 E.r2 / g2 by substitution for rp D E fx1 Œr1 E.r1 / C x2 Œr2 E.r2 / g2 collecting like terms n
263
its maximum value of positive unity, a linear risk-return relationship results. This straight line between assets A and B in risk-return space is derived by first setting AB to positive unity in Equation (16.7b). Next the values of the two assets’ weights are varied from zero to one (that is, 0 < x < C1) in an inverse manner so that they always sum to positive unity (namely, xA C xB D 1). Finally, the infinite number of pairs of positive values for xA and xB that sum to one are substituted, one pair at a time, into the portfolio risk formula, Equation (16.7b), and the portfolio return formula, Equation (16.3). The infinite number of risk and return statistics for the two-asset portfolio thus derived trace out the straight line in Fig. 16.3a when AB D C1:0.
16.9.2 Uncorrelated Assets, Fig. 16.3b If the rates of return from assets A and B are zerocorrelated, substantial risk reduction benefits can be obtained from diversifying between them. This beneficial risk reduction can be seen by noting what happens to portfolio risk Equation (16.7b) when AB equals zero. The last quantity on the right-hand side of Equation (16.7b) becomes zero when AB D 0. Clearly, this reduces the portfolio’s risk level below what it was when this correlation was a larger value (for instance, when AB D C1). The results of the uncorrelated returns are illustrated in Fig. 16.3b. The portfolio’s expected return is unaffected by changing the correlation between assets – this is because AB is not a variable in the portfolio return Equation (16.3). All differences between the portfolios generated when AB D C1 and the portfolios generated when AB D 0 are risk differences stemming from Equation (16.7b). Figure 16.3b shows that the portfolios with AB D 0 have less risk at every level of expected return than the same portfolios in Fig. 16.3a with AB D C1. The substantial risk reductions available by diversifying across uncorrelated assets are readily available to all investors. Empirical research has shown that common stock price indices, bond price indices, and commodity price indices all tend to be uncorrelated. Thus, an investor who diversifies across different market assets can expect to benefit from the diversifying between uncorrelated assets as indicated in Fig. 16.3b.
D E x12 Œr1 E.r1 / 2 C x22 Œr2 E.r2 / 2 o C2x1 x2 Œr1 E.r1 / Œr2 E.r2 /
16.9.3 Perfectly Negatively Correlated Returns, Fig. 16.3c
D x12 E Œr1 E.r1 / 2 C x22 E Œr2 E.r2 / 2 C2x1 x2 E fŒr1 E.r1 / Œr2 E.r2 / g D x12 Var.r1 / C x22 Var.r2 / C 2x1 x2 Cov.r1 ; r2 / restating definitions
The lowest possible value for a correlation coefficient is negative unity. When the correlation coefficient in portfolio risk
264
J.C. Francis
Fig. 16.3 Portfolio analysis for two assets in risk-return space. (a) The portfolios that can be constructed from A and B when they are perfectly positively correlated lie along a straight line. (b) When A and B are uncorrelated, the portfolio possibilities lie along a curve. (c) If A and B are perfectly negatively correlated, the portfolios that can be created include the riskless portfolio Z. (d) The correlation between the assets determines which efficient frontier is possible
Equation (16.7b) reaches negative unity, the last term on the right-hand side of the equation assumes its maximum negative value for any given pair of values for xA and xB . In fact, when AB D 1, the portfolio’s risk can be reduced to zero risk for one particular set of portfolio weights. For example, for the portfolio made of A and B, Fig. 16.3c shows that when xA D 0:711; xB D 0:289, and AB D 1:0, the portfolio’s risk diminishes to zero. For all other weights the portfolio’s risk is above zero. But, relative to all similar portfolios, portfolio risk is always at its lowest possible level when AB D 1. If it seems dubious that two perfectly negatively correlated risky assets like A and B can be combined to form a riskless portfolio like the one at xA D 0:711 and xB D 0:289 in Fig. 16.3c, consider an example. In this example we assume two assets are perfectly negatively correlated, so their prices and returns move inversely. As a result of these inverse movements, whatever losses one asset experiences are offset by gains from the other asset. This inverse interaction can leave the portfolio with zero variability of return for the correct values for xA and xB . Thus, a riskless portfolio can be created from risky components.
16.9.4 Portfolio Analysis Using Markowitz Diversification, Fig. 16.3d Figure 16.3d summarizes the three illustrated examples from Figs. 16.3a–c. Summarizing, Fig. 16.3a shows that at point P the weights are xA D 0:711 and xB D 0:289, and for a correlation of AB D C1, the portfolio’s total risk is p D 21:2 percent. Figure 16.3b shows that if AB D 0, then p D 15 percent at point W . And, p Fig. 16.3c illustrates the case when AB D 1, then p D 0 D 0 at point Z. Figure 16.3d contains the three points (P; W , and Z, respectively) with E.r/ D 10:75% from Figs. 16.3a–c. Markowitz diversification can lower risk to zero if the securities analyst can find securities whose rates of return have very low enough correlations. Unfortunately, there are not many securities that have correlations that are very low. The typical pair of NYSE-listed stocks has returns that are correlated about D 0:5. Using Markowitz diversification requires a data bank of financial statistics for many securities, a computer, and some econometric work.
16 Portfolio Analysis
265
16.10 Mathematical Portfolio Analysis A solution technique for portfolio analysis that uses differential calculus and linear algebra is explained in this section. Calculus can be used to find the minimum risk portfolio for any given expected return E . Mathematically, the problem involves finding the minimum portfolio variance.6 That is, Minimize Var.rp / D
n n X X
xi xj ij
(16.8)
i D1 j D1
subject to two Lagrangian constraints. The first constraint requires that the desired expected return E be achieved. This is equivalent to requiring that the following equation cannot be violated. n X xi E.ri / E D 0 (16.9) i D1
The second constraint requires that the weights sum to unity. This constraint is equivalent to requiring that Equation (16.10) is true. n X
xi 1 D 0
(16.10)
i D1
C2
n X
!
i D1
xi 1
(16.11)
i D1
The minimum risk portfolio is found by setting dz=dxi D 0 for i D 1; : : : ; n and dz=dj D 0 for j D 1, 2 and solving the resulting system of equations for the xi ’s. The number of assets analyzed, n, can be any positive integer.
16.11 Calculus Minimization of Risk: A Three-Security Portfolio For a three-security portfolio, the objective function to be minimized is shown below.
6
dz dx1 dz dx2 dz dx3 dz d1 dz d2
For more information about mathematical portfolio analysis, see Francis and Alexander (1986).
D 2x1 11 C 2x2 12 C 2x3 13 C 1 E1 C 2 D 0 D 2x2 22 C 2x1 12 C 2x3 23 C 1 E2 C 2 D 0 D 2x3 33 C 2x1 13 C 2x2 23 C 1 E3 C 2 D 0 D x1 E1 C x2 E2 C x3 E3 E D 0 D x1 C x2 C x3 1 D 0
(16.13) All the equations in system Equation (16.13) are linear, since the weights (xi ’s) are the variables and they are all of degree one. As a result, the system of equations may be solved as a system of simultaneous linear equations. The matrix representation of this system of linear equations is shown below as matrix Equation (16.14). 3 32 3 2 0 x1 1 6 7 6 7 17 7 6x2 7 6 0 7 6 6 7 7 1 7 6x3 7 D 6 0 7 7 0 5 42 5 4 1 5 E 0 1 (16.14) This system may be solved several different ways. With matrix notation, the inverse of the coefficient matrix, denoted C 1 , may be used to find the solution (weight) vector x follows: Cx D k C 1 Cx D C 1 k (16.15) Ix D C 1 k x D C 1 k 2
Combining these three quantities above yields the Lagrangian objective function of the risk minimization problem with a desired return constraint: ! n n X n X X xi xj ij C 1 xi E.r1 / E zD i D1 j D1
z D x12 11 C x22 22 C x32 33 C 2x1 x2 12 C 2x1 x3 13 C2x2 x3 23 C 1 .x1 E1 C x2 E2 C x3 E3 E / C2 .x1 C x2 C x3 1/ (16.12) Setting the partial derivatives of z with respect to all variables equal to zero yields equation system Equation (16.13) below.
211 6 221 6 6 231 6 4 1 E1
212 222 232 1 E2
213 223 233 1 E3
E1 E2 E3 0 0
The solution will give the n weights in terms of E . If n D 3 we have: x1 D a1 C d1 E x2 D a2 C d2 E (16.16) x3 D a3 C d3 E where the ai and di are constants. For any desired E the equations give the weights of the minimum risk portfolio. These are the weights of a portfolio in the efficient frontier. By varying E the weights may be generated for the entire efficient frontier. The risk, Var.Vp /, of the efficient portfolios may be calculated, and the efficient frontier may be graphed. As a numerical example, the data from the three-security portfolio problem in Table 16.4 is solved to obtain the following coefficients matrix.
266
J.C. Francis
Table 16.4 Statistical inputs for portfolio analysis of three common stocks Asset E.ri / Var.ri / D ¢ ii Cov.ri ; rj / D ¢ ij E.r1 / D 5% D 0:05 E.r2 / D 7% D 0:07 E.r3 / D 30% D 0:3
1 2 3
2
211 6221 6 6231 6 41 E1
212 222 232 1 E2
213 223 233 1 E3
E1 E2 E3 0 0
¢11 D 0:1 ¢22 D 0:4 ¢33 D 0:7
3 2 2.:1/ 1 6 2. :1/ 17 7 6 6 17 7 D 6 2.0/ 41 5 0 0
:05
¢12 D 0:1 ¢13 D 0:0 ¢23 D 0:3
2. :1/ 2.:4/ 2.:3/ 1 :07
3 2.0/ :05 1 2.:3/ :07 1 7 7 2.:7/ :3 1 7 7 1 0 05 :3
0 0 (16.17)
Multiplying the inverse of this coefficients matrix by the constants vector k yields the weights vector .C 1 k D x/ as shown below.7 C 1 k D W 2 :677 :736 6 :736 :800 6 :064 6 :059 4 1:433 2:790 :789 :447
:059 :064 :005 4:223 :236
:789 :447 :236 :522 :095
3 2 3 32 1:433 x1 0 2:790 7 6 0 7 6 x2 7 7 6 7 76 4:223 7 6 0 7 D 6 x3 7 15:869 5 4 1 5 4 2 5 1 :552 E
date investments provides a practical approach to portfolio analysis. Markowitz portfolio analysis can consider both the risk and return of dozens or hundreds of different securities simultaneously – the number is limited only by the size of the computer and the number of securities for which the risk and return input statistics are available. No human mind can simultaneously evaluate numerous different investment opportunities and balance the risks and returns of them all off against one another to find efficient portfolios that dominate all other investment opportunities. Markowitz portfolio analysis is essentially a mathematics problem requiring that many different equations be solved simultaneously. The computations can be done on a large scale by using a computer program that does “quadratic programming.” Quadratic programming (QP) minimizes the portfolio’s risk (a quadratic equation) at each level of average return for the portfolio.9 Markowitz portfolio theory has remained undiminished for over half a century. Commercially available raw data and new computers make it easier and cheaper to employ Markowitz’s QP every year. Now is an excellent time to utilize Markowitz’s Nobel Prize winning theory to manage your portfolio.
(16.18)
Evaluating the weights vector yields the system of equations below. x1 D :789 1:433E x2 D :447 2:790E x3 D :236 C 4:223E 1 D :522 15:869E 2 D :095 C :522E
(16.19)
The weights in the three equations sum to unity (that is, x1 C x2 C x3 D 1 for any given value of E ). And, they are all linear functions of E , and, they all represent the weights of the three securities in the efficient portfolio at the point where E.rp / D E . Varying E generates the weights of all the efficient portfolios.8
References Francis, J. C. and G. Alexander. 1986. Portfolio analysis, 3rd ed. Prentice-Hall, Englewood cliffs, NJ. Francis, J. C. and R. Ibbotson. 2002. Investments: a global perspective, Prentice-Hall, Upper Saddle River, New Jersey, ISBN: 0-13890740-4. Holden, C. W. 2005. Excel modeling in investments, 2nd ed. PrenticeHall, Upper Saddle Ridge, New Jersey, ISBN: 0-13-142426-2. Holden C. W. 2009. Holden, excel modeling and estimation, 3rd ed. Prentice-Hall, Upper Saddle Ridge, New Jersey. Markowitz, H. 1952. “Portfolio selection.” Journal of Finance 7(1), 77–91. Markowitz, H. 1959. Portfolio selection. Cowles Foundation monograph, 16 Wiley, New York. Malevergne, Y. and D. Sornette. 2005. “Higher-moment portfolio theory; capitalizing on behavioral anomalies of stock markets.” Journal of Portfolio Management 31, 49–55. Stevens, G. V. G. 1998. “On the inverse of the covariance matrix in portfolio analysis.” Journal of Finance, October, 53(5), 1821–1827. Szego, G. 2002. “Measures of risk.” Journal of Banking and Finance 26, 1253–1272.
16.12 Conclusion In conclusion, it is helpful to note that using a computer to apply Markowitz diversification to a collection of candi7 Portfolio solutions depend on the covariance matrix. See Stevens, G. V. G. “On the inverse of the covariance matrix in portfolio analysis.” Journal of Finance, October 1998, 53(5), 1821–1827. 8 Additional details and further implications have been published by Francis and Ibbotson (2002).
9 Computer programs to perform portfolio analysis and other forms of investment analysis are available. For an Excel template that you can construct yourself, see Chapter 9 of Craig W. Holden, Excel Modeling In Investments, Second Edition, Prentice-Hall, Upper Saddle Ridge, New Jersey, 2005, ISBN: 0-13-142426-2. Or, see Chapters 5 and 6 of Holden’s Third Edition book entitled Excel Modeling and Estimation, 2009.
Chapter 17
Portfolio Theory, CAPM and Performance Measures Luis Ferruz, Fernando Gómez-Bezares, and María Vargas
Abstract This chapter is focused on the “Portfolio Theory” created by Markowitz. This theory has the objective of finding the optimum portfolio for investors; that is, that which gives tangency between an indifference curve and the efficient frontier. In this chapter, the mathematics of this model is developed. The CAPM, based on this theory, gives the expected return on an asset depending on the systematic risk of the asset. This model detects underpriced and overpriced assets. The critics expressed against the model and their application possibilities are also analyzed. Finally, the chapter centers on performance measures related to portfolio theory (classic indices, derivative indices and new approaches) and on the performance persistence phenomenon employing the aforementioned indices, including an empirical example. Keywords CAPM r Performance measures r Penalized internal rate of return index (PIRR) r Penalized internal rate of return for beta index (PIRR for Beta) r Portfolio theory
17.1 Portfolio Theory and CAPM: Foundations and Current Application 17.1.1 Introduction Portfolio theory has become highly developed and has strong theoretical support, making it essential in the toolbox of any financial expert. In 1990, three American economists shared the Nobel Prize for Economics: Harry Markowitz, Merton Miller, and
William Sharpe. Markowitz is considered the creator of portfolio theory. He began this field of research with his work of 1952, making major additions with his works of 1959 and 1987. Sharpe attempted to simplify the model of his mentor, Markowitz, with his work of 1963, based on which he created the capital asset pricing model (CAPM) in 1964. Miller has also worked on portfolio topics. Another Nobel Prize winner for Economics, James Tobin, contributed to the advance of portfolio theory with his work in 1958. Merton,1 Black and Fama are others who have made great advances in the field. A portfolio is a combination of assets, such as a group of stocks that are diversified to some extent. A portfolio is characterized by the return it generates over a given holding period. Portfolio theory describes the process by which investors seek the best possible portfolio in terms of the tradeoff of risk for return. In this context, risk refers to the situation where the investor is uncertain about which outcomes may eventuate, but can assess the probability with which each outcome may arise. In a risky environment, the choice of an optimal portfolio will depend on the distribution of returns, as well as the preferences of the investor – his or her appetite for risk. It is clear that all decision makers have their own personal preferences, but it is possible to posit some generalizations. First, let us suppose that all decision makers prefer more wealth to less and, second, that decision makers are risk averse. We define risk as the variability of the results of an activity; this variability can be measured by standard deviation or by variance. If we assume, as is customary, the marginal utility of an individual as a decreasing function of wealth, then the least risky activity is preferred, given the same average results. Although the risk aversion behavior of individuals has been questioned, the literature is based on the assumption
L. Ferruz () and M. Vargas University of Zaragoza, Zaragoza, Spain e-mail:
[email protected] F. Gómez-Bezares University of Deusto, Bilbao, Spain
1
Merton also won a Nobel Prize in 1997 along with Scholes.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_17,
267
268
L. Ferruz et al.
that risk must be rewarded to be accepted by the individual. This reward is made by awarding “prizes” for risky activity with a higher expected payoff. Therefore, the return premium represents the reward for risk.
17.1.2 The Mean-variance Model Decision makers will be interested in knowing the probability distribution for the return on their investment. If we assume that this probability distribution is normal, it will be defined by two parameters: the mathematical expectation (mean), and the standard deviation (or its square, the variance). Therefore, investors can be said to make decisions based on mean and variance, hence the “meanvariance” model. Asset i will produce, in period t, a return ri , that is a random variable with average i and standard deviation ¢i . Consequently, each asset will be a point on the ¢ map. The map, shown in Fig. 17.1, represents the group of possible investments (portfolios) from which the investor will be able to choose. Investors must then choose among the various points on this map of possible opportunities. Each individual has a particular preference, which constitutes the individual’s system of indifference curves. An indifference curve is that place of the points on the map that yield equal utility for the individual. In Fig. 17.2, a system of indifference curves typical for an individual who is averse to risk is shown. Characteristically, the indifference curves are positive-sloped and convex, since the higher the deviation in results, the higher is the mean demanded by individuals to maintain themselves at the same level of indifference. On the other hand, the furthest the curve is from the horizontal axis, the greater the level of utility that it represents.
17.1.3 The Efficient Frontier A portfolio is efficient when it yields a higher average return for a given risk, and a lower risk for a determined average return. Such portfolios make up the efficient frontier, which is a subgroup of the minimum-variance frontier, within which are located those portfolios that have a lower risk for a determined average return. We now show the process for obtaining the minimumvariance frontier, following the work of Merton (1972).2 We start with a portfolio made up of n securities (all risky and such that none can be obtained by the linear combination of any others) with ri returns and in wi weightings: R0 D .r1 ; r2 ; : : : ; rn / W0 D .w1 ; w2 ; : : : ; wn / We also adopt the following nomenclature: U is a vector of ones, so W0 U D 1 (the proportions must add up to 1) P D R0 W is the return on the portfolio and E.P/ D E.R0 /W, the average return † is the variance and covariance matrix of returns on n securities VAR.P/ D W0 †W, the variance of the portfolio COV.R; P/ D †W, the co-variance vector of the returns on the securities and the portfolio. Determination of the minimum-variance frontier results in minimization of the variance for each value of the average return; that is, variance is minimized by assuming a return expectation E : MIN W W0 †W Subject to W E.P/ D E.R0 /W D E W0 U D 1:
μ
μ
Possible Opportunities
σ σ Fig. 17.1 Figure 17.1 represents the group of possible investments (portfolios) from which the investor will be able to choose
Fig. 17.2 Figure 17.2 shows a system of indifference curves typical for an individual who is averse to risk
2
Adapted by Gómez-Bezares (2006).
17 Portfolio Theory, CAPM and Performance Measures
μ
m =
A + C
269
D ×s C
m =
μ A C
D(C s 2 −1)
+
C
Z A C
r0 m =
1
A C
2
−
D(Cs −1)
σ
C
σ
C
m =
A C
−
D C
Fig. 17.4 Figure 17.4 shows an efficient frontier with a risk-free asset
×s
values of œ1 we obtain the sub-group called the efficient frontier, that, as well as having minimum variance for a given expectation, has maximum expectation for a given variance.
Fig. 17.3 Figure 17.3 shows that the frontier is a hyperbola on the mean-deviation map
Finding the minimum conditioned by Lagrange, we obtain W vector, which provides the proportions, depending on the portfolio expectation, in which appear the securities of the portfolios that compose the frontier: W D
.C E.P / A/
P1
E.R/ C .B A E.P // D
P1
U
;
(17.1)
P P where: A D U 0 1 E.R/I B D E.R0 / 1 E.R/I C D P U 0 1 U ; and D D BC A2 . It can be demonstrated that the frontier is a hyperbola on the mean-deviation map (see Fig. 17.3), and a parabola on the mean-variance map (Merton 1972)3 and Gómez-Bezares (2006). Furthermore, we can rewrite W D E.P/G C H, where G and H are vectors, in such a way that, given that each wi is linearly related with E(P) and knowing the W values of two portfolios that are on the frontier, we obtain G and H and, therefore, we obtain the Ws of all portfolios that make up the frontier. If we designate œ1 the Lagrange multiplier corresponding to the first condition, whose formula is œ1 D 2ŒC E.p/ A =D, it can be demonstrated that œ1 represents the slope – measured by its inverse – of the minimum-variance frontier on the mean-variance map. Depending on how œ1 varies, we obtain a new portfolio of minimum variance. For positive 3
See also Roll (1977).
17.1.4 Efficient Frontier with a Risk-free Asset It is possible to construct portfolios as a combination of risky and risk-free assets. On the mean-deviation map, a straight line represents the new portfolios. The highest straight line will be the tangent to the previous efficient frontier on point Z; thus, a new efficient frontier will appear (Fig. 17.4). On the left of Z, we have portfolios composed of a proportion of risk-free assets, and another of portfolio Z; on the right, we invest more than 100% in portfolio Z, accepting debt at the risk-free rate. We designate W as the vector of the weightings of investments in risky assets, and .1 W0 U/ as the proportion invested in the risk-free asset, with r0 return. In this case, we also minimize variance for a value of the E mean: MIN W W0 †W Subject to:E.P/ D E.R0 /W C .1 W0 U/ r0 D E Resolving the minimum conditioned by Lagrange, we obtain the vector W: P .E.P / r0 / 1 .E.R/ U r0 / D .E.P / r0 / M; W D B 2Ar0 C C r02 (17.2) designating M the corresponding vector. œ continues to be a linear function of E (P): D
2 ŒE.P / r0 B 2Ar0 C C r02
(17.3)
270
in such a way that the various W vectors are proportional to the M vector. This leads to Tobin’s Separation Theorem (1958): When there is a risk-free asset (that can be sold short) on the frontier, all W vectors (that contain risky assets) will have the same proportions of assets, with only the proportion between these and the risk-free asset varying.
17.1.5 Efficient Frontier with Inequality Restrictions Until now the only restriction considered has been that the totality of the proportions invested in securities must add up to the unit. Other equality restrictions could be added and it would still hold that on the frontier, each portfolio is a linear combination of any two portfolios, and that the W values are a linear function of E (P), and therefore of œ1 . However, the problem becomes more complex when we add inequality restrictions (standard problem) to the basic problem, which contains only equality restrictions. In the standard problem, there are restrictions such as: ai wi bi , so that the relationships stop being linear for any value of œ1 , but are linear within each interval. For example, it could be that up to a value of œ1 D x no inequality restriction appears; that is, there is a basic sub-problem. If, at this point, any restriction stops, it becomes an equality up to œ1 D y, meaning that between x and y there is a new basic sub-problem (here the evolution will be linear). Therefore, a standard problem is a group of basic sub-problems. Points x and y are called cornerpoints, meaning that each portfolio on the frontier is a linear combination of two adjacent corner portfolios. In this way, if there is no risk-free asset, between each two of the adjacent corner portfolios, we will have a piece of hyperbola, but the complete frontier will not yet be a hyperbola (see Fig. 17.1). If securities cannot be emitted (sold short) except for the risk-free asset, whose short sale means getting into debt, but securities can be bought, the restrictions – that will only affect the risky assets – will take the form 0 wi 1. In this case, since we are able to put ourselves into debt in the risk-free asset, the upper line that is the efficient frontier will be unlimited towards the right. Therefore, the efficient part of the frontier will be a straight line that responds to a basic sub-problem within a standard problem.
17.1.6 Application to the Market Continuing with the case of inequality restrictions (supposing that there are limitations on short-selling, as in Fig. 17.1),
L. Ferruz et al.
μ R*
r0
σ Fig. 17.5 Figure 17.5 represents an optimum portfolio considering that the risk-free asset can be emitted without restriction
individuals will search for their optimum portfolio, which will be that portfolio that reaches the highest possible indifference curve, that is, that in which tangency is produced between one of the indifference curves and the efficient frontier. The problem here is that we do not know where the market portfolio will be located (the sum of all individuals’ portfolios), nor whether it will be efficient. However, if we assume a risk-free asset that can be emitted without restriction, then all individuals will invest a part of their wealth in the risk-free asset – positively or negatively – and another part in the R portfolio, which is the market portfolio. This is because all individuals will invest the risk-based portion of their wealth in the proportions of R (see Fig. 17.5). Of course, we assume uniform expectations (agreement) among investors. It is also assumed that there is equilibrium between supply and demand and a uniform interest rate for everyone, in terms of lending and borrowing. In this case, the efficient frontier is the so-called Capital Market Line, CML.
17.1.7 The Capital Asset Pricing Model The capital asset pricing model (CAPM), or Sharpe-Lintner model, stands out among asset pricing models. This model reflects how the expected return on an asset is a function of the expected returns on the market and the risk-free asset and of the relevant (systematic) risk of that asset. If this model is complied with, at equilibrium, all assets adjust their values so as to offer the return that corresponds to them, thereby allowing investors to determine whether an asset is undervalued or overvalued. This model was created in the pioneering work of Sharpe (1961, 1964), Lintner (1965), among others, and has been adapted over time to reflect different aspects of the real financial world.
17 Portfolio Theory, CAPM and Performance Measures
271
Although the starting hypotheses of the CAPM place us in markets with perfect competition, which creates the potential for lack of realism, the conclusions of the model adapt well to reality.4 In addition, recent technological advances, new products, and greater knowledge of the markets make markets in general seem ever more like the ideal markets supported by the CAPM. Portfolio theory is the fundamental theoretical base of the CAPM. This model accepts that there is a risk-free asset with r0 return (that can be emitted), and claims that the securities that make up a portfolio on the efficient frontier, with P return, must comply with the following equation E.R/ D r0 U C ŒE.P / r0
COV.R; P / : VAR.P /
(17.4)
According to this model, the expected return on a security will depend on the portfolio P and the return on the risk-free asset; P represents any portfolio that is located on the efficient frontier. Furthermore, if we suppose there is agreement, all individuals will make the same forecast of the portfolio map, and if the market has reached equilibrium, the market portfolio R will be efficient. We return to this concept later.
17.1.8 The Market Model Sharpe (1963) attempts to reach a simplification of the Markowitz model by proposing the market model. The simplifying hypothesis of this model posits that the relations between returns on the various assets stem solely from the relation between returns on all assets and that of a market index. It also assumes the existence of a linear relation between the return on each asset and that of the index. Using the market portfolio as a proxy for the market index, we have the model Rit D ’i C “i Rt C ©it ;
(17.5)
where Rit is the return on the asset i in period t, and it is usually considered that ©it is an error term such that:
Hence, we obtain ¢ 2 .Ri / D “2i ¢ 2 .R / C ¢ 2 .©i /
(17.6)
in such a way that the first member of this equation represents the risk of the asset measured with its variance, and the second has two components: the first measures the systematic risk and the second the diversifiable risk. We commented above that in this model the only relationship between the returns on the assets is the shared relationship with the return on the index, therefore, COV.©it ©jt / D 0 if i ¤ j, but this condition is difficult to accept since the application of factorial analysis demonstrates that there are sectorial factors, which corroborates the idea that, aside from the correlation through the market, there are other sources of correlation. It is precisely based on said condition of zero covariance that it is demonstrated that diversifiable risk really does disappear in the portfolio; however, fortunately, even though there are sector effects, a careful diversification enables us to maintain the distinction between systematic and diversifiable risk. Systematic risk derives from the general economy and affects all assets to some extent, and diversifiable, or specific risk, is a risk that affects one asset in particular. If an asset has a high diversifiable risk, it will be easy to find other assets that do not correlate with it, forming a diversified portfolio with much less risk. Systematic risk is measured through the first component of the second member of the formula (17.6), but given that market variance is equal for all the assets, we can measure this risk using the beta parameter. This parameter is the regression slope between the asset return and that of the market (Security Characteristic Line, SCL). We will call a security an “aggressive security”if it increases market variations .“ > 1/ and “defensive” if it decreases them .“ < 1/, as shown in Fig. 17.6. The market portfolio has a slope equal to the unit, while the risk-free asset has a null slope. There are also assets with negative slopes that allow the total risk of a well-diversified portfolio to be reduced. SCL (aggressive) Ri
E.©it / D 0 VAR.©it / D ¢ 2 .©i / for all t’s; .homoscedasticity/ COV.©it ©it’ / D 0 for all t’s; t’ .no autocorrelation/ COV.©it; Rt / D 0 for all t’s:
SCL (defensive)
R* 4
Although we will discuss this matter later.
Fig. 17.6 Figure 17.6 shows an aggressive and a defensive security
272
L. Ferruz et al.
17.1.9 Relation Between the Market Model and CAPM The relevant risk from a security is its systematic risk, since it is this risk that each security introduces to the market portfolio, which by definition does not have diversifiable risk. On the other hand, the systematic risk of a portfolio is the linear combination of the systematic risks of the securities that comprise it. Securities have differing levels of systematic risk, depending on how defensive or aggressive they are, and must yield returns accordingly. The CAPM confirms that securities must produce returns that are dependent entirely on their systematic risk. If we consider formula (17.4) of the CAPM and substitute P for R (market portfolio) (since this is the only efficient portfolio made up solely of risky assets), we obtain COV.R; R / E.R/ D r0 U C E.R / r0 VAR.R /
(17.7)
and, given that the regression betas between the asset return / , we arrive at and that of the market index are ˇ D COV.R;R VAR.R / the usual equation from the CAPM E.R/ D r0 U C E.R / r0 ˇ;
(17.8)
where “ is the vector that contains the betas. Considering CAPM formula (17.8), we note that assets must yield returns, depending on their systematic risk or beta as follows: their expected return is equal to that on the risk-free asset plus a premium for risk, which is the product of multiplying the premium that the market receives by the quantity of systematic risk accepted. Expected returns on assets, at equilibrium, must be adapted to the previous equation, leading to the Security Market Line, SML (Fig. 17.7). Securities located above the SML will be undervalued, as they produce greater returns than expected and, as they are much in demand, their prices will increase until they reach the SML; those securities located under the SML will be overvalued and will reach the SML by a process inverse to that of those located above the SML. E(Ri )
17.1.10 The Pricing Model The CAPM is an asset pricing model because it considers that assets are undervalued (overvalued) when their expected return is above (below) that demanded by the model. However, the constraint that a security must produce returns that depend on its systematic risk goes against the idea of business diversification; that is, the investment is not valued by the risk that it brings to the company, rather by the risk brought to the shareholder. However, and contrary to what is proposed by the CAPM, which does not consider diversifiable risk, there are investments that are difficult to diversify, as is the case for owners of family companies who invest the majority of their savings in their businesses. Also, the fact that few securities are listed on the stock market hinders diversification. These are only two reasons among many of why risk diversification is complex, and why diversifiable risk can be important. There are other occasions when it is necessary to pay attention to total risk and not just systematic risk – for example, bankruptcy risk, which is dependent on total risk. All in all, the CAPM produces several interesting results that cannot be ignored. From the CAPM, we deduce an important maxim: a firm must do only what the shareholder cannot do individually; that is, the firm should diversify to seek synergies; however, to diversify diversifiable risk, the shareholder alone is sufficient (this is true, above all, in large companies).
17.1.11 Contrasts and Controversy Regarding the CAPM
SML
E(R* )
r0 VAR(R*) β=1
Therefore, intelligent investors will eliminate diversifiable risk from their portfolios because it is not rewarded; they will retain only the systematic risk and will be rewarded as a result. The extent of their aversion to risk will lead them to accept a certain quantity of risk along the CML. On the CML are located the efficient portfolios, which do not have diversifiable risk, only systematic – given that the market portfolio has systematic risk only, and that all portfolios on the CML have one portion of the market portfolio and a portion of the risk-free asset.
COV(Ri,R*) BETAi
Fig. 17.7 Figure 17.7 shows the Security Market Line
There are numerous studies that criticize the starting hypotheses of the CAPM and that analyze the effects of their relaxation. In general, we point out two major levels of criticism: for professionals of the financial world, the CAPM is too theoretical, and for the academic world, the model does not adequately reflect a reality that is, in fact, much more
17 Portfolio Theory, CAPM and Performance Measures
complex. However, the most important thing is to determine whether the CAPM can adapt to reality – that is, whether it is sufficiently robust. At the beginning of the 1970s, the CAPM was viewed optimistically. By the late 1970s, the first criticisms, which grew in the 1980s, arose, and by the 1990s there was a great deal of controversy associated with the CAPM. To determine whether the CAPM actually works, it seems logical to study whether, using past data, the predicted relationship between an asset’s return and its systematic risk is borne out; there are diverse methodologies that enable this task. Equation (17.8) represents an ex ante model that has the drawback that expectations cannot be observed; but if we employ rational expectations, we may test it based on past data (ex post model). To do this, a regression is applied between average returns on assets and their betas (empirical market line) Rj D y0 C y1 ˇj C uj ; (17.9) where y0 represents the risk-free interest rate, and y1 the market premium for risk. The first contrasts are favorable to the CAPM. In this way for example, Black et al. (1972) achieve a good linear fit between average returns and betas with a positive y1 , although y0 does not represent the risk-free interest rate. Fama and MacBeth (1973) also obtain results consistent with a linear model, in which beta is the only risk measure and a positive risk premium is observed. In general, at this time, a consensus held that the market correctly prices securities and that the CAPM is a good pricing model. However, the first critics were not long in arriving. Roll (1977) holds that the only way to test the CAPM is to check whether the market portfolio is efficient ex post; but a problem with the CAPM is that the real market portfolio cannot be observed, as there are no portfolios that include all the assets of the economy. Another criticism of the CAPM is from Van Horne (1989), who makes clear that betas vary considerably according to the index employed as a proxy for the real market portfolio. However, the work that generated the most controversy about the CAPM was that of Fama and French (1992), which shows a small power of betas to explain average returns, and which emphasizes the existence of other variables that do explain them, such as size and book-to-market ratio. Nevertheless, Kothari et al. (1992) examine Fama and French (1992) and find that the tests used are not powerful enough. They also observe, using a different source, that the relationship between average returns and the book-to-market ratio is weak, although they do not make the same observation regarding size. They also find that annual betas make the relationship with returns clearer. That is, in short, they consider that beta continues to be a useful instrument to explain average returns.
273
Roll and Ross (1994) also analyze the work of Fama and French (1992) and point out that a positive and precise relationship between expected returns and betas will be found if the market index used to find the betas is in the efficient frontier. The problem is that the market portfolio cannot be observed, meaning that proxies must be found, and if the proxy used is not efficient, it should not surprise us that the average returns/betas relationship is not found. Therefore, these authors conclude that although the CAPM cannot be rejected, it is overly dependent on the proxy chosen for the market portfolio, which detracts from the CAPM’s robustness. Later, Fama and French (1993) proposed a multifactor model with five factors to explain return on stocks and bonds (market, size, and book-to-market for stocks, along with two specific factors for bonds). In this way, they reach an approach strongly related to the APT. Fama and French (1996a) defend their model, insisting that beta, in itself, cannot explain expected returns. Fama and French (1996b) maintain that average returns are related to several variables, which, given that they are not explained by the CAPM, are called anomalies. However, these anomalies disappear when a model with three factors is constructed. However, the controversy continued. MacKinlay (1995) contends that the multi-factor models do not resolve all deviations of the CAPM. Kim (1995) believes that the problems with using betas to explain expected returns stem from error problems in the variables. Jagannathan and Wang (1996) think that the problem lies in the fact that empirical studies assume that betas remain constant, and they propose a conditional CAPM that is criticized by authors such as Ghysels (1998), and Lewellen and Nagel (2006), who believe that the conditional CAPM does not explain the pricing errors of the traditional model, since the variation in betas and the risk premia would have to be very large to explain such major anomalies in pricing as momentum or value premium. More recently, Daniel et al. (2001), using the Daniel and Titman test (1997), rejected the model of Fama and French (1993) when evaluating whether the value premium found in the average returns on the largest Japanese stocks represents a compensation for the risk taken. This work was probably undertaken as a way to counter criticism from Davis et al. (2000), who state that a three-factor model better explains the asset premium than does the Daniel and Titman characteristics model (1997). Many other authors have joined the controversy about the CAPM, for example, Chan and Lakonishok (1993), Grundy and Malkiel (1996), Malkiel and Xu (1997), and Bruner et al. (1998), among others. Among the most recent studies on portfolio theory, that of Dimson and Mussavian (1999) stands out. These authors conducted a review of financial asset pricing theory and point out that derivative pricing is the latest trend. Another notable
274
L. Ferruz et al.
study is Jegadeesh and Titman (2001) finding support for the hypothesis of behavioral models, which states that the returns on momentum strategies stem from subsequent reactions to the formation of these strategies. Lastly, Liu (2006) demonstrates that liquidity is an important source of risk, and that liquidity is not considered by the CAPM or by the model of Fama and French. This author believes that a two-factor model (liquidity and market factors) better explains stock returns and justifies the effect of the book-to-market ratio, which cannot be explained by the three-factor model of Fama and French. In general, there is consensus that investors seek more return when a greater risk is taken, but it is not so clear how they determine either the risk or the reward demanded. Several asset pricing models have attempted to answer these questions; among them, the CAPM is the most often employed model by both financial experts and academics, despite the difficulties inherent in its empirical contrast. The CAPM is not a complete model; rather, it is a simplification of reality. But its competitors do not achieve superior results, and they are more complex. Furthermore, the models that are created from empirical results and have little theoretical basis, such as that of Fama and French, are susceptible to data mining bias, as Black (1993) shows, making models with rigorous theoretical foundation, such as the CAPM, preferable. Additionally, and despite all the criticism that it has been subjected to since its appearance, we are able to confirm that the CAPM is very efficacious within companies when it is used to determine investment and finance policies. It is also useful in the markets, as it helps to determine the return that should be demanded from an asset, as well as the level of risk that is warranted. CAPM also enables us to evaluate fund managers, comparing the returns they achieve with the risk they bear.
According to the classic financial literature, we have two alternative methods to measure performance in portfolio management: On the one hand, we may estimate average past returns on portfolios during a set time horizon and, later, adjust said returns by the behavior of the reference market in which such portfolios invest, and by a representative measure of the risk borne by such portfolios (if the total risk is considered, this is usually measured through variability in the portfolio return, that is, through the standard deviation5 of the series of returns; however, if only the systematic risk is considered, it will be measured through the beta coefficient as explained in the previous section of this chapter). The premise here is that we, as individuals, are averse to risk, and consequently demand a higher expected return for bearing a higher risk; it is precisely the performance measures that, when relating average return to its risk, inform us as to how the analyzed portfolio is rewarding us in terms of return received for the accepted risk. On the other hand, portfolio management performance can also be measured using a history-based composition of portfolios, real or estimated, and by developing evaluation methodologies for them. The contributions made by Sharpe (1992), Grinblatt and Titman (1993), and Daniel et al. (1997) are important. However, this work focuses on the first alternative for the measurement of performance. In this section we first examine the classic performance measures usually employed to study stock market portfolios. We will see that the differences between the various measures are attributable to the relevant risk, as well as to the way in which the method of beating the market is measured. Later, we examine other measures that attempt to solve some problems present in the classic measures and, finally, we touch on the latest performance measures that enable better adjustment to the present reality.
17.2 Performance Measures Related to Portfolio Theory and the CAPM: Classic Indices, Derivative Indices, and New Approaches
17.2.2 Classic Performance Measures 17.2.2.1 The Sharpe Ratio (1966) r0 ; (17.10) where is the average return on a portfolio or asset; r0 is the risk-free asset return, and ¢ is the standard deviation of the return on the portfolio or asset. SD
17.2.1 Introduction When evaluating the behavior of a security, portfolio, or investment fund, there are three characteristics that should be considered: return, risk, and liquidity. This last item can be assumed in markets that are sufficiently liquid. It is the return–risk relationship that must be analyzed, which leads to the issue of performance measures.
5
However, when there is a problem of asymmetry in the distribution of returns, the standard deviation is not a suitable risk measure, as Sortino and Price (1994), Eftekhari et al. (2000), and Pedersen and Satchell (2002) indicate; alternative measures must be considered, such as the standard semi-deviation or the Gini index, among others.
17 Portfolio Theory, CAPM and Performance Measures
275
The Sharpe ratio calculates the return premium obtained by the asset or portfolio per unit of total risk accepted. Given that this index responds to total risk, it should be used to evaluate portfolios with a tendency to eliminate diversifiable risk. However, the indices of Treynor and Jensen, as we will see below, are based on the systematic risk, meaning that they should be used to evaluate portfolios that are not very diversified, or individual securities.
17.2.2.2 The Treynor Ratio (1965)
T D
r0 ; ˇ
(17.11)
where “ is the systematic risk of the portfolio or asset. This index reflects the return premium obtained by the asset or portfolio per unit of systematic risk borne.
17.2.2.3 The Jensen Index (1968, 1969)
J D . r0 / r0 ˇ;
the aim of replicating the risk of the reference portfolio, according to the equation M2 D
(17.12)
where is the average return obtained by the market portfolio. Usually, the market portfolio is approached using a benchmark or a stock market index. This index determines the difference between the excess return on the asset or portfolio over the risk-free asset, and the excess that it should have obtained according to the CAPM. Therefore, an asset or portfolio will be said to have obtained a good performance result when it obtains a positive Jensen index value. If the Sharpe or Treynor ratios are used as the performance measures to determine whether a portfolio or asset has obtained a good result, it will be necessary to compare this ratio with that obtained by the market portfolio. Also, this comparison will allow us to discover whether active management is superior to passive management. Furthermore, regardless of which performance index we use, the portfolio or asset that obtains the highest value in such index will always be considered superior.
17.2.3 M2 and M2 for Beta Another indicator of performance based on total risk is Modigliani and Modigliani (1997) M2 measure. This measure compares portfolio returns, adjusted for risk, with those in the appointed reference portfolio. It is calculated as that portfolio return that has been leveraged up or down with
1 r0 ;
(17.13)
where ¢ is the standard deviation of the return on the benchmark, and r0 is the interest rate representing the associated cost of the quantity borrowed, or the interest rate paid for the money loaned. Therefore, M2 is a measure directly comparable with market average return, , meaning that the evaluation of management will be positive when M2 is higher than . Also, it can be proven that M2 generates the same rankings as the Sharpe ratio. Later, Modigliani (1997) proposed a new version of M2 that allows the use of systematic risk, rather than total risk: M2 for beta, which generates the same ranking as the Treynor ratio. The idea behind this measure is that we borrow or lend such that the portfolio has the same beta as the market, later comparing the resulting portfolios returns: M2 for beta D r0 C
. r0 / ˇ
(17.14)
The interest in the use of M2 and M2 for beta measures, which technically coincide with the Sharpe and Treynor ratios, respectively, lies in the fact that they are measured as returns, which is much easier to understand by common investors than the Sharpe and Treynor risk premiums. In addition, Modigliani (1997) criticizes the Jensen index, claiming that differences in return cannot be compared when the risks are very different.
17.2.4 Information Ratio and Tracking Error
IR D
TRACKINGERROR
(17.15)
Tracking error is defined as the difference between the return obtained within a period by a portfolio and that obtained by the benchmark. The variability of the tracking error is the deviation of said differences .¢TRACKING-ERROR /. The information ratio was developed by Sharpe (1994) and, assuming a normal distribution of probability, it reveals the probability that the tracking error will be negative; that is, the probability that the portfolio will yield lower returns than the benchmark. This means that the ratio is very useful for evaluating the manager’s ability to beat the market. However, Modigliani and Modigliani (1997) do not consider this a good performance measure as it does not consider total risk.
276
L. Ferruz et al.
17.2.5 PIRR Index
mi
The PIRR index came into being as an attempt to resolve the problem of the Sharpe ratio’s expression as a quotient. Indeed, if we assume that the distribution of returns on an asset is normal .; ¢/, which is the norm, the numerical value of the Sharpe ratio results from standardizing r0 , meaning that the application of this ratio to the ranking of funds depending on their performance, necessitates classifying them according to the probability that, within any subperiod of time, their returns will be lower than r0 . This means that this is not a good criterion for classification, because it is not logical that fund X be preferable to fund Y merely because the return on X has a lower probability of falling below the free risk rate. In fact, Y could have an average return much higher than that for X. All of this reasoning led Gómez-Bezares (1993) to question the risk quotient penalty that the Sharpe ratio imposes, and to study a possible linear penalty; this led him to opt for the penalty of the NPV,6 ultimately leading to the PPV7 PPV D NPV tNPV ;
(17.16)
where t is the penalty parameter that will vary according to the risk aversion of the decision maker, and the NPV has been calculated at the risk-free interest rate, as we will later penalize it by subtracting t standard deviations. The PPV could be interpreted as the certainty equivalent of a risky investment with a determined average and standard deviation of the NPV.8 In accordance with this idea, Gómez-Bezares et al. (2004) penalize the IRR,9 giving rise to the PIRR.10 PIRR D IRR tIRR
(17.17)
We may interpret the PIRR as the certainty equivalent return on a risky asset with a mean and a standard deviation of IRR; that is, a performance measure adjusted by risk. Returning to the previous nomenclature we have PIRR D t:
(17.18)
The next step consists of assigning a value to t. To do this, we assume an efficient market in which r0 and (affected by ¢ ) are considered to be equivalent (in the sense that they
D
mD
R*
m* PIRRD
mC
C
PIRRC r0
aC
a*
aD
s* = s D
sC
si
Fig. 17.8 Figure 17.8 shows a comparison between the S and PIRR indices
are similarly desirable for the group of investors): r0 D t . Calculating t, we have t D . r0 / = , which we observe is equal to the Sharpe ratio for the market portfolio .S /. Therefore, we reformulate PIRR PIRR D S :
(17.19)
In Fig. 17.8, we compare the S and PIRR indices. We represent the C and D portfolios and the market portfolio, and observe that the Sharpe ratio coincides with the tangent of the angle that forms the straight line that links r0 with one of the portfolios with the horizontal line corresponding to r0 ; therefore, it is deduced that the Sharpe ratio classifies assets based on a system of indifference straight lines that bundle in such a way that the asset located on the highest line is the best one. However, we can see that PIRR is equivalent to drawing straight lines parallel to the line that joins r0 and the market portfolio (PIRR is the point where each line crosses the vertical axis). Thus, we have in this case, a system of parallel indifference lines. Finally, it is easy to observe that S and PIRR lead to different classifications of portfolios: SC > SD > S I PIRRD > PIRRC > r0 (observe that PIRR D r0 ).
6
Net present value. Penalized present value. 8 Two other characteristics of this system are: if we assume that NPV follows a normal distribution, it is easy to understand t as the standardized value of the PPV; and the system is equivalent to using parallel indifference lines. 9 Internal rate of return. 10 Penalized internal rate of return. 7
17.2.6 PIRR Index for Beta The Treynor ratio (1965) poses similar problems to the Sharpe ratio; therefore, we will also impose a linear penalty of the risk, in this case
17 Portfolio Theory, CAPM and Performance Measures
PIRR for Beta D t0 “;
277
(17.20)
where t0 is the penalty parameter, and we again assume that the certainty equivalent of the market portfolio is the riskfree rate: r0 D t 0 ˇ D t 0 (given that “ D 1); calculating the value, we have: t 0 D r0 , therefore PIRR for Beta D . r0 /“:
17.3 Empirical Analysis: Performance Rankings and Performance Persistence 17.3.1 Performance Rankings In this section, we put into practice what was discussed in the previous section. To do this, we use an example: We take weekly data from the returns on a sample of 198 equity investment funds over 6 years. We also take a stock market index to represent the market portfolio and the risk-free asset is approached by the one-month Treasury bill.11 With these data, we calculate the Sharpe and Treynor ratios for each year, as well as the Jensen and PIRR12 indices. In Table 17.1, we show the percentage of funds that outperform the market portfolio based on each of the performance measures considered, and for each of the 6 years analyzed.13 From Table 17.1 it is evident that, based on the S and PIRR indices, exactly the same percentage of investment funds outperform the stock market index. This result is not
The data have been taken from Vargas’ PhD Dissertation (2006). The remainder of the performance measures shown in the previous section are not applied in this example because, as has been stated previously, they give classifications of assets equivalent to those given by one of the measures considered in this example (except IR). 13 We do not show an individual analysis of the results obtained for each fund, for each performance measure, and for each period because of our purpose of abbreviating. 12
S PIRR J T
45% 45% 39% 40%
44% 44% 56% 55%
67% 67% 79% 79%
40% 40% 51% 51%
27% 27% 30% 30%
46% 46% 49% 49%
(17.21)
This system can be justified, as can the previous one, by parallel indifference straight lines. It is also apparent that this is equivalent to the Jensen index (1968, 1969). There are other performance proxies: conditional performance analysis, which considers return and risk expectations to be time-varying; style analysis, which attributes portfolio performance to a group of characteristics that define them; market timing analysis, which breaks down performance into two parts: (1) that which can be attributed to the manager’s ability to select securities (stock picking) and (2) that which can be attributed to the manager’s ability to anticipate the movements of the market (market timing); among others. However, this chapter does not focus on these analyses.
11
Table 17.1 Table 17.1 reports the percentage of funds that outperform the market portfolio based on different performance measures Year 1 Year 2 Year 3 Year 4 Year 5 Year 6
surprising because, as shown in Fig. 17.8, when an asset beats the market based on the Sharpe ratio, it also beats it based on the PIRR index. We observe very similar results between the J and T14 indices. In general, the results obtained are similar based on the four performance measures applied, which leads us to analyze the level of correlation that exists between the classifications of the funds, depending on their performance, derived from the application of the four measures. To do this, we first create a ranking of funds based on each measure, then we apply the Spearman’s rho and Kendall’s tau-b correlation coefficients to evaluate the level of similarity between these classifications.
17.3.1.1 Spearman’s Rho Coefficient P 6 di2 rs D 1 ; N.N 2 1/
(17.22)
where N is the number of funds (198, in this case), and di is the difference between the place occupied by fund i in two classifications. The Spearman’s rho correlation coefficient agrees with Pearson’s coefficient, but the former is an adaptation of the latter for ordinal data.
17.3.1.2 Kendall’s Tau-b Coefficient
D
2.P Q/ ; N.N 1/
(17.23)
where P is the number of agreements between two classifications, and Q is the number of disagreements. This coefficient takes values between 1 and 1.1 £ 1/, and has the disadvantage that, in the case of a draw, the coefficient cannot reach the values 1 and 1, meaning that in such a situation, it would be necessary 14 These should also coincide exactly, but they do not. However, in our example, the first and second years do not coincide and this is due to the special characteristics of one of the analyzed funds that give a negative beta during these 2 years.
278
Table 17.2 Table 17.2 shows the results for the two correlation coefficients applied on the classifications of the funds considering different performance measures15
L. Ferruz et al.
J-PIRR
J-S
S-PIRR
S-T
PIRR-T
J-T
1st year
Kendall Spearman
0.946 0.994
0.835 0.960
0.862 0.969
0.883 0.926
0.790 0.900
0.796 0.898
2nd year
Kendall Spearman
0.830 0.926
0.754 0.890
0.849 0.958
0.858 0.952
0.728 0.892
0.795 0.890
3rd year
Kendall Spearman
0.793 0.874
0.702 0.826
0.832 0.946
0.754 0.815
0.648 0.771
0.801 0.918
4th year
Kendall Spearman
0.873 0.941
0.789 0.912
0.879 0.972
0.833 0.907
0.809 0.917
0.869 0.971
5th year
Kendall Spearman
0.986 0.999
0.816 0.945
0.821 0.947
0.977 0.997
0.814 0.945
0.813 0.943
6th year
Kendall Spearman
0.946 0.991
0.860 0.970
0.882 0.978
0.957 0.995
0.860 0.972
0.868 0.971
15
All of the coefficients in the table are significant at the 1% level.
to correct the expression of the coefficient. However, in the example that we have given, there are no draws. In Table 17.2, we show the results of the application of the two correlation coefficients on the classifications of the funds derived from the application of the four performance measures analyzed in the example. Table 17.2 shows a strong similarity between the order of preference determined from the different performance measures, as the coefficients obtained are very close to the unit and are significant at the 1% level. We can therefore see that the risk linear penalty achieves a methodological improvement of the Sharpe ratio that, in any case, generally maintains the rankings of the funds.
17.3.2.1 Malkiel’s z-statistic (1995) p Z D .Y np/ = np .1 p/;
(17.24)
where Z represents a statistic that, approximately, follows the normal distribution (0,1); Y represents the number of winning portfolios in two consecutive periods; n equals WW C WL. This test shows the proportion of WW compared to WW C WL in such a way that p represents the probability that a winning portfolio over a period will continue to be a winner in the following period. We assign the value of 0.5 to p. If Z > 1:96, we reject the null hypothesis of nonpersistence at a significance level of 5%.
17.3.2 Performance Persistence
17.3.2.2 Brown and Goetzmann’s Statistic (1995)
In this section, we analyze the possible existence of persistence in the performance results achieved by our sample. To do this, we compare the performance achieved by the funds in two consecutive periods. We apply two methodologies. First is the non-parametric methodology, based on contingency tables and several statistics. It consists of comparing the performance rankings in two consecutive periods, distinguishing, in both periods, two portfolio subgroups (“winners” and “losers”) based on the criterion of the median: a fund is a “winner” if its performance is above the median, and a “loser” if it is below. Therefore, we classify the funds as WW if they are winners over two consecutive periods, and WL if they are winners in one period and then losers in a later period, and so on. In addition, the statistical tests of Malkiel (1995), Brown and Goetzmann (1995), and Kahn and Rudd (1995) are applied to determine the significance of the persistence.
ln.OR/ ; (17.25) ln.OR/ where OR is Brown and Goetzmann’s Odd’s ratio: WWLL OR D and where ln.OR/ D WLLW , q 1= 1 1 1 W W C =LL C =W L C =LW . Again, a value of Z > 1:96 would confirm a trend toward persistence in performance at the 5% level. ZD
17.3.2.3 Kahn and Rudd’s Statistic (1995) 2 n n X X Oij Eij D ; Eij i D1 j D1 2
(17.26)
where Oij is the real frequency of the i-th row and the j-th column; Eij is the expected frequency of the i-th row and the j-th column.
17 Portfolio Theory, CAPM and Performance Measures
279
In the case of a 2 2 contingency table, the distribution shows one degree of freedom. A priori, the four expected frequencies would present the same figure (total number of funds, divided by four), meaning that we could reformulate the ¦2 statistic 2 D
.LL N =4/2 .LW N =4/2 .W W N =4/2 C C N =4 N =4 N =4 C
.W L N =4/2 ; N =4
(17.27)
where N is the total number of portfolios evaluated. If the chisquared has a critical value above 3.84, it would be indicative of performance persistence at a significance level of 5%. Following the above example, Tables 17.3 and 17.4 show the performance persistence results – performance is measured by the Sharpe ratio16 – obtained from application of the non-parametric methodology; in Table 17.3, we show the contingency table, and in Table 17.4, the results of the statistics.
16 We do not show persistence results obtained from applying other performance measures, due to space considerations and given that the aim of this example is not to analyze a sample, but rather to explain how the different methodologies of persistence analysis work.
Table 17.3 Table 17.3 reports the contingency table for the Sharpe ratio
Table 17.4 Table 17.4 shows the results of the statistics that check the persistence in performance measured by Sharpe ratio
WW LL LW WL No. funds
Z-Malkiel P OR Z- B & G P ¦2 - K & R P
.
Table 17.5 Table 17.5 reports the results of performance persistence with the parametric methodology when performance is measured with the PIRR index
Pp.t C1/ D ˛ C ˇ Pp.t / C "p ;
(17.28)
where Pp.tC1/ and Pp.t/ represent the performance of portfolio p in the periods t C 1 and t, respectively. Positive beta values, together with a significant t statistic, would confirm the existence of performance persistence, while a negative estimation of said coefficient would indicate the existence of the opposite behavior. Continuing with the above example, in Table 17.5 we show the results of performance persistence–performance
Years 1, 2
Years 2, 3
Years 3, 4
Years 4, 5
Years 5, 6
Total
31 31 15 15 92
34 36 18 20 108
35 36 20 21 112
47 47 33 33 160
51 51 48 48 198
198 201 134 137 670
Years 1, 2
Years 2, 3
Years 3, 4
Years 4, 5
Years 5, 6
Total
2.3591 0.0183 4.2711 3.2641 0.0011 11.1304 0.0008
1.9052 0.0568 3.4000 3.0335 0.0024 9.6296 0.0019
1.8708 0.0614 3.0000 2.7998 0.0051 8.0714 0.0045
1.5652 0.1175 2.0285 2.2021 0.0277 4.9000 0.0269
0.3015 0.7630 1.1289 0.4263 0.6699 0.1818 0.6698
3.3328 0.0009 2.1679 4.9146 0.0000 24.5075 0.0000
/ indicates statistical significance at a level of 5% (1%).
No. Funds ’ T’ “ T“ R2
We can observe from Tables 17.3 and 17.4 the presence of the phenomenon of performance persistence. In fact, based on Table 17.3, we can confirm that the number of funds that repeat as W or L in two consecutive years is greater than the number of funds that change their category (399 vs. 271). Furthermore, Table 17.4 shows the presence of the phenomenon of persistence given that statistics significant at the level of 1% are obtained for the whole sample. The other methodology that allows us to analyze the presence of the phenomenon of performance persistence is the parametric methodology, based on regression analysis in such a way that it determines, by means of ex post values, whether the relationship between performance in a certain period and the corresponding performance of an earlier period is statistically significant. To do this, the following regression is applied
Years 1, 2
Years 2, 3
Years 3, 4
Years 4, 5
Years 5, 6
92 0.000117 0.654 0.840 5.450 0.248
108 0.001 11.912 0.063 1.036 0.010
112 0.000377 2:743 0.602 6.098 0.253
160 0.000096 1.329 0.246 3.725 0.081
198 0.000382 4.171 0.449 4.385 0.089
indicates statistical significance at the 1% level.
280
being measured by the PIRR index – obtained through the application of the parametric methodology. This table confirms the presence of the phenomenon of performance persistence, except between the second and third years (non-significant beta).
17.4 Summary and Conclusions “Portfolio theory,” created by Markowitz, has reached a high level of development thanks to the improvement work carried out on it by several authors. This theory operates in a risk-based environment (the future return on an asset is known in terms of probability) and has the objective of determining the optimum portfolio for investors. To do this, it places all assets on the ¢ map (mean and standard deviation of the random return variable, respectively), and to choose among them, it considers the investor’s preferences, represented by a system of indifference curves. Therefore, the optimum portfolio is that which gives tangency between an indifference curve and the efficient frontier (a group of portfolios with maximum return and minimum risk). Furthermore, “portfolio theory” is used as a theoretical foundation for an asset pricing model: the CAPM. This model, created by Sharpe, considers that the expected return on an asset depends on the market expected return and on the risk-free asset return, as well as on the systematic risk (non-diversifiable) of the asset. At equilibrium, according to CAPM, all assets adjust their values to offer the return that corresponds to them, meaning that it is easy to detect if an asset is underpriced or overpriced. A method of simplification of the relationships between securities also comes from Sharpe, who created the “market model,” according to which the relationships between the returns on assets are solely due to their common relationship with the market index return. This model allows us to distinguish between systematic and diversifiable risk (the latter risk is not remunerated, as it can be eliminated by the investor). The CAPM has been the object of various critiques, especially regarding its starting hypotheses that are alleged to distance this model from reality. However, it has been shown that its conclusions, to a sufficiently significant extent, are fulfilled. Moreover, CAPM has not been overcome by other more complex pricing models, and its usefulness for firms and in markets has been proven. The second part of this chapter refers to performance measures related to portfolio theory; therefore, classic indices are shown along with other, newer indices that attempt to overcome certain limitations present in the classic indices.
L. Ferruz et al.
Finally, we show an example in which we analyze the performance of a sample of equity investment funds using several performance measures. We contrast the similarity between the rankings of the funds (depending on their performance) that are obtained from several measures and, finally, we analyze the possible presence of the phenomenon of performance persistence throughout the period analyzed. Acknowledgement This chapter follows Gómez-Bezares (2006). The authors express their gratitude to Professor Stephen Brown for his helpful comments, which have greatly contributed to the improvement of this chapter.
References Black, F. 1993. “Beta and return.” Journal of Portfolio Management 20, 8–18. Black, F., M. C. Jensen and M. Scholes. 1972. “The capital asset pricing model: some empirical tests.” in Studies in the theory of capital markets, Jensen (Ed.). Praeger, New York. Brown, S. J. and W. N. Goetzmann. 1995. “Performance persistence.” Journal of Finance 50, 679–698. Bruner, R. F., K. M. Eades, R. S. Harris and R. C. Higgins. 1998. “Best practices in estimating the cost of capital: survey and synthesis.” Financial Practice and Education 8(1), 13–28. Chan, L. K. C. and J. Lakonishok. 1993. “Are the reports of beta’s death premature?” Journal of Portfolio Management, summer 19, 51–62. Daniel, K., M. Grinblatt, S. Titman and R. Wermers. 1997. “Measuring mutual fund performance with characteristic-based benchmarks.” Journal of Finance 52, 1035–1058. Daniel, K. and S. Titman. 1997. “Evidence on the characteristics of cross-sectional variation in stock returns.” Journal of Finance 52, 1–33. Daniel, K., S. Titman and K. C. J. Wei. 2001. “Explaining the crosssection of stock returns in Japan: factors or characteristics?” Journal of Finance 56, 743–766. Davis, J. L., E. G. Fama and K. R. French. 2000. “Characteristics, covariances and average returns: 1929 to 1997.” Journal of Finance 55(1), 389–406. Dimson, E. and M. Mussavian. 1999. “Three centuries of asset pricing.” Journal of Banking and Finance 23(12), 1745–1769. Eftekhari, B., C. S. Pedersen and S. E. Satchell. 2000. “On the volatility of measures of financial risk: an investigation using returns from European markets.” The European Journal of Finance 6, 18–38. Fama, E. F. and K. R. French. 1992. “The cross-section of expected stock returns.” Journal of Finance 47(2), 427–465. Fama, E. F. and K. R. French. 1993. “Common risk factors in the returns on stocks and bonds.” Journal of Financial Economics 33, 3–56. Fama, E. F. and K. R. French. 1996a. “The CAPM is wanted, dead or alive.” Journal of Finance 51(5), 1947–1958. Fama, E. F. and K. R. French. 1996b. “Multifactor explanations of asset pricing anomalies.” Journal of Finance 51(1), 55–84. Fama, E. F. and J. D. MacBeth. 1973. “Risk, return and equilibrium: empirical tests.” Journal of Political Economy 81(3), 607–636. Ghysels, E. 1998. “On stable factor structures in the pricing of risk: Do time-varying betas help or hurt?” Journal of Finance 53, 549–573. Gómez-Bezares, F. 1993. “Penalized present value: net present value penalization with normal and beta distributions.” in: Capital budgeting under uncertainty, Aggarwal (Ed.). Prentice-Hall, Englewood Cliffs, New Jersey, pp 91–102.
17 Portfolio Theory, CAPM and Performance Measures Gómez-Bezares, F. 2006. Gestión de carteras, Desclée de Brouwer, Bilbao. Gómez-Bezares, F., J. A. Madariaga and J. Santibánez. 2004. “Performance ajustada al riesgo.” Análisis Financiero 93, 6–17. Grinblatt, M. and S. Titman. 1993. “Performance measurement without benchmarks: an examination of mutual fund returns.” Journal of Business 66(1), 47–68. Grundy, K. and B. G. Malkiel. 1996. “Reports of beta’s death have been greatly exaggerated.” Journal of Portfolio Management 22(3), 36–44. Jagannathan, R. and Z. Wang. 1996. “The conditional CAPM and the cross-section of expected returns.” Journal of Finance 51(1), 3–53. Jegadeesh, N. and S. Titman. 2001. “Profitability of momentum strategies: an evaluation of alternative explanations.” Journal of Finance 56, 699–720 Jensen, M. C. 1968. “The performance of mutual funds in the period 1945–1964.” Journal of Finance 39, 389–416. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolios.” Journal of Business 42(2), 167–247. Kahn, R. N. and A. Rudd. 1995. “Does historical performance predict future performance?” Financial Analysts Journal 51, 43–52. Kim, D. 1995. “The errors in the variables problem in the cross-section of expected stock returns.” Journal of Finance 50(5) 1605–1634. Kothari, S. P., J. Shanken and R. G. Sloan. 1992. Another look at the cross-section of expected stock returns, Working Paper, Bradley Policy Research Center, University of Rochester, New York. Lewellen, J. and S. Nagel. 2006. “The conditional CAPM does not explain asset-pricing anomalies.” Journal of Financial Economics 82, 289–314. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets.” Review of Economics and Statistics 47(1), 13–37. Liu, W. 2006. “A liquidity-augmented capital asset pricing model.” Journal of Financial Economics 82, 631–671. MacKinlay, A. C. 1995. “Multifactor models do not explain deviations from the CAPM.” Journal of Financial Economics 38, 3–28. Malkiel, B. G. 1995. “Returns from investing in equity mutual funds 1971 to 1991.” Journal of Finance 50, 549–572. Malkiel, B. G. and Y. Xu. 1997. “Risk and return revisited.” Journal of Portfolio Management, 23, 9–14. Markowitz, H. 1952. “Portfolio selection.” Journal of Finance 7(1), 77–91. Markowitz, H. 1959. Portfolio selection: efficient diversification of investments, Wiley, New York.
281 Markowitz, H. 1987. Mean-variance analysis in portfolio choice and capital markets, Basil Blackwell, Oxford. Merton, R. C. 1972. “An analytic derivation of the efficient portfolio frontier.” Journal of Financial and Quantitative Analysis 7, 1851–1872. Modigliani L. 1997. “Don’t pick your managers by their alphas,” U.S. Investment Research, U.S. Strategy, Morgan Stanley Dean Witter, June. Modigliani, F. and L. Modigliani. 1997. “Risk-adjusted performance.” Journal of Portfolio Management 23(2), 45–54. Pedersen, C. S. and S. E. Satchell. 2002. “On the foundations of performance measures under asymmetric returns.” Quantitative Finance 3, 217–223. Roll, R. 1977. “A critique of the asset pricing theory’s tests.” Journal of Financial Economics 4(2) 129–176. Roll, R. and S. A. Ross. 1994. “On the cross-sectional regression between expected returns and betas.” Journal of Finance 49, pp. 101–121. Sharpe, W. F. 1961. “Portfolio analysis based on a simplified model of the relationships among securities.” Ph.D. Dissertation, University of California, Los Angeles. Sharpe, W. F. 1963. “A simplified model for portfolio analysis.” Management Science 9(2), 277–293. Sharpe, W. F. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19(3), 425–442. Sharpe, W. F. 1966. “Mutual fund performance.” Journal of Business 39, 119–138. Sharpe, W. F. 1992. “Asset allocation: management style and performance measurement.” Journal of Portfolio Management 18, 7–19. Sharpe, W. F. 1994. “The Sharpe ratio.” Journal of Portfolio Management 21, 49–58. Sortino, F. and L. Price. 1994. “Performance measurement in a downside risk framework.” Journal of Investing 3, 59–65. Tobin, J. 1958. “Liquidity preference as behavior towards risk.” The Review of Economic Studies 67, 65–86. Treynor, J. L. 1965. “How to rate management of investment funds.” Harvard Business Review 43, 63–75. Van Horne, J. C. 1989. Financial management and policy, PrenticeHall, Englewood Cliffs, New Jersey. Vargas, M. 2006. “Fondos de inversión españoles: análisis empírico de eficiencia y persistencia en la gestión.” PhD Dissertation, University of Zaragoza, Spain.
Chapter 18
Intertemporal Equilibrium Models, Portfolio Theory and the Capital Asset Pricing Model Stephen J. Brown
Abstract Intertemporal equilibrium models of the kind discussed in (Asset pricing, Princeton University Press, Princeton, 2001) have become the standard paradigm in most advanced asset pricing courses. The purpose of this chapter is to explain the relationship between this paradigm and the portfolio theory paradigm common in most of the prior asset pricing literature. We show that these paradigms are merely different ways of looking at the same economic phenomena, and that insights can be gained from each approach. Keywords CAPM r Hansen Jagannathan bounds r Intertemporal equilibrium r Portfolio theory r Sharpe ratio r Stochastic discount factor
18.1 Introduction John Cochrane’s recent textbook (Cochrane 2001) offers a compelling paradigm for asset pricing, starting with a consumption-based intertemporal equilibrium model, bypassing the normal development via single period portfolio theory that leads to the traditional Capital Asset Pricing Model (CAPM). This is a very elegant and persuasive treatment that generalizes CAPM to a multiperiod economy, and has direct implications for derivative pricing. Finally, and perhaps most importantly, well trained economists find this approach provides an intuitive and persuasive access to the central results of the financial economics literature. As a result, this text is required reading in most advanced level asset pricing courses. While this approach to the asset pricing literature is the very model of clarity and simplicity, the text does not discuss the connection to the earlier portfolio-theory based literature. As Cochrane notes “I skip very lightly over many
parts of asset pricing theory that have faded from current applications. Some examples are portfolio separation theorems, properties of various distributions, or asymptotic APT. While portfolio theory is still interesting and useful, it is no longer a cornerstone of pricing. Rather than use portfolio theory to find a demand curve for assets, which intersected with a supply curve gives prices, we now go to prices directly. One can then find optimal portfolios, but it is a side issue for the asset pricing question” (p. xvi). While this statement is correct, experience has shown that it gives the student the false but nevertheless comforting perception that the text subsumes all relevant developments in financial economics, and the necessary corollary that the entire literature of financial economics can be safely ignored. This is just as well, because the language and terminology of the text are so much at variance with the established portfolio-theory based literature that most beginning students find this earlier literature very difficult to understand. Our purpose is to show how the intertemporal equilibrium asset pricing model relates to the established portfolio-theory literature. An interesting byproduct of this reconciliation is the important role of the maximized Sharpe ratio. It was this insight that led in the first instance to the development of the CAPM in Sharpe (1964).
18.2 Intertemporal Equilibrium Models Hansen and Jagannathan (1991) have developed a very simple and elegant approach to developing an intertemporal equilibrium model that can relate to observed asset returns. This approach builds on earlier work by Breeden (1979) and Rubinstein (1976). The investor’s problem is to allocate wealth in order to maximize the utility of consuming both now and in the future. In other words, the investor is faced with an intertemporal choice problem of the form:
S.J. Brown () NYU Stern School of Business, New York University, New York, NY, USA e-mail:
[email protected]
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_18,
2 Max Et 4
1 X
3 ı j U.ct Cj /5
j D0
283
284
S.J. Brown
where ct Cj represents future consumption, and ı is a subjective discount factor applied to future consumption. For a given budget constraint, the first order conditions for this problem imply that U 0 .ct / D ı j Et .1 C Ri;t Cj /U 0 .ct Cj / for all assets i and periods j into the future.1 Dividing through by the marginal utility of consumption today, we have the important result that 1 D Et .1 C Ri;t Cj /mt;j U 0 .c
a vector of factors some combination of which can proxy for consumption growth. A third idea originally due to Hansen and Jagannathan (1991) is that since the stochastic discount factor prices all financial claims, we might be just as well off inferring the stochastic discount factor from the observed set of asset returns. This important insight allows us to interpret the stochastic discount factor in terms of the mean-variance efficient portfolio, an interpretation that yields the standard CAPM as a direct implication. Intertemporal equilibrium imposes an important constraint on the cross-section of expected returns on different securities. Starting from the basic formula that holds for all securities i :
/
where mt;j D ı j U 0 .ct Cj is the intertemporal marginal rate t/ of substitution. It also has the interesting interpretation of being a stochastic discount factor (sometimes also referred to as a pricing kernel) because it takes an asset with uncertain per dollar future payoff $.1 C Ri;t Cj / back to the present to be valued at one dollar. If there is a riskless asset in this econ .1 C RF;t Cj / mt;j D , then 1 D E omy with return R F;t Cj t .1 C RF;t Cj / Et mt;j or Et mt;j D .1CR1F;t Cj / , so that the expected value of the stochastic discount factor is equal to the discount factor used when the future payment is in fact without any risk. For the subsequent discussion we will drop the time subscripts.
18.3 Relationship to Observed Security Returns Cochrane (2001) argues that this intertemporal equilibrium model provides a straightforward way to value all financial claims. Security prices are determined so that the expected value of the growth of a dollar invested discounted by the stochastic discount factor equals the value of a dollar today. This is a direct implication of the equation 1 D Et .1 C Ri;t Cj / mt;j . The difficulty is however that the stochastic discount factor m is not observable. There are three general approaches to this problem. The first is to specify m directly through assumptions made about utility and using measures of consumption. The chief difficulty associated with this approach is obtaining accurate and timely measures of aggregate consumption c. An alternative approach is to use
1 D E Œ .1 C Ri /m D E Œ .1 C i C . Ri i /m D .1 C i / EŒm C E Œ. Ri i /m 1 If there is a risk-free rate EŒm D 1CR , this implies the asF set pricing relation i RF D .1 C RF / E Œ.Ri i /m , which says that the risk premium is negatively proportional to the covariance between the asset return and the stochastic discount factor. In difficult economic times, consumption is depressed and the intertemporal rate of substitution m is high. A negative covariance between asset returns and m is therefore a source of risk to investors. As we mention above, we cannot directly observe m. However, since the same m prices all assets in the economy we should be able to infer it from asset prices and returns. Suppose there are N assets in the economy and we infer (the unobserved) m by a hypothetical regression on the set of returns in the economy, so that
m D m C
N X
.Rj j / j
j D1
where the j are the hypothetical regression coefficients. The asset pricing relationship can then be written: i RF D .1 C RF / E Œ. Ri i /m D .1 " C RF / m E Œ. Ri i / #.1 C RF / N P .Rj j / j E . Ri i / j D1
D
N P
E. Ri i / .Rj j / Zj
j D1 1 Cochrane [2001] along with many economists define return as the growth in value of one dollar invested, or the gross return 1 C Ri;tCj where Ri;tCj is the “net return” or total return (income plus capital gain) in period t C j for security i . This leads to some confusion, as Cochrane uses the symbol R to denote gross return, whereas much of the finance literature uses the same symbol to signify net return. We use the net return convention most commonly employed in the financial economics literature.
where the coefficients Zj D .1 C RF /j . This equation must hold for all assets, so we have the system of equations 1 RF D Z1 12 C Z2 12 C Z3 13 C : : : C ZN 1N 2 RF D Z1 21 C Z2 22 C Z3 23 C : : : C ZN 2N
18 Intertemporal Equilibrium Models, Portfolio Theory and the Capital Asset Pricing Model
3 RF D Z1 31 C Z2 32 C Z3 32 C : : : C ZN 3N :: : N RF D Z1 N1 C Z2 N 2 C Z3 N 3 C : : : C ZN N2 which the reader will recognize as Equation (9.1) used in Elton et al. (2007) to identify the mean variance efficient portfolio with riskless lending and borrowing.2 From this fact we conclude immediately that the hypothetical regression coefficients j are proportional to mean variance efficient portfolio weights, and hence that the best estimate of the stochastic discount factor m can be written as a linear function of the return on a mean variance efficient portfolio, m D a C b RMV .
18.4 Intertemporal Equilibrium and the Capital Asset Pricing Model If we further identify the mean variance efficient portfolio MV as the market portfolio, then the asset pricing relation above immediately implies the standard CAPM. To see this, note that using the returns-based proxy m for the discount factor m i RF D .1 C RF / E . Ri i /m and substituting in for m we find that expected excess return is given by
285
There is an important intuition here that establishes that the particular mean variance efficient portfolio is the market portfolio. From the interpretation of the portfolio as resulting from regressing the (unknown) stochastic discount factor m on the set of observed security returns, the variability ex2 D plained by the observed return portfolio, m2 D b 2 MV
.MV RF /2 2 , is maximized. This implies that the Sharpe ratio .1CRF /2 MV RF given as MVMV D .1 C RF /m D m is also maximized, m which further establishes that RMV is the return on the market
portfolio. This last equation is very important because it establishes a nexus between the financial markets on the one hand and consumer preferences on the other. The fact that we can represent the maximized Sharpe ratio as the ratio of the standard deviation of the stochastic discount factor to its mean presents many financial economists with a serious problem. The average risk premium of the market measured in units of risk is far too high to be explained by any consumptionbased representation of the stochastic discount factor. This is referred to as the “equity premium puzzle.” As Cochrane (2001) p.24 notes, the Sharpe ratio measured in real (not nominal) terms has been about 0.5 on the basis of the past 50 years of data for the United States. Assuming the time separable power utility function, the ratio m is approxim mately the risk-aversion coefficient times the standard deviation of the log of consumption. Since the rate of change in consumption is considerably less than the variance of market returns, this implies that investors are very risk averse, with a coefficient of risk aversion at least 50. A degree of risk aversion this large is difficult to motivate.
i RF D .1 C RF / a E Œ. Ri i / b .1 C RF / E Œ. Ri i /RM V : The first term is zero by definition. The expectation in the second term is the covariance between the return on security i and the return on the mean variance efficient portfolio, which is beta times the variance of this portfolio return: 2 i RF D b .1 C RF / M V ˇi
which must hold for all assets, including the mean variance efficient portfolio with beta equal to one, so that b D MV RF and the CAPM follows immediately, .1CR / 2 F
MV
i RF D ˇi .M V RF /:
2 Note that this system of equations corresponds to the first order conditions Rf D †z for a maximum value of the Sharpe ratio p D .x 0 Rf /= x 0 †x, where x is the vector of portfolio weights, † is the covariance matrix of returns, and z is proportional to the portfolio weight vector x that maximizes the Sharpe ratio.
18.5 Hansen Jagannathan Bounds The relationship between the Sharpe ratio and the moments of the stochastic discount factor gives rise to another fascinating insight due to Hansen and Jagannathan (1991). Suppose we represent the stochastic discount factor m D a C b RMV . We have shown that this choice of m prices all assets, E Œ.1 C R/m D 1. Consider another discount factor m D a C b RMV C ", where " is uncorrelated with returns R on every single asset in the economy, and has zero expectation. Then this discount factor will also price all assets E Œ.1 C R/m D 1. This means that there are many possible discount factors, with m D m and m2 D m2 C "2 > m2 . However the choice of m D m minimizes the volatility of the stochastic discount factor and is most preferred by investors. Hence, we have the interesting RF F R bounds: mm m D MVMV MV , which are referred m to as the “Hansen-Jagannathan bounds.” The inequality on the left hand side refers to the fact that consumption risk will increase if there is a source of risk " that cannot be hedged
286
S.J. Brown
by the set of assets represented by returns R. Imperfections in the capital markets make the world a riskier place than it need to be. The inequality on the right hand side reflects the reality that limits to diversification through short sale restrictions or other factors limit the opportunities available to investors. Another useful interpretation of these contrasting inequalities is that the problem of choosing a returns-based discount function by minimizing the volatility of the discount factor m is equivalent to determining a portfolio that maximizes the portfolio Sharpe ratio.
18.6 Are Stochastic Discount Factors Positive? Unfortunately, this interpretation of the stochastic discount factor leads to a distinctly unattractive implication. Since MV RF 1 1 D a C bMV , a D 1CR C .1CR MV m D 1CR / 2 f f F
D b .1 C RF / LPMMV;c ˇMV;c;i
where LPM MV;c D EŒ.RMV c/2 jRMV < c is referred to as the lower partial moment of the return on the mean variance efficient portfolio, a measure of downside risk, and where D E Œ.Ri i /.RMV c/ =LPM MV;c is referred to ˇMV;c;i as the lower partial moment beta, the contribution of security i to the downside risk of the market. As before, we have immediately a linear asset pricing model similar to the standard , CAPM, except that the lower partial moment beta, ˇMV;c;i replaces the more familiar beta, ˇMV . This generalized asset pricing model was first derived by Bawa and Lindenberg (1997) who showed that it corresponded with an equilibrium model where agents have utility functions for wealth displaying diminishing absolute risk aversion. They further show that if returns are multivariate Normal or Student-t, risk measure collapses to ˇMV and the standard the ˇMV;c;i CAPM result follows.
MV
MV RF 1 and therefore m D 1CR .1CR .RMV MV /, so that 2 f F /MV the implied discount rate is negative whenever the market 2
return exceeds its mean by an amount equal to MVMV RF , the standard deviation of the market return divided by the Sharpe ratio. Using the example given above, if the Sharpe ratio is about 0.5, the discount factor will be negative whenever the market return is two standard deviations above its mean. It is intuitive that the discount factor should fall as market return increases; after all we are assuming that consumers have diminishing marginal utility. It is not intuitive that the discount factor should be negative, and in fact it is easy to show that there are arbitrage opportunities that arise when the representative investor is willing to throw his or her money away in these circumstances. As a practical matter, the empirical representation of the discount factor in terms of the return on the market portfolio is rarely negative, and so a constraint on the discount factor to ensure that it is nonnegative should not lead to any major implications for asset prices. Hansen and Jagannathan suggest such a restriction but find it makes little difference in the case of equity markets. This assertion however depends strongly on the assumption that market returns are distributed normally. If returns are positively skewed, then the positive discount factor restriction may have a greater impact. It is worth pursuing a little the implications of a positive discount factor restriction. We can represent this positive factor as mC D a C b.RMV c/ , where .RMV c/ is the payoff of a put on the market index with exercise price 1 C c. As before, we have i RF D .1 C RF / E . Ri i /mC D .1 C RF / a E . Ri Ri / b .1 C RF / E Œ. Ri i /.RM V c/
18.7 Conclusion A straightforward maximization of expected utility subject to a budget constraint leads to the CAPM via the maximized Sharpe ratio. This standard portfolio theory optimization problem has a dual representation involving shadow prices. This essential duality is illustrated by the Hansen Jagannathan bounds, which show that the maximized Sharpe ratio is a lower bound on the ability of investors to hedge consumption risk. Many economists find this representation to be more intuitive, although the pricing implications are the same. However, it is difficult and in many cases confusing to switch from one way of considering the problem to another. The purpose of this chapter is to clarify the relationship between the two approaches and to show that they are just two ways of looking at the same problem. Careful consideration suggests that the intertemporal equilibrium model cannot really avoid problems associated with the properties of various distributions, or issues that gave rise to consideration of an asymptotic APT. The analysis assumes that we can observe the mean-variance efficient portfolio or proxy for it on the basis of a meanvariance analysis based on the sample covariance matrix of returns. Indeed, whether a given portfolio is mean-variance efficient is a central question of modern portfolio theory.3 The intertemporal model additionally assumes that security returns are expressed in real terms, when observed returns are generally expressed in nominal terms, necessitating a separate adjustment for price level changes. As we see, the need to generalize the CAPM to account for the properties 3
See for example Gibbons et al. (1989).
18 Intertemporal Equilibrium Models, Portfolio Theory and the Capital Asset Pricing Model
of various distributions has an interesting parallel in the need to constrain the stochastic discount factor to be positive. In short, there is much to be gained from considering the asset pricing problem both from a stochastic discount factor and from a portfolio theory framework.
References Bawa, V. and E. Lindenberg. 1977. “Capital market equilibrium in a mean-lower partial moment framework.” Journal of Financial Economics 5, 189–200. Breeden, D. 1979. “An intertemporal asset pricing model with stochastic consumption and investment opportunities.” Journal of Financial Economics 7, 265–296.
287
Cochrane, J. 2001. Asset pricing, Princeton University Press, Princeton. Elton, E., M. Gruber, S. Brown and W. Goetzmann. 2007. Modern portfolio theory and investment analysis, Wiley, Hoboken. Gibbons, M., S. Ross and J. Shanken. 1989. “A test of the efficiency of a given portfolio.” Econometrica 57, 1121–1152. Hansen, L. P. and R. Jagannathan. 1991. “Implications of security market data for models of dynamic economies.” Journal of Political Economy 99, 225–262. Rubinstein, M. 1976. “The valuation of uncertain income streams and the pricing of options.” Bell Journal of Economics 7, 407–425. Sharpe, W. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19(3) 425–442.
Chapter 19
Persistence, Predictability, and Portfolio Planning Michael J. Brennan and Yihong Xia
Abstract We use a model of stock price behavior in which the expected rate of return on stocks follows an OrnsteinUhlenbeck process to show that levels of return predictability that cause large variation in valuation ratios and offer significant benefits to dynamic portfolio strategies are hard to detect or measure by standard regression techniques, and that the R2 from standard short run predictive regressions carry little information about either long run predictability or the value of dynamic portfolio strategies. We propose a new approach to portfolio planning that uses forward-looking estimates of long run expected rates of return from dividend discount models. We show how such long run expected rates of return can be used to estimate the instantaneous expected rate of return under the assumption that the latter follows an Ornstein-Uhlenbeck process. Simulation results using four different estimates of long run rates of return on U.S. common stocks suggest that this approach may be valuable for long horizon investors. Keywords Portfolio planning Dividend discount model
r
Return predictability
r
19.1 Introduction Hakansson (1970) and Merton (1971) extended the principles of dynamic portfolio optimization to situations of stochastic (time-varying) investment opportunities over 30 years ago, but interest in this topic languished until the 1990s under the soothing influence of the efficient markets paradigm and the belief that the investment opportunity set, if not exactly constant, was sufficiently close to constant that myopic portfolio strategies that ignored time variation in investment opportunities could be safely employed. M.J. Brennan () University of California, Los Angeles, CA, USA e-mail:
[email protected] Y. Xia Wharton School, Philadelphia, PA, USA
However, the debate about excess stock price volatility,1 the evidence of mean reversion in stock prices,2 and of the predictive power of instruments such as the dividend yield, the book-to market ratio, the term spread and the short term interest rate,3 have revived interest in dynamic portfolio theory in recent years. This interest has been further stimulated by the behavior of stock prices in the late 1990s, which drew attention to the implications of valuation ratios such as the market dividend yield and the book to market ratio for expected future returns of equity securities,4 and therefore for portfolio strategies. The dynamic portfolio models that have been developed and calibrated to U.S. stock returns by Brennan et al. (1997), Barberis (2000), Campbell and Viceira (2001), and Xia (2001), among others, suggest that time variation in investment opportunities may have significant effects on optimal portfolio strategies for long-lived investors, and that failure to take account of time variation in expected returns may carry significant costs. The analysis of Kandel and Stambaugh (1996) suggests that short horizon investors also can benefit from predictability, even if it is highly uncertain. These models all rely on statistical relations between (excess) returns on stocks and (usually one of) a set of predictor variables or instruments such as those mentioned above. However, doubt has been cast on the usefulness of dynamic portfolio models by the weakness and instability of the estimated relations between stock returns and the instruments. It has also been suggested that in sample “predictability” may be an illusion, the result of uncertainty and learning about the parameters of the underlying stochastic process for dividends.5 Consistent with this suggestion, Goetzmann and Jorion (1993) argue that long run predictive regressions lack
1
See, for example, Shiller (1981), LeRoy and Porter (1981), Kleidon (1986), Marsh and Merton (1986). 2 See, for example, Fama and French (1988b). 3 See, for example, Keim and Stambaugh (1986), Fama and French (1988a) and so forth. 4 See, for example, Lewellen (2004). 5 See Brav and Heaton (2002) and Lewellen and Shanken (2002).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_19,
289
290
power in small samples,6 while Bossaerts and Hillion (1999) find no evidence of out-of-sample excess return predictability at the monthly frequency in 14 countries using the standard instruments such as lagged returns, interest rates, and dividend yields. Goyal and Welch (2003) demonstrate that the dividend yield has no out of sample predictive power for annual stock returns during the period 1926–2002. They report that the dividend yield “is primarily forecasting future market returns over horizons greater than 10 years. It does not forecast market returns (or dividend growth) over horizons less than 5 years.” In a more recent paper, Goyal and Welch (2004) show similar results for a wide range of variables7 and conclude that “for all practical purposes, the equity premium has not been predictable, and any belief about whether the stock market is now too high or too low has to be based on a theoretical prior.” These results of Goyal and Welch seem to be consistent with the finding of Philips et al. (1996) that investment managers who followed Tactical Asset Allocation strategies during the period 1988–1994 failed to outperform static strategies. On the other hand, there is continuing evidence of predictability in equity returns, especially over long horizons. Fama and French (1988b), Poterba and Summers (1988), Lo and MacKinlay (1988), Cochrane (1999), and Campbell and Yogo (2003) all find evidence of temporary components in stock prices that becomes stronger at long horizons. Informal evidence of long run predictability is further apparent from the long swings and upward trend in valuation ratios during the twenteeth century – swings that do not appear to have been mirrored in changing dividend growth rates.8 The apparent paradox of strong evidence of long run predictability and only weak, if any, evidence of out-of-sample short run predictability is resolved by recognizing the importance of the time series behavior of the predictor variable that determines the time variation in the conditional mean return. This time series behavior has implications both for the estimation of the relation and for the economic importance of a given short run relation. In this paper we demonstrate, first, that levels of return predictability that are of first order economic significance for long-term investors are likely to be hard to detect from short run returns using the standard econometric methods that have been employed, even though they imply major deviations of valuation ratios from their long run “equilibrium” levels. In 6
Campbell (2001) shows that long-horizon regression tests have serious size distortions when asymptotic critical values are used, but some versions of such tests have power advantages remaining after size is corrected. 7 Campbell and Thompson (2004) show that the findings of Goyal and Welch (2004) are no longer true if sensible restrictions are imposed on the signs of predictive coefficients and return forecasts. 8 Campbell and Shiller (1998, 2001) and Fama and French (2002), among others, find that dividend yields have little or no forecast power for subsequent dividend growth in the years since 1870.
M.J. Brennan and Y. Xia
showing that (in-sample) predictability may be hard to find when it exists, our paper complements the analyses of Goyal and Welch (2003) and Bossaerts and Hillion (1999), which show that predictability may appear to exist for certain instruments even when no true (out-of-sample) predictability exists. Second, we show how long run measures of expected return that are anchored in fundamentals and are derived by comparing current stock prices with expected future dividends may be used to construct portfolio policies for long run investors. We provide time series estimates of instantaneous expected returns derived from these Dividend Discount Model measures of long run expected returns, and provide evidence that these measures would have been useful to long horizon investors.
19.2 Detecting and Exploiting Predictability In this section we use the simplest model of stock return predictability, in which the proportional drift of the stock price follows a standard Ornstein-Uhlenbeck process, to analyze two issues. The first is the likelihood of being able to detect return predictability using standard approaches if the stock price drift is perfectly observable. The second issue that we address is the value of following an optimal strategy that takes account of the time variation in the drift parameter, relative to the value of an unconditional strategy that treats the drift as constant. We conclude that levels of timevariation in expected returns that are very important for investors with long horizons (20 years or more) may well be undetectable using the standard statistical approaches and, even if detectable, are likely to be estimated with considerable imprecision.
19.2.1 Implications of a Model of Stock Price Behavior Our basic framework relies on the following continuous time model of the dynamics of the stock price with dividends reinvested, which we denote by P : dP D .˛ C ˇ/dt C P dzP ; P d D . /dt C dz ;
(19.1) (19.2)
where dzP and dz are correlated standard Brownian motions with the correlation P . In this formulation, t is to be thought of as a perfect signal of the drift of the asset price process, ˛ Cˇt ; however, for the most part we shall assume that ˛ D 0, and ˇ D 1; then can be interpreted as the drift of the stock price.
19 Persistence, Predictability, and Portfolio Planning
291
The discrete time equivalent of the Ornstein-Uhlenbeck process (19.2) is the AR(1) process: t D e .t / C t :
(19.3)
where is the time interval between observations of . Let R.t; t C / denote the cumulative return on the stock over the interval Œt; t C , then
t
The conditional variance of the -period return, R.t; t C /, which we denote by V ./ Var.R.t; t C /jt /, depends on the volatility of the return process P , the volatility of the signal , the horizon , and the mean reversion parameter , and can be written as:
1 e 1 e C 2 2 2ˇP P 1e : C 2
C P2 (19.5)
Since the variance of s conditional on t .s t/, 2
2 .s jt / D 2 .1 e 2.st / /, is decreasing in , one might conjecture that the conditional variance of the cumulative return V ./ would also be decreasing in the mean reversion intensity . The following expression for the derivative of the variance with respect to shows that this conjecture is true only if the correlation between innovations in P and innovations in is positive. When the correlation is negative (as we shall typically assume it is), the sign of the derivative is indeterminate: 2 ˇ 2 1 e 2 1 e @V D 3 2 4 C @ C
1 C e
if P 0 if P < 0
(19.6)
where
t
ˇ 2 2
2ˇP P 1 e e C 2 <0 indeterminate
R.t; t C / D a C b t C ";
1 1e D ˛ C ˇ P2 C ˇ.t / 2 Z t C ˇ C .1 e .t C s/ /dz .s/ t Z t C C P dzP .s/: (19.4)
V ./ D
To analyze the role of t as a predictor of the cumulative return, R.t; t C /, rewrite Equation (19.4) as
R.t; t C / ln P .t C / ln P .t/
Z t C 1 ˛ C ˇ.s/ P2 ds D 2 t Z t C C P dzP ;
2e2 1 C e 2 2
2e 2 C 2e
1 1 e ; a ˛ C ˇ P2 ˇ 2
(19.7)
1 e ; (19.8) Z t C Z ˇ t C " .1 e .t C s/ /dz .s/ C P dzP .s/; t t b ˇ
and " is independent of t . Note first that b increases with as long as ˇ > 0 and that b < ˇ 8 > 0. In addition, b ! ˇ as the horizon goes to infinity, so that an innovation in has a permanent effect on the stock price. As ! 0; b ! 0 and b = !ˇ. Second b is decreasing in the intensity of the mean reversion parameter . As ! 1 the process for tends to a constant; deviations of from are transient, and as a result, b ! 0. On the other hand, as ! 0, the process for approaches a random walk, and b ! ˇ , or b = !ˇ so that an increase in increases the expected rate of return in all future periods by the same amount. Thus, as has been pointed out by Cochrane (1999) and others, the degree of persistence in has major implications for the predictive role of in long horizon regressions, and two signals that differ in their persistence parameter may have the same predictive importance for returns at one horizon, but quite different importance at a different horizon. This suggests, as we shall see, that it may be dangerous to attempt to assess the relevance of predictive variables by examining their power at a single horizon. A simple measure of the predictive power of a signal t is the R2 from the regression (19.6) at different return intervals . The unconditional distribution of the predictive variable, 2
t , is normal with mean and variance 2 , and the unconditional variance of R.t; t C / is simply b2 Var.t / C Var."/. The theoretical R2 can be expressed explicitly in terms of the model parameters as:
292
M.J. Brennan and Y. Xia
R2
b2 Var.t / ; b2 Var.t / C Var."/
D 1e 2
C
2
h 2 1e C
1e 2
1e 2 2
The expression for R2 depends on four parameters: the cumulative return interval , the mean reversion intensity of P the signal , the noise-to-signal ratio ˇ , and the correlation between the innovations to the return and to the signal P . While all four parameters are important in determining the magnitude of the theoretical regression R2 , the dependence of R2 on has received special attention in the literature. For example, Fama and Bliss (1987), Cochrane (1997, 1999), and Campbell (2001) have all explored the relation between R2 and empirically finding that R2 normally increases with the return interval . While Campbell et al. (1997) derive an approximate relation between a two-period and a one-period R2 in a discrete model, our simple continuous time setup enables us to examine the relation between R2 and all relevant parameters, including , in closed form. To analyze formally the relation between R2 and , differentiate R2 with respect to : 2 ˇ 2 1e
@.R2 / 2 2 D 2 2 @ b Var.t / C Var."/ " 2 ˇ 2
.2e C e 2 1/ 2 CP2 .2e C e 1/ 2P ˇP 2 .2e C Ce 1/ ; The expression outside the bracket is always positive but the term inside the bracket may be positive or negative depending on the parameter values. In general, R2 first increases and then decreases with the return interval , and the suffi2 > 0 is that < c where c is the cient condition for @R @ solution to the nonlinear equation 2e C e 2 1 D 0. The turning point at which R2 starts to decrease with depends on all four parameter values. It may also be verified that R2 ! 0 as ! 0. Therefore, a variable that perfectly predicts the drift of the stock price will appear to have no predictive power if the predictive regressions are run using sufficiently short horizon returns. How short is “sufficiently” short depends on parameter values. In addition to the hump-shaped relation between R2 2 2 @R are also indeterminate. On and , the signs @R P @ and @
@R2 @P
ˇ
the other hand, < 0, cet. par. the theoretical R2 is always decreasing in the correlation coefficient P and attains
i
C 2
P ˇ
2
P C 4P ˇ
1e
:
(19.9)
its highest value when P D 1. Since equation (19.9) is complicated, we shall explore the parameter sensitivities of the R2 at different horizons numerically by considering several reasonable scenarios.
19.2.2 Detecting Predictive Power in Finite Samples For our numerical analysis we choose the following “global” parameters. p We set the volatility of the annual cumulative return V .1/ D 0:14, which is approximately the annual volatility of real return on the S&P 500 index for the period 1950–2003. We set ˛ D 0 and ˇ D 1 so that is the drift of the return process. With this in mind, we set the interest rate r D 3% and the long run mean D 9%, which implies a mean equity premium of 6%. Finally, the volatility of the stationary distribution of is defined by v p2 . We set v at 4%:9 this means that there is a 68% chance that the equity premium at any moment will lay between 4% and 8%, and a 6.6% probability that it will be negative.10 This seems to us to represent a substantial amount of variation in the equity premium that we should have a reasonable chance of detecting. Within these global parameters we consider nine scenarios, the product of three values of the mean reversion parameter, ›, and three values of the correlation between innovations in and in returns, P . The three values of › are 0.02, 0.10, and 0.50, which correspond to half-lives for innovations in of 34.7, 6.9, and 1.4 years and therefore would seem to span the economically interesting range. The values of P are 0:90; 0:5, and 0.0. As mentioned above, we restrict our attention to nonpositive correlations because it is to be expected that, as innovations in expected returns are equivalent to innovations in discount rates, they should be negatively related to contemporaneous stock returns;11 Xia (2001) reports a correlation of 0:93 between monthly innovations in the dividend yield and stock returns for the period January 9
We shall find that this corresponds to the empirical estimates of this parameter reported below. 10 Boudoukh et al. (1993) report evidence that the ex ante risk premium is negative in some states of the world. 11 It is possible for innovations in discount rates to be positively correlated with current returns if the innovations in discount rates are highly correlated with innovations in cash flow news.
19 Persistence, Predictability, and Portfolio Planning
293
1950 to December 1997, which is similar to Barberis (2000) who also uses the dividend yield as a predictor of stock returns.
Table 19.1 reports various statistics for our nine different scenarios. The first four lines of the table report the parameters describing each scenario. The volatility of the drift, ,
Table 19.1 Stock return model parameters and parameter estimates Scenarios (i) (ii) (iii) (iv) (v) › 0.02 0.02 0.02 0.10 0.10 0.008 0.008 0.008 0.018 0.018 P 0.00 0:50 0:90 0.00 0:50 P 0.140 0.142 0.144 0.140 0.144
(vi) 0.10 0.018 0:90 0.148
(vii) 0.50 0.040 0.00 0.139
(viii) 0.50 0.040 0:50 0.147
(ix) 0.50 0.040 0:90 0.155
Panel A: Variance ratios 1 month(s) 1.00 6 1.00 1 year(s) 1.00 2 1.00 3 1.01 4 1.02 5 1.02
1.03 1.01 1.00 0.98 0.95 0.93 0.91
1.05 1.03 1.00 0.95 0.91 0.86 0.82
1.00 1.00 1.00 1.01 1.03 1.06 1.09
1.05 1.03 1.00 0.95 0.92 0.89 0.87
1.10 1.05 1.00 0.90 0.82 0.75 0.69
0.98 0.99 1.00 1.04 1.07 1.11 1.13
1.10 1.05 1.00 0.94 0.91 0.89 0.88
1.20 1.10 1.00 0.86 0.77 0.70 0.66
Panel B: 1-month R2 Asymptotic 0.7% 25% 0.1% Median 0.2% Mean 0.3% 75% 0.5%
0.7% 0.2% 0.4% 0.5% 0.7%
0.6% 0.5% 0.7% 0.7% 1.0%
0.7% 0.2% 0.5% 0.6% 0.9%
0.6% 0.4% 0.6% 0.7% 1.0%
0.6% 0.6% 0.8% 0.9% 1.1%
0.7% 0.3% 0.6% 0.7% 1.0%
0.6% 0.3% 0.6% 0.7% 1.1%
0.5% 0.4% 0.7% 0.7% 1.0%
Panel C: 12-month R2 Asymptotic 7.4% 25% 0.6% Median 2.3% Mean 3.0% 75% 5.8%
7.4% 2.3% 5.0% 6.9% 8.5%
7.4% 5.4% 7.9% 8.4% 11.0%
6.9% 2.0% 5.1% 6.6% 9.6%
6.9% 3.7% 6.8% 7.7% 10.8%
6.9% 6.4% 9.0% 9.5% 12.1%
4.8% 1.7% 4.6% 5.7% 8.4%
4.8% 2.4% 5.2% 6.4% 9.2%
4.8% 3.3% 6.0% 6.8% 9.5%
Panel D: 1-month bias-corrected regression coefficient b Theoretical b 0.083 0.083 (2.39) (2.36) 25% 0.038 0.041 (0.58) (0.65) Median 0.082 0.082 (1.27) (1.26) Mean 0.082 0.094 (1.31) (1.26) 75% 0.127 0.133 (2.08) (1.87)
0.083 (2.33) 0.037 (0.65) 0.081 (1.22) 0.101 (1.21) 0.146 (1.79)
0.083 (2.38) 0.058 (1.32) 0.085 (2.05) 0.085 (2.05) 0.112 (2.80)
0.083 (2.32) 0.053 (1.28) 0.081 (1.88) 0.086 (1.88) 0.113 (2.48)
0.083 (2.27) 0.051 (1.29) 0.080 (1.81) 0.089 (1.81) 0.117 (2.30)
0.082 (2.36) 0.057 (1.56) 0.081 (2.29) 0.081 (2.29) 0.104 (2.96)
0.082 (2.23) 0.056 (1.49) 0.080 (2.10) 0.082 (2.11) 0.107 (2.76)
0.082 (2.14) 0.054 (1.44) 0.078 (1.94) 0.082 (1.97) 0.107 (2.52)
Panel E: 12-month bias-corrected regression coefficient b 25% 0.453 0.491 (0.56) (0.65) Median 0.987 0.980 (1.28) (1.22) Mean 0.987 1.113 (1.30) (1.25) 75% 1.549 1.587 (2.04) (1.87)
0.419 (0.62) 0.933 (1.18) 1.170 (1.19) 1.717 (1.78)
0.654 (1.22) 0.974 (1.93) 0.971 (1.96) 1.314 (2.73)
0.595 (1.20) 0.924 (1.78) 0.963 (1.81) 1.283 (2.42)
0.575 (1.22) 0.910 (1.74) 1.005 (1.77) 1.337 (2.29)
0.495 (1.11) 0.778 (1.83) 0.787 (1.83) 1.059 (2.51)
0.481 (1.09) 0.761 (1.74) 0.791 (1.78) 1.080 (2.41)
0.464 (1.08) 0.741 (1.66) 0.784 (1.69) 1.070 (2.26)
Historical
1.00 1.04 0.88 0.88 0.86
Panel A reports theoretical variance ratios based on Equation (19.5) for the returns generated under each of the scenarios. The historical data in this panel is from Table 19.1 of Poterba and Summers (1988), where the data is the real returns from 1871 to 1985 and it is scaled so that the 1-year ratio is 1.00. Panels B-C of this table report the distributions of statistics from ordinary least squares predictive regressions with non-overlapping observations of the form R.t; t C / D a C b t C " for D 1; 12 months. The distributions are estimated from 2000 simulations of the return process given by Equations (19.3) and (19.4) for 840 months using the parameter values for each of 9 scenarios Panels D-E report the bias-corrected regression coefficients and their t -ratios using the Amihud and Hurvich (2004) approach where both the point estimate and the standard errors are adjusted. The Stambaugh (1999) correction yields almost the same point p estimates and is omitted from the table. The t -ratios are reported in parentheses. The exogenous parameters are v D 4%; D 9%; r D 3%; V .1/ D 14%; ˛ D 0, and ˇ D 1
294
M.J. Brennan and Y. Xia
is set to be consistent with the exogenously chosen values of p v and › using the formula D v 2. The volatility of the instantaneous return P is chosen to be consistent with p V .1/ D 0:14 and the values of the other parameters using the quadratic Equation (19.5). Panel A reports the variance ratios implied by the nine different scenarios. The variance ratio for m months is defined by the expression: VRm D
m // 12 Var.R.t; t C 12 ; m Var.R.t; t C 1//
m is the cumulative m-month return, and where R t; t C 12 the variances for the returns are calculated using expression (19.5). Under a pure random walk the variance ratio will be equal to unity for all values of m. Mean reversion or negative autocorrelation in stock prices, such as would be introduced by changing discount rates, will cause the variance ratios to decline with the return calculation interval. The right hand column of the table reports the variance ratios calculated by Poterba and Summers (1988) for real U.S. equity returns in the period 1871–1985. An interesting feature of the PoterbaSummers ratios is that they increase for the 2-year calculation period, which implies small positive autocorrelation at this horizon. Comparing the variance ratios under the nine scenarios with those reported by Poterba and Summers, we see that scenarios (v) and (viii) ( D 0:05 and › D 0:10, 0.50) are the most consistent with the empirical data. Panels B-E present, for each scenario, means and quantiles of the distribution of statistics for the regression (19.6) for 1-month and 1-year forecast periods. The theoretical (asymptotic) R2 and predictive coefficient b , calculated using definition (19.9) and (19.8), are reported as benchmarks. The means and quantiles of the distributions of the regression R2 and estimated values of predictive coefficient, b are calculated from Monte Carlo simulations of returns. Each simulation is for 70 years (840 months), roughly corresponding to the length of data available on the CRSP tape. Each scenario is simulated 2000 times, and the predictive regression (19.6) is run for 1-month and 1-year forecast horizons. The 1 year regressions use nonoverlapping observations and the ordinary least squares estimates of the slope coefficient and its tstatistic are adjusted for bias using the Amihud and Hurvich (2004) approach. The Stambaugh (1999) bias-adjusted slope coefficients are indistinguishable from those obtained under the Amihud and Hurvich (2004) adjustment. The theoretical R2 for the 1-month regressions are between 0.5% and 0.7%, and for all scenarios the average simulated R2 lies between 0.3% and 0.9%. These values compare with those reported by Bossaerts and Hillion (1999) for a shorter sample period of between 2% and 9.8% using a combination of popular predictors. At the annual forecast horizon the theoretical R2 rises to between 4.8% and 7.4%, while
the median of the simulations yield estimates between 3.0% and 9.5%, which compares with the R2 of 17% reported by Cochrane (1999) where overlapping observations of pricedividend ratio are used to predict annual excess stock returns. The 25th percentile of the R2 distribution is around 3% points below the mean, so the Goyal and Welch (2003) estimate of 5.83% for the period 1926–1990 is well within the range of estimates that are consistent with our scenarios. Figure 19.1 plots the theoretical R2 as a function of the return interval, , for the nine different scenarios. Quite different patterns arise from the different sets of parameter values. Three features stand out: First, the generally humped shape of the graphs that is consistent with the theoretical analysis in the previous subsection; second, the small differences between the R2 at the 1-year return interval for the different scenarios as compared with the large differences in R2 for longer intervals; and third, the importance of P , as well as the persistence parameter , in determining the long horizon R2 . When D 0:10, for example, the R2 at the 20-year interval ranges from 15.9% when P D 0 to 46.1% when P D 0:9. Figures 19.2a and b plot the 10- and 20-year theoretical R2 against the 1-year theoretical R2 . The 1-year R2 is independent of the correlation P by construction, since we set p V .1/ exogenously at 14%. The same R2 at the 1-year horizon can be associated with quite different R2 at long horizons: when D 0:1, for example, the values of the 1-year R2 are all about 6.9%, but the R2 at the 20-year horizon for P D 0:0 is 15.9% vs. 46.1% for P D 0:9. Thus, modest variation in the 1-year R2 between 6.5% and 7.5% is associated with enormous variation in values of R2 at the 10- and 20-year intervals – from under 20% to over 70% in the case of the 20-year R2 . Figures 19.3a and b show the separate influences of and P on the long horizon R2 across the nine scenarios. It is clear from these figures that a high value of (0.5) greatly attenuates the long-horizon predictability of the return. On the other hand, the predictability of long run stock returns when expected returns are highly persistent depends strongly on the degree of negative correlation between the drift innovations and the stock return. For example, Scenario (i), which has a highly persistent expected return . D 0:02/ but zero correlation .P D 0:0/, yields less return predictability at the 10- and 20-year horizons than does Scenario (vi), which has less persistent expected returns . D 0:10/ but a highly negative correlation .P D 0:9/. These examples show that, for a given unconditional volatility of the instantaneous expected return, the long run predictability of stock returns depends crucially on the correlation between innovations in expected returns and the realized stock return. This is not surprising since, in the absence of underlying cash flow uncertainty, the correlation would be perfect and long run returns would be perfectly predictable. Thus, it is dangerous
19 Persistence, Predictability, and Portfolio Planning
295
Fig. 19.1 Theoretical R2 as a function of the horizon. The figure plots the theoretical regression R2 as a function of the investment horizon , where varies from 0.08 (1 month) to 20 years for the nine scenarios reported in Table 19.1. The exogenous parameters are v D 4%; D 9%; r D p 3%; V .1/ D 14%; ˛ D 0, and ˇD1
to infer anything about the long run predictability of stock returns from the R2 of 1-year regressions, particularly when the sampling variability of the R2 is taken into account. For example, Table 19.1 Panel B shows that under Scenario (ii) there is a 50% chance of estimating an R2 below 5% in an annual regression using 70 years data, although, as shown in Table 19.3a below, the theoretical R2 for 20-year returns is almost 57%. Panels D and E report statistics for the bias-adjusted coefficients of t and their associated t-ratios from predictive regressions for 1-month and 1-year returns using the simulated data. For most of the scenarios, the median t-statistic is less than 2, and the predictive relation between the signal and the stock return would not be identified as significant in roughly two-thirds of the cases.12 The shorter data samples that are typically used in empirical studies to identify potential signals of the stock price drift make it even less likely that a significant relation will be found, and if the drift is not a perfect linear function of the signal as we have assumed, then the likelihood of identifying a signifi-
cant relation is further reduced. Similar results are obtained for the annual forecast period. Now, however, it is noticeable that both the theoretical and the estimated value of b decline for large that reduces the persistence of the signal. The evidence in Table 19.1 shows that for the stock price processes that we have considered it is likely to be difficult to identify a significant relation between even a perfect signal of the expected return on stocks and the realized returns; even if a statistically significant relation is found, the standard error of the coefficient will be large, which makes the use of the estimate for portfolio planning purposes problematic.13 Thus, the failure of Goyal and Welch (2004) to find any ex-post predictive power in the dividend yield for excess returns on common stocks is not surprising even if the equity premium has a standard deviation as high as 4% and the dividend yield is a perfect signal of the expected return. Throughout this section we have assumed that the standard deviation in the equity premium is 4%. While this seems to us to be quite large, we have found that it is broadly consistent with the levels of stock return predictability and negative autocorrelation
12
13 See Xia (2001) for an analysis of the effects of parameter uncertainty on optimal portfolio planning.
Of course, if the coefficient is not corrected for bias, the t -statistic is somewhat higher.
296
M.J. Brennan and Y. Xia
Fig. 19.2 Theoretical 10- and 20-year R2 vs. 1-year R2 . The figure plots the theoretical regression R2 at long horizon (10 and 20 years) as a function of 1-year R2 for the nine scenarios reported in Table 19.1. The exogenous parameters are v D 4%; D 9%; r D p 3%; V .1/ D 14%; ˛ D 0, and ˇD1
in stock returns that previous researchers have found. We consider next whether this level of return predictability is of economic significance for variation in the level of stock prices.
19.3 Stock Price Variation and Variation in the Expected Returns To assess the implications for the level of stock prices of time variation in expected returns of the magnitude discussed above, we employ a simple valuation model in which the instantaneous expected rate of return on the stock, ,14 follows the O-U process described by Equation (19.2). Since stock prices vary with dividend growth expectations as well as expected returns, we assume that the expected growth rate in
dividends is constant in order to isolate the effects of expected returns on stock prices. Thus, the dividend on the stock is assumed to follow a geometric Brownian Motion with constant expected growth rate, g: dD D gdt C D dzD : D
Let V .D; / Dv ./ denote the value of the stock, which is homogeneous of degree one in the level of the dividend.15 Ito’s Lemma implies that the process for capital gains plus dividends may be written as: 00 Ddt C dV 1 v 2 v0 1 D C Œ. / C D C g C dt V 2 v v v CD dzD C 15
14
Note that we are assuming here that ˛ D 0 and ˇ D 1.
(19.10)
v0 dz ; v
(19.11)
Note that P is used to denote the stock price with dividends reinvested and V to denote the ex-dividend value.
19 Persistence, Predictability, and Portfolio Planning
297
Fig. 19.3 Theoretical 10- and 20-Year R2 as a function of and P . The figure plots the theoretical regression R2 at long horizon (10 and 20 years) as a function of the mean reverting parameter , and the correlation between the innovations of the process of the instantaneous return and its drift P . The exogenous parameters are v D 4%; D 9%; r D p 3%; V .1/ D 14%; ˛ D 0, and ˇD1
Table 19.3a Long run return predictability and the value of dynamic vs. unconditional strategies Horizon Scenarios (i) (ii) (iii) (iv) (v) (vi) › 0:02 0:02 0:02 0:1 0:1 0:1 0:008 0:008 0:008 0:018 0:018 0:018 P 0 0:5 0:9 0 0:5 0:9 P 0:14 0:142 0:144 0:14 0:144 0:148
(vii) 0:5 0:04 0 0:139
(viii) 0:5 0:04 0:5 0:147
(ix) 0:5 0:04 0:9 0:155
1 year
R2 t D 0:05 t D 0:09 t D 0:13
7:40% 1:01 1 1:01
7:40% 1:01 1 1:01
7:40% 1:01 1 1:01
6:90% 1:01 1 1:01
6:90% 1:01 1 1:01
6:90% 1:01 1 1:01
4:80% 1:01 1 1:01
4:80% 1:01 1 1:01
4:80% 1:01 1 1:01
5 years
R2 t D 0:05 t D 0:09 t D 0:13
26:50% 1:04 1 1:04
28:80% 1:03 1 1:06
31:00% 1:03 1:01 1:08
18:80% 1:05 1:02 1:03
22:40% 1:03 1:02 1:06
26:80% 1:02 1:03 1:11
4:60% 1:05 1:03 1:04
5:90% 1:04 1:04 1:06
7:80% 1:04 1:05 1:09
10 years
R2 t D 0:05 t D 0:09 t D 0:13
38:00% 1:1 1:02 1:06
44:10% 1:06 1:02 1:14
50:70% 1:04 1:04 1:25
20:40% 1:11 1:05 1:06
27:80% 1:07 1:06 1:14
39:80% 1:07 1:13 1:35
2:60% 1:09 1:08 1:08
3:60% 1:09 1:09 1:11
5:60% 1:13 1:15 1:2
20 years
R2 t D 0:05 t D 0:09 t D 0:13
45:60% 1:27 1:07 1:09
56:80% 1:14 1:07 1:32
71:10% 1:09 1:25 2:16
15:90% 1:3 1:19 1:16
24:60% 1:17 1:16 1:27
46:10% 1:29 1:57 2:25
1:30% 1:2 1:18 1:18
1:90% 1:19 1:2 1:22
3:30% 1:36 1:39 1:46
40 years
R2 t D 0:05 t D 0:09 t D 0:13
43:70% 2:17 1:5 1:29
58:00% 1:46 1:27 1:58
79:30% 1:42 2:43 9:02
8:80% 1:9 1:7 1:62
15:20% 1:43 1:42 1:56
38:30% 2:32 3:25 5:47
0:60% 1:43 1:41 1:41
1:00% 1:45 1:45 1:48
1:80% 2 2:05 2:14
This table reports the theoretical values under different scenarios of R2 from regressions of long at the beginning of the period. It also reports the ratios of the certainty equivalent wealth for an certainty equivalent wealth under an unconditional strategy for different horizons and initial values p ters are v D 4%; D 9%; r D 3%; V .1/ D 14%; ˛ D 0, and ˇ D 1. The risk
where D D D is the covariance between the dividend growth and the expected return. Equating the drift term in (19.11) to the expected rate of return yields the following ordinary differential equation for the price-dividend ratio, v: 1 00 2 v Cv0 Œ./CD C.g /vC1 D 0: 2 2
(19.12)
run returns on the value of optimal dynamic strategy to the of t . The exogenous parameaversion parameter is D 5
and we impose a lower reflecting boundary on at such that v0 . / D 0.16 Equation (19.11) is a quasi-structural model of return predictability that corresponds to the purely statistical model (19.1–19.2). In both models the expected return is an exogenously specified O-U process, but (19.11) shows how the realized stock return depends on the innovations in the expected return and in the dividend. The instantaneous
< . The boundary with the growth condition: g C 22 D condition for v./ for high values of is lim!1 v./ D 0;
See Feller (1951). In our simulations below, we set D 2:5%, which is about three standard deviations away from the long run D 9%. 16
298
M.J. Brennan and Y. Xia
volatility of the stock price, V , is: P2
D
V2
D2
C
v0 v
2
v0 C 2 D ; v
(19.13)
so that the stochastic expected return drives a wedge between V and D . If v0 is zero (i.e., is a constant), then V D D . Note that the assumption that V (and therefore P ) is constant is made for simplicity, which is obviously 0 inconsistent with the above relation (19.13) in which vv is stochastic since it is a function of . In order to explore the sensitivity of the price-dividend ratio, v, to the expected rate of return , when the expected rate of return follows the stochastic process (19.2), the partial differential equation was solved numerically for the values of and P that describe the scenarios used in Table 19.1. The real expected dividend growth rate, g, and the volatility of the real dividend growth rate, D , are set to 1.85% and 12.02% respectively, which are the sample mean and volatility of the historical annual real dividend growth rate in Shiller’s data for the period from 1872 to 2001.17 This leaves one parameter, D , of the system (19.10 and 19.2) to be determined endogenously. The covariance between the total return pro0 cess and d is V D D C vv 2 , and there is an approximate equality between P and V : P P V D D C
v0 2 : v
(19.14)
19.4 Economic Significance of Predictability
This is combined with (19.13) to determine D as: 2 D 1
P2 2 .1 P /: D2
10.8%, across the six available scenarios. The price-dividend ratios in the case of negative D tend to be greater, but show broadly similar patterns to those in Panel A, except for scenarios (ii) and (iii) where the growth condition failed. The price-dividend ratio is very sensitive to the variation in , the sensitivity being determined almost entirely by the persistence parameter . When D 0:10, an decrease in from one standard deviation above its mean to one standard deviation below its mean increases the price-dividend ratio by 66–80%. The variation is much greater when D 0:02, and much less when D 0:50. However, the value of P has very little effect on the sensitivity of the price-dividend ratio to , although, as we have seen, this parameter is a major determinant of the predictability of long run returns. The results in Tables 19.1 and 19.2 show that it is possible for changes in the expected rate of return, , to have very large effects on stock prices without those changes being easily detected by standard statistical approaches. For example, Table 19.1 shows that in Scenario (v), in which D 0:10, the median bias-adjusted t-ratio on the predictor variable in regressions of 1-year (1-month) returns on is only 4.78 (4.88) when 70 years of data are available. Yet Table 19.2 shows that in this same scenario a change in from one standard deviation below the mean to one standard deviation above the mean can increase the dividend yield by 74% without any change in dividend growth rate expectations.
(19.15)
Conditional on P > D , there exists no solution for D when P D 0 (Scenarios (i), (iv), and (ix)), but there are two equal solutions of opposite sign for D in the other scenarios. In this case, Equation (19.15) together with (19.14) implies that v0 < 0, which is intuitive because a higher discount rate leads to a smaller price-dividend ratio. Panels A and B of Table 19.2 report, for the scenarios considered in Table 19.1 for which a feasible solution exists, the price-dividend ratios and the corresponding dividend yields for D > 0, while Panels C and D report these variables for D < 0. The price-dividend ratio in the case of positive D ranges from a low of 9.25 and a high of 30.12, with the corresponding dividend yield ranging from 3.3% to
Under the scenarios that we have described above, an investor faces a situation in which there is little short run predictability in stock returns, but there may be very significant long run predictability, depending on the persistence of the equity premium and the correlation of innovations in the premium with stock returns. Under these circumstances it is natural to ask whether there are likely to be significant costs for an investor who ignores time variation in expected returns. We answer this question by comparing the certainty equivalent wealth of an investor under an “unconditional” strategy with the certainty equivalent wealth under the “optimal” strategy. The investor is assumed to have an iso-elastic utility function defined over end-of-period wealth at the investment horizon T :18 max E0 Œu.WT / ; x
17 The P/D ratio and dividend data are from Robert Shiller’s web page http://www.econ.yale.edu/shiller/data.htm. Fama and French (2002) report that the mean dividend growth rate for the S&P 500 was 2.74% for the period 1872 to 1950 and was 1.05% (1.3% from Shiller’s data) for the period 1951 to 2000, which contrasts with the real earnings growth rate of 2.82% over the last half century.
18 Brennan and Xia (2002) and Wachter (2002) show that the determination of an optimal consumption and investment program for an investor with time-additive utility can be represented as a time integral of optimal utility of terminal wealth problems.
19 Persistence, Predictability, and Portfolio Planning
299
Table 19.2 Stock return models and valuation ratios Scenarios (i) (ii) › 0.02 0:02 0.008 0:008 P 0.00 0:50 P 0.140 0:142
(iii) 0:02 0:008 0:90 0:144
(iv) 0.10 0.018 0.00 0.14
(v) 0:10 0:018 0:50 0:144
(vi) 0:10 0:018 0:90 0:148
(vii) 0.50 0.040 0.00 0.139
(viii) 0:50 0:040 0:50 0:147
(ix) 0:50 0:040 0:90 0:155
Panel A: Price-dividend ratios .D > 0/ D n.a. t D . v / D 5% n.a. t D D 9% n.a. t D C v D 13% n.a.
0:421 30:12 14:55 9:66
0:887 25:47 13:42 9:25
n.a. n.a. n.a. n.a.
0:387 19:47 14:37 11:43
0:879 17:67 13:20 10:60
n.a. n.a. n.a. n.a.
0:324 15:17 13:92 12:98
0:864 14:28 13:11 12:23
Panel B: Dividend yields .D > 0/ t D . v / D 5% n.a. t D D 9% n.a. t D C v D 13% n.a.
3:3% 6:9% 10:4%
3:9% 7:5% 10:8%
n.a. n.a. n.a.
5:1% 7:0% 8:8%
5:7% 7:6% 9:4%
n.a. n.a. n.a.
6:6% 7:2% 7:7%
7:0% 7:6% 8:2%
Panel C: Price-dividend ratios .D < 0/ D n.a. 0:421 t D . v / D 5% n.a. 57:80 t D D 9% n.a. 20:26 t D C v D 13% n.a. 11:27
0:887 191:45 46:56 17:52
n.a. n.a. n.a. n.a.
0:387 23:50 17:03 13:30
0:879 27:38 19:59 15:10
n.a. n.a. n.a. n.a.
0:324 16:41 15:05 14:03
0:864 17:62 16:15 15:05
0:5% 2:1% 5:7%
n.a. n.a. n.a.
4:3% 5:9% 7:5%
3:7% 5:1% 6:6%
n.a. n.a. n.a.
6:1% 6:6% 7:1%
5:7% 6:2% 6:6%
Panel D: Dividend yields .D < 0/ t D . v / D 5% n.a. t D D 9% n.a. t D C v D 13% n.a.
1:7% 4:9% 8:9%
This table reports valuation ratios for a security whose dividend follows: dD D gdt C D dzD D and the expected rate of return on the security follows an Ornstein-Uhlenbeck process: d D . /dt C dz The real dividend growth rate g D 1:85% and the volatility D D 12:02% is from the sample mean and volatility of historical data of real annual dividend growth from 1872 to 2001, provided by Robert Shiller. The correlation, D is derived endogenously across the nine scenarios considered in Table 19.1, and “n.a.” means that no solution exists for D in that scenario. The growth condition failed in scenario (ii) and (iii) p under D < 0 The other exogenous parameters are v D 4%; D 9%; r D 3%; V .1/ D 14%; ˛ D 0, and ˇ D 1
where u.WT / D
8 < :
1
e
T
e T
WT 1 ln WT
> 1; D 1:
The dynamic portfolio choice problem is subject to the dynamic budget constraint: dW D Œx.˛ C ˇ r/ C r dt C xP dzP ; W with x defined as the proportion of wealth invested in the single risky asset whose stochastic process was given in Equations (19.1–19.2). The risk free interest rate, r, is assumed to be constant for simplicity.19 The “unconditional” strategy, which is based on the assumption that the equity premium is constant, allocates a constant fraction of wealth 19 For models of dynamic investment strategies with stochastic interest rates see Campbell and Viceira (2001) and Brennan and Xia (2002).
to the risky asset, but the allocation to the risky asset under the “optimal” strategy depends on t as shown below. Denoting the equity premium by yt ˛ C ˇt r, the investor’s optimal dynamic policy20 is to invest a fraction of his wealth, xt , in the risky asset, where: xt D
P yt C ŒB./ C C./yt ; P P2
(19.16)
and T t is the time remaining to the investment horizon. The investor’s indirect utility, J.W; ; / maxxt E0 Œu.WT / , is given by: J.W; ; / De
t
1 W 1 2 exp A./ C B./yt C C./yt 1 2
20 This was first derived by Kim and Omberg (1996); it is discussed in more detail in Xia (2001).
300
M.J. Brennan and Y. Xia
e t
W 1 o ' .; t/; 1
(19.17)
where A./; B./ and C./ are defined in Appendix 19A. The optimal equity allocation, xt , is a time-dependent function of the equity premium y. Xia (2001) shows that the allocation in general is a non-monotonic function of the investment horizon, and that if the return process is negatively correlated with the process for , i.e., P < 0, which we have argued is likely to be case, then the optimal equity allocation x most likely increases with the investment horizon . Let x u denote the constant equity allocation under the unconditional strategy that treats the equity premium, y ˛ C ˇ r, as a constant. Then xu
y : P2
(19.18)
Although the optimal unconditional strategy is based on an equity premium that is assumed to be constant, y, the evolution of the investor’s wealth depends on the true dynamics of the equity premium r. As shown in Appendix 19B, the indirect utility function for the unconditional strategy under the true dynamics of the equity premium is given by:
Fig. 19.4 The certainty equivalent wealth ratio CEWR as a function of and P . The figure plots the certainty equivalent wealth ratio .CEWRou ; CEWRom ; CEWRob / between the optimal dynamic strategy and the unconditional, the myopic, and the buy-and-hold strategies at the 20-year-horizon for t D 5%; 9%; 13% as a function of the mean reverting parameter , and the correlation between the innovations of the process of the instantaneous return and its drift P . The exogenous parameters are v D 4%; D 9%; r D p 3%; V .1/ D 14%; ’ D 0, and ˇD1
J u .W; ; / D e t e t
W 1 exp fD./ C E./T g 1
W 1 u ' .; t/; 1
(19.19)
with D./ and E./ defined in Appendix 19B. The Certainty Equivalent Wealth under the optimal strategy, CEW o .; /, is defined as the amount of wealth to be received at the horizon with certainty that would give the investor the same expected utility as he receives under the optimal strategy with an initial $1 of wealth to invest: 1
CEW o .; / Œ' o .; t/ 1 :
(19.20)
The Certainty Equivalent Wealth for the unconditional strategy, CEW u .; /, is defined analogously. We shall analyze the ratio of the certainty equivalents under the two strategies, CEW Rou , which is defined by CEW Rou .; /
CEW o .; / : CEW u .; /
The ratio, CEW R, is a measure of the value of taking account of variation in the equity premium. Table 19.3a and Fig. 19.4a–c reports values of CEW Rou for different horizons across the nine different scenarios for
19 Persistence, Predictability, and Portfolio Planning
the stochastic stock price processes, and for different initial values of t , when the coefficient of relative risk aversion, , is equal to 5. We include scenarios (i), (iv), and (vii) in the table for completeness, although the pattern of declining variance ratios generated by these scenarios is inconsistent with the empirical evidence, and the zero value of D contrasts with the negative value we expect to be associated with discount rate effects. Therefore, we restrict our discussion to the other six scenarios. When t D , the optimal strategy offers no measurable advantage over a 1-year horizon in any of the scenarios; a gain of up to 5% over a 5-year horizon; a gain of between 2% and 15% over 10 years, depending on the scenario, and a gain of 7–57% over 20 years, again depending on the scenario. Under Scenario (v) the gain is 2% for a 5-year horizon, 6% for a 10-year horizon, and 16% for a 20-year horizon. Focusing on the 20-year horizon, the gain from the optimal strategy when t is one standard deviation above its mean may be substantial, particularly when P D 0:9 (scenarios (iii) and (vi), and (ix)) and, for any given scenario, the increase in the CEW R is greater for a one standard deviation increase in t than is the reduction for a half standard deviation decrease in t . Note that, while the R2 of the predictive regressions was determined primarily by the value of , the influence of P on CEW R is much greater; for example, comparing scenarios (iv) and (vi) at the 20-year horizon when t D , the advantage of the optimal strategy increases from 19% to 57% as P decreases from zero to 0:90. This is because a large negative correlation between stock returns and the expected rate of return reduces the volatility of long horizon returns; this effect is taken into account by the optimal strategy that invests more than the unconditional strategy in stocks for > when the horizon is long. The benefit of the optimal strategy is a non-monotone function of the persistence parameter , tending to be greatest for D 0:10 at the longer horizons. The short run predictability of returns as measured by the 1-year R2 is not, however, necessarily associated with higher certainty equivalent for the optimal strategy relative to the unconditional strategy. This is illustrated in Fig. 19.3, where the certainty equivalent ratio under the optimal and the unconditional strategies for a 20-year horizon, CEW Rou , is plotted against the 1-year R2 for the nine scenarios. Interestingly, Table 19.3a shows that there is also no clear relation between the advantage of the optimal strategy and the long run predictability of returns (20-year regression R2 ) when D : the correlation between the 20-year CEW R across scenarios and the 20-year R2 is 0:05. Less surprisingly, since carries more information about future investment opportunities when the R2 is high, the 20-year CEW R has a correlation of 0.48 with the 20-year R2 when is one standard deviation below its mean and a correlation of 0.57 when is one standard deviation above its mean.
301
Finally, it is interesting to note that there is also no relation between CEW Rou and the monthly or annual out-of-sample predictability as measured by the root mean squared error (RMSE).21 In summary, the gains to the optimal strategy can be very large for long horizon investors. For scenarios (v) and (vi), which correspond closely to the behavior of the expected rate of return that we shall extract from forecasts of long run rates of return, the gains run from 16% to 125% over a 20-year horizon, depending on the initial value of . The gains of the optimal strategy come from both market timing and hedging. Market timing is simply the variation in the equity allocation with the equity premium. Hedging is the additional allocation to equities that results from the negative correlation between innovations to rates of return and returns on the equity security, which implicitly recognizes that equities are not so risky in the long run.22 The benefits of market timing, but not of hedging, are captured by the myopic rule, x m : xm
˛ C ˇt r yt D : 2 P P2
(19.21)
The certainty equivalent wealth associated with the myopic strategy is calculated by evaluating the expected utility associated with the myopic strategy, and the details are given in Appendix 19C. Table 19.3b reports the certainty equivalent wealth ratios between the optimal and the myopic strategies CEW Rom . The results show that the hedging gains offered by the optimal strategy but not by the myopic strategy, are zero when P D 0 and of the order of 6% at the 20-year horizon when P D 0:50, but are as high as 38–66% when P D 0:90. For comparison, the certainty equivalent wealth associated with the optimal buy-and-hold strategy, x b , which is described in Appendix 19D, was calculated numerically. The initial equity allocation of the buy-and-hold strategy depends on the value of t , but subsequent changes in the allocation are determined entirely by the realized asset returns. The certainty equivalent wealth ratios between the optimal and the buy-and-hold strategies CEW Rob , reported in Table 19.3c and Fig. 19.4g–i, show that the buy-and-hold strategy is
21 The out of sample RMSE is calculated from the simulated data used for Table 19.1; the predictive relation is estimated from the first 65 years of data, and the out-of-sample RMSE is calculated from the differences between the predicted and the realized returns over the following five years. The results, which are not reported here, are available on request. Goyal and Welch (2003, 2004) use the out-of-sample RMSE as a measure of the value of a predictive instrument. 22 Stambaugh (1999) and Barberis (2000) compare myopic and buyand-hold strategies.
302
M.J. Brennan and Y. Xia
Table 19.3b Long run return predictability and the value of dynamic vs. myopic strategies Horizon
(v)
(vi)
(vii)
(viii)
(ix)
› P P
Scenarios
(i) 0:02 0:008 0:00 0:140
(ii) 0:02 0:008 0:50 0:142
(iii) 0:02 0:008 0:90 0:144
(iv) 0:10 0:018 0:00 0:140
0:10 0:018 0:50 0:144
0:10 0:018 0:90 0:148
0:50 0:040 0:00 0:139
0:50 0:040 0:50 0:147
0:50 0:040 0:90 0:155
1 year
R2 t D 0:05 t D 0:09 t D 0:13
7:4% 1:00 1:00 1:00
7:4% 1:00 1:00 1:00
7:4% 1:00 1:00 1:00
6:9% 1:00 1:00 1:00
6:9% 1:00 1:00 1:00
6:9% 1:00 1:00 1:00
4:8% 1:00 1:00 1:00
4:8% 1:00 1:00 1:00
4:8% 1:00 1:00 1:00
5 years
R2 t D 0:05 t D 0:09 t D 0:13
26:5% 1:00 1:00 1:00
28:8% 1:00 1:00 1:00
31:0% 1:00 1:00 1:00
18:8% 1:00 1:00 1:00
22:4% 1:00 1:00 1:00
26:8% 1:00 1:01 1:02
4:6% 1:00 1:00 1:00
5:9% 1:00 1:00 1:00
7:8% 1:01 1:01 1:02
10 years
R2 t D 0:05 t D 0:09 t D 0:13
38:0% 1:00 1:00 1:00
44:1% 1:00 1:00 1:01
50:7% 1:00 1:02 1:04
20:4% 1:00 1:00 1:00
27:8% 1:00 1:01 1:02
39:8% 1:02 1:06 1:12
2:6% 1:00 1:00 1:00
3:6% 1:01 1:01 1:01
5:6% 1:04 1:05 1:06
20 years
R2 t D 0:05 t D 0:09 t D 0:13
45:6% 1:00 1:00 1:00
56:8% 1:01 1:02 1:05
71:1% 1:04 1:16 1:41
15:9% 1:00 1:00 1:00
24:6% 1:02 1:04 1:06
46:1% 1:20 1:38 1:66
1:3% 1:00 1:00 1:00
1:9% 1:02 1:02 1:02
3:3% 1:12 1:13 1:14
40 years
R2 t D 0:05 t D 0:09 t D 0:13
43:7% 1:00 1:00 1:00
58:0% 1:04 1:09 1:20
79:3% 1:35 2:32 5:56
8:8% 1:00 1:00 1:00
15:2% 1:08 1:11 1:14
38:3% 2:10 2:76 3:86
0:6% 1:00 1:00 1:00
1:0% 1:05 1:05 1:05
1:8% 1:31 1:33 1:34
This table reports the theoretical values under different scenarios from regressions of R2 long run returns on the value of at the beginning of the period. It also reports the ratios of the certainty equivalent wealth for an optimal dynamic strategy to the certainty equivalent p wealth under a myopic strategy for different horizons and initial values of t . The exogenous parameters are v D 4%; D 9%; r D 3%; V .1/ D 14%; ˛ D 0, and ˇ D 1. The risk aversion parameter is D 5
Table 19.3c Long run return predictability and the value of dynamic vs. buy-and-hold strategies Horizon
(v)
(vi)
(vii)
(viii)
(ix)
› P P
Scenarios
(i) 0:02 0:008 0:00 0:140
(ii) 0:02 0:008 0:50 0:142
(iii) 0:02 0:008 0:90 0:144
(iv) 0:10 0:018 0:00 0:140
0:10 0:018 0:50 0:144
0:10 0:018 0:90 0:148
0:50 0:040 0:00 0:139
0:50 0:040 0:50 0:147
0:50 0:040 0:90 0:155
1 year
R2 t D 0:05 t D 0:09 t D 0:13
7:4% 1:00 1:01 1:02
7:4% 1:00 1:01 1:02
7:4% 1:00 1:01 1:02
6:9% 1:00 1:01 1:02
6:9% 1:00 1:01 1:02
6:9% 1:00 1:02 1:03
4:8% 1:01 1:01 1:03
4:8% 1:01 1:01 1:03
4:8% 1:01 1:02 1:03
5 years
R2 t D 0:05 t D 0:09 t D 0:13
26:5% 1:02 1:07 1:14
28:8% 1:02 1:08 1:15
31:0% 1:01 1:09 1:16
18:8% 1:04 1:08 1:13
22:4% 1:04 1:10 1:16
26:8% 1:05 1:12 1:18
4:6% 1:08 1:10 1:12
5:9% 1:09 1:12 1:15
7:8% 1:10 1:14 1:17
10 years
R2 t D 0:05 t D 0:09 t D 0:13
38:0% 1:04 1:16 1:29
44:1% 1:05 1:19 1:35
50:7% 1:05 1:24 1:43
20:4% 1:10 1:17 1:26
27:8% 1:13 1:23 1:36
39:8% 1:18 1:36 1:53
2:6% 1:19 1:21 1:24
3:6% 1:24 1:27 1:30
5:6% 1:32 1:36 1:42
20 years
R2 t D 0:05 t D 0:09 t D 0:13
45:6% 1:11 1:31 1:56
56:8% 1:15 1:47 1:88
71:1% 1:21 1:83 2:57
15:9% 1:24 1:34 1:48
24:6% 1:38 1:56 1:81
46:1% 1:76 2:27 2:92
1:3% 1:44 1:46 1:51
1:9% 1:59 1:63 1:68
3:3% 1:90 1:96 2:05
40 years
R2 t D 0:05 t D 0:09 t D 0:13
43:7% 1:27 1:55 2:00
58:0% 1:47 2:12 3:28
79:3% 2:11 5:24 15:66
8:8% 1:58 1:70 1:92
15:2% 2:09 2:40 2:91
38:3% 4:81 6:73 10:28
0:6% 2:11 2:15 2:22
1:0% 2:65 2:71 2:81
1:8% 3:97 4:07 4:25
This table reports the theoretical values under different scenarios of R2 from regressions of long run returns on the value of at the beginning of the period. It also reports the ratios of the certainty equivalent wealth for an optimal dynamic strategy to the certainty equivalent wealth under an optimal buy-and-hold strategy for different horizons and initial values of t . The exogenous parameters are v D 4%; D 9%; p r D 3%; V .1/ D 14%; ˛ D 0, and ˇ D 1. The risk aversion parameter is D 5
19 Persistence, Predictability, and Portfolio Planning
303
Fig. 19.5 The certainty equivalent wealth ratio under the optimal and the unconditional strategies for a 20-year horizon vs. the 1-year predictive regression R2 . The figure plots the certainty equivalent wealth ratio (CEWRou ; CEWRom ; CEWRob ) between the optimal and the unconditional for a 20-year horizon investor CEWRou , as a function of 1 year for the nine scenarios reported in Table 19.1. The initial value of is set at respectively, 5%, 9% and 13%. The exogenous parameters are v D 4%; D 9%; r D p 3%; V .1/ D 14%; ˛ D 0, and ˇD1
extremely inefficient.23 At the 20-year horizon the gains ofthe optimal strategy when D 0:1 are of the order of 38– 81% when D D 0:50 and 76–192% when D D 0:90 Fig. 19.5. To this point, we have shown that time-variation in expected stock returns, which may be very difficult to detect by standard regression methods, may nevertheless imply both significant variation in stock prices that is unrelated to changes in cash flow expectations, and substantial potential gains to the use of dynamic portfolio strategies for long horizon investors. However, the gains that we have calculated assume that it is possible to observe the instantaneous expected return on the stock and, as we have seen, regression estimates of the relation between expected returns and even perfect instruments of it are likely to be very imprecise. Therefore, 23
This is in contrast to the findings of Brennan and Torous (1999) who show that a buy-and-hold strategy performs well relative to a rebalancing strategy when the investment opportunity set is treated as constant.
in the next section, we explore the use of estimates of long run expected returns derived from a dividend discount model (DDM) as inputs to dynamic portfolio models.
19.5 Forecasts of Equity Returns Regressions that attempt to predict stock returns from instruments such as the dividend yield or the interest rate lack strong theoretical restrictions on the regression coefficients and, as we have seen in Sect. 19.2, the data are likely to yield very imprecise estimates of the coefficients even when the instruments are perfect. It is not surprising therefore that such regressions have essentially no out-of-sample predictive power. An alternative to the simple regression approach is to estimate the expected return on stocks by comparing the current level of the stock market with forecasts of future
304
M.J. Brennan and Y. Xia
dividends on the market portfolio – the Dividend Discount Model (DDM) approach. The advantage of this approach, which has long been employed in “Tactical Asset Allocation” models, is that the expected return is estimated directly and that there is no need to estimate a regression coefficient relating the stock (excess) return to the predictor instruments. The offsetting disadvantage is that the rate of return estimated from the DDM is a long run internal rate of return and there is no reason to believe that this will be equal to the instantaneous expected rate of return even if the dividend forecasts are unbiased. Therefore, it is necessary to develop a model to derive the expected instantaneous rate of return from the long run expected rate of return.
19.5.1 Models and Estimation Procedure We employ two models to convert the estimated DDM long run expected rate of return into an estimate of t , the instantaneous expected rate of return. The first model assumes that the dividend growth rate g is a known constant. The second model assumes that the growth rate follows an OrnsteinUhlenbeck process.24 Both methods assume that the instantaneous expected rate of return follows an O-U process as in Equation (19.2). Our basic input data are direct estimates of the DDM long run “expected rate of return,” t ,25 as defined by the discounted cash flow model: Pt D
1 X Et ŒDt C D1
1 C t
;
(19.22)
where Pt is the level of the stock price index at time t, and Et ŒDt C is the dividend on the index expected at time t to be paid at time t C .
Note that Equation (19.23) rests on the assumption that the dividend expectations in (19.22) form a geometric series. It was shown in Sect. 19.3 that if the expected rate of return, , follows an O-U process and the dividend growth rate is constant, then the price-dividend ratio v./ also satisfies the ordinary differential equation (ODE) (pde). Since v is a monotonic function of , at each point in time there exists a t whose implied v from the ODE is equal to the pricedividend ratio associated with t from Equation (19.23) for a given value of g. This implies a (nonlinear) one-to-one mapping between t and t for a given set of parameters for the stochastic process for ; .; v ; ; D /. As a result, these parameters and the time series of t can be estimated by an iterative process: starting with an initial value of D 0 ; t is calculated from the time series of t ; then is estimated from the time series of t , and a new t series is re-calculated from t ; the process continues until convergence is achieved. We denote the resulting Model 1 estimates by 1 and 1 . This iterative procedure is essentially a non-linear Kalman filter in which the latent variable t is a nonlinear function of the observable variable t . The transition equation for t is the discrete-time equivalent of the O-U process: t C t D .1 e t / C e t t C "
(19.24)
and the observation equation (which contains no observation error) is: t D f .t / D g C
D D g C .v.t ; //1 : P
The mapping of v implied by the DDM to the v implied by the ODE is equivalent to solving the nonlinear function f numerically.
19.5.1.2 Model 2: Stochastic g 19.5.1.1 Model 1: Constant g When the dividend growth rate, g, is constant, there is a oneto-one correspondence between t and the price-dividend ratio, P =D, which is given by the (discrete time version of the) Gordon growth model: vt
24
Pt 1Cg ; D Dt t g
(19.23)
Brennan and Xia (2001) assume a similar model for the dividend growth rate. 25 There is no assurance that t , the solution to Equation (19.22), will be equal to the expected rate of return except when all future dividends are known and discount rates are constant.
To allow for the possibility that dividend growth rate expectations are stochastic, the instantaneous dividend growth rate is assumed to follow an Ornstein-Uhlenbeck process: dg D g .g g/dt C g dzg :
(19.25)
Two steps are then required to derive from : first the instantaneous dividend growth rate, g, and the parameters of its stochastic process, g .g ; g ; g/, are estimated and then, given the time series of gt and g , the time series of t and the parameters of its stochastic process, are estimated.
19 Persistence, Predictability, and Portfolio Planning
305
With stochastic dividend growth, the DCF valuation formula t can be written as: X …s EŒ.1 C gt Ci /jgt Pt i D1 D ; Dt .1 C kt /s sD1 1
(19.26)
For a given set of parameters of the dividend growth process g ; EŒ.1Cgt Ci /jgt is a known linear function of gt . That is, Equation (19.26) is used to solve for the time series gt , given g , the observed series of price-dividend ratios P =D, and the DDM expected rates of return t . The estimated time series gt is then used to estimate g , and theprocedure iterates until there is a consistent set of gt and g ; gt ; t . This procedure can again be interpreted as a Kalman filter in which g is the latent variable with the transition equation gt C t D .1 e g t /g C e g t gt C "; while the observation equation relating the observable variables and P =D to the unobservable variable g is given implicitly by (19.26). Conditional on gt ; t from the first step, t is determined step. The price-dividend ratio
in the second v v ; gI ; g now satisfies the following two-statevariable partial differential equation: 0D
1 2 1 vgg C g g vg C 2 v 2 g 2 CŒg .g g/ C g D gD vg C Œ . / C D D v C Œg v C 1:
(19.27)
Equation (19.27) depends on the set of parameters g , which was estimated in the first step, and the unknown , which is to be estimated together with t in the second step. For a given value of , the PDE (19.27) is solved numerically
for v ; gI ; g . Given gt from the first step, a mapping between t and t ; t D f .t ; gt /, is defined by setting v ; gt I ; g equal to the observed price dividend ratio P/D. The time series of t estimates are then used to estimate a new and the process is iterated until convergence. This can be interpreted as another Kalman filter in which the transition equation of the latent variable t is given by (19.24) while the nonlinear observation equation is: P =D D v.; gt I ; g /: The resulting estimates from Model 2 are denoted by 2 and 2 .
19.5.2 Estimates Quarterly data on real dividends and the price-dividend ratio for the S&P 500 index are obtained for the sample period 1950.1 to 2002.2. Four different DDM discount rate series, t , are used for illustrative purpose. The first two series are estimates of the real long run expected rate of return on equities that would have been assessed by investors at each date. The Arnott and Bernstein (2002) (A&B) series, 1 , is constructed by adding to the current dividend yield an estimate of the expected long run real growth rate in dividends that in turn is equal to a forecast of real GNP per capita growth less a “dilution factor” – these are estimated using an average of the experience over the previous 40 years and the experience since the series began in year 1810. The Ilmanen (2003) (IL) series, 2 , is constructed by adding together estimates of the “dividend yield” and the long run growth rate of dividends. The former is calculated from a smoothed earnings yield multiplied by 59%. The latter is an average of 2% and the past 10, 20, 30, 40 and 50 years’ geometric average real growth rate in corporate earnings. Both the A&B and the IL series were constructed around 2002 for research purposes and use data or parameter values that may not have been available to market participants at the date of the forecast. However, the next two series, 3 and 4 , were constructed in real time for use by investment professionals in asset allocation strategies. The series 3 is provided by Wilshire Associates (WA), an investment management consulting firm, and the series 4 is from Barclays Global Investors (BGI), a global investment management firm. They are both in nominal terms and are constructed by first aggregating I/B/E/S consensus estimates of growth rates in earnings per share out for 5 years for individual firms in the S&P 500 index and then letting the growth rates converge linearly to the economy-wide average growth rate over the next 10 years. The resulting growth rates are then used to project current dividends and to calculate an implied DDM rate of return from the current level of the S&P 500 index. These two nominal series are obtained from real-time forecasts of fundamentals that do not use any future information. Table 19.4 reports the estimated parameters of the process for each series under the two assumptions about dividend growth. Several features stand out in the real series. First, the estimates of ; , and are largely unaffected by whether the expected dividend growth rate is assumed to be constant or not. Second, the estimates of both and are somewhat higher for the Ilmanen series than for the A&B series; the effect of these differences on v , the standard deviation of the stationary distribution of , are largely offsetting
306
M.J. Brennan and Y. Xia
Table 19.4 Parameter estimates of the series Parameters Scenario (vi) Table 19.2 0.100 0:0180 0:090 Real models A&B 1;1 0.085 0:0173 0:047 A&B 1;2 0.083 0:0172 0:045 IL 2;1 0.122 0:0196 0:066 IL 2;2 0.115 0:0224 0:066 BGI 3;1 BGI 3;2 WA 4;1 WA 4;2
Nominal models 0.091 0:0239 0.085 0:0214 0.122 0:0336 0.095 0:0266
0:133 0:113 0:137 0:111
v
P
D
D
0:0400
0:900
0:1200
0:879
0:0419 0:0423 0:0397 0:0467
0:977 0:981 0:884 0:885
0:0852 0:0776 0:0852 0:0822
0:126 0:106 0:117 0:066
0:0560 0:0519 0:0680 0:0611
0:812 0:657 0:682 0:714
0:0859 0:2076 0:0859 0:2202
0:234 0:249 0:095 0:209
g
g
g
Dg
0:103
0:0090
0:328
0:413
0:025
0:0034
0:379
0:055
0:209
0:0294
0:378
0:326
0:220
0:0339
0:402
0:417
This table reports the parameter estimates associated with the instantaneous expected return, , series, which are derived from the long run discount rate ›. The two real › series are calculated in Arnott and Bernstein (2002) (A&B) and Ilmanen (2002) (IL), while the two nominal › series are provided by Barclays Global Investors (BGI) and Wilshire Associates (WA). In each › series, we derive under two cases. In case I, the dividend growth rate g is assumed to be a constant. In case II, the dividend growth rate g is assumed to follow a mean-reverting process. When the real › from Ilmanen or A&B is used as the long run discount rate, g is set to 0.86% in the first case, while g is set to 0.86% in the second case. When the nominal › is used, g is set to 4.82% in the first case, while g is set to 4.82% in the second case
and the estimates are very close to the value of 4% that we have assumed in our simulations in Sects. 19.2 and 19.3. Third, the estimates of the correlation P are all between 0:884 and 0:981; this is what we should expect since is a proxy for the discount rate. Finally, most of the parameters are quite close to the values that were assumed or derived for Scenario (vi) in Table 19.2 (which assumes a constant expected dividend growth rate). The exception is D , which is much higher in Scenario (vi). We suspect that this is because D was set equal to 12.02% in Table 19.2 while its estimates is only 8.52% in Table 19.4.26 The estimates of the parameters of the nominal process are broadly similar to those estimated for the real models except that in both cases the Model 2D is over 20%, but this is offset by the much higher level of mean reversion of the dividend growth rate g ; in addition, D is positive in both nominal models, while it is negative in the two real models. Figure 19.6 shows that when the A&B series 1 is used as the input, the estimates of from the two models, 1;1 and 1;2 , are virtually coincident, suggesting that there is little advantage in allowing for a stochastic dividend growth rate. Note that both estimates of the instantaneous expected rate of return, , are much more volatile than the underlying DDM series from which they are derived: the maximum value of 1;2 is over 16% in June 1982 and the minimum is around 3% in March 2000; in contrast, the maximum and minimum values of the DDM , which occur in the same 2 months, were respectively 7.9% and 2.1%. 26
As noted above, 12.02% is the sample volatility of annual real dividend growth over the period 1872–2001, while 8.52% is the estimate of D in Equation (19.25) derived from quarterly data for the period 1950.1 to 2002.2 using the algorithm described under Model 2 above.
Figure 19.7 plots the corresponding estimates from the Ilmanen series 2 . As in the previous case, the 2;1 and 2;2 series track each other closely except around the end of 1999 when the decline in the Model 2 estimate, 2;2 , is much more dramatic than that in 2;1 : 2;2 reaches a minimum of 3:2% just before the end of the bull market in December 2000. For comparison, Figure 19.8 plots the i and the i;2 .i D 1; 2/ for both the A&B and the Ilmanen series. While the two series of i;2 .i D 1; 2/ generally track each other quite well, there are periods of significant difference. In the 1950s the Ilmanen estimate exceeds the A&B estimate by up to 3.8%. Significant differences of the opposite sign occur during the period 1983–1996; in September 1987, the Ilmanen estimate was less than 1% while the A&B estimate was over 6.5%. Figures 19.9 and 19.10 plot the quarterly nominal estimates of i ; i;1 and i;2 .i D 3; 4/ from BGI and WA for the shorter periods for which these series are available. These last two series differ from the previous two in three significant ways. First, they are truly ex-ante; second, they are nominal expected rates of return; third, they are based on analysts’ forecasts of earnings growth rates that are known to be upward biased. Not surprisingly, these nominal series lie everywhere above the two real series. Nevertheless, they show a similar pattern of increase during the 1970s, the BGI series 3 (the WA series 4 ) reaching a peak of 19.1% (18.8%) in the third quarter of 1981 (second quarter 1982). After the peak, there is a prolonged decline in both series. Although for both series, i;1 and i;2 .i D 3; 4/ move largely in parallel, i;1 is everywhere above i;2 .i D 3; 4/ and the difference between them is much larger than that observed for the A&B and the IL real series. Figure 19.11 shows i and the corresponding instantaneous expected return i;2 .i D 3; 4/ together. The two i;2
19 Persistence, Predictability, and Portfolio Planning Fig. 19.6 Estimates of the real 1;1 and 1;2 series (1950.1 to 2002.2). The figure plots the long run expected real return ›1 from Arnott and Bernstein (2002) (A&B) together with the estimated series of 1;1 and 1;2 , corresponding to the two cases of constant g and mean-reverting g
Fig. 19.7 Estimates of the real 2;1 and 2;2 series (1950.1 to 2002.2). The figure plots the long run expected real return ›2 from Ilmanen (2003) (IL) together with the estimated series of 2;1 and 2;2 , corresponding to the two cases of constant g and mean-reverting g
307
308 Fig. 19.8 Estimates of the real 1;2 and 2;2 Series (1950.1 to 2002.2). The figure plots the long run expected real return ›1 and ›2 together with their corresponding 1;2 and 2;2 series under the assumption of mean-reverting g
Fig. 19.9 Estimates of the nominal 3;1 and 3;2 Series (1972.4 to 2002.1). The figure plots the long run expected real return ›3 from Barclays Global Investors (NGI) together with the estimated series of 3;1 and 3;2 , corresponding to the two cases of constant g and mean-reverting g
M.J. Brennan and Y. Xia
19 Persistence, Predictability, and Portfolio Planning Fig. 19.10 Estimates of the nominal 4;1 and 4;2 series (1973.1 to 2004.1). The figure plots the long run expected real return ›4 from Wilshire Associate (WA) together with the estimated series of 4;1 and 4;2 , corresponding to the two cases of constant g and mean-reverting g
Fig. 19.11 Estimates of the nominal 3;2 and 4;2 series (1973.1 to 2002.2). The figure plots the long run expected real return ›3 and ›4 together with their corresponding 3;2 and 4;2 series under the assumption of mean-reverting g
309
310
series track each other closely except for the period 1980– 1982 as well as 1999–2000, when the WA estimate 4;2 exceeds the BGI estimate 3;2 by 1–3%, and the period 2001–2002 when the WA estimate 4;2 is below the BGI estimate 3;2 by 2–5%.
19.5.3 Return Prediction We have already shown that even when return predictability is of great economic importance, the statistical evidence of predictability may be weak and hard to detect using standard statistical methods. Therefore, even if the series contain valuable information for portfolio planning, the statistical evidence of their predictive power may be weak. Panel A of Table 19.5 reports the results of regressions of quarterly real returns on the S&P 500 index on values of i;2 .i D 1; 2/ derived from the A&B and IL real DDM series, while Panel B reports the results of regressing quarterly nominal returns on the estimates of the nominal i;2 .i D 3; 4/ derived from the BGI and WA nominal DDM series. We report results for the whole sample period from 1950.2 to 2002.2, and also the two approximate halves of the period, omitting the influential 1974.3 quarter when the real return on the S&P 500 was approximately 28%. For the two real series, the effect of omitting this quarter is to raise both the estimated coefficient towards its theoretical value of unity and to raise the regression R2 . While the regression coefficients are not significantly different from either zero or their theoretical value of unity, the point estimates are close to unity. The explanatory power of the predictive variable in both models is considerably greater in the first half of the sample period where the regression R2 is around 4–5% and the predictive coefficient is statistically significant at the 5% level. In all other periods, however, regression R2 ’s are all below 2% and none of the predictive coefficients is significant at the 5% level. Panel B reports the corresponding results for nominal returns using the predictors derived from the BGI and WA nominal DDM models. These models have greater predictive power than the two real models for the relevant sample periods, with R2 at around 2–3%, despite the fact that they were made in real time and contain no “look-ahead” bias. The estimates of the coefficients of the predicted return are little affected by the omission of 1974.3 and, as in the two previous cases, are close to the theoretical value of unity but are not significantly different from zero either.
M.J. Brennan and Y. Xia
The evidence from Table 19.5 is thus broadly consistent with the earlier observation that the predictive coefficient estimates, while close to their theoretical values, are associated with large estimation error and that the weak statistical evidence of predictive power of series at a quarterly horizon does not provide much information on their economic importance to investors with horizons of 20 years or longer.
19.5.4 Historical Simulations In this section we report the results of simulating the optimal and unconditional policies using each of four series, A&B 1;1 , IL 2;1 , BGI 3;2 , and WA 4;2 , for a long horizon investor with a relative risk aversion coefficient, , of 5 when the equity allocation is constrained to lie between 0 and 100%. Allocations to stocks were revised quarterly, and borrowing and short sale constraints were imposed. Under the unconditional constrained strategy, the fraction of the portfolio that is allocated to the risky asset is determined from Equation (19.18) subject to the constraint that 0 x u 1, and is constant over time. Under the constrained optimal strategy, the fraction of wealth allocated to the risky asset is determined by solving an optimal problem whose value function depends on ; it is described in Appendix 19A. The return on the market portfolio is taken as the historical real (nominal) return on the S&P 500 index for each quarter and the riskless interest rate is taken as the realized real (nominal) return on a 30-day Treasury from CRSP for the real 1;1 and 2;1 (the nominal 3;2 and 4;2 ). The instantaneous return volatility, P D 15:7% (17.1%), is determined from equation (var) by setting V .0:25/ equal to the sample volatility, 7.74% (8.40%), of quarterly S&P 500 index real (nominal) return, when either 1;1 or 2;1 (3;2 or 4;2 ) is used. The same value of P is used in both the optimal and the unconditional policies. The other parameters for the optimal policies are taken from the appropriate line of Table 19.4, and the constant unconditional equity premium for the unconditional policy is calculated from the corresponding from Table 19.4.27 For both the optimal and the 27
The unconditional equity premium for the unconditional strategy was calculated in three different ways: (1) using the same as that used in calculating the optimal strategy; (2) setting to the sample mean of the S&P 500 Index excess return during the whole sample period of 1929 to 2002; and (3) setting to the gradually updated sample mean excess return with the initial value calculated from 1929 to 1949 (or 1972 for the BGI or WA series). The first approach ensures that differences between the wealth realized under the optimal and the unconditional strategies are not caused by different assumptions about the
19 Persistence, Predictability, and Portfolio Planning
311
Table 19.5 Quarterly return prediction A. Real return predictive regressions
1
Sample period 1950.2–2002.2
Obs. 209
2
1950.2–1974.2, 1974.4–2000.2
208
3
1950.2–1974.2
97
4
1974.4–2002.2
111
1;2 as Predictor a0 a1 0.005 0.874 (0.46) (1.74) [0.43] [1.60]
R2 (%) 1.43
2;2 as Predictor a0 a1 0.009 0.701 (0.86) (1.53) [0.79] [1.27]
R2 (%) 1.12
0.005 (0.43) [0.38]
0.981 (2.02) [1.83]
1.94
0.008 (0.85) [0.75]
0.800 (1.81) [1.50]
1.57
0:019 (0.92) [1.04]
2.157 (2.08) [2.45]
4.37
0:020 (1.09) [1.06]
1.961 (2.42) [2.21]
5.79
0.011 (0.82) [1.07]
0.708 (1.22) [1.86]
1.35
0.017 (1.40) [2.36]
0.494 (0.88) [1.80]
0.70
B. Nominal return predictive regressions
1
Sample period 1973.2–2002.2
Obs. 117
2
1973.2–1974.2, 1974.4–2000.2
116
3;2 as Predictor a0 a1 0.003 1.026 (0.20) (1.86) [0.18] [1.69] 0.008 0.953 (0.48) (1.80) [0.41] [1.60]
2
R (%) 2.89
2.75
4;2 as Predictor a0 a1 0.006 0.924 (0.37) (1.80) [0.34] [1.66] 0.010 0.852 (0.67) (1.73) [0.54] [1.42]
R2 (%) 2.74
2.57
This table reports the results of regressing real and nominal quarterly returns of the S&P 500 stock index on forecasts of the return at the beginning of the quarter calculated from the estimated value of i;2 .i D 1; 2; 3; 4/ and the estimated parameters of the joint stochastic process using equation 1:0 e =4 R.t; t C 0:25/ D a0 C a1 i;2 t C "t ; i D 1; 2; 3; 4 where R.t; t C 0:25/ is the one quarter real return on the S&P 500 stock index in Panel A and is the corresponding nominal return in Panel B. In Panel A, the real return on the S&P 500 index is regressed on the estimated real A&B and IL i;2 .i D 1; 2/ series. In Panel B, nominal returns on the index is regressed on the estimated nominal BGI and WA i;2 .i D 3; 4/ series. The OLS t -ratios are reported in parenthesis and Newey-West adjusted t -ratios are in brackets
unconditional strategies under the real (nominal ), the risk free rate is set at a constant 1.1% (4.1%), which is the sample mean of the realized real (nominal) return of the 30-day Treasury bill rate from 1950 to 2002. Figures 19.12 and 19.13 summarize the results of the simulations under the real A&B 1;1 and IL 2;1 . For each figure the investor is assumed to be concerned with maximizing the expected utility of wealth on the last date included in the figure. For example, Fig. 19.12a describes the evolution of wealth of an investor who starts investing at the end of the first quarter (or equivalently the beginning of the second quarter) of 1950 with a horizon of the end of the first quarter of 1970; thus his initial horizon is 20 years and deunconditional equity premium. Unconditional strategies based on (2) and (3) have similar realized wealth as that based on (1) except for the period 1950–1970, during which unconditional strategies based on (2) and (3) significantly outperform that based on (1) but still underperform the optimal strategy. Since the relative performance of the optimal strategy is consistent across the three unconditional strategies, we only report results of the unconditional strategy based on (1) and omit those based on (2) and (3) for brevity.
creases each period. Investment decisions are assumed to be made at the beginning of each quarter based on the current value of . The figure shows that an investor, who would have followed the optimal strategy based on 1;1 over this period, would have vastly outperformed his unconditional counterpart. Much the same pattern is visible in Fig. 19.13a, which is based on 2;1 . Figures 19.12b and 19.13b show that the optimal strategies continue to outperform the unconditional strategies but by a smaller margin over the subsequent 20-year period 1970–1990. Both the 1;1 and 2;1 based strategies lose more in the oil-price related bear market of 1974 on account of their more aggressive stock positions but more than make up for this by the end of the period. A 12year horizon investor over the period 1990–2002 does better (about as well) under the optimal strategies than (as) under the unconditional strategy using 1;1 .2;1 / series. Finally, the simulations for an investor with a 52-year horizon starting in 1950 show that the “optimal” investor ends up well ahead. Since the i;1 and i;2 .i D 1; 2/ series are trivially different we do not report the results for the 2 series.
312 Fig. 19.12 Cumulative real wealth under the optimal and unconditional strategies for a long horizon. Investor (1;1 as the predictor). The figure plots the cumulative real wealth under the optimal and unconditional strategies for long horizon investors with a risk aversion parameter D 5. The optimal strategy is based on the estimated real A&B 1;1 series and its associated parameter estimates. Both the optimal and the unconditional strategies are constrained to have allocations between 0 and 1. In panel a, the investment horizon is 20 years and the investors starts investing in 1950.06 with a terminal date in 1970.03. In panel b, the investment horizon is also 20 years and the investor starts investing in 1970.06 with a terminal date in 1990.03. In panel c, the investment horizon is 13 years and the investor starts investing in 1990.06 with a terminal date in 2002.06. In panel d, the investment horizon is 53 years and the investor starts investing in 1950.06 with a terminal date in 2002.06 Fig. 19.13 Cumulative real wealth under the optimal and unconditional strategies for a long horizon. Investor (2;1 as the predictor). The figure plots the cumulative real wealth under the optimal and unconditional strategies for long horizon investors with a risk aversion parameter D 5. The optimal strategy is based on the estimated real IL 2;1 series and its associated parameter estimates. Both the optimal and the unconditional strategies are constrained to have allocations between 0 and 1. In panel a, the investment horizon is 20 years and the investors starts investing in 1950.06 with a terminal date in 1970.03. In panel b, the investment horizon is also 20 years and the investor starts investing in 1970.06 with a terminal date in 1990.03. In panel c, the investment horizon is 13 years and the investor starts investing in 1990.06 with a terminal date in 2002.06. In panel d, the investment horizon is 53 years and the investor starts investing in 1950.06 with a terminal date in 2002.06
M.J. Brennan and Y. Xia
19 Persistence, Predictability, and Portfolio Planning Fig. 19.14 Cumulative nominal wealth under the optimal and unconditional strategies for a long horizon. Investor (3;2 as the Predictor). The figure plots the cumulative real wealth under the optimal and unconditional strategies for long horizon investors with a risk aversion parameter D 5. The optimal strategy is based on the estimated real BGI 3;2 series and its associated parameter estimates. Both the optimal and the unconditional strategies are constrained to have allocations between 0 and 1. In panel a, the investment horizon is 20 years and the investors starts investing in 1973.03 with a terminal date in 1992.12. In panel b, the investment horizon is around 10 years and the investors start investing in 1993.03 with a terminal date in 2002.03. In panel c, the investment horizon is about 30 years and the investor starts investing in 1973.03 with a terminal date in 2002.03
Fig. 19.15 Cumulative nominal wealth under the optimal and unconditional strategies for a long horizon. Investor (4;2 as the predictor). The figure plots the cumulative real wealth under the optimal and unconditional strategies for long horizon investors with a risk aversion parameter D 5. The optimal strategy is based on the estimated real WA 4;2 series and its associated parameter estimates. Both the optimal and the unconditional strategies are constrained to have allocations between 0 and 1. In panel a, the investment horizon is 20 years and the investors starts investing in 1973.06 with a terminal date in 1993.02. In panel b, the investment horizon is around 10 years and the investors start investing in 1993.06 with a terminal date in 2002.06. In panel c, the investment horizon is about 30 years and the investor starts investing in 1973.06 with a terminal date in 2002.06
313
314
Both the A&B and Ilmanen DDM series are constructed by projecting historical growth rates of aggregate series such as profits or GDP. In contrast, the nominal BGI and WA DDM series constructed by investment professionals relies on bottom-up forecasts of individual firm growth rates. Figures 19.14 and 19.15 compare the (nominal) wealth outcomes from the unconditional vs. the optimal strategy derived from BGI 3;2 and WA 4;2 series for the 20-year period 1973–1993 and the 11-year period 1993–2003. In the first period, the wealth outcome under the optimal strategy is 53–67% higher than that under the unconditional strategy: it is noteworthy that these strategies do not suffer the large losses of the A&B and Ilmanen-based strategies during 1974. For the 11-year investment period 1993–2003, the optimal strategies based on both 3;2 and 4;2 series underperform the unconditional ones by 10–20%. It is interesting to note that the unconditional strategy out performs the optimal strategies during the bubble period at the end of the 1990s, but that it substantially under performs as the bubble collapses during the period 2000–2003: it is precisely in such circumstances that we would expect that a dynamic strategy that takes account of the long run expected rate of return implicit in asset prices to do well. Figures 19.14c and 19.15c compare the outcome of the unconditional strategies with those of the optimal ones that start with a 30-year horizon in 1973. Over this longer horizon, the optimal strategies show a consistent advantage. While we should be careful from inferring too much from these historical simulations that represent only a single sample path of stock prices for a single level of risk aversion, it is encouraging that the optimal dynamic strategies tend to outperform naive unconditional strategies, even when they are based on real time data.28
19.6 Conclusion There is considerable disagreement about whether or not stock returns are time-varying and predictable. On the one hand, there is ample evidence of in-sample return predictability from regressions of stock returns on instruments such as the dividend yield and short-term interest rate, and from variance ratio tests. On the other hand, it has been argued 28
The instantaneous expected returns estimated from the DDM long run expected returns depend on parameters of the expected return and the dividend growth g processes, ; , and g etc. These parameters were estimated using data from the whole sample period so that our estimates of instantaneous expected returns, even when they are based on real time DDM estimates, rely on future data. For the A&B and IL series, we also estimated for the period of 1950–2002 by first estimating the parameters using data only from 1900 to 1949, and the superior performance of the optimal strategy remains unchanged. We do not have long enough sample for the BGI and WA series, which are only available starting from 1973, to carry out this robustness check.
M.J. Brennan and Y. Xia
that in-sample predictive regressions perform poorly out of sample, so that the evidence of time variation should be discounted. In this paper, we have shown that time variation in expected returns that implies both large variation in stock market valuation ratios and substantial gains to long-term dynamic investment strategies is likely to be hard to detect by standard statistical methods. As a result, weak statistical evidence for return predictability does not in itself imply that return predictability is economically insignificant. In view of the difficulty of estimating expected returns from regressions of realized returns on instruments such as the interest rate or dividend yield, we suggest that it is likely to be more productive to estimate the expected long run rate of return by comparing the current level of stock prices with forecasts of expected future dividends in the dividend discount model (DDM) paradigm. This forwardlooking approach has the advantage that it does not rely on hard-to-estimate regression coefficients from past data. The disadvantage is that the rate of return that emerges from the dividend discount rate model is a long run expected rate of return. In order to use the DDM expected rate of return estimate in the dynamic portfolio planning, we show how the instantaneous expected rates of return can be estimated from the DDM long run expected rate of return under the assumption that the instantaneous expected rate of return follows an Ornstein-Uhlenbeck process; the technique is also extended to a setting in which the expected growth rate of dividends, instead of being constant, also follows an Ornstein-Uhlenbeck process. Time series of expected rates of return are estimated for four time series of DDM expected rates of return. Two of them are historical “back-casts,” and two are real time estimates provided by investment professionals. Simulations using realized S&P 500 index returns and the 30-day Treasury bill rates suggest that there may be significant benefits from the use of expected instantaneous rates of return derived from dividend discount model expected returns, even when the expected returns are estimated in real time. In this paper, however, we have examined the benefit of the optimal market timing strategy without taking account of misspecification of, or errors in estimating, the stochastic process for the expected instantaneous rate of return. Determining the sensitivity of DDM-based dynamic portfolio strategies to errors in specifying and estimating the stochastic process for the instantaneous expected return is a task for future work.
References Amihud, Y. and C.M. Hurvich. 2004. “Predictive regressions: a reduced bias estimation method”. Journal of Financial and Quantitative Analysis 39, 813–41.
19 Persistence, Predictability, and Portfolio Planning Arnott, R. D. and P. L. Bernstein. 2002. “What risk premium is normal?” Financial Analysts Journal 58, 64–85. Barberis, N. 2000. “Investing for the long run when returns are predictable.” Journal of Finance 55, 225–264. Bossaerts, P. and P. Hillion. 1999. “Implementing statistical criteria to select return forecasting models: what do we learn?” Review of Financial Studies 12, 405–428. Boudoukh, J., M. Richardson and T. Smith. 1993. “Is the ex ante risk premium always positive? – A new approach to testing conditional asset pricing models.” Journal of Financial Economics 34, 387–408. Brav, A. and J. B. Heaton. 2002. “Competing theories of financial anomalies.” Review of Financial Studies 15, 575–606. Brennan, M. J., E. S. Schwartz and R. Lagnado. 1997. “Strategic asset allocation.” Journal of Economic Dynamics and Control 21, 1377–1403. Brennan, M. J. and W. N. Torous. 1999. “Individual decision making and investor welfare.” Economic Notes 28, 119–143. Brennan, M. J. and Y. Xia. 2001. “Stock price volatility and the equity premium.” Journal of Monetary Economics 47, 249–283. Brennan, M. J. and Y. Xia. 2001. “Assessing asset pricing anomalies.” Review of Financial Studies 14, 905–942. Brennan, M. J. and Y. Xia. 2002. “Dynamic asset allocation under inflation.” The Journal of Finance 57, 1201–1238. Campbell, J. Y. 2001. “Why long horizons? A study of power against persistent alternative.” Journal of Empirical Finance 8, 459–491. Campbell, J. Y., A. W. Lo and A. Craig MacKinlay. 1997. The econometrics of financial markets, Princeton University Press, Princeton, New Jersey. Campbell, J. Y. and R. J. Shiller. 1998. “Valuation ratios and the long run stock market outlook.” Journal of Portfolio Management 24, 11–26. Campbell, J. Y. and R. J. Shiller. 2001. “Valuation ratios and the long run stock market outlook: an update.” in Advances in behavioral finance, vol II, N. Barberis and R. Thaler (Eds.). Russell Sage Foundation, 2004. Campbell, J. Y. and S. B. Thompson. 2004. “Predicting the equity premium out of sample: can anything beat the historical average?” working paper, Harvard University. Campbell, J. Y. and L. Viceira. 2001. “Who should buy long term bonds?” American Economic Review 91, 99–127. Campbell, J. Y. and M. Yogo. 2003. “Efficient test of stock return predictability.” working paper, Harvard University. Cochrane, J. H. 1997. “Where is the market going? Uncertain facts and novel theories.” Economic Perspectives XXI(6) (Federal Reserve Bank of Chicago), 3–37. Cochrane, J. H. 1999. “New facts in finance.” Economic Perspectives XXIII(3) (Federal Reserve Bank of Chicago), 36–58. Fama, E. F. and R. R. Bliss. 1987. “The information in long-maturity forward rates.” American Economic Review 77, 680–692. Fama, E. F. and K. R. French. 1988a. “Dividend yields and expected stock returns.” Journal of Financial Economics 23, 3–25. Fama, E. F. and K. R. French. 1988b. “Permanent and temporary components of stock prices.” Journal of Political Economy 96, 246–273. Fama, E. F. and K. R. French. 2002. “The equity premium.” Journal of Finance 57, 637–659. Feller, W. 1951. “An Introduction to Probability Theory and its Applications”, John Wiley, New York. Goetzmann, W. N. and P. Jorion. 1993. “Testing the predictive power of dividend yields.” The Journal of Finance 48, 663–679. Goyal, A. and I. Welch. 2003. “Predicting the equity premium with dividend ratios.” Management Science 49, 639–654. Goyal, A. and I. Welch. 2004. “A comprehensive look at the empirical performance of equity premium prediction.” Working paper, Yale University.
315 Hakansson, N. H. 1970. “Optimal investment and consumption strategies under risk for a class of utility functions.” Econometrica 38, 587–607. Ilmanen, A. 2003. “Expected returns on stocks and bonds.” Journal of Portfolio Management Winter 2003, 7–27. Kandel, S. and R. F. Stambaugh. 1996. “On the predictability of stock returns: an asset-allocation perspective.” The Journal of Finance 51, 385–424. Keim, D. B. and R. F. Stambaugh. 1986. “Predicting returns in the stock and bond markets.” Journal of Financial Economics 17, 357–390. Kim, T. S. and E. Omberg. 1996. “Dynamic nonmyopic portfolio behavior.” Review of Financial Studies 9, 141–161. Kleidon, A. W. 1986. “Variance bounds tests and stock price valuation models.” Journal of Political Economy 94, 953–1001. LeRoy, S. and R. Porter. 1981. “The present value relation: tests Based on Variance Bounds.” Econometrica 49, 555–574. Lewellen, J. W. 2004. “Predicting returns using financial ratios.” Journal of Financial Economics, 74, 209–235. Lewellen, J. W. and J. A. Shanken. 2002. “Learning, asset-pricing tests, and market efficiency.” The Journal of Finance 57, 1113–1145. Lo, A. W. and A. Craig MacKinlay. 1988. “Stock market prices do not follow random walks: evidence from a simple specification test.” Review of Financial Studies 1, 41–66. Marsh, T. A. and R. C. Merton. 1986. “Dividend variability and variance bounds tests for the rationality of stock market prices.” American Economic Review 76, 483–498. Merton, R. C. 1971. “Optimum consumption and portfolio rules in a continuous-time model.” Journal of Economic Theory 3, 373–413. Philips, T. K., G. T. Rogers and R. E. Capaldi. 1996. “Tactical asset allocation: 1977–1994.” Journal of Portfolio Management 23, 57–64. Poterba, J.M. and L.H. Summers. 1988. “Mean reversion in stock prices: evidence and implications”, Journal of Financial Economics 22, 27–59. Shiller, R. 1981. “Do stock prices move too much to be justified by subsequent changes in dividends?” American Economic Review 71, 421–436. Stambaugh, R. F. 1999. “Predictive regression.” Journal of Financial Economics 54, 375–421. Wachter, J. 2002. “Portfolio and consumption decisions under meanreverting returns: an exact solution for complete markets.” Journal of Financial and Quantitative Analysis 37, 63–91. Xia, Y. 2001. “Learning about predictability: the effect of parameter uncertainty on dynamic asset allocation.” The Journal of Finance 56, 205–246.
Appendix 19A The Optimal Strategy The investor is assumed to have an iso-elastic utility function defined over end-of-period wealth at the investment horizon T : 8 3 2 < T WT1r > 15 max E0 4u.WT / D e ; 1r x D1 : e T ln WT subject to the following dynamic budget constraint: dW D Œx.˛ C ˇ r/ C r dt C xP dzP ; W
316
M.J. Brennan and Y. Xia
where x is defined as the proportion of wealth invested in the single risky asset whose stochastic process was given in Equations (19.1–19.2). The risk free interest rate, r, is assumed to be constant for simplicity. Under the iso-elastic utility function, the indirect utility function, J.W; ; t/ maxxt Et Œu.WT / , is homogeneous in W : W 1r J.W; ; t/ D e t T .; t/; 1r where satisfies the following Bellman equation: 0 D max x
1 2 ' C . / C .1 /x P ' 2 C .1 /Œx.˛ C ˇ r/ C r
1 .1 /x 2 P2 Ct 2
a3 D
C./ D
2a3 .1 e / ; . a2 / C . C a2 /e
B./ D
4a3 .˛ C ˇ r/.1 e 2 /2 ; Œ. a2 / C . C a2 /e
P yt C ŒB./ C C./yt ; P P2
and the function .; t/ under the unconstrained optimal strategy is reduced to: 1
.; t/ D exp A./ C B./yt C C./yt2 2
(19A.5)
" A./ D a3
.ˇS /2 2 2 .˛ C ˇ r/2 C a2 2
(19A.6) !
# C r.1 /
(19A.1)
where, D @2 ; D @ , and t D @ are the partial @ @t derivatives of with respect to or t. Denote the equity premium by yt ˛ C ˇt r and the remaining investment horizon by T t, then the investor’s unconstrained optimal dynamic policy xt is given by:
(19A.4)
then q D a23 4a1 a3 > 0 for all 1, which is the condition for the well-behaved normal case. In this paper, we focus on dynamic strategies under > 1, so A./; B./, and C./ are given by the normal solution:
C
4a3 .˛ C ˇ r/Œ.2a2 C /e 4a2 e 3 Œ. a2 / C . C a2 /e
C
ˇ ˇ 2a3 .ˇs /2 ˇˇ . a2 / C . C a2 /e ˇˇ ln ˇ ˇ; 2 2 a22
@2
xt D
1 ; P2
2
C 2a2
(19A.7)
p where D q. If the optimal allocation x is constrained to be between zero and one (i.e., no borrowing or short-sale is allowed), then no closed-form solution is available. Both the constrained optimal policy x and the associated function are solved numerically from the Bellman Equation (19A.1) subject to the constraint 0 x 1.
o
where A./; B./, and C./ are solutions to a system of three ordinary differential equations with boundary conditions of A./ D 0; B./ D 0; and C./ D 0 at D 0: The details of the equations are contained in Kim and Omberg (1996) for the general HARA utility and in Xia (2001) for the CRRA utility. In particular, let
2 1 2 P ; 1C (19A.2) a1 D ˇ
.1 /ˇ P a2 D 2 ; (19A.3) P
Appendix 19B The Unconditional Strategy If the investor believes that the equity premium is constant, then the investment opportunity set is constant from the investor’s perspective. The unconditional strategy, xu D
y ˛ C ˇ r D ; 2 P P2
is based on the long run mean of the equity premium. However, the wealth process of the investor evolves according to the true dynamics of the equity premium given by Equations (19.1–19.2): dW D Œx u .˛ C ˇ r/ C r dt C x u P dzP ; W
(19B.1)
19 Persistence, Predictability, and Portfolio Planning
317
The indirect utility function J u .W; ; / is then given by: W 1r W 1r J u .W; ; t/ D Et e T T D e T T Et 1r 1r
WT exp .1 / ln Wt
1r WT T WT exp Et .1 / ln De 1r Wt
WT C Vart .1 / ln : Wt Solving for ln WWTt and its first two conditional moments from Equation (19B.1) gives the following result: J.W; ; t/ D e t
Wt1r D. /CE. /t e ; 1r
(19B.2)
where "
C.1 /
2 "
2ˇP C .1 /
.1 /x u ˇ C .1 / x u ˇP C.1 /
J .W; ; t/ D Et e m
!#
x u ˇ2 2
D e T
WT1r m
.; t/ 1r (19C.2)
where WT is determined via the wealth dynamics (19C.1). Similar to the indirect utility function under the optimal strategy, we conjecture that the function m is of the form m . /CB m . /yC 1 C m . /y 2 2
m D eA
:
(19C.3)
where Am ./; B m ./, and C m ./ are solutions to a system of ordinary differential equations (ODE) similar to the case of the optimal strategy: dCm D a1m .C m /2 C a2m C m C a3m ; d
(19C.4)
with the boundary conditions:
1 e
Am ./ D 0; B m ./ D 0; and C m ./ D 0 at D 0: This system of ODEs has exactly the same form of solutions as that under the optimal strategy, but with slightly different coefficients, .a1m ; a2m ; a3m /. In particular,
a1m
Appendix 19C The Myopic Strategy
D
ˇ 2 2
¤
a1 ; a2m
D a2 and a3m D
If the investor has a short investment horizon, then it is optimal to adopt the myopic market-timing strategy, ˛ C ˇt r ; P2
The wealth process of the investor evolves according to the following dynamics: dW D Œx m .˛ C ˇt r/ C r dt Cx m P dzP ; W
WT1r 1r
1 dAm D a1m C m C .B m /2 C .˛ C ˇ r/B m d 2 C.1 /r ; (19C.6)
xu2 ˇ 2 2 1 e 2 1 ; (19B.3) C .1 /2 2 2 2 1 e E./ D .1 /x u ˇ : (19B.4)
xm;t D
T
dBm 1 D a1m B m C m C a2m B m C .˛ C ˇ r/C m ; d 2 (19C.5)
1 D./ D .1 / r C .x u /2 P2 2 ˇ 2 2
and the investor’s indirect expected utility function is given by
(19C.1)
.1 /ˇ P D2 P
1 D a3 P2
where a2 and a3 are given in (19A.2–19A.3) in Appendix 19A. Therefore, the solution to Am ./; B m ./, and C m ./ has exactly the same expression as that given by equations (19A.5–19A.7) in Appendix 19A except that a1 is replaced by a1m and is replaced by m
q a22 4a1m a3 :
318
M.J. Brennan and Y. Xia
Appendix 19D The Optimal Buy-and-Hold Strategy The dynamics of the stock price (with dividends reinvested) given in Equations (19.1–19.2) implies that the stock price at the time T conditional on information at t is:
1 (19D.1) PT D Pt exp ˛ C ˇ r P2 .T t/ 2 Cˇ.t / Z
T
Cˇ t
1 e .T t /
1 e .T s/ dz .s/ C P
Z
T t
dzP .s/ :
The optimal buy-and-hold strategy, x b , which is the proportion of current wealth Wt invested in the stock, then solves the following optimization problem by normalizing Wt to one dollar: 2
1 3 x b PPTt C .1 x b /e r.T t / 6 7 max Et 4 5 : (19D.2) b 1 x The strategy x b , the indirect utility J b , and the certainty equivalent wealth CEW b are all solved numerically.
Chapter 20
Portfolio Insurance Strategies: Review of Theory and Empirical Studies Lan-chih Ho, John Cadle, and Michael Theobald
Abstract A portfolio insurance strategy is a dynamic hedging process that provides the investor with the potential to limit downside risk while allowing participation on the upside so as to maximize the terminal value of a portfolio over a given investment horizon. First, this paper introduces the basic concepts and payoffs of a portfolio insurance strategy. Second, it describes the theory of alternative portfolio insurance strategies. Third, it empirically compares the performances of various portfolio insurance strategies during different markets and time periods. Fourth, it summaries the recent market developments of portfolio insurance strategies, especially in terms of the variations of features in CPPI investments. Finally, it addresses the impacts of these strategies on financial market stability. Keywords CPPI r OBPI r Risk-based portfolio insurance
20.1 Introduction Portfolio insurance refers to any strategy that protects the value of a portfolio of risky assets. The risky assets can be stocks, bonds, currencies, or even alternative assets, such as commodities, real assets, hedge funds, credits and so forth. If the value of the risky asset declines, the insurance or hedge will increase in value to help offset the decline in price of the hedged risky assets. If the price of the risky asset increases, the increase of the insured portfolio will be less than the increase in the risky asset itself but will nevertheless still increase.
L.-c. Ho () Department of Foreign Exchange, Central Bank of the Republic of China (Taiwan), 2, Roosevelt Road, Sec. 1, 10066 Taipei, Taiwan, ROC e-mail:
[email protected] J. Cadle and M. Theobald Accounting and Finance Subject Group, University of Birmingham, Birmingham, UK
Table 20.1 illustrates how portfolio insurance works. In this example, the underlying risky asset is purchased for $95 and $5 is spent on portfolio insurance. The minimum amount that the insured investor can realize is $95, but the uninsured portfolio can fall in value to a low of $75 if the market falls. If the value of the risky asset increases, the value of the insured portfolio will increase, but at a smaller rate. Figure 20.1 illustrates the profit and loss of the insured and uninsured portfolio. Portfolio insurance allows market participants to alter the return distribution to fit investors’ needs and preferences for risk. Figure 20.2 shows the effect of insurance on the expected returns of a portfolio. Notice that the uninsured portfolio has greater upside potential as well as greater downside risk, whereas the insured portfolio limits the downside loss to the cost of the hedge. The upside potential of the insured portfolio is always below that of the uninsured portfolio. The cost of the insurance is the lower return for the insured portfolio should prices increase. While some investors would prefer the greater upside potential that the uninsured portfolio offers, risk-averse investors would prefer the limited-risk characteristics that the hedged portfolio offers.
20.2 Theory of Alternative Portfolio Insurance Strategies This section introduces the mechanisms behind various portfolio insurance strategies. These strategies share a common characteristic, a convex payoff function, which implies a “buy high and sell low” rule for the risky asset. If the market level declines an investor sells a fraction of the risky asset and buys the riskless asset; if the market level rises an investor switches from the riskless to the risky asset and rallies with the market. For illustrative purposes, we construct a portfolio that comprises a risky currency, the euro (EUR), and a risk-free currency, the US dollar (USD) overnight deposit account. For comparison purposes, we assume that the
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_20,
319
320
Table 20.1 Mechanics of portfolio insurance: an example
L.-c. Ho et al.
Initial investment Cost of portfolio insurance
$100 $5
Amount of investment going toward securities
$95
Amount invested D 100 Value of portfolio at year end ($)
Return on uninsured portfolio (%)
Return on insured portfolio ($)
Net return on insured portfolio (%)
75
25
95
5
80
20
95
5
85
15
95
5
90
10
95
5
95
5
95
5
100
0
95
5
105
5
100
0
110
10
105
5
115
15
110
11
120
20
115
15
125
25
120
20
130
30
125
25
Fig. 20.1 Gains and losses of insured and uninsured portfolios: an example
initial value of EUR in the portfolio is 25%, short positions in EUR are not allowed, but that an investor can borrow via USD to buy EUR.
20.2.1 Portfolio Insurance with Synthetic Put (Option-Based Portfolio Insurance) The most popular portfolio insurance strategy is the synthetic put approach of Rubinstein and Leland (1981), also
referred to as an option-based portfolio insurance (OBPI) strategy. If an investor holds a risky asset and buys one atthe-money put option on that asset, the value of the resultant portfolio will not be less than the exercise price net of the premium at expiration. Thus, the investor effectively hedges the portfolio against downside risk. However, the implied volatility is usually higher than the historical volatility in option markets, which indicates that an actual option is more expensive than a synthetic one. Therefore, a synthetic option is often used as an alternative hedging vehicle in practice.
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies
321
Fig. 20.2 Expected returns on insured and uninsured portfolios
In a currency portfolio, a synthetic put can be created by dynamically rebalancing the portfolio between the risky (foreign) and riskless (domestic) currencies according to the delta of a put option, D D e rf T ŒN.d1 / 1 :
(20.1)
Specifically, the investment proportion of the risky currency in the portfolio value can be expressed as: WSynP ut D
Se rf T N.d1 / S C D S D ; V S CP
(20.2)
where P is the premium of a European put option on foreign currency from Garman-Kohlhagen (1983), P D Xe rT N.d2 / Se rf T N.d1 /;
determined by the changes of delta. Table 20.2 illustrates the adjustment process. One important parameter in the synthetic put approach is the volatility. Rendleman and O’Brien (1990), Do (2002), Do and Faff (2004), and Bertrand and Prigent (2005) use different measures of volatility, such as implied, historical and Leland’s (1985) adjusted historical volatility, and discuss the effects of volatility estimation on the synthetic put strategy. Additionally, in order to reduce transaction costs, some revisions on the rebalancing discipline are adopted, such as the time discipline with weekly/monthly rebalancing, the price discipline with adjustments according to changes in the value of the risky asset itself, or the volume discipline corresponding to changes in the number of shares that should be held.
(20.3)
p ln.S=X /C.rrf C 2 =2/T p and d2 D d1 T ; S is and d1 D T the spot exchange rate, r is the domestic risk-free interest rate, rf is the foreign risk-free interest rate, ¢ is the volatility of spot exchange rate return, X is the exercise price, T is the time to expiration, and N. / is the cumulative standard normal distribution function. Assume that the initial portfolio contains 1,000,000 units of EUR. In order to make the initial value of the EUR position in the portfolio equal to 25%, an investor needs to create a synthetic put with a delta of 0:75 and sell 750,000 units of the EUR. The dollar income from the EUR sold is kept in an USD cash account and earns the overnight deposit rate. As the EUR depreciates, the synthetic put moves further in-the-money and an investor sells more EUR; conversely, an investor withdraws money from the USD cash account to buy EURs as the EUR appreciates. The daily adjustment is
20.2.2 Constant Proportion Portfolio Insurance Another simplified approach, not involving complex “Greeks,” is the constant proportion portfolio insurance (CPPI) strategy developed by Black and Jones (1987) and Perold (1986), and extended to fixed-income instruments by Hakanoglu et al. (1989). In CPPI, the exposure to the risky asset, E, is always kept at the cushion times a multiplier, m. The cushion, C , is the difference between the portfolio value, V , and a protected floor value, FL, and the multiplier is constant throughout the investment period: C D V FL;
(20.4)
E D m C:
(20.5)
2007/1/1 2007/1/2 2007/1/3
1.3201 1.3272 1.3170
(1) 0.750000 0.729593 0.764881
(2) 750,000 20,407 35,288
(3) $330,025 $358,884 $309,651
.4/ D .3/=.9/ 25.0 27.1 23.5
Table 20.2 Daily adjustment process of portfolio insurance with synthetic put EUR asset EUR amt. USD Synthetic put EURUSD Delta EUR asset (%) sold ./ value (5) $990,075 $27,084 $46,474
$ Income of EUR sold
USD Cum. Interest income income .6/ D .5/C.6/C.7/ (7) $990,075 $145 $963,137 $142 $1,009,753 $148
.9/ D .3/ C .6/ .10/ D d.9/=.9/ $1,320,100 $1,322,020 0.15 $1,319,404 0.20
.8/ D .7/=.9/ 75.0 72.9 76.5
Portfolio return (%)
Portfolio USD value
USD asset (%)
322 L.-c. Ho et al.
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies
The investment proportion of the risky currency in the portfolio value can be expressed as: WCPPI D
E : V
323
20.2.3 Portfolio Insurance with Downside Risk Control (Risk-Based Portfolio Insurance)
(20.6)
In a portfolio of securities, the floor is usually set as the discount price of a zero-coupon bond, which approaches the par value at the end of the investment horizon. In a currency portfolio, the floor is set as the forward rate, with a maturity equal to the investment horizon, that appreciates with time; that is, Ft D F0 exp.r0 t/, where F0 D S0 exp..r0 rf;0 / T /. Note that when the domestic riskfree rate is higher than the foreign risk-free rate, the forward rate is at a premium, which would result in short-selling of the risky asset at the beginning of the investment horizon; that is, E D m C D m .V FL/ D m Units .S F / < 0. In order to make the initial value of the EUR position in the portfolio equal to 25%, a discount factor (DF) is needed, and the multiplier1 also needs to be trimmed at the beginning of investment horizon. Thus, the synthetic floor is set as: Ft D DF F0 exp.r0 t/:
(20.7)
Table 20.3 illustrates the adjustment process. As EUR appreciates, the portfolio value increases more than the floor level increases, the cushion increases, and investors buy EUR, vice versa. There are various market developments in CPPI mechanism with additional features, such as constraints on the investment level, constraints on leverage, variable floors, and variable multipliers and so forth, which will be introduced in later section.
The idea of risk management has become widespread and has been applied to the asset allocation problem. By dynamically controlling for downside risk while maximizing the expected return of a portfolio, the concept of the modern riskbased asset allocation process is analogous to that of the classic portfolio insurance strategy. Zhao and Ziemba (2000) are the first to explore the portfolio payoffs between the two approaches. There are basically two popular measures of downside risk, one is the Value-at-Risk (VaR) and the other more coherent one is the Expected Shortfall (ES). Accordingly, there are two risk-based portfolio insurance (RBPI) strategies. VaR-based portfolio insurance is a strategy that permanently controls the shortfall risk of the portfolio. The allocation to the risky asset is adjusted each day so that the shortfall probability – the probability of realizing a portfolio return .RP / that falls below a pre-specified threshold .R/ at the end of the investment period – does not exceed a target value, ˛ (say 5%): Prob.RP < R/ ˛: (20.8) To operationalize the shortfall risk, Herold et al. (2005) use the lower partial moment of order zero, which is equivalent to VaR, and assume a normally distributed portfolio returns: LPM 0 .RP / D N
u ;
RP D W RA C .1 W / RF ;
(20.9) (20.10)
where u and are the mean and the volatility of portfolio returns, respectively, and is the required portfolio return. RA and RF are the return of the risky and the riskless asset, respectively. Controlling for a fixed shortfall probability, the dynamic allocation process is solved for the weight of the risky asset so that the portfolio return is always higher than the required return (i.e., VaR): 1
The magnitude of the multiplier depends on the discount factor. The lower the discount factor, the lower the multiplier in order to keep the initial exposure fixed at 25%; that is, E D m .V FL/.
NVaR˛;t .RP / D ut .RP / N 1 .˛/t .RP /:
(20.11)
Table 20.3 Daily adjustment process of constant proportion portfolio insurance Floor discount factor .DF/ D 0:94; multiplier .m/ D 5:24592 CPPI
2007/1/1 2007/1/2 2007/1/3
EURUSD
1.3201 1.3272 1.3170
Portfolio USD value
Floor level
EUR asset USD value
EUR amount
(1)
(2)
.3/ D ..1/–.2// m
(4)
$1,320,100 $1,322,020 $1,319,562
1.2572 1.2574 1.2577
$330,025 $338,752 $324,510
250,000 255,238 246,401
EUR asset (%) .5/ D .3/=.1/ 25.0 25.6 24.6
USD asset USD value
USD asset (%)
Portfolio return(%)
.6/ D .1/–.3/
.7/ D .6/=.1/
.8/ D d.1/=.1/
$990,075 $963,137 $1,009,753
75.0 74.4 75.4
0.15 0.19
324
L.-c. Ho et al.
Table 20.4 illustrates the daily adjustment process. Since the VaR of a portfolio return does not change dramatically each day during a certain period of time, the frequency of adjustment is much less than that in traditional portfolio insurance strategies. In comparison with the traditional portfolio insurance strategy, Herold et al. (2005) show that the VaR-based model can be regarded as a generalized version of CPPI with a dynamic and time-varying implied multiplier. The risky currency investment proportion is then given by: WVaR D
; Z˛
(20.12)
where Z˛ is the ˛-quantile of the standard normal distribution and the process drift is assumed to be equal to zero at the daily differencing interval. The implied multiplier of the VaR model is obtained by: W V C =V V 1 1 E D D D ; mimpl D D C C Z˛ C Z˛ VaR .RP / (20.13) where D C=V is expressed in percentage terms. Thus, the inverse of the CPPI multiplier can be interpreted as the maximum loss or worst case return that is allowed to occur over the next period. Bertrand and Prigent (2001) propose that the upper bound on the multiplier can be determined by the extreme value theory (EVT). ES-based portfolio insurance is a strategy that permanently controls the expected shortfall of the portfolio. The expected shortfall is defined as the expected value of the loss of a portfolio in a certain percentage of worst cases within a given holding period. The allocation to the risky asset is adjusted each day so that the expected shortfall does not exceed a target value. Following the idea of Herold et al. (2005), Hamidi et al. (2007) propose a conditional CPPI multiplier that links the traditional CPPI strategy with the RBPI strategy. The conditional CPPI multiplier is determined by keeping the risk exposure of a portfolio constant, and the risk exposure can be defined either by the VaR or by the ES: mcond D
1 1 or D VaR˛ .RP / ES˛ .RP /
where HVaR˛;t .RP / D Percentile.RPt 260 ; : : : ; RPt I 5%/ R, among which HVaR˛;t .RP / is the historical VaR of portfolio returns at date t at the 1 ’% confidence level, RPt 260 ; : : : ; RPt indicates the daily portfolio return series given a certain weight of the risky asset during the past year, and EŒ is the expectation operator. Table 20.5 illustrates the daily adjustment process. Since the ES of a portfolio return does not change dramatically each day during a certain period of time, the frequency of adjustment is also much less than that in the traditional portfolio insurance strategies.
(20.14)
By assuming a normally distributed portfolio return, the expected shortfall and the VaR are scalar multiples of the standard deviation. There will be no differences across portfolio strategies employing variance, VaR and ES risk measures. Therefore, Ho et al. (2008) use the historical distribution of portfolio returns to operate the ES-based portfolio insurance strategy: HES˛;t .RP / D EŒRP jRP < H VaR˛;t .RP / R; (20.15)
20.3 Empirical Comparison of Alternative Portfolio Insurance Strategies In the late 1980s and early 1990s, much of the research in this area focused on simulated comparisons of the performances among alternative portfolio insurance strategies. In the 2000s, most studies use empirical data to analyze the circumstances where one strategy would outperform the others based on some modifications on certain portfolio insurance mechanisms, such as volatility input, rebalancing intervals and so forth. The performances are typically evaluated by floor protection – whether they achieve the desired floor at the end of the investment horizon, the cost of insurance – the opportunity cost of forfeiting upward markets, and the Sharpe ratio. Listed below are some of the studies that have appeared in academic journals.
20.3.1 Zhu and Kavee (1988) Zhu and Kavee evaluate and compare the performances of the two traditional portfolio strategies, the synthetic put approach of Rubinstein and Leland (1981) and the constant proportion approach of Black and Jones (1987). They employ a Monte Carlo simulation methodology, assuming lognormally distributed daily returns with an annual mean return of 15% and paired with different values for market volatility, in order to discover whether these strategies can really guarantee the floor return and how much investors have to pay for the protection. Both strategies are able to reshape the return distribution so as to reduce downside risk and retain a certain part of the upside gains. However, they demonstrate that a certain degree of protection can be achieved at a considerable cost. There are two types of costs in implementing a portfolio insurance strategy. The first is the explicit cost, that is, the transactions costs. The other is the implicit cost, which is the average return forgone in exchange for protection against
EURUSD
1.3201 1.3272 1.3170
Control VaR
2007/1/1 2007/1/2 2007/1/3
EUR adj. rule (2) 25.0 25.0 25.0
Portfolio USD value
(1) $1,320,100 $1,322,020 $1,319,617
EUR asset USD value .4/ D .3/ S $330,025 $331,800 $329,250
EUR amount .3/ D .2/ IniAmt 250,000 250,000 250,000
Table 20.4 Daily adjustment process of VaR-based portfolio insurance ’ D 5.%/
.5/ D .4/=.1/ 25.0 25.1 25.0
EUR asset (%)
$0 $0
.6/ D d.3/ S
USD deposit adjust .C/
.7/ D .6/ C .7/ C .8/ $990,075 $990,220 $990,367
Cumulated USD amount
(8) $145.2 $146.3 $145.3
Interest income
.9/ D .7/=.1/ 75.0 74.9 75.0
USD asset (%)
0.15 –0.18
.10/ D d.1/=.1/
Portfolio return (%)
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies 325
EURUSD
1.3201 1.3272 1.3170
Control ES
2007/1/1 2007/1/2 2007/1/3
EUR adj. rule (2) 25.0 25.0 25.2
Portfolio USD value
(1) $1,320,100 $1,322,020 $1,319,617
EUR asset USD value .4/ D .3/ S $330,025 $331,800 $331,374
EUR amount .3/ D .2/ IniAmt 250,000 250,000 251,613
Table 20.5 Daily adjustment process of ES-based portfolio insurance ’ D 5.%/
.5/ D .4/=.1/ 25.0 25.1 25.1
EUR asset (%)
$0 $2;124
.6/ D d.3/ S
USD deposit adjust .C/
.7/ D .6/ C .7/ C .8/ $990,075 $990,220 $988,242
Cumulated USD amount
(8) $145.2 $146.3 $145.0
Interest income
.9/ D .7/=.1/ 75.0 74.9 74.9
USD asset (%)
0.15 –0.18
.10/ D d.1/=.1/
Portfolio return (%)
326 L.-c. Ho et al.
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies
327
Using simulated stocks and bills prices, Perold and Sharpe examine and compare how the four dynamic asset allocation strategies – namely, the buy-and-hold, constant mix, CPPI, and OBPI – perform in bull, bear, and flat markets and in volatile and not-so-volatile markets. CPPI and OBPI strategies sell stocks as the market falls and buy stocks as the market rises. This dynamic allocation rule represents the purchase of portfolio insurance and has a convex payoff function, which results in a better downside protection and a better upside potential than a buy-andhold strategy. However, they do worse in relatively trendless, volatile markets. Conversely, a constant mix strategy – holding a constant fraction of wealth in stocks – buys stocks as the market falls and sells them as it rises. This rebalancing rule effectively represents the sale of portfolio insurance and has a concave payoff function, which leads to less downside protection than, and not as much upside as, a buy-and-hold strategy. However, it does best in relatively trendless and volatile markets. Perold and Sharpe suggest that no one particular type of dynamic strategy is best in all situations. Financial analysts can help investors understand the implications of various strategies, but they cannot, and should not, choose a strategy without a substantial understanding of an investor’s circumstances and desires.
where St is the price of the risky asset at date t; k0 the strike price, r0 the annual continuously compounded riskless rate of interest, ¢0 the ex ante volatility parameter at the beginning of the insurance period, and T the maturity of the option. Assume that at the beginning of the insurance period, one manager predicts a high-volatile market and a 20% annualized volatility, while a second manager believes a lowvolatile market and a 10% volatility. The delta for the first manager is higher, which means he would buy more insurance and allocate less to a position in the risky asset. Assume that the ex-post volatility turns out to be 10%. The second manager will have made the proper allocation between risky and riskless assets. In contrast, the high-estimate manager will have invested too little in the risky asset. This misallocation would lose the opportunity of participating in the price appreciation in a strong market. Thus, a manager who underestimates volatility will typically end up buying less insurance than is necessary to ensure a given return, while a manager who overestimates volatility will buy more insurance than is necessary and forgo gains. As for the issue of portfolio rebalancing, Rendleman and O’Brien examine adjustment frequencies by time intervals; namely, daily, weekly, monthly, and bi-monthly. The effect (error) is measured as the difference of the horizon insurance values between noncontinuous and continuous trading. They suggest that weekly rebalancing produces an amount of error that appears to be tolerable by most portfolio managers. More importantly, they address the biggest potential risk of implementing portfolio insurance strategies – the gap risk, by simulating the performance of the OBPI strategy over the period of the October 1987 market crash. They indicate that most insured portfolios would have fallen short of their promised values because the managers would not be able to adjust the portfolio in time before a big drop in the market. The gap risk will be discussed further in the financial stability section.
20.3.3 Rendleman and O’Brien (1990)
20.3.4 Loria et al. (1991)
Rendleman and O’Brien address the issue that the misestimation of volatility input can have a significant impact on the final payoffs of a portfolio using a synthetic put strategy. In an OBPI strategy, the daily portfolio adjustments depend on the delta of the put option on the risky asset. In the original Black and Scholes (1973) valuation equation, the delta at each day is a function of the following variables:
Loria, Pham, and Sim simulate the performance of a synthetic put strategy using futures contracts based on the Australian All Ordinaries Index for the period April 1984–March 1989. Their study contains 20 consecutive nonoverlapping 3-month insurance periods whose expiration dates coincide with the expiration dates of each SPI futures contract traded on the Sydney Futures Exchange. Four implementation scenarios are examined: a zero floor versus 5% floor that correspond to a portfolio return of 0% and 5% annually,
the downside risk. When the market becomes volatile, the protection error of the synthetic put approach increases, and the transaction costs may be unbearable. On the other hand, while the constant proportion approach may have lower transactions costs, its implicit cost may still be substantial.
20.3.2 Perold and Sharpe (1988)
Dt D f .St ; kN0 ; rN 0 ; 0 ; T /;
(20.16)
328
respectively; and a realized volatility versus Leland’s (1985) modified volatility. They report that there is no perfect guarantee of loss prevention under any scenario. Even in the scenario with a 5% floor and modified volatility, 2 out of 20 contracts do not meet the desired floor. In addition, the OBPI strategy is most effective under severe market under severe market conditions. In other periods characterized by insignificant market declines, the value of the insured portfolio is below that of the market portfolio. Loria, Pham, and Sim suggest futures mispricing may be one potential culprit for this outcome.
L.-c. Ho et al.
have a realized volatility of less than 15%; violation of either one or both of these conditions is regarded as indicating a turbulent period. All portfolio insurance strategies achieve 100% floor protection during tranquil periods whereas the futures-based OBPI approach records the highest portfolio return. During turbulent times, futures-based portfolio insurance continues to perform quite well. The 1987 stock market crash makes the assessment difficult because of trading halts. However, assuming the futures continued to trade during that crisis, from the algorithm’s perspective, the futures-based CPPI maintains a positive return, while the OBPI results in a negative return.
20.3.5 Do and Faff (2004) 20.3.6 Cesari and Cremonini (2003) This empirical paper is an extension of Loria et al. (1991). Do and Faff examine two approaches (the OBPI and CPPI) and conduct simulations across two implementation strategies (via Australia stock index and bills, and via SPI futures and stock index). Furthermore, they consider the use of implied volatility as an input into the model, and the use of a zero floor versus 5% floor. The dataset consists of 59 nonoverlapping 3-month insurance periods, which span from October 1987 to December 2002. Thus, their key contributions relative to Loria et al. (1991) are the examination of a futures-based CPPI, a fine-tuned algorithm that allows for dividend payments, the consideration of ex ante volatility information, and more up to date data. In terms of floor protection, the futures-based portfolio insurance implementation generally dominates its index-andbill rival in both floor specifications, which reflects the low transaction costs in the futures market. Furthermore, the perfect floor protection is possible when implied volatility is used rather than using ex-post volatilities. From the cost of insurance perspective, the futures-based strategy generally induces a lower cost of insurance than its index-and-bill rival in both floor specifications, which reflects the same reason as above. However, the cost of insurance is higher when implied volatility is used compared to ex-post volatilities. The possible explanation is that the implied volatility is often higher than the same period ex-post volatility, which results in overhedging. As for the performances between the OBPI and CPPI approaches, there is no strong evidence to distinguish between them. Within the futures-based implementation, the synthetic put appears to dominate the CPPI with respect to floor protection, while the latter appears to slightly outperform in terms of upside participation. They also examine whether portfolio insurance strategies work under stress conditions by assessing the strategies’ effectiveness during tranquil and turbulent periods. Tranquil periods are defined as ones that return more than 4% and
This study is an extensive comparison of a wide variety of traditional portfolio insurance strategies. There are basically five dynamic asset allocation strategies: (1) buy-and-hold (BH); (2) constant mix (CM); (3) constant proportion (without and with the lock-in of profits, CP and CPL); (4) the option-based approach (with three variations, BCDT, NL, PS); and (5) technical strategy (with two kinds of stop-loss mechanism, MA and MA2). Therefore, nine strategies in total are considered. For each strategy, eight measures for risk, return and risk-adjusted performance are calculated; namely, mean return, standard deviation, asymmetry, kurtosis, downside deviation, Sharpe ratio, Sortino ratio, and return at risk. The strategies are then compared in different market situations (bear, no-trend, bull markets) and with different market volatility periods (low, medium and high periods), taking into account transaction costs and discrete rebalancing of portfolios. The three market situations are defined accordingly if market average returns fall into the three ranges: .30%; 5%/; .5%; C5%/; .C5%; C30%/, and the three ranges for the volatile periods are (10%, 15%), (15%, 25%), (25%, 30%). Transaction costs are treated in two ways: a proportional cost to the value traded and a correction to the Leland’s (1985) option volatility. Two main rebalancing disciplines are used: a time discipline with weekly adjustment, and a price discipline with adjustment only when prices are increased/decreased by 2.5% with respect to the previous rebalance time. Monte Carlo simulations of MSCI World, North America, Europe, and Pacific stock market returns show that no strategy is dominant in all market situations. However, in bear and no-trend markets, CP and CPL strategies appear to be the best choice. In a bull market or in no-trend but with high volatility market, the CM strategy is preferable. If the market phase is unknown, CP, CPL, and BCDT strategies are recommended.
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies
In addition, these results are independent of the volatility level and the risk-adjusted performance measure adopted.
20.3.7 Herold et al. (2005) By constructing a fixed income portfolio, in which the risky asset is the JPMorgan government bond index and the riskless asset is cash (1-month yield), Herold, Maurer, and Purschaker compare the hedging performances between the traditional CPPI strategy and the risk-based (specifically, VaR-based) strategy with a 1-year investment horizon that begins at each year from 1987 to 2003. CPPI avoids losses in the bear years of 1994 and 1999. The mean return is inferior to (about 40 base points below) that of the risk-based strategy. CPPI also produces a higher turnover.
20.3.8 Hamidi et al. (2007) Although Hamidi, Jurczenko, and Maillet propose a conditional CPPI multiplier determined either by VaR or by ES, only a VaR-based measure is studied in this empirical work, and the ES measure is absent for future research. The data set contains 29 years of daily returns of the Dow Jones Index, from January 2, 1987 to May 20, 2005, 4,641 observations in total. They use a rolling window of 3,033 returns to estimate the VaR, and there are 1,608 VaRs in the out-of-sample period. They resort to eight methods of VaR calculation: one non-parametric method using the historical simulation approach; three parametric methods based on distributional assumptions: namely, the normal VaR, the RiskMetrics VaR based on the normal distribution, and the GARCH VaR based on the Student-t distribution; four semi-parametric methods using quantile regression to estimate the conditional autoregressive VaR (CAViaR): namely, the symmetric Absolute Value CAViaR, the Asymmetric Slope CAViaR, the IGARCH(1,1) CAViaR, and the Adaptive CAViaR. According to the 1,608 back-testing results, the Asymmetric Slope CAViaR is the best model to fit the data. After having calculated the VaR values, the conditional multipliers can be determined. The estimations spread between 1.5 and 6, which are compatible with multiple values used by practitioners in the market (between 3 and 8). Using the time-varying CPPI multipliers estimated by different methods, and a “multi-start” analysis – the fixed 1-year investment horizon beginning at every day of the out-of-sample period – they find that the final returns of these insured portfolios are not significantly different.
329
20.3.9 Ho et al. (2008) This empirical study presents a complete structure of comparing traditional portfolio insurance strategies (OBPI, CPPI) with modern risk-based portfolio insurance strategies (VaR-, ES-based RBPI). By constructing a currency portfolio, in which the risky asset is the Australian dollar and the riskless asset is a US dollar overnight deposit, Ho, Cadle, and Theobald compare the dynamic hedging performances between the traditional and the modern strategies with a 1-year investment horizon that begins at each year from 2001 to 2007. When implementing the OBPI strategy, the delta is calculated based on the modified Black-Scholes formula, Dt D f .St ; kN0 ; rt ; t ; T /. That is, daily annualized historical volatilities are used as inputs in the put-replication process instead of the constant ex ante or the implied volatility in the original model to mitigate the volatility misestimation problem. Besides, the latter two parameters would make the portfolio daily returns more volatile. The interest rates are also updated daily for the same reason. When CPPI is implemented, the possible upper bounds of the multiplier are examined via the EVT, which ranges from 4 to 6 corresponding to different confidence levels. When risk-based approaches are employed, both a historical distribution and a normal distribution with exponentially weighted volatility of risky asset returns are assumed. A daily rebalancing principle is adopted in their research without any modification to show the original results of hedging. The performances are evaluated from six differing perspectives. In terms of the Sharpe ratio and the volatility of portfolio returns, the CPPI is the best performer, while the VaR based upon the normal distribution is the worst. From the perspective that the return distribution of the hedged portfolio is shifted to the right and in terms of both the average and the cumulative portfolio returns across years, the ES-based strategy using the historical distribution ranks first. Moreover, the ES-based strategy results in a lower turnover within the investment horizon, thereby saving transaction costs.
20.4 Recent Market Developments 20.4.1 Type of Risky Asset Over the past few years, portfolio insurance reemerged due to low structuring and trading costs. According to Pain and Rand (2008), OBPI investments have not been popular, which reflects the difficulty in explaining options to investors. However, the CPPI strategy is much more prevalent, stemming from a broadening in the asset classes
L.-c. Ho et al.
B/W in Print
330
Fig. 20.3 Structured note issuance by type of risky asset
for which investors need principal protection. Many of principal-protected structured notes are not only designed to link the performance of traditional assets, such as stocks, currencies, interest rates and bonds, but also to invest in alternative risky assets, such as hedge funds, fund of hedge funds, credit derivatives, commodities, real estates, private equities, and any combination of the above assets.
Fig. 20.4 Structured note issuance on alternative assets
institutional and retail clients, such as pension funds, insurance companies, private banks who purchase products for onward sale to their clients, and high net wealth individuals. Among which, the institutional investors are especially important in continental Europe.
20.4.4 Modified CPPI Mechanisms 20.4.2 Market Size Figure 20.3 shows the evolution of shares of structured note issuance by type of risky asset. The share of notes linked to equity and alternative assets increased significantly in 2002–2007. In 2007, the shares of currencies and equities are both about 40%, alternative assets occupy more than 10%, and bonds have the least share. Figure 20.4 illustrates the evolution of issuance of structured notes linked to alternative assets. The total issuance has grown five times, from about US $5 billion in 2002 to more than US $45 billion in 2007.
20.4.3 Market Participants According to Pain and Rand (2008), portfolio insurance products are more prevalent in Europe than in the United States. The key issuers (sellers) are typically large investment banks that can provide the necessary structuring and marketing expertise. The main investors include both
Since the inception of portfolio insurance strategies, CPPI investments have evolved to incorporate various different features. Of particular importance are the key components in the design of CPPI investments (i.e., the floor, the multiplier, and the exposure to risky assets).
20.4.4.1 Variations in Floor Ratchet floor: The performance of CPPI investments is price dependent; any gains at a particular point in time may still be lost if the underlying asset price subsequently falls. To address this, products with a “ratchet” mechanism have been introduced to lock in a proportion of upside performance. More specifically, the floor can be set as a percentage of the historical maximum net asset value of the portfolio. That is, the floor jumps up with the portfolio value in order to reduce the risky asset allocation when the market peaks. The time-invariant portfolio protection (TIPP) strategy proposed by Estep and Kritzman (1988) reflects this variation.
20 Portfolio Insurance Strategies: Review of Theory and Empirical Studies
Margin floor: The converse to the above case is when the market falls substantially for the cushion to approach zero, the exposure to the risky asset also approaches zero. Under a conventional CPPI, this leads to “cash-out” where the portfolio is fully invested in the risk-free asset, with no possibility of regaining any exposure. One method to counter this is to adjust the starting floor value, by augmenting this by a margin, which can later be used to help regain some exposure after a near cash-out situation. Straight-line floor: The floor in a conventional CPPI is a zero-coupon bond, which is sensitive to the level of interest rates. When interest rates fall, the floor would rise and the allocation switches away from the risky asset. This in turn would limit the subsequent potential upside from the CPPI. However, this sensitivity can be removed by allowing the floor to vary linearly with time.
331
offers the flexibility to build in additional features, such as zero-coupon, fixed or contingent coupons, and any variations mentioned above. Swaps: An alternative idea in the structured products area could be a collateralized swap. The client has a leveraged exposure to the risky asset and receives a leveraged return on the performance of the asset with no upfront payment required. In return, the client will accrue a financing fee on the leverage amount, payable periodically or at maturity. The swap normally requires the client to have a pre-agreed credit line with the issuer or pledge shares or cash to the issuer as collateral.
20.5 Implications for Financial Market Stability
20.4.4.2 Variations in Multiplier
20.5.1 Amplification of Market Price Movements Rather than having a constant multiplier in a conventional CPPI, dynamic portfolio insurance (DPI) allows for the multiplier to vary over time in relation to the volatility of the risky asset and reflects investors’ appetite for risk. In fact, DPI is a cushion management technique that quantifies the level of risk born by the investment in the risky asset. There is often a maximum level for the multiplier, which can be determined by stress testing on the returns of the risky asset using EVT. The risk-based portfolio insurance strategies in the earlier section reflect this feature. 20.4.4.3 Variations in the Exposure to Risky Assets Constraints on the investment level: Another method to avoid the “cash-out” situation when a market falls is to incorporate a minimum level of investment in the risky asset. Conversely, to avoid unbounded investment in the risky asset as a market rallies, a maximum investment level can also be imposed. Constraints on leverage: Exposure to the risky asset of more than the initial available funds can be achieved by allowing borrowing. There may be constraints on how much can be borrowed. Volatility caps: Some CPPI products include mechanisms that allow the percentage exposure to the risky asset to be reduced if its realized volatility exceeds a certain level.
The portfolio insurance strategies mentioned in this chapter share a common “buy high and sell low” rule for the risky asset. When the underlying asset markets move in one particular direction, either trending up or down, these actions could conceivably have feedback effects in markets that amplify price developments. If the underlying asset markets are deep and liquid, the feedback effects may be limited. However, if the underlying asset markets are less liquid (for example, CPPI products linked to hedge funds on credit derivatives), even small actions may reinforce market prices. Moreover, some underlying assets that appear less correlated in normal conditions may become more correlated in stress situations when CPPI products all look to reduce risky positions simultaneously. Taken together the stress situations and the inherently less liquid nature of certain underlying asset markets, the feedback would have a large impact. Fortunately, according to Pain and Rand (2008), market contacts reveal that the amount of portfolio insurance-related dynamic hedging flows still remain modest relative to the size of the corresponding underlying asset markets, even in those new asset classes. Moreover, market contacts did not perceive portfolio insurance as a significant factor during the 2007 financial market turmoil whereas it was perceived as the driving force during the 1987 stock market crash.
20.4.5 Structured Products 20.5.2 Gap Risk Structured notes: Many of the issuances of principalprotected products are designed as medium-term structured notes, with maturities ranging from 5 to 15 years. The model
The biggest potential risk of implementing a portfolio insurance strategy is that the payoffs are particularly
332
vulnerable to sudden huge drops in the risky asset prices before the portfolio can be rebalanced. In such cases, the value of the insured portfolio would fall below the floor; however, a hard guarantee, such as 100% principal protection, to the investors must be fulfilled at the end of investment horizon. Therefore, the issuers of CPPI products would suffer a gap risk. In particular, if the issuers issue several CPPI products written on different underlying assets, and those seemingly uncorrelated assets suddenly become much more correlated in stressed conditions (such as the credit crunch led by the subprime mortgage crisis since September 2007), then the scale of the gap risk may be very much underestimated. The unexpected huge losses may jeopardize the issuers and the stability of financial markets. In theory, the issuer of CPPI products can hedge such exposure to gap risk by options. The issuer needs to model the likely worst-case move in the risky asset price before the next rebalancing opportunity and build the cost of this implicit option into the premiums and fees charged to the investors or provided by capital. But, in practice, the pricing of such options can be quite complex, because the issuer does not exactly know the underlying asset price processes and their correlations in stressed markets. Furthermore, it is difficult for the issuers to find available options through which to hedge their exposures. Pain and Rand (2008) report that some issuers of CPPI products create securities that package up the gap risk and sell these to investors, including private banks and funds. But these structured derivatives are not popular due to the complicated nature of risk. Other issuers seek to limit their exposure to gap risk by other imprecise hedging vehicles or so-called proxy hedging. But, this in turn, exposes the issuers to the basis risk.
20.6 Conclusion A portfolio insurance strategy is a dynamic hedging process that provides the investor with the potential to limit downside risk while allowing participation on the upside so as to maximize the terminal value of a portfolio over a given investment horizon. This paper looks at the basic concept of portfolio insurance, the alternative dynamic hedging strategies available to the portfolio manger, the empirical comparison of the performances of various portfolio insurance strategies, the recent market development of portfolio insurance techniques, and finally, the impact of portfolio insurance on financial market stability.
L.-c. Ho et al.
References Bertrand, P. and J. L. Prigent. 2001. Portfolio insurance: the extreme value approach to the CPPI method, Working paper, Thema University of Cergy. Bertrand, P. and J. L. Prigent. 2005. “Portfolio insurance strategies: OBPI versus CPPI.” Finance 26(1), 5–32. Black, F. and R. Jones. 1987. “Simplifying portfolio insurance.” Journal of Portfolio Management 14, 48–51. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Cesari, R. and D. Cremonini. 2003. “Benchmarking, portfolio insurance and technical analysis: a Monte Carlo comparison of dynamic strategies of asset allocation.” Journal of Economic Dynamics and Control 27, 987–1011. Do, B. H. 2002. “Relative performance of dynamic portfolio insurance strategies: Australian evidence.” Accounting and Finance 42, 279–296. Do, B. H. and R. W. Faff. 2004. “Do futures-based strategies enhance dynamic portfolio insurance?” The Journal of Futures Markets 24(6), 591–608. Estep, T. and M. Kritzman. 1988. “TIPP: Insurance without complexity.” Journal of Portfolio Management 14(4), 38–42. Garman, M. B. and S. W. Kohlhagen. 1983. “Foreign currency option values.” Journal of International Money and Finance 2, 231–237. Hakanoglu, E., R. Kopprasch, and E. Roman. 1989. “Constant proportion portfolio insurance for fixed-income investment: a useful variation on CPPI.” Journal of Portfolio Management 15(4), 58–66. Hamidi, B., E. Jurczenko, and B. Maillet. 2007. An extended expected CAViaR approach for CPPI. Variances, Working paper, University of Paris-1. Herold, U., R. Maurer, and N. Purschaker. 2005. “Total return fixedincome portfolio management: a risk-based dynamic strategy.” Journal of Portfolio Management 31, 32–43. Ho, L. C., J. Cadle and M. Theobald. 2008. An analysis of risk-based asset allocation and portfolio insurance strategies, Working paper, Central Bank of the Republic of China (Taiwan). Leland, H. 1985. “Option pricing and replication with transaction costs.” Journal of Finance 40, 1283–1301. Loria, S., T. M. Pham, and A. B. Sim. 1991. “The performance of a stock index futures-based portfolio insurance scheme: Australian evidence.” Review of Futures Markets 10(3), 438–457. Pain, D. and J. Rand. 2008. “Recent developments in portfolio insurance.” Bank of England Quarterly Bulletin Q1, 37–46. Perold, A. R. 1986. Constant proportion portfolio insurance, Working paper, Harvard Business School. Perold, A. R. and W. F. Sharpe. 1988. “Dynamic strategies for asset allocation.” Financial Analysts Journal 44(1), 16–27. Rendleman, R. J. Jr. and T. J. O’Brien. 1990. “The effects of volatility misestimation on option-replication portfolio insurance.” Financial Analysts Journal 46(3), 61–70. Rubinstein, M. and H. E. Leland. 1981. “Replicating options with positions in stock and cash.” Financial Analysts Journal 37, 63–72. Zhao, Y. and W. T. Ziemba. 2000. “A dynamic asset allocation model with downside risk control.” Journal of Risk 3(1), 91–113. Zhu, Y. and R. C. Kavee. 1988. “Performance of portfolio insurance strategies.” Journal of Portfolio Management 14(3), 48–54.
Chapter 21
Security Market Microstructure: The Analysis of a Non-Frictionless Market Reto Francioni, Sonali Hazarika, Martin Reck, and Robert A. Schwartz
Abstract The Capital Asset Pricing Model describes a frictionless world characterized by infinite liquidity. In contrast, trading in an actual marketplace is replete with costs, blockages, and other impediments. Equity market microstructure focuses on how orders are handled and turned into trades in the non-frictionless environment. For over three decades, the literature has grown while, concurrently, trading systems around the world have been reengineered. After depicting the frictionless CAPM, we consider the development of microstructure analysis, concentrating on issues germane to market architecture. We then consider the design of one facility, Deutsche Börse’s electronic platform, Xetra. Important insights were gained from the microstructure literature during Xetra’s planning period (1994–1997), and Xetra’s implementation marked a huge step forward for Germany’s equity markets. Nevertheless, academic research and the design of a real world marketplace remain works in progress. Keywords Market microstructure r CAPM r Market makers r Liquidity r Price volatility r Price discovery r Trading costs r Electronic markets
21.1 Introduction Security market microstructure addresses issues that involve the implementation of portfolio (investment) decisions in a marketplace. Implementation entails the placement and handling of orders in a securities market, and their translation into trades and transaction prices. The process
This chapter includes material from Francioni et al. (2008) and from Schwartz (1988), which was reprinted in Schwartz and Francioni (2004).
R.A. Schwartz and S. Hazarika () Zicklin School of Business, Baruch College, CUNY, New York, NY, USA e-mail:
[email protected] R. Francioni and M. Reck Deutsche Börse, Frankfurt, Germany
links fundamental information concerning equity valuation (which is of primary concern to portfolio managers) to prices and transaction volumes that are realized in the marketplace. The quality of the link depends on the rules, procedures, and facilities of a securities market, and on the broader regulatory and competitive environment within which the market operates. Widespread interest on the part of the securities industry, government, and academia is testimony to the importance of market microstructure analysis. The subject addresses issues that concern investors, broker/dealer intermediaries, market regulators, exchanges, and other trading venues as well as the broad economy. Interest in microstructure has increased sharply over the past three and a half decades, spurred in particular by three events: the U.S. Securities and Exchange Commission’s (SEC) Institutional Investor Report (1971), the passage by the U.S. Congress of the Securities Acts Amendments of 1975, and the sharp stock market drop on October 19, 1987. Further, the advent of computer-driven trading in recent years has enabled researchers to capture electronically the full record of all trades and quotes, and this has provided empirical researchers with far richer data (referred to as “high frequency data”) for analyzing trading and price setting. Over the years, microstructure analysis has expanded and, concomitantly, exchange structure has strengthened. We consider both of these developments in this chapter. First, we set forth the major challenges that the microstructure literature addresses. Second, we consider the properties of a frictionless trading environment. Third, we present a broad view of the direction in which microstructure analysis has been and is evolving. Fourth, we turn to one application – the design of an actual marketplace: Deutsche Börse’s electronic trading system, Xetra. The German market was the last of the major European bourses to introduce an electronic trading platform, and it is state of the arts, which makes Deutsche Börse a particularly interesting case in point. Fifth, in the concluding section, we consider the bumpy and hazardous road that takes us from theory to the development of an actual marketplace.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_21,
333
334
21.2 Microstructure’s Challenge Microstructure analysis has four broad applications. First (and this is a key focus of the chapter), it gives guidance to market structure development. The link with market structure is straightforward: the critical factor that drives microstructure analysis is friction in the marketplace (i.e., the explicit and implicit costs of implementing portfolio decisions), and trading costs dependent on the architecture of the marketplace, which determines how orders are handled and turned into trades. The flipside of friction is illiquidity, and a primary function of a market center is to amass liquidity. Microstructure’s second application is to facilitate the development of trading strategies and algorithms for asset managers and broker/dealer intermediaries. The importance of this application is evident in the current development of computer-driven algorithmic trading. Algorithms can be finetuned to take account of, for example, the probability of a limit order executing, time of day effects such as market openings and closings, the search for liquidity in a fragmented environment, and the choice of a trading modality (e.g., a continuous limit order book market, a quote-driven dealer market, a periodic call auction, a block trading facility, or hybrid combinations of the above). The third application of microstructure analysis concerns tests of market efficiency. In the 1970s, at a time when the subject was first emerging, the Efficient Markets Hypothesis (EMH) was widely accepted by financial economists as a cornerstone of modern portfolio theory, and it continues to receive broad academic support today. The hypothesis addresses informational as distinct from operational efficiency (the latter refers to the containment of transaction costs by superior market design). According to the EMH, a market is informationally efficient if no participant is able to achieve excess risk adjusted returns by trading on currently available information. Many of the EMH tests have considered one major part of the information set – market information (e.g., recent quotes, trading volume, and transaction prices). If prices properly reflect all known information, then (in a frictionless market at least) they must change randomly over time; hence the term “random walk.” Earlier studies, based on daily data, generally supported the random walk hypothesis. However, with the advent of high frequency data, the footprints of complex correlation patterns have been detected. This observation, along with superior knowledge of the impact of trading costs on returns behavior, is casting a new light on market efficiency. Whether inefficiency is thought of in operational or informational terms, the EMH is not as stellar as it once was. In its fourth application, microstructure analysis sheds light on how new information is incorporated into security prices. In a zero cost, frictionless environment, share values would be continuously and instantaneously updated with the release of new information. In actual markets, however, information must be received and assessed, traders’ orders
R. Francioni et al.
must be placed and processed, and executions must be delivered and accounts cleared and settled. Costs, both explicit (e.g., commissions) and implicit (e.g., market impact), are incurred throughout this chain of events. Highlighted in much microstructure literature are the costs that some participants incur when, in an asymmetric information environment, other participants receive information first and trade on it to the disadvantage of the uninformed. Asymmetric information is not the only reality, however. In light of the size, complexity, and imprecision of much publicly available information, one might expect that investors in possession of the same (large) information set will form different expectations about future risk and return configurations. This situation is referred to as “divergent expectations.”1 Asymmetric information and divergent expectations together reflect a rich set of forces that impact the dynamic behavior of security prices. This overview of microstructure’s four broad applications underscores that trading frictions are the subject’s raison d’être. Participant orders cannot be translated into trades at zero cost (markets are not perfectly liquid), and trades typically are not made at market clearing (i.e., equilibrium) prices. Trading decision rules (algorithms) are needed because the costs of implementing portfolio decisions can sharply lower portfolio performance. In fact, much algorithmic trading is designed to control trading costs, rather than to exploit profitable trading opportunities. Today, trading is recognized as an activity that is both distinct from investing and equivalently professional. Market structure is of concern to the buyside desks precisely because markets are not perfectly liquid, and neither are they perfectly efficient, either informationally or operationally. Consequently, better market structure can deliver superior portfolio performance for participants. What is the economic service, one might ask, that an equities market provides? The fuzzy link that connects information and prices in the non-frictionless environment underscores two major market functions – price discovery and quantity discovery. Price discovery refers to participants collectively searching for equilibrium prices. Quantity discovery refers to the difficulty that participants who would be willing to trade with each other actually have finding each other and trading when markets are fragmented. This difficulty is accentuated because some participants (primarily institutional investors) do not immediately reveal the total size of their orders (doing so would unduly drive up their market impact costs). Market structure affects both the accuracy of price discovery and the completeness of quantity discovery. The link between market structure and price discovery depends on the environment within which participants are operating. At one end of the spectrum, investors can be equally 1
For a recent discussion, see Davis et al. (2007).
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
informed and form homogeneous expectations based on the information they all possess. At the other end, they can be differentially informed and form divergent expectations with regard to commonly shared information. When investors who share common information all agree on share values (i.e., have homogeneous expectations), prices can be “discovered” in the upstairs offices of research analysts. When investors are not equally informed, and when they form different expectations based on common information, prices must be discovered in the marketplace. In this second environment, the economic service provided by an exchange is clear – it “produces the price.” Regarding quantity discovery, handling the orders of large institutional customers is a challenge. It is not at all uncommon for an institution to want to buy or to sell, for instance, 500,000 shares of a company that has an average daily trading volume of 300,000 shares. Executing an order of this size can easily drive prices away from the trader before the job has been completed. The adverse price move is a market impact cost. Institutions attempt to control their market impact costs by trading patiently and, as much as possible, invisibly. Good market structure can help. To this end, a number of alternative trading systems (ATSs) have been formed in recent years, and dark (i.e., non-transparent) liquidity pools have emerged. With prices discovered in the marketplace, participants employ trading strategies when they come to the market to implement their portfolio decisions. Participants with differential information that will soon become public determine how best to meter their orders into the market so as to move prices to new levels with minimal speed. Additional questions that any trader might ask include: “If I trade now, at the current moment, how will the price that I will receive compare with the average price that shares are trading at today?” “Is price currently at a sustainable, validated level, or is it likely to move higher or lower in the coming hours, minutes, or even seconds?” “Would I do better to be patient and place a limit order, or submit a market order and get the job done right away?” “Should I attempt to trade now in the continuous market, or wait for a closing call?” The orders that a set of participants reveal to the market depend on how questions such as these are answered, and prices that are set and trading volumes that are realized depend on the orders that are revealed. The categories of trading costs that receive the most attention on the part of exchanges, regulators, and academicians are generally those that are the most straightforward to measure: commissions and bid-ask spreads. Increasingly precise measures of market impact are also becoming available, and this cost too is being widely taken into account. On the other hand, the opportunity cost of a missed trade, being far more difficult to quantify, is often overlooked. Also more challenging is quantifying a cost that has received little formal attention: realizing executions at poorly discovered
335
prices. The problem, of course, is that equilibrium values are not observable and appropriate benchmark values are not easily defined.
21.3 The Perfectly Liquid Environment of CAPM Peter Bernstein’s (2007) piece in the Journal of Portfolio Management has the intriguing title, “The Surprising Bond Between CAPM and the Meaning of Liquidity.” In it he wrote, “The more liquid an asset, the greater the dominance of systematic risk over stock specific risk.” We build on this insight in this section. In so doing, we formalize the fact that the Capital Asset Pricing Model (CAPM) describes an extreme case, a totally frictionless world where liquidity is infinite and systematic risk has complete dominance over stock specific risk. The analysis provides a good platform from which to launch a discussion of market microstructure, the study of a non-frictionless environment. CAPM models the prices of the individual equity shares that, in aggregate, comprise the market portfolio. Following standard methodology, we start our analysis of the frictionless environment by taking the market portfolio to be one single asset (e.g., an all encompassing exchange traded fund). We consider the demand of an agent to hold shares of this one risky asset when the only alternative is the riskless asset. We show that an individual agent’s demand curve to hold shares of the risky asset is downward sloping, and then use this curve to re-derive certain key CAPM equations to show that the associated demand to hold shares of each individual equity issue in that portfolio is infinitely elastic, and that therefore, the market for the individual shares is infinitely liquid. In the CAPM world, each individual equity issue in the market portfolio has an intrinsic value that is given by the parameter that locates the height (on the price vector) of that infinitely elastic demand. In the section that follows, we turn to the non-frictionless environment of microstructure analysis where individual stock demand curves are downward sloping, the liquidity of individual shares is, therefore, finite, and individual shares do not have intrinsic values. To obtain the representative investor’s demand curve to hold shares of the risky market portfolio, first we state the agent’s utility (of wealth) function. The demand curve to hold shares of the market portfolio may then be obtained directly from the utility function. The derivation follows Ho et al. (1985). We make the following assumptions: The investor’s portfolio comprises a risk-free asset and
one risky asset (shares of the market portfolio). Share price and share holdings are continuous variables.
336 Short selling is unrestricted. The existence of a brief trading period, T0 to T1 , which is
R. Francioni et al.
of P1 , the expected utility of end of period wealth, written as a function of P and Q, given N0 and C0 , would be
followed by a single investment period, T1 to T2 . All transactions made during the trading period are settled
at point in time T1 . The investor seeks a portfolio at the beginning of the investment period (at time T1 ) that will maximize the expected utility of wealth to be realized at the end of the investment period (at time T2 ). Investor expectations with respect to the share price at the end of the investment period (at time T2 ) are exogenously determined (expectations are independent of the current price of shares). Investors are risk averse. The following variables are used: C0 D holdings of the risk-free asset at the beginning of the trading period .T0 / C1 D holdings of the risk-free asset at the beginning of the investment period .T1 / N0 D number of shares of the market portfolio held at the beginning of the trading period .T0 / N1 D number of shares of the market portfolio held at the beginning of the investment period .T1 / R0 1 D risk-free rate of interest over the trading period R1 1 D risk-free rate of interest over the investment period P1 D price at which shares of the market portfolio are purchased or sold during the trading period P2 D price at which shares of the market portfolio can be sold at the end of the investment period .T2 / rm D P2 =P1 1 D return on the market portfolio Q D number of shares traded by the investor at the beginning of the investment period .T1 /I Q > 0 indicates a purchase; Q < 0 indicates a sale
21.3.1 The Expected Utility of End of Period Wealth The participant starts the investment period with C1 dollars of the risk-free asset and N1 shares of the market portfolio (the risky asset). Therefore, wealth at T2 is given by C1 R1 CN1 P2 . As of T1 , this wealth is uncertain because P2 is uncertain. As of T1 , the expected utility of end of period wealth can be written as (21.1) EU.C1 R1 C N1 P2 / The investor starts the trading period with C0 dollars of the risk-free asset and N0 shares of the risky asset. If during the trading period the decision maker were to exchange holdings of the risk-free asset for Q shares of the risky asset at a price
h.P1 ; QjN0 ; C0 / D EU Œ.C0 R0 QP1 /R1 C .N0 C Q/P2 (21.2) where C0 R0 QP1 D C1 and N0 C Q D N1 . Equation (21.2) can be rewritten as h.P1 ; QjN0; C0 / D c C gQ.a bQ P1 /
where
(21.3)
c D U.W/ N20 U 0 .W /=R1 g D U0 .W/R1 a D ŒE.P2 / 2 N0 =R1 b D =R1 1 D ŒU00 .W/=U0 .W/ Var.P/ 2
The step from Equation (21.2) to Equation (21.3) involves a Taylor expansion of the investor’s utility around the expected value of wealth if the investor does not trade.2 The procedure is a convenient way of introducing the variance term into the utility function.3
21.3.2 The Reservation Demand Curve Equation (21.3) can be further assessed with the use of risk aversion and risk premium measures that are defined in Appendix A. Specifically, using Equation (21.3), we now obtain both a reservation price demand curve and an ordinary demand curve to hold shares of the risky asset. We consider the reservation demand curve first. The reservation price for a purchase or a sale is the maximum price the decision maker would be willing to pay to buy a given number of shares .Q > 0/, or the minimum price the decision maker would be willing to receive to sell a given number of shares .Q < 0/ when the only alternative is not to trade at all. Equation (21.3) shows that, if no trade is made (that is, if Q D 0), the decision maker’s expected utility is equal to c. The reservation price for any value of Q is the price that equates the expected utility Œh.Pl ; QjN0 ; C0 / if 2 For a discussion of the Taylor procedure see, for example, R. G. D. Allen, Mathematical Analysis of Economists, London, England: Macmillan, 1960. 3 Two further assumptions are required to obtain Equation (21.3): (1) the third derivative of utility with respect to wealth is small enough to ignore; and (21.2) the squared deviation of the expected rate of return on the risky asset from the risk-free rate is small enough to ignore.
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
the trade is made, with the expected utility (c) if no trade is made. Thus, the reservation price for any value of Q is given by (21.4) h.P R ; QjN0 ; C0 / D c where PR is the reservation price associated with the trade of Q shares. Given Equation (21.3), for Equation (21.4) to be satisfied, we must have a bQ P1 D 0. Hence the reservation price demand curve is P R D a bQ
(21.5)
21.3.3 The Ordinary Demand Curve Using Equation (21.3), we can also obtain the ordinary demand curve. At any value of P1 , the decision maker selects the value of Q that maximizes expected utility. Hence, the ordinary price demand curve is given by @h 0 .P ; QjN0 ; C0 / D 0 @Q
We thus see that the investor achieves an optimal holding of the risky asset by obtaining the number of shares that equates the marginal risk premium with the market price of risk.
21.3.5 The Investor’s Optimal Point on the Capital Market Line The demand model can be used to assess the investor’s optimal point on the capital market line. Let rp be the return on the combined portfolio (N1 shares of the market portfolio and C1 dollars of the risk-free asset). From Appendix Equation (21.7) we have D rp
D
D
The derivation in this section through Sect. 21.3.6 follows Schwartz (1988). When the investor has traded the optimal number of shares of the market portfolio at the market determined price per share, his or her risk premium can be related to the market price of risk. Assessing, the ordinary demand curve at P0 D P1 gives P1 D
E.P2 / 2 N1 R1 R1
(21.8)
Multiplying by R1 =P1 , rearranging, and recognizing that ŒE.P2 /=Pl 1 D E.rm / and R1 1 D rf , we get 2 N1 D E.rm / rf P1
(21.9)
M % D E.rm / rf
(21.10)
Therefore, we have
where M % is the marginal risk premium (see Appendix 21A). Note that the right-hand side is the price of risk.
2
2 1 W RA Var.rp / 2 N1
(21.11)
Because ¢p D .NP=W/¢m , we have Var .rp / D ¢p .NP=W/¢m and can write Equation (21.11) as
(21.7)
21.3.4 The Risk Premium and the Market Price of Risk
W N1
which, using RA D U00 .W/=U0 .W/, the measure of absolute risk aversion, can be written as
(21.6)
where P0 is the “ordinary” price associated with the trade of Q shares. Differentiating h in Equation (21.3) with respect to Q, setting the derivative equal to zero, and rearranging gives P 0 D a 2bQ
337
1 RA p 2
PW N1
m
(21.12)
Substituting (21.12) into (21.9) and simplifying gives RR P D
E.rm / rf m
(21.13)
where RR .D WRA / is the measure of relative risk aversion. Equation (21.13) shows that for the investor to hold an optimal combined portfolio, the market price of risk per standard deviation of the market portfolio must be equal to the investor’s coefficient of relative risk aversion times the standard deviation of the combined portfolio’s return. Letting w D N1 P1 =W, substituting w¢m ; D ¢p into Equation (21.13), and rearranging gives wD
E.rm / rf Var.rm /RR
(21.14)
Equation (21.14) shows that the percentage of wealth that the risk averse participant invests in the market portfolio is positively related to the expected return E.rm /, and negatively related to rf ; Var.rm /, and RR . Investors all face the same values of E.rm /; rf and Var.rm /, but differ according to their degree of risk aversion. More risk averse investors (larger RR ) have smaller optimal values of w and hence are more
338
R. Francioni et al.
apt to lend at the risk-free rate (which implies w < 1); less risk averse investors (smaller RR ) have larger optimal values of w and hence are more likely to borrow at the risk-free rate (which implies w > 1). The right-hand side of Equation (21.13) is the market price of risk per standard deviation of the market portfolio. The total compensation for risk taking is the price of risk times the standard deviation that the investor accepts (here, the standard deviation of the combined portfolio). Multiplying both sides of Equation (21.13) by p , we obtain
E.rm / rf RR Var.rp / D m
p
(21.15)
Adding rf to both sides of Equation (21.15) gives the investor’s total compensation for waiting and for risk taking:
E.rm / rf E.rp / D rf C RR Var.rp / D rf C m
p (21.16)
Equation (21.16) shows that the location of the investor’s optimal point on CAPM’s capital market line depends on his or her measure of relative risk aversion .RR /.
21.3.6 The ith Risky Asset’s Point on the Security Market Line We now assess the demand model to show the location of an ith risky asset on the security market line. In so doing, we establish that the demand for the ith risky asset is infinitely elastic. Equation (21.10) shows that the marginal risk premium for each investor, as a percentage of P1 , will equal E.rm / rf . Therefore, for each investor, RA Var.P2 /N1 D E.rm / rf P1
(21.17)
It follows from Equation (21.17) that investors with lower values of RA hold a larger number of shares, such that the product RA N1 is the same for all investors. Because rm D .P2 =P1 / 1; Var.rm / D Var.P2 /=P1 2 . Substituting Var.rm /P1 2 D Var.P2 / into Equation (21.17) and simplifying gives RA Var.rm /P1 N1 D E.rm / rf
(21.18)
Using P1 N1 D wW we obtain wRR Var.rm / D E.rm / rf
(21.19)
Equation (21.19) can be interpreted as an equilibrium condition for each investor. Because wRR D RA N1 P1 , and given that the product RA N1 is constant across investors, RR w is constant across all investors. [It is also clear from Equation (21.19) that the product wRR must be constant across all investors, because E.rm /; rf , and Var.rm / are the same for all.] The equilibrium condition for each investor with respect to the market portfolio implies an equilibrium condition for each investor with respect to any ith risky asset in the market portfolio. The CAPM shows that the relevant measure of risk for the ith risky asset is “i D ¢im =Var.rm/. Therefore, writing Var.rm / D ¢im =“i , substituting into Equation (21.19), and multiplying both sides by “i we get wRR m D ˇi ŒE.rm / rf
(21.20)
Adding rf to both sides of Equation (21.20) gives CAPM’s security market line, rf C wRR i m D rf C ˇi ŒE.rm / rf D E.ri /
(21.21)
Where E.ri / is the expected return on the ith stock in the market portfolio. Equation (21.21), assessed at w D 1, shows that the expected return for the ith risky asset depends on its covariance with the market return, and on the measure of relative risk aversion for an investor whose optimal combined portfolio contains the market portfolio only. The equation also shows that the ith risky asset’s specific location on the security market line depends on the covariance of the asset’s return with the return on the market portfolio, and hence that its expected return depends only on ˇi , its systematic risk. It follows from the above discussion that the demand to hold shares of the market portfolio is downward sloping, while the demand for each individual stock in the market portfolio is infinitely elastic. The reason is that perfect substitutes do not exist for the aggregate portfolio, but they do exist for the individual stocks. Only one factor characterizes any ith stock – ˇi , its covariance with the market. But the covariance for any stock can be duplicated exactly by an appropriate combination of two or more other stocks, and all holdings that have the same covariance must yield the same expected return. If they were to yield different expected returns, an unlimited number of shares of the higher yielding position would be bought, and an unlimited number of shares of the lower yielding position would be sold short until, with costless trading, the buying and selling pressures bring the two prices into exact equality. Unlimited buying (selling) at any price lower (higher) than the beta appropriate, CAPM price manifests an infinitely elastic demand to hold shares. That is, at an infinitesimally higher price no shares will be held, and at an infinitesimally lower price demand will be unlimited.
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
Bernstein’s (2007) two insights immediately follow: a stock’s systematic risk totally dominates its specific risk, and the market for each ith stock is infinitely liquid at the price which translates into E.ri /, its systematic risk appropriate return. As we turn to the non-frictionless market, the infinitely liquid, infinitely elastic property of CAPM is a good point of departure from the frictionless world. A common denominator in many microstructure analyses is that the demand to hold shares of individual stocks is downward sloping (which means that shares do not have intrinsic values). Market makers post bid and ask quotes that, when raised, result in more public sales to the market maker and, when lowered, result in more public purchases from the market maker. Bid and ask quotes can be distributed over multiple price points in competitive dealer markets as well as on public limit order books. Trading is not costless. Both explicit costs (e.g., commissions and taxes) and implicit costs (e.g., market impact costs) are incurred. Information is complex and imprecise, and thus investors commonly disagree about its interpretation. Arbitrage is not costless, and perfect substitutes for individual issues do not exist. Share values depend not only on the calculations of systematic risk in the upstairs markets, but also on how orders interact in the marketplace. As a consequence of all of this, trades that are made and the transaction prices that they are made at also depend on the structure of a marketplace. Microstructure analyses address these realities that CAPM does not comprehend.
21.4 What Microstructure Analysis Has to Offer: Personal Reflections In this section we review the development of microstructure analysis. Our objective is not to provide a comprehensive survey of the literature, but to highlight some of the important themes that have given guidance to market structure development. More detailed information can be obtained from Cohen et al. (1979) who have provided an early survey of the field; from O’Hara (1997) who discusses important theoretical microstructure models; from Madhavan (2000); Biais et al. (2005); Parlour and Seppi (2008) who have provided more recent surveys; and from Hasbrouck (2007) who deals with empirical microstructure research and research methodology. We first focus on the early literature, next turn to more recent developments, and finally present our thoughts concerning an important direction in which future microstructure research ought to head.
339
21.4.1 The Early Focus The first contributions to the new field in financial economics that came to be called “microstructure” were made by a couple of people who participated in the SEC’s Institutional Investor Report (1971). A handful of others independently started to focus on microstructure topics in the early 1970s. Eventually a few of the early researchers came to recognize the commonality of their interests and, applying the title of Garman’s (1976) well-known paper, “Market Microstructure,” they gave the field its name. Much of the early literature focused on dealers and exchange specialists. These market makers were viewed as the suppliers of immediacy to investors, and the spread was considered the price they charge for providing this service in an environment where order arrival is non-synchronous. Of key importance was the relationship between spreads and the costs of market making. The earlier market maker studies were in large part motivated by a desire to determine whether or not these intermediaries were realizing monopoly profits and, if so, whether or not their profits were attributable to market making being a natural monopoly. Spreads that are greater than the costs of market making would be taken as an indication of monopoly power on the part of the dealers, and spreads that were negatively related to trading volumes would indicate economies of scale in market making, which could imply a natural monopoly (Stigler 1964). Spreads were indeed found to decrease with transactions volume, but reasons other than market making being a natural monopoly were advanced (Smidt 1971; Tinic 1972). The general picture that emerged was that the trading costs incurred by investors could be lowered by strengthening competition between market maker intermediaries. In particular, competition in the NYSE market was deemed inadequate, as specialists and the Exchange itself were viewed as having monopoly positions: each stock was assigned to just one specialist; the NYSE’s order consolidation rule (Rule 390) precluded in-house executions by requiring that exchange members send their orders for NYSE-listed securities to an exchange; and commissions were fixed and unjustifiably high (Tinic and West 1980).4 Not surprisingly, the focus on the market maker firms led several researchers to model market maker pricing decisions (i.e., the setting of their bid and ask quotes). These 4
Another major issue addressed by the microstructure literature at that time was the impact of information on trading volume and price (Copeland 1976; Beja and Hakansson 1977; Beja and Goldman 1980).
340
included Bagehot (1971); Stoll (1978); Amihud and Mendelson (1980); Ho and Stoll (1980, 1981, 1983); Mildenstein and Schleef (1983). With one exception (Bagehot 1971), the early formulations dealt with inventory considerations. A market maker firm holding an undesirably long position would lower the quotes (i.e., lower the offer so as to sell more shares, and reduce the bid so as to discourage others from selling shares to it). Reciprocally, a market maker who was short would raise the quotes. This response on the part of the public (buy more shares when the market maker’s offer is lower, and sell more shares when the market maker’s bid is higher) is evidence that the public’s demand to hold shares of any specific stock was taken to be downward sloping. A variety of mathematical tools were used to solve for optimal market maker quotes. These models also gave further insight into the cost components of the market maker’s spread (Stoll 1989). While insightful, the early inventory-based pricing models suffered from some shortcomings. First, the early formulations for the most part assumed monopoly market makers, although some of these models were applied to markets such as the New York Stock Exchange where exchange specialists were in fact competing with other floor brokers and customer limit orders (Demsetz 1968). The application of theory further suffered from the reality that the price of immediacy for an investor is not the spread of an individual market maker, or even an average market maker spread, but the inside spread (i.e., the lowest ask across all market makers minus the highest bid).5 It is important to note that dealer spreads could individually remain relatively invariant with respect to transaction volume while the inside spread fell appreciably. A further shortcoming of most of these earlier models is that they did not take account of a major cost incurred by market makers: the losses generated by trading with better informed investors. Recognition of this reality (which is also outside the scope of the frictionless world of CAPM) led to a development that did much to establish microstructure as an important new field in financial economics – the introduction of market maker models that were based, not on inventory management, but on controlling the cost incurred when some investors are in possession of information that the market maker and other investors have not yet received. Bagehot (1971) was the first to embark on this line of thought. He was later followed by, among others, Gloston and Milgrom (1985) and Kyle (1985). With information asymmetries, the market maker always looses when trading with a better informed participant. For microstructure theorists at the time, this meant that, for the dealer market not to fail, some investors must trade for
5
For further discussion, see Cohen et al. (1979).
R. Francioni et al.
reasons that are not related to information.6 Liquidity considerations (i.e., an investor’s personal cash flow needs) was one such motive for public buying and selling. A third participant type was also introduced along with the liquidity traders – noise traders (participants who trade on price moves as if they contain information when in fact they do not). This trio of informed traders, liquidity traders, and noise traders was used to show how markets could function and, in so doing, enable new information to be incorporated into security prices (Grossman and Stiglitz 1980; Milgrom and Stokey 1982; Kyle 1985; Glosten and Milgrom 1985; Copeland and Galai 1983; Easley and O’Hara 1987, 1991, 1992). At this stage in its early development, the microstructure pricing models were predominantly market maker models. One exception should be noted, however: a National Book System proposed by Mendelson et al. (1979) contained a comprehensive description of an order driven automated trading system that provided guidance for designing the first exchange-based electronic trading systems. For a more recent discussion of automated trading systems, see Domowitz and Steil (1999). Most equity markets around the globe are now order driven, limit order book markets that might include market makers in a hybrid structure (as does the NYSE), but are not basically quote driven (i.e., dealer) markets (as was the old Nasdaq and London Stock Exchange). The limit order book markets are driven by the orders placed by the investors themselves, not by market maker intermediaries.
21.4.2 The Current Focus Over the years, microstructure analysis has grown extensively on both the theoretical and empirical fronts. Concomitantly, the securities markets themselves have evolved, becoming ever more technologically developed, more global in outreach, but also more fragmented between different trading facilities. One important new direction microstructure research has taken is to further model the order driven market, an environment where natural buyers and sellers provide immediacy to each other because some, who are patient, are willing to post limit orders while others, who demand immediacy, choose to submit market orders that execute against the posted limit orders. Understanding the costs of, and motives for, placing limit orders as distinct from market orders was called for.
6
A market supported by informational trading only can indeed function if agents trade with each other because their expectations are divergent. When the information that triggers trading is common knowledge, the condition may be thought of as one where agents are agreeing to disagree.
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
With limit orders, the very existence of the bid-ask spread has to be explained. That is, with a sufficiently large number of participants placing priced orders, one might expect that orders would be posted at virtually every available price point in the neighborhood of equilibrium, and that the spread would disappear. Cohen et al. made this point in their review paper (Cohen et al. 1979), and they analyzed the existence of the spread in Cohen et al. (1981).7 They further wrote, “With regard to modeling the market spread, we suggest that a straightforward aggregation from individual spreads is not possible in a system where there is no clear distinction between demanders and suppliers of immediacy, and where traders meet in a dynamic, interactive environment that incorporates the impact of investor order placement strategies.” Strategic order placement clearly required further analysis. The task, however, was not simple. Some of the first papers in this area assumed, as is true for a dealer market, that limit order and market order participants are two separate, exogenously fixed groups that are separated by a firewall (Glosten 1994). This assumption, while simplifying mathematical modeling, unfortunately distills out much of the richness of an order driven market. More recent models have eliminated the firewall (Handa and Schwartz 1996a; Foucault 1999; Parlour 1998; Handa et al. 2003; Faucault et al. 2005; Goettler et al. 2005). With the choice between limit order and market order endogenous, for any market to function, participants must divide naturally into four groups that reflect two dichotomies (one between buyers and sellers and the other between limit order and market order placers), not the standard two (buyers and sellers). With order type selection endogenous in the order driven market, the balance between immediacy demanders and immediacy suppliers becomes a second equilibrium that must be understood. That is, one needs to recognize the conditions under which some participants will choose to be liquidity demanders (place market orders) while others choose to be liquidity suppliers (place limit orders). If a reasonable balance is not achieved between these two groups, the order driven market will fail (as indeed it does for thinner, small cap stocks). Increasingly, these issues have been handled, and some sophisticated limit order models have been developed.8 Microstructure analysis of trading systems has expanded to include periodic call auctions.9 The economics of a call auction are quite different from those of continuous trading and, consequently, so too are the order placement strategies
7
Cohen et al. (1981) describe the tradeoff between execution probability and price improvement in the optimal choice between limit and market orders. 8 See Bach and Baruch (2007) for a recent discussion and further references. 9 See Economides and Schwartz (1995) for a description of alternative call market structures.
341
that participants should employ when they approach a call market. Call auctions do not, by their very nature, supply immediacy. Rather, orders that are entered during a call’s bookbuilding phase are held for a periodic crossing at a single clearing price at the (generally predetermined) time of the market call. Consequently, buy and sell orders submitted to a call do not execute when they arrive even if they match or cross in price (matching and crossing orders execute immediately in a continuous trading environment). This being the case, limit and market orders have a different meaning in a call: limit orders do not supply immediacy to market orders, and market orders are simply extremely aggressively priced limit orders (i.e., a market order to sell in a call effectively has a limit price of zero, and a market order to buy effectively has a limit price of infinity). Today, virtually all modern, electronic exchanges open and close their continuous markets with call auctions. Consequently, participants face further decisions when operating in a call plus continuous, hybrid market: how to submit an order to a call auction that is followed by continuous trading (e.g., an opening call), and how to submit an order to a continuous trading environment that is followed by a call auction (e.g., a closing call). Taking these tactical decisions into account is part of the complexity of microstructure analysis. Technological developments have simultaneously enabled new trading venues to emerge (which can fragment markets) while providing connectivity between them (which can consolidate markets). Concurrently, regulatory initiatives have been motivated by the desire to intensify inter-market competition. Questions can be raised, however, concerning fragmentation of the order flow. The conventional wisdom has been that the consolidation of order flow improves liquidity, and exposing each order to all other displayed orders gives investors the best prices for their trades. Consolidating trading in a single market provides incentives to liquidity suppliers to compete aggressively for market orders by revealing their trading interest, and by being the first to establish a more favorable price (if time is used as a secondary priority rule). On the other hand, arguments in favor of trading on multiple markets include the benefits of inter-market competition, and the fact that traders with disparate motives for trading may want different marketplaces to trade in (i.e., the “one size does not fit all” argument). And so, different markets develop to serve diverse investor needs (such as achieving a faster execution versus obtaining a better price). One growing need among large institutional investors, the ability to trade large orders with minimal market impact, has led to the advent of dark pool, block trading facilities such as Liquidnet, Pipeline, and ITG’s Posit that aid in quantity discovery. This development in the industry has spawned a related line of research on off-exchange and upstairs trading (Seppi 1990; Grossman 1992; Keim and Madhavan 1996; Madhavan and Cheng 1997).
342
A spectrum of market quality issues have been of long and continuing importance to microstructure researchers. These include market transparency,10 both pre- and posttrade, (Porter and Weaver 1998), the accentuation of intraday price volatility, and correlation patterns that have been observed in high frequency data (Engle and Granger 1987). Other important issues include price clustering and tick sizes (Harris 1991, 1994; Angel 1997). Applications such as transaction cost analysis (TCA) and algorithmic trading have received increasing attention (Domowitz et al. 2001). The relative performance of floor based versus electronic trading is another important issue (Domowitz and Steil 1999). A major line of empirical research was pioneered by Hasbrouck 1993 who decomposes transaction prices into two components: a random walk component and a stationary component. The random-walk component is identified with an efficient price that the market is trying to discover. The stationary component is viewed as microstructure noise. Microstructure noise is commonly explained by features such as the bid-ask spread, market impact, and the discreteness of the pricing grid. The noise component has also been attributable to price discovery itself being a dynamic process (Menkveld et al. 2007 and Paroush et al. 2008).11 Numerous empirical studies have focused on two of the world’s premier markets, the New York Stock Exchange and NASDAQ (Hasbrouck 1991, 1995; Hasbrouck and Sofianos 1993; Christie and Schultz 1994; Christie et al. 1994; Bessembinder and Kaufman 1997a, b; Bessembinder 1999, 2003; Barclay et al. 2003, among others). Many other studies have considered European markets, Asian markets, and other markets around the world (e.g., Biais et al. 1995; Sandas 2001; Ozenbas et al. 2002).12 Across all of these markets, structural and performance differences have been noted, but also major similarities have been observed. It is apparent that, despite the influence of historic and cultural considerations, trader behavior and market performance around the globe depend largely on microstructure realities. Alternatively stated, trading rooms and markets around the world bear striking resemblances to each another. Another recent line of research has considered how search costs affect bid-ask spreads in financial markets. To this end, Duffie et al. (2005) present a dynamic model of market makers under the assumption of no inventory risk and information that is symmetrically distributed. They show that sophisticated investors who have better search and bargaining
10
Trading systems differ in their degree of transparency. Pagano and Röell (1996) investigate whether greater transparency enhances market liquidity by reducing the opportunities for taking advantage of uninformed participants. 11 Also see Hasbrouck (1995), Harvey and Huang (1991) and Jones et al. (1994). Further references are provided by Menkveld et al. (2007). 12 Also see Bessler (2006) for discussion and further references.
R. Francioni et al.
abilities face tighter bid-ask spreads. This is in contrast to traditional information-based models that imply that spreads are wider for more sophisticated (i.e., better informed) investors. As we have noted, unlike in the frictionless market arena of CAPM, amassing liquidity is a primary function of a marketplace and market structure features are generally designed with liquidity implications in mind. Asset managers also take liquidity into account, along with the two other standard variables of modern portfolio theory, risk and return. Difficulties in defining, measuring, and modeling liquidity are formidable, however, and the literature that deals with it directly is relatively sparse (Bernstein 1987). Nevertheless, liquidity considerations have permeated the microstructure literature, both explicitly and implicitly.13 Looking back over the development of microstructure analysis, two observations stand out. First, microstructure studies have in multiple ways given direction to market structure development. Second, to a remarkable extent, the various theoretical microstructure models that are center stage today, and many empirical analyses that are based upon them, share a common structural framework – the asymmetric information paradigm. This consistency is desirable in that it implies that the field has grown by accretion rather than by replacement. Consequently, new insights are more apt to refine than to contradict old conclusions. Consistency, however, is not desirable if the common structural framework becomes overly rigid and restrictive, and if it yields incomplete and/or misleading answers to questions involving trader behavior, market structure, and regulatory policy. At times, a literature starts to advance along new fronts. We consider this possibility next for the microstructure literature.
21.4.3 Future Directions As we have noted, the current focus in the literature is on asymmetric information-based models, which are characterized as follows. Trading is driven by informational change, liquidity needs, and noise trading. The information motive for trading is the first mover of the three (liquidity and noise trading are required so that a market will not fail). Further, order arrival in the continuous environment is generally taken to be asynchronous. For a continuous trading regime to function with asynchronous order arrival, the presence of a limit order book and/or a market maker intermediary is required.
13
For further discussion and references regarding liquidity see Amihud and Mendelson (1986), Chordia et al. (2000, 2008), Hasbrouck and Seppi (2001), Amihud (2002) and Pastor and Stambaugh (2003).
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
Information trading is of keen interest because it represents the process by which new information is reflected in share values. In the standard asymmetric information models, it is assumed that all participants in possession of the same information form equivalent expectations concerning future risk and return configurations. When information changes, however, participants may not all receive the news at the same time; some receive it before others, a reality that, at any point in time, can divide traders into two groups – the informed and the uninformed. Informed participants will never trade with each other; consequently, liquidity and noise traders must be present for a market to function. As noted, asymmetry of information, for the most part, lies at the heart of the standard microstructure models of today. The homogeneous expectations assumption has been tempered of late. As a further departure from the infinitely liquid, zero cost environment of CAPM, it is being recognized that some participants produce “private information” (namely, that they further process information so as to gain insights that are not immediately available to others). Whether participant expectations differ because of the actual production of private information, or simply because different people interpret the same information or news announcement differently, the expectations of a group of investors can be divergent. Also at the heart of the asymmetric information models is the presumption that a stock has a fundamental value that bears a unique relationship, not to trader activity in the marketplace, but to the fundamental information that informed traders possess. The process of information being fully reflected in prices under asymmetric information is the act of informed and uninformed agents trading with each other until any discrepancy between a market price and a fundamental value is eliminated. The process can be viewed as arbitrage. In the earlier dealer models, the market maker was assumed to know a stock’s fundamental value. In later models, informed traders but not the market maker knows the fundamental values (Kyle 1985). Especially in the later models, price discovery is not instantaneous; rather, it is a protracted process that depends on the individual strategies employed by the informed and uninformed agents. In recent years, an alternative paradigm has been emerging: a divergent expectations environment (Miller 1977). While institutionally realistic, this paradigm has met with considerable academic resistance. For one thing, homogeneous expectations environments are far easier to deal with mathematically and homogeneity has, in many applications, proven to be a useful modeling assumption. The assumption has also been retained for another reason. As an attribute of individual rationality, it is presumed that intelligent agents facing the same information and applying the same
343
(correct) analytic techniques will reach the same conclusions and, therefore, will have homogeneous expectations. Fundamental information, however, is enormous in scope. It is complex and imprecise, and our tools for analyzing it are relatively crude. In the presence of fuzzy information, expectations can be divergent. Allowing for divergent expectations opens another path for microstructure analysis, and it introduces new questions concerning agent behavior, market structure, and regulatory policy. Moreover, a further element can enter the analysis in a divergent expectations environment: along with forming their own opinions, agents may also respond to the opinions of others; that is, exhibit adaptive valuation behavior (Paroush et al. 2008; Davis et al. 2007).14 Just how agents communicate with each other and respond to each others’ opinions is a subject for ongoing research. The topic also opens another interface with behavioral finance. Price discovery acquires a different meaning in a divergent expectations environment, and this has important implications for market structure. When asymmetric information characterizes a community of investors, the strategic behavior of informed agents can affect the path that price takes when news moves a share value from one equilibrium to another, but the new equilibrium is path independent. With divergent expectations, the new equilibrium is path dependent – it depends on how the opinions of a diverse set of agents are integrated (Paroush et al. 2008). Alternatively stated, with divergent expectations, price discovery is a coordination process and, as such, is directly effected by market structure. In the standard asymmetric information environment, the key dichotomy is between informed and uninformed participants. But a second dichotomy also exists – one that separates large institutional customers from small retail customers. One might expect that the informed investor set would largely comprise the institutional customers. After all, the institutions are professional; they can afford to continuously monitor information and respond to news, and their very size (all else constant) reduces their per share cost of doing so. With divergent expectations, however, there is no presumption that institutional customers can, because of their size, consistently evaluate shares more accurately. On the contrary, institutions commonly disagree with each other and, as a consequence, commonly trade with each other. In the divergent expectations environment, institutional investors do not necessarily have an advantage over retail customers as fundamental analysts. In fact, their size makes 14 Adaptive valuation behavior refers to individual agents becoming more bullish (bearish) when learning of the relatively bullish (bearish) attitudes of others.
344
trading more difficult and they incur higher transaction costs. So what accounts for their popularity? The value added by the mutual funds, pension funds, and so forth comes largely from their ability to facilitate diversification. Further, they can bring a systematic, professional, and disciplined approach to portfolio management (Davis et al. 2007).
R. Francioni et al.
Big Board from a floor-based “slow” market into a hybrid that includes a “fast market” electronic venue. As of this writing, the floor-based component of the NYSE’s hybrid has been markedly reduced in importance. Several specialist firms have ceased operations, other floor brokers have departed, and the trading room areas have collapsed from five to two.
21.5 From Theory to Application 21.5.2 Regulatory Initiatives Microstructure analysis is inherently involved with analyzing the detailed functioning of a marketplace. The literature has a strong theoretical component and, to a large extent, is structured to yield insights into the effect of market design (structure and regulation) on market performance. Hopefully, theory can provide a broad roadmap for real world market architects to follow. In this section we provide a broad overview of major technology and regulatory changes that have taken place in the United States and Europe.15
21.5.1 Technological Developments Two exogenous forces have driven market structure change: technology and regulation. Regarding technology, the first big step was taken in 1971 in the U.S. when the National Association of Securities Dealers (NASD) introduced an electronic automated quotation (AQ) display system called NASDAQ. The Toronto Stock Exchange was the first exchange to introduce an electronic order driven platform – its Computer Assisted Trading System (CATS) in 1977. Following in Toronto’s footsteps, London instituted SEAQ in 1986, Paris rolled out its Cotation Assistée en Continu (CAC) in 1986, and Deutsche Börse’s Xetra came to life in 1997. Also in 1997, the London Stock Exchange introduced its Stock Exchange Trading System (SETS) limit order platform. By the end of the twentieth century most of the exchanges in Europe had converted to electronic limit order book platforms. Change came more slowly in the U.S. Instinet introduced an electronic platform in 1969. Nearly 30 years later, Instinet became known as an Electronic Communications Network (ECN). In short order, a slew of other ECNs emerged, led most prominently by Archipelago and Island. In 2002, NASDAQ implemented its own electronic platform, which, at the time, was called “SuperMontage.” Most recently, in the spring of 2006, the newly privatized NYSE Group initiated its Hybrid Market, a facility that has transformed the 15
Further discussion of market structure development is provided by Harris (2003).
Major regulatory initiatives have played an important role in jump-starting these market structure changes. The 1975 Congressional Securities Acts Amendments was the first sizable regulatory foray into market structure development. The Amendments precluded the fixing of commission rates and mandated the development of a National Market System (NMS). In 1997, the U.S. Securities and Exchange Commission instituted its new Order Handling Rules (OHRs), which require that market makers holding customer limit orders display those orders in their quotes, and that dealers at least match any quotes that they themselves display on an ECN (either by bettering the quotes that they offer customers or by posting their superior quotes in NASDAQ’s SuperMontage). Following the OHRs, four other regulatory initiatives were introduced in the U.S. in relatively fast succession. In 1998, the SEC release Regulation of Exchanges and Alternative Trading Systems (RegATS), set forth rules that allow ATSs the alternative of registering as national exchanges or as broker-dealers. In 2000, the NYSE, under pressure from the SEC, rescinded its order consolidation rule (Rule 390). In 2001, the U.S. markets completed the transition from fractional to decimal pricing, which resulted in the minimum tick size decreasing from 1/16 of a dollar or 6.25 cents (it had earlier been 1/8 of a dollar or 12.5 cents) to one cent. In 2005, the SEC adopted Regulation NMS, the key provision of which is that better-priced limit orders cannot be traded through (the trade-through rule was fully implemented in 2007). On the eastern side of the Atlantic, the first major regulatory initiative was taken in 1993 when the Investment Services Directive opened the door for cross-border trading by introducing the single European passport. As discussed in Schwartz and Francioni (2004), “Passporting defines a system of mutual acceptance of other EU countries’ rules without truly harmonizing all of the details of the various rules.” Major regulatory change has come again to the European arena in the form of the Markets in Financial Instruments Directive (MiFID). Key provisions in MiFID include a best execution requirement (echoes of the 1975 U.S. Securities Acts Amendments), a quote disclosure requirement for upstairs broker/dealers (echoes of the U.S. Order Handling Rules),
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
and the disallowance of order focusing rules (echoes of the U.S. SEC pressuring the withdrawal of NYSE Rule 390). A major regulatory difference is that, no trade-through rule has been imposed on the European markets (unlike under the U.S. SEC’s Reg NMS).
21.6 Deutsche Börse: The Emergence of a Modern, Electronic Market We turn in this section to the designing of an actual marketplace. Our focus is on Deutsche Börse: it is the dominant stock exchange in Germany, the last of the major European Borses to go electronic, and its technology is state of the art. Important insights were gained from the microstructure literature during Xetra’s planning period and the system’s implementation has marked a huge step forward for Germany’s equity markets. But our roadmap, which is undoubtedly incomplete today, was even more limited in the 1994–1997 years when Xetra was being designed. And, there is always the danger that the cartographer whose map is being used has some misconceptions (e.g., believes in the existence of the Northwest Passage).
21.6.1 The German Equities Market in the Mid-1990s As recently as the mid-1990s, the German market had major structural defects that would undermine its competitiveness in the European arena. In recognition of this, Deutsche Börse, the newly founded exchange operator of the Frankfurter Wertpapierbörse (FWB), became the leading force for change.16 In the mid-1990s, Frankfurt’s trading floor was the major marketplace for German stocks, but the German market was badly fragmented. Kursmaklers, the equivalent of specialists, concentrated much of the liquidity in their order books. A primitive (by today’s standards) electronic trading system, IBIS (which was owned by FWB), operated in parallel with the floor trading. IBIS’s central component was an open limit order book that had hit and take functionality, but did not match orders automatically. The electronic system captured about 40% of the trading volume in the 30 large-cap DAX stocks, but no link existed between IBIS and the floor. Seven other floor-based regional exchanges were also operating in Germany with technical infrastructures that were similar to 16
FWB also owned the futures and options exchange Deutsche Termine Börse. After the 1997 merger with SOFFEX, DTB became Eurex.
345
those in Frankfurt. In total, the regionals at that time were attracting roughly 10% of German exchange-based trading volume. Moreover, off-board trading has been (and still is) prevalent in Germany (Davis et al. 2006). Transparency for floor trading (pre-trade transparency in particular) was low. Quotes were not distributed publicly (they were available on the floor only). Price priority between different trading venues was not enforced and orders executed in one market commonly traded through orders waiting to be executed in another market. Market manipulation and other abuses of power and position were believed to be rife on the old Frankfurt floor. Given the appreciable market fragmentation, poor transparency, imperfect inter-market linkages, and dubious floor behavior, transaction costs were high. Changes, both structural and regulatory, were called for. The result was the development of Xetra, an electronic orderdriven trading system that comprises two principal modalities – a continuous order book platform and periodic singleprice call auctions.17
21.6.2 Designing a New Trading System Xetra’s development started in 1994, and the system was launched in 1997.18 Strong external forces also motivated this reengineering of Deutsche Börse’s market structure: regulatory reform, soaring trading volumes, pan-European harmonization of the exchange industry, vibrant cross-border competition for order flow, and the rising concerns of market participants about the future performance of Germany’s financial markets. Through Xetra’s design stage, microstructure theory, even as it existed at the time, was an indispensable guide. This new field in financial economics, with its origin in issues concerning the competitive and architectural structure of an equity market, should have been able to give guidance to the development of an actual marketplace such as Xetra. To an extent, it has fulfilled its promise. The literature gave Deutsche Börse a broad roadmap, and it has highlighted underlying relationships and other important considerations that a market architect should be aware of. Building the Xetra model involved specifying principles that the new market should implement, and the system’s functionality also had to be defined. Most importantly, the new market system was to provide equal and decentralized access to all of its participants. Further, the system’s functionality and the market information delivered to users (both pre- and post-trade) were to be the same for all traders.
17 18
For further discussion and descriptions, see Francioni et al. (2008). Appendix B provides details of Xetra’s design.
346
A trader’s location should not matter. With this in mind, Deutshe Börse’s fundamental architectural decision was to structure a hybrid market that included two major modalities – a continuous electronic order driven platform, and periodic call auctions that were used primarily for market openings and closings.19 An absolutely critical attribute of an order driven trading system is its ability, vis-à-vis its competitors, to win the battle for liquidity. Regarding this matter, the earlier microstructure literature has given some guidance, but liquidity is a complex attribute to deal with. As it is not easy to define and measure, liquidity has been very difficult to model and assess. However, as noted above, the measurement and analysis of liquidity are currently attracting considerably more attention in the microstructure literature. Price discovery and transparency are two other issues for which the microstructure literature has provided valuable guidance. The architects at Deutsche Börse recognized that price discovery is a primary function of a market center, and their major reason for introducing the call auctions was to sharpen its accuracy, particularly at market openings and closings. Understanding that transparency is important while recognizing that it should not be excessive, the decision was made to disclose only the indicative clearing price (not the full book of orders) in the pre-call, bookbuilding period. Microstructure literature has given insights into the operations of the public limit order book for continuous trading. At the time, recognition was also emerging of periodic call auctions, a modality that was clearly differentiated from, but could effectively be used with, the continuous market. With regard to continuous trading, microstructure analyses of the use of limit and market orders and of the interaction between these two order types proved to be most valuable. However, a deeper understanding of the economics of an order driven market now exists than was the case in the 1994– 1997 period when Xetra was being designed. Another important contribution of microstructure theory has been the classification of traders according to their needs for immediacy and their propensities to be either givers or takers of liquidity. The differentiation between informed and uniformed traders also proved to be valuable, particularly with respect to the market maker role that has been incorporated into Xetra. Specifically, market makers, referred to as “designated sponsors,” were included to bolster liquidity provision for smaller cap stocks. A balance had to be achieved between the obligations imposed on the designated sponsors and the privileges granted to them. To accomplish this, information had to be assessed concerning the role of dealers in general (e.g., NASDAQ-Type market makers) and special19
Interestingly, the microstructure literature on call auctions was relatively sparse at that time. For an early discussion, see Handa and Schwartz (1996b).
R. Francioni et al.
ists in particular (e.g., NYSE-Type specialists). That balance defined the designated sponsors’ role in Xetra, and secured their willingness to accept it. Market microstructure insights also yielded the understanding needed to transform the specialist role into the newly designed designated sponsor role. Interestingly, the specialist role in the NYSE has now been changed to better harmonize with the Exchange’s electronic platform, and these intermediaries are now referred to as “designated market makers.” But designing an automated trading system is indeed a complex task, and the gap between theory and implementation is both large and intricate. Trading decisions can be made in a large variety of ways that run the gamut from humans interacting directly with humans without computers, to humans trading via electronic order handling and execution systems, to computers making trading decisions that are sent electronically to a computerized market (e.g., computerdriven algorithmic trading). Since the mid-1990s, market structure development has involved mainly the design of an electronic trading facility. Deutsche Börse took account of the fact that automation impacts both the way in which trading decisions are made, and the process by which prices are determined and trades executed in a market center. An electronic market requires the specification of an array of critical features (e.g., the trading modalities employed, rules of price and quantity determination, and basic features such as order types and trading parameters). With an electronic market, the software that implements a desired market structure must be specified on a level of detail that far exceeds what is required for human intermediated trading. For instance, a human agent (specialist) has historically handled price determination at NYSE openings. This function is performed with reference to various rules, but the specialist is also free to exercise reasonable judgment. Further, human-to-human interactions can evolve naturally as problems, opportunities, and new competitive pressures arise. In contrast, with a fully electronic opening, every possible condition that can occur must be recognized and a rule for dealing with it specified, and electronic interaction can be changed only by rewriting the code that specifies with stepby-step precision just how orders are handled and turned into trades and transaction prices. How does one achieve the precise specifications that a computerized trading system must have? In 1994, the market architects at Deutsche Börse could study the operations of other electronic platforms (e.g., CATS in Toronto and CAC in Paris). Doing so was helpful but of limited value given that Deutsche Börse was looking to develop a distinctive system. When moving into new territory, market structure development is a venture. How does one know in advance whether or not it will work? How can one determine whether or not the new system will be viable from a business perspective?
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
Nevertheless, design decisions have to be made, technical requirements must be specified, and the system must be built. The decisions involved represent huge financial bets on whether or not a new market structure will attract sufficient liquidity. Prototyping a new market in the design phase helps the assessment process, but doing so was considerably more difficult in 1994 than it is today with the advent of superior information technology and testing capabilities. In 1994, the architects were forced to rely more on their own educated judgment and on any insights they might gain from microstructure research. Those who are involved in the design of an actual market realize that the devil is in the details. Consider, for instance, the specification of a call auction. A call has excellent theoretical properties, but how should an actual auction be designed? It is straightforward to say that the market clearing price in a call auction should be the value that maximizes the number of shares that trade. But what should the specific rule be for selecting the clearing price if two prices both result in the same maximum trade size? Additionally, how transparent should the book be in the pre-call, order entry period? Are further design features needed to counter the possibility of gaming? And so on. Other considerations that for the most part are outside the scope of the microstructure literature also came into play during the design of Xetra. Information technology issues such as scalability, open architecture, and system reliability are of critical importance. So too are procedures for post-trade clearing and settlement. One of the final steps in the structural design of the new German market was the introduction in 2003 of a central counterparty (with a CCP, counterparty risk management was centralized and trading became fully anonymous, both pre- and post-trade). Electronic trading is also a prerequisite for highly efficient straight-through processing (STP involves all stages of a trade’s life cycle). Information technology has further facilitated the timely capture of market data (all trades, quotes, market index values, etc.) and has expedited its delivery to users. With regard to these diverse applications, Deutsche Börse has achieved a closer integration between trading on Xetra and the broader market infrastructure.
21.7 Conclusion: The Roadmap and the Road A market architect must have a roadmap that, broadly speaking, says where one ought to head and roughly how to get there. To this end, the microstructure literature has added clarity, articulation, and intellectual support. Briefly stated, the objective is to reduce trading frictions (costs), sharpen price discovery, and facilitate quantity discovery. The means of achieving this broad objective involve the amassing of
347
liquidity. This is done through the appropriate use of limit order books for both continuous and call auction trading and, where appropriate, the inclusion of broker/dealer intermediaries. Further insights are gained from microstructure’s indepth analyses of trading motives (new information, liquidity needs, and technical trading signals). The literature has also provided guidance with regard to issues such as transparency and the consolidation (fragmentation) of order flow. But theory, even if it does provide a good roadmap, can take one only so far. The closer one gets to the design of an actual system, the more apparent the complexities of trading and trading systems become. The road actually traveled is indeed bumpy and hazardous. System designers know that “the devil is in the details.” They have to grapple with issues ranging from scalability, reliability, and other IT requirements, to business considerations concerning the ultimate profitability of a trading venue. The market architects at Deutsche Börse recognized these issues and their new system, Xetra, has marked a huge step forward for the German equity market. Today, important problems persist with regard to market design in Germany (and in all other markets around the world). Two fundamental questions concerning market architecture that have yet to be adequately answered: (1) What is the best way to deal with large, institutional orders? (2) How is liquidity creation best handled for mid-cap and smallcap stocks? At the same time, important microstructure topics continue to emerge at the academic research desks. Are there limits beyond which microstructure theory cannot provide guidance? Are there limits to the level of efficiency that a real world market can ever achieve? Undoubtedly, both answers are “yes” but, without question, neither of these limits has as of yet been reached. Quite clearly, microstructure research and the design of an actual marketplace remain works in progress.
References Amihud, Y. 2002. “Illiquidity and stock returns: cross-section and timeseries effects.” Journal of Financial Markets 5, 31–56. Amihud, Y. and H. Mendelson. 1980. “Dealership market: marketmaking with inventory.” Journal of Financial Economics 8, 31–53. Amihud, Y. and H. Mendelson. 1986. “Asset pricing and the bid-ask spread.” Journal of Financial Economics 17, 223–249. Angel, J. 1997. “Tick size, share prices, and stock splits.” Journal of Finance 52, 655–681. Back, K. and. Baruch. 2007. “Working orders in limit-order markets and floor exchanges.” Journal of Finance 62, 1589–1621. Bagehot, W. (pseudonym). 1971. “The only game in town.” Financial Analysts Journal 27, 12–14, 22. Barclay, M., T. Hendershott, and T. McCormick. 2003. “Competition among trading venues: information and trading on electronic communications networks.” Journal of Finance 58, 2637–2666. Beja, A. and M. B. Goldman. 1980. “On the dynamic behavior of prices in disequilibrium.” Journal of Finance 35, 235–248.
348 Beja, A. and N. H. Hakansson. 1977. “Dynamic market processes and the rewards to up-to-date information.” Journal of Finance 32, 291–304. Bernstein, P. 1987. “Liquidity, stock markets and market makers.” Financial Management 16, 54–62. Bernstein, P. 2007. “The surprising bond between CAPM and the meaning of liquidity.” Journal of Portfolio Management 34, 11–11. Bessembinder, H. 1999. “Trade execution costs on Nasdaq and the NYSE: a post-reform comparison.” The Journal of Financial and Quantitative Analysis 34, 387–407. Bessembinder, H. 2003. “Quote-based competition and trade execution costs in NYSE-listed stocks.” Journal of Financial Economics 70, 385–422. Bessembinder, H. and H. Kaufman. 1997a. “A comparison of trade execution costs for NYSE and Nasdaq-listed stocks.” The Journal of Financial and Quantitative Analysis 32, 287–310. Bessembinder, H. and H. Kaufman. 1997b. “A cross-exchange comparison of execution costs and information flow for NYSE-listed stocks.” Journal of Financial Economics 46, 293–319. Bessler, W. (Ed.). 2006. Bösen, Banken und Kapitalmärkte, Duncker & Humblot, Berlin. Biais, B., P. Hillion, and C. Spatt. 1995. “An empirical analysis of the limit order book and the order flow in the paris bourse.” Journal of Finance 50, 1655–1689. Biais, B., L. Glosten, and C. Spatt. 2005. “Market microstructure: a survey of microfoundations, empirical results, and policy implications.” Journal of Financial Markets 8, 217–264. Chordia, T., R. Roll, and A. Subrahmanyam. 2000. “Commonality in liquidity.” Journal of Financial Economics 56, 3–28. Chordia, T., R. Roll, and A. Subrahmanyam. 2008. “Liquidity and market efficiency.” Journal of Financial Economics 87, 249–268. Christie, W. and P. Schultz. 1994. “Why do NASDAQ market makers avoid odd-eighth quotes?” Journal of Finance 49, 1813–1840. Christie, W., J. Harris, and P. Schultz. 1994. “Why did NASDAQ market makers stop avoiding odd-eighth quotes?” Journal of Finance 49, 1841–1860. Cohen, K., S. Maier, R. Schwartz, and D. Whitcomb. 1979. “Market makers and the market spread: a review of recent literature.” The Journal of Financial and Quantitative Analysis 14, 813–835. Cohen, K., S. Maier, R. Schwartz, and D. Whitcomb. 1981. “Transaction costs, order placement strategy, and existence of the bid-ask spread.” The Journal of Political Economy 89, 287–305. Copeland, T. 1976. “A model of asset trading under the assumption of sequential information arrival.” Journal of Finance 31, 1149–1168. Copeland, T. and D. Galai. 1983. “Information effects on the bid-ask spread.” Journal of Finance 38, 1457–1469. Davis, P., M. Pagano, and R. Schwartz. 2006. “Life after the big board goes electronic.” Financial Analysts Journal 62(5), 14–20. Davis, P., M. Pagano, and R. Schwartz. 2007. “Divergent expectation.” Journal of Portfolio Management 34(1), 84–95. Demsetz, H. 1968. “The cost of transacting.” The Quarterly Journal of Economics 82(1), 33–53. Domowitz, I. and B. Steil. 1999. “Automation, trading costs, and the structure of the securities trading industry.” Brookings-Wharton Papers on Financial Services, 33–92. Domowitz, I., J. Glen, and A. Madhavan. 2001. “Liquidity, volatility, and equity trading costs across countries and over time.” International Finance 4, 221–256. Duffie, D., L. Pedersen, and N. Garleanu. 2007. “Valuation in over-thecounter markets.” Review of Financial Studies 20, 1865–1900. Economides, N. and R. Schwartz. 1995. “Electronic call market trading.” Journal of Portfolio Management 21, 10–18. Easley, D. and M. O’Hara. 1987. “Price, trade size, and information in securities markets.” Journal of Financial Economics 19, 69–90. Easley, D. and M. O’Hara. 1991. “Order form and information in securities markets.” Journal of Finance 46, 905–928.
R. Francioni et al. Easley, D. and M. O’Hara. 1992. “Time and the process of security price adjustment.” Journal of Finance 47, 577–606. Engle, R. and C. Granger. 1987. “Co-integration and error correction: representation, estimation and testing.” Econometrica 55, 251–276. Foucault, T. 1999. “Order flow composition and trading costs in a dynamic order driven market.” Journal of Financial Markets 2, 99–134. Foucault, T., O. Kaden, and E. Kandel. 2005. “The limit order book as a market for liquidity.” Review of Financial Studies 18, 1171–1217. Francioni, R., S. Hazarika, M. Reck and R. A. Schwartz. 2008. “Equity market microstructure: taking stock of what we know.” Journal of Portfolio Management 35, 57–71. Garman, M. 1976. “Market microstructure.” Journal of Financial Economics 3, 33–53. Glosten, L. 1994. “Is the electronic open limit order book inevitable?” Journal of Finance 49, 1127–1161. Glosten, L. and P. Milgrom. 1985. “Bid, ask, and transaction prices in a specialist market with heterogeneously informed agents.” Journal of Financial Economics 14, 71–100. Goettler, R., C. Parlour, and U. Rajan. 2005. “Equilibrium in a dynamic limit order market.” Journal of Finance 60, 2149–2192. Grossman, S. 1992. “The information role of upstairs and downstairs markets.” Journal of Business 65, 509–529. Grossman, S. and J. Stiglitz. 1980. “On the impossibility of informationally efficient markets.” American Economic Review 70(3), 393–408. Handa, P. and R. Schwartz. 1996a. “Limit order trading.” Journal of Finance 51, 1835–1861. Handa, P. and R. Schwartz. 1996b. “How best to supply liquidity to a securities market.” Journal of Portfolio Management 22, 44–51. Handa, P., R. Schwartz, and A. Tiwari. 2003. “Quote setting and price formation in an order driven market.” Journal of Financial Markets 6, 461–489. Harris, L. 1991. “Stock price clustering and discreteness.” Review of Financial Studies 4, 389–415. Harris, L. 1994. “Minimum price variations, discrete bid-ask spreads, and quotation sizes.” Review of Financial Studies 7, 149–178. Harris, L. 2003. Trading and exchanges: market microstructure for practitioners, Oxford University Press, New York. Harvey, C. and R. Huang. 1991. “Volatility in the foreign currency futures market.” Review of Financial Studies 4, 543–569. Hasbrouck, J. 1991. “Measuring the information content of stock trades.” Journal of Finance 46, 179–207. Hasbrouck, J. 1993. “Assessing the quality of a security market: a new approach to transaction-cost measurement.” Review of Financial Studies 6(1), 191–212. Hasbrouck, J. 1995. “One security, many markets: determining the contribution to price discovery.” Journal of Finance 50, 1175–1199. Hasbrouck, J. 2007. Empirical market microstructure, Oxford University Press, New York. Hasbrouck, J. and D. Seppi. 2001. “Common factors in prices, order flows and liquidity.” Journal of Financial Economics 59, 383–411. Hasbrouck, J. and G. Sofianos. 1993. “The trades of market makers: an empirical analysis of NYSE specialists.” Journal of Finance 48, 1565–1593. Ho, T. and H. Stoll. 1980. “On dealer markets under competition.” Journal of Finance 35, 259–267. Ho, T. and H. Stoll. 1981. “Optimal dealer pricing under transactions and return uncertainty.” Journal of Financial Economics 9, 47–73. Ho, T. and H. Stoll. 1983. “The dynamics of dealer markets under competition.” Journal of Finance 38, 1053–1074. Ho, T. R. Schwartz, and D. Whitcomb. 1985. “The trading decision and market clearing under transaction price uncertainty.” Journal of Finance 40, 21–42.
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market Jones, C., G. Kaul, and M. Lipson. 1994. “Information, trading and volatility.” Journal of Financial Economics 36, 127–154. Keim, D., and A. Madhavan. 1996. “The upstairs market for large-block transactions: analysis and measurement of price effects.” Review of Financial Studies 9, 1–36. Kyle, A. 1985. “Continuous auctions and insider trading.” Econometrica 53, 1315–1335. Madhavan, A. 2000. “Market microstructure.” Journal of Financial Markets 3, 205–258. Madhavan, A. and M. Cheng. 1997. “In search of liquidity: an analysis of upstairs and downstairs trades.” Review of Financial Studies 10, 175–204. Mendelson, M., J. Peake, and T. Williams. 1979. “Toward a modern exchange: the Peake-Mendelson-Williams proposal for an electronically assisted auction market,” in Impending changes for securities markets: what role for the exchange?, E. Bloch and R. Schwartz (Eds.). JAI Press, Greenwich, CT. Menkveld, A., S. Koopman, and A. Lucas. 2007. “Modelling aroundthe-clock price discovery for cross-listed stocks using state space methods.” Journal of Business and Economic Statistics 25(2), 213–225. Mildenstein, E., and H. Schleef. 1983. “The optimal pricing policy of a monopolistic marketmaker in the equity market.” Journal of Finance 38, 218–231. Milgrom, P. and N. Stokey. 1982. “Information, trade and common knowledge.” Journal of Economic Theory 26(1), 17–27. Miller, E. 1977. “Risk, uncertainty and divergence of opinion.” Journal of Finance 32, 1151–1168. O’Hara, M. 1997. Market microstructure theory, Basil Blackwell, Cambridge, MA. Ozenbas, D., R. Schwartz, and R. Wood. 2002. “Volatility in U.S. and European equity markets: an assessment of market quality.” International Finance 5(3), 437–461. Pagano, M. and A. Röell. 1996. “Transparency and liquidity: a comparison of auction and dealer markets with informed trading.” Journal of Finance 51, 579–611. Parlour, C. 1998. “Price dynamics in limit order markets.” Review of Financial Studies 11, 789–816. Parlour, C. and D. Seppi. 2008. “Limit order markets: a survey,” Handbook of financial intermediation & banking, A. W. A. Boot and A. V. Thakor (Eds.), Elesevier, Amsterdam. Paroush, J., R. Schwartz, and A. Wolf. 2008. The dynamic process of price discovery in an equity market, Working paper, Baruch College, CUNY. Pástor, L. and R. Stambaugh. 2003. “Liquidity risk and expected stock returns.” Journal of Political Economy 113, 642–685. Porter, D. and D. Weaver. 1998. “Post-trade transparency on Nasdaq’s national market system.” Journal of Financial Economics 50(2), 231–252. Sandas, P. 2001. “Adverse selection and competitive market making: empirical evidence from a limit order market.” Review of Financial Studies 14, 705–734. Schwartz, R. 1991. Reshaping the equity markets: a guide for the 1990s, Harper Business, New York. Schwartz, R. and R. Francioni. 2004. Equity markets in action, Wiley, New York. Seppi, D. 1990. “Equilibrium block trading and asymmetric information.” Journal of Finance 45, 73–94. Smidt, S. 1971. “Which road to an efficient stock market: free competition or regulated monopoly?” Financial Analyst Journal, 27, 18–20, 64–69. Stigler, G. 1964. “Public regulation of the securities markets.” Journal of Business 37, 117–142. Stoll, H. 1978. “The supply of dealer services in securities markets.” Journal of Finance 33, 1133–1151.
349
Stoll, H. 1989. “Inferring the components of the bid-ask spread: theory and empirical tests.” Journal of Finance 44, 115–134. Tinic, S. 1972. “The economics of liquidity services.” The Quarterly Journal of Economics 86(1), 79–93. Tinic, S. and R. West. 1980. “The securities industry under negotiated brokerage commissions: changes in the structure and performance of New York stock exchange member firms.” Bell Journal of Economics 11, 29–41.
Appendix 21A Risk Aversion and Risk Premium Measures Our analysis of the perfectly liquid CAPM environment makes reference to two measures of risk aversion and to several dimensions of a risk premium. We provide details concerning both of these in this appendix.
21A.1 Risk Aversion We use two risk aversion measures: (1) RA D U00 .W/=U0 .W/ is a measure of absolute risk aversion, and (2) RR D WRA is a measure of relative risk aversion. Because U00 < 0 for a risk averse decision maker, RA ; RR > 0 for risk aversion. Larger values of RA and RR indicate higher degrees of risk aversion. RA is a measure of absolute risk aversion because it reflects the decision maker’s reaction to uncertainty in relation to the absolute (dollar) gains/losses in an uncertain situation. RR is a measure of relative risk aversion because it reflects the decision maker’s reaction to uncertainty in relation to the percentage gains/losses in an uncertain situation.20
21A.2 Risk Premiums A risk premium is the minimum dollar compensation a decision maker requires to hold a risky asset in place of an alternative that involves no risk. Specifically, a decision maker would be indifferent between a riskless investment with a certain return of D dollars, and a risky investment with an expected dollar return of E(Z) equal to D plus the investor’s risk premium. In general, the investor’s risk premium depends upon his or her utility function and initial wealth, and upon the distribution of Z. Pi . / in Equation (21.3) is a risk premium: equals onehalf of RA (the measure of the investor’s absolute risk aversion) times Var.P2 /, which measures the absolute (dollar) risk attributable to holding one share of the market portfolio. The uncertainty associated with holding N shares of the risky 20
For further discussion, see J. Pratt, “Risk Aversion in the Small and the Large,” Econometrica, January 1964.
350
R. Francioni et al.
asset is Var.NP2 / D N2 Var.P2 /; thus the total risk premium for holding N shares is T D N12
(21A.1)
Dividing Equation (21A.1) by N1 .D N0 C Q/ gives the risk premium per share (the average risk premium): A D N1
(21A.2)
Differentiating Equation (21A.1) with respect to N1 gives the risk premium for a marginal share (the marginal risk premium): (21A.3) m D 2 N1 Dividing Equation (21A.3) by P1 expresses the marginal risk premium as a percentage of current price: M % D
M 2 N1 D P1 P1
(21A.4)
The return on the combined portfolio of N1 shares of the market portfolio and C1 dollars of the risk-free asset is rP D
P2 1 P1
P1 N1 W
P1 N1 C 1 rf W
(21A.5)
and the variance of the return on the combined portfolio is Var
P2 P1
P1 N1 W
D
N1 W
2 Var.P2 /
(21A.6)
Thus the investor’s risk premium associated with the uncertain return realized from the combined portfolio is rp D
N1 W
2
(21A.7)
Appendix 21B Designing Xetra This appendix provides further detail on the development and design of Deutsche Börse’s electronic trading platform, Xetra. The first steps in designing Xetra involved specifying principles that the new market should implement, and defining the system’s functionality. This was done by Deutsche Börse working together with key market participants. Most importantly, the new market system was to provide equal and decentralized access to all its participants. Further, the system’s functionality and the market information delivered to users (whether pre- or post-trade) were to be the same for all traders. A trader’s location should not matter. Equity trading in the German market has been and continues to be order driven. This was true both for IBIS
and for floor trading that was managed by a Kursmakler specialist acting in the capacity of auctioneer, broker, and dealer. It was clear from the beginning that Xetra should run an open limit order book (open in the sense that aggregated order volume is displayed at all price points in the order book). Additionally, order matching was automated and trader anonymity ensured. Core features of an electronic trading system are determined by the market structure that it implements. The structure defines how orders are handled and translated into trades and transaction prices. Xetra’s market model comprises diverse sub-models, each with a single trading modality, or a combination of multiple modalities (i.e., it is a hybrid). Most importantly, Xetra implements both continuous trading and periodic call auction trading. This differentiation is required to cope with liquidity differences among stocks, and different liquidity needs among users depending on the size of their orders and motives for trading. The market for all stocks opens and closes with a call auction, while less liquid stocks trade in multiple call auctions per day. Once the building-blocks were defined (i.e., continuous trading and call auctions), and their combinations specified, the next design step was to detail the specific features of each of the modalities. Those features are either static (i.e., represent basic structures such as the order book), or dynamic (i.e., define processes and behavior such as order matching). The next two sections of this appendix consider the systems design in more detail for continuous trading and periodic call auction trading, respectively.
21B.1 Continuous Trading By the mid-1990s, order books for continuous trading with price and time priorities had been implemented around the globe. In designing Xetra, Deutsche Börse’s market architects could refer to a wide range of existing examples, and to a broad microstructure literature. Once the eligible order types were identified, the center piece of the development was the definition of the detailed rules of price-time matching. The complexity of this definition was broken down into a finite set of individual cases that involved various order book situations combined with various incoming orders, for which the trading outcome was to be defined by a rule. All rules collectively described the dynamics of order matching. A major challenge in designing continuous trading involves the measures that should be taken to provide an orderly market in periods of sharply elevated price volatility. To deal with this, the concept of a “price corridor” was formulated. Diverse corridors around historical prices were defined that set the benchmark for an “orderly” price for the next
21 Security Market Microstructure: The Analysis of a Non-Frictionless Market
trade. If a price occurred that lay outside its corridor, trading was to be halted (briefly) with the entire order book transported into a call auction. The purpose of the call was to allow the market to consolidate in both space and time. Trading in the continuous market was resumed upon completion of the call. Lastly, all trading parameters for the continuous platform had to be determined. This included specifying tick-sizes, breadth of the price corridors, and durations and timings. Together, this provided a comprehensive overview of the “steering wheels” for the newly designed market.
21B.2 Call Auction Trading The purpose of Xetra’s call auctions is threefold: (1) to open and close continuous trading, (2) to trade less liquid stocks in multiple calls per day with no continuous trading offered, and (3) to stabilize the market in times of large price moves. Despite those multiple purposes, a single design was defined for the auctions. Additionally, certain key consistencies between continuous trading and the call had to be achieved. For example, both limit and market orders that could be submitted to continuous trading were allowed entry into the call order book. This seemingly simple requirement was difficult to implement because it expanded the universe of possible order book configurations (and therefore necessitated more complex matching rules). Additional procedures for setting the clearing price were also required to guard against erroneous pricing that could be caused by market orders overpowering an insufficient number of limit orders. As with continuous trading, price and time priority execution rules were stipulated. Most crucial was the degree of transparency that the calls would offer. Sufficient information about the order book had to be delivered for market participants to have relevant price and quantity information concerning actual market situations, but detailed information was suppressed to inhibit excessive information leakage and gaming. The pre-call information now available in Xetra is the highest bid and the lowest offer posted in the call when these orders do not cross, or the indicative call auction price that is calculated when the order book is crossed. In other words, the full order book content is not visible – pre-call, the Xetra screen displays only the potential outcome of the call at each point in time. When Xetra was under development, call auction trading at pre-specified times was managed on the floor by specialists who were responsible for price determination, timing,
351
and the provision of dealer liquidity. The challenge was to reengineer the call so that it could be run by a computer, not by a human intermediary. The issue that Deutche Börse was facing was also grappled with by market microstructure academicians and other market architects. Substantial external guidance was received in the planning process. In particular, important inputs were obtained concerning the optimal degree of transparency for the call’s anti-gaming measures. The availability at the time of a variety of different call auction designs (both used and proposed) enabled Xetra’s calls to be designed relatively quickly.
21B.3 Electronic Trading for Less Liquid Stocks Kursmaklers (specialists) on the Frankfurt floor (both today and in the past) provide immediate liquidity at times when external liquidity is insufficient. The desire was strongly expressed, with two provisos, for a market maker to be incorporated into Xetra’s order driven model for less liquid stocks. The two provisos were (1) market participants must all have equal access to information, and (2) equal access to functionality must be maintained at a maximum level. Consequently, any changes that would favor the dealers were kept to a minimum. The dealers were referred to as “designated sponsors.” Like market makers in general, the designated sponsors were given both privileges and obligations. The primary obligation is that, on request of other market participants, the designated sponsor must provide quotes for a minimum volume and maximum spread in a stock during continuous trading. Additionally, multiple designated sponsors were included, so that they might compete with each other. Concurrently, the fulfillment of each sponsor’s obligation is measured, and the results published. The designated sponsors’ primary privilege is that they can see the identity of the quote requesters in an environment that otherwise ensures complete anonymity. Further, a sponsor balances the order book in all call auctions for the stocks that it is registered in. This gives the designated sponsors a last mover advantage (the freedom to trade against any imbalance that might exist at the market clearing price). With this privilege, a designated sponsor can influence the clearing price so as to execute orders that otherwise would not have transacted in that call. Lastly, the designated sponsors, depending on their measured performance, receive fee reductions.
352
21B.4 Xetra’s Implementation and the Migration of Liquidity to Xetra Since 1997 Xetra went operational in fall 1997. At the beginning, the new system attracted roughly 60% of trading in the most liquid segment of the market, the 30 DAX stocks. Trading on Xetra for mid-cap stocks was not as successful – market share for this segment was about 20%, as the less liquid stocks largely continued at that time to trade on the floor. But the 1997 launch was just the start of a sequence of releases that have continued through the current time. One more recent innovation was the “continuous call auction.” With this facility, calls are not held at pre-specified
R. Francioni et al.
times but are triggered by the occurrence of a “critical” liquidity situation. The continuous call comprises a dealerauctioneer who is responsible for providing a base level of liquidity in each call, as well as controlling its timing. Additionally, Xetra allows internalization of trading by member firms. Consequently, Xetra, which originally started as an exchange trading system, now also serves as the technical platform for OTC trading. Major innovations have benefited a broad range of cap sizes and, across the board, floor trading has continued to decline. Xetra has now been rolled out to 260 member firms in Europe, and its market share currently stands at 95% of all on-exchange trading in Germany today.
Chapter 22
Options Strategies and Their Applications Cheng Few Lee, John Lee, and Wei-Kang Shih
Abstract In this chapter we introduce different types of options and their characteristics. Then, we develop put-call parity theorems for European, American, and futures options. Finally, we discuss option strategies and their investment applications. Keywords Call option r Put option r American option r European option r Exercise price r Time to maturity r Futures option r Long straddle r Short straddle r Long vertical spread r Short vertical spread r Calendar spread r Protective put r Covered call r Collar
22.1 Introduction The use of stock options for risk reduction and return enhancement has expanded at an astounding pace over the last 20 years. Among the causes of this growth, two are most significant. First, the establishment of the Chicago Board Option Exchange (CBOE) in 1973 brought about the liquidity necessary for successful option trading, through public listing and standardization of option contracts. The second stimulus emanated from academia. In the same year that the CBOE was established, Professors Fischer Black and Myron Scholes published a paper in which they derived a revolutionary option-pricing model. The power of their model to predict an option’s fair price has since made it the industry standard. The development of option-valuation theory shed new light on the valuation process. Previous pricing models such as CAPM were based on very stringent assumptions, such as there being an identifiable and measurable market portfolio, as well as various imputed investor attributes, such
C.F. Lee and W.-K. Shih () Rutgers University, Newark, NJ, USA e-mail:
[email protected] J. Lee Center for PBBEF Research, New York, NY, USA
as quadratic utility functions. Furthermore, previous theory priced only market risk since investors were assumed to hold well-diversified portfolios. The strength of the Black-Scholes and subsequent option-pricing models is that they rely on far fewer assumptions. In addition, the option-valuation models price total risk and do not require any assumptions concerning the direction of the underlying securities price. The growing popularity of the option concept is evidenced by its application to the valuation of a wide array of other financial instruments (such as common stock and bonds) as well as more abstract assets including leases and real estate agreements. This chapter aims to establish a basic knowledge of options and the markets in which they are traded. It begins with the most common types of options, calls, and puts, explaining their general characteristics and discussing the institutions where they are traded. In addition, the concepts relevant to the new types of options on indexes and futures are introduced. The next focus is the basic pricing relationship between puts and calls, known as put-call parity. The final study concerns how options can be used as investment tools. The chapter on option valuation that follows utilizes all these essential concepts to afford a deeper conceptual understanding of valuation theory.
22.2 The Option Market and Related Definitions This section discusses option-market and related definitions of options, which are needed to understand option valuations and option strategies.
22.2.1 What Is an Option? An option is a contract conveying the right to buy or sell a designated security at a stipulated price. The contract
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_22,
355
356
normally expires at a predetermined time. The most important element of an option contract is that there is no obligation placed upon the purchaser: it is an “option.” This attribute of an option contract distinguishes it from other financial contracts. For instance, while the holder of an option has the opportunity to let his or her claim expires unused if so desired, futures and forward contracts obligate their parties to fulfill certain conditions.
22.2.2 Types of Options and Their Characteristics A call option gives its owner the right to buy the underlying asset while a put option conveys to its holder the right to sell the underlying asset. An option is specified by five essential parts: 1. 2. 3. 4. 5.
The type (call or put) The underlying asset The exercise price The expiration date The option price
While the most common type of underlying asset for an option is an individual stock, other underlying assets for options exist as well. These include futures contracts, foreign currencies, stock indexes, and U.S. debt instruments. In the case of common stock options (on which this discussion is exclusively centered), the specified quantity to which the option buyer is entitled to buy or sell is one hundred shares of the stock per option. The exercise price (also called the strike price) is the price stated in the option contract at which the call (put) owner can buy (sell) the underlying asset up to the expiration date, the final calendar date on which the option can be traded. Options on common stocks have expiration dates 3 months apart in one of three fixed cycles. 1. January/April/July/October 2. February/May/August/November 3. March/June/September/December The normal expiration date is the third Saturday of the month. (The third Friday is the last trading date for the option.) As an example, an option referred to as an “ABC June 25 call” is an option to buy one hundred shares of the underlying ABC stock at $25 per share, up to its expiration date in June. Option prices are quoted on a per-share basis. Thus, a stock option that is quoted at $5 would cost $500 ($5 100 shares), plus commission and a nominal SEC fee. A common distinction among options pertains to when they can be exercised. Exercising an option is the process of
C.F. Lee et al.
carrying out the right to buy or sell the underlying asset at the stated price. American options allow the exercise of this right at any time from when the option is purchased up to the expiration date. On the other hand, European options allow their holder the right of exercise only on the expiration date itself. The distinction between an American and European option has nothing to do with the location at which they are traded. Both types are currently bought and sold in the United States. There are distinctions in their pricing and in the possibility of exercising them prior to expiration. Finally, when discussing options, the two parties to the contract are characterized by whether they have bought or sold the contract. The party buying the option contract (call or put) is the option buyer (or holder), while the party selling the option is the option seller (or writer). If the writer of an option does not own the underlying asset, he or she is said to write a naked option. Table 22.1 shows a listing of publicly traded options for Johnson & Johnson at September 21, 2007.
22.2.3 Relationships Between the Option Price and the Underlying Asset Price A call (put) option is said to be in the money if the underlying asset is selling above (below) the exercise price of the option. An at-the-money call (put) is one whose exercise price is equal to the current price of the underlying asset. A call (put) option is out of the money if the underlying asset is selling below (above) the exercise price of the option. Suppose ABC stock is selling at $30 per share. An ABC June 25 call option is in the money .$30 25 > 0/, while an ABC June 35 call option is out of the money .$30 35 < 0/. Of course, the expiration dates could be any month without changing the option’s standing as in, at, or out of the money. The relationship between the price of an option and the price of the underlying asset indicates both the amount of intrinsic value and time value inherent in the option’s price, as shown in Equation (22.1): Intrinsic Value D Underlying Asset Price Option Exercise Price
(22.1)
For a call (put) option that is in the value (underlying asset price > exercise price), its intrinsic value is positive. And for at-the-money and out-of-the-money options the intrinsic value is zero. An option’s time value is the amount by which the option’s premium (or market price) exceeds its intrinsic value. For a call or put option: Time Value D Option Premium Intrinsic Value
(22.2)
22 Options Strategies and Their Applications
Table 22.1 Options quotes for Johnson & Johnson at 09/21/2007
357
Stock price at 9/21/2007 D $65.12 Call options expiring Fri Jan 18, 2008 Strike
Symbol
Last
Bid
Ask
Vol
Open int
40 45 50 55 60 65 70 75 80 85 90
JNJAH.X JNJAI.X JNJAJ.X JNJAK.X JNJAL.X JNJAM.X JNJAN.X JNJAO.X JNJAP.X JNJAQ.X JNJAR.X
24 20 15:5 11:1 6:4 2:65 0:55 0:1 0:05 0:05 0:05
25.5 20.5 15.7 10.9 6.4 2.65 0.55 0.05 N/A N/A N/A
25:6 20:7 16 11:1 6:5 2:7 0:6 0:1 0:05 0:05 0:05
10 3 11 33 275 1;544 845 2 10 0 0
5;427 2;788 8;700 10;327 32;782 70;426 48;582 13;629 4;497 3;275 3;626
Put options expiring Fri Jan18, 2008 Strike Symbol Last 40 JNJMH.X 0:05 45 JNJMI.X 0:1 50 JNJMJ.X 0:12 55 JNJMK.X 0:25 60 JNJML.X 0:7 65 JNJMM.X 1:8 70 JNJMN.X 5 75 JNJMO.X 13:3
Bid N/A 0.05 0.1 0.25 0.65 1.85 4.9 9.7
Ask 0:05 0:1 0:15 0:3 0:7 1:95 5 9:9
Vol
Open int 1;370 5;002 14;004 31;122 69;168 46;774 1; 582 20
where intrinsic value is the maximum of zero or stock price minus exercise price. Thus an option premium or market price is composed of two components, intrinsic value and time value. In-the-money options are usually the most expensive because of their large intrinsic-value component. An option with an at-the-money exercise price will have only time value inherent in its market price. Deep out-of-the-money options have zero intrinsic value and little time value and consequently are the least expensive. Deep in-the-money cases also have little time value, and time value is the greatest for at-the-money options. In addition, time value (as its name implies) is positively related to the amount of time the option has to expiration. The theoretical valuation of options focuses on determining the relevant variables that affect the time-value portion of an option premium and the derivation of their relationship in option pricing. In general, the call price should be equal to or exceed the intrinsic value: C Max.S E; 0/ where: C D the value of the call option; S D the current stock price; and E D the exercise price. Figure 22.1 illustrates the relationship between an option’s time value and its exercise price. When the exercise
0 3 1 99 227 30 20 0
Fig. 22.1 The relationship between an option’s exercise price and its time value
price is zero, the time value of an option is zero. Although this relationship is described quite well in general by Fig. 22.1, the exact relationship is somewhat ambiguous. Moreover, the identification of options with a mispriced time-value portion in their total premium motivates interest in a theoretical pricing model. One more aspect of time value that is very important is the change in the amount of time value an option has as its duration shortens. As previously mentioned, options with a
358
C.F. Lee et al.
Exercise price X January 15 $5.50 20 25
April Max.20 15; 0/ C1:25 D $6:25 Max.20 20; 0/ Max.20 20; 0/ C1:00 D $1:00 C2:00 D $2:00 Max.20 25; 0/ Max.20 25; 0/ C5:0 D $0:50 C1:25 D $1:25
July Max.20 15; 0/ C3:50 D $8:50 Max.20 20; 0/ C5:00 D $5:00 Max.20 25; 0/ C3:50 D $3:50
Other values are shown in the table above.
22.2.3.1 Additional Definitions and Distinguishing Features Fig. 22.2 The relationship between time value and time to maturity for a near-to-the-money option (assuming a constant price for the underlying asset)
longer time to maturity and those near to the money have the largest time-value components. Assuming that a particular option remains near to the money as its time to maturity diminishes, the rate of decrease in its time value, or what is termed the effect of time decay, is of interest. How time decay affects an option’s premium is an important question for the valuation of options and the application of option strategies. To best see an answer to this question refer to Fig. 22.2. In general, the value of call options with the same exercise price increases as time to expiration increases: C .S1 ; E1 ; T1 / C .S1 ; E1 ; T2 / where T1 T2 and T1 and T2 are the time to expiration. Note that for the simple case in Fig. 22.2 the effect of time decay is smooth up until the last month before expiration, when the time value of an option begins to decay very rapidly. This effect is made clearer if we refer to Chap. 23, in which the components of the option-pricing model was examined. Example 22.4 shows the effect of time decay. Example 22.1. It is January 1, the price of the underlying ABC stock is $20 per share, and the time premiums are shown in the following table. What is the value for the various call options? Exercise price X 15 20 25
January $0:50 1:00 0:50
April $1:25 2:00 1:25
July $3:50 5:00 3:50
Solution 22.1. Call premium D Intrinsic value C Time premium C15;Jan D Max.20 15; 0/ C 0:50 D $5:50 per share or $550 for 1 contract
Options may be specified in terms of their classes and series. A class of options refers to all call and put options on the same underlying asset. For example, all AT&T call and put options at various exercise prices and expiration months form one class. A series is a subset of a class and consists of all contracts of the same class (such as AT&T) having the same expiration date and exercise price. When an investor either buys or sells an option (that is, is long or short) as the initial transaction, the option exchange adds this opening transaction to what is termed the open interest for an option series. Essentially, open interest represents the number of contracts outstanding at a particular point in time. If the investor reverses the initial position with a closing transaction (that is, sells the option if he or she originally bought it or vice versa) then the open interest for the particular option series is reduced by one). While open interest is more of a static variable, indicating the number of outstanding contracts at one point in time, volume represents a dynamic characteristic. More specifically, volume indicates the number of times a particular option is bought and sold during a particular trading day. Volume and open interest are measures of an option’s liquidity, the ease with which the option can be bought and sold in large quantities. The larger the volume and/or open interest, the more liquid the option. Again, an option holder who invokes the right to buy or sell is exercising the option. Whenever a holder exercises an option, a writer is assigned the obligation to fulfill the terms of the option contract by the exchange on which the option is traded. If a call holder exercises the right to buy, a call writer is assigned the obligation to sell. Similarly, when a put holder exercises the right to sell, a put writer is assigned the obligation to buy. The seller or writer of a call option must deliver 100 shares of the underlying stock at the specified exercise price when the option is exercised. The writer of a put option must purchase 100 shares of the underlying stock when the put option is exercised. The writer of either option receives the premium or price of the option for this legal obligation. The
22 Options Strategies and Their Applications
maximum loss an option buyer can experience is limited to the price of the option. However, the maximum loss from writing a naked call is unlimited; the maximum loss possible from writing a naked put is the exercise price less the original price of that put. To guarantee that the option writer can meet these obligations, the exchange clearinghouse requires margin deposits. The payment of cash dividends affects both the price of the underlying stock and the value of an option on the stock. Normally, no adjustment is made in the terms of the option when a cash dividend is paid. However, strike price or number of shares may be adjusted if the underlying stock realizes a stock dividend or stock split. For example, an option on XYZ Corporation with an exercise price of $100 would be adjusted if XYZ Corporation stock split two for one. The adjustment in this case would be a change in the exercise price from $100 to $50, and the number of contracts would be doubled. In the case of a noninteger split (such as three for two), the adjustment is made to the exercise price and the number of shares covered by the option contracts. For example, if XYZ Corporation had an exercise price of $100 per share and had a three-for-two split, the option would have the exercise price adjusted to $662=3 , and the number of shares would be increased to 150. Notice that the old exercise value of the option, $10,000 ($100 100 shares) is maintained by the adjustment ($662=3 150 shares).
22.2.4 Types of Underlying Assets Although most people would identify common stocks as the underlying asset for an option, a variety of other assets and financial instruments can assume the same function. In fact, options on agricultural commodities were introduced by traders in the U.S. as early as the mid-l800s. After a number of scandals, agricultural commodity options were banned by the government. They were later reintroduced under tighter regulations and in a more standardized tradable form. Today, futures options on such agricultural commodities as corn, soybeans, wheat, cotton, sugar, live cattle, and live hogs are actively traded on a number of exchanges. The biggest success for options has been realized on financial futures. Options on the S&P 500 index futures contracts, NYSE index futures, foreign-currency futures, 30-year U.S. Treasury bond futures, and gold futures have all realized extraordinary growth since their initial offerings back in 1982. Options on futures are very similar to options on the actual asset, except that the futures options give their holders the right (not the obligation) to buy or sell predetermined quantities of specified futures contracts at a fixed price within a predetermined period. Options on the actual asset have arisen in another form as well. While a number of options have existed for various
359
stock-index futures contracts, options now also exist on the stock index itself. Because of the complexity of having to provide all the stocks in an index at the spot price should a call holder exercise his or her buy right, options on stock indexes are always settled in cash. That is, should a call holder exercise his or her right to buy because of a large increase in the underlying index, that holder would be accommodated by a cash amount equal to the profit on his contract, or the current value of the option’s premium. Although the options on the S&P 100 stock index at the Chicago Board Options Exchange (CBOE) are the most popular among traders, numerous index options are now traded as well. These include options on the S&P 500 index, the S&P OTC 250 index, the NYSE composite and AMEX indexes (computer technology, oil and gas, and airline), the Philadelphia Exchange indexes (gold/silver), the Value Line index, and the NASDAQ 100 index.
22.2.5 Institutional Characteristics Probably two of the most important underlying factors leading to the success of options have been the standardization of contracts through the establishment of option exchanges and the trading anonymity brought about by the Option Clearing Corporations and clearinghouses of the major futures exchanges. An important element for option trading is the interchangeability of contracts. Exchange contracts are not matched between individuals. Instead, when an investor or trader enters into an option contract, the Option Clearing Corporation (or clearinghouse for the particular futures exchange) takes the opposite side of the transaction. So rather than having to contact a particular option writer to terminate an option position, a buyer can simply sell it back to the exchange at the current market clearing price. This type of anonymity among option-market participants is what permits an active secondary market to operate. The sources of futures options traded on the various futures exchanges mentioned earlier are determined by the open-auction bidding, probably the purest form of laissezfaire price determination that can be seen today. With the open-auction-bidding price mechanism there are no market makers, only a large octagonal pit filled with traders bidding among themselves to buy and sell contracts. While some traders buy and sell only for themselves, many of the participants are brokers representing large investment firms. Different sides of the pit usually represent traders who are dealing in particular expiration months. As brokers and other pit participants make trades they mark down what they bought or sold, how much, at what price, and from whom. These cards are then collected by members of the exchange
360
who record the trades and post the new prices. The prices are displayed on “scoreboards” surrounding the pit. While stock options and options on commodities and indexes are traded in a similar fashion, one major difference prevails – the presence of market makers. Market makers are individuals who typically trade one type of option for their own account and are responsible for ensuring that a market always exists for their particular contract. In addition, some option exchanges utilize board brokers as well. These individuals are charged with the maintenance of the book of limit orders (orders from outside investors that are to be executed at particular prices or when the market goes up or down by a prespecified amount). Essentially, market makers and board brokers on the options exchanges share the duties performed by the specialists on the major stock exchanges. Although stocks can be bought with as little as 50% margin, no margin is allowed for buying options – the cost of the contract must be fully paid. Because options offer a high degree of leverage on the underlying asset, additional leveraging through margins is considered by regulators to be excessive. However, if more than one option contract is entered into at the same time – for instance, selling and buying two different calls – then, of course, a lower cost is incurred, since the cost of one is partially (or wholly) offset by the sale of the other.
22.3 Put-Call Parity This section addresses a most important concept, called put-call parity (for option valuation). The discussion includes European options, American options, and future options.
C.F. Lee et al.
Theorem 22.1. Put-Call Parity for European Options with No Dividends. Ct;T D Pt;T C St EBt;T where: Ct;T D value of a European call option at time t that matures at time T .T > t/; Pt;T D value of a European put option at time t, that matures at time T ; St D value of the underlying stock (asset) to both the call and put options at time t; E D exercise price for both the call and put options; Bt;T D price at time t of a default-free bond that pays $1 with certainty at time T (if it is assumed that this risk-free rate of interest is the same for all maturities and equal to r – in essence a flat-term structure – then Bt;T D e r.T t / , under continuous compounding), or Bt;T D 1= .1 C r/T t for discrete compounding. Equation (22.3) uses the following principle. If the options are neither dominant nor dominated securities, and if the borrowing and lending rates are equal, then the return patterns of a European call and a portfolio composed of a European put, a pure discount bond with a face value equal to the options exercise price E, and the underlying stock (or asset) are the same.1 In understanding why the put-call parity theorem holds, and to support the theorem, two additional properties of option pricing must be provided: Property 22.1. At maturity (time T ) the call option is worth the greater of ST E dollars or zero dollars: CT D Max.0; ST E/
22.3.1 European Options As an initial step to examining the pricing formulas for options, it is essential to discuss the relationships between the prices of put and call options on the same underlying asset. Such relationships among put and call prices are referred to as the put-call parity theorems. Stoll (1969) was the first to introduce the concept of put-call parity. Dealing strictly with European options he showed that the value of a call option would equal the value of a portfolio composed of a long put option, its underlying stock, and a short discounted exercise price. Before stating the basic put-call parity theorem as originally devised by Stoll, it must be assumed that the markets for options, bonds, and stocks (or any other underlying asset we choose) are frictionless.
(22.3)
(22.4)
As an example, suppose that the call option has an exercise price of $30. At maturity, if the stock’s (asset’s) price is $25, then the value of the call is the maximum of (0, 25–30) or (0, 5), which of course is zero. If an option sells for less than (ST E) its intrinsic value, an arbitrage opportunity will exist. Investors would buy the option and short sell the stock, forcing the mispricing to correct itself. Consequently, this first property implies that a call option’s value is always greater than zero. An equivalent property and argument exist for the value of a put option as well.
1
Any security x is dominant over any security y if the rate of return on x is equal to or greater than that of y for all states of nature and is strictly greater for at least one state. For an expanded discussion of this subject, see Merton (1973) and Smith (1976).
22 Options Strategies and Their Applications
361
Property 22.2. At maturity, the value of a put option is the greater of (E ST ) dollars or zero dollars: PT D Max.0; E ST /
P0;1yr D $5 C $110
Example 22.2. A call option with 1 year to maturity and exercise price of $110 is selling for $5. Assuming discrete compounding, a risk-free rate of 10%, and a current stock price of $100, what is the value of a European put option with a strike price of $110 and 1-year maturity?
.1:1/1
$100
P0;1yr D $5
(22.5)
Using the same line of reasoning and argument as for the call option, the second property also implies that the value of a put option is never less than zero. Table 22.2 provides proof of this first put-call parity theorem. Suppose at time t, two portfolios are formed: portfolio B is just a long call option on a stock with price St an exercise price of E, and a maturity date at T . Portfolio A consists of purchasing one hundred shares of the underlying stock (since stock options represent one hundred shares), purchasing (going long) one put option on the same stock with exercise price E and maturity date T , and borrowing at the risk-free rate an amount equal to the present value of the exercise price or EBt;T with face value of E. (This portion of the portfolio finances the put, call, and stock position.) At maturity date T , the call option (portfolio B) has value only if ST > E, which is in accordance with Property 1. For portfolio A, under all these conditions the stock price and maturing loan values are the same, whereas the put option has value only if E > ST . Under all three possible outcomes for the stock price ST , it can be seen that the values of portfolios A and B are equal. Proof has been established for the first put-call parity theorem. Example 22.2 provides further illustration.
1
22.3.2 American Options Of course, this first put-call parity theorem holds only under the most basic conditions (that is, no early exercise and no dividends). Jarrow and Rudd (1983) give an extensive coverage of the effects of more complicated conditions on put-call parity. These authors demonstrate that the effect of known dividends is simply to reduce, by the discounted value (to time t) of the dividends, the amount of the underlying stock purchased. In considering stochastic dividends, the exactness of this pricing relationship breaks down and depends on the degree of certainty that can be maintained about the range of future dividends. Put-call parity for American options is also derived under various dividend conditions. Jarrow and Rudd demonstrate that as a result of the American option’s early exercise feature, strict pricing relationships give way to boundary conditions dependent on the size and certainty of future dividends, as well as the level of interest rates and the size of the exercise price. To summarize, they state that for sufficiently high interest rates and/or exercise prices it may be optimal to exercise the put prior to maturity (with or without dividends). So the basic put-call parity for an American option with no dividends and constant interest rates is described by the following theorem. Theorem 22.2. Put-Call Parity for an American Option with No Dividends Pt;T C S EBt;T > Ct;T > Pt;T C St E
(22.6)
Solution 22.2. Increasing the generality of conditions results in increasing boundaries for the equilibrium relationship between put and
Pt;T D Ct;T C EBt;T St Table 22.2 Put-call parity for a European option with no dividends
Time T (maturity) ST > E
ST D E
ST < E
1. Buy 100 shares of the stock .St /
ST
ST
ST
2. Buy a put (Pt , maturity at T with exercise price E)
0
0
E ST
3. Borrow EBt;T dollars
E .ST E/
E 0
E 0
.ST E/
0
0
Time t strategy Portfolio A
Portfolio A value at time T Portfolio B 1. Buy a call (Ct maturing at T with exercise price E)
362
C.F. Lee et al.
call options. The beauty of these arguments stems from the fact that they require only that investors prefer more wealth to less. If more stringent assumptions are made, then the bounds can be made tighter. For an extensive derivation and explanation of these theorems see Jarrow and Rudd (1983). Example 22.3 provides further illustration. Example 22.3. A put option with 1 year to maturity and an exercise price of $90 is selling for $15; the stock price is $100. Assuming discrete compounding and a risk-free rate of 10%, what are the boundaries for the price of an American call option? Solution 22.3. Pt;T C S EBt;T > Ct;T > Pt;T C St E
1 > Ct;T > $15 C $100 $90 $15 C $100 $90 .1:1/1 $33:18 > Ct;1yr > $25
Futures Options As a final demonstration of put-call parity the analysis is extended to the case where the underlying asset is a futures contract. The topic of futures contracts and their valuation will be more fully examined. Nevertheless, this chapter takes time to apply put-call parity when the options are on a futures contract because of the growing popularity and importance of such futures options. A futures contract as described in as a contract in which the party entering into the contract is obligated to buy or sell the underlying asset at the maturity date for some stipulated price. While the difference between European and American options still remains, the complexity of dividends can be ignored since futures contracts do not
pay dividends. Put-call parity for a European futures option (when interest rates are constant) is as follows: Theorem 22.3. Put-Call Parity for a European Futures Option. Ct;T D Pt;T C Bt;T .Ft;T E/
(22.7)
where Ft;T is the price at time t for a futures contract maturing at time T (which is the underlying asset to both the call and put options). Option pricing Properties 1 and 2 for call and put options apply in an equivalent sense to futures options as well. However, to understand this relationship as stated in Equation (22.7) it must be assumed that the cost of a futures contract is zero. While a certain margin requirement is required, the majority of this assurance deposit can be in the form of interest-bearing securities. Hence as an approximation a zero cost for the futures contract is not unrealistic. Again, the easiest way to prove this relationship is to follow the same path of analysis used in proving Theorem 1. Table 22.3 indicates that the argument for this theorem’s proof is similar, with only a few notable exceptions. The value of the futures contract at time T (maturity) is equal to the difference between the price of the contract at time T and the price at which it was bought, or FTT Ft;T . This is an outcome of the fixed duration of a futures contract as opposed to the perpetual duration of common stock. Second, because no money is required to enter into the futures contract, the exercise price is reduced by the current futures price and the total is lent at the risk-free rate. (Actually this amount is either lent or borrowed depending on the relationship between Ft;T and E at time t. If Ft;T E < 0, then this amount will actually be borrowed at the risk-free rate.) Why are there options on spot assets as well as options on futures contracts for the spot assets? After all, at expiration the basis of a futures contract goes to zero and futures prices
Table 22.3 Put-call parity for a European futures option
Time T (maturity) FTT > E
FTT D E
1. Buy a futures contract .Ft;T /
FTT Ft;T
FTT Ft;T
FTT Ft;T
2. Buy a put (Pt;T on Ft;T with exercise price E and maturity T )
0
0
E FTT
3. Lend Bt;T .FtT E/ dollars
Ft;T E .FTT E/
Ft;T E 0
Ft;T E 0
FTT E
0
0
Time t strategy
FTT < E
Portfolio A
Portfolio A’s value at time T Portfolio B 1. Buy a call (Ct;T on Ft;T with exercise price E and maturity T )
22 Options Strategies and Their Applications
363
equal spot prices; thus, options in the spot and options on the future are related to the same futures value, and their current values must be identical. Yet a look at the markets shows that options on spot assets and options on futures for the same assets sell at different prices. One explanation for this is that investors who purchase options on spot must pay a large sum of money when they exercise their options, whereas investors who exercise an option on a future need only pay enough to meet the initial margin for the futures contract. Therefore, if the exercise of the option is important to an investor, that investor would prefer options on futures rather than options on spot and would be willing to pay a premium for the option on the future, whereas the investor who has no desire to exercise the option (remember, the investor can always sell it to somebody else to realize a profit) is not willing to pay for this advantage and so finds the option on spot more attractive.
22.3.3 Market Application Put options were not listed on the CBOE until June 1977. Before that time, brokers satisfied their clients’ demands for put option risk-return characteristics by a direct application of put-call parity. By combining call options and the underlying security, brokers could construct a synthetic put. To illustrate, the put-call parity theorem is used when a futures contract is the underlying asset. Furthermore, to simulate the option broker’s circumstances on July 1, 1984, the equation is merely rearranged to yield the put’s “synthetic” value: Pt;T D Ct;T Bt;T Ft;T C Bt;T E
(22.8)
So instead of a futures contract being purchased, it is sold. Assume the following values and use the S&P 500 index futures as the underlying asset. Ct;T D $ 3:35; Ft;T D 154:85 (September contract); E D 155:00; and Bt;T D 0:9770 (current price of a risk-free bond that pays $1 when the option and futures contract expire, average of bid and ask prices for T -bills from The Wall Street Journal). According to Equation (22.8), the put’s price should equal the theorem price: Pt;T D $3:497. The actual put price on this day (July 1, 1984) with the same exercise price and expiration month was Pt;T D $3:50. With repeated comparisons of the theorem using actual prices, it becomes clear that put-call parity is a powerful equilibrium mechanism in the market.
22.4 Risk-Return Characteristics of Options One of the most attractive features of options is the myriad ways in which they can be employed to achieve a particular combination of risk and return. Whether through a straight option position in combination with the underlying asset or some portfolio of securities, options offer an innovative and relatively low-cost mechanism for altering and enhancing the risk-return tradeoff. In order to better grasp these potential applications this section analyzes call and put options individually and in combination, relative to their potential profit and loss and the effects of time and market sentiment.
22.4.1 Long Call The purchase of a call option is the simplest and most familiar type of option position. The allure of calls is that they provide the investor a great deal of leverage. Potentially, large percentage profits can be realized from only a modest price rise in the underlying asset. In fact, the potential profit from buying a call is unlimited. Moreover, the option purchaser has the right but no obligation to exercise the contract. Therefore, should the price of the underlying asset decline over the life of the call, the purchaser need only let the contract expire worthless. Consequently, the risk of a long call position is limited. Figure 22.3 illustrates the profit profile of a long call position. The following summarizes the basic risk-return features for a long-call position. Profit potential: unlimited Loss potential: limited (to cost of option) Effect of time decay: negative (decrease option’s value) Market expectation: bullish As the profit profile indicates, the time value of a long call declines over time. Consequently, an option is a wasting asset. If the underlying asset’s price does not move above the exercise price of the option E by its expiration date T , the buyer of the call will lose the value of his initial investment (the option premium). Consequently, the longer an investor holds a call, the more time value the option loses, thereby, reducing the price of the option. This leads to another important point – taking on an option position. As with any other investment vehicle, the purchaser of a call expresses an opinion about the market for the underlying asset. Whereas an investor can essentially express one of three different sentiments (bullish, neutral, or bearish) about future market conditions, the long call is strictly a bullish position. That is, the call buyer only wins if the underlying asset rises in price. However, depending on the exercise price of the call, the
364
C.F. Lee et al.
Fig. 22.3 Profit profile for a long call
Fig. 22.4 Profit profile for a short call
buyer can express differing degrees of bullishness. For instance, since out-of-the-money calls are the cheapest, a large price increase in the underlying asset will make these calls the biggest percentage gainers in value. So an investor who is extremely bullish would probably go with an out-of-themoney call, since its intrinsic value is small and its value will increase along with a large increase in the market.
22.4.2 Short Call Selling a call (writing it) has risk-reward characteristics, which are the inverse of the long call. However, one major distinction arises when writing calls (or puts) rather than buying them. That is, the writer can either own the underlying asset upon which he or she is selling the option (a covered write), or simply sell the option without owning the asset (a naked write). The difference between the two is of consid-
erable consequence to the amount of risk and return taken on by the seller. Let us first examine the profit profile and related attributes of the naked short call, displayed in Fig. 22.4. When the writer of a call does not own the underlying asset, his or her potential loss is unlimited. Why? Because if the price of the underlying asset increases, the value of the call also increases for the buyer. The seller of a call is obliged to provide a designated quantity of the underlying asset at some prespecified price (the exercise price) at any time up to the maturity date of the option. So if the asset starts rising dramatically in price and the call buyer exercises his or her right, the naked-call writer must go into the market to buy the underlying asset at whatever the market price. The naked-call writer suffers the loss of buying the asset at a price S and selling it at a price E when S > E (less the original premium collected). When common stock is the underlying asset, there is no limit to how high its price could go. Thus, the naked-call writer’s risk is unlimited as well. Of course, the naked-call writer could have reversed position by buying
22 Options Strategies and Their Applications
back the original option he sold – that is, zeroing out the position – however, this also done at a loss. The following summarizes the basic risk-return features for a naked shortcall position. Profit potential: limited (to option premium) Loss potential: unlimited Effect of time decay: positive (makes buyer’s position less valuable) Market expectation: bearish to neutral The naked short-call position is obviously a bearish position. If the underlying asset’s price moves down, the call writer keeps the entire premium received for selling this call, since the call buyer’s position becomes worthless. Once again, the naked-call writer can express the degree of bearishness by the exercise price at which he or she sells the call. By selling an in-the-money call, the writer stands to collect a higher option premium. Conversely, selling an out-of-themoney call conveys only a mildly bearish to neutral expectation. If the underlying asset’s price stays where it is, the value of the buyer’s position, which is solely time value, will decay to zero; and the call writer will collect the entire premium (though a substantially smaller premium than for an in-the-money call). While the passing of time has a negative effect on the value of a call option for the buyer, it has a positive effect for the seller. One aspect of an option’s time value is that in the last month before the option expires, its time value decays most rapidly. Why? Time value is related to the probability that the underlying asset’s price will move up or down enough to make an option position increase in value. This probability declines at an accelerating (exponential) rate as the option approaches its maturity date. The consideration of time value, then, is a major element when investing in or hedging with options. Unless an investor is extremely bullish,
Fig. 22.5 Profit profile for a covered short call
365
it would probably be unwise to take a long position in a call in its last month before maturity. Conversely, the last month of an option’s life is a preferred time to sell since its time value can more easily and quickly be collected. Now consider the other type of short-call position, covered-call writing. Because the seller of the call owns the underlying asset in this case, the risk is truncated. The purpose of writing a call on the underlying asset when it is owned is twofold. First, by writing a call option, one always decreases the risk of owning the asset. Second, writing a call can increase the overall realized return on the asset. The profit profile for a covered short call (or a covered write) in Fig. 22.5 provides further illustration. The following summarizes the basic risk-return features for the covered short-call position. Profit potential: limited (exercise price asset price C call premium) Loss potential: limited (asset price call premium) Effect of time decay: positive Market expectation: neutral to mildly bullish By owning the underlying asset, the covered-call writer’s loss on the asset for a price decline is decreased by the original amount of the premium collected for selling the option. The total loss on the position is limited to the extent that the asset is one of limited liability, such as a stock, and cannot fall below zero. The maximum profit on the combined asset and option position is higher than if the option was written alone, but lower than simply owning the asset with no short call written on it. Once the asset increases in price by a significant amount the call buyer will very likely exercise the right to purchase the asset at the prespecified exercise price. Thus, covered-call writing is a tool or strategy for enhancing an asset’s realized return while lowering its risk in a sideways market.
366
22.4.3 Long Put Again, the put option conveys to its purchasers the right to sell a given quantity of some asset at a prespecified price on or before its expiration date. Similar to a long call, a long put is also a highly leveraged position, but the purchaser of the put makes money on the investment only when the price of the underlying asset declines. While a call buyer has unlimited profit potential, a put buyer has limited profit potential since the price of the underlying asset can never drop below zero. Yet like the long-call position, the put buyer can never lose more than the initial investment (the option’s premium). The profit profile for a long put is shown in Fig. 22.6. The following summarizes the basic risk-return features for the profit profile of a long-put position. Profit potential: limited (asset price must be greater than zero) Loss potential: limited (to cost of put) Effect of time decay: negative or positive Market expectation: bearish An interesting pricing ambiguity for this bearish investment is how the put’s price is affected by the time decay. With the long call there is a clear-cut relation – that is, the effect of the time decay is to diminish the value of the call. The relationship is not so clear with the long put. Although at certain prices for the underlying asset the value of the long-put position decreases with time, there exist lower asset prices for which its value will increase with time. It is the put’s ambiguous relationship with time that makes its correct price difficult to ascertain. (This topic has been explored in Chap. 5.) One uniquely attractive attribute of the long put is its negative relationship with the underlying asset. In terms of the capital asset pricing model, it has a negative beta (though usually numerically larger than that of the underlying asset, due to the leverage affect). Therefore, the long put is an ideal hedging instrument for the holder of the underlying asset who wants to protect against a price decline. If the investor is
Fig. 22.6 Profit profile for a long put
C.F. Lee et al.
wrong and the price of the asset moves up instead, the profit from the asset’s price increase is only moderately diminished by the cost of the put. (See Chap. 23 for more on hedging and related concepts.)
22.4.4 Short Put As was true for the short-call position, put writing can be covered or uncovered (naked). The risk-return features of the uncovered (naked) short put are discussed first. For taking on the obligation to buy the underlying asset at the exercise price, the put writer receives a premium. The maximum profit for the uncovered-put-writer is this premium, which is initially received. Figure 22.7 provides further illustration. While the loss potential is limited for the uncoveredput-writer, it is nonetheless still very large. Thus, someone neutral on the direction of the market would sell out-of-themoney (lower exercise price) puts. A more bullish sentiment would suggest that at-the-money options be sold. The investor who is convinced the market will go up should maximize return by selling a put with a larger premium. As with the long put, the time-decay effect is ambiguous and depends on the price of the underlying asset. The following summarizes the basic risk-return features for the profit profile of an uncovered short-put position. Profit potential: limited (to put premium) Loss potential: limited (asset price must be greater than zero) Effect of time decay: positive or negative Market expectation: neutral to bullish Referring again to Fig. 22.5 for the combined short-call and long-asset position, notice the striking resemblance of its profit profile at expiration to that for the uncovered short put. This relationship can be seen mathematically by using
22 Options Strategies and Their Applications
367
Fig. 22.7 Profit profile for an uncovered short call
Fig. 22.8 Profit profile for a long straddle
put-call parity. That is, the synthetic put price PT D E C CT ST , or at expiration the value of the put should equal the exercise price of the call option plus the call option’s value minus the value at time T of the underlying asset. Buying (writing) a call and selling (buying) the underlying asset (or vice versa) allows an investor to achieve essentially the same risk-return combination as would be received from a long put (short put). This combination of two assets to equal the risk and return of a third is referred to as a synthetic asset (or synthetic option in this case). Synthesizing two financial instruments to resemble a third is an arbitrage process and is a central concept of finance theory. Now a look at covered short puts is in order to round out the basics of option strategies. For margin purposes and in a theoretical sense, selling a put against a short-asset position would be the sale of a covered put. However, this sort of position has a limited profit potential if the underlying asset is anywhere below the exercise price of the put at expiration. This position also has unlimited upside risk, because the short position in the asset will accrue losses while the profit from the put sale is limited. Essentially, this position is equivalent to the uncovered or naked short call, except that the latter has less expensive transaction costs. Moreover, because the time value for put options is generally less than that of calls, it will be advantageous to short the call.
Strictly speaking, a short put is covered only if the investor also owns a corresponding put with exercise price equal to or greater than that of the written put. Such a position, called a spread, is discussed later in this chapter.
22.4.5 Long Straddle A straddle is a simultaneous position in both a call and a put on the same underlying asset. A long straddle involves purchasing both the call and the put. By combining these two seemingly opposing options an investor can get the best risk-return combination that each offers. The profit profile for a long straddle in Fig. 22.8 illustrates the nature of this synthetic asset. The following summarizes the basic risk-return features for the profit profile of a long-straddle position. Profit potential: unlimited on upside, limited on downside Loss potential: limited (to cost of call and put premiums) Effect of time decay: negative Market sentiment: bullish or bearish The long straddle’s profit profile makes clear that its riskreward picture is simply that of the long call overlapped by the long put, with each horizontal segment truncated
368
C.F. Lee et al.
(represented by the horizontal dashed lines on the bottom). An investor will profit on this type of position as long as the price of the underlying asset moves sufficiently up or down to more than cover the original cost of the option premiums. Thus, a long straddle is an effective strategy for someone expecting the volatility of the underlying asset to increase in the future. In the same light, the investor who buys a straddle expects the underlying asset’s volatility of price to be greater than that imputed in the option price. Since time decay is working against the value of this position, it might be unwise to purchase a straddle composed of a call and put in their last month to maturity when their time decay is greatest. It would be possible to reduce the cost of the straddle by purchasing a high-exercise-price call and a low- exercise put (out-of-the-money options); however, the necessary up or down movement in the asset’s price in order to profit is larger. Example 22.4 provides further illustration. Example 22.4. Situation: An investor thinks the stock market is going to break sharply up or down but is not sure which way. However, the investor is confident that market volatility will increase in the near future. To express his position the investor puts on a long straddle using options on the S&P 500 index, buying both at-the-money call and put options on the September contract. The current September S&P 500 futures contract price is 155.00. Assume the position is held to expiration. Transaction: 1. Buy 1 September 155 call at $2.00 ( 500 per contract) 2. Buy 1 September 155 put at $2.00 Net initial investment (position value) Results: 1. If futures price D 150.00: (a) 1 September call expires at $0 (b) 1 September put expires at $5.00 (c) Less initial cost of put Ending position value (net profit) 2. If futures price D 155.00: (a) 1 September call expires at $0 (b) 1 September put expires at $0 Ending position value (net loss) 3. If futures price D 160.00: (a) 1 September call expires at $5.00 (b) 1 September call expires at $0 (c) Less initial cost of put Ending position value (net profit)
Maximum loss potential: $2,000, the initial investment. Breakeven points: 151.00 and 159.00, for the September S&P 500 futures contract.2 Effect of time decay: negative, as evidenced by the loss incurred, with no change in futures price (result 2).
22.4.6 Short Straddle For the most part, the short straddle implies the opposite riskreturn characteristics of the long straddle. A short straddle is a simultaneous position in both a short call and a short put on the same underlying asset. Contrary to the long-straddle position, selling a straddle can be an effective strategy when an investor expects little or no movement in the price of the underlying asset. A similar interpretation of its use would be that the investor expects the future volatility of the underlying asset’s price that is currently impounded in the option premiums to decline. Moreover, since the time decay is a positive effect for the value of this position, one appropriate time to set a short straddle might be in the last month to expiration for the combined call and put. Figure 22.9 shows the short straddle’s profit profile, and Example 22.5 provides further illustration. The following summarizes the basic risk-return features for the profit profile of a short-straddle position. Profit potential: limited (to call and put premiums) Loss potential: unlimited on upside, limited on downside Effect of time decay: positive Market expectation: neutral
($1,000) ($1,000) ($2,000)
($1,000) $2,500 ($1,000) $500 ($1,000) ($1,000) $2,000 $2,500 ($1,000) ($1,000) $500
Summary: Maximum profit potential: unlimited. If the market had contributed to move below 150.00 or above 160.00, the position would have continued to increase in value.
Example 22.5. Situation: An investor thinks the market is overestimating price volatility at the moment and that prices are going to remain stable for some time. To express his opinion, the investor sells a straddle consisting of at-the-money call and put options on the September S&P 500 futures contract, for which the current price is 155.00. Assume the position is held to expiration. Transaction: 1. Sell 1 September 155 call at $2.00 ( 500 per contract) 2. Sell 1 September 155 put at $2.00 Net initial inflow (position value)
2
$1,000 $1,000 $2,000
Breakeven points for the straddle are calculated as follows: Upside BEP D Exercise price C Initial net investment (in points) 159:00 D 155:00 C 4:00 Downside BEP D Exercise price Initial net investment (in points) 159:00 D 155:00 C 4:00 151:00 D 155:00 4:00
22 Options Strategies and Their Applications
369
Fig. 22.9 Profit profile for a short straddle
Results: 1. If futures price D 150.00: (a) 1 September 155 call expires at 0 (b) I September 155 put expires at $5.00 (c) Plus initial inflow from sale of put Ending position value (net loss) 2. If futures price D 155.00: (a) 1 September 155 call expires at 0 (b) I September 155 put expires at 0 Ending position value (net profit) 3. If futures price D 160.00: (a) 1 September 155 call expires at $5.00 (b) I September put expires at 0 (c) Plus initial inflow from sale of call Ending position value (net loss)
$1,000 ($2,500) $1,000 ($500) $1,000 $1,000 $2,000 ($2,500) $1,000 $1,000 ($500)
Summary: Maximum profit potential: $2,000, result 2, where futures price does not move. Maximum loss potential: unlimited. If futures price had continued up over 160.00 or down below 145.00, this position would have kept losing money. Breakeven points: 151.00 and 159.00, an eight-point range for profitability of the position.3 Effect of time decay: positive, as evidenced by result 2.
22.4.7 Long Vertical (Bull) Spread When dealing strictly in options, a spread is a combination of any two or more of the same type of options (two calls or two puts, for instance) on the same underlying asset. A vertical spread specifies that the options have the same 3
Breakeven points for the short straddle are calculated in the same manner as for the long straddle: exercise price plus initial prices of options.
maturity month. Finally, a long vertical spread designates a position for which one has bought a low-exercise-price call (or a low-exercise-price put) and sold a high-exercise-price call (or a high-exercise-price put) that both mature in the same month. A long vertical spread is also known as a bull spread because of the bullish market expectation of the investor who enters into it. Actually, the long vertical spread (or bull spread) is not a strongly bullish position, because the investor limits the profit potential in selling the highexercise-price call (or high-exercise-price put). Rather, this is a popular position when it is expected that the market will more likely go up than down. Therefore, the bull spread conveys a bit of uncertainty about future market conditions. Of course, the higher the exercise price at which the call is sold, the more bullish the position. An examination of the profit profile for the long vertical spread (see Fig. 22.10) can tell more about its risk-return attributes. The following summarizes the basic risk-return features for the profit profile of a long-vertical-spread position. Profit potential: limited (up to the higher exercise price) Loss potential: limited (down to the lower exercise price). Effect of time decay: mixed Market expectation: cautiously bullish Although profit is limited by the shorted call on the upside, the loss potential is also truncated at the lower exercise price by the same short call. There are other reasons for this being a mildly bullish strategy. The effect of time decay is ambiguous up to the expiration or liquidation of the position. That is, if the asset price, St , is near the exercise price of the higher-price option EH , then the position acts more like a long call and time-decay effect is negative. Conversely, if St is near the exercise price of the lower-price option EL , then the bull spread acts more like a short call and the time-decay effect is neutral. Consequently, unless an investor is more than mildly bullish, it would probably be unwise to put on a bull spread with the low exercise price call near the current price of the
370
C.F. Lee et al.
Fig. 22.10 Profit profile for a long vertical spread
asset while both options are in their last month to expiration. Example 22.6 provides further illustration. Example 22.6. Situation: An investor is moderately bullish on the West German mark. He would like to be long but wants to reduce the cost and risk of this position in case he is wrong. To express his opinion, the investor puts on a long vertical spread by buying a lower-exercise-price call and selling a higher-exercise-price call with the same month to expiration. Assume the position is held to expiration. Transaction: 1. Buy 1 September 0.37 call at ($ 587.50) $0.0047 ( 125000 per contract) 2. Sell 1 September 0.38 call at $ 1 62.50 $0.0013 Net initial investment ($ 425.00) (position value) Results: 1. If futures price D 0.37: (a) 1 September 0.37 call expires at $0 (b) 1 September 0.38 call expires at $0 Ending position value (net loss) 2. If futures price D 0.3800: (a) 1 September 0.37 call expires at $0.01 (b) 1 September 0.38 call expires at $0 Less initial cost of 0.37 call Ending position value (net profit) 3. If futures price D 0.39: (a) 1 September 0.37 call expires at $0.02 (b) 1 September 0.38 call expires at $0.01 Less initial premium of 0.37 call Plus initial premium of 0.38 call Ending position value (net profit)
($587.50) $162.50 ($425.00) $1,250.00 $162.50 ($587.50) $825.00
Summary: Maximum profit potential: $825.00, result 2. Maximum loss potential: $425.00, result 1. Breakeven point: 0.3734.4 Effect of time decay: mixed. positive if price is at high end of range and negative if at low end.
22.4.8 Short Vertical (Bear) Spread The short vertical spread is simply the reverse of the corresponding long position. That is, an investor buys a highexercise-price call (or put) and sells a low-exercise-price call (or put), both having the same time to expiration left. As the more common name for this type of option position is bear spread, it is easy to infer the type of market sentiment consistent with this position. The profit profile for the short vertical spread is shown in Fig. 22.11. As the profit profile indicates, this strategy is profitable as long as the underlying asset moves down in price. Profit is limited to a price decline in the asset down to the lower exercise price, while risk is limited on the upside by the long-call position. From the time-decay effects shown, a mildly bearish investor might consider using options in the last month to expiration with the EL option near the money. The following summarizes the basic risk-return features for the profit profile of a short-vertical-spread position. Profit potential: limited (down to EL ) Loss potential: limited (up to EH ) Effect of time decay: mixed (opposite to that of long vertical spread) Market sentiment: mildly bearish
$2,500.00 ($1,250.00) ($587.50) $162.50 $825.00
4
Breakeven point for the long vertical spread is computed at a lower exercise price plus price of long call minus price of short call .0:3734 D 0:3700 C 0:0047 0:0013/.
22 Options Strategies and Their Applications
371
Fig. 22.11 Profit profile for a short vertical spread
Fig. 22.12 Profit profile for a neutral calendar spread
22.4.9 Calendar (Time) Spreads A calendar spread (also called a time or horizontal spread) consists of the sale of one option and the simultaneous purchase of another option with the same exercise price but a longer term to maturity. The objective of the calendar spread is to capture the faster erosion in the time-premium portion of the shorted nearer-term-to-maturity option. By taking a position in two of the same type options (two calls or two puts), both with the same exercise price, the investor utilizing this strategy expresses a neutral opinion on the market. In other words, the investor is interested in selling time rather than predicting the price direction of the underlying asset. Thus, a calendar spread might be considered appropriate for a sideways-moving or quiet market. However, if the underlying asset’s price moves significantly up or down, the calendar spread will lose part of its original value. Figure 22.12 displays the calendar spread’s profit profile and related riskreturn attributes.
The profit profile shows that this strategy will make money for a rather narrow range of price movement in the underlying asset. While similar in nature to the short straddle (both are neutral strategies), the calendar spread is more conservative. The reason? It has both a lower profit potential and lower (limited) risk than the short straddle. The lower potential profit is the result of only benefiting from the time decay in one option premium instead of two (the call and the put) for the short straddle. Moreover, taking opposite positions in the same type of option at the same exercise price adds a loss limit on each side against adverse price moves. The following summarizes the basic risk-return features for the profit profile of a neutral calendar-spread position. Profit potential: limited Loss potential: limited (to original cost of position) Effect of time decay: positive. (Option sold loses value faster than option bought.) Market sentiment: neutral
372
The calendar spread does not have to be neutral in sentiment. By diagonalizing this spread it is possible to express an opinion on the market. For instance, by selling a near-term higher-exercise-price option and purchasing a longer-term lower-exercise-price option, the investor is being bullish in position. Such a position is thus referred to as a bullish calendar spread. Why is it bullish? Remember that with the neutral calendar spread we are concerned solely with benefiting from the faster time decay in the premium of the shorted near-term option. Any significant movement in price upwards, for instance, would have not been profitable because it would have slowed the time decay and increased the intrinsic value of the shorted near-term option. In fact, we would eventually lose money because the difference in premiums between the near-term and the longer-term options (the spread) would narrow as the underlying asset’s price increased. However, the bullish calendar spread is much like a long vertical (or bull) spread in that a modest increase in price for the asset up to the higher exercise price will be profitable. At the same time, though, the bullish calendar spread also reaps some of the benefits from the greater time decay in the nearer-term option’s premium. While this strategy might sound superior to the straight bull spread, it really depends on market conditions. With a bullish calendar spread, its gain from time decay will probably not be as great as that from a neutral calendar spread, nor will its bullish nature be as profitable as a straight bull spread in the event of a modest price increase for the underlying asset. The real world application will be discussed in the next section.
22.5 Examples of Alternative Option Strategies In this section we are considering several other option strategies and using the options of Constellation Energy Group, Inc. (CEG) in the following examples. Below is the information published on July 13, 2007 for all options that will expire in October 2007. CEG stock closed at $94.21 on July 13, 2007 (Table 22.4).
22.5.1 Protective Put Assume that an investor wants to invest in the CEG stock on July 13, 2007, but does not desire to bear any potential loss for prices below $95. The investor can purchase CEG stock and at the same time buy the put option CEGVS.X with a strike price of $95. Let S0 ; ST , and X
C.F. Lee et al.
Table 22.4 Call and put option quotes for CEG at 07/13/2007 Call option expiring close Fri Oct 19, 2007 Strike
Symbol
Bid
Ask
70 75 80 85 90 95 100 105 110 115
CEGJN.X CEGJO.X CEGJP.X CEGJQ.X CEGJR.X CEGJS.X CEGJT.X CEGJA.X CEGJB.X CEGJC.X
23:5 19 14:6 11:4 8 5 2:85 1:5 0:65 0:2
25:5 20:9 16:4 12:2 8:5 5:4 3:2 1:75 0:9 0:45
Bid 0:15 0:4 0:9 1:6 2:95 4:9 7:7
Ask 0:35 0:65 1:15 1:85 3:4 5:5 8:6
Put option expiring close Fri Oct 19, 2007 Strike Symbol 70 CEGVN.X 75 CEGVO.X 80 CEGVP.X 85 CEGVQ.X 90 CEGVR.X 95 CEGVS.X 100 CEGVT.X
denote the stock purchase price, future stock price at the expiration time T, and the strike price, respectively. Given S0 D $94:21, X D $95, and the premium for the put option $5.5, Table 22.5 shows the values for Protective Put at different stock prices at time T. The profit profile of the Protective Put position is constructed in Fig. 22.13.
22.5.2 Covered Call This strategy involves investing in a stock and selling a call option on the stock at the same time. The value at the expiration of the call will be the stock value minus the value of the call. The call is “covered” because the potential obligation of delivering the stock is covered by the stock held in the portfolio. In essence, the sale of the call sold the claim to any stock value above the strike price in return for the initial premium. Suppose a manager of a stock fund holds 1,000 shares of CEG stock on July 13, 2007 and she plans to sell the CEG stock if its price hits $100. Then she can write 1,000 share of the call option CEGJB.X with a strike price of $100 to establish the position. She shorts the call and collects premiums. Given that S0 D $94:21; X D $100, and the premium for the call option $2.85, Table 22.6 shows the values for Covered Call at different stock prices at time T. The profit profile of the Covered Call position is constructed in Fig. 22.14.
22 Options Strategies and Their Applications
Table 22.5 Value of protective put position at option expiration
373
Long a put at strike price
$95:00
Premium
Buy one share of stock
$5:50
Price
Stock Price
Payoff
Profit
Payoff
Profit
Payoff
Profit
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
$24:21 $19:21 $14:21 $9:21 $4:21 $0:79 $5:79 $10:79 $15:79 $20:79 $25:79
$25:00 $20:00 $15:00 $10:00 $5:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00
$19:50 $14:50 $9:50 $4:50 $0:50 $5:50 $5:50 $5:50 $5:50 $5:50 $5:50
$95:00 $95:00 $95:00 $95:00 $95:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
$4:71 $4:71 $4:71 $4:71 $4:71 $4:71 $0:29 $5:29 $10:29 $15:29 $20:29
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
Long put .X D $95/
$94:21
One share of stock
Fig. 22.13 Profit profile for protective put
Protective put value
Protective Put : Profit $30 $20
Profit
$10
One Share of Stock
$0 Long Put (X=$95)
−$10 −$20
Protective Put Value
−$30 $70
$75
$80
$85
$90
$95
$100 $105 $110 $115 $120
Stock Price
Table 22.6 Value of covered call position at option expiration
Write a call at strike price Buy one share of stock Stock
One share of stock
Price
Payoff
Profit
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
$24:21 $19:21 $14:21 $9:21 $4:21 $0:79 $5:79 $10:79 $15:79 $20:79 $25:79
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
Premium Price
$100.00 Written call .X D $100/ Payoff $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $5:00 $10:00 $15:00 $20:00
$2:85 $94:21
Covered call Profit
Payoff
Profit
$2:85 $2:85 $2:85 $2:85 $2:85 $2:85 $2:85 $2:15 $7:15 $12:15 $17:15
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $100:00 $100:00 $100:00 $100:00
$21:36 $16:36 $11:36 $6:36 $1:36 $3:64 $8:64 $8:64 $8:64 $8:64 $8:64
374
C.F. Lee et al.
Fig. 22.14 Profit profile for covered call
Covered Call : Profit $30
Profit
$20 One Share of Stock
$10
Written Call (X=$100)
$0 -$10
Covered Call
-$20 -$30 $70
$75
$80
$85
$90
$95 $100 $105 $110 $115 $120
Stock Price
Table 22.7 Value of collar position at option expiration Long a put at strike price $85:00 Write a call at strike price
Premium
$105:00
$1:85
Premium
Buy one share of stock
$1:50
Price
$94:21
Stock
One share of stock
Long put .X D $85/
Write call .X D $105/
Collar value
Price
Payoff
Profit
Payoff
Profit
Payoff
Payoff
Profit
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
$70:00 $75:00 $80:00 $85:00 $90:00 $95:00 $100:00 $105:00 $110:00 $115:00 $120:00
$24:21 $19:21 $14:21 $9:21 $4:21 $0:79 $5:79 $10:79 $15:79 $20:79 $25:79
$15:00 $10:00 $5:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00
$13:15 $8:15 $3:15 $1:85 $1:85 $1:85 $1:85 $1:85 $1:85 $1:85 $1:85
$85:00 $85:00 $85:00 $85:00 $90:00 $95:00 $100:00 $105:00 $105:00 $105:00 $105:00
$9:56 $9:56 $9:56 $9:56 $4:56 $0:44 $5:44 $10:44 $10:44 $10:44 $10:44
Profit
$0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $0:00 $5:00 $10:00 $15:00
Fig. 22.15 Profit profile for collar
$1:50 $1:50 $1:50 $1:50 $1:50 $1:50 $1:50 $1:50 $3:50 $8:50 $13:50
Collar : Profit
Profit
$30 $20
One Share of Stock
$10
Long put (X=$85)
$0
Write Call (X=$105) Collar Value
−$10 −$20 −$30 $70
$75
$80
$85
$90
$95
$100
$105
$110
$115
$120
Stock Price
22.5.3 Collar A collar combines a protective put and a short call option to bracket the value of a portfolio between two bounds. For example, an investor holds the CEG stock selling at $94.21. Buying a protective put using the put option CEGVQ.X with an exercise price of $85 places a lower bound of $85 on the
value of the portfolio. At the same time, the investor can write a call option CEGJA.X with an exercise price of $105. The call and the put sell at $1.50 and $1.85, respectively, making the net outlay for the two options to be only $0.35. Table 22.7 shows the values of the Collar position at different stock prices at time T. The profit profile of the Collar position is shown in Fig. 22.15.
22 Options Strategies and Their Applications
22.6 Conclusion This chapter has introduced some of the essential differences between the two most basic kinds of options: calls and puts. A delineation was made of the relationship between the option’s price or premium and that of the underlying asset. The option’s value was shown to be composed of intrinsic value, or the underlying asset price less the exercise price, and time value. Moreover, it was demonstrated that for an option the time value decays over time, particularly in the last month to maturity. Index and futures options were studied to introduce these important financial instruments. Put-call parity theorems were developed for European, American, and futures options in order to show the basic valuation relationship between the underlying asset and its call and put options. Finally, investment application of options and related combinations were discussed, along with relevant risk-return characteristics.
References Amram, M. and N. Kulatilaka. 2001. Real options, Oxford University Press, USA. Ball, C. and W. Torous. 1983. “Bond prices dynamics and options.” Journal of Financial and Quantitative Analysis 18, 517–532. Bhattacharya, M. 1980. “Empirical properties of the Black-Scholes formula under ideal conditions.” Journal of Financial and Quantitative Analysis 15, 1081–1106. Black, F. 1972. “Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–445. Black, F. 1985. “Fact and fantasy in the use of options.” Financial Analysts Journal 31, 36–72. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 31, 637–654. Bodhurta, J. and G. Courtadon. 1986. “Efficiency tests of the foreign currency options market.” Journal of Finance 41, 151–162. Bookstaber, R. M. 1981. Option pricing and strategies in investing, Addison-Wesley Publishing Company, Reading, MA. Bookstaber, R. M. and R. Clarke. 1983. Option strategies for institutional investment management, Addison-Wesley Publishing Company, Reading, MA. Brennan, M. and E. Schwartz. 1977. “The valuation of American put options.” Journal of Finance 32, 449–462. Cox, J. C. and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 8, 229–263. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. Eckardt, W. and S. Williams. 1984. “The complete options indexes.” Financial Analysts Journal 40, 48–57. Ervine, J. and A. Rudd. 1985. “Index options: the early evidence.” Journal of Finance 40, 743–756.
375 Finnerty, J. 1978. “The Chicago board options exchange and market efficiency.” Journal of Financial and Quantitative Analysis 13, 28–38. Galai, D. and R. W. Masulis. 1976. “The option pricing model and the risk factor of stock.” Journal of Financial Economics 3, 53–81. Galai, D., R. W. Masulis, R. Geske, and S. Givots. 1988. Option markets, Addison-Wesley Publishing Company, Reading, MA. Gastineau, G. 1979. The stock options manual, McGraw-Hill, New York. Geske, R. and K. Shastri. 1985. “Valuation by approximation: a comparison of alternative option valuation techniques.” Journal of Financial and Quantitative Analysis 20, 45–72. Hull, J. 2005. Options, futures, and other derivatives, 6th Edition, Prentice Hall, Upper Saddle River, NJ. Jarrow, R. A. and A. Rudd. 1983. Option pricing, Richard D. Irwin, Homewood, IL. Jarrow, R. A. and S. Turnbull. 1999. Derivatives securities, 2nd Edition, South-Western College Pub, Cincinnati, OH. Lee, C. F. and A. C. Lee 2006. Encyclopedia of finance, Springer, New York. Liaw, K. T. and R. L. Moy. 2000. The Irwin guide to stocks, bonds, futures, and options, McGraw-Hill Companies, New York. Macbeth, J. and L. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” Journal of Finance 34, 1173–1186. McDonald, R. L. 2005. Derivatives markets, 2nd Edition, Addison Wesley, Boston, MA. Merton, R. 1973. “Theory of rational option pricing.” Bell Journal of Economics and Management Science 4, 141–183. Rendleman, R. J. Jr. and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance 34, 1093–1110. Ritchken, P. 1987. Options: theory, strategy and applications, Scott, Foresman. Rubinstein, M. and H. Leland. 1981. “Replicating options with positions in stock and cash.” Financial Analysts Journal 37, 63–75. Rubinstein, M., H. Leland, and J. Cox. 1985. Option markets, PrenticeHall, Englewood Cliffs, NJ. Sears, S. and G. Trennepohl. 1982. “Measuring portfolio risk in options.” Journal of Financial and Quantitative Analysis 17, 391–410. Smith, C. 1976. “Option pricing: a review.” Journal of Financial Economics 3, 3–51. Stoll, H. 1969. “The relationships between put and call option prices.” Journal of Finance 24, 801–824. Summa, J. F. and J. W. Lubow. 2001. Options on futures, John Wiley & Sons, New York. Trennepohl, G. 1981. “A comparison of listed option premium and Black-Scholes model prices: 1973–1979.” Journal of Financial Research 4, 11–20. Weinstein, M. 1983. “Bond systematic risk and the options pricing model.” Journal of Finance 38, 1415–1430. Welch, W. 1982. Strategies for put and call option trading, Winthrop, Cambridge, MA. Whaley, R. 1982. “Valuation of American call options on dividend paying stocks: empirical tests.” Journal of Financial Economics 10, 29–58. Zhang, P. G. 1998. Exotic options: a guide to second generation options, 2nd Edition, World Scientific Pub Co Inc, Singapore.
Chapter 23
Option Pricing Theory and Firm Valuation Cheng Few Lee, Joseph E. Finnerty, and Wei-Kang Shih
Abstract In this chapter, we introduce the basic concepts of call and put options. Second, we discuss the Black-Scholes option pricing model and its application. Third, we discuss how to apply the option pricing theory in capital structure issue. Finally, the warrant, one type of equity options, is discussed in detail. Keywords Warrants r Executive stock option r Publicly traded option r Call option r Put option r American option r European option r Exercise price r Time to maturity r Black-Scholes option pricing model r Capital structure
23.1 Introduction We discuss the emergence of options and option pricing, including several types of options and how their value is determined. We begin looking at the basic concepts of options in Sect. 23.2, then go on to discuss factors that affect the value of options in Sect. 23.3. Hedging, hedge ratio, and option valuation are discussed in Sect. 23.4. Section 23.5 discusses how option pricing theory is used to investigate the capital structure question. We close the chapter with a look at the type of option called the warrant in Sect. 23.6 and a summary of this chapter is discussed in Sect. 23.7.
23.2 Basic Concepts of Options In general, there are three types of equity options: (1) warrants, (2) executive stock options, and (3) publicly traded options. A warrant is a financial instrument issued by a corporation that gives the purchaser the right to buy a fixed number of shares at a set price for a specific period. There C.F. Lee and W.-K. Shih () Rutgers University, Newark, NJ, USA e-mail:
[email protected] J.E. Finnerty University of Illinois at Urbana-Champaign, Champaign, IL, USA
are two major differences between a warrant and a publicly traded option. The first is that the maturity of the warrant is normally less than 9 months. The second difference is that the warrant is an agreement between the corporation and the warrant’s buyer. This means that if the warrant’s owner decides to exercise his right and purchase stock, the corporation issues new shares and receives the cash from the sale of those shares. The publicly traded option is an agreement between two individuals who have no relationship with the corporation whose shares are being optioned. When the publicly traded option is exercised, money is exchanged for shares between individuals and the corporation receives no funds, only a new owner. Executive stock options are a means of compensation for corporate employees. For services rendered, the manager or the employee has the right to buy a specific number of shares for a set price during a given period. Unlike warrants and publicly traded options, executive stock options cannot be traded. The option’s owner has only two choices: exercise the option or let it expire. Like a warrant, should the owner decide to exercise the option, the corporation receives money and issues new shares. The use of executive stock options for management compensation raises an interesting agency question. The firm’s managers may make investment and financing decisions that increase the firm’s risk in order to increase the value of their stock options. Such action could have a detrimental effect on the bondholders and other creditors of the firm. Thus, we will see that the value of an option is directly related to the variability or riskiness of the underlying asset, which in this case is the firm. Publicly traded options are probably the most widely known of the three types of equity option instruments. An important date in the history of these options is 1973, when the Chicago Board Options Exchange was founded. Although it was possible to trade options over the counter before that time, trading volume was relatively low. This date marks the beginning of a phenomenal growth in the popularity of options as a financial instrument. Indeed, in terms of the value of securities traded, the Chicago Board Options Exchange is running neck and neck with the New York and Tokyo stock
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_23,
377
378
exchanges as the world’s largest securities market. There are now five options-trading centers in the United States – the Chicago Board Option Exchange, the American Stock Exchange, the Philadelphia Stock Exchange, the Pacific Stock Exchange, and the New York Stock Exchange – and there is a steady stream of proposals for new listings on these exchanges. The share volume of trading and the general acceptance of these financial instruments renders options valuation an important subject for study. However, even more significant for our purposes, the theory of options valuation has important applications in financial management and in the valuation of other financial instruments. Black and Scholes (1973) point out those corporate securities can be viewed as options or contingent claims on firm value. In viewing a firm’s securities as options, we need to evaluate the interdependencies of bonds and stocks. As we have seen, a bond’s price is the present value of future interest and principal payments, and a stock’s price is determined by future dividends. The contingent-claims approach to security valuation differs in that it considers the valuation of all of the firm’s classes of securities simultaneously. Bonds and stocks are valued in terms on the value placed on the firm’s assets; that is, their value is contingent on the firm’s assets or investments. Banz and Miller (1978) developed an approach to making capital budgeting decisions based on a contingent claims framework. They created a method for calculating the NPV of an investment based on estimates derived from using an option valuation model. In this chapter, we consider the question of option valuation and how options are used to make financial management decisions. However, before doing so, we must become acquainted with some of the terminology of options trading. There are two basic types of options: puts and calls. A call gives the holder the right to buy a particular number of shares of a designated common stock at a specified price, called the exercise price (or strike price), on or before a given date, known as the expiration date. Hence, the owner of shares of common stock can create an option and sell it in the options market, thereby increasing the return or income on his or her stock investment. Such an option specifies both an exercise price and an expiration date. On the Chicago Board of Options Exchange, options are typically created for 3, 6, or 9 months. The actual expiration date is the third Saturday of the month of expiration. A more venturesome investor may create an option in this fashion without owning any of underlying stock. This is called naked option writing and can be very risky, especially if the value of the underlying asset has a high degree of variability. A put gives the holder the right to sell a certain number of shares of common stock at a set price on or before the expiration date of the option. In purchasing a put, the owner of shares has bought the right to sell these shares by the expira-
C.F. Lee et al.
tion date at the exercise price. As with calls, the creator can own the underlying shares (covered writing) or not (naked writing). The owner of a put or call is not obligated to carry out the specified transaction, but has the option of doing so. If the transaction is carried out, it is said to have been exercised. For example, if you hold a call option on a stock that is currently trading at a price higher than the exercise price, because the stock could be immediately resold at a profit. (You also could sell the option or hold it in the hope of further gains.) This call option is said to be “in the money.” On the other hand, if the call option is “out of the money” – that is, the stock is trading at a lower price – you certainly would not want to exercise the option, as it would be cheaper to purchase the stock directly. An American option can be exercised at any time up to the expiration date. A simpler instrument to analyze is the European option, which can only be exercised on the expiration date. In this case, the term to maturity of the option is known. Because of this simplifying factor, we will concentrate on the valuation of the European option. The factors determining the values of the two types of options are the same, although, all other things equal, an American option is worth more than a European option because of the extra flexibility permitted the option holder. Although our discussion is limited to equity options, many kinds of publicly traded options are available. Those include options in stick indexes, options on treasury bonds, options on future contracts, options on foreign currencies, and options on agricultural commodities. The proceeding discussion presented quite a lot of new terminology. For convenience, we list and define those terms below. Call
Put Exercise price of the contract Expiration date Exercise option American option
European option
Call option “in the money”
An option to purchase a fixed number of shares of common stock An option to sell a fixed number of shares of common stock Trading price set for the transaction as specified in an option Time by which the option transaction must be carried out Carrying out the transaction specified in an option An option in which the transaction can be carried out at any time up to the expiration date An option in which the transaction can only be carried out at the expiration date If the stock price is above the exercise price
23 Option Pricing Theory and Firm Valuation
Put option “in the money” Call option “out of the money” Put option “out of the money”
379
If the stock price is below the exercise price If the stock price is below the exercise price If the stock price is above the exercise price
23.2.1 Option Price Information Each day the previous day’s options trading are reported in the press. Exhibit 23.1 is a partial listing of equity options traded on the Chicago Board of Options Exchange. In this exhibit, we have highlighted the option for Johnson & Johnson (JNJ). The first 3 months (September, October, and January) are for a call option contract on Johnson & Johnson for 100 shares of stock at an exercise price of 65.12 represents the closing price of Johnson & Johnson shares on the New York Stock Exchange. The numbers in the September column represents the price of one call option with a September expiration date. Since each option is the right to buy 100 shares, the cost of the call option with the strike price $55 is $1030 or $10:3 100. The values of the call options for September is somewhat higher, reflecting the fact that if you owned those Exhibit 23.1 Listed options quotations
Close price JNJ 65.12 65.12 65.12 65.12 65.12 65.12 MRK 51.82 51.82 51.82 51.82 51.82 51.82 51.82 PG 69.39 69.39 69.39 69.39 69.39 69.39 69.39 69.39
options you would have a longer time to exercise them. In general, this time premium is reflected in higher prices for options with longer lives. The last three columns (September, October and January) represent the value of put options on Johnson & Johnson stock. Again, each put option contract is for 100 shares and worth 100 times the price, or $2 for the October 55 Johnson & Johnson puts. The specific factors that determine option value are discussed next. How much should you pay for a put or call option? The answer is not easy to determine. However, it is possible to list the various factors that determine option value. We begin with a simple question. How much is a call option worth on its expiration date? The question is simple because we are in a deterministic world. The uncertain future movement of the price of the stock in question is irrelevant. The call must be exercised immediately or not at all. If a call option is out of the money on its expiration date, it will not be exercised and the call becomes a worthless piece of paper. On the other hand, if the call option is in the money on its expiration date, it will be exercised. Stock can be purchased at the exercise price, and immediately resold at the market price, if so desired. The option value is the difference between these two prices. On the call option’s expiration date, its value will be either 0 or some positive amount equal to the difference between the market price of the stock and the
Strike price
Calls Sep
Oct
Jan
Puts Sep
Oct
Jan
45.00 50.00 55.00 60.00 65.00 70.00
20.40 N/A 10.30 5.30 0.10 N/A
N/A N/A 10.40 N/A 1.15 0.05
N/A N/A 11.10 6.40 2.60 0.55
N/A N/A N/A N/A 0.05 4.80
N/A N/A 0.02 N/A 0.85 4.50
N/A N/A 0.30 0.70 1.95 5.00
45.00 47.50 50.00 52.50 55.00 57.50 60.00
6.70 4.34 1.85 0.03 0.01 N/A N/A
7.08 4.90 2.70 1.10 0.35 N/A N/A
N/A 6.04 4.40 2.85 1.70 0.95 0.50
N/A N/A 0.03 0.55 3.20 5.70 8.20
0.15 N/A 0.65 1.50 3.30 N/A N/A
0.85 1.30 2.10 3.10 4.60 N/A 8.40
40.00 45.00 55.00 60.00 65.00 70.00 75.00 80.00
29.70 24.60 14.50 9.70 4.50 0.05 N/A N/A
29.70 24.70 N/A 9.80 4.90 0.90 0.05 N/A
30.10 N/A 15.20 10.70 6.19 2.75 0.70 0.15
N/A N/A N/A N/A N/A 0.60 5.50 10.40
N/A N/A N/A 0.08 0.21 1.50 5.70 10.50
N/A N/A 0.15 0.42 1.00 2.65 5.70 N/A
380
C.F. Lee et al.
exercise price of the option. Symbolically, let P D price of stock, Vc D value of call option, and E D exercise price. Then Vc D P E if E P , and Vc D 0 if P < E. This relationship can be written as Vc D MAX.0; P E/ where MAX denoted the larger of the two bracketed terms. For a put option .Vp /; Vp D E P if E > P , and Vp D 0 if E P . This can be written as Vp D MAX.0; E P /: The call position is illustrated in Fig. 23.1a, where we consider a call option with an exercise price of $50. The figure shows the option value as a function of stock price to the option holder. For any price of the stock up to $50, the call option is worthless. The option’s value increases as the stock price rises above $50. Hence, if on the expiration date the stock is trading at $60, the call option is worth $10. Figure 23.1b, which is the mirror image of (a), shows the option position from the view point of the writer of the call option. If the stock is trading below the exercise price on the expiration date, the call option will not be exercised and the seller incurs no loss. However, if the stock is trading at $60,
Fig. 23.1 Value of $50 exercise price call option (a) to holder, (b) to seller
the seller of the call option will be required to sell this stock for $50, which is $10 below the price that could be obtained on the market. We have seen that once a call option have been purchased; the holder of the option has the possibility of obtaining gains but cannot incur losses. Correspondingly, the writer of the option may incur losses but cannot achieve any gains after receiving the premium. Further, there is no net gain in the sense that the profit incurred by one will balance the loss of the other. Thus, options are zero sum securities – they merely transfer wealth rather than create it. To acquire this instrument, a price must be paid to the writer, and it is this price that corresponds to the call value. This value is the price paid to acquire the chance of future profit and will therefore reflect uncertainty about the future market prices on the common stock. We will now discuss how to determine the value of an option before its expiration date.
23.3 Factors Affecting Option Value 23.3.1 Determining the Value of a Call Option Before the Expiration Date Suppose that you are considering the purchase of a call option, with an exercise price of $50, on a share of common stock. Let us try to determine what factors should be taken into account in trying to assess the value of such an option prior to the expiration date. As we have seen, the problem is trivial at the expiration date. However, determining the value of an option prior to expiration is more complex. An important factor in this determination is the current price per share of the stock. Indeed, it is most straightforward to think of option value as a function of the market price. Given this framework, we can see fairly quickly how to set bounds on the value of a call option. Lower bound: The value of a call cannot be less than the payoff that would accrue if it were exercised immediately. We can think of Fig. 23.1 as showing the payoff from immediate exercise of a call as a function on the current price of the stock. In our example, this payoff is zero for any market price below $50 and equal to the market price less for the $50 exercise price when market price is above $50. Suppose that the current stock price is $60 per share, and you are offered a call option for $8. This is certain to be profitable because you could immediately exercise the call and then resell the stock at a $10 gain. Subtracting the cost of the call option would yield a profit of $2. Of course, market participants are well aware of this, so that the excess of demand oversupply for such immediate profits are unattainable.
23 Option Pricing Theory and Firm Valuation
381
Upper bound: The value of a call cannot be more than the market price of the stock. Suppose that, at the same cost, you are offered two alternatives: (a) purchase a share of stock (b) purchase an option on the same share of stock Looking into the future, option (b) will either be exercised or discarded as worthless. If the option is exercised, its value will be the difference between the future stock prices less the exercise price. This will be less than the future stock price, which is the value derived from option (a). Similarly, if the option turns out to have no value, this cannot be preferable to holding the stock, which, while it may have fallen in price, will have retained some value. Therefore, option (b) cannot be preferable to (a), so that the option value cannot exceed the market price of the stock. These conclusions are illustrated in Fig. 23.2, which relates the call-option value to the market price of the stock. The ACD line shows the payoff that would be obtained if the option were exercised immediately. The AB line is the set of points at which the call value equals the market price of the stock. The call value must lie between these boundaries; that is, in the shaded area depicted in Fig. 23.2. So far, we have been able to determine a fairly wide range in which the option value must lie for any given value of the market price of the stock. Let us now see if we can be more precise in formulating the shape of the relationship between call-option value and market price stock. The following considerations should help in forming the appropriate picture of Fig. 23.3. 1. If the market price of the stock is 0, the call value will also be 0, as indicated by point A. This extreme case arises only when there is no hope that the stock will ever have
Fig. 23.2 Value of call option
Fig. 23.3 Call option value as a function of stock price
any value; otherwise, investors would be prepared to pay some price, however small, for the stock. It follows, in such a dire circumstance, that the call option will never be exercised, and so it too has no value. 2. All other things equal, as the stock price increases, so does the call value (see lines AB and CD). A call with an exercise price of $50 would be worth more if the current market price were $40 than if it were $35, all else equal. The probability that the market price will eventually exceed the exercise price by any given amount is higher in the former case than in the latter, and so, consequently, is the payoff expected from the holding call. 3. If the price of the stock is high in relation to the exercise price, each dollar increase in price induces an increase of very nearly the same amount in the call value (shown by the slope of the line at point F). Suppose that the market price of the stock exceeds the exercise price of a call by such a large amount that it is virtually impossible for the stock price to fall below the exercise price before the expiration date. Then, any change in stock price will induce a change of the same amount in the payoff expected from the call, since it is certain that the option will be exercised. The point is that if it is known that an option will eventually be exchanged for stock, this is tantamount to already owning stock. The call holder has effectively purchased the stock without paying the full amount for it right away. The balance owned is the exercise price to be paid at the expiration date. It follows that the value of the call option is the market value of the stock less the present value of the exercise price. Notice that in our previous notation, this will exceed P-E if the exercise price does not have to be paid until some time in the future, because of the time value of money.
382
4. The call value rises at an increasing rate as the value of the stock price rises, as shown by curvature AFG. The segment AF shows that as the stock price increases, the call value increases by a lesser amount. The segment FE shows the call price increasing by a larger amount than the increase in the stock price. Theoretically, this is impossible, because if it occurs, eventually the call price would exceed the price of the underlying stock. Putting these four considerations together, we show in Fig. 23.3a typical curve relating call-option value to the market price of a stock. Beginning at the origin, when both quantities are 0, the curve shows call value increasing as market price increases. Eventually, the curve becomes virtually parallel to the two 45-degree lines, forming the boundaries if the possible-values region. This is a result of the fourth consideration listed above. The curve graphed in Fig. 23.3 shows the relationship between the call-option value and the market price of the stock, when all other relevant factors influencing call-option value are constant. Next, we must try to see what these other factors might be and assess their impact on call value. We consider, in turn, five factors that influence the value of a call option: 1. Market price of the stock. In Fig. 23.3, we have seen the curvilinear relationship between call-option value and market price of the stock. All other things equal, the higher the market price, the higher the call-option value on the stock. As already noted, the slope of this relationship increases as market price becomes higher, to the point where, eventually, each dollar increase in the stock price translates into an increase of about the same amount in the call option value on that stock. 2. The exercise price. The exercise price offered two otherwise identical call options on the same stock; you would prefer the call with the lower exercise price. This would involve larger gains from any favorable movement in the price of the stock than would an option with a higher exercise price. Therefore, we conclude that the lower the exercise price, the higher the call-option value, all other things equal. 3. The risk-free interest rate. If a call option is eventually exercised, the holder of the option will reap some of the benefits of an increase in the market value of the stock. However, the holder will do so without having to immediately pay the exercise price. This payment will only be made at some future time when the call option is actually exercised. In the meantime, this money can be invested in government securities to earn a no-risk return. This opportunity confers an increment of value on the call option; the higher the risk-free rate of interest, the greater this incremental value or lower the present value of the exercise price. Therefore, we would expect to find, all else equal,
C.F. Lee et al.
Table 23.1 Probabilities for future prices of two stocks Less volatile stock More volatile stock Future price($)
Probability
Future price ($)
Probability
42 47 52 57 62
.10 .20 .40 .20 .10
32 42 52 62 72
.15 .20 .30 .20 .15
that the higher the risk-free rate of interest, the greater the call-option value. Moreover, the longer the exercise of the option is postponed, the greater the risk-free interest earnings. Accordingly, we would expect the risk-free interest rate to determine call-option value in conjunction with the time remaining before the expiration date. 4. The volatility of the stock price. Suppose that you are offered opportunity to purchase one of two call options, each with an exercise price of $50. Table 23.1 lists probabilities for different market prices on the expiration date for each of the two stocks. In each case, the mean (expected) future price is the same. However, for the second stock, prices that differ substantially from the mean are far more likely than for the first stock. The price of such a stock is said to be relatively volatile. And the expected price of such a stock is said to be relatively volatile. We now show that the expected payoff on the expiration date is higher for a call option on the more volatile stock than for a call option on the less volatile stock. For less volatile stock, the option will not be exercised for prices below $50, but for the three higher prices its exercise will result in payoffs of $2, $7, and $12. Therefore, we find expected payoff from call option on less volatile stock D .0/.:10/ C .0/.:20/ C .2/.:40/ C .7/.:20/ C .12/.:10/ D $3:40. Similarly, for a call option on more volatile stock, expected payoff from call option on more volatile stock D .0/.:15/ C .0/.:20/ C .2/.:30/ C .12/.:20/ C .22/.:15/ D $6:30. We find, then, that although the expected future price is the same for the two stocks, the expected payoff is higher from a call option on the more volatile stock. This conclusion is quite general. For example, it does not depend on our having set the exercise price below the expected future stock price. The reader is invited to verify that the same qualitative finding would emerge if the exercise price were $55. We can conclude that, all other things remaining the same, the greater the volatility in the price of the stock, the higher the call-option value on that stock. One useful way to measure volatility is through the variance in day-to-day changes in the stock price. Figure 23.4 illustrates our assertion about the influence of stock-price volatility on call-option value. The figure shows
23 Option Pricing Theory and Firm Valuation
Fig. 23.4 Call-option value as function of stock price for high-, moderate-, and low-volatility stocks
the relationship between call value and current market price for three stocks. The three curves have the same general form depicted in Fig. 23.3. However, notice that for any given current market price of the stock, the higher the variance of dayto-day price changes, the greater the call-option value. The notion of volatility in future stock market prices is related to the length of the time horizon being considered. Specifically, if the variance of day-to-day changes is 2 , the variance in the change from the present to t days is 2 t if the price changes are serially independent. Therefore, the further ahead the expiration date, the greater the volatility in price movements. This suggests that for a European option, which can only be exercised on the expiration date, the relevant measure of volatility is 2 t where t is the number of days remaining to the expiration date. The larger this quantity, the greater the call-option value. 5. Time remaining to expiration date. We have seen in factors 3 and 4 above two reasons to expect that the longer the time remaining before the expiration date, the higher the call-option value, all else equal. The reason is that the extra time allows larger gains to be derived from postponing payment of the exercise price and permits greater volatility in price movements of the stock. These two considerations both operate in the same direction – toward increasing the call-option value. We have seen that five factors influence the value of a call option. The interaction of these various factors in determining option value is rather complex.1 We present a formula 1
Compared to the CAPM, the option pricing model (OPM) is a relative pricing approach. In the CAPM, the major determinants are risk and return. In the OPM, the return of the asset or the market does not affect the option value.
383
that, under certain assumptions, can be shown to determine a call-option value as a function of these five parameters. However, before proceeding to this somewhat complicated relationship, we discuss a simple situation in which option valuation is more straightforward. This is useful in explaining further the dependence of option value on the risk-free rate of interest. Our aim here is to show, for a special set of assumptions, that an investor can guarantee a particular return from a combined strategy of holding shares in a stock and writing call options on that stock, even though there is some uncertainty about the future price of the stock. The essential simplifying assumption required to generate this result is that there are only two possible values for the future price per share of the stock. In addition, for convenience we will use European options, with an expiration date in 1 year. Further, we assume that the stock pays no dividend and that there are no transactions costs. Within this framework, consider the factors in Table 23.2. An investor can purchase shares for $100 each and write European call options with an exercise price of $100. There are two possible prices of the stock on the expiration date, with uncertainty as to which will materialize. In Table 23.3, we list the consequences to the investor for the two expiration-date stock prices. If the higher price prevails, each share of stock will be worth $125. However, the call options will be exercised, at a cost per share to the writer of the difference between the stock price and the exercise price. In Table 23.3, this is a negative value of $25 from having written the call option. If the lower of the two possible stock prices materializes on the expiration date, each share owned will be worth $85, and it will have cost nothing to write the option because it will not be exercised. Suppose that the investor wants to form a hedged portfolio by both purchasing stock and writing call options, so as to guarantee as large a total value of holdings per dollar invested
Table 23.2 Data for a hedging example Current price per share: Future price per share: Exercise price of call option:
$100 $125 with probability .6 $85 with probability .4 $100
Table 23.3 Possible expiration-date outcomes for hedging example Expiration-date Value per share of Value per share of stock price stock holdings options written $125 $85
$125 $85
$25 $0
384
C.F. Lee et al.
as possible, whatever the stock price on the expiration date. The hedge is constructed to be riskless, since any profit (or loss) from the stock is exactly offset with a loss (or profit) from the call option. This is called a perfect or riskless hedge and can be accomplished by purchasing H number of shares for each option written. We now determine H . If H shares are purchased and one option written, the total value on the expiration date will be $125H $25 at the higher market price and $85H $0 at the lower price. Suppose we choose H so that these two amounts are equal; that is 125H 25 D 85H or H D
5 25 D : 40 8
Then, the same total value results whatever the stock’s expiration-date market price. This ratio is known as the hedge ratio of stocks to options, the implication being that a hedged portfolio is achieved by writing eight options for each five shares purchased. More generally, it follows from the above argument that the hedge ratio is given by H D
PU E PU PL
where PU D upper share price; PL D lower share price; E D exercise price of option; and E is assumed to be between PU and PL . Returning to our example, suppose that five shares are purchased and an option on eight shares is written. If the expiration-date price is $125, then the total value of the investor’s portfolio is .5/.125/ .8/.25/ D $425:
On the expiration date, an initial investment of $500 $8 Vc will be worth $1:08.500 8 Vc /. If this is to be equal to the value of the hedged portfolio, then 1:08.500 8Vc / D 425 so that Vc D
.1:08/ .500/ 425 D $13:31: .1:08/ .8/
Therefore, we conclude that if the price per share for the call option is $13.31, the hedging strategy will yield an assured rate of return equal to the risk-free interest rate. In a competitive market, this is the price that will prevail, and therefore is the value of the call option. Suppose that the price of the call option was above $13.31. By forming a hedged portfolio in the manner just described, investors could ensure a return in excess of the risk-free rate. Such an opportunity would attract many to sell options, thus driving down the price. Conversely, if the price of the option was below $13.31, it would be possible to achieve a return guaranteed to be in excess of the risk-free rate by both purchasing call options and selling short the stock. The volume of demand thus created for the call option would drive up its price. Hence, in a competitive market, $13.31 is the only sustainable price for this option. Our example shows that a hedged portfolio can achieve an assured rate of return of 8%. Notice that our analysis depends on the level of the risk-free rate. If that rate is 10% annually, the call-option value is $14.20. This illustrates the dependence of call-option value on the risk-free interest rate.
23.4 Determining the Value of Options 23.4.1 Expected Value Estimation
If the expiration-date stock price is, $85, total value is .5/.85/ .8/.0/ D $425: As predicted, the two are identical, so that this value is assured. Next, we must consider the investor’s income from the writing of call options. Let Vc denote the price per share of a call option. Then, the purchase of five shares costs $500, but $8 Vc is received from writing call options on eight shares, so that the net outlay will be $500 $8 Vc . For this outlay, a value 1 year hence of $425 is assured. However, there is another simple mechanism for guaranteeing such a return. An investment could be made in government securities at the risk-free interest rate. Suppose that this rate is 8% annually.
A higher expected rate of return, as compared with the hedged portfolio, can be achieved by the exclusive purchase either of shares or of call options. Suppose that a single share is purchased for $100. One year hence, the expected value of that share is expected value of share D .:6/.125/ C .:4/.85/ D $109 Hence, the expected rate of return from a portfolio consisting entirely of holdings of this stock is 9%. Similarly, suppose that a call option is purchased for $13.31. The expected value of this option on the expiration date is expected value of call D .:6/.25/ C .:4/.0/ D $15:
23 Option Pricing Theory and Firm Valuation
Fig. 23.5 Put-option value
Hence, the expected rate of return is expected rate of return on call D
1513:31
100%D12:7%: 13:31
The increased expected rates of return from exclusively holding these two types of securities should not be surprising. They simply represent increases required by investors for assuming additional risk. We have shown how call-option valuation can be determined within a simple framework in which only two values for the future market price of a stock are possible. We must move on to more realistic situations that involve a range of future market prices. The value of a put option is determined by the same factors that determine the value of a call option, except that the factors have different relationships to the value of a put than they have to the value of a call. Figure 23.5 shows the relationship between put-option value and stock price. The value of a put option is VP D MAX .E P; 0/ where VP D value of the put; E D exercise price; and P D value of the underlying stock. The maximum value of a put is equal to the exercise price of the option when the stock price is 0. This is shown by the line “Maximum value of the put” in Fig. 23.5. The minimum value of the put is 0, which occurs when the stock price exceeds the exercise price.
23.4.2 The Black-Scholes Option Pricing Model The notion that the price of a call option should be such that the rate of return on a fully hedged portfolio is equal to the risk-free rate of interest has been used by Black and Scholes
385
(1973) to derive a more generally applicable procedure for valuing an option. The assumption that only two future prices are possible is dropped for a more realistic view of future price movements. The complete set of assumptions on which the Black-Scholes formula is based are given below. Only European options are considered. Options and stocks can be traded in any quantities in a perfectly competitive market; there are no transactions costs and all relevant information is freely available to market participants. Short-selling of stocks and options in a perfectly competitive market is possible. The risk-free interest rate is known and is constant up to the expiration date. Market participants are able to borrow or lend at this rate. No dividends are paid on the stock. The stock price follows a random path in continuous time such that the variance of the rate of return is constant over time and known to market participants. The logarithm of future stock prices follows a normal distribution. Under these assumptions, Black and Scholes show that the call-option value on a share of stock is given by Vc D P ŒN.d1 / e rt EŒN.d2 /
(23.1)
where V D value of option; P D current price of stock; r D continuously compounded annual risk-free interest rate; t D time in years to expiration date; E D exercise price; e D 2:71828 : : :is a constant; and N.d1 / D probability that a standard normal random variable is less than or equal to di .i D 1; 2/, with
d1 D
P ln E
2 C rC t 2 p t
(23.2a)
and
d2 D
P ln E
2 C r t p 2 p D d1 t t
(23.2b)
where 2 D variance of annual rate of return on the stock, continuously compounded; and logarithms are to base e. A binomial distribution approach to derive the OPM can be found in Appendix 2A. Although the specific form of the Black-Scholes option pricing formula is complicated, its interpretation is straightforward. The first term of Equation (23.1) is the value of an investor’s right to that portion of the probability distribution of the stock’s price that lies above the exercise price. This term equals the expected value of this portion of the stock’s price distribution, as shown in Fig. 23.6.
386
Fig. 23.6 Probability distribution of stock prices
The second term in Equation (23.1) is the present value of the exercise price times the probability that the exercise price will be paid (the option will be exercised at maturity). Overall, the Black-Scholes model involves the present value of a future cash flow, a common concern in finance. The BlackScholes model equates option value to the present expected value of the stock price minus the present value of the cost of exercising the option. More important for our purposes than the particular formula is the manner in which the model relates option value to the five factors discussed in the previous section. These relationships are as follows: 1. All other things equal, the higher the current market price of the stock, the higher the call-option value 2. All other things equal, the higher the exercise price, the lower the option value 3. All other things equal, the higher the risk-free interest rate, the higher the option value 4. All other things equal, the greater the time to the expiration date, the higher the option value 5. All other things equal, the greater the volatility of the stock price (as measured by 2 ), the higher the option value According to the Black-Scholes model, although the calloption value depends on the variance of rate of return, it does not depend on the stock’s expected rate of return. This is because a change in expected rate of return affects stock price but not the relative value of the option and the stock. Recall from our earlier discussion that formation of the hedged portfolio was independent of expected return. For example, changing the two probabilities in Table 23.2 has no effect on call-option value. The following example illustrates the computation of option value using the Black-Scholes formula. Example 23.1. Suppose that the current market price of a share of stock is $90 and an option is written to purchase the stock for $100 in 6 months. The current risk-free rate of
C.F. Lee et al.
interest is 8% annually. Since the time to the expiration date is half a year, we have, in the notation of the Black-Scholes model, P D 90I E D 100I r D :08; and t D :50. All of this information is readily available to market participants. More problematic is an assessment of the likely volatility of returns on the stock over the next 6 months. One possible approach is to estimate this volatility using data on past price changes. For instance, we could compute the variance of daily changes in the logarithm of price (daily rate of return) over the last 100 trading days. Multiplying the result by the number of trading days in the year would then yield an estimate of the required variance on an annual basis.2 For this stock, suppose that such a procedure yields the standard deviation D :6 This value or some alternative (perhaps developed subjectively) can then be substituted, together with the values of the other four factors, in Equation (23.1) to obtain an estimate of the market value of the call option. We now illustrate these calculations. First, we need the natural logarithm of P/E. This is
P D ln .:9/ D :1054: ln E It follows that ln PE C rt C 2 2t d1 D p t :1054 C .:08/ .:50/ C 12 .:36/ .:50/ D :06 p :6 :50 ln PE C rt 2 2t p d2 D t D
D
:1054 C .:08/ .:50/ 12 .:36/ .:50/ p :6 :50
D :37 Using the table of the cumulative distribution function of the standard normal distribution in the appendix at the back of this book, we find the probability that a standard normal random variable less than .06 is N .d1 / D N .:06/ D :5239 and the probability that a standard normal random variable is less than :37 is N .d2 / D N .:37/ D :3557:
2
If we use monthly rate of return to calculate the variance, then the annualized variance will be 12 times the monthly variance.
23 Option Pricing Theory and Firm Valuation
These numbers represent the probability that the stock price will be at least equal to the exercise price – 52% of the time over the life of the option – and the probability of exercise – 35%. We also require e rt D e .:08/.:5/ D :9608: Finally, on substitution into Equation (23.1), we find Vc D PN .d1 / e rt EN .d2 / D .90/ .:5239/ .:9608/ .100/ .:3557/ D $12:98 Therefore, according to the Black-Scholes model, the call option should be priced at $12.98 per share. A further implication of the Black-Scholes model is that the quantity N .d1 / provides the hedge ratio for a hedged portfolio of stocks and written options. We see, then, that to achieve such a portfolio, .5239 shares of stock should be purchased for each option written.
23.4.3 Taxation of Options The taxation of option gains, losses, and income are a fairly complex and constantly changing part of the tax law. However, tax treatment can have a large impact on the usefulness of options for the individual investor. Income (premiums) received by the option writer is taxed as normal income, just as if the option writer were providing a service. Options can be used to defer gains into the future, which reduces the investor’s current tax liability. For example, an investor with a short-term gain on a stock can purchase a put to protect against a drop in the stock price. This allows the investor to hold the stock and not realize the gain until sometime in the future, perhaps the next year, thereby deferring the payment of tax from the present until some future period. Hence, the tax position of the investor is also a factor in determining the value or usefulness of options. However, because we do not have a homogeneous tax structure, it is difficult to incorporate the tax effect into a formulation such as the Black-Scholes model.
387
American call option before the expiration date. Therefore, if American and European call options on a nondividendpaying stock are otherwise identical, they should have the same value. The effect of dividends on a call option depends on whether the firm is expected to pay dividends before the option’s expiration date. Such action should reduce the option’s value – the greater the expected dividend, the larger the reduction in option value. The reason is that a dividend payment amounts to the transfer to stockholders of part of the firm’s value, a distribution in which option holders do not share. The larger this reduction in firm value, all other things equal, the lower the expected future price of the stock, because that price reflects firm value at any point in time. Therefore, the higher the dividend, the smaller the probability of any gains from exercising a call option on common stock and, hence, the lower the value of the option. The extreme case occurs if the firm is liquidated and the entire amount of funds realized from the liquidation is distributed as dividends to common stockholders. Once such a distribution is made, the stock and the call option have zero value. If the firm pays dividends, the value of an American call option on its stock will exceed that of an otherwise identical European option. The reason is that the holder of the American call can exercise the option just before the exdividend date and thus receive the dividend payment. If the benefits from such an exercise outweigh the interest income that would have been earned on the exercise price had the option been held to the expiration date, it benefits the holder of an American call to exercise the option. Since this opportunity is not available to the holder of a European option, the American option should have a higher value. The solution to valuing an option on a stock that pays dividends can be approximated by replacing P , the stock price in the Black-Scholes formula, with the stock price minus the present value of the known future dividend. This is justified because if the option is not exercised before maturity, the option holder will not receive the dividends, in which case the holder should subtract the present value of the dividend from the stock price.
23.4.4 American Options
23.5 Option Pricing Theory and Capital Structure
In our analysis of options, we have assumed the options are European and thus may only be exercised on the expiration date, and that no dividends are paid on the stock before the expiration date. Merton (1973) has shown that if the stock does not pay dividends, it is suboptimal to exercise an
In this section we show how option pricing theory can be used to analyze the value of the components of a firm’s capital structure. Suppose that management of an unlevered corporation decides to issue bonds. For simplicity, we assume these to be zero coupon bonds with face value B to be paid
388
C.F. Lee et al.
at the maturity date.3 When the bonds mature, bondholders must be paid an amount B, if possible. If this money cannot be raised, bondholders have a first claim on the firm’s assets and will be paid the firm’s entire value, Vf , which in this case will be less than B. Let us look at this arrangement from the point of view of the firm’s stockholders. By using debt, stockholders can be regarded as having sold the firm to bondholders, with an option to purchase it back by paying an amount B at the maturity date of the bonds. Thus, the stockholders hold a call option on the firm, with expiration date at the date of maturity of the bonds. If the firm’s value exceeds the value of the debt on the expiration date, the debt will be retired, so that stockholders will have exercised their call option. On the other hand, if the firm’s value is less than the face value of the debt, stockholders’ limited liability allows them to leave the firm in the hands of the bondholders. The call option will not be exercised, so that the stock will be worth nothing. However, stockholders will have no further obligations to bondholders. Therefore, on the date of expiration, the value of this call option, or, equivalently, the value of stockholders’ equity, will be V D Vf B if Vf > B and 0 if Vf < B. Thus, we can write V D MAX 0; Vf B
This analysis indicates that the stockholders own a call option on the firm’s assets. At any time, they could exercise this option by delivering the face value of the bonds to pay off the bondholders and realize the value of their option. This point is illustrated in the following example, where we employ the Black-Scholes model. Example 23.2. An unlevered corporation is valued at $14 million. The corporation issues debt, payable in 6 years, with a face value of $10 million. The standard deviation of the continuously compounded rate of return on the total value of this corporation is .2. Assume that the risk-free rate of interest is 8% annually. In our previous notation, the current market price of the asset (in this case, the firm) is $14 million, and the exercise price of the option is the face value of the debt. Therefore, we have P D 14; E D 10; r D :08; t D 6, and D :2, where we are measuring value in units of a million dollars. We will use the Black-Scholes formula, Equation (23.1), to compute the value of stockholders’ equity; that is, the value of the stockholders’ call option on the firm. First, we require the natural logarithm of P=E, so that
(23.3)
where V D value of stockholders’ call option on the firm; Vf D value of the firm; and B D face value of debt. This relationship is shown in Fig. 23.7. As we can see, this is exactly the relationship shown in Fig. 23.1a for the value of a call option. We see, then, that the theory of options can provide us with insights into the valuation of debt and stockholders’ equity.
ln
P E
D ln.1:4/ D :3365:
Next, we find d1 D
D
ln
P E
C rt C 2 2t p t
:3365 C .:08/ .6/ C 12 .:2/ .6/ p D 1:91 :2 6
and d2 D
D
ln
P E
C rt 2 2t p t
:3365 C .:08/ .6/ 12 .:2/ .6/ p D 1:42 :2 6
From tabulated values of the cumulative distribution function of the standard normal random variable, we find N .d1 / D N .1:91/ D :9719 and Fig. 23.7 Option approach to capital structure
N .d2 / D N .1:42/ D :9222: We also require
3
These bonds will sell for less than the face value, the difference reflecting interest to be paid on the loan.
e rt D e .:08/.6/ D :6188:
23 Option Pricing Theory and Firm Valuation
389
On substitution in Equation (23.1), we find the value of stockholders’ equity V D PN.d1 / e rt EN.d2 / D .14/.:9719/ .:6188/.10/.:9222/ D $7:90 million Since the total value of the firm is $14 million, the value of the debt is value of debt D $14 $7:9 D $6:10 million: Assuming the market is efficient, this firm could sell debt with a face value of $10 million for $6.1 million. These receipts could be distributed to stockholders, leaving them with an equity worth $7.9 million. We now turn to examine the effects of two factors on our calculations.
23.5.1 Proportion of Debt in Capital Structure Let us compare our results of Example (23.3) with an otherwise identical situation, but where now only $5 million face value of debt is to be issued. With E D 5 so that P=E D 2:8, we find In(P/E) to be 1.0296. Using the same procedure as in Example (23.3), d1 D 3:33 and d2 D 2:84, so that N .d1 / D :9996 and N .d2 / D :9977. Using Equation (23.1), the value of stockholders’ equity is V D PN.d1 / e rt EN.d2 / D .14/.:9996/ .:6188/.5/.:9977/ D $10:91 million Therefore, the value of the debt is value of debt D 14 10:91 D $3:09 million: In Table 23.4, we compare, from the point of view of bondholders, the cases where the face values of issued debt are $5 million and $10 million. We see from the table that increasing the proportion of debt in the capital structure decreases
Table 23.4 Effect of different levels of debt on debt value Actual value per dollar debt face Face value of debt Actual value of debt value of debt ($ millions) ($ millions) 5 10
3.09 6.10
$.618 $.610
the value of each dollar of face value of debt. If all the debt is issued at one time, this phenomenon, which simply reflects the increase in bondholders’ risk as debt increases, will result in the demand for correspondingly higher interest rates. As our example illustrates, if the corporation sells only $5 million of face-value bonds, a price of $618,000 per million dollars in face value will be paid for these bonds. However, if bonds with face value of $10 million are to be issued, this price will fall to $610,000. In such a case, the market for bonds operates such that the risk to bondholders of future default is considered in establishing the price of the zero coupon bonds, or, more generally the interest rate attached to any bonds. Suppose, however, that our company issues bonds with face value of $5 million, at a cost to bond purchasers of $3.09 million. One year later, this corporation decides to issue more debt. In valuing this new debt, potential purchasers of bonds will assess the risk of default. One factor taken into consideration will be the total level of debt of the corporation; that is, the amount of existing debt as well as the size of the new issue. Hence, the market price for the new bonds will reflect riskiness. Consider, however, the position of existing bondholders. The interest rates to be paid to these holders of old debt has already been established. The issue of new debt is going to increase the chance of default on all loans. Consequently, the value of existing bonds must fall when new bonds are issued. Therefore, the issue of new bonds entails a decrease in the wealth of existing bondholders. The beneficiaries are the stockholders of the company whose wealth increases by a corresponding amount. We see, then, that there is a conflict of interest between existing bondholders and holders of common stock. All other things equal, existing bondholders will prefer that no further debt be issued, while stockholders will prefer to see more debt issued. This provides an illustration of the agency problem discussed in Chap. 23. Purchasers of bonds will require covenants restricting the freedom of action of corporate management in order to protect the value of their investment.
23.5.2 Riskiness of Business Operations Let us return to the corporation of Example 23.2, which is about to issue debt with face value of $10 million. Leaving the other variables unchanged, suppose that the standard deviation of the continuously compounded rate of return on the corporation’s total value is .4, rather than .2. This implies that the corporation is operating in an environment of greater business risk. Setting 2 D :4, with all other relevant variables as specified in Example 23.2, we find d1 D 1:32 and d2 D :34, so that N .d1 / D :9066 and N .d2 / D :6331. From
390
C.F. Lee et al.
Table 23.5 Effect of different levels of business risk on the value of $10 million face value of debt Variance of rate of Value of equity Value of debt return ($ millions) ($ millions) .2 .4
7.90 8.77
6.10 5.23
Equation (23.1), we find the value of stockholders’ equity to be V D PN.d1 / e rt EN.d2 / D .14/.:9066/ .:6188/.10/.:6331/ D $8:77 million Thus, the value of the debt is value of debt D 14 8:77 D $5:23 million: Table 23.5 summarizes the comparison between these results and those of Example 23.2. We see that this increase in business risk, with its associated increase in the probability of default on the bonds, leads to a reduction from $6.10 million to $5.23 million in the market value of the $10 million face value of debt. To the extent that potential purchasers of bonds are able to anticipate the degree of business risk, the higher interest rate on the bonds reflects the risk involved. The greater the degree of risk, the higher the interest rate that must be offered to sell a particular amount of debt. However, suppose that having issued debt, corporate management embarks on new projects with a higher level of risk than could have been foreseen by the purchasers of the bonds. As we have just seen, this will lower the value of existing bonds. This decrease in bondholders’ wealth accrues as a gain in wealth to holders of common stock. Once again, we find a conflict of interest between stockholders and bondholders. Once debt has been sold, it will be in the interests of stockholders for the firm to operate with a high degree of business risk, while bondholders will prefer lower levels of risk. Thus, the degree of business risk represents another example of an agency problem. Bondholders will want protection against the possibility that management will take on riskier than anticipated projects, and will demand protective covenants against such actions. Because of this factor, the issue of a large volume of debt is likely to be accompanied by constraints on the freedom of management action in the firm’s operation.
23.6 Warrants A warrant is an option issued by a corporation to individual investors to purchase, at a stated price, a specified number of
the shares of common stock of that corporation. Warrants are issued in two principal sets of circumstances: 1. In raising venture capital, either to start a new company or to substantially increase the scope of operations of an existing company, warrants are often issued to lenders as an additional inducement to the promised interest payments or to purchasers of new stock issues. 2. Often, when issuing bonds, a company will increase the bonds’ attractiveness by attaching warrants. Thus, as well as receiving interest payments on the bonds, their purchasers obtain an option to buy stock in the corporation at a specified price. As we have seen, such options will have some value, and so their attachment to bonds should lead to a lowering of the interest rate paid on a fixed quantity of bonds or an increase in the number of bonds that can be sold at a given interest rate. Since a warrant is essentially a call option on a specified number of shares of common stock, the principles underlying the valuation of call options are also applicable to the valuation of warrants. Suppose that a warrant entitles its holder to purchase N shares of common stock at a total cost E. If, at the expiration date, the price per share of common stock is P , then shares of total value NP can be purchased at cost E. If NP exceeds E, the option to purchase will be exercised, and the difference between these quantities is the value of the warrant. On the other hand, if NP is less than E, it will not pay to exercise the option, so that the warrant will be worthless. The warrant’s value on the expiration date can then be expressed as Vw D MAX .0; NP E/ where Vw D value of warrant; N D number of shares that can be purchased; P D market price per share of stock; and E D exercise price for the purchase of N shares of stock. Prior to the expiration date, for reasons discussed earlier in this chapter, the warrant’s value will exceed this theoretical value. The same factors affecting the value of an ordinary call option are also relevant here. Thus, all other things equal warrant value will increase with increases in the volatility of the stock price, in the risk-free interest rate, and in the time to the expiration date. However, the basic version of the BlackScholes model generally will not be directly applicable, as it pertains to the valuation of warrants. The reason is that, generally, warrants differ in important respects from ordinary call options. These factors are as follows: 1. The life of a call option typically is just a few months. However, the life of a warrant is several years. While it may not be unreasonable to expect volatility of rate of return to remain constant for a few months, it is less likely that it will do so for several years.
23 Option Pricing Theory and Firm Valuation
2. Over a period of several years, it is likely that dividends will be paid. As we have seen, dividend payments, which do not accrue to warrantholders, reduce the value of options to purchase stock. 3. For many warrants, the exercise price is not fixed to the expiration date, but changes at designated points in time. It may well pay to exercise the option to purchase shares immediately before such a change. 4. It may be that the number of shares that all warrantholders are entitled to purchase represents a considerable fraction of the total number of shares of the corporation. Thus, if these options are all exercised, and total earnings are unaffected, then earnings per share will be diluted. Let us look at this last point in more detail. Since by using warrants, a corporation can extract more favorable terms from bondholders, it follows that, in return, the company must have transferred something of value to these bondholders. This transfer can be visualized as giving the bondholders a stake in the corporation’s equity. Thus, we should regard equity comprising both stockholdings and warrant value. We will refer to this total equity, prior to the exercise of the options, as old equity so that Old Equity D Stockholders’ Equity C Warrants: Suppose that the warrants are exercised. The corporation then receives additional money from the purchase of new shares, so that total equity is New Equity D Old Equity C Exercise Money: We denote by N the number of shares outstanding and by Nw the number of shares that warrant holders can purchase. If the options are exercised, there will be a total of N C Nw shares, a fraction Nw =.Nw C N / of which is owned by former warrantholders. These holders then own this fraction of the new equity; that is H.New Equity/ D H.Old Equity/ C H.Exercise Money/ w . where H D NwNCN Thus, a fraction, Nw =.Nw C N /, of the exercise money is effectively returned to the former warrant holders. In fact, they have really spent Œ1Nw =.Nw CN / D N=.Nw CN / of the exercise money to acquire a fraction, Nw =.Nw C N /, of the old equity. Therefore, in valuing the warrants, the BlackScholes formula must be modified. We need to make the appropriate substitutions in Equation (23.1) for the current stock price, P , and the exercise price of the option, E.
P D
Nw Nw C N
.Value of old equity/
391
ED
N Nw C N
.Exercise Money/
It also follows that the appropriate measure of volatility, 2 , is the variance of rate of return on the total old equity (including the value of warrants), not simply on stockholders’ equity. Suppose that a firm has one million shares outstanding, currently selling at $100 per share. There are also 500,000 warrants with an exercise price of $80 per share. The warrants are worth $20, or the current stock price, $100, less the exercise price of $80. The value of the old equity is Old Equity D $100 .1m/ C $20 .0:5m/ D $110 million: If the warrants are exercised, the firm will receive $40 million .$80 :5 m/ of new equity, so that the new equity is New Equity D $110m C $40m D $150 million: When they exercise their warrants, the warrantholders will own one-third of the shares outstanding; that is, H D
Nw Nw C N
or
500;000 500;000 C 1;000;000
and the old shareholders will own the remaining two-thirds of the shares outstanding. The warrantholders now have an investment worth $50 million, or 1 1 1 .$150m/ D .$110m/ C .$40m/ D 50 million: 3 3 3 It makes sense for the warrantholders to exercise their warrant; they spend $40 million for shares that are worth $50 million. In terms of the warrant value, the market should be willing to pay $20 per warrant for 500,000 warrants, or $10 million. A convertible bond is a security that gives its owner the right to exchange it for a given number of shares of common stock any time before the maturity date of the bond. Hence, a convertible bond is actually a portfolio of two securities: a bond and a warrant. The value of a convertible bond is the value of the bond portion of the portfolio plus the value of the warrant.
23.7 Conclusion In Chap. 23, we have discussed the basic concepts of call and put options and have examined the factors that determine the value of an option. One procedure used in option valuation is
392
the Black-Scholes model, which allows us to estimate option value as a function of stock price, option-exercise price, timeto-expiration date, and risk-free interest rate. The option pricing approach to investigating capital structure is also discussed, as is the value of warrants.
References Amram, M. and N. Kulatilaka. 2001. Real options, Oxford University Press, USA. Banz R. and M. Miller. 1978. “Prices for state contingent claims: some estimates and applications.” Journal of Business 51, 653–672. Bhattachayra, M. 1980. “Empirical properties of the Black-Scholes formula under ideal conditions.” Journal of Financial and Quantitative Analysis 15, 1081–1105. Black, F. 1972. “Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–445. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 31, 637–659. Bookstaber, R. M. 1981. Option pricing and strategies in investing, Addison-Wesley, Reading, MA. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. Finnerty, J. 1978. “The Chicago board options exchange and market efficiency.” Journal of Financial and Quantitative Analysis 13, 29–38.
C.F. Lee et al. Galai, D. and R. W. Masulis. 1976. “The option pricing model and the risk factor of stock.” Journal of Financial Economics 3, 53–81. Hull, J. 2005. Options, futures, and other derivatives, 6th Edition, Prentice Hall, Upper Saddle River, NJ. Jarrow R. A. and S. Turnbull. 1999. Derivatives securities, 2nd Edition, South-Western College Pub, Cincinnati, OH. Lee, C. F. 2009. Handbook of quantitative finance and risk management, Springer, New York. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Liaw, K. T. and R. L. Moy. 2000. The Irwin guide to stocks, bonds, futures, and options, McGraw-Hill Companies, New York. MacBeth, J. and L. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” The Journal of Finance 34, 1173–1186. McDonald, R. L. 2005. Derivatives markets, 2nd Edition, Addison Wesley, Boston, MA. Rendleman, R. J., Jr. and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance 24, 1093–1110. Ritchken, P. 1987. Options: theory, strategy, and applications, Scott, Foresman, Glenview, IL. Summa, J. F. and J. W. Lubow. 2001. Options on futures. John Wiley & Sons, New York. Trennepohl, G. 1981. “A comparison of listed option premia and Black-Scholes model prices: 1973–1979.” Journal of Financial Research, 11–20. Zhang, P. G.. 1998. Exotic options: a guide to second generation options, 2nd Edition, World Scientific Pub Co Inc, Singapore.
Chapter 24
Applications of the Binomial Distribution to Evaluate Call Options Alice C. Lee, John Lee, and Jessica Shin-Ying Mai
Abstract In this chapter, we first introduce the basic concepts of call and put options. Then we show how the simple one period binominal call option pricing model can be derived. Finally, we show how a generalized binominal option pricing model can be derived. Keywords Binomial distribution r Option r Simple binomial option pricing model r Generalized binomial option pricing model r Hedge ratio r Cumulative binomial function r Exercise price r Decision tree
purchaser is under no obligation to buy; it is, indeed, an “option.” This attribute of an option contract distinguishes it from other financial contracts. For instance, whereas the holder of an option may let his or her claim expire unused if he or she so desires, other financial contracts (such as futures and forward contracts) obligate their parties to fulfill certain conditions. A call option gives its owner the right to buy the underlying security, a put option the right to sell. The price at which the stock can be bought (for a call option) or sold (for a put option) is known as the exercise price.
24.1 Introduction
24.3 The Simple Binomial Option Pricing Model In this chapter, we will show that how the binomial distribution can be used to derive the call option pricing model. In the second section of this chapter, we will define the basic definition of option. In the third section, we will define and examine the simple binominal option pricing model. In the forth section, we will derive the generalized n period binominal option pricing model. Finally, in the fifth section, we will summarize our findings and make concluding remarks.
24.2 What Is an Option? In the most basic sense, an option is a contract conveying the right to buy or sell a designated security at a stipulated price. The contract normally expires at a predetermined date. The most important aspect of an option contract is that the
A.C. Lee State Street Corp., Boston, MA, USA
Before discussing the binomial option model, we must recognize its two major underlying assumptions. First, the binomial approach assumes that trading takes place in discrete time; that is, on a period-by-period basis. Second, it is assumed that the stock price (the price of the underlying asset) can take on only two possible values each period; it can go up or go down. Say we have a stock whose current price per share S can advance or decline during the next period by a factor of either u (up) or d (down). This price either will increase by the proportion u 1 0 or will decrease by the proportion 1 d; 0 < d < 1. Therefore, the value S in the next period will be either uS or dS. Next, suppose that a call option exists on this stock with a current price per share of C and an exercise price per share of X and that the option has one period left to maturity. This option’s value at expiration is determined by the price of its underlying stock and the exercise price X . the value is either
J. Lee Center for PBBEF Research, New York, NY, USA J.S.-Y. Mai () Rutgers University, Newark, NJ, USA e-mail:
[email protected]
Cu D Max .0; uS X/
(24.1)
Cd D Max .0; dS X/
(24.2)
Or
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_24,
393
394
A.C. Lee et al.
Why is the call worth Max (0, uS – X) if the stock price is uS? The option holder is not obliged to purchase the stock at the exercise price of X, so she or he will exercise the option only when it is beneficial to do so. This means the option can never have a negative value. When is it beneficial for the option holder to exercise the option? When the price per share of the stock is greater than the price per share at which he or she can purchase the stock by using the option, which is the exercise price, X. Thus, if the stock price uS exceeds the exercise price X, the investor can exercise the option and buy the stock. Then he or she can immediately sell it for uS, making a profit of uS – X (excluding commission). Likewise, if the stock price declines to dS, the call is worth Max (0, dS – X). Also for the moment, we will assume that the risk-free interest rate for both borrowing and lending is equal to r percent over the one time period and that the exercise price of the option is equal to X. To intuitively grasp the underlying concept of option pricing, we must set up a risk-free portfolio – a combination of assets that produces the same return in every state of the world over our chosen investment horizon. The investment horizon is assumed to be one period (the duration of this period can be any length of time, such as an hour, a day, a week, etc.). To do this, we buy h share of the stock and sell the call option at its current price of C. Moreover, we choose the value of h such that our portfolio will yield the same payoff whether the stock goes up or down. h .uS/ Cu D h .dS/ Cd
(24.3)
By solving for h, we can obtain the number of shares of stock we should buy for each call option we sell. hD
Cu Cd .u d /S
(24.4)
Here h is called the hedge ratio. Because our portfolio yields the same return under either of the two possible states for the stock, it is without risk and therefore should yield the risk-free rate of return, r percent, which is equal to the riskfree borrowing and lending rate. In essence this condition must be true; otherwise, it would be possible to earn a riskfree profit without using any money. Therefore, the ending portfolio value must be equal to .1 C r/ times the beginning portfolio value, hS – C. .1 C r/ .hS C/ D h .uS/ Cu D h .dS/ Cd
C D
Rd ud
Cu C
uR Cd R ud
where d < r < u. To simplify this equation, we set pD
Rd so 1 p D ud
uR ud
(24.7)
Thus we get the option’s value with one period to expiration: C D ŒpCu C .1 p/Cd =R
(24.8)
This is the binomial call option valuation formula in its most basic form. In other words, this is the binomial valuation formula with one period to expiration of the option. To illustrate the model’s qualities, let’s plug in the following values, while assuming the option has one period to expiration. Let X D $100 S D $100 U D .1:10/: so uS D $110 D D .:90/; so dS D $90 R D 1 C r D 1 C :07 D 1:07 First we need to determine the two possible option values at maturity, as indicated in Table 24.1. Next, we calculate the value of p as indicated in Equation (24.7). pD
1:07 :90 1:10 1:07 D :85 so 1 p D D :15 1:10 :90 1:10 :90
Solving the binomial valuation equation as indicated in Equation (24.8), we get C D Œ:85.10/ C :15.0/ =1:07 D $7:94 Table 24.1 Possible option value at maturity l Stock (S)
Option (C)
Next period (maturity) uS D $110
Cu D Max .0; uS X/ D Max .0; 110 100/ D Max .0; 10/ D $10
dS D $90
Cd D Max .0; dS X/ D Max .0; 90–100/ D Max .0; 10/ D $0
(24.5) $100
Note that S and C represent the beginning values of the stock price and the option price, respectively. Setting R D 1 C r, rearranging to solve for C, and using the value of h from Equation (24.4), we get
(24.6)
C
24 Applications of the Binomial Distribution to Evaluate Call Options
395
The correct value for this particular call option today, under the specified conditions, is $7.94. If the call option does not sell for $7.94, it will be possible to earn arbitrage profits. That is, it will be possible for the investor to earn a risk-free profit while using none of his or her own money. Clearly, this type of opportunity cannot continue to exist indefinitely.
190.61
162.22
137.89
138.06
137.89
117.35
117.50
99.75 137.89
117.35
24.4 The Generalized Binomial Option Pricing Model
99.75
99.88
99.75
84.90
Suppose we are interested in the case where there is more than one period until the option expires. We can extend the one-period binomial model to consideration of two or more periods. Because we are assuming that the stock follows a binomial process, from one period to the next it can only go up by a factor of u or go down by a factor of d . after one period the stock’s price is either uS or dS. Between the first and second periods, the stock’s price can once again go up by u or down by d , so the possible prices for the stock two periods from now are uuS, udS, and ddS. This process is demonstrated in tree diagram from (Fig. 24.1) in Example 24.1 later in this appendix. Note that the option’s price at expiration, two periods from now, is a function of the same relationship that determined its expiration price in the one-period model, more specifically, the call option’s maturity value is always
72.16
$100.00
137.89
117.35
99.75
99.88
99.75
84.90
72.16
85.00
99.75
84.90
72.16
72.25
72.16
61.41
52.20
11 0
1
2
3
4
(24.9)
Fig. 24.1 Price path of underlying stock (Source: Rendelman, R. J., Jr. and B. J. Bartter. 1979. “Two-state option pricing.” Journal of Finance 34(5), 1093–1110.)
where T designated the maturity date of the option. To derive the option’s price with two periods to go .T D 2/, it is helpful as an intermediate step to derive the value of Cu and Cd with one period to expiration when the stock price is either uS or dS, respectively.
In Equation (24.12), we used the fact that Cud D Cdu because the price will be the same in either case. Following Equations (24.10) and (24.11), we can obtain
CT D Max Œ0; ST X
Cu D ŒpCuu C .1 p/Cud =R
Cuu D ŒpCuuu C .1 p/Cuud =R
(24.13)
(24.10)
Cd D ŒpCd u C .1 p/Cd d =R
Cdd D ŒpCddu C .1 p/Cddd =R
(24.14)
(24.11)
Cud D ŒpCudu C .1 p/Cudd =R
(24.15)
Equation (24.10) tells us that if the value of the option after one period is Cu , the option will be worth either Cuu (if the stock price goes up) or Cud (if stock price goes down) after one more period (at its expiration date). Similarly, Equation (24.11) shows that the value of the option is Cd after one period, the option will be worth either Cdu or Cdd at the end of the second period. Replacing Cu and Cd in Equation (24.8) with their expressions in Equations (24.10) and (24.11), respectively, we can simplify the resulting equation to yield the two-period equivalent of the one-period binomial pricing formula, which is ı C D p 2 Cuu C 2p.1 p/Cud C .1 p/2 Cd d R2 (24.12)
Substituting Cuu ; Cud and Cdd in Equation (24.12), we can obtain the three-period option pricing model C D p 3 Cuuu C 3p 2 .1 p/Cuud C3p.1 p/2 Cudd C .1 p/3 Cddd
ı
R3 (24.16)
Equation (24.16) can also be written in terms of a generalized form as follows: C D
3 1 X 3Š 3 R kŠ.3 k/Š kD0
p .1 p/3k MaxŒ0; uk d 3k S K k
(24.17)
396
A.C. Lee et al.
C1 D Max Œ0; .1:1/3 .:90/0 .100/ 100 D 33:10
Following Equations (24.13), (24.14), and (24.15), we can obtain Cuuu D ŒpCuuuu C .1 p/Cuuud =R
(24.18)
C3 D Max Œ0; .1:1/ .:90/2 .100/ 100 D 0
Cddd D ŒpCdddu C .1 p/Cdddd =R
(24.19)
C4 D Max Œ0; .1:1/0 .:90/3 .100/ 100 D 0
Cuud D ŒpCuudu C .1 p/Cuudd =R
(24.20)
Cudd D ŒpCuddu C .1 p/Cuddd =R
(24.21)
Substituting Cuuu ; Cuud ; Cudd and Cddd in Equation (24.16), we can obtain the four-period option pricing model C D Œ p 4 Cuuuu C 4p 3 .1 p/Cuuud C 6p 2 .1 p/2 Cuudd C 4p.1 p/3 Cud d d C .1 p/4 Cd d d d =R4 (24.22) Equation (24.22) can also be written in terms of a generalized form as follows: C D
4 4Š 1 X R4 kŠ.4 k/Š kD0
p .1 p/ k
4k
k
MaxŒ0; u d
4k
S K
(24.23)
We know the values of the parameters S and X. If we assume that R, u, and d will remain constant over time, the possible maturity values for the option can be determined exactly. Thus deriving the option’s fair value with two periods to maturity is a relatively simple process of working backwards from the possible maturity values. Finally, using this same procedure of going from a oneperiod model, two-period model, three-period model and to a four-period model, we can extend the binomial approach to its more generalized form, with n-period maturity: C D
C2 D Max Œ0; .1:1/2 .:90/ .100/ 100 D 8:90
1 Rn
n X kD0
nŠ kŠ.n k/Š
p k .1 p/nk M axŒ0; uk d nk S X
(24.24)
To actually get this form of the binomial model, we could extend the two-period model to three periods, then from three periods to four periods, and so on. Equation (24.24) would be the result of these efforts. To show how Equation (24.24) can be used to assess a call option’s value, we modify the example as follows: S D $100; X D $100; R D 1:07; n D 3; u D 1:1 and d D :90. First we calculate the value of p from Equation (24.7) as.85, so 1 p is .15. Next, we calculate the four possible ending values for the call option after three periods in terms nk of Max Œ0; uk d S X .
Now we insert these numbers (C1 ; C2 ; C3 , and C4 ) into the model and sum the terms. 3Š 3Š 1 .:85/0 .:15/3 X 0 C .:85/1 .:15/2 X 0 C D 3 .1:07/ 0Š3Š 1Š2Š 3Š 3Š 2 1 3 0 .:85/ .:15/ X 8:90 C .:85/ .:15/ X 33:10 C 2Š1Š 3Š0Š 3X 2X1 1 0C0C .:7225/.:15/.8:90/ D 1:225 2X1X1 3X 2X1 C X.:61413/.1/.33:10/ 3X 2X1X1 1 Œ.:32513X 8:90/ C .:61413X 33:10/ 1:225 D $18:96 D
As this example suggests, working out a multiple-period problem by hand with this formula can become laborious as the number of periods increases. Fortunately, programming this model into a computer is not too difficult. Now let’s derive a binomial option pricing model in terms of the cumulative binomial density function. As a first step, we can rewrite Equation (24.24) as # k nk nŠ K nk u d p .1 p/ C DS kŠ.n K/Š Rn kDm " n # X X nŠ (24.25) p k .1 p/nk n R kŠ.n k/Š "
n X
kDm
This formula is identical to Equation (24.24) except that we have removed the Max operator. In order to remove the Max operator, we need to make uk d nk S X positive, which we can do by changing the counter in the summation from k D 0 to k D m. What is m? It is the minimum number of upward stock movements necessary for the option to terminate “in the money” (that is, uk d nk S X > 0). How can we interpret Equation (24.25)? Consider the second term in brackets; it is just a cumulative binomial distribution with parameters of n and p. Likewise, via a small algebraic manipulation we can show that the first term in the brackets is also a cumulative binomial distribution. This can be done by defining P 0 .u=R/p and 1 P 0 .d =R/.1 p/. Thus P k .1 p/nk
uk d nk D p rk .1 p 0 /nk Rn
24 Applications of the Binomial Distribution to Evaluate Call Options
Therefore, the first term in brackets of Equation (24.25) is also a cumulative binomial distribution with parameters of n and p 0 . Using the definition of cumulative binomial function, we can write the binomial call option model as C D SB1 .n; p 0 ; m/
X B2 .n; p; m/ Rn
(24.26)
397
if the stock went down in the first period, it can go down again to $72.25 or up in the second period to $99.88. Using the same argument, we can trace the path of the stock’s price for all four periods. If we are interested in forecasting the stock’s price at the end of period 4, we can find the average price of the stock for the 16 possible outcomes that can occur in period 4.
where
16 P
B1 .n; p ; m/ D 0
n P kDm
B2 .n; p; m/ D
n P kDm
Ckn p 0k .1
p /
0 nk
P D
Example 24.1. A Decision Tree Approach to Analyzing Future Stock Price By making some simplifying assumptions about how a stock’s price can change from one period to the next, it is possible to forecast the future price of the stock by means of a decision tree. To illustrate this point, let’s consider the following example. Suppose the price of Company A’s stock is currently $100. Now let’s assume that from one period to the next, the stock can go up by 17.5% or go down by 15%. In addition, let us assume that there is a 50% chance that the stock will go up and a 50% chance that the stock will go down. It is also assumed that the price movement of a stock (or of the stock market) today is completely independent of its movement in the past; in other words, the price will rise or fall today by a random amount. A sequence of these random increases and decreases is known as a random walk. Given this information, we can lay out the paths that the stock’s price may take. Figure 24.1 shows the possible stock prices for company A for four periods. Note that in period 1 there are two possible outcomes: the stock can go up in value by 17.5% to $117.50 or down by 15% to $85.00. In period 2 there are four possible outcomes. If the stock went up in the first period, it can go up again to $138.06 or down in the second period to $99.88. Likewise,
16
D
190:61 C 137:89 C : : : C 52:20 16
D $105:09
Ckn p k .1 p/nk
and m is the minimum amount of time the stock has to go up for the investor to finish in the money (that is, for the stock price to become larger than the exercise price). In this chapter, we showed that by employing the definition of a call option and by making some simplifying assumptions, we could use the binomial distribution to find the value of a call option. In Chap. 26, we will show how the binomial distribution is related to the normal distribution and how this relationship can be used to derive one of the most famous valuation equations in finance, the Black-Scholes option pricing model.
Pi
i D1
We can also find the standard deviation for the stock’s return.
.190:61 105:09/2 C : : : C .52:20 105:09/2 P D 16
1=2
D $34:39 P and P can be used to predict the future price of stock A.
24.5 Conclusion In this chapter, we demonstrate how can an option pricing model be derived in a less mathematically fashion using binominal option pricing model. We have first discussed the basic concepts of call options then we show how the decision trees can be used to derive binominal call option pricing model. The binominal call option pricing model can be used to derived Black-Scholes Option Pricing Model (1973) as shown by Rendleman and Barter (RB 1979), and Cox et al. (CRR 1979), which we will discuss in detail in the next chapter.
References Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 31, 637–659. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. Cox, J., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263. Hull, J. 2005. Options, futures, and other derivatives, 6th Edition, Prentice Hall, Upper Saddle River, NJ. Jarrow R. A. and S. Turnbull. 1999. Derivatives securities, 2nd Edition, South-Western College Pub, Cincinnati, OH. Lee, C. F. 2009. Handbook of quantitative finance and risk management, Springer, New York. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York.
398 Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co. Singapore. Lee, Jack C., C. F. Lee, R. S. Wang, and T. I. Lin. 2004. “On the limit properties of binomial and multinomial option pricing models: review and integration.” Advances in Quantitative Analysis of Finance and Accounting New Series Volume 1, World Scientific, Singapore, Vol. 1. Lee, John C. 2001. “Using microsoft excel and decision trees to demonstrate the binomial option pricing model.” Advances in Investment Analysis and Portfolio Management 8, 303–329.
A.C. Lee et al. MacBeth, J. and L. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” The Journal of Finance 34, 1173–1186. Rendleman, R. J., Jr. and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance 34(5), 1093–1110.
Chapter 25
Multinomial Option Pricing Model Cheng Few Lee and Jack C. Lee
Abstract In this chapter, we extend the binomial option pricing model to a multinomial option pricing model. Then we derive the multinomial option pricing model and apply it to the limiting case of Black and Scholes model. Finally, we introduce a lattice framework for option pricing model and its application to option valuation.
the current stock price is S, the stock price at the end of the period will be one of fi S ’s. We can represent this movement with the following diagram:
S
Keywords Multinomial process r Black and Scholes Model r Cox r Ross and Rubinstein (CRR) lattice binomial approach r Lattice model
25.1 Introduction Instead of two possible movements for the stock price, as considered by Cox et al. (1979) and Rendleman and Barter (1979), it is natural to extend it to the situation in which there are k C 1 possible price movements. In this section we will present the extension proposed by Madan et al. (1989). More details are provided and should be helpful to the readers. We will derive the multinomial option pricing model in Sect. 25.2 and the Black and Scholes model as a limiting case in Sect. 25.3.
C.F. Lee () Rutgers University, Newark, NJ, USA e-mail:
[email protected] J.C. Lee National Chiao Tung University, Hsinchu, Taiwan
f2S
w. p. q2
⋅
⋅
⋅
⋅ w. p. qk+1
where w.p. denotes with probability. We need the following definitions and notations in our presentation. Let X D .x1 ; x2 ; : : : xk ; xkC1 /T , and q D .q1 ; q2 ; : : : qk ;
qkC1 /T ; 0 < qj < 1; j D 1; 2; : : : k C 1. X is said to have
the multinomial distribution, or X Mult.n; q /, if and only
if the joint probability density is f .X / D f .x1 ; x2 ; : : : ; xkC1 /
nŠ D q x1 q2 : : : qkC1 xkC1 x1 Šx2 Š : : : xkC1 Š 1 x2 kC1 Q xj n D q x1 ; x2 ; : : : xkC1 j D1 j for all 0 xj n, where
kC1 P j D1
1 T q D 1, where 1
Suppose that the stock price follows a multiplicative multinomial process over discrete periods. The rate of return on the stock over each period can have .k C 1/ possible values: fi 1 with probability qi ; i D 1; : : : k C 1. Thus, if
w. p. q1
fk+1S
25.2 Multinomial Option Pricing Model 25.2.1 Derivation of the Option Pricing Model
f1S
.kC1/
xj D 1 T X D n and
(25.1)
kC1 P
qj D
j D1
W .k C 1/-dimensional vector of unit
entries. Let C D the current value of the n-period option price; K D the option exercise price; S D the stock price at the end of the n-period; n D the number of periods to maturity; and rO D one plus the riskless rate per period. The following theorem gives the option price for the multinomial case. This theorem extends the option price formula from the binomial case to the multinomial case.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_25,
399
400
C.F. Lee and J.C. Lee
Theorem 1. The current value of the n-period multinomial option price C is given by C D
X X 2A
n x1 ; x2 ; : : : ; xkC1
Now, let D .1 ; 2 ; : : : k ; kC1 /T ; j D
If investors were risk-neutral, that is,
kC1 X
.
.fj S / qj D rO S;
j D1
8 9 kC1 < kC1 Y fj qj xj Y = K rO n qj S : ; rO j D1
fi qj rO
(25.2)
j D1
or
kC1 P
fj qj D rO , implying
j D1
kC1 P j D1
fi qj rO
D
kC1 P
j D 1 T D 1,
j D1
then and q are both the probability vectors in multinomial where A D fX j X D .x1 ; x2 ; : : : ; xkC1 /T ; xj 2 N [
f0g ; 1 T X D n; S > Kg and S and S are the current
stock price and the stock price at the end of the n-period, respectively.
distribution. So we can rewrite Equation (25.2) as C D SP .A / K rO n P q .Aq /;
(25.3)
where 82 9 3 kC1 n = Y X< 5 4 P .A / D (25.4) j x j : x1 ; x2 ; : : : ; xkC1 ;
Proof. Since 0 S D
f1x1
f2x21
xkC1 fkC1
S D@
kC1 Y
1 x fj j A
X 2A
j D1
S;
82 9 3 kC1 n = X< Y 5 4 P q .Aq / D qj xj : (25.5) : x1 ; x2 ; : : : ; xkC1 ;
j D1
C D rO n EŒmax.S K; 0/ , and
X 2A
j D1
A D X j 1 T X D n; S > K ;
we have C D
8 X < X 2A
9, =
Y kC1 n S K qj xj : x1 ; x2 ; : : : ; xkC1 ; j D1
D
X
X 2A
n x1 ; x2 ; : : : ; xkC1
0 @S
kC1 Y j D1
D
X X 2A
1 fi
xj
KA
kC1 Y
qj xj
j D1
n x1 ; x2 ; : : : ; xkC1
9, = ;
rO n
8 91, kC1 < kC1 = Y Y S rO n .fi qj /xj K qj xj A : ; j D1
D
X X 2A
j D1
n x1 ; x2 ; : : : ; xkC1
rO n We next consider the limiting result. If t is the fixed length of time to expiration, and n is the number of periods of length l prior to expiration, then l nt . As trading becomes more and more frequent, l gets closer and closer to zero. We must then adjust the intervaldependent variable rO and fj ’s in such a way that we obtain empirically realistic results as l becomes smaller when n ! 1. Let r denote one plus the interest rate over a fixed length of time while rO means one plus the interest rate over a period of length l. t Suppose that rO n D r t for all n. Then rO D r n , which shows how rO must depend on n for the total return over time t to be independent of n. Denote T D .1 ; 2 ; : : : kC1 / D .ln f1 ; ln f2 ; : : : ;
ln fkC1 /.
8 91 kC1 < kC1 = Y fi qj Y /xj K rO n . qj xj A S : ; rO j D1
25.2.2 The Black and Scholes Model as a Limiting Case
j D1
If j D h j , where h D then ln
S S
D
kC1 P j D1
q
t n
Co
xj log fj D T x .
p1 n
kC1 P , qj j2 D 1, j D1
25 Multinomial Option Pricing Model
401
" D
T .n q /
Let n D nh
2
kC1 P j D1
tC
qj j2
t, and
n T q
2
kC1 P j D1
2 t 2 n
, then
T q
D
# qj 2j
Hence XO 2 Aq
t n.
ln
Since t T q D h T q D ; h D O
n
qDO T
1 ; we have p n
ln D ln
.y1 ; y2 ; : : : yp1 ; yp /, then yO D .y1 ; y2 ; : : : yp1 ; yp /. T
Bhattacharya and Rao (1976) showed that if X Mult
P D .n; q /, then for large n; XO Nk n qO ; n q , where
P T D
. q O / q O q O , and
. q O / is a diagonal matrix; that q
q ;i D j is, .qO / D .bij /; bij D i . 0 ;i ¤ j
We next observe that XO 2 Aq iff X D f1 x1 f2 x2 : : : fkC1 xkC1 S > K,
or XO 2 Aq iff ln .f1 x1 f2 x2 : : : fkC1 xkC1 / D
kC1 P
ln.fj /xj
j D1
D O XO CkC1 xkC1 > ln T
or Z D
P 1 Œ n O T O 2 q
>
K O S X
ln
D
S OT qO K CkC1 xkC1 Cn P 1 Œ n O T O 2 q
X
,
O kC1 1 / .
T
D . kC1 1 /T
Proof. See the Appendix. If S C n T q ln K
X
O kC1 1 /: .
Zq D
P
1
Œn. kC1 1 /T q . kC1 1 / 2
;
D O XO n O T qO kC1 1 T XO CnkC1 1 T qO
D
We need the following lemma. P T O kC1 Lemma 1. (a) If q D . q / q q , then .
P P O kC1 1 / D . kC1 1 /T 1 /T q . kC1 1 / q .
P P T T (b) if D . / ; O / O O , then D . . O kC1 1 /T
T
D
S n O T qO : K
Consequently, . O kC1 1 /T .XO n qO / ! N
P 0; n. O kC1 1 /T O kC1 1 / . q .
. O kC1 1 /T .XO n qO /
O T XO
O T XO
S kC1 xkC1 n O T qO nkC1 qkC1 CkC1 xkC1 K
Since,
S C Thus, P q .Aq / D P q . O kC1 1 /T .XO n qO / > ln K
n O T qO .
2 Aq iff,
where Z N.0; 1/ is a random variable and the right handside is also a random variable because it is a function of xkC1 . Madan et al. (1989) used the random variable . O kC1 1 /T .XO n qO / to solve the problem:
obtained from y by deleting its last entry. That is, if y T D
O T XO n OT qO
S kC1 xkC1 n O T qO K
For any vector y , let yO denote the truncated vector
iff . O kC1 1 /T .XO n qO / >
1 : p n
O T XO n O T qO >
iff
then
0
n O qO kC1 .n xkC1 / C nkC1 .1 qkC1 /
T
B P q .Aq / D Pq @
n O qO CkC1 xkC1 nkC1 qkC1
Œn. O kC1 1
we have
1
. O kC1 1 /T .XO n qO /
/T
P
O kC1 1 / q .
1 2
C >Zq A
1 N.Zq / D N.Zq /:
O T XO n O T qO D . O kC1 1 /T .XO n qO /
CnkC1 qkC1 kC1 xkC1 :
Similarly, if Z D P .A / N.Z /.
ln
S T K Cn
Œn. kC1 1 /T
P
1 . kC1 1 / 2
, then
402
C.F. Lee and J.C. Lee t
Theorem 2. If rO 1 f T q D r n f T q D 1 and j D
q kC1 P t 1 ln fj D h j ; h D n C o pn for all j; qj j2 D 1. j D1
Then (1)
T q
2 ln r 2 t
D
n
Co
1
n
,
C o n1
P (3) lim n. O kC1 1 /T O kC1 1 / q .
(2) T D n!1
2 ln rC 2
t
n
D lim n. kC1 1 /T
q . kC1 1 /
P lim n. kC1 1 /T . kC1 1 / D n!1
n!1
(4)
P
D 2t 2 t.
Proof. See the Appendix. With the assumptions of Theorem 2, let d1 D lim Z D n!1
S ln K 1
. 2 t/ 2
C
ln r C
2 2
t
1
. 2 t/ 2
S ln r ln K p C t p C 2 t
2 S ln r p ln K 2 t d2 D lim Zq D C D d1 t 1 1 n!1 . 2 t/ 2 . 2 t/ 2
D
Then C ! SN.d1 /Kr t N.d2 /, which is the Black-Scholes formula. Here, we use the fact that P .A / ! N.d1 /, and P q .Aq / ! N.d2 /.
25.3 A Lattice Framework for Option Pricing 25.3.1 Modification of the Two-State Approach for a Single State Variable In this subsection, we will present a lattice framework for option pricing proposed by Boyle (1989). We will first discuss a modification of the CRR lattice binomial approach for option pricing in which there is only a simple state variable. We will discuss the case with two state variables later. Instead of the two-jump process used in CRR, we will use a three-jump process. Let us consider an asset (S) with a lognormal distribution of returns. Over a small time interval h, this distribution is approximated by a three-point jump process such that the expected return on the asset is the riskfree rate, and the variance of the approximating distribution is equal to the variance of the corresponding lognormal distribution.
Let T D time to option maturity (in years), X D exercise price of option, R D the riskfree annual interest rate, 2 D the variance of the annual rate of return on the underlying asset (yearly), h D Tn : length of one time step, Su D value of asset after an up jump, S D value of asset after a horizontal jump, Sd D value of asset after a down jump. Let Sn be a random variable whose distribution is as follows: Nature of jump Up Horizontal Down
Probability p1 p2 p3
Asset price (value of Sn ) Su S Sd
We further assume that ud D 1. It is trivial to see that 1 S D M S; E.Sn / D p1 u C p2 C p3 u Var.Sn / D p1 .S 2 u2 S 2 M 2 / C p2 .S 2 S 2 M 2 / C p3
1 S 2 2 S 2 M 2 D S 2 V; u p3 M D p1 u C p2 C ; u
1 2 V D p1 .u2 M 2 / C p2 .1 M 2 / C p3 : M u2
For the approximation to the lognormal distribution, we impose three conditions: (i) The probabilities are positive and sum to one. (ii) The mean of the discrete distribution, MS, is equal to the lognormal distribution, in a risk-neutral world; that is, SM D S exp.rh/: (iii) The variance of the discrete distribution, S 2 V , is equal to the variance of the lognormal distribution S 2 V D S 2 M 2 Œexp. 2 h/ 1 . Before proceeding further, we note that if log SSn is normally distributed with mean h and variance h 2 , then E
Sn S
D e hC
h 2=2
;
rh which ı is equal to e in a risk-neutral world. Hence h. C 2 2/ D hr. Furthermore,
25 Multinomial Option Pricing Model
Var
Sn S
D e 2h e h De
2
403
2 e h 1
Table 25.1 Jump amplitudes and jump probabilities D 0:2; r D 0:1; T D 1:0; n D 20
u
p1
p3
p3
1.00 1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00
1.0457 1.0504 1.0511 1.0599 1.0646 1.0694 1.0742 1.0790 1.0838 1.0887 1.0936
0.5539 0.4610 0.3900 0.3346 0.2904 0.2547 0.2253 0.2008 0.1802 0.1627 0.1477
0.4646 0.3798 0.3156 0.2659 0.2266 0.1951 0.1694 0.1482 0.1305 0.1156 0.1030
0.0184 0.1592 0.2943 0.3995 0.4829 0.5502 0.6053 0.6510 0.6892 0.7216 0.7493
2
2h C 2
2 D e 2hr e h
2 e h 1
1 :
By setting E SSn D M and Var SSn D V , we obtain the conditions (ii) and (iii). It is noted that although M and V have been specified, they will be forced to take on values exp.rh/ and M 2 Œexp. 2 h/ 1 , respectively, in the approximation. The above three conditions are: p1 C p2 C p3 D 1; S p1 S u C p2 S C p3 D SM; u p1 .S 2 u2 S 2 M 2 / C p2 .S 2 S 2 M 2 / 2
S 2 2 Cp3 S M D S 2 V: u2
(25.6)
(25.8)
Even though there are three equations in three unknowns (assuming the stretch parameter u is specified), we need only two equations as p1 ; p2 and p3 are linearly dependent. From Equation (25.6) we have p2 D 1 p1 p2 . Substituting this into Equations (25.7) and (25.8) we have:
1 p1 .u 1/ C p3 1 D M 1; u
1 2 p1 .u 1/ C p3 1 D V C M 2 1: u2
(25.9) (25.10)
From Equations (25.9)–(25.10) we have: p1 .u; M; V / D
.V C M 2 M /u .M 1/ ; .u 1/.u2 1/
p3 .u; M; V / D
u2 .V C M 2 M / u3 .M 1/ ; (25.12) .u 1/.u2 1/
p2 D 1 p1 p2 :
(25.11)
(25.13)
From Equations (25.11)–(25.13) we see that u ¤ 1, in fact, u > 1. Also pi .u; M; V / indicates that the probability pi is a function of u, M, V. Recall that in CRR, u is set to be p
u D exp. h/:
p u D exp. h/:
(25.7)
(25.14)
However, it is noted by Boyle that p2 will result in negative values for many realistic parameter values. Thus, it is proposed to consider
(25.15)
where 1 and is called the jump amplitude. Table 25.1 shows a list of jump amplitudes and jump probabilities. Before moving on to the two assets extension, it is noted that for a range of parameter values, the accuracy of the threejump method with 5 time intervals was comparable to that of the CRR method with 20 time intervals, as claimed by Boyle.
25.3.2 A Lattice Model for Valuation of Options on Two Underlying Assets We next consider the extension to the two assets case. Assume that the two assets are bivariate lognormally distributed. Let S1;n and S2;n be the two discrete random variables at the end of the interval corresponding to the two assets with current values S1 and S2 , respectively. As before, we assume that the option matures after time T with exercise price X . Also T D nh. The means, variances, and covariance of the two assets at the end of this interval are obtained from the lognormal distribution as follows: S Let Yi D log Si;ni ,
Y1 Y2
N
y1 y2
;
y11 y21
y12 y22
where yi D i h;
yij D ij h:
Since the marginal distribution of Yi is normal, we have as in the one asset case E .Si;n / D Si Mi ; Mi D e rh ;
404
C.F. Lee and J.C. Lee
Var .Si;n / D Si2 Mi2 exp i2 h 1
Value of assets given that event occurred
D Si2 Vi where
Vi D Mi2 exp i2 h 1 :
By the property of the moment generating function for the bivariate normal distribution, we have E .S1;n S2;n / D S1 S2 E e y1 Cy2 0 0 11 y1 A @ B .1;1/ y2 C C D S1 S2 E B A @e h
h
2
Probability p1 p2 p3 p4 p5
2
.p1 C p2 /S1 u1 C p5 S1 C .p3 C p4 /S1 d1 D S1 M1 ; (25.18)
D S1 S2 M1 M2 Œexp .1 2 h/ 1 ; D S1 S2 R;
2 Var .Si;n / D E Si;n ŒE .Si;n / 2 D Si2 Mi2 exp i2 h 1 D Si2 Vi
.p1 C p4 /S2 u2 C p5 S2 C .p2 C p3 /S2 d2 D S2 M2 ; (25.19)
.p1 C p2 / S12 u21 S12 M12 C p5 S12 S12 M12
(25.20) C .p3 C p4 / S12 d12 S12 M12 D S12 V1 :
.p1 C p4 / S22 u22 S22 M22 C p5 S22 S22 M22
(25.21) C .p2 C p3 / S22 d22 S22 M22 D S22 V2 : Furthermore,
Vi D
exp i2 h 1 :
pi D 1:
(25.22)
Note that there is a strong correspondence between the three Equations (25.6), (25.7) and (25.8) and the triplet Equations (25.22), (25.19), and (25.21). If we regard .p1 C p2 /; .p3 C p4 / and p5 as the new probabilities, we can solve these probabilities as before. They are given in Equations (25.11), (25.13), and (25.13) with appropriate modifications. Likewise, we can solve for .p1 C p4 /; .p2 C p3 / and p5 . Thus, we can write:
Asset 2 S2 M2 S 2 1 V2
where: Mi D exp.rh/; Vi D Mi2 Œexp.i2 h 1/ ; i D 1; 2: Furthermore, E.S1;n S2;n / D S1 S2 M1 M2 Œexp.1 2 h/
5 X i D1
Thus, we have Asset 1 S1 M1 S 2 1 V1
pi D 1.
In order to obtain the probabilities and the stretch parameters, we will first find the probabilities in terms of u1 ; u2 and other variables. By equating the means and variances of the discrete and lognormal distributions, we have
C ov .S1;n ; S2;n / D E .S1;n S2;n / ŒE .S1;n / ŒE .S2;n /
Mi2
5 P
Asset 2 S2 u2 S2 d2 S2 d2 S2 u2 S2
i D1
D S1 S2 M1 M2 exp Œ1 2 h ;
where
Asset 1 S1 u1 S1 u1 S1 d1 S1 d1 S1
where u1 d1 D u2 d2 D 1 and
D S1 S2 e 1 hC2 C 2 .1 C2 C212 /
Mean Variance
Event E1 E2 E3 E4 E5
(25.16)
where is the correlation coefficient between the two assets. Thus, the covariance between the two assets is C ov.S1;n ; S2;n / D S1 S2 M1 M2 Œexp.1 2 h/ 1 D S1 S2 R (25.17) In order to obtain an efficient algorithm, a five-point jump process is natural and is given below.
p1 C p2 D f .u1 ; M1 ; V1 / D f1 ;
(25.23)
p3 C p4 D g.u1 ; M1 ; V1 / D g1 ;
(25.24)
p1 C p4 D f .u2 ; M2 ; V2 / D f2 ;
(25.25)
p2 C p3 D g.u2 ; M2 ; V2 / D g2 :
(25.26)
Obviously, f1 C g1 D f2 C g2 , which gives the relationship to be satisfied by u1 and u2 . From Equation (25.16) we obtain the following equation by equating the expected value of the product of the assets under approximating discrete and lognormal distributions, .p1 u1 u2 C p2 u2 d2 C p3 d1 d2 C p4 d1 u2 C p5 /S1 S2 D RS1 S2 : (25.27)
25 Multinomial Option Pricing Model
From Equations (25.23), (25.24), and (25.27), we obtain the following:
405
(25.25),
(25.26),
u1 u2 .R 1/ f1 u21 1 f2 u22 1 C .f2 C g2 /.u1 u2 1/ 2 p1 D ; u1 1 u22 1
f1 u21 1 u22 C f2 u22 1 .f2 C g2 /.u1 u2 1/ u1 u2 .R 1/ 2 p2 D ; u1 1 u22 1
u1 u2 .R 1/ f1 u21 1 u22 C g2 u22 1 u21 C .f2 C g2 / u1 u2 u22 2 p3 D ; u1 1 u22 1
f1 u21 1 C f2 u22 1 u21 .f2 C g2 /.u1 u2 1/ u1 u2 .R 1/ 2 p4 D : u1 1 u22 1
As in the one asset case, set p ui D exp.i i h/: Table 25.2 shows a list of jump amplitudes and jump probabilities. We next present some numerical examples as given by Boyle (1988) that will illustrate the operation of the methods presented in the section. Both Johnson (1981) and Stulz (1982) have derived a number of exact results for European options when there are two underlying assets. These results can be used to check the accuracy of the approach. In particular, expressions exist for a European call option on the maximum of two assets and for a European put option on the minimum of two assets. Johnson (1987) has obtained a generalization of these results for the case involving a European call option on the maximum or minimum of several assets. His results could be used as benchmarks to check extensions of our method to situations involving three or more assets.
Table 25.2 Jump amplitudes and jump probabilities for 5-jump process
(25.28)
The advantage of the present approach is that it permits early exercise and thus can be used to value American options and, in particular, American put options. The approach presented in this section can also be modified to handle dividends and other payouts. For our numerical solutions, we assume S1 D 40; S2 D 40; 1 D 0:20; 2 D 0:30; D 0:5; r D 5% per year effective D 0.048790 continuously, T D 7 months D 0.5833333 years, and exercise prices D 35, 40, 45. We display in Table 25.3 the results for three types of options: European call options on the maximum of the two assets; European put options on the minimum of the two assets; and American put options on the minimum of two assets. Since we assume no dividend payments, the value of the European call on the maximum of the two assets will be identical to the value of an American call option with the same specifications. The agreement between the numbers obtained using the lattice approximation and the accurate values is
1 D 0:2; 2 D 0:25; r D 0:1; n D 20; T D 1; D 0:5
u1
u2
p1
p2
p3
p4
p5
1:00 1:10 1:20 1:30 1:40 1:50 1:60 1:70 1:80 1:90 2:00
1:0457 1:0504 1:0551 1:0599 1:0646 1:0694 1:0742 1:0790 1:0838 1:0887 1:0936
1:0574 1:0633 1:0692 1:0752 1:0812 1:0873 1:0934 1:0995 1:1056 1:1118 1:1180
0:4201 0:3499 0:2962 0:2543 0:2208 0:1937 0:1715 0:1529 0:1373 0:1240 0:1126
0:1337 0:1111 0:0938 0:0803 0:0696 0:0609 0:0538 0:0479 0:0429 0:0387 0:0351
0:3448 0:2814 0:2334 0:1963 0:1670 0:1434 0:1243 0:1085 0:0953 0:0842 0:0748
0:1198 0:0984 0:0822 0:0696 0:0597 0:0517 0:0451 0:0397 0:0352 0:0314 0:0282
0:0184 0:1592 0:2943 0:3995 0:4829 0:5502 0:6053 0:6510 0:6892 0:7216 0:7493
406
Table 25.3 Different types of options involving two assets: comparison of lattice approach of this section with accurate results
C.F. Lee and J.C. Lee
Number of steps 50
Accurate Value
35 9:404 9:414 40 5:466 5:477 45 2:817 2:790 European put on the minimum of the two assets
9:419 5:483 2:792
9.420 5.488 2.795
35 1:425 1:394 40 3:778 3:790 45 7:475 7:493 American put on the minimum of the two assets
1:392 3:795 7:499
1.387 3.798 7.500
Exercise price
10
20
European call on the maximum of the two assets
35 40 45
quite reasonable, especially when 50 time steps are used. In this case, the maximum difference is 0.005 and the accuracy would be adequate for most applications. For this set of parameter values, the American put is not much more valuable than its European counterpart.
1:450 3:870 7:645
1:423 3:885 7:674
1:423 3:892 7:689
Stulz, R. M. 1982. “Options on the Minimum or Maximum of Two Risky Assets: Analysis and Applications.” Journal of Financial Economics 10, 161–185.
Appendix 25A Proof of Lemma 1
25.4 Conclusion
(a) Since X
D .q / q q T
! ! qO qO
D
.q T ; qkC1 /
0 qkC1 1qkC1 0 1 qO qO T qkC1 qO
qO 0
A
A D @ T
@
T 2 qkC1 qO qkC1 qkC1 0
1 0 X qkC1 qO q B
C
C DB
@ A T 2 qkC1 qO qkC1 qkC1 q
In this chapter we try extend binomial option pricing model to multinomial option pricing model. We derive the multinomial option pricing model and apply it to the limiting case of Black and Scholes model. We also introduce a lattice framework for option pricing model. To apply the lattice model, we present the modified models of the two state approach on single state variable and two state variables.
References Bhattacharya, R. N., and R. R., Rao. 1976. “Normal Approximation and Asymptotic Expansions.” New York: John Wiley. Boyle, P. 1988. “A Lattice Framework for Option Pricing with Two State Variables.” Journal of Financial and Quantitative Analysis, 23, 1–12. Boyle, P. P. 1989. “The quality option and the timing option in futures contracts.” Journal of Finance 44, 101–103. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simple approach.” Journal of Financial Economics 7, 229–263. Johnson, H. 1981. “The pricing of complex options.” Unpublished manuscript. Johnson, H. 1987. “Options on the maximum or the minimum of several assets.” Journal of Financial and Quantitative Analysis 22, 277–283 Madan, D. B., F. Milne, and H. Shefrin. 1989. “The Multinomial Option Pricing Model and its Brownian and Poisson Limits.” The Review of Financial Studies, 2(2), 251–265. Rendleman, R., Jr. and B. Barter. 1979. “Two State Option Pricing.” Journal of Finance 34, 1093–1110.
and D
kC1 1
!
O
kC1
; we have
T X
kC1
q
T
kC1 1
kC1
!
;0 D O kC1 1
kC1
0 X 1 ! qkC1 qO q O kC1 1 B
C
kC1 B C
@ A T 2 0 qkC1 qO qkC1 qkC1
T X
kC1 1 : D kC1 1 q
k
(b) The proof is similar to that of (a).
k
25 Multinomial Option Pricing Model
407
Proof of Theorem 2 (1) Since r
nt
f
T
and
q D 1;
1T q
80 9 t 1 kC1 < = X fj qj @r n fj A qj j2 j2 D : ; rO j D1 j D1
D 1, we have
kC1 X
T t q D 0: r n f 1
820 9 1 3 t kC1 < = X 4@r n fj 1A C 15 qj j2 D : ;
Hence t t r n fj 1 D r n e hj 1 t ln rChj D e n 1
1 2 2 t 1 : D ln r C j C h j 2 C o n 2 n
j D1
D
j D1
D
Here we use the fact that
1 t ; and Co p j D ln fj D hj ; h D n n
1 1 ;B D O p if A D O n n
1 2 1 ACB : e 1 D fA C Bg C B Co 2 n
t
implying 0 t 1T
1 1T 0 t t kC1 X @r n fj 1A qj 0 D @r n f 1 A q D
@r n f 1A D
j D1
kC1 X t 1 2 2 1 qj ln r C j C h j C o D n 2 n j D1
kC1 t 1 2X 2 1 T ; D ln r C q C h j qj C o
n 2 j D1 n 2t
1
2
ln r 2
t
Co h D Co n p Multiplying both sides by n, we have
p 2 t 1 ln r Co : . nh/ T q D p 2 h n
Thus, T q D 0;
n
1
0 @e
j
10 t t 1 ln r j ln r n 1A @e n Aqj
j D1
n
D
.
kC1 X 3 t 1 j ln r C 2j qj C o n 2 n j D1
kC1 3X 2 2 1 ln r C h j qj Co D 2 j D1 n
3 2t 1 t Co D T q ln r C
n 2 n n
t 1 2t 1 T Co : D ln r C C
n 2 n n t T q
n
(2) Since 0
1 1T 0 t t kC1 X @r n f 1 A D @r n fj 1A j
kC1 X
kC1 X t 1 2 1 ln r C j C j C o D n 2 n j D1 t 1 1 1 ln r C j C 2j C o qj n 2 n
as n ! 1.
t
j D r n fj qj D r n e j qj
1 ; n
On the other hand,
0
implying T q D nt ln r 12
qj j2 C o
we have
T 1 t 1 2 T nt C r f 1 D ln r C t Co :
n 2 n n
Thus,
kC1 X
t 1 2 1 qj j2 1 ln r C j C j C o n 2 n
j D1
r
kC1 X
j D1
kC1 X t 1 2 2 1 j ln r C j C h j C o D n 2 n j D1
kC1 1 t 1 2 t X 2 fj j Co ; D ln r C T C j
n 2 n j D1 rO n
Thus,
2
ln r C t 1 1 2 2 T D T q C t C o D Co
n n n n
408
C.F. Lee and J.C. Lee
0
1
Also,
1
(3) Let
B : C C DB @ : A;
j
D j kC1 ; j D 1; : : : ; k T
k
qO D
we have O kC1 1 D h
:
D
k X j D1 kC1 X
qj .j kC1 / .j kC1 /2 qj .kC1 kC1 /qkC1
j D1
Then, lim n. O kC1 1 /T
n!1
. .q / q q T /
D T qO kC1 D kC1 :
X
O kC1 1 / q .
D t 2
T
D lim nh2 n!1
. .q / q q T /
T
:
Thus, lim n. kC1 1 /
n!1
Now T
T
X q
.qO /
D
D D
k X j D1 k X j D1 kC1 X
2 j qj
D
k X
D lim n. O kC1 1 /T n!1
qj .j kC1 /
2
D tŒ1 C 2
2 kC1
kC1 1
X q
O kC1 1
.kC1 / D t: 2
2
j D1
2 qj j2 C kC1 2j kC1 2 qj j2 qkC1 kC1 2 C kC1 .1 qkC1 /
j D1
2kC1 . T q qkC1 kC1 /
D 1 C kC1 2 ;
because T q D 0:
(4) The proof of (4) is analogous to that of (3).
Chapter 26
Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model Cheng-Few Lee and Carl Shu-Ming Lin
Abstract In this chapter, we review two famous models on binomial option pricing, Rendleman and Barter (RB 1979) and Cox et al. (CRR 1979). We show that the limiting results of the two models both lead to the celebrated Black-Scholes formula. From our detailed derivations, CRR is easy to follow if one has the advanced level knowledge in probability theory but the assumptions on the model parameters make its applications limited. On the other hand, RB model is intuitive and does not require higher level knowledge in probability theory. Nevertheless, the derivations of RB model are more complicated and tedious. For readers who are interested in the binomial option pricing model, they can compare the two different approaches and find the best one that fits their interests and is easier to follow.
26.2 The Two-State Option Pricing Model of Rendleman and Bartter In Rendleman and Bartter (1979), a stock price can either advance or decline during the next period. Let HTC and HT represent the returns per dollar invested in the stock if the price rises (the C state) or falls (the state), respectively, from time T1 to time T. And VTC and VT the corresponding end-of-period values of the option. Let R be the riskless interest rate, Rendleman and Bartter (1979) show that the price of the option can be represented as a recursive form
PT 1 D Keywords Binomial option pricing model r Black-Scholes formula r Two-state option pricing model
VC 1 C R HT C VT HTC 1 R C HT HT .1 C R/
that can be applied at any time T 1 to determine the price of the option as a function of its value at time T.
26.1 Introduction 26.2.1 The Discrete Time Model The main purpose of this chapter is to review two famous binomial option pricing models: Rendleman and Barter (RB 1979) and Cox et al. (CRR 1979). First, we will give an alternative detailed derivation of the two models and show that the limiting results of the two models both lead to the celebrated Black-Scholes formula. Then we will make comparisons of the two different approaches and analyze the advantages of each approach.
From the above equation, the value of a call option at maturing date T 1 is given by
WT 1 D
WTC .1 C R H / C WT .H C .1 C R// .H C H /.1 C R/ (26.1)
Similarly,
WT 2 D
C.-F. Lee and C.S.-M. Lin () Rutgers University, Newark, NJ, USA e-mail:
[email protected]
WTC1 .1 C R H / C WT1 .H C .1 C R// .H C H /.1 C R/ (26.2)
Section 3 of this chapter is essentially drawing from the paper by Lee et al. (2004).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_26,
409
410
C.-F. Lee and C.S.-M. Lin
Substituting Equation (26.1) into Equation (26.2) can get,
WT 2
CC WT .1 C R H / C WTC .H C .1 C R// .1 C R H / D .H C H /2 .1 C R/2 C WT .1 C R H / C WT .H C .1 C R// .H C C .1 C R// C : .H C H /2 .1 C R/2
(26.3)
Noting that WTC D WTC , so Equation (26.3) can be simplified as:
WT 2
CC WT .1 C R H /2 C 2WTC .H C .1 C R// .1 C R H / C WT .H C .1 C R//2 D .H C H /2 .1 C R/2
(26.4)
We can use this recursive form to get W0 :
T Since after T periods, there are ways that a sequence 0
T of (T) pluses can occur, ways that .T 1/ pluses can 1
T occur, ways that .T 2/ pluses can occur, and so on: : :. 2 Hence, by Binomial Theorem W0 can be represented as:
T WTC::::::C .1 C R H /T .H C .1 C R//0 0
T C WTC:::::: .1 C R H /T 1 .H C .1 C R//1 1
T C WTC::::: .1 C R H /T 2 .H C .1 C R//2 2 W0 D : ::
T C WTC:::: .1 C R H /1 .H C .1 C R//T 1 T 1
T C WT:::::: .1 C R H /0 .H C .1 C R//T T
Œ.H C H /.1 C R/ T
(26.5)
To determine the value of the option at maturity, assume that stock increases i times and declines .T i / times, then i T i the price of the stock will be S0 H C H on the expiration date. So the option will be exercised if i
T i
S0 H C H
> X:
The maturity value of the option will be WT D S 0 H
Ci
H
T i
X:
(26.6)
Let a denote the minimum integer value of i in Equation (26.6) for which the inequality is satisfied.
3 ln X=S0 T ln.H / 5 a D 1 C INT 4 ln H C ln H 2
where INTŒ is the integer operator. That is, taking natural logarithm Equation (26.6),
of
(26.7)
RHS
of
26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model
411
ln S0 C i ln H C C .T i / ln H D ln X
) i ln H C i ln H D ln
)i D
ln SX T ln H
X S0
T ln H
0
ln H C ln H
Hence, the maturing value of the option is given by i
T i
WT D S0 H C H X : : : if : : : i a WT D 0 : : : if : : : i < a
(26.8)
Substituting Equation (26.8) into Equation (26.5), then the generalized option pricing equation for the discrete time is
T P
W0 D
i Da
T i
i T i S0 H C H X .1 C R H /i .H C .1 C R//T i (26.9)
.H C H /T .1 C R/T
26.2.2 The Continuous Time Model For Equation (26.9), we can write it as:
T P T Ci T i i C T i S0 H H X.1 C R H /i .H C .1 C R//T i .1 C R H / .H .1 C R// i i Da i Da W0 D .H C H /T .1 C R/T
T P T ŒH C .1 C R H / i ŒH .H C 1 R/ T i S0 i i Da D C T T .H H / .1 C R/ T P T X .1 C R H /i .H C .1 C R//T i i i Da .H C H /T .1 C R/T i T i T P T .1 C R H /H C .H C 1 R/H D S0 i .1 C R/.H C H / .H C H /.1 C R/ i Da T i T P .1 C R H / i .H C 1 R/ X T .1 C R/T i Da i .H C H / .H C H / (26.10) T P
T i
.H C 1 R/H .1 C R H /H C C D 1, .1 C R/.H C H / .H C H /.1 C R/ .1 C R H / .H C 1 R/ C D 1, therefore, we can C .H H / .H C H / interpret it as “pseudo probability.” .1 C R H /H C .1 C R H / Let ' D and
D , .1 C R/.H C H / .H C H / we can restate Equation (26.10) as:
Since
W0 D S0 B.a; T; '/
X B.a; T; / .1 C R/T
(26.11)
where B.a; T; .// is the cumulative binomial probability function, the number of successes will fall between a and T after T trials. As T ! 1 W0 S0 N.Z1 ; Z10 /
X N.Z2 ; Z20 / .1 C R/T
(26.12)
where N.Z; Z 0 / is the probability of a normally distributed random variable with zero mean and variance 1 taking values
412
C.-F. Lee and C.S.-M. Lin
between a lower limit Z and a upper limit Z 0 . And by the property of binomial pdf, aT'
Z1 D p T '.1 '/
; Z10
T T'
Dp T '.1 '/ :
a T
T T
; Z20 D p Z2 D p T .1 / T .1 / Thus W0 D S 0
lim Z1 ; lim Z10
T !1
T !1
X N lim .1 C R/T
T !1
lim Z2 ; lim Z20
T !1
(26.13)
T !1
Let 1 C R D e r=T , then lim .1 C R/T D e r . And since T !1
lim Z10 D lim Z20 D 1, the remaining things to be deter-
T !1
T !1
mined are lim Z1 and lim Z2 . T !1
T !1
From Equations (26.10) and (26.11) of the text in the Rendleman and Barter (1979), p q .1 / T/
H C D e =T C.= H D e
p q =T .= T / .1 /
Z1
X S0
ln
p T .1 r / p T 1 C p T .1 / r p T 1 C p T .1 /
X ln S s 0 '.1 '/ .1 /
X ln S s 0 '.1 '/ .1 /
1 p T '.1 '/
X ln 1 S0 s p T '.1 '/ '.1 '/ .1 / T' 1 p p T '.1 '/ T '.1 '/ p
r
T 1 p T' C p T '.1 '/ T '.1 '/ p T .1 / r p p T T .1 / 1 p C T '.1 '/
X p ln T . '/ T' S0 C p s p T '.1 '/ '.1 '/ '.1 '/ .1 / (26.14)
substituting H C and H into Equation (26.7), so Substituting H C ; H and 1 C R D e r=T into ',
a T'
Z1 D p T '.1 '/ r
p X C T 6 ln S0 1 6 1 C INT 6 4 p T .1 / D p T '.1 '/ 2
.1 C R H /H C .1 C R/.H C H / h r p p i p p 1 e T e T .= T / 1 e T C.= T / D r h p p 1 p p i e T e T C.= T / e T .= T / 1
'D
3 7 7 7T' 5
In the limit, the term 1 C INTŒ will be simplify to Œ . So, h D
p
e .=
T/
h
p p i p p 1 r e T e T .= T / 1 e .= T / p p D p p 1 r eT e = T e .= T / 1
p p p p 1 i r e T T .= T / 1 C.= T / h p p 1 p p i e .= T / e .= T / 1
p 1
p p 1 p r i p p T T C.= T / 1 = T / 1 . e e D h p p 1 p p i h p p 1 p p i e .= T / e .= T / 1 e .= T / e .= T / 1 h
26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model
413
Now expanding in Taylor’s series in T, r
1 Co T 1 T " r ! D r
#
p 1 1 1 1 C Co T C 1 2T 1 T p
1 1 2 2 C CO r 2 1 T ! " r C r
#
p 1 1 1 1 C C Co T 1 2T 1 T
1 2 C 2 r 1 2 1 ! D p r r
T 1 1 2 1 1 C C Co p 1 2 1 T T r
1 1 ! # C" r C o r
T 1 1 1 1 C C Co p 1 2 1 T T
where,
r
!
1 !2 r 1 1 Cp C 2Š T T !3 r 1 1 Cp C C 3Š T T r
2 1 1 1 Co C D1C C p T 2T T T HC D 1 C
Cp T T
and H D 1 C C C
lim
p T T
r lim ' D r
T !1
r
1 2Š
p T T
1 3Š
p T T
T !1
! 1 !2 r 1 !3 r C 1
r
2 1 C Co D1C C p T 2T 1 T T 1 where o T1 denotes a function tending to zero more rapidly than T1 .(when we expanding in Taylor’s series in T, the rest of the terms tendingto zero more rapidly than T1 so regard them as a function o T1 .) Hence,
r D
1 r
1 C
r
1 D 1 C p .1 / 1
p p .1 / D 2 D 1
and,
p T . '/ 2
1 2 6 r C 2 p 6 1 2 1 ! D lim T 6 r r
6 pT T !1 1 1 1 1 2 4 C Co p C 1 2 1 T T
414
C.-F. Lee and C.S.-M. Lin
3 7 1 7 1 7 ! " r r
# o T 7 1 1 1 1 5 C C Co p 1 2 1 T T r
2
1 C 2 r 2 2 1 ! r r
1 1 2 1 1 C C Co p 1 2 1 T T 3 r p 7 T p 1 7 1 7 ! " r r
# To T 7 1 1 1 1 5 C Co p C 1 2 1 T T " r ! r r
# p p 1 1 1 1 2 Co T T p C C 1 2 1 T 1 T ! D lim r r
T !1 1 1 1 1 2 Co C p C 1 2 1 T T 6 6p D lim 6 T T !1 6 4
r ! r r 1 C C 1
D
D
D
D
1 2 2
1 2 2
1 2 2
1 2 2
1 2
C 2 p 1 2 1 2
To T 1 1 1 Co p 2 1 T T
1 1 r 2 C 2 1 2 1 ! r r 1 C 1 .1 /2 2 .1 /
1 r C C 2 2 1 p .1 /
1 2 1
1 1
1 r C C 2 2 1 p .1 /
1 2 2
1
1
1
2
2
1 r C C 2 2 1
p .1 /
1
2
26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model
After canceling terms, p 1 .1 /.r C 2 / p 2 lim T . / D : T !1 p Now substituting lim ' for ' and lim T . '/ for T !1 T !1 p T . '/ into Equation (26.14),
p X 1 ln .1 / r C 2 S0 2 lim Z1 D s p T !1 .1 / .1 / .1 /
ln D
X S0
1 r 2 2
up by a factor of u with probability p or go down by a factor of d with probability .1 p/. After n periods to maturity, CRR showed that the option price C is: nŠ 1 X p k .1p/nk MaxŒ0; uk d nk S K : n R kŠ.n k/Š kD0 (26.15) n
CD
An alternative expression for C , which is easier to evaluate, is "
n X
nŠ uk d nk C DS p k .1 p/nk kŠ.n k/Š R kDm " n # K X nŠ p k .1 p/nk n R kŠ.n k/Š
lim Z2 D
T !1
ln
X S0
#
kDm
D SB.mI n; p 0 /
Similarly,
415
r C
n P
where B.mI n; p/ D
1 2 2
K B.mI n; p/: Rn n Ck p
k
(26.16)
.1 p/nk and m is the
kDm
Since N.Z; 1/ D N.1; Z/, let D1 D lim Z1 ,
minimum number of upward stock movements necessary for the option to terminate in the money; that is, m is the minimum value of k in Equation (26.15) such that um dnm S X > 0.
T !1
D2 D lim Z2 , the continuous time version of the twoT !1
state model is obtained: r w0 D S0 N.1; D1 / Xe N.1; D2 / 1 X C r C 2 ln S0 2 D1 D D2 D D1
The above equation is identical to the Black-Scholes model.
26.3 The Binomial Option Pricing Model of Cox, Ross, and Rubinstein In this section we will concentrate on the limiting behavior of the binomial option pricing model proposed by Cox et al. (CRR 1979).
26.3.1 The Binomial Option Pricing Formula of CRR Let S be the current stock price, K the option exercise price, R 1 the riskless rate. It is assumed that the stock follows a binomial process, from one period to the next it can only go
26.3.2 Limiting Case We now show that the binomial option pricing formula as given in Equation (26.16) will converge to the celebrated Black-Scholes option pricing model. The Black-Scholes formula is p (26.17) C D SN .d1 / e rt XN d1 t where d1 D
log
S XRt
p
t
C
p t 2
(26.18)
2 D the variance of stock rate of return t D the fixed length of calendar time to expiration date, such that h D nt . We wish to show that Equation (26.16) will coincide with Equation (26.17) when n ! 1. In order to show the limiting result that the binomial option pricing formula converges to the continuous version of Black-Scholes option pricing formula, we suppose that h represents the lapsed time between successive stock price changes. Thus, if t is the fixed length of calendar time to expiration, and n is the total number of periods each with length
416
C.-F. Lee and C.S.-M. Lin
h, then h D nt . As the trading frequency increases, h will get closer to zero. When h ! 0, this is equivalent to n ! 1. Let RO be one plus the interest rate over a trading period of length h. Then, we will have RO T D Rt
(26.19)
t for any choice of n. Thus, RO D R n , which shows that RO must depend on n for the total return over elapsed time t to be independent of n. Also, in the limit, .1 C r/n tends to e rt as n ! 1. Let S be the stock price at the end of the nth period with the initial price S . If there are j up periods, then
log
u S D j log u C .n j / log d D j log C n log d S d (26.20)
where j is the number of upward moves during the n periods. Since j is the realization of a binomial random variable with probability of a success being q, we have expectation of log .S =S /
h u i S E log D q log C log d n D n; O S d and its variance
h i S u 2 Var log D log q.1 q/n D O 2 n: S d
(26.21)
(26.22)
Since we divide up our original longer time period t into many shorter subperiods of length h so that t D hn, our procedure calls for making n longer, while keeping the length t fixed. In the limiting process we would want the mean and the variance of the continuously compounded log rate of return of the assumed stock price movement to coincide with that of actual stock price as n ! 1. Let the actual values of O n and O 2 n respectively. Then we want to choose u; d , and q in such a manner that O n ! t and O 2 n ! 2 t as n ! 1. It can be shown that if we set pt u D e n ; pt d D e n ; r 1 t 1 ; (26.23) qD C 2 2 n then O n ! t and O n ! t as n ! 1. In order to proceed further, we need the following version of the central limit theorem. Lyapounov’s Condition. Suppose X1 ; X2 ; are independent and uniformly bounded with E.Xi / D 0; Yn D X1 C C Xn , and s 2 D E Yn2 D Var.Yn /. 2
2
n P
2Cı 1 D 0 2Cı E jXk j n!1 kD1 sn distribution of Ysnn converges to
If lim the n ! 1.
for some ı > 0, then the standard normal as
Theorem 1. If p jlog u j O 3 C .1 p/ jlog d j O 3 p ! 0 as n ! 1 O 3 n (26.24) then 3 2 O log SS n p z5 ! N.z/ (26.25) Pr 4 O n where N(z) is the cumulative standard normal distribution function. Proof. See Appendix 26A. It is noted that the condition (26.24) is a special case of the Lyapounov’s condition, which is stated as follows. When ı D 1, we have the condition (26.24). This theorem says that when the fixed length t is divided into many subperiods, the log rate of return will approach to the normal distribution when the number of subperiods approached infinity. For this theorem to hold, the condition stated in Equation (26.24) has to be satisfied. We next show that this condition is indeed satisfied. We will next show that the binomial option pricing model as given in Equation (26.16) will indeed coincide with the Black-Scholes option pricing formula as given in Equation (26.17). Observe that RO n is always equal to Rt , as evidenced from Equation (26.19). Thus, comparing the two option pricing formulae given in Equations (26.16) and (26.17), we see that there are apparent similarities. In order to show the limiting result, we need to show that as n ! 1, B.mI n; p 0 / ! N.x/
p and B.mI n; p/ ! N x t :
In this section we will only show the second convergence result, as the same argument will hold true for the first convergence. From the definition of B.n; pm/, it is clear that 1 B.mI n; p/ D Pr .j m 1/ ! m 1 np : p D Pr p np.1 p/ np.1 p/ j np
(26.26) Recall that we consider a stock to move from S to uS with probability p and dS with probability .1 p/. During the
26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model
fixed calendar period of t D nh with n subperiods of length h, if there are j up moves, then log
u S D j log C n log d: S d
(26.27)
The mean and variance of the continuously compounded rate of return for this stock are O P and O P2 where O P D p log
u
h u i2 Clog d and O P2 D log p.1p/: d d
From Equation (26.27) and the definitions for O P and O P2 , we have log SS O P n j np : (26.28) D p p O P n np.1 p/
We have pt t e n log r e n pt p D pt e n e n q h i 3 1 C nt log r 1 nt C 12 2 nt C O n 2 q q i D h 3 1 C nt 1 nt C O n 2 " #r 1 log r 12 2 1 t C O.n1 /: D C (26.30) 2 2 n Hence, the condition given by Equation (26.24) is satisfied because pj log u O p j3 C .1 p/j log d O p j3 p O p3 n .1 p/2 C p 2 D p ! 0; as n ! 1: np.1 p/
Also, from the binomial option pricing formula we have m1 D
X n Sdu log d
log
Finally, in order to apply the central limit theorem, we have to evaluate O p n; O p2 n and log du as n ! 1. It is clear that
"
X D log n log d S
u p 1 O p n ! log r 2 t; O p2 n ! 2 t and log ! 0: 2 d
u log "; d
where is a real number between 0 and 1. From the definitions of O P and O P2 , it is easy to show that log m 1 np D p np.1 p/
X S
O P n " log p O P n
u d
log
:
X S
O p n" log p O p n
u d
!zD
log
X log r 12 2 t S p : t
Using the fact that 1 N.z/ D N.z/, we have, as n ! 1 p B.mI n; p/ ! N.z/ D N x t :
S S
O P n p O P n
log XS O P n " log p O P n
Hence, in order to evaluate the asymptotic probability in Equation (26.26), we have log
Thus from Equation (26.26) we have
1 B.mI n; p/ D Pr
417
u ! d
:
Similar argument holds for B.mI n; p 0 /, and hence we completed the proof that the binomial option pricing formula as given in Equation (26.16) includes the Block-Scholes option pricing formula as a limiting case.
(26.29) We will now check the condition given by Equation (26.24) in order to apply the central limit theorem. Now recall that rO d pD ; ud t
with rO D r n , and d and u are given in Equation (26.23).
26.4 Comparison of the Two Approaches From the results of last two sections, we show that both RB and CRR models lead to the celebrated Black-Scholes formula. The following table shows the comparisons of the necessary mathematical and statistical knowledge and assumptions for the two models.
418
C.-F. Lee and C.S.-M. Lin
26.5 Conclusion Model
Rendleman and Bartter (1979)
Cox et al. (1979)
Mathematical and probability theory knowledge
Basic algebra Taylor expansion Binomial theorem Central limit theorem Properties of binomial distribution
Basic algebra Taylor expansion Binomial theorem Central limit theorem Properties of binomial distribution Lyapounov’s condition
Assumption
1. The distribution of returns of the stock is stationary over time and the stock pays no dividends (discrete time model)
The stock follows a binomial process from one period to the next it can only go up by a factor of “u” with probability “p” or go down by a factor of “d” with probability “1–p” In order to apply the Central Limit Theorem, “u,” “d,” and “p” are needed to be chosen
2. The mean and variance of logarithmic returns of the stock are held constant over the life of the option (continuous time model) Advantage and disadvantage
1. Readers who have undergraduate level training in mathematics and probability theory can follow this approach
2. The approach of RB is intuitive. But the derivation is more complicated and tedious than the approach of CRR
1. Readers who have advanced level knowledge in probability theory can follow this approach; but for those who don’t, CRR approach may be difficult to follow 2. The assumption on the parameters “u,” “d,” “p” makes CRR approach more restricted than RB approach
Hence, as we indicated in the table, CRR is easy to follow if one has the advanced level knowledge in probability theory, but the assumptions on the model parameters make its applications limited. On the other hand, RB model is intuitive and does not require higher level knowledge in probability theory. However, the derivation is more complicated and tedious.
This chapter can help one to understand the statistical aspects of option pricing models, especially for those in economics and finance professions. Also, it gives important financial and economic intuitions for readers in statistics professions. Therefore, by showing two alternative binomial option pricing models approaches to derive the Black-Scholes model, this chapter is useful for understanding the relationship between the two important optional pricing models and the Black-Scholes formula. For readers who are interested in the binomial option pricing model, they can compare the two different approaches and find the best one that fits their interests and is easier to follow.
References Amram, M. and N. Kulatilaka. 2001. Real options, Oxford University Press, USA. Banz, R. and M. Miller. 1978. “Prices for state contingent claims: some estimates and applications.” Journal of Business 51, 653–672. Bartter, B. J. and R. J. Rendleman Jr. 1979. “Fee-based pricing of fixed rate bank loan commitments.” Financial Management 8. Bhattachayra, M. 1980.“Empirical properties of the Black-Scholes formula under ideal conditions.” Journal of Financial and Quantitative Analysis 15, 1081–1105. Bhattacharya, R. N. and R. R. Rao. 1976. Normal approximations and asymptotic expansions, Wiley, New York. Black, F. 1972.“Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–445. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 31, 637–659. Bookstaber, R. M. 1981. Option pricing and strategies in investing, Addison-Wesley, Reading, MA. Brennan, M. J. and E.S. Schwartz 1977.“The valuation of American put options.” Journal of Financial Economics 32, 449–462. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. Cox, J. C. and S. A. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3, 145–166. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263. Duffie, D. and C. Huang. 1985. “Implementing Arrow Debreu equilibria by continuous trading of few long lived securities.” Econometrica 6, 1337–1356. Finnerty, J. 1978. “The Chicago board options exchange and market efficiency.” Journal of Financial and Quantitative Analysis 13, 29–38. Galai, D. and R. W. Masulis. 1976. “The option pricing model and the risk factor of stock.” Journal of Financial Economics 3, 53–81. Geske, R. 1979. “The valuation of compound options.” Journal of Financial Economics 7, 63–81. Harrison, J. M. and D. M. Kreps. 1979. “Martingales and arbitrage in multiperiod securities markets.” Journal of Economic Theory 20, 381–408.
26 Two Alternative Binomial Option Pricing Model Approaches to Derive Black-Scholes Option Pricing Model Harrison, J. M. and S. R. Pliska. 1981. “Martingales and stochastic integrals in the theory of continuous trading.” Stochastic Process and Their Applications 11, 215–260. Hull, J. 2005. Options, futures, and other derivatives, 6th Edition, Prentice Hall, Upper Saddle River, NJ. Jarrow, R. and Turnbull, S. 1999. Derivatives securities, 2nd Edition, South-Western College Pub, Cincinnati, OH. Jones, E. P. 1984. “Option arbitrage and strategy with large price changes.” Journal of Financial Economics 13, 91–113. Lee, C. F. 2009. Handbook of quantitative finance, and risk management, Springer, New York. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Lee, J. C., C. F. Lee, R. S. Wang, and T. I. Lin. 2004. “On the limit properties of binomial and multinomial option pricing models: review and integration.” Advances in Quantitative Analysis of Finance and Accounting New Series, Volume 1. World Scientific: Singapore. Liaw, K. T. and R. L. Moy. 2000. The Irwin guide to stocks, bonds, futures, and options, McGraw-Hill Companies, New York. MacBeth, J. and L. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” The Journal of Finance 34, 1173–1186. Madan, D. B. and F. Milne. 1987. The multinomial option pricing model and its limits, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. Madan, D. B. and F. Milne. 1988a. Arrow Debreu option pricing for long-tailed return distributions, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. Madan, D. B. and F. Milne. 1988b. Option pricing when share prices are subject to predictable jumps, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. McDonald, R. L. 2005. Derivatives markets, 2nd Edition, Addison Wesley, Boston, MA. Merton, R. C. 1973. “The theory of rational option pricing.” Bell Journal of Economics and Management Science 4, 141–183. Merton, R. C. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics 3, 125–144. Merton, R. C. 1977. “On the pricing of contingent claims and the Modigliani-Miller theorem.” Journal of Financial Economics 5, 241–250. Milne, F. and H. Shefrin. 1987. “A critical examination of limiting forms of the binomial option pricing theory.” Department of Economics, Australian National University, Canberra, ACT 2601. Øksendal, B. 2007. Stochastic differential equations: an introduction with applications, 6th Edition, Springer-Verlag, Berlin Heidelberg. Parkinson, M. 1977. “Option pricing: the American put.” Journal of Business 50, 21–36. Rendleman, R. J., Jr. and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance 24, 1093–1110. Rendleman, R. J., Jr. and B. J. Barter. 1980. “The pricing of options on debt securities.” Journal of Financial and Quantitative Analysis 15, 11–24. Ritchken, P. 1987. Option: theory, strategy, and applications, Scott, Foresman: Glenview, IL. Rubinstein, M. 1976. “The valuation of uncertain income streams and the pricing of options.” Bell Journal of Economics and Management Science 7, 407–425. Shreve, S. E. 2004a. Stochastic calculus for finance I: The binomial asset pricing model, Springer Finance, New York. Shreve, S. E. 2004b. Stochastic calculus for finance II: continuous-time model, Springer Finance, New York. Summa, J. F. and J. W. Lubow. 2001. Options on futures, John Wiley & Sons, New York.
419
Trennepohl, G. 1981. “A comparison of listed option premia and Black-Scholes model prices: 1973–1979.” Journal of Financial Research 11–20. Zhang, P. G. 1998. Exotic options: a guide to second generation options, 2nd Edition, World Scientific Pub Co Inc, Singapore.
Appendix 26A The Binomial Theorem
.x C y/n D
n x k y nk kD0 k
Xn
Lindberg-Levy Central Limit Theorem If x1 ; : : : ; xn are a random sample from a probability distri2 bution 1 Pnwith finite mean and finite variance and xN D i D1 xi , then n p d n.xN n / ! N 0; 2 : Proof of Theorem 1 Since ˇ3 ˇ u ˇ ˇ p jlog u j O 3 D p ˇlog u p log log d ˇ d ˇ u ˇˇ3 ˇ D p.1 p/3 ˇlog ˇ d And ˇ3 ˇ u ˇ ˇ .1 p/ jlog d j O 3 D .1 p/ ˇlog d p log log d ˇ d ˇ u ˇˇ3 ˇ D p 3 .1 p/ ˇlog ˇ ; d we have p jlog u j O 3 C.1p/ jlog d j O 3 D p.1p/Œ.1 ˇ ˇ u ˇ3 2 2 ˇ p/ p log d . Thus O 3 p jlog u j O 3 C .1 p/ jlog d j p O 3 n ˇ ˇ3 p.1 p/Œ.1 p/2 p 2 ˇlog du ˇ D p u
3 p p.1 p/ log n d 2 2 .1 p/ C p D p ! 0 as n ! 1: np.1 p/ Hence the condition for the theorem to hold as stated in Equation (26.24) is satisfied.
Chapter 27
Normal, Lognormal Distribution and Option Pricing Model Cheng Few Lee, Jack C. Lee, and Alice C. Lee
Abstract In this chapter, we first introduce normal distribution, lognormal distribution, and their relationship. Then we discuss multivariate normal and lognormal distributions. Finally, we apply both normal and lognormal distributions to derive Black-Scholes formula under the assumption that the rate of stock price follows a lognormal distribution. Keywords Normal distribution r Lognormal distribution r Multivariate normal distribution r Multivariate lognormal distribution r Binomial distribution r Poisson distribution r Black-Scholes formula
1 x 2 1 e 2 . / ; f .x/ D p 2
27.2 The Normal Distribution A random variable X is said to be normally distributed with mean and variance 2 if it has the probability density function (pdf) C.F. Lee () Rutgers University, Newark, NJ, USA e-mail:
[email protected] J.C. Lee National Chiao Tung University, Hsinchu, Taiwan
(27.1)
The normal distribution is symmetric around , which is the mean and the mode of the distribution. It is easy to see that the pdf of Z D X is z2 1 g.z/ D p e 2 ; 2
(27.2)
which is the pdf of the standard normal and is independent of the parameters and 2 . The cdf of Z is
27.1 Introduction The normal (or Gaussian) distribution is the most important distribution in probability and statistics. One of the justifications for using the normal distribution is the central limit theorem. Also, most of the statistical theory is based on the normality assumption. However, in finance research, the lognormal distribution plays a more important role. Part of the reason is that in finance, we are dealing with random quantities that are positive in nature. Hence, taking the natural logarithm is quite reasonable. Also, empirical data quite often support the assumption of the lognormality for random quantity such as the stock price movements.
> 0:
P .Z z/ D N.z/;
(27.3)
which has been well tabulated. Also, software packages or systems such S-plus will provide the value N.z/ instantly. For a discussion of some approximations for N.z/, the reader is referred to Johnson and Kotz (1970). For the cdf of X we have
x x X DN : P .X x/ D P (27.4) The normal distribution as given in Equation (27.1) is very important in practice. It is very useful in describing phenomena such as the scores of tests, heights of students in a certain school and it is useful as an approximation for distribution such as binomial and Poisson. It is also quite useful in studying the option pricing. We next discuss some properties of the normal distribution. If X is normally distributed with mean and variance 2 , then the mgf of X is Mx .t/ D e t Ct
2 2
=2 ;
(27.5)
which is useful in deriving the moment of X. Equation (27.5) is also useful in deriving the moments of the lognormal distribution. From Equation (27.5), it is easy to verify that
A.C. Lee State Street Corp., Boston, MA, USA
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_27,
E.X / D and Var.X / D 2 : 421
422
C.F. Lee et al.
If X1 ; ; Xn are independent, normally distributed random variables, then any linear function of these variables is also normally distributed. In fact, if Xi is normally distributed n P ai Xi is normally with mean i and variance i2 , then distributed with mean
n P
i D1
ai i and variance
i D1
n P i D1
ai2 i2 , when
ai0 s are constants. If X1 and X2 are independent, and each 2 is ıdistributed with mean 0 and variance , then normally 2 2 2 2 X1 C X2 is also normally distributed. X1 X2
It is noted that the moment sequence f0r g does not belong only to lognormal distribution. Thus, the lognormal distribution cannot be defined by its moments. The cdf of X is
log x ; P .X x/ D P .log X log x/ D N (27.11) because log X is normally distributed with mean and variance 2 . The distribution of X is unimodal with the mode at 2
mod e.X / D e . / :
27.3 The Lognormal Distribution A random variable X is said to be lognormality distributed with parameters and 2 if Y D log X
(27.6)
is normally distributed with mean and variance 2 . It is clear that X has to be a positive random variable. This distribution is quite useful in studying the behavior of stock prices. For the lognormal distribution as described above, the pdf is 1
g.x/ D p e 2 x
1 2 2
.log x/2
2 2 rC r 2
:
(27.8)
It is noted that we have utilized the fact the mgf of the normal random variable Y with mean and variance 2 is MY .t/ D t 2 2
e t C 2 . Thus, E.e rY / is simply MY .r/, which is the right hand size of Equation (27.8). From Equation (27.8) we have E.X / D e
2 C 2
; 2
log x˛ log X
log x˛ DN : (27.13)
P .X x˛ / D P
Thus, z˛ D
log x˛ ,
implying that
2
Var.X / D e 2 e Œe 1 :
(27.14)
(27.7)
The lognormal distribution is sometimes called the antilognormal distribution, because it is the distribution of the random variable X, which is the antilog of the normal random variable Y. However, “lognormal” is most commonly used in the literature. When applied to economic data, especially production function, it is often called the Cobb-Douglas distribution. We next discuss some properties of the lognormal distribution, as defined in Equation (27.6). The rth moment of X is 0r D E.X r / D E.e rY / D e
Let x˛ be the .100/˛ percentile for the lognormal distribution and z˛ be the corresponding percentile for the standard normal, then
x˛ D e C z˛ : ; x > 0:
(27.12)
Thus, the percentile of the lognormal distribution can be obtained from the percentile of the standard normal. From Equation (27.13), we also see that median.X / D e ;
(27.15)
as z0:5 D 0. Thus median.X / > mod e.X /. Hence the lognormal distribution is not symmetric.
27.4 The Lognormal Distribution and Its Relationship to the Normal Distribution By comparing the pdf of normal distribution given in Equation (27.1) and the pdf of lognormal distribution given in Equation (27.7), we know that f .x/ D
f .y/ : x
(27.16)
(27.9)
In addition, from Equation (27.6), it is easy to see that
(27.10)
dx D xdy:
(27.17)
27 Normal, Lognormal Distribution and Option Pricing Model
The cdf for the lognormal distribution can be expressed as F .a/ D Pr.X a/ D Pr.log X log a/
log a log X D Pr
423
If the lower bound a is larger than 0; then the partial mean of x can be shown as Z 1 Z 1 2 xf .x/dx D f .y/e y dy D e C 2 N.d / 0
log.a/
(27.26)
D N.d /
(27.18)
where
log.a/ C : This implies that partial mean of a lognormal variable is the mean of x times an adjustment term, N.d /. d D
where
log a (27.19) and N.d / is the cdf of the standard normal distribution that can be obtained from Normal Table. N.d / can also be obtained from S-plus or other software packages. Alternatively, the value of N.d / can be approximated by the following formula: d D
d2
N.d / D a0 e 2 .a1 t C a2 t 2 C a3 t 3 / where tD
(27.20)
1 1 C 0:33267d
a0 D 0:3989423; a1 D 0:4361936; a2 D 0:1201676; a3 D 0:9372980
27.5 Multivariate Normal and Lognormal Distributions The normal distribution with the pdf given in Equation (27.1) can be extended to the p-dimensional case. Let X D 0 X1 ; : : : ; Xp be a p 1 random vector. Then we say that X Np .; †/, if it has the pdf f .x/ D .2/
p=2
j†j
1=2
1 0 1 exp .x/ † .x/ : 2 (27.27)
In case we need Pr.X > a/, then we have Pr.X > a/ D 1 Pr.X a/ D 1 N.d / D N.d /: (27.21) Since for any h; E.X h / D E.e hY /, the hth moment of X , the following moment generating function of Y , which is normally distributed with mean and variance 2 , 1 2 2
MY .t/ D e t C 2 t
:
(27.22)
In Equation (27.27), is the mean vector and † is the covariance matrix, which is symmetric and positive definite. The moment generating function of X is 0 1 0 0 Mx .t/ D E e t x D e t C 2 t †t
0 where t D t1 ; : : : ; tp is a p 1 vector of real values. From Equation (27.28) it can be shown that E.X/ D
For example, 1 2 2
X D E.X / D E.e Y / D MY .1/ D e C 2 t 1 2 2
E.X h / D E.e hY / D MY .h/ D e hC 2 t
:
:
(27.23)
Hence 2
X2 D E.X 2 / .EX /2 D e 2C2 e 2C 2
2
D e 2C .e 1/
2
(27.24)
Thus, fractional and negative moment of a lognormal distribution can be obtained from Equation (27.23). The mean of a lognormal random variable can be defined as Z 1 2 xf .x/dx D e C 2 : (27.25) 0
(27.28)
and C ov.X/ D †
If C is a q p matrix of rank q p. Then CX Nq C; C†C0 . Thus, linear transformation of a normal random vector
a multivariate .1/ normal random vector.
is.1/also †11 †12 X ; D , and † D , Let X D .2/ †21 †22 X.2/ where X.i / and .i / are pi 1; p1 C p2 D p, and †ij is pi pj . Then the marginal distribution of X .i / is also a multivari.i / ate normal with mean vector and covariance matrix †ii ; .i / .i / that is, X Npi ; †ii . Furthermore, the conditional distribution of X.1/ given X.2/ D x .2/ , where x .2/ is a known vector, is normal with mean vector 12 and covariance matrix †112 where 12 D .1/ C †12 †22 1 x .2/ .2/
(27.29)
424
C.F. Lee et al.
and
Theorem 27.1. (Anderson 1984, pp. 13–14). Let the pdf of Y1 ; : : : ; Yp be f .y1 ; : : : ; yp /, consider the p-valued functions
(27.30) †112 D †11 †12 †22 1 †21 ; ˇ ˇ i.e., X.1/ ˇX.2/ D x .2/ Np1 .12 ; †112 / . We next consider a bivariate version of correlated lognormal distribution.
1 log .X1 / 11 12 Y1 N D ; . Let Y2 2 21 22 log .X2 / The joint pdf of X1 and X2 can be obtained from the joint pdf of Y1 and Y2 by observing that dx1 dx2 D x1 x2 dy1 dy2 ;
(27.31)
which is an extension of Equation (27.17) to the bivariate case. Hence, the joint pdf of X1 and X2 is g .x1 ; x2 / D
1 1 exp Œ.log x1 ; log x2 / 0 †1 2 j†j x1 x2 2
xi D xi .y1 ; : : : ; yp /; i D 1; : : : ; p:
(27.37)
We assume that the transformation from the y-space to the x-space is one-to-one with the inverse transformation yi D yi .x1 ; : : : ; xp /; i D 1; : : : ; p:
(27.38)
Let the random variables X1 ; : : : ; Xp , be defined by Xi D xi .Y1 ; : : : ; Yp /; i D 1; : : : ; p:
(27.39)
Then the pdf of X1 ; : : : ; Xp is g.x1 ; : : : ; xp / D f y1 .x1 ; : : : ; xp /; : : : ; yp .x1 ; : : : ; xp /
J.x1 ; : : : ; xp /
(27.40)
(27.32) where J.x1 ; : : : ; xp / is the Jacobian of transformations ˇ ˇ ˇ @y1 @y1 ˇ From the property of the multivariate normal distribution, we ˇ ˇ ˇ @x1 @xp ˇˇ have Yi N .i ; ii /. Hence, Xi is lognormal with ˇ ˇ ˇ J.x1 ; : : : ; xp / D mod ˇ ::: : : : ::: ˇ ; (27.41) i i ˇ ˇ ˇ @yp ˇ (27.33) E.Xi / D e i C 2 ; @y p ˇ ˇ ˇ @x @x ˇ 1 p (27.34) Var.Xi / D e 2i e i i Œe i i 1 : where “mod” means a modulus or absolute value. Furthermore, by the property of the moment generating for Applying the above theorem with f .y1 ; : : : ; yp / being a the bivariate normal distribution, we have p-variate normal, and ˇ ˇ E .X1 X2 / D E e Y1 CY2 ˇ ˇ 1 ˇ ˇ 0 0 1 C2 C 12 .11 C22 C212 / ˇ ˇ x 1 De ˇ ˇ ˇ 0 1 0 ˇ Y p p ˇ ˇ 1 D E .X1 / E .X2 / exp 11 22 :(27.35) x2 ˇD J.x1 ; : : : ; xp / D mod ˇˇ ; ˇ : : :: ˇ xi ˇ ::: i D1 : : ˇ ˇ Thus, the covariance between X1 and X2 is ˇ 1 ˇˇ ˇ 0 ˇ xp ˇ C ov .X1 ; X2 / D E .X1 X2 / E .X1 / E .X2 / (27.42) p we have the following joint pdf of X1 ; : : : ; Xp D E .X1 / E .X2 / exp 11 22 1 p ! D E .X1 / E .X2 / exp 11 22 1 p Y p 1 2 p=2 g.x1 ; : : : ; xp / D .2 / exp j†j (27.36) x i D1 i ( From the property of conditional normality of Y1 given Y2 D 0 1 log x1 ; : : : ; log xp y2 , we also see that the conditional distribution of Y1 given 2 Y2 D y2 is lognormal. ) The extension to the p-variate lognormal distribution is 0 †1 log x1 ; : : : ; log xp : (27.43) trivial. Let Y D Y1 ; : : : ; Yp where Yi D log Xi . If Y 0 Np .; †/, where D 1 : : : p and † D ij . The joint pdf of X1 ; : : : ; Xp , can be obtained from the It is noted that when p D 2, Equation (27.43) reduces to the following theorem. bivariate case given in Equation (27.32).
Œ.log x1 ; log x2 / g :
27 Normal, Lognormal Distribution and Option Pricing Model
425
Then the first two moments are E.Xi / D e i C
i i 2
C DS
kDm
;
(27.44)
Var.Xi / D e 2i e i i Œe i i 1 : (27.45)
1 i i C jj Cor Xi ; Xj D exp i C j C 2 p exp ij i i jj 1 (27.46)
X .1Cr/ T
T P
kDm
TŠ p k .1 kŠ.T k/Š X .1Cr/T
p/
T k
(27.47)
B.T; p; m/;
S D current price of the stock, T D term to maturity in years, m D minimum number of upward movements in stock price that is necessary for the option to terminate “in the money,” uR p D Rd ud and 1 p D ud , X D option exercise price (or strike price) R D 1 C r D 1C risk-free rate of return u D 1C percentage of price increase d D 1C percentage of price decrease
p0 D
u R
p
B.n; p; m/ D
n X
n Ck p
k
.1 p/nk
kDm
By a form of the central limit theorem, we have shown in Chap. 26 as T ! 1, the option price C converges to C below1 C D SN.d1 / XRT N.d2 /
(27.48)
C D Price of the call option,
Current price of the firm’s common stock .S / Exercise price (or strike price) of the option .X / Term to maturity in years .T / Variance of the stock’s price . 2 / Risk-free rate of interest .r/
The binomial option pricing model defined in Equation (27.22) can be written as
p 0 /T k
and C D 0 if m > T where
27.6 The Normal Distribution as an Application to the Binomial and Poisson Distributions The cumulative normal density function tells us the probability that a random variable Z will be less than some value x. Note in Fig. 27.1 that P .Z < x/ is simply the area under the normal curve from 1 up to point x. One of the many applications of the cumulative normal distribution function is in valuing stock options. A call option gives the option holder the right to purchase, at a specified price known as the exercise price, a specified number of shares of stock during a given time period. A call option is a function of the following five variables:
TŠ 0k kŠ.T k/Š p .1
D SB.T; p 0 ; m/
where ij is the correlation between Yi and Yj . For more details concerning properties of the multivariate lognormal distribution, the reader is referred to Johnson and Kotz (1972).
1. 2. 3. 4. 5.
T P
d1 D
log
S X r t
p
t p d2 D d1 t
C
1p t; 2
N.d / is the value of the cumulative standard normal distribution, t is the fixed length of calendar time to expiration and h is the elapsed time between successive stock price changes and T D ht. If future stock price is constant over time, then 2 D 0. It can be shown that both N.d1 / and N.d2 / are equal to 1 and that Equation (27.48) becomes C D S Xe rT :
(27.49)
Alternatively, Equations (27.48) and (27.49) can be understood in terms of the following steps: 1
Fig. 27.1 Cumulative normal density function
See Cox et al. (1979), “Option pricing: a simplified approach” for details.
426
C.F. Lee et al.
Step 1: The future price of the stock is constant over time. Because a call option gives the option holder the right to purchase the stock at the exercise price X , the value of the option, C , is just the current price of the stock less the present value of the stock’s purchase price. Mathematically, the value of the call option is C DS
X : .1 C r/T
(27.50)
Note that Equation (27.50) assumes discrete compounding of interest, whereas Equation (27.49) assumes continuous compounding of interest. To adjust Equation (27.50) for continu1 ous compounding, we substitute e rT for .1Cr/ T to get C D S Xe rT :
(27.51)
Step 2: Assume the price of the stock fluctuates over time .St /. In this case, we need to adjust Equation (27.49) for the uncertainty associated with that fluctuation. We do this by using the cumulative normal distribution function. In deriving Equation (27.48), we assume that St follows a lognormal distribution, as discussed in Sect. 27.3. The adjustment factors N.d1 / and N.d2 / in the BlackScholes option valuation model are simply adjustments made to Equation (27.49) to account for the uncertainty associated with the fluctuation of the price of the stock. Equation (27.48) is a continuous option pricing model. Compare this with the binomial option price model given in Equation (27.47), which is a discrete option pricing model. The adjustment factors N.d1 / and N.d2 / are cumulative normal density functions. The adjustment factors B.T; p 0 ; m/ and B.T; p; m/ are complementary binomial distribution functions. We can use Equation (27.48) to determine the theoretical value, as of November 29, 1991, of one of IBM’s options with maturity on April 1992. In this case we have X D $90; S D $92:5; D 0:2194; r D 0:0435, and 5 D :42 (in years).2 Armed with this information we T D 12 can calculate the estimated d1 and d2 . xD
˚ 92:5 ln 90 C .:0435/ C 12 .:2194/2 .:42/ 1
.:2194/.:42/ 2
Fig. 27.2 Probability of standard normal distribution
tion takes on a value less than d1 and a value less than d2 , respectively. The values for N.d1 / and N.d2 / can be found by using the tables in the back of the book for the standard normal distribution, which provide the probability that a variable Z is between 0 and x (see Fig. 27.2). To find the cumulative normal density function, we need to add the probability that Z is less than zero to the value given in the standard normal distribution table. Because the standard normal distribution is symmetric around zero, we know that the probability that Z is less than zero is .5, so P .Z < x/ D P .Z < 0/ C P .0 < Z < x/ D :5C value from table We can now compute the values of N.d1 / and N.d2 /. N.d1 / D P .Z < d1 / D P .Z < 0/ C P .0 < Z < d1 / D P .Z < :392/ D :5 C :1517 D :6517 N.d2 / D P .Z < d2 / D P .Z < 0/ C P .0 < Z < d2 / D P .Z < :25/ D :5 C :0987 D :5987 Then the theoretical value of the option is C D .92:5/.:6517/ Œ.90/.:5987/ =e .:0435/.:42/ D 60:282 53:883=1:0184 D 7:373 and the actual price of the option on November 29, 1991, was $7.75.
D :392;
p 1 x t D x .0:2194/.0:42/ 2 D :25: In Equation (27.45), N.d1 / and N.d2 / are the probabilities that a random variable with a standard normal distribu-
27.7 Applications of the Lognormal Distribution in Option Pricing3 To derive the Black-Scholes formula it is assumed that there are no transaction costs, no margin requirements, and no
2
Values of X D $90; S D $92:5; r D 0:0435 were obtained from Section C of the Wall Street Journal on December 2, 1991. And D 0:2194 is estimated in terms of monthly rate of return during the period January 1989 to November 1991.
3
See Cheng F. Lee et al. (1990), Security Analysis and Portfolio Management (Glenview, I11: Scott, Foresman/Little, Brown), 754–760.
27 Normal, Lognormal Distribution and Option Pricing Model
taxes; that all shares are infinitely divisible; and that continuous trading can be accomplished. It is also assumed that the economy is risk neutral and the stock price follows a lognormal distribution. Denote the current stock price by S and the stock price at the end of j th period by Sj . Sj Then Sj 1 D expŒKj is a random variable with a lognormal distribution where Kj is the rate of return in j th period and is a random variable with normal distribution. Let Kt have the expected value k and variance k2 for each j . Then K1 C K2 C C Kt is a normal random variable with expected value tk and variance tk2 . Thus, we can define the expected value (mean) of SSt D expŒK1 C K2 C C Kt as E
St S
t 2 D exp tk C k : 2
(27.52)
427
is less than the exercise price .X /, the option expires out of the money. This occurs because in purchasing shares of the stock the investor will find it cheaper to purchase the stock in the market than to exercise the option. Let X D SST be lognormally distributed with parameters D tr
where g.x/ is the probability density function of Xt D SSt . Substituting D tr tk2 =2; 2 D tk2 and a D XS into Equations (27.18) and (27.26) in Sect. 27.4, we obtain Z
C D expŒrt EŒMax.ST X; 0/
(27.54)
where T is the time of expiration and X is the striking price. Note that
X ST X ST ; for > Max.ST X; 0/ D S S S S S D0
for
Z
(27.53)
In the risk-neutral assumptions of Cox and Ross (1976) and Rubinstein (1976), the call option price C can be determined by discounting the expected value of the terminal option price by the riskless rate of interest:
X ST < S S
(27.55)
Equations (27.54) and (27.55) say that the value of the call option today will be either St X or 0, whichever is greater. If the price of stock at time t is greater than the exercise price, the call option will expire in the money. This simply means that an investor who owns the call option will exercise it. The option will be exercised regardless of whether the option holder would like to take physical possession of the stock. If the investor would like to own the stock, the cheapest way to obtain the stock is by exercising the option. If the investor would not like to own the stock, he or she will still exercise the option and immediately sell the stock in the market. Since the price the investor paid .X / is lower that the price he or she can sell the stock for .St /, the investor realizes an immediate the profit of St X . If the price of the stock .St /
and 2 D tk2 . Then
C D expŒrt EŒM ax.St X / R1 D expŒrt X S x XS g.x/dx S R1 R1 D expŒrt S X xg.x/dx expŒrt S XS X g.x/dx S S (27.56)
Under the assumption of a risk-neutral investor, the expected return E SSt is assumed to be exp.rt/ (where r is the riskless rate of interest). In other words, 2 k D r k : 2
t k2 2
1 X S
xg.x/dx D e rt N.d1 /
(27.57)
g.x/dx D N.d2 /
(27.58)
1 X S
where t r 2t k2 log p d1 D t k
X S
p log C t k D
S X
C r C 12 k2 t p t k (27.59)
and d2 D
log
S X
p C r 12 k2 t p D d1 t k t k
(27.60)
Substituting Equation (27.58) into Equation (27.56), we obtain C D SN.d1/ X expŒrt N.d2 /;
(27.61)
This is Equation (27.48) defined in Sect. 27.6. We have defined a put option as a contract conveying the right to sell a designated security at a stipulated price. It can be shown that the relationship between a call option .C / and a put option .P / can be defined as C C Xe rt D P C S
(27.62)
Substituting Equation (27.33) into Equation (27.34), we obtain the put option formula as P D Xe rt N.d2 / SN.d1 /
(27.63)
428
where S; C; r; t; d1 and d2 are identical to those defined in the call option model.
27.8 Conclusion In this chapter, we first introduce univariate and multivariate normal distribution and lognormal distribution. Then we show how normal distribution can be used to approximate binomial distribution and Poisson distribution. Finally, we use the concepts normal and lognormal distributions to derive Black-Scholes formula under the assumption that investors are risk neutral.
C.F. Lee et al.
References Anderson, T. W. 1984. An introduction to multivariate statistical analysis, 2nd Edition, Wiley, New York. Cox, J. C. and S. A. Ross. 1976. “A Survey of Some New Results in Financial Option Pricing Theory”. The Journal of Finance, 31(2), 383–402. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263. Johnson, N. L. and S. Kotz. 1970. Continuous univariate distributions – I Distributions in statistics, Wiley, New York. Johnson, N. L. and S. Kotz. 1972. Distributions in statistics: continuous multivariate distribution, Wiley, New York. Lee, C. F. et al. 1990. Security analysis and portfolio management, Scott, Foresman/Little, Brown, Glenview, I11. Rubinstein, M. 1976. “The Valuation of Uncertain Income Streams and the Pricing of Options”. The Bell Journal of Economics, 7(2), 407–425.
Chapter 28
Bivariate Option Pricing Models Cheng Few Lee, Alice C. Lee, and John Lee
Abstract The main purpose of this chapter is to present the American option pricing model on stock with dividend payment and without dividend payment. A Microsoft Excel program for evaluating this American option pricing model is also presented. Keywords Bivariate normal distribution r Option pricing model r American option r Ex-dividend r Early exercise
28.1 Introduction In this chapter we first review the basic concept of the bivariate normal density function in Sect. 28.2. In Sect. 28.3 we present the bivariate normal CDF for American call option valuation model. A computer program for evaluating American option will be presented in Sect. 28.4. The evaluation of stock option models without dividend payment and with dividend payment are discussed in Sects. 28.5 and 28.6, respectively. Finally, Sect. 28.7 presents the conclusion of this chapter.
We can define the probability density function probability density function (pdf) of the normally distributed random variables X and Y as 1 .X X / f .X / D ; 1 < X < 1 p exp 2X2 X 2 (28.1) 1 .Y Y / ; 1 < Y < 1 f .Y / D p exp 2Y2 Y 2 (28.2) where X and Y are population means for X and Y, respectively; X and Y are population standard deviations of X and Y, respectively; D 3:1416; and exp represents the exponential function. If represents the population correlation between X and Y, then the pdf of the bivariate normal distribution can be defined as f .X; Y / D
1 p exp.q=2/; 2 X Y 1 2 1 < X < 1;
28.2 The Bivariate Normal Density Function In correlation analysis, we assume a population where both X and Y vary jointly. It is called a joint distribution of two variables. If both X and Y are normally distributed, then we call this known distribution a bivariate normal distribution.
C.F. Lee () Rutgers University, Piscataway, NJ, USA e-mail:
[email protected] A.C. Lee State Street Corp., Boston, MA, USA e-mail:
[email protected] J. Lee Center for PBBEF Research, NJ, USA e-mail:
[email protected]
1 < Y < 1;
(28.3)
where X > 0; Y > 0, and 1 < < 1, 1 qD 1 2
"
X X X
C
2
X Y Y
X X 2 X
!#
Y Y Y
2
:
It can be shown that the conditional mean of Y, given X, is linear in X and given by
Y E.Y jX / D Y C X
.X X /:
(28.4)
It is also clear that given X, we can define the conditional variance of Y as
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_28,
.Y jX / D Y2 .1 2 /:
(28.5) 429
430
C.F. Lee et al.
Equation (28.4) can be regarded as describing the population linear regression line. For example, if we have a bivariate normal distribution of heights of brothers and sisters, we can see that they vary together and there is no cause-and-effect relationship. Accordingly, a linear regression in terms of the bivariate normal distribution variable is treated as though there were a two-way relationship instead of an existing causal relationship. It should be noted that regression implies a causal relationship only under a “prediction” case. Equation (28.3) represents a joint pdf for X and Y. If D 0, then Equation (28.3) becomes f .X; Y / D f .X /f .Y /
(28.6)
This implies that the joint pdf of X and Y is equal to the pdf of X times the pdf of Y. We also know that both X and Y are normally distributed. Therefore, X is independent of Y. Example 28.1. Using a Mathematics Aptitude Test to Predict Grade in Statistics Let X and Y represent scores in a mathematics aptitude test and numerical grades in elementary statistics, respectively. In addition, we assume that the parameters in Equation (28.3) are X D 550 X D 40 Y D 80 Y D 4 D :7 Substituting this information into Equations (28.4) and (28.5), respectively, we obtain E.Y jX / D 80 C :7.4=40/.X 550/ D 41:5 C :07X (28.7) 2 .Y jX / D .16/.1 :49/ D 8:16:
(28.8)
If we know nothing about the aptitude test score of a particular student (say, John), we have to use the distribution of Y to predict his elementary statistics grade. 95% interval D 80 ˙ .1:96/.4/ D 80 ˙ 7:84: That is, we predict with 95% probability that John’s grade will fall between 87.84 and 72.16. Alternatively, suppose we know that John’s mathematics aptitude score is 650. In this case, we can use Equations (28.7) and (28.8) to predict John’s grade in elementary statistics. E.Y jX D 650/ D 41:5 C .:07/.650/ D 87
We can now base our interval on a normal probability distribution with a mean of 87 and a standard deviation of 2.86. 95% interval D 87 ˙ .1:96/.2:86/ D 87 ˙ 5:61: That is, we predict with 95% probability that John’s grade will fall between 92.61 and 81.39. Two things have happened to this interval. First, the center has shifted upward to take into account the fact that John’s mathematics aptitude score is above average. Second, the width of the interval has been narrowed from 87:8472:16 D 15:68 grade points to 92:61 81:39 D 11:22 grade points. In this sense, the information about John’s mathematics aptitude score has made us less uncertain about his grade in statistics.
28.3 American Call Option and the Bivariate Normal CDF An option contract that can be exercised only on the expiration date is called a European call. If the contract of a call option can be exercised at any time of the option’s contract period, then this kind of call option is called an American call. When a stock pays a dividend, the American call is more complex. Following Whaley (1981), the valuation equation for American call option with one known dividend payment can be defined as p C.S; T; X / D S x ŒN1 .b1 / C N2 .a1 ; b1 I t=T / r.T t / Xe rt ŒN1 .b2 /ep CN2 .a2 ; b2 I t=T / C De rt N1 .b2 / (28.9a) where x p ln SX C r C 12 2 T a1 D p ; a2 D a1 T (28.9b) T x ln SS C r C 12 2 t p t b1 D ; b2 D b1 t (28.9c) p t S x D S De rt
S x represents the correct stock net price of the present value of the promised dividend per share (D); t represents the time dividend to be paid. St is the ex-dividend stock price for which C.St ; T t/ D St C D X
and 2 .Y jX / D .16/.1 :49/ D 8:16:
(28.10)
S; X; r; 2 ; T have been defined in Chap. 27.
(28.11)
28 Bivariate Option Pricing Models
431
Both N1 .b1 / and N2 .b2 / are cumulative univariate normal density function. N2 .a; bI / is the cumulative bivariate normal density function with upper p integral limits, a and b, and correlation coefficient, D t=T . American call option on a non-dividend-paying stock will never optimally be exercised prior to expiration. Therefore, if there exist no dividend payments, Equation (28.9) will reduce to European Option pricing model with no dividend payment as defined in Chap. 27. In Chap. 27, we have shown how the cumulative univariate normal density function can be used to evaluate the European call option. If a common stock pays a discrete dividend during the option’s life, the American call option valuation equation requires the evaluation of a cumulative bivariate normal density function. While there are many available approximations for the cumulative bivariate normal distribution, the approximation provided here relies on Gaussian quadratures. The approach is straightforward and efficient, and its maximum absolute error is 0.00000055. Following Equation (28.3), the probability that x 0 is less than a and that y 0 is less than b for the standardized cumulative bivariate normal distribution
(This portion is based upon Appendix 16.1 of Hans R. Stoll and Robert E, Whaley (1993), Futures and Options (South Western Publishing, Cincinnati)) and the coefficients a1 and b1 are computed using
Z a Z b 1 0 0 p exp P .X < a; Y < b/ D 2 1 2 1 1 02 2x 2x 0 y 0 C y 02
dx 0 dy 0 2.1 2 /
If ab > 0, compute the bivariate normal probability, N2 .a; bI /, as:
y
x where x 0 D x ; y 0 D y y and p is the correlation between x the random variables x 0 and y 0 . The first step in the approximation of the bivariate normal probability N2 .a; bI / is as follows:
a a1 D p 2.1 2 /
The second step in the approximation involves computing the product ab. if ab 0, compute the bivariate normal probability, N2 .a; bI /, using the following rules: 1: If a 0; b 0; and 0; then N2 .a; bI /D '.a; bI / 2: If a 0; b 0; and > 0; then N2 .a; bI /D N1 .a/ '.a; bI / 3: If a 0; b 0; and > 0; then N2 .a; bI /D N1 .b/ '.a; bI / 4: If a 0; b 0; and 0; then N2 .a; bI /D N1 .a/ C N1 .b/ 1 C '.a; bI / (28.13)
N2 .a; bI / D N2 .a; 0I ab / C N2 .b; 0I ab / ı
(28.12) where
h
f xi0 ; xj0 D exp a1 2xi0 a1 C b1 2xj0 b1
i C2 xi0 a1 xj0 b1 ;
(28.14)
where the values of N2 ./ on the right-hand side are computed from the rules, for ab 0 .a b/Sgn.a/ ; ab D p a2 2ab C b 2 ıD
5 X 5
X p '.a; bI / :31830989 1 2 wi wj f xi0 ; xj0 ; i D1 j D1
b and b1 D p 2.1 2 /
.b a/Sgn.b/ ba D p a2 2ab C b 2
1 Sgn.a/ Sgn.b/ ; 4
and Sgn.x/ D
1 x 0 1 x < 0
N1 .d / is the cumulative univariate normal probability.
28.4 Valuating American Options The pairs of weights, (w) and corresponding abscissa values .x 0 / are as follows: i; j 1 2 3 4 5
w 0.24840615 0.39233107 0.21141819 0.033246660 0.00082485334
x0 0.10024215 0.48281397 1.0609498 1.7797294 2.6697604
An American call option whose exercise price is $48 has an expiration time of 90 days. Assume the risk-free rate of interest is 8% annually, the underlying price is $50, the standard deviation of the rate of return of the stock is 20%, and the stock pays a dividend of $2 exactly 50 days; (a) What is the European call value? (b) Can the early exercise price predicted? (c) What is the value of the American call?
432
C.F. Lee et al.
(a) The current stock net price of the present value of the promised dividend is
Since d D 2 > 0:432, therefore, the early exercise is not precluded. (c) The value of the American call is now calculated as
S x D 50 2e 0:08.50=365/ D 48:0218:
i h p C D 48:208 N1 .b1 / C N2 a1 ; b1 I 50=90 48e 0:08.90=365/ N1 .b2 /e 0:08.40=365/ i p CN2 .a2 ; b2 I 50=90
The European call value can be calculated as C D .48:0218/N.d1/ 48e 0:08.90=365/ N.d2 /
C2e 0:08.50=365/ N1 .b2 /
where
(28.15)
Œln.48:208=48/ C .:08 C :5.:20/2 /.90=365/ p d1 D :20 90=365 D 0:25285 d2 D 0:292 0:0993 D :15354
Since both b1 and b2 depend on the critical ex-dividend stock price St , which can be determined by C.St ; 40=365I 48/ D St C 2 48; by using trial and error, we find that St D 46:9641. An Excel program used to calculate this value is presented in Fig. 28.1. Substituting Sx D 48:208; X D $48 and St into Equations (28.9b) and (28.9c) we can calculate a1 a2 b1 and b2 :
From standard normal table, we obtain N.0:25285/ D :5 C :3438 D :599809 N.:15354/ D :5 C :3186 D :561014
a1 D d1 D 0:25285
So the European call value is
a2 D d2 D 0:15354
48:208 2 50 ln 46:9641 C 0:08 C 0:22 365 b1 D p D 0:4859 .:20/ 50=365
C D .48:516/.0:599809/ 48.0:980/.0:561014/ D 2:40123: (b) The present value of the interest income that would be earned by deferring exercise until expiration is
p In addition, we also know D 50=90 D 0:7454
X.1 e r.T t / / D 48.1 e 0:08.9050/=365 / D 48.1 :991/ D :432
Calculation of St (critical ex-dividend stock price) 46.962 46.963 46.9641 46.9 47
S (critical ex-dividend stock price) X (exercise price of option) r (risk-free interest rate) volatility of stock T -t (expiration date-exercise date) d1
48 0.08 0.2 0.10959
48 0.08 0.2 0.10959
48 0.08 0.2 0.10959
48 0.08 0.2 0.10959
48 0.08 0.2 0.10959
48 0.08 0.2 0.10959
48 0.08 0.2 D .90 50/=365
0.4773
0.1647
0.1644
0.164
0.1846
0.1525
d2
0.5435
0.2309
0.2306
0.2302
0.2508
0.2187
D (dividend) c (value of European call option to buy one share) p (value of European put option to sell one share) C.S t ; T tI X/ St D C X
2 0.60263
2 0.96319
2 0.96362
2 0.9641
2 0.93649
2 0.9798
2.18365
1.58221
1.58164
1.58102
1.61751
1.56081
0.60263
0.00119
0.00062
2.3E-06
0.03649
0.0202
D .LN.C3=C4/ C .C5 C C6^ 2=2/ .C7//=.C6 SQRT.C7// D .LN.C3=C4/ C .C5 C6^ 2=2/ .C7//=.C6 SQRT.C7// 2 D C3 NORMSDIST.C8/ C4 EXP.C5 C7/ NORMSDIST.C9/ D C4 EXP.C5 C7/ NORMSDIST .C9/ C3 NORMSDIST.C8/ D C12 C3 C10 C C4
Fig. 28.1
46
b2 D 0:485931 0:074023 D 0:4119
46
28 Bivariate Option Pricing Models
433
From the above information, we now calculate related normal probability as follows:
and p.S; T I X / maxŒ0; Xe rT S
(28.16b)
N1 .b1 / D N1 .0:4859/ D 0:6865 respectively, and the lower price bounds for the American call and put options are
N1 .b2 / D N1 .0:7454/ D 0:6598 Following Equation (28.14),We now calculate the value of N2 .0:25285; 0:4859I 0:7454) and N2 .0:15354; 0:4119I 0:7454/ as follows: Since ab¡ > 0 for both cumulative bivariate normal density function, therefore, we can use Equation N2 .a; bI ¡/ D N2 .a; 0I ¡ab / C N2 .b; 0I ¡ba / • to calculate the value of both N2 .a; bI ¡/ as follows: Œ.0:7454/.0:25285/ C 0:4859 .1/ ab D p 2 .0:25285/ 2.0:7454/.0:25285/.0:4859/ C .0:4859/2 D 0:87002 Œ.0:7454/.0:4859/ 0:25285 .1/ ba D p .0:25285/2 2.0:7454/.0:25285/.0:4859/ C .0:4859/2 D 0:31979
(28.17a)
P .S; T I X / maxŒ0; Xe rT S
(28.17b)
and
respectively. The put-call parity relation for non-dividendpaying European stock options is c.S; T I X / p.S; T I X / D S Xe rT ;
(28.18a)
and the put-call parity relation for American options on nondividend-paying stocks is S X C.S; T I X / P .S; T I X / S Xe rT : (28.18b)
• D .1 .1/.1//=4 D 1=2
N2 .0:292; 0:4859I 0:7454/ D N2 .0:292; 0:0844/ C N2 .0:5377; 0:0656/ 0:5 D N1 .0/ C N1 .0:5377/ ˆ.0:292; 0I 0:0844/ ˆ.0:5377; 0I 0:0656/ 0:5 D 0:07525 Using a Microsoft Excel program in the following Figures 28.2 and 28.3, we obtain N2 .0:1927; 0:4119I 0:7454/ D 0:06862 Then substituting the related information into the Equation (28.15), we obtain: C D $3:08238 All related results are shown in the second column of Table 28.1.
For non-dividend-paying stock options, the American call option will not rationally be exercised early, while the American put option may be.
28.6 Dividend-Paying Stocks If dividends are paid during the option’s life, the above relations must reflect the stock’s drop in value when the dividends are paid. To manage this modification, we assume that the underlying stock pays a single dividend during the option’s life at a time that is known with certainty. The dividend amount is D and the time to ex-dividend is t. If the amount and the timing of the dividend payment is known, the lower price bound for the European call option on a stock is c.S; T I X / maxŒ0; S De rt Xe rT :
28.5 Non-Dividend-Paying Stocks To derive the lower price bounds and the put-call parity relations for options on non-dividend-paying stocks, simply set the cost-of-carry rate, b, equal to the risk-less rate of interest, r. Note that the only cost of carrying the stock is interest. The lower price bounds for the European call and put options are c.S; T I X / maxŒ0; S Xe rT
C.S; T I X / maxŒ0; S Xe rT
(28.16a)
(28.19a)
In this relation, the current stock price is reduced by the present value of the promised dividend. Because a Europeanstyle option cannot be exercised before maturity, the call option holder has no opportunity to exercise the option while the stock is selling cum dividend. In other words, to the call option holder, the current value of the underlying stock is its observed market price less the amount that the promised dividend contributes to the current stock value; that is, S Dert . To prove this pricing relation, we use the same arbitrage
434
C.F. Lee et al.
Option Explicit Public Function Bivarncdf (a As Double, b As Double, rho As Double) As Double Dim rho_ab As Double, rho_ba As Double Dim delta As Double If (a b rho)
D 0 And rho > 0) Then Bivarncdf D Application.WorksheetFunction.NormSDist(a) - Phi(a, -b, -rho) End If If (a >D 0 And b 0) Then Bivarncdf D Application.WorksheetFunction.NormSDist(b) - Phi(-a, b, -rho) End If If (a >D 0 And b >D 0 And rho D 0, 1, 1)) / Sqr(a ^ 2 2 rho a b C b ^ 2) rho_ba D ((rho b a) IIf(b >D 0, 1, 1)) / Sqr(a ^ 2 2 rho a b C b ^ 2) delta D (1 IIf(a >D 0, 1, 1) IIf(b >D 0, 1, 1)) / 4 Bivarncdf D Bivarncdf(a, 0, rho_ab) C Bivarncdf(b, 0, rho_ba) delta End If End Function Public Function Phi(a As Double, b As Double, rho As Double) As Double Dim a1 As Double, b1 As Double Dim w(5) As Double, x(5) As Double Dim i As Integer, j As Integer Dim doublesum As Double a1 D a / Sqr(2 (1 rho ^ 2)) b1 D b / Sqr(2 (1 rho ^ 2)) w(1) D 0.24840615 w(2) D 0.39233107 w(3) D 0.21141819 w(4) D 0.03324666 w(5) D 0.00082485334 x(1) D 0.10024215 x(2) D 0.48281397 x(3) D 1.0609498 x(4) D 1.7797294 x(5) D 2.6697604 doublesum D 0 For i D 1 To 5 For j D 1 To 5 doublesum D doublesum C w(i) w(j) Exp(a1 (2 x(i) a1) C b1 (2 x(j) b1) C 2 rho (x(i) a1) (x(j) b1)) Next j Next i Phi D 0.31830989 Sqr(1 rho ^ 2) doublesum End Function Fig. 28.2 Microsoft excel program for calculating function phi
28 Bivariate Option Pricing Models
435
Option Pricing Calculation S(current stock price)D St (critical ex-dividend stock price)D S(current stock price NPV of promised dividend)D X(exercise price of option)D r(risk-free interest rate)D ¢(volatility of stock)D T(expiration date)D t(exercise date)D D(Dividend)D d1(non-dividend-paying)D
50 46.9641 48.0218 48 0.08 0.2 0.24658 0.13699 2 0.65933
d2(non-dividend-paying)D d1 (critical ex-dividend stock price)D
0.56001 0:16401
d2 (critical ex-dividend stock price)D d1(dividend-paying)D
0:23022 0.25285
d2(dividend-paying)D a1 D
0.15354 0.25285
a2 D b1 D
0.15354 0.48593
b2 D
0.41191
D B3–B11 EXP.B7 B10/
D .LN.B3=B6/ C .B7 C 0:5 B8^ 2/ B9/=.B8 SQRT.B9// D B12 B8 SQRT.B9/ D .LN.B4=B6/ C .B7 C 0:5 B8^ 2/ .B9–B10//=.B8 SQRT.B9–B10// D B14 B8 SQRT.B9–B10/ D .LN.B5=B6/ C .B7 C 0:5 B8^ 2/ .B9//=.B8 SQRT.B9// D B16 B8 SQRT.B9/ D .LN..B3–B11 EXP.B7 B10//=B6/ C .B7 C 0:5 B8^ 2/ .B9//=.B8 SQRT.B9// D B18 B8 SQRT.B9/ D .LN..B3–B11 EXP.B7 B10//=B4/ C .B7 C 0:5 B8^ 2/ .B10//=.B8 SQRT.B10// D B20 B8 SQRT.B10/
C.St ; T tI X/ D
0.9641
C.St ; T tI X/St D C X D
2.3E-06
D B4 NORMSDIST.B14/ B6 EXP.B7 .B9–B10// NORMSDIST.B15/ D B23 B4 B11 C B6
N1.a1/ D N1.a2/ D N1.b1/ D N1.b2/ D N1.b1/ D N1.b2/ D ¡D a D a1I b D b1 ˆ.a; bI ¡/ D ˆ.a; bI ¡/ D ¡ab D
0.59981 0.56101 0.68649 0.6598 0.31351 0.3402 0:74536
DNORMSDIST(B18) DNORMSDIST(B19) DNORMSDIST(B20) DNORMSDIST(B21) D NORMSDIST.B20/ D NORMSDIST.B21/ D SQRT.B10=B9/
0.20259 0.04084 0.87002
¡ba D
0:31979
N2.a; 0I ¡ab/ D N2.b; 0I ¡ba/ D •D
0.45916 0.11092 0.5
D phi.B20; 0; B37/ D phi.B18; 0; B36/ D ..B32 B18 .B20// IF.B18 >D 0; 1; 1//=SQRT.B 18^ 2 2 B32 B18 B20 C .B20/^ 2/ D ..B32 B20 .B18// IF.B20 >D 0; 1; 1//=SQRT.B18^ 2 2 B32 B18 B20 C .B20/^ 2/ Dbivarncdf(B18,0,B36) D bivarncdf.B20; 0; B37/ D .1 IF.B18 >D 0; 1; 1/ IF.B20 >D 0; 1; 1//=4
a D a2I b D b2 ˆ.a; bI ¡/ D ˆ.a; bI ¡/ D
0.24401 0.02757
Fig. 28.3 Microsoft excel program for calculating two alternative American call options
D phi.B21; 0; B45/ D phi.B19; 0; B44/
436
C.F. Lee et al.
Option Pricing Calculation ¡ab D
0.94558
¡ba D
0:48787
N2.a; 0I ¡ab/ D N2.b; 0I ¡ba/ D •D N2.a1; b1I ¡/ D N2.a2; b2I ¡/ D c(value of European call option to buy one share) p(value of European put option to sell one share) c(value of American call option to buy one share)
0.47243 0.09619 0.5 0.07007 0.06862 2.40123 1.44186 3.08238
D ..B32 B19 .B21// IF.B19 >D 0; 1; 1//=SQRT.B19^ 2 2 B32 B19 B21 C .B21/^ 2/ D ..B32 B21 .B19// IF.B21 >D 0; 1; 1//=SQRT.B19^ 2 2 B32 B19 B21 C .B21/^ 2/ Dbivarncdf(B19,0,B44) D bivarncdf.B21; 0; B45/ D .1 IF.B19 >D 0; 1; 1/ IF.B21 >D 0; 1; 1//=4 D bivarncdf.B18; B20; B32/ D bivarncdf.B19; B21; B32/ D B5 NORMSDIST.B16/ B6 EXP.B7 B9/ NO RMSDIST.B17/ D B5 NORMSDIST.B16/ C B6 EXP.B7 B9/ NO RMSDIST.B17/ D .B3–B11 EXP.B7 B10// .NORMSDIST.B20/ C bivarncdf.B18; B20; SQRT.B10=B9/// B6 EXP.B7 B9/ .NORMSDIST.B21/ EXP.B7 .B9 B10// C bivarncdf.B19; B21; SQRT.B10=B9/// C B11 EXP.B7 B10/ NORMSDIST.B21/
Fig. 28.3 (continued)
transactions, except we use the reduced stock price S Dert in place of S. The lower price bound for the European put option on a stock is: p.S; T I X / maxŒ0; Xe rT S De rt :
(28.19b)
Again, the stock price is reduced by the present value of the promised dividend. Unlike the call option case, however, this serves to increase the lower price bound of the European put option. Because the put option is the right to sell the underlying stock at a fixed price, a discrete drop in the stock price such as that induced by the payment of a dividend serves to increase the value of the option. An arbitrage proof of this relation is straightforward when the stock price, net of the present value of the dividend, is used in place of the commodity price. The lower price bounds for American stock options are slightly more complex. In the case of the American call option, for example, it may be optimal to exercise just prior to the dividend payment because the stock price falls by an amount D when the dividend is paid. The lower price bound of an American call option expiring at the ex-dividend instant would be 0 or S Xert , whichever is greater. On the other hand, it may be optimal to wait until the call option’s expiration to exercise. The lower price bound for a call option expiring normally is shown in Equation (28.19a). Combining the two results, we get C.S; T I X / maxŒ0; S Xe rt ; S De rt Xe rT (28.20a) The last two terms on the right-hand side of Equation (28.20a) provide important guidance in deciding
whether to exercise the American call option early, just prior to the ex-dividend instant. The second term in the squared brackets is the present value of the early exercise proceeds of the call. If the amount is less than the lower price bound of the call that expires normally; that is, if S Xe rt S De rT Xe rt ;
(28.21)
the American call option will not be exercised just prior to the ex-dividend instant. To see why, simply rewrite Equation (28.21) so it reads D < X Œ1 e r.T t /
(28.22)
In other words, the American call will not be exercised early if the dividend captured by exercising prior to the exdividend date is less than the interest implicitly earned by deferring exercise until expiration. Figure 28.4 depicts a case in which early exercise could occur at the ex-dividend instant, t. Just prior to ex-dividend, the call option may be exercised yielding proceeds St C D X , where St , is the ex-dividend stock price. An instant later, the option is left unexercised with value c.St ; TtI X/, where c./ is the European call option formula. Thus, if the ex-dividend stock price, St is above the critical ex-dividend stock price where the two functions intersect, St , the option holder will choose to exercise her option early just prior to the ex-dividend instant. On the other hand, if St St , the option holder will choose to leave her position openuntil the option’s expiration. Figure 28.5 depicts a case in which early exercise will not occur at the ex-dividend instant, t. Early exercise will not
28 Bivariate Option Pricing Models
437
Fig. 28.4 American call option price as a function of the ex-dividend stock price immediately prior to the ex-dividend instant. Early exercise may be optimal
Fig. 28.5 American call option price as a function of the ex-dividend stock price immediately prior to the ex-dividend instant. Early exercise will not be optimal
occur if the functions, St C D X and c.St ; T tI X / do not intersect, as is depicted in Fig. 28.5. In this case, the lower boundary condition of the European call, St Xer.T t / , lies above the early exercise proceeds, St C D X , and hence the call option will not be exercised early. Stated explicitly, early exercise is not rational if
time t, early exercise is suboptimal, where .X S /e r.t tn / is less than .X S C D/. Rearranging, early exercise will not occur between tn and t if1
St C D X < St Xe r.T t /
Early exercise will become a possibility again immediately after the dividend is paid. Overall, the lower price bound of the American put option is
This condition for no early exercise is the same as Equation (28.21), where St is the ex-dividend stock price and where the investor is standing at the ex-dividend instant, t. The condition can also be written as D < X Œ1 e r.T t /
(28.22)
In words, if the ex-dividend stock price decline – the dividend – is less than the present value of the interest income that would be earned by deferring exercise until expiration, early exercise will not occur. When condition Equation (28.22) is met, the value of the American call is simply the value of the corresponding European call. The lower price bound of an American put option is somewhat different. In the absence of a dividend, an American put may be exercised early. In the presence of a dividend payment, however, there is a period just prior to the ex-dividend date when early exercise is suboptimal. In that period, the interest earned on the exercise proceeds of the option is less than the drop in the stock price from the payment of the dividend. If tn represents a time prior to the dividend payment at
tn > t
ln 1 C r
D X S
:
P .S; T I X / maxŒo; X .S De rt / :
(28.23)
(28.20b)
Put-call parity for European options on dividend-paying stocks also reflects the fact that the current stock price is deflated by the present value of the promised dividend; that is, c.S; T I X /p.S; T I X / D S De rt Xe rT :
(28.24)
That the presence of the dividend reduces the value of the call and increases the value of the put is again reflected here by the fact that the term on the right-hand side of 1
It is possible that the dividend payment is so large that early exercise prior to the dividend payment is completely precluded. For example, consider the case where X D 50; S D 40; D D 1; t D 0:25 and r D 0:10. Early exercise is precluded if r, D 0:25 lnŒ1 l=.50 40/ =0:10 D 0:7031. Because the value is negative, the implication is that there is no time during the current dividend period (i.e., from 0 to t) where it will not pay the American put option holder to wait until the dividend is paid to exercise his option.
438
C.F. Lee et al.
Table 28.1 Arbitrage transactions for establishing put-call parity for American stock options S Dert X C.S; T I X/ P .S; T I X/
Position
Initial Value
Ex-Dividend Day(t)
Buy American Call Sell American Put Sell Stock Lend D e rt Lend X Net Portfolio Value
C P S Dert X C C P C S Dert X
D D
Put Exercised normally(T) Terminal Value SQT X SQT > X 0 SQT X Q .X ST / 0 SQT SQT XerT
XerT
X.e rT 1/
X.e rT 1/
r
Xe CQ C X.e r 1/
0
Equation (28.24) is smaller than it would be if the stock paid no dividend. Put-call parity for American options on dividend-paying stocks is represented by a pair of inequalities; that is, S De rt X C.S; T I X / P .S; T I X / S De rt Xe rT
Put Exercised Early.”/ Intermediate Value CQ .X SQ / SQ
(28.25)
To prove the put-call parity relation Equation (28.25), we consider each inequality in turn. The left-hand side condition of Equation (28.25) can be derived by considering the values of a portfolio that consists of buying a call, selling a put, selling the stock, and lending X C Dert risklessly. Table 28.1 contains these portfolio values. In Table 28.1, it can be seen that, if all of the security positions stay open until expiration, the terminal value of the portfolio will be positive, independent of whether the terminal stock price is above or below the exercise price of the options. If the terminal stock price is above the exercise price, the call option is exercised, and the stock acquired at exercise price X is used to deliver, in part, against the short stock position. If the terminal stock price is below the exercise price, the put is exercised. The stock received in the exercise of the put is used to cover the short stock position established at the outset. In the event the put is exercised early at time T, the investment in the riskless bonds is more than sufficient to cover the payment of the exercise price to the put option holder, and the stock received from the exercise of the put is used to cover the stock sold when the portfolio was formed. In addition, an open call option position that may still have value remains. In other words, by forming the portfolio of securities in the proportions noted above, we have formed a portfolio that will never have a negative future value. If the future value is certain to be nonnegative, the initial value must be nonpositive, or the left-hand inequality of Equation (28.25) holds.
The right-hand side of Equation (28.25) may be derived by considering the portfolio used to prove European put-call parity. Table 28.1 contains the arbitrage portfolio transactions. In this case, the terminal value of the portfolio is certain to equal zero, should the option positions stay open until that time. In the event the American call option holder decides to exercise the call option early, the portfolio holder uses his long stock position to cover his stock obligation on the exercised call and uses the exercise proceeds to retire his outstanding debt. After these actions are taken, the portfolio holder still has an open long put position and cash in the amount of X Œ1 e r.T t / . Since the portfolio is certain to have nonnegative outcomes, the initial value must be nonpositive or the right-hand inequality of Equation (28.25) must hold.
28.7 Conclusion In this chapter reviewed the basic concept of the bivariate normal density function and presented the bivariate normal CDF. The theory of American call stock option pricing model for one dividend payment was also presented. The evaluations of stock option models without dividend payment and with dividend payment were also discussed. Finally, we provided an Excel program for evaluating American option pricing model with one dividend payment.
References Stoll, H. R. 1969. “s” Journal of Finance 24, 801–824. Whaley, Robert E. 1981. “On the valuation of American call options on stocks with known dividends.” Journal of Financial Economics 9, 207–211.
Chapter 29
Displaced Log Normal and Lognormal American Option Pricing: A Comparison Ren-Raw Chen and Cheng-Few Lee
Abstract This paper compares the American option prices with one known dividend under two alternative specifications of the underlying stock price: displaced log normal and log normal processes. Many option pricing models follow the standard assumption of the Black–Scholes model (Journal of Political Economy 81:637–659, 1973) in which the stock price, follows a log normal process. However, in order to reach a closed form solution for the American option price with one known dividend, Roll (Journal of Financial Economics 5:251–258, 1977), Geske (Journal of Financial Economics 7: 63–81, 1979), and Whaley (Journal of Financial Economics 9:207–211, 1981) assume a displaced lognormal process for the cum-dividend stock price which results in a lognormal process for the ex-dividend stock price. We compare the two alternative pricing results in this paper. Keywords Displaced log normal r Log normal r Option pricing r Black–Scholes model
29.1 Introduction The American option pricing model with one known dividend was first derived by Roll (1977), improved by Geske (1979a), and corrected by Whaley (1981). The Geske-RollWhaley model (GRW model hereafter) is a closed form model with bivariate normal probability functions that are easy to implement. Roll (1977) adopts a portfolio that contains European and compound options to replicate the probabilistic cash flows of the American option with one known dividend. Geske (1979a) and Whaley (1981) assume the exdividend stock price process follows a lognormal diffusion
R.-R. Chen () Graduate School of Business Administration, Fordham University, New York, NY 10019, USA e-mail: [email protected]
and arrive at a closed form solution. If we assume the cum-dividend stock price process to follow a lognormal process, the price will be higher.
29.2 The American Option Pricing Model Under the Lognormal Process As Geske pointed out (1979a), the American option formula can be directly derived by taking the risk neutral expectation of the option payoff. Follow the GRW notation and let P be the cum-dividend price at an arbitrary time t, and S be the ex-dividend price. No arbitrage pricing states that P should yield the risk free rate under the equivalent martingale measure: dP D rdt C dW (29.1) P Note that Pt is log normally distributed with log mean and log variance:
2 .s u/ Eu Œln Ps D ln Pu C r 2 Vu Œln Ps D 2 .s u/
(29.2)
where u < s are two arbitrary times. The underlying stock pays a known dividend of D at time t and the option expires at T . At time t; St D Pt ˛D where ˛ is a constant. Note that the option will be exercised if the exercise value is greater than the option value after ex-dividend; that is the American option will be exercised if Pt K > C.St / where C.St / is the European value of the option at time t (which is a function of the ex-dividend stock price St ). In this case, the payoff is Pt K. Otherwise, that is, Pt K < C.St /, the call continues to be held. The trigger of early exercise is when the stock price satisfies the following equation:
C.-F. Lee Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_29,
8 < Pt K D C.St / or : St C ˛D K D C.St / 439
440
R.-R. Chen and C.-F. Lee
Like CRW, we follow the second expression of the above equation. Since St appears on both sides and is the only unknown in this equation, we can solve for St for the equality: St C ˛D K D C.St /.1 In other words, St > St (which implies St C ˛D K > C.St /) and the option will be early exercised, otherwise, it does not. Hence, the pricing formula can be derived by implementing the following integrals: 2 6 C0 D e rt 4
ZPt
Z1 .Pt K/f .Pt /dPt C
Pt
3
Note that if Pt ˛D is and f .Pt / were both log normal, then the product of two log normal variables,
ZPt Z1 f .ST j Pt ˛D /f .Pt /dST dPt 0
K Pt Z˛DZ1
D
f .ST jPt ˛D/f .Pt ˛D/dST dPt 0
7 C.St /f .Pt /dPt 5
K
Pt ˛D
Z
Z1
D
0
(29.3) where Pt is the critical value over which triggers early exercise. Note that the call price (under no early exercise) in the above equation, C.St /, is a function of the ex-dividend stock price and hence can be written as the following integration: 21 3 Z C.St / D e r.T t / 4 .ST K/f .ST jSt /dST 5
f .ST ; Pt ˛D/dST dPt 0
(29.6)
K
would result in a bivariate normal probability. Unfortunately, since Pt ˛D is a displaced lognormal, its distribution is unknown and certainly cannot be log normal. Consequently, Equation (29.6) (and hence Equation (29.5)) will not have a closed form solution.
(29.4)
K
29.3 The Geske-Roll-Whaley Model Plugging (29.4) into (29.3), we have: The GRW model, in order to arrive at a closed form solution, assumes the following:
2
Z1 rt 6 C0 D e 4 .Pt K/f .Pt /dPt
dS D rdt C dW: S
Pt
(29.7)
Pt
Z Z1 C 0
K ZPt Z1
K 2 De
rt
6 4
In this case, the ex-dividend price follows a log normal distribution with log mean and variance:
ST f .ST jSt /f .Pt /dST dPt
0
3
2 Eu Œln Ss D ln Su C r .s u/ 2
7 f .ST jSt /f .Pt /dST dPt 5
Vu Œln Ss D 2 .s u/:
K Pt
Z1 Z .Pt K/f .Pt /dPt C
Pt
Under the circumstance, Equation (29.5) becomes: 2
0
C0 D e
Z1 ST f .ST jPt ˛D/f .Pt /dST dPt K Pt
Z Z1 K 0
rt
6 4
Z1 .Pt K/f .Pt /dPt
Pt
3 7 f .ST jPt ˛D/f .Pt /dST dPt 5
ZPt Z1 C
ST f .ST jSt /f .Pt /dST dPt 0 K
K
ZPt Z1 (29.5)
K 0
Note that the call option has a lower bound of S e r.T t/ K. As a result, ˛D > K.1 e r.T t/ /.
1
(29.8)
3 7 f .ST jSt /f .Pt /dST dPt 5 (29.9)
K
A simple change of variable, St D Pt ˛D, simplifies the above equation into:
29 Displaced Log Normal and Lognormal American Option Pricing: A Comparison
2 Z1 6 C0 D e rt 4 .St C ˛D K/f .St /dSt
441
where ln.S0 =St / C .r C 2 =2/t p t p a2 D a1 t
a1 D
St
ZSt Z1 C
ST f .ST jSt /f .St /dST dSt 0 K
3
ZSt Z1 K
7 f .ST jSt /f .St /dST dSt 5 (29.10)
ln.S0 =K/ C .r C 2 =2/T p T p b2 D b1 T b1 D
0 K
This would lead to a nice bivariate solution (shown below) if St and ST are jointly log normal. However, it is not possible that both Pt and St be log normal at the same time since a displaced log normal variable (i.e. St ) can no longer be log normally distributed. Assuming that St can be log normally distributed, GRW continue with the following derivation and arrive at their option pricing formula2: 2 3 Z1 ZSt 6 7 e rt 4 .St C ˛D K/f .St /dSt C C.St /f .St /dSt 5 St
0
D S0 N.a1 / C e rt .˛D K/N.a2 / 9 8 ZSt < Z1 = e r.T t / .ST K/f .ST jSt /dST f .St /dSt C e rt : ; 0 K „ ƒ‚ … X
(29.11) We can then simplify the second term of the right hand side of (29.11), X , as: X D e rT
211 Z Z 4 .ST K/f .ST jSt /f .St /dSt dST 0 K
3 Z1 Z1 7 .ST K/f .ST jSt /f .St /dSt dST 5 St K
p D S0 N.b1 / e r.T t / KN.b2 / S0 M a1 ; b1 I p t T p Ce r.T t / KM a2 ; b2 I p t T p p t D S0 M a1 ; b1 I pT e r.T t / KM a2 ; b2 I pTt
The derivation of (29.11) is provided in the Appendix. Putting (29.11) and (29.12) together, we arrive at the GeskeRoll-Whaley formula: h h p i S0 N.a1 /CM a1 ; b1 I p t e rT K e r.T t / N.a2 / T p i C e rt ˛DN.a2 / (29.13) CM a1 ; b1 I pTt where M.a; bI / is a bivariate normal probability function with correlation coefficient , and N.a/ is a univariate formal probability function. Note that this result is consistent with the GRW model only if S0 is used in computing the option price, and not today’s stock price, P0 . This implies that it is S (the ex-dividend price) but not P (the cum-dividend price) that follows the log normal diffusion. Whaley (1981, p. 209, last two lines) and Geske (1979b, p. 375, definition of S / already point out that the result is consistent with the exdividend process being log normal. Note that while both exdividend and cum-dividend processes are martingales (after deflated by the risk free rate), the actual observed price series are not. The actual observed prices are cum-dividend prior to the dividend and ex-dividend after the dividend. As a result, the observed price series, Pt t < (29.14) Vt D St t > but each ex or cum dividend series is a martingale: St D Pt D
Pt e r. t / ˛D t < t St
and
t < Pt St C e r.t / ˛D t
The terminal stock price at the maturity T , should be ST D (29.12) PT e r.T t / ˛D, a result that can be easily obtained by noarbitrage. Since PT is log normally distributed, ST cannot be. 2 Equivalently, according to GRW the ex-dividend price process follows As a result, there is no closed form solution to the American option price. We use the binomial model to compute the a log normal diffusion: American price and compare that with the GRW price. We dS D rdt C dW set the parameter values same as Whaley (1981): S that readily impliesp that f .ln St ; ln ST / is a bivariate normal with correlation coefficient t =T .
K D 100 ˛D D 5
442
R.-R. Chen and C.-F. Lee
r t T
D 0:2 D 4% D1 D2
The percentage differences are larger. In particular, the percentage difference for P0 D 80 is 25.51% under $10 dividend drop, as opposed to 11.64% under $5 dividend drop.
The results are given as follows3 :
29.4 Conclusion P0 80 85 90 95 100 105 110 115 120
S0 75:196 80:196 85:196 90:196 95:196 100:196 105:196 110:196 115:196
GRW 3:212 4:818 6:839 9:276 12:111 15:316 18:851 22:676 26:748
binomial 3:635 5:297 7:36 9:814 12:645 15:827 19:33 23:106 27:131
% diff 11:64% 9:04% 7:08% 5:48% 4:22% 3:23% 2:48% 1:86% 1:41%
The above table demonstrates that the difference is larger when the escrowed stock price is lower; that is, the option is more out of the money. Note that this is a situation where the dividend yield .D=S0 / is larger. In other words, the difference is larger when the dividend impact is greater. Also, the difference is larger when the dividend amount gets larger. The following table contains prices with ˛D D 10. P0 80 85 90 95 100 105 110 115 120
3
S0 70:392 75:392 80:392 85:392 90:392 95:392 100:392 105:392 110:392
GRW 2:154 3:499 5:312 7:619 10:414 13:663 17:313 21:301 25:560
binomial 2:892 4:378 6:286 8:635 11:417 14:605 18:158 22:033 26:170
% diff 25:51% 20:09% 15:50% 11:76% 8:79% 6:45% 4:65% 3:33% 2:33%
The binomial model is run on 1,000 steps. We verified that the binomial price for the compound option matches up to the third decimal place with the Geske formula (for strikes 10 and 100, both models give 7.996).
In this paper, we clarify the ambiguity in pricing American options with known dividends. We show that the famous pricing model by Geske, Roll, and Whaley differs from the usual assumption that the cum-stock price follows a lognormal process. The failure of obtaining the closed form solution is due to the fact that displaced log normal distribution is no longer lognormal. By assuming the ex-dividend distribution log normal, GRW were able to derive a closed form solution under the ex-dividend stock price. As the number of known dividends grows, the difference should also grow.
References Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Geske, R. 1979a. “The valuation of compound options.” Journal of Financial Economics 7, 63–81. Geske, R. 1979b. “A note on an analytical valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 7, 375–380. Roll, R. 1977. “An analytical valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 5, 251–258. Whaley, R. 1981. “On the valuation of American call options on stocks with known dividends.” Journal of Financial Economics 9, 207–211.
29 Displaced Log Normal and Lognormal American Option Pricing: A Comparison
443
Appendix 29A In this appendix, we derive Equation (29.11). 2 X D e rT
3 Z1 Z1 Z1 Z1 6 7 .ST K/f .ST jSt /f .St /dSt dST .ST K/f .ST jSt /f .St /dSt dST 5 4 St K
0 K
2 D e rT
3
611 7 6Z Z 7 Z1 Z1 6 7 6 .ST K/f .ST ; St /dSt dST .ST K/f .ST ; St /dSt dST 7 6 7 6 7 0 K 4„ 5 St K ƒ‚ … „ ƒ‚ … Y
Z
The first term, Y , can be simplified by changing the integrals: 0 1 Z1 Z1 Y D @ .ST K/f .ST ; St /dSt A dST K
0
1 01 Z .ST K/ @ f .ST ; St /dSt A dST
Z1 D K
0
Z1 D
.ST K/f .ST /dST K
D e rT S0 N.b1 / KN.b2 /
The second term, Z, leads to a bivariate probability: Z1 Z1 ZD .ST K/f .ST ; St /dSt dST St K
Z1 Z1 D
Z1 Z1 ST f .ST ; St /dSt dST K
St K
p D W KM a2 ; b2 I pTt
f .ST ; St /dSt dST St K
444
R.-R. Chen and C.-F. Lee
The second term of Z is a bivariate normal probability. The first term: Z1 Z1 W D
ST f .St ; ST /dST dSt
S1
K
Z1 Z1 D
e XT f .Xt ; XT /dX T dX t
ln S1 ln K
Z1 Z1 D
e
ln S1 ln K
XT
1 1 exp 2 t T 2.1 2 /
Xt t 2 t
XT T T
C
0 Z1
Z1
D ln S1 ln K
"
Xt t t
XT T T 2
2
2 #! dX T dX t
6 X 2 B 1 1 t t 6 B p exp B 6 @ 2.1 2 / 4 t 2 t T 1 2 „ ƒ‚ … W1
31
7C XT T Xt t XT T 2 7C 2 C 2.1 2 /XT 7C dX T dX t „ ƒ‚ …5A t T T „ ƒ‚ … „ ƒ‚ … W4
W2
W3
Look at each individual term in square brackets: W1 D
D
D
X1 C t t
2 C t D t C t and
t D t t2 D 2 t
2
Xt C t t
where
t2 D 2 t D
Xt t t
t 2 T D 2 T2 T
D !2
!2
!2 Xt t C t2 C t t
C t2 C 2 Xt C t C 2 T2 C 2 Xt C t
29 Displaced Log Normal and Lognormal American Option Pricing: A Comparison
445
Similarly,
2 XT .T C T2 / XT T 2 W3 D D C T T T !2 XT C T 2 D C 2.XT C T / C T T
X2 2 Xt t W2 D 2 t !2 ! C XT C Xt t T C t C T D 2 t T ! ! ! ! Xt C XT C Xt C XT C t t T T D 2 2T 2t 2t T t T t T ! ! X2 C Xt C t 2 C 2 2 2 2.Xt C D 2 t / 2 .XT T / 2 T t 2
The bracketed term, as a result, becomes:
W1 C W2 C W3 C W4 D
Xt C t t
!2
Xt C t 2 t
!
XT C T T
! C
XT C T T
and then, 2 !2 ! Xt C Xt C 1 t t 4 @ W D p exp 2 2.1 2 / t t 2 t T 1 2 ln St ln K 3 1 ! !2 XT C XT C T T 5A dXT dXt C T T Z1 Z1
C
D e T
C
e T
T2 =2
0
T2 =2
S0 M.a1 ; b1 I / p ! t D e rT S0 M a1 ; b1 I p T
Note that X D e rT .Y " Z/ De
rT
e S0 N.b1 / KN.b2 / W C KM rT
p !# t a2 ; b2 I p T !
p p ! t t D S0 N.b1 / e rT KN.b2 / S0 M a1 ; b1 I p C e rT KM a2 ; b2 I p T T p ! p ! t t D S0 M a1 ; b1 I p e rT KM a2 ; b2 I p T T
!2
446
R.-R. Chen and C.-F. Lee
Plugging X back to (29.12) leads (29.11). rt C0 D S0 N.a /X " 1 / C e .˛D K/N.a " p 2 !# p !# t t rt C e ˛DN.a2 / K N.a2 / C M a2 ; b2 I p D S0 N.a1 / C M a1 ; b1 I p T T
Chapter 30
Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model George Chalamandaris and A.G. Malliaris
Abstract The purpose of this paper is to develop certain relatively recent mathematical discoveries known generally as stochastic calculus, or more specifically as Itô’s Calculus and to also illustrate their application in the pricing of options. The mathematical methods of stochastic calculus are illustrated in alternative derivations of the celebrated Black–Scholes–Merton model. The topic is motivated by a desire to provide an intuitive understanding of certain probabilistic methods that have found significant use in financial economics. Keywords Stochastic calculus pricing
r
Itô’s Lemma
r
Options
stochastic integrals and was developed by the Japanese probabilist Kiyosi Itô during 1944–1951. Two decades later economists such as Merton (1973) and Black and Scholes (1973) started using Itô’s stochastic differential equation to describe the behavior of asset prices. The results of Black– Scholes–Merton were soon generalized in the larger framework of the fundamental theorem of asset pricing following the work of Harrison and Kreps (1979) and Duffie and Huang (1985–1986). According to this theorem the absence of an arbitrage opportunity was linked to the notion of a Martingale measure and process. Because stochastic calculus is now used regularly by financial economists, some attention must be given to its mathematical meaning, its appropriateness and applications in economic modeling, and its applications to finance.
30.1 Introduction 30.2 The ITÔ Process and Financial Modeling The purpose of this chapter is to develop certain relatively recent mathematical discoveries known generally as stochastic calculus (or more specifically as Itô’s Calculus), describe some more modern methods, and illustrate their application in the pricing of options. The topic is motivated by a desire to provide an intuitive understanding of certain probabilistic methods that have found significant use in financial economics. A rigorous presentation of the same ideas is presented briefly in Malliaris (1983), and more extensively in Malliaris and Brock (1982), Duffie (1996) or Karatzas and Shreve (2005). Lamberton and Lapeyre (2007), Oksendal (2007), and Neftci (2000) are also useful books on stochastic calculus. Itô’s Calculus was prompted by purely mathematical questions originating in N. Wiener’s work in 1923 on G. Chalamandaris Athens University of Economics and Business, Athens, Greece e-mail: [email protected] A.G. Malliaris () School of Business Administration, Loyola University Chicago, Chicago, IL, USA e-mail: [email protected]
Stochastic calculus is the mathematical treatment of random change in continuous time, unlike ordinary calculus, which deals with deterministic change. This section defines intuitively the Brownian motion (or Wiener process), the Itô integral and equation, and discusses its appropriateness to financial modeling. The elementary building block and “engine” of the continuous time stochastic models is the Brownian motion, named thus after the British botanist Robert Brown who observed the random movement of pollen particles in water. Formally, a Wiener process is a random process Z .t; w/ W Œ0; 1/ ! R with increments that are statistically independent and normally distributed with mean zero and variance equal to the increment in time. In other words, for every pair of disjoint time intervals Œt1 ; t2 ; Œt3 ; t4 with, say, t1 < t2 t3 < t4 , the increments Z .t4 ; w/ Z .t3 ; w/ and Z .t2 ; w/Z .t1 ; w/ (also called white noise) are independent and normally distributed random variables with means: E ŒZ .t3 ; w/ Z .t4 ; w/ D E ŒZ .t2 ; w/ Z .t1 ; w/ D 0 (30.1)
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_30,
447
448
G. Chalamandaris and A.G. Malliaris
and respective variances: Var ŒZ .t4 ; w/ Z .t3 ; w/ D t4 t3 Var ŒZ .t2 ; w/ Z .t1 ; w/ D t2 t1
(30.2)
By convention it is assumed that at time t D 0, the Wiener process is zero; that is, Z .0; w/ D 0. The Wiener process is widely applied in mathematical models involving various noisy systems, like the motion of diffusing particles, stochastic control disturbances or the behavior of asset prices in a financial market. The main reason for its popularity is that if the noise in the system is caused by a multitude of independent random changes, then the Central Limit Theorem predicts that the net result will follow a normal distribution. Very important properties of the Brownian motion are the following: Its paths are continuous. In fact, one can think of
the Brownian motion as the continuous-time limit of a discrete time random walk, which of course has discontinuities. If we assume for this random walk, that the successive random increments have length proportional to the square root of the time interval between them, then by driving the length of this time interval to zero we eradicate the discontinuities of the original discrete time random walk, thus getting its continuous limit – the Wiener process. However, when we decrease the length of the interval on which we examine the Wiener process, its observed path always remains “jagged” and never smooth. It is finite; that is, it does not explode nor degenerate to a fixed point as the time interval changes. Intuitively, it also means that it would take infinite time for it to reach infinity and only zero time would ensure no motion at all. However, it is almost certain that at some point it will reach a specific target that is bounded. It is a Markov process, meaning that the conditional distribution of Z .t/ given information only until < t, depends only on its value Z ./ at , and not in past values. In other words, the process keeps no memory of its path and the distribution of its future value depends on its current one. It is a Martingale process: This means that given information only until < t the expectation of Z .t/ conditional on that information is Z ./. In other words, the best estimate of its future value is the current value of the process. Later we shall see that the last two features of the Brownian motion are associated with the weak form of “efficient markets” introduced in Fama (1970, 1991). The next logical step in the construction of the stochastic calculus framework is the definition of the Itô’s integral: This Rb one takes the form I .w/ D a .s/ dZ .s; w/ with .s/ an integrable function of time or process that satisfies suitable
regularity and integrability conditions that are beyond the scope of this chapter, s denoting the time dimension of the process and the second argument w denoting a random element taking values from a random set as a random variable. We shall present an intuitive approach to stochastic calculus rather than give here the technical conditions of its rigorous mathematical construction. The interested reader could turn to Oksendal (2007), Karatzas and Shreve (2005) or Bjork (1998) for a more detailed lengthy and complete derivation. The idea behind the construction of a stochastic integral is to build a partition of the time interval Œa; b and approximate the function (or process) .t/ with a function (or process) n .t/ that is piecewise constant over each elementary time interval function Œtk ; tkC1 . Then this approximating Itô integral is defined similarly to the Riemman integral as Z
b
In .w/ D
n .s/ dZ .s; w/ a
X
D
n .tk / ŒZ .tkC1 ; w/ Z .tk ; w/ (30.3)
kD0;n1
What is important is to evaluate n .t/ at the left side of each and every infinitesimal interval Œtk ; tkC1 . This feature represents the flow of information that underlies the modeling of random events: although n .t/ may be a random variable itself, its value must be known at the beginning of each one of them, leaving as the only source of uncertainty the realization of the forward increments of the Wiener process. This feature is essential not only from a mathematical but, as we shall see later on, from an economical point of view. It now becomes obvious, that the finer the partition of the overall time interval the better is the approximation of the original function .t/ that can be achieved by the piecewise constant ones. The Itô integral is defined by taking the detail of the partition to infinity. When this happens and assuming that the right technical conditions are satisfied, the integral is defined as the limit Z
Z
b
b
.s/ dZ .s; w/ D limn!1
I.w/ D a
n .s/ dZ .s; w/ a
(30.4)
Once the integral has been defined, a wealth of important results follows. For our purposes however we need to stress only three points: Rb 1. Each elementary Itô integral a n .s/ dZ .s; w/ is a welldefined stochastic variable In .w/ whose value depends on both the function .t/ and the path of the integrating Brownian motion. Of course, the value of .t/ may well depend on the realized path of the Brownian motion from time t D 0 to time t.
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
2. The sequence of the elementary Itô integrals converges to a well-defined random variable I.w/; therefore the Itô integral itself can be regarded as a random variable. 3. If .t/ is a deterministic function of time then one can prove that the Itô integral is normally distributed with mean zero and variance equal to Z
t
var
Z
t
.s/ dZ .s; w/ D
0
.s/2 ds
(30.5)
0
It is important to notice here that the second integral is a nonrandom, simple integral of time. The above results are essential for the understanding of the numerical method of integration through Monte Carlo simulation: Instead of evaluating analytically an Itô integral, one can sample through its distribution and calculate the average of the sample draws. Having a concise model for randomness and uncertainty, we next move to the key notion of the stochastic calculus – the Itô process. This can be described by the integral equation Z
t
Œu; S .u; w/ du
S.t; w/ D S.0/ C 0
Z
t
C
Œu; S .u; w/ dZ .u; w/:
(30.6)
0
Although mathematically more meaningful as written above, it is usually presented in its shorthand form as a stochastic differential equation, dS .t; w/ D Œt; S .t; w/ dt C Œt; S .t; w/ dZ .t; w/ (30.7) which is analogous to the ordinary differential equation dS .t/=dt D Œt; S .t/ . A stochastic process is an Itô process if the random variable dS .t; w/ can be represented by Equation (30.7). The first term, Œt; S .t; w/ dt, is the expected change in S .t; w/ at time t. The second term, Œt; S .t; w/ dZ .t; w/, reflects the uncertain term. The Itô equation is a random equation. The domain of the equation is Œ0; 1/ , with the first argument t denoting time and taking values continuously in the interval Œ0; 1/, and the second argument w denoting a random element taking values from a random set . The range of the equation is the real numbers or real vectors. For simplicity only the real numbers, denoted by R, are considered as the range of Equation (30.7). Because time takes values continuously in Œ0; 1/, the Itô equation is a continuous-time random equation. Although at first a real random variable S .t; w/ W Œ0; 1/
! R is used – that is, a function having as domain Œ0; 1/ and as range real numbers – Equation (30.7)
449
expresses not the values of S .t; w/, but its infinitesimal differences dS .t; w/ as a function of two terms. For example, in finance, S .t; w/ denotes the price of a stock at time t affected by the state of the economy described by the random element w; and Equation (30.7) expresses the small changes in the stock price, dS .t; w/, at time t affected by the random element w. The mathematical meaning of this small difference can be explained as follows: dS .t; w/ may be viewed as the limit of large finite differences S .t; w/ as
t approaches zero. Note that S .t; w/ D S .t C t; w/ S .t; w/, where t denotes the difference in the change in time. Thus, the Itô equation expresses random changes in the values of a variable taking place continuously in time. Moreover, these random changes are given as the sum of two terms. The first term, Œt; S .t; w/ , is called the drift component of the Itô equation, and in finance it is used to compute the instantaneous expected value of the change in the random variable S .t; w/. Observe that Œt; S .t; w/ , as a function used in the computation of a statistical mean, is affected by both time and randomness. If at a given time t the expected change in dS .t; w/, expressed as E ŒdS .t; w/ , is desired, this can be answered by computing E f Œt; S .t; w/ dtg. The second term Œt; S .t; w/ dZ .t; w/ is the product of two factors. Each factor is important and needs special attention. The first factor, Œt; S .t; w/ , is used in the calculation of the instantaneous standard deviation of the change in the random variable S .t; w/; it is a function of both time t and the range of values taken by S .t; w/. When Œt; S .t; w/ is squared to compute the instantaneous variance, it is usually called the diffusion coefficient; it measures the magnitude of the variability of dS .t; w/ at a given instance in time. The two factors have been described separately; an intuitive explanation of the product of Œt; S .t; w/ and dZ .t; w/ is now presented. Because Œt; S .t; w/ measures the instantaneous standard deviation or volatility of dS .t; w/ and because, as already described, the infinitesimal Brownian motion increment dZ .t; w/ is by definition, purely random with mean zero and variance dt, the expression of Œt; S .t; w/ dZ .t; w/ is the product of two independent random variables. However, at time t, the instantaneous volatility .t; S .t// is known exactly given t and S .t/. This attribute characterizes the .t; S .t// process as adapted. In that case we can treat this random variable as deterministic at the specific time t, since the only real unknown then is the forward increment of the Wiener process dZ .t/. Hence we take: E Œ .t; S .t; w// dZ .t; w/ D .t; S .t; w// E ŒdZ .t; w/ D 0 (30.8)
450
G. Chalamandaris and A.G. Malliaris
Var Œ .t; S .t; w// dZ .t; w/ D 2 .t; S .t; w// Var ŒdZ .t; w/ D .t; S .t; w// dt: 2
(30.9) Therefore, the product Œt; S .t; w/ dZ .t; w/ is a random variable with mean and variances given by Equations (30.8) and (30.9), which, for a given time t and state of nature w, yields a real number. This number may be either positive or negative depending on the value of dZ .t; w/ since Œt; S .t; w/ represents a measure of standard deviation and is always positive. Furthermore, the magnitude of the product Œt; S .t; w/ dZ .t; w/ depends on the magnitude of each of the two terms, even though the first factor plays the role of scale and the second the role of the realization of randomness. Indeed, the methodological foundation of the Itô model is that the standardized uncertainty generator dZ .t; w/, which provides the realization of a random draw, is multiplied/scaled by the uncertainty magnitude Œt; S .t; w/ to produce the total contribution of uncertainty. Therefore, Œt; S .t; w/ dZ .t; w/ describes the total fluctuation produced by volatility as this volatility is aggrandized or reduced by pure randomness. This analysis explains why uncertainty given by dZ .t; w/ is being modeled as a multiplicative factor in the product Œt; S .t; w/ dZ .t; w/: to repeat once again, the multiplicative modeling of uncertainty allows the generation of values for dS .t; w/ that are above or below the instantaneous expected value, depending on whether the realization of uncertainty is positive or negative, respectively. In other words, the product Œt; S .t; w/ dZ .t; w/ can be viewed as the total contribution of uncertainty to dS .t; w/, with such uncertainty being the product of two factors: the instantaneous standard deviation of changes and the purely random white noise. The conceptual components of the Itô process can now be collected to offer a more general interpretation of its meaning in finance. If at a given time t the possible future change in the price of an asset during the next trading interval is being evaluated, this change can be decomposed into two nonoverlapping components: the expected change and the unexpected change. The expected change, E ŒdS .t; w/ , is described by E f Œt; S .t; w/ dtg and the unexpected change is given by E f Œt; S .t; w/ dZ .t; w/g, both evaluated by Equations (30.8) and (30.9). As already noted, even though the instantaneous drift .t; S .t; w// and volatility .t; S .t; w// may be both random variables (since they depend on the random variable S .t; w/ themselves), at time t where we evaluate the next infinitesimal change of the stock process, both are known. Hence, instantaneous uncertainty or new information that cannot be anticipated enters the model only through dZ .t; w/. Thus, Equation (30.7), which was developed by mathematicians, captures the spirit
of financial modeling admirably because it is an equation that involves three important concepts in finance: mean, standard deviation, and randomness. However, Equation (30.7) captures the spirit of finance not only because it involves means, standard deviations, and randomness but, more important, because it also expresses key methodological elements of modern financial theory. In an elegant paper, Merton (1982) identifies several foundational notions of the appropriateness of the Itô equation in modern finance. 1. The inclusion of a Wiener process as the generator of uncertainty in the Itô’s equation allows uncertainty not to disappear even as trading intervals become extremely short. In the real world of financial markets uncertainty evolves continuously, and dZ .t; w/ captures this uncertainty because the value of dZ . t; w/ is not zero even as t becomes increasingly small. On the other hand, as trading intervals increase uncertainty increases but always in a time-scaled manner because by definition Var ŒZ .t C t; w/ Z .t; w/ D t: 2. Itô’s equation incorporates uncertainty for all times. This means that uncertainty is present at all trading periods. 3. The rate of change described by Œt; S .t; w/ dt is finite, and uncertainty does not cause the term Œt; S .t; w/ dZ .t; w/ to become unbounded. These notions of the Itô equation are consistent with real-world observations of finite means, finite variances, and uncertainty (which evolves nicely by obtaining continuously finite values). 4. Finally, all that counts in an Itô equation is the present time t, which is due to the Markov property of the Wiener motion. In other words, the past and future are independent. This notion expresses mathematically the concept of economic markets efficiency. That is, knowledge of past price behavior does not allow for above-average returns in the future. Sample Problem 30.2.1 provides further illustration. Sample Problem 30.1. Consider the following example of an Itô process describing the price of given stock dS .t; w/ D 0:001 S .t; w/ dt C 0:025 S .t; w/ dZ .t; w/ : If both sides are divided by S .t; w/: dS .t; w/ D 0:001 dt C 0:025 dZ .t; w/ S .t; w/
(30.10)
which gives the proportional change in the price of the stock. Assume that dt equals one trading period, such as 1 day.
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
Using appropriate coefficients in Equation (30.10), dt could be denoted as 2 s. In Equation (30.10), the expected daily proportional change is given by: E
dS .t; w/ D E Œ0:001 dt C 0:025 dZ .t; w/ D 0:001 S .t; w/ (30.11)
which means that at any given trading day, the price of the stock is expected to change by 0.1%. The standard derivation of the proportional change is D 0:025 or 2.5 percent. What actually occurs at a given trading period depends on the evolution of the uncertainty as modeled by the normally distributed random variable dZ .t; w/, with zero mean and variance 1. From the properties of the normal distribution it can be deduced that although Equation (30.11) says that the expected daily proportional change is 0.001, there is a 68.26% probability that the daily proportional change will be between 0:001 ˙ .1/ .0:025/, or a 95.46% probability that it will be between 0:001 ˙ .2/ .0:025/, or a 99.74% probability that it will be between 0:001 ˙ .3/ .0:025/. This example shows how the Itô equation combines the notion of a trading interval dt, the expected change in the price of the stock , the volatility of the stock , and pure uncertainty dZ .t; w/ to describe changes in the price of an asset. The same example will be used later to describe the behavior of the price of the stock S .t; w/. Equation (30.7) illustrates only approximately the infinitesimal proportional change in the price of the stock, and its solution in Equation (30.6) will give the exact evolution of the stock price.
30.3 ITÔ’S Lemma The preceding two sections dealt with the Itô process, both its intuitive mathematical meaning and its financial interpretation. This section briefly presents Itô’s Lemma of stochastic differentiation. By returning to the Itô’s Equation (30.6) Z
t
Z
t
.u; w/ d u C
S.t; w/ D S .0; w/ C 0
.u; w/ dZ .u; w/ 0
(30.12)
we get a description of the Itô’s process in terms of the original random variable S .t; w/ rather than infinitesimal differences (differential form), as in Equation (30.6). Suppose that a stochastic process is given by Equation (30.6) and that a new process Y .t; w/ is formed by letting Y .t; w/ D u Œt; S .t; w/ . Because stochastic calculus studies random changes in continuous time, the question arises: what is dY .t; w/? This question is important for both mathematical analysis and finance. The answer is given in Itô’s Lemma.
451
Itô’s Lemma. Consider the nonrandom continuous function u .t; S / W Œ0; 1/ R ! R and suppose that it has continuous partial derivatives ut ; uS , and uSS . Let: Y .t; w/ D u Œt; S .t; w/ (30.13) with: dS .t; w/ D Œt; S .t; w/ dt C Œt; S .t; w/ dZ .t; w/ (30.14) Then the process Y .t; w/ has a differential given by: d Y .t; w/ D fut Œt; S.t; w/ C uS Œt; S.t; w/ Œt; S.t; w/ 1 C uS S Œt; S.t; w/ 2 Œt; S.t; w/ g dt 2 CuS Œt; S.t; w/ Œt; S.t; w/ dZ.t; w/ (30.15) The proof is presented in Gikhman and Skorokhod (1969, pp. 387–389), and extensions of this lemma may be found in Arnold (1974, pp. 90–99). Also a heuristic derivation of the lemma can be found in Baxter and Rennie (1996) and Wilmott (2000). Here the analysis is limited to three remarks. 1. Itô’s Lemma is a useful result because it allows the computation of stochastic differentials of arbitrary functions having as an argument a stochastic process that itself is assumed to possess a stochastic differential. In this respect, Itô’s formula is as useful as the chain rule of ordinary calculus. 2. Given a stochastic process S.t; w/ with respect to a given Wiener process Z.t; w/, and letting Y .t; w/ D u Œt; S .t; w/ be a new process, Itô’s formula gives the stochastic differential of Y .t; w/, where dY .t; w/ is given with respect to the same Wiener process – that is, both processes have the same source of uncertainty. 3. An inspection of the proof of Itô’s Lemma reveals that it consists of an application of Taylor’s theorem of advanced calculus and several probabilistic arguments to establish the convergence of certain quantities to appropriate integrals. Therefore, Itô’s formula may be obtained by applying Taylor’s theorem instead of remembering the specific result in Equation (30.15). More specifically, the differential of Y .t; w/ D u Œt; S .t; w/ , where S.t; w/ is a stochastic process with differential given by equation below, may be computed by using Taylor’s theorem and the following multiplication rules: dt dt D 0
dZ dZ D dt
dt dZ D 0
(30.16)
as: 1 dY D ut dt C uS dS C .dS/2 D ut dt C uS .dtCdZ/ 2 1 C uS S .dt C dZ/2 2
452
G. Chalamandaris and A.G. Malliaris
By carrying out these multiplications and using the rules in Equations (30.16) and (30.15) is obtained. The multiplication rules Equation (30.16) are also a very good guide for easily applying the Itô’s Lemma to functions of more than one variable. Indeed, in a more general case, we have a number of Itô processes Xi with i D 1; : : : ; N that are driven by m independent Wiener processes dX i .t/ D i .Xi ; t/ dt X C i k .Xi ; t / dZ k .t; w/
(30.17)
kD1;:::;m
and a function that depends on their values Y D f .t; X1 ; X2 ; : : : ; XN /. Then the multivariable form of the Itô’s Lemma takes the form dY .t/ D
X @f @f dt C dX i @t @Xi i D1WN
1 @f D .2 .X C Y / 2X/ D Y @X 2 1 @f D .2 .X C Y / 2Y / D X @Y 2 @2 f @2 f D D1 @X @Y @Y @X @2 f @2 f D D0 @X 2 @Y 2
(30.22)
And by substituting in Equation (30.18) we get 1 d .Xt Yt / D Xt d Yt C Yt d Yt C 1 dXt d Yt 2 1 C 1 d Yt dXt 2 D Xt d Yt C Yt d Yt C .x dt C x dZ/ Y dt C y dZ D Xt d Yt C Yt d Yt C x y dZ dZ
1 X @2 f C dX i dXj0 2 i;j D1WN @Xi @Xj
D Xt d Yt C Yt d Yt C x y dt (30.18)
(30.23) We must note here that for reasons of clarity we have suppressed the notation x .X; t/ and x .X; t/.
Now we can use a set of the extended multiplication rules
30.4 Stochastic Differential-Equation Approach to Stock-price Behavior
dt dt D 0 dt dZ i D 0 dZ i .t; w/ dZ i .t; w/ D dt dZ i .t; w/ dZ j .t; w/ D 0;
i ¤j
(30.19)
with the help of which we can evaluate multivariate functions of these processes. If the Wiener processes that drive Xi are not independent but correlated, then the corresponding multiplication rule simply changes to dZ i .t; w/ dZ j .t; w/ D ij dt;
i ¤j
(30.20)
Sample Problem 30.2. In this example we derive the product rule in stochastic calculus for two processes that are driven by the same Wiener process; that is
This section demonstrates how a stochastic equation can be used to describe a stochastic price behavior. As a special case of Equation (30.7) consider the so called “Geometric Brownian motion,” dS .t; w/ D S .t; w/ dt C S .t; w/ dZ .t; w/
in which and are constants as in Sample Problem 30.2.1. Assume that Equation (30.24) describes the price behavior of a certain stock with S.0; w/ given. The solution S.t; w/ of Equation (30.24) is given by:
1 2 S.t; w/ D S.0; w/ exp t C Z.t; w/ 2
dX t D x .X; t/ dt C x .X; t/ d Z .t; w/ : dY t D y .Y; t/ dt C y .Y; t / d Z .t; w/ For expositional purposes we look for a bivariate function on which we can apply Itô’s Lemma. A good candidate is
1 .Xt C Yt /2 Xt2 Yt2 D Xt Yt 2 (30.21) The partial derivatives are f .Xt ; Yt / D
@f D0 @t
(30.24)
(30.25) To show that the behavior of Equation (30.25) is the solution of Equation (30.24) use Itô’s Lemma as follows. First, start with Equation (30.25), which corresponds to the function Y .t; w/. In this case Equation (30.25) is a function of t and Z.t; w/, and instead of Equation (30.24): dZ.t; w/ D 0 dt C 1 dZ.t; w/: Next, compute the first- and second-order partials denoted by St ; SZ , and SZZ :
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
2 S .t; w/ St .t; w/ D 2 SZ .t; w/ D S .t; w/ SZZ .t; w/ D 2 S .t; w/ Collect these results and use Equation (30.15) to conclude that Equation (30.24) holds:
2 S .t; w/ C S .t; w/ 0 2 2 C S .t; w/ 1 dt 2 CS .t; w/ 1 dZ .t; w/ D S .t; w/ dt C S .t; w/ dZ .t; w/
dS .t; w/ D
This result is not only mathematically interesting; in finance it means that, assuming stock price are given by an Itô process as in Equation (30.24), then Equation (30.25) holds as well. To see if Equation (30.25) accurately describes stock prices in the real world, rewrite it in the form of log-changes (or log-returns) as: ln
2 S .t; w/ t C Z .t; w/ ; D S .0; w/ 2
(30.26)
which isıa random variable normally distributed with mean 2 2 t and variance 2 t. This means that if the stock price follows an Itô process, it has a log normal probability distribution. Such a distribution for stock price is reasonable because it is consistent with reality, where negative stock prices are not possible; the worst that can happen is that the stock price reaches zero, but then again it would take infinite time. Equation (30.25) is the conceptual cornerstone of Monte Carlo simulation as it is used in most cases for the pricing of more exotic1 or path-dependent2 options: replicating the distribution of a stock price process at any given future time requires us only to generate a large number of normally distributed random numbers with zero mean and variance t and apply repeatedly Equation (30.23) by substituting them for dZ .t; w/. Of course, in practice the correct application of the Monte Carlo simulation involves certain complications regarding the drift or the volatility term, which we’ll discuss later. Another way of looking at Equation (30.26) is also motivated from applying the usual convention of continuously compounded returns zi from S .t C t/ D S .t/ e zi .
Successive continuously compounded returns are simply added together to produce the return for the longer time interval as in S .T / D S .0/ e z1 Cz2 C:::Czn . Empirical evidence prompts us to impose the assumptions that these returns are independently and identically distributed (without making any additional assumption on their distribution), ensuring thus that the stock price follows a random walk, a property that is tightly associated with the concept of efficient markets in Fama (1970, 1991). Furthermore, as described above, the empirically observed finite means and volatilities of stock returns force us to impose two more assumptions; namely, that the continuously compounded stock return means and variances are scaled by time as the size of the time interval becomes smaller. Cox and Miller (1990) show with the use of the Central Limit Theorem that if these four assumptions hold then the continuously compounded stock returns are normally distributed and therefore the stock price is lognormally distributed. For a detailed analysis of the properties of a log normal random distribution see Cox and Rubinstein (1985, pp. 201– 204); for a list of bibliographical references on the empirical distribution of stock-price changes see Cox and Rubinstein (1985, pp. 485–488). Sample Problem 30.4.1 provides further illustration.
Sample Problem 30.3. The analysis of this problem applies the results of the last two sections to Sample problem 30.2.1. Suppose that instead of Equation (30.10) and (30.27) is given. Then Equation (30.25) describes the solution of Equation (30.10):
0:0252 t S .t; w/ D S .0; w/ exp 0:001 2 C0:025Z .t; w/
(30.27)
From Equation (30.27) the exact evolution of the price of the stock as influences by a mean, a variance, time, and uncertainty is obtained. Using Equation (30.27) for t D 1 to compute:
S .t; w/ 2 E ln D t S .0; w/ 2 " # .0:025/2 D 0:001 .0:1/ D 0:0006875 2
1
Exotic option is an option with a more complicated payoff or condition of payoff, than the simple call or put option (“vanilla” options). 2 Path-dependent option is an option, the payoff of which depends on more than one observations of the underlying during the life of the option.
453
Var
(30.28)
S .t; w/ D 2 t D .0:025/2 .0:1/ D 0:000625; S .0; w/ (30.29)
454
G. Chalamandaris and A.G. Malliaris
which means that for a given trading day, the price of the stock is expected to experience a continuous change of 0.068% with a standard deviation of 0.025%. Computing Equations (30.28) and (30.29) for any t can be done easily; the same computations cannot be performed readily in Equation (30.21).
dS .t/ D S .t/ dt C S .t/ dZ .t/
Consider an investor who builds up a portfolio of stocks, options on the stocks, and a riskless asset (for example, government bonds) yielding a riskless rate r. The nominal value of the portfolio, denoted by P .t/, is P .t/ D N1 .t/ S .t/ C N2 C .t/ C Q .t/
30.5 The Pricing of an Option
(30.33)
where:
An option is a contract giving the right to buy or sell an asset within a specified period of time subject to certain conditions. The simplest kind of option is the European call option, which is a contract to buy a share of a certain stock at a given date for a specified price. The date the option expires is called the expiration date (maturity date) and the price that is paid for the stock when the option is exercised is called the exercise price (strike or striking price). In terms of economic analysis, several propositions about call-option pricing seem clear. The value of an option increases as the price of the stock increases. If the stock price is much greater than the exercise price, it is almost certain that that option will be exercised; and, analogously, if the price of the stock is much less than the exercise price, the value of the option will be near zero and the option will expire without being exercised. If the expiration date is very far in the future, the value of the option will be approximately equal to the price of the stock. If the expiration date is very near, the value of the option will be approximately equal to the stock price minus the exercise price, or zero if the stock price is less than the exercise price. In general, the value of the option is more volatile than the price of the stock, and the relative volatility of the option depends on both the stock price and maturity. The first rigorous formulation and solution of the problem of option pricing was achieved by Black and Scholes (1973) and Merton (1973). Consider a stock option denoted by C whose price at time t can be written: C .t; w/ D C Œt; S .t; w/
(30.32)
(30.30)
in which C is a twice continuously differentiable function. Here S.t; w/ is the price of some stock upon which the option is written. The price of this stock is assumed to follow Itô’s stochastic differential equation: dS .t; w/ D Œt; S .t; w/ dt C Œt; S .t; w/ dZ .t; w/ (30.31) Assume, as a simplifying case, that Œt; S .t; w/ D S .t; w/ and Œt; S .t; w/ D S .t; w/. For notational convenience w is suppressed from the various expressions, and sometimes t as well. Therefore, Equation (30.31) becomes
N1 D the number of shares of the stock; N2 D the number of call options; and Q D the value of dollars invested in riskless bonds. Assume that the stock pays no dividends or other distributions. By Itô’s Lemma the differential of the call price using Equations (30.30) and (30.32) is 1 dC D Ct dt C CS dS C CS S dS2 2
1 D Ct C CS S C CSS 2 S 2 dt C CS S dZ 2 D C Cdt C C C dZ (30.34) Observe that in Equation (30.34): 1 C C D Ct C CS S C CSS 2 S 2 2 C C D CS S
(30.35) (30.36)
In other words, C C is the expected change in the calloption price and C2 C 2 is the variance of such a change per unit of time. Itô’s Lemma simply indicates that if the calloption price is a function of a spot stock price that follows an Itô process, then the call-option price also follows an Itô process with mean and standard deviation parameters that are more complex than those of the stock price. The Itô process for a call is given by Equations (30.35) and (30.36). The infinitesimal change in the value of the portfolio dP is caused by the change in the prices of the assets. This follows from the fact that their respective quantities are rebalanced only after first taking into account the change on the asset prices, in other words the rebalancing strategy is previsible. Mathematically, this is expressed as dN 1 D dN 2 D 0, which yields dP D N1 .dS/ C N2 .dS/ C dQ D .dt C dZ/ N1 S C .C dt C C dZ/ N2 C CrQ dt (30.37) Let w1 be the fraction of the invested in stock, w2 be the fraction invested in options, and w3 be the fraction of the invested in the riskless asset. As before, w1 C w2 C w3 D 1, that is,
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
all of the funds available are invested in some type of asset and furthermore the dynamic strategy remains at all times self-financing. Set w1 D N1 S =P; w2 D N2 C =P; w3 D Q=P D 1 w1 w2 . Then Equation (30.37) becomes dP D . dt C dZ/ w1 C .C dt C C dZ/ w2 C .r dt/ w3 P (30.38) At this point the notion of economic equilibrium (also called risk-neutral or preference-free pricing) is introduced in the analysis. This notion plays an important role in modeling financing behavior, and its appropriate formulation is considered to be a major breakthrough in financial analysis. More specifically, design the proportions w1 ; w2 so that the position is riskless for all t 0–that is, let w1 and w2 be such that:
dP Var D Var .w1 dZ C w2 C dZ/ D 0 (30.39) P In the last equation, Vart denotes variance conditioned on S .t/ ; C .t/ and Q .t/. In other words, choose .w1 ; w2 / D .wN 1 ; wN 2 / so that: w1 C w2 C D 0:
(30.40)
Then from Equation (30.38), because the portfolio is riskless it follows that the portfolio must be expected to earn the riskless rate of return, or: Et
dP D Œw1 C C w2 C r .1 w1 w2 / dt D r .t/ dt P (30.41)
Equations (30.40) and (30.41) yield the Black–Scholes– Merton equations: C w1 (30.42) D w2 and: r D w1 C C w2 rw1 rw2 C r;
(30.43)
which simplify to C r r D C
455
from Equations (30.35) to (30.36), the partial differential equation of the pricing of an option is obtained: 1 2 2 S CSS .t; S/ C rSCS .t; S / rC .t; S/ C Ct .t; S/ D 0: 2 (30.45) The equation along with the boundary conditions for call options fully characterize the call price: C .S; T / D Max .0; S X/ ; S 0; 0 E T . The solution to the differential Equation (30.45), given these boundary conditions, is the Black–Scholes formula.
30.6 A Reexamination of Option Pricing To illustrate the notion of economic equilibrium once again, consider the nominal value of a portfolio consisting of a stock and a call option on this stock and write: P .t/ D N1 .t/ S .t/ C N2 .t/ C .t/
(30.46)
using the same notation as in the previous section. Equation (30.46) differs from Equation (30.33) because the term Q .t/ has been deleted. Now concentrating on the two assets of the portfolio – that is, the stock and the call option – and using Equations (30.46) and (30.34), the change in the value of the portfolio is given by: dP D N1 dS C N2 dC D N1 dS
1 C N2 Ct C CSS 2 S 2 dt C CS dS : 2
(30.47)
Note that dN 1 D dN 2 D 0, since at any given point in time the equations of stock and option are already determined from the last rebalancing. For arbitrary quantities of stock and option, Equation (30.47) shows that the change in the nominal value of the portfolio dP is stochastic because dS is a random variable. Suppose the quantities of stock and call option are chosen as that: N1 D CS : N2
(30.48)
(30.44)
Because the law of one price, Equation (30.44) says that the net rate of return per unit of risk (the so called market price of risk) must be equal for the two assets and as such it describes an appropriate concept of economic equilibrium in this problem. If this were not the case, there would exist an arbitrage opportunity until this equality held. Using Equation (30.44) and making the necessary substitutions
Note that CS in Equation (30.48) denotes a hedge ratio and is called delta. Then, N1 dS C N2 CS dS D 0, and inserting Equation (30.48) into Equation (30.47) yields:
1 2 2 dP D N2 CS dS C N2 Ct C CSS S dt C CS dS
2 : 1 2 2 D N2 Ct C CSS S dt 2 (30.49)
456
G. Chalamandaris and A.G. Malliaris
Let N2 D 1 in Equation (30.49) and observe that in equilibrium the rate of return of the riskless portfolio must be the same as the riskless rate r .t/, otherwise we would have constructed synthetically a riskless asset that earns more (or less) than the tradable riskless asset. Therefore: dP D r dt p
(30.50)
Equation (30.50) can be used to derive the partial differential equation for the value of the option. Making the necessary substitutions in Equation (30.49): .Ct C 12 CSS 2 S 2 /dt D r dt; CS S C C which upon rearrangement gives Equation (30.45). Note that the option-pricing equation is a second-order linear partial differential equation of the parabolic type. The boundary conditions of Equation (30.45) are determined by the specification of the asset. For the case of an option that can be exercised only at the expiration date t with an exercise price X , the boundary conditions are C .t; S D 0/ D 0;
(30.51)
C .t D t ; S / D Max .0; S X / :
(30.52)
Observe that Equation (30.51) says that the call-option price is zero if the stock price is zero at any date t; Equation (30.52) says that the call-option price at the expiration date t D t must equal the maximum of either zero or the difference between the stock price and the exercise price. The solution of the option-pricing equation subject to the boundary conditions is given, for T D t t, as: X C T; S; 2 ; X; r D SN .d1 / r N .d2 / e
(30.53)
where N denotes the cumulative normal distribution; namely: Z y 1 2 N .y/ D p ex =2 dx 2 1 In Equation (30.53), T is time to expiration (measured in years) and d1 and d2 are given by: ln.S =X / C r C 12 2 T d1 D p T ln.S =X / C r 12 2 T p : d2 D T p D d1 T
(30.54)
It can be shown that: @C @C @C @C @C > 0; > 0; < 0; >0 > 0; @T @S @ 2 @X @r (30.56) These partial derivatives justify the intuitive behavior of the price of an option, as was indicated in the beginning of the previous section. More specifically, these partials show the following: 1. As the stock price rises, so does the option price. Furthermore, if the stock price rises much higher than the strike of the option then the option starts behaving like the stock itself and its sensitivity relative to the stock price converges to 1; that is, @C @S ! 1. 2. As the variance rate of the underlying stock rises, so does the option price. 3. With a higher exercise price, the expected payoff decreases. 4. The value of the option increases as the interest rate rises. 5. With a longer time to maturity, the price of the option is greater. Before giving an example it is appropriate to sketch the solution of Equation (30.45) subject to the boundary conditions of Equations (30.51) and (30.52). Let t denote the current trading period that is prior to the expiration date t . At time t, two outcomes can be expected to occur at t . (1) S .t / > X – that is, the price of the stock at the time of the expiration of the call option is greater than exercise price, or (2) S .t / X . Note that the first outcome occurs with probability P0 P ŒS .t / > X and the second occurs with probability 1 p. Obviously, the only interesting possibility is the first case when S .t / > X , because this is when the price of the call has positive value. If S .t / X , then C .t / D 0 from Equation (30.52). Again from Equation (30.52), if S .t / > X , then the price of the call option at expiration C .t / can be computed from the expiration of Equation (30.52): ˚ C t D E Max 0; S t X D E S t X (30.57) What is the price of a call option if we calculate the value of the first outcome at t instead of t ? This can be answered immediately by appropriate continuous discounting. Using Equation (30.57): C .t/ D er.t
t /
E S t X :
(30.58)
Recall, however, that C .t/ in Equation (30.58) holds only with probability p. Combine both possibilities to write:
(30.55)
C .t/ D p er.t t / E ŒS .t / X C.1 p/ 0 : D p er.t t / E Œ S .t /j S .t / > X p er.t t / X (30.59)
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
457
Detailed calculation in Jarrow and Rudd (1983, pp. 92–94) show that because the price of the underlying stock is distributed log normally, it follows that:
presented to illustrate the use of Equation (30.53). The values of the observable variables were taken from Bloomberg on September 20, 2007. On Thursday, September 19, 2007, the IBM closed at " S .t / # $116.67. To estimate the price of a call option expiring on ln X C r C 12 2 .t t/ pDP S t >X DN p Friday, October 20, 2007 with an exercise price $115, the t t riskless rate and the instantaneous variance need to be esti(30.60) mated. The riskless rate is estimated either by using the aver ˇ age of the bid and ask quotes on U.S. Treasury bills or from p E S t ˇ S > X D S .t/ er.t t / N the money market quotes for deposits of approximately the " S .t / # ln X C r C 12 2 .t t/ same maturity as the option. The interest rate for a deposit
p : (30.61) of 1 month was 5.14875% on that particular day. The only t t missing piece of information is the instantaneous variance of Combining Equations (30.59) and (30.60) with T D t t the stock price. Several different techniques have been suggested for estiinto Equation (30.61) yields Equation (30.53). It is worth observing that two terms of Equation (30.53) mating the instantaneous variance. In this regard the work of have economic meaning. The first term, SN .d1 /, denotes Latané and Rendleman (1976) must be mentioned; they dethe present value of receiving the stock provided that rive standard deviations of continuous price-relative returns S .t / > X . The second term gives the present value of pay- that are implied in actual call-option prices on the assumping the striking price provided that S .t / > X . In the special tion that investors behave as if they price options according to the Black–Scholes model. In the example, the implicit varicase when there is no uncertainty and D 0, observe that: ance is calculated by using the actual October 20, 2007 call price of an IBM option with an exercise price of $120 to solve N .d1 / D N .d2 / D N .1/ D 1I and for an estimate of the instantaneous variance. More specif (30.62) ically, a numerical search is used to approximate the stanC D S .t/ er.t t / X I dard derivation implied by the Black–Scholes formula with that is, a call is worth the difference between the current value these parameters: stock price S D 116:67, exercise price of the stock and the discounted value of the striking price pro- X D 120, time to expiration T D 30=365 D 0:0822, riskvided S .t / > X ; otherwise the call price would be zero. less rate r D 0:0514875, and call-option price C D 2:00. When ¤ 0 that is, when uncertainty exists and the stock The approximated implicit standard deviation is found to be price is volatile – the two terms in Equation (30.61) are mul- D 23:82%. A simple spreadsheet like Excel for example tiplied by N .d1 / and N .d2 /, respectively, to adjust the call can be used to calculate this “implied volatility,” using the price for the prevailing uncertainties. These two probabilities “solver” or the “goal seek” tool. That way, one can simply can also be given an economic interpretation. As mentioned write the Black–Scholes formula and set as “changing variearlier, N .d1 / is called delta; it is the partial derivative of able” the volatility, so that the “target variable” – the calcuthe call price with respect to the stock price. N .d2 / gives lated option price – becomes equal to its market price. the probabilities that the call option will be in the money, as After the clarifications are made, the example is this: Equation (30.59) shows. given S D 116:67; X D 115; T D 0:0822; r D Assuming that investors in the economy have risk-neutral 0:0514875, and D 23:82%, we use Equation (30.53) preferences, it will be possible to derive the Black–Scholes to compute C . Using Equations (30.54) and (30.55) we formula without using stochastic differential equations. calculate: Garven (1986) has shown that to derive Equation (30.53)
0:23822 knowledge is required of normal and log normal distributions 116:67 0:0822 ln =115 C 0:0514875 C 2 and basic calculus, as presented in Appendix 30A. Sample p d1 D Problem 30.6.1 provides further illustration. 0:2382 0:0822
Sample Problem 30.4. Equation (30.53) indicates that the Black–Scholes option-pricing model is a function of only five variables: T , the time to expiration, S , the stock price, 2 , the instantaneous variance rate on the stock price, X , the exercise price, and r, the riskless interest rate. Of these five variables, only the variance rate must be estimated; the other four variables are directly observable. A simple example is
D 0:30722
0:23822 116:67 0:0822 ln =115 C 0:0514875 2 d2 D p 0:2382 0:0822
D 0:238922:
458
G. Chalamandaris and A.G. Malliaris
Using the Excel function Normsdist() we can calculate the cumulative standardized normal distribution. Thus, we calculate for the call option premium,
The search for these new models moved in the direction of the local volatility models (see for example Derman et al. (1996)), asset jump models (see Merton (1992)) or stochastic volatility models (Dupire 1994; Heston 1991). UnfortuC D 116:67 N .0:30722/115 exp .0:0514875 0:0822/ nately, the larger scope of these models that is required to encompass for example the more general and more realistic N .0:238922/ D 4:34: case of incomplete markets (see Pliska 1997; Britten-Jones and Neuberger 1996) adds significantly to their complexity The calculated call-option price of $4.34 is between the acand thus obscures their intuitive content: instead of a sintual bid-offer for the call price of $4.30–$4.50 reported in gle non-observable parameter – the volatility – on which the Bloomberg. Black–Scholes model depends, these models require the esThis simple example shows how to use the Black–Scholes timation (“calibration”) of more parameters that relate to the model to price a call option under the assumptions of the price in usually untractable ways. model. The example is presented for illustrative purposes This fact alone, and considering the conceptual transonly, and it relies heavily on the implicit estimate of the parency and robustness of the Black–Scholes model, kept it variance, its constancy over time, and all the remaining asto this day still very popular among practitioners in many opsumptions of the model. The appropriateness of estimating tion markets, especially in the more liquid ones that trade the the implicit instantaneous variance is ultimately an empirisimpler options in very large volumes. Traders that are aware cal question, as is the entire Black–Scholes pricing formula. of its limitations can be very often much more effective in Boyler and Ananthanarayanan (1977) studied the implicadealing with their “out of model” unpredictable risks than tions of using an estimate of the variance in option-valuation the ones who can be lulled in false security behind a more models and showed that this procedure produces biased opsophisticated but much more obscure model. Taleb (1997) tion values. However, the magnitude of this bias is not large. relates many cases that provide evidence of this fact. Chiras and Manaster (1978) compared implied volatilities Several textbooks such as Hull (2006), Chance (2006), from option prices with volatilities estimated from historical and McDonald (2005) extend the discussion above of the data and found the latter much better forecasts. Black–Scholes model and give numerous applications and One additional remark must be made. The closeness in examples. this example of the calculated call option price to the actual call price is not necessarily evidence of the validity of the Black–Scholes model. Earlier extensive empirical work that took place to investigate how market prices of call options compare with price predicted by Black–Scholes showed positive evidence of the model’s robustness in the market (see 30.7 Extending the Risk-Neutral Argument: MacBeth and Merville (1979) and Bhattacharya (1980)). For The Martingale Approach example, Chiras and Manaster (1978) also studied whether a strategy of buying low-volatility options and sell highFollowing the work of Harrison and Kreps (1979), Harrison volatility options produced excess returns. Their evidence and Pliska (1981), and Cox and Ross (1976) a more general was supportive of the Black–Scholes model. approach in pricing derivatives was devised than setting up However, the market crash of 1987 revealed its limitaeach time a riskless replicating portfolio that we described tions in predicting abnormal market conditions. Indeed, the above. Although the mathematics for the development of the appearance of phenomena like the pronounced “volatility Martingale method demand some basic knowledge of ab3 smile,” or daily price jumps that defied the assumption of stract measure theory that is outside the scope of this chapter, a lognormal distribution for the asset returns, called for alterwe will attempt to give a simple and intuitive explanation of native models that explain in a more parsimonious and genthis approach in this section. eral way all market prices. We have already mentioned the concept of a Martingale process. Let’s assume that at every time t, we can collect the entire “flow of information” about the asset in question; that is everything that it is known up to time t. This flow of 3 Volatility smile is a pattern of consistently higher implied volatilities information is formally symbolized by the so-called filtration for strikes set far from the at-the-money strike relative to the ATM im- fFt gt 0 . A stochastic process X .t/ is called a Martingale if plied volatility. That is, using the Black-Scholes equation in reverse, so it satisfies the relation as to recover the implied volatility from the market prices of options with different strikes, one would discover that each strike will yield a different number– see Das and Sundaram (1997) for an empirical study.
E ŒX .t/ jFs D X .s/ :
(30.63)
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
In other words, a process is a Martingale, if after collecting all the available information up to time s, the best forecast of the value of the process at any future time is no other than its current value. Of course, this fact is reminiscent of the main feature of the (weak form) of the efficient market hypothesis: past and present information is all included at the current price of an asset. Now let’s also assume that we have a stock whose price S .t/ is the underlying of the derivative we wish to price. For simplicity, we let S .t/ follow a Geometric Brownian motion described by the Itô stochastic differential equation dS .t/ D S .t; w/ dt C S .t; w/ dZ .t; w/
(30.64)
Dividing by S .t; w/ we get the elementary proportional change of its price equal to
459
In a risk-neutral world, the investors that populate it care only about expected returns and ignore the different risks that accompany the different assets, and so all the expected returns of the assets that span that economy should be equal. This means that in that world, the investors will expect no future growth (nor diminishment) of the stock value relative to the numeraire asset (assuming that both assets provide no income), since they bear no risk to justify any expected growth relative to this and therefore the expected value of the stocknumeraire ratio is equal to its current value. More formally, in a risk neutral world ˇ S .0/ S .T / ˇˇ F (30.68) D E 0 ˇ P .T; T / P .0; T / and because in our example P .T; T / D 1,
dS .t; w/ d .ln S .t; w// D dt C dZ .t; w/ : S .t; w/ (30.65) Assuming that we calculate its price in U.S. dollars, its expected proportional change (log-change) in price will be
dS .t/ E D dt S .t; w/
(30.66)
E Œ S .T / j F0 D
S .0/ : P .0; T /
(30.69)
Hence, we gather that in this risk-neutral world the normalized stock price follows a Martingale process, which translates into the following Itô process d
S .t/ P .t; T /
D 0 dt C
S .t/ P .t; T /
d ZQ .t; w/ (30.70)
and its variance equal to
dS .t/ Var D 2 dt: S .t; w/
(30.67)
Also let’s assume that we don’t measure the stock price in current dollars, but we measure it relative to a tradable asset that grows deterministically in time. For example, that could be a zero coupon bond that is traded today, it pays 1 U.S. dollar at time T , has constant yield R, and trades today at P .0; t /. Hence, instead of looking at the dollar stock price S .t/ at time t we look at the normalized “stock per zero coupon .t / bond price” PS.t;T / , and the zero coupon bond is called the numeraire asset. The difference between measuring the stock price in zero coupon bonds instead of in dollars is the following: A dollar today is (almost certainly) more valuable than a dollar in the future time T , simply because they represent different cashflows in the time/cashflow 2-dimensional space, in fact they are different assets. On the other hand, our zero coupon bond represents at all times until and including time T , the same claim at the same cashflow; that is, at all times remains one single asset. Hence, in the first case we measure the same instrument relative to two different assets, whereas in the second case we measure it consistently relative to a single asset.
where dZQ .t; w/ is a Wiener process. Of course, we do not assume that all investors are risk neutral or that all the assets of the economy grow at the same rate or that the stock price grows according to Equation (30.70). This is more like a “trick” to simplify valuation. Indeed, we can write Equation (30.68) like E
Q
ˇ Z C1 S .T / ˇˇ S .T; w/ F0 D ˇ P .T; T / 1 P .T; T /
fw;T jF0 .T; wjF0 / dw D
S .0/ P .0; T /
(30.71)
Where fw;T jF0 .T; wjF0 / is the conditional density function of the stock price S .T; w/ at time T ; that is, integrated across all possible random events w 2 . It can be proven that if there is a unique density function that satisfies Equation (30.71) then the market is arbitrage-free and complete – that is, all the contingent claims on the stock with a payoff at time T can be replicated by unique dynamic trading strategies between the two assets. What’s more, if such a unique density function exists, then the replicating dynamic trading strategy is “previsible” – it is decided at each step t ! t C dt conditioned on variables whose value is known at that time (for
460
G. Chalamandaris and A.G. Malliaris
example, S .t/ ; .S; t// – and “self-financing” – it requires no insertion of new wealth as time evolves from time t D 0 towards the maturity of the claim. The interested reader can look at Musiela and Rutkowski (1997), Karatzas and Shreve (2005) or Lamberton and Lapeyre (2007) for a rigorous treatment of the Martingale methods for derivative pricing. Also Rebonato (1998, 2002) produces a very extensive exposition of the Martingale approach for the construction of relatively more advanced interest rate models. Equation (30.68) is the cornerstone of the Martingale approach and the modern methods of pricing. In general, this method consists of the following steps: 1. First, we search for a numeraire asset with regard to which one can express the future payoff of the function. For example, if the option has a cash payoff at time T , a natural choice for the numeraire asset is a money market account that is continuously rolled over, or a zero coupon bond maturing at time T . On the other hand, if the deliverability of the option is a swap starting at time T , then the natural choice should be a forward starting annuity, the payments of which coincide with the payments of the swap. 2. Next, we apply Equation (30.68) looking for the density function that satisfies it. That is, we look for the “measure” that makes the discounted price process for the underlying stock of the option, a Martingale process. This density function/measure is also called risk neutral measure and is denoted by a capital letter, say Q. Then we write all expectations with regard to this measure as E
Q
ˇ S .0/ S .T / ˇˇ F0 D ˇ P .T; T / P .0; T /
d
S .t/ P .t; T /
D S .t/ d
(30.73)
It is obvious that for different assets one defines as the numeraire asset, the induced measure will be different. However, in a complete market all the measures will calculate the same price for the contingent claim.
1 P .t; T /
C
1 dS .t/ : P .t; T / (30.74)
Of course, in the general case the numeraire asset needs not be deterministic. The application of Itô’s Lemma with the multiplication rules of Equations (30.8) and (30.9) helps us handle the general case. For reason of simplicity, however, we choose in our exposition to employ a deterministic numeraire asset. We write Equation (30.74) using the convention of continuous compounding for the price of the zero coupon .P .t; T / D e R.T t / /, d
S .t/ P .t; T /
R 1 D S .t/ dt C dS .t/ P .t; T / P .t; T /
and by substituting for dS .t/ according to (30.64) we finally get d
S .t/ P .t; T /
R P .t; T /
D S .t/ C
dt
1 . S .t/ dt P .t; T /
C S .t/ dZ .t// :
(30.72)
3. If this measure Q exists and is unique then the market is complete as in Pliska (1997). Once it is discovered, then one can price all the contingent claims on the discounted asset by calculating a simple expectation. Furthermore, the price process of each contingent claim can be replicated in a unique way by dynamically trading the two S .T / assets. Indeed, all the contingent claims X P .T;T / that have a payoff at time T are just a deterministic function of the underlying instrument. This leads us to the conclusion that the discounted payoff of the claim is also a Martingale process. Hence, it can be priced by X .S .T // ˇˇ V .0; S .0// D EQ ˇF0 : P .0; T / P .T; T /
The way to find this risk-neutral measure is derived from the following argument: we want to build the stochastic dif.t / ferential equation for the normalized stock price PS.t;T / that
S .t / is, we want to calculate d P .t;T / . In our case, the zero coupon price is deterministic; that is, it can be regarded as a stochastic process with zero volatility, and by applying the product rule we get
Or d
S .t/ P .t; T /
D
S .t/ P .t; T /
C D
. R/ dt
S .t/ P .t; T /
S .t/ P .t; T/
dZ .t/
. R/ dt C dZ .t/ (30.75)
It then follows that the so-called “change of measure,” is nothing other than requiring that .R/ dtCdZ .t/ D d ZQ .t/ is a Wiener process, so that Equation (30.75) is a Martingale process according to the above. From Equation (30.75), one can see that the difference between Equations (30.64) and (30.75) is only the drift of the process, since the volatilities of the two processes are identical.
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
Also, because the normalized stock price is a Martingale, Equation (30.64) becomes
2 S .0/ Q D exp t C Z .t/ P .0; 2
T/ 2 S .0/ Q exp t C Z .t/ ) S.t/ D P .t; T / P .0; T / 2 (30.76)
S .t/ P .t; T /
461
Now ZQ .t/ D ZQ .t/ ZQ .0/ is the increment of a Wiener process; that is, it follows the normal distribution with zero mean and variance equal to T ; that is, C D P .0; T / 8
Z1 < 2 T max S .0/ exp R : 2 1
It is worth noting here that the generality of the Martingale approach to pricing lies on the fact that the numeraire asset – a tradable instrument linked to deterministic rates in the Black–Scholes economy – can be in general stochastic. In truth, any asset can be used as the numeraire asset and provided that the market is complete and offers no opportunity of arbitrage, Equation (30.73) will yield a unique probability density function that satisfies it. One can observe the similarity between fixed income pricing and the Martingale method: in a complete and arbitragefree fixed income market, all bonds, swaps, FRAs, and other linear instruments imply a unique term structure that prices exactly all of them. The extension of this view to the options market is that, if similar conditions apply to this market too (complete and arbitrage-free), then there exists a unique density function that prices all the market options exactly. Sample Problem 30.5. Deriving the Black–Scholes Formula for a Call Option with the Martingale Approach. Let us define the numeraire asset to be a zero-coupon bond P .t; T / D e R.T t / with maturity, which coincides with the maturity T of the option. The payoff of the option is max .S .T / K; 0/. According to the above, let Q be the risk-neutral measure .t / such that the discounted stock process PS.t;T / is a Martingale, such that S .T / S .0/ Q jF0 D : (30.77) E P .T; T / P .0; T / Under this measure, all the discounted contingent claims on the stock price are also Martingales, which by also applying (30.76) gives us C max .S .T / K; 0/ D EQ jF0 ) P .0; T / P .T; T/ S .0/ C D P .0; T / E Q max exp P .0; T / 2
T C ZQ .T / K; 0 jF0 2 D P .0; T / E Q max .S .0/ exp
2 T C ZQ .T / K; 0 jF0
R 2 (30.78)
C y
p
T
9
exp y 2 = 2 K; 0 p dy: 2 ; (30.79)
The option is exercised only when the final stock price is greater than the strike; that is
p 2 S .0/ exp R T C y T K 2
p K 2 T C y T ) exp R 2 S .0/
2
p S .0/ T y T ln ) R 2 K
2
S .0/ ln C R T K 2 )y p D d2 T (30.80) We can then write Equation (30.79) as 8 ˆ ˆ
< 2 S .0/ exp R C D P .0; T / T ˆ 2 ˆ : d 2 2 9 y > > exp = p 2 dy C y T K p > 2 > ; Z1
Z1 1 y2 D P .0; T / p S .0/ exp 2 2
d 2 2 p T dyC C y T C R 2 2
Zd 2 1 y P .0; T / p dy K exp 2 2 D I1 I2
1
(30.81) The first integral is calculated by “completing the square.” Indeed,
462
G. Chalamandaris and A.G. Malliaris
p y2 C y T 2
p 1 h 2 T D y2 2 y T C R 2 2 i 2 R T C 2 T p 2 1 D y T 2RT 2 p
(30.82)
We employ the change of variable x D y T ) y D p x C T and I1 becomes 1 I1 D P .0; T / S .0/ exp .R T / p 2
Z1 1 exp x 2 dx 2 p .d 2C T / p d 2CZ T Dd1
1 1 2 D S .0/ p exp x dx 2 2 D S .0/ N .d1 /
Y .0/ Y .T / jF0 D : X .T / X .0/
(30.87)
It is also very easy to check that the sum of the two Wiener processes is a Wiener process itself dZ .t/ D dZ 1 .t/ C N dZ 2 .t/
(30.88)
(30.84)
are independent with the previous one, they follow the normal distribution, and p they have a mean 0, and standard deviation t .
thus reaching the Black–Scholes result. Sample Problem 30.6. Pricing an Option to Exchange One Asset for Another Let us consider two tradable assets with respective prices X .t/ and Y .t/ that provide no income. In this section we are pricing the option that permits us, if we hold the asset X .t/, to exchange it for Y .t/; that is, the payoff of the option is max.Y .t/ X .t/ ; 0/. Let’s also assume that the maturity of the option is T . The natural choice for the numeraire asset is X .t/ – the asset we hold. Since both assets are tradable without any dividends, Equation (30.72) tells us that there exists a unique density function/measure Q that satisfies
X Y D : X Y
(30.83) The second integral is simply equal to
EQ
D
p with N D 1 2 , since the successive increments of the new process
1
I2 D P .0; T / K N .d2 / ;
In Equation (30.86) the driving Wiener processes Z1 .t/ and Z2 .t/ are independent. However, we observe that although X .t/ is driven only by Z1 .t/ ; Y .t/ is driven by both. The result is that the returns of Y .t/ become correlated with the returns of X .t/. Indeed, it very easy to check from the definition of correlation that dX .t/ dY .t/ E X .t/ Y .t/ v " " u
2 #
2 # u dX .t/ dY .t/ tE X dt Y dt E X .t/ Y .t/
According to the previous, we do know that the solutions to these stochastic differential equations are
2 x x t C x Z1 .t/ 2
2 Y .t/ D Y .0/ exp Y Y t C Y Z1 .t/ 2 (30.89) CY N Z2 .t/
X .t/ D X .0/ exp
By dividing the Equations in 30.89, we get the ratio
2 Y 2 Y x x t 2 2 C .Y x / Z1 .t/ C Y N Z2 .t/ :
Y .t/ Y .0/ D exp X .t/ X .0/
(30.85)
(30.90) This is the density function that ensures that the current price of this ratio is fair. For reasons of simplicity we let the dynamics of the two assets be described by the following Itô processes
dX .t/ D X X .t/ dt C CX X .t/ dZ 1 .t/
d
dY .t/ D Y Y .t/ dt C CY Y .t/ dZ 1 .t/ CY N Y .t/ dZ 2 .t/
If we consider Equation (30.90) as a bivariate function u D f .Z1 ; Z2 / and we apply the Itô’s Lemma 30.18 with the multiplication rules of Equation (30.19) we have
(30.86)
Y .t/ X .t/
D
@f @f @f dt C dZ 1 C dZ 2 @t @Z1 @Z2 C
1 @2 f 1 @2 f dt C dt 2 2 @Z1 2 @Z2 2
(30.91)
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
The partial derivatives are Equations (30.92a) to (30.92b):
presented
below
from
2 Y .t/ @f Y x2 D Y x (30.92a) @t 2 2 X .t/ Y .t/ @f D .Y x / @Z1 X .t/
(30.92b)
463
According to the Martingale approach, we need a density probability function f Q .t; w//measure makes Equation (30.95) a Martingale process, or lently, makes the following normalized process, a process h
Y x
(30.92c)
Y .t/ @2 f D .Y x /2 2 X .t/ @Z1
(30.92d)
@2 f Y .t/ D Y2 N2 2 X .t/ @Z2
(30.92e)
Y .t/ X .t/ Y .t/ X .t/
2 x 2 Y D Y x 2 2 1 1 2 2 2 C .Y x / C Y dt 2 2 C .Y x / dZ 1 C Y dZ 2 (30.93)
N 2 2
i dt C dZ .t/ D d ZQ .t/ (30.96)
V .0/ max .Y .T / X .T / ; 0/ D EQ X .0/ X .T /
Y .T / 1; 0 D E Q max X .T /
(30.97)
This last equation denotes a simple call option on the asset Y .t / with strike equal to one. X .t / Furthermore, the dynamics of this asset satisfies the assumptions of the Black–Scholes model, which means that we can apply the formula if we make the following substitutions: Y .t/ ! S .t/ 1 ! K 1 ! P .t; T / X .t/ !
(30.98)
0!R
That way we reach the result
D .Y x / E ŒdZ 1 C Y N E ŒdZ 2 D 0 h i E ..Y x / dZ 1 C Y N dZ 2 /2 i h D .Y x /2 E dZ 1 2 C Y 2 N2 E dZ2 2 C 0 D Y 2 2 C x 2 2 Y x CY 2 1 2 dt D Y 2 C x 2 2 Y x dt D N 2 dt
V .0/ V .0/ D N .d1 / N .d2 / X .0/ X .0/
(30.99)
with ln d1 D
2 Y .0/ C T X .0/ 2 p T
ln d2 D
2 Y .0/ T X .0/ 2 p : T (30.100)
(30.94)
We can then turn Equation (30.93) into a simpler equation of only one stochastic factor Y .t / X .t / Y .t / X .t /
C
V .0/ Q V .X .T / ; Y .T // DE X .0/ X .T /
E Œ.Y x / dZ 1 C Y N dZ 2
d
Under this measure Q, the prices of all the contingent claims on this ratio of the assets are Martingales and can each be replicated with a unique, self-financing and previsible trading strategy between the two assets. In other words,
)
Since we have only one asset under consideration – the ratio Y .t / of X .t / – it is perhaps redundant to describe its dynamics with two driving Wiener processes. The sum of the increments of the two processes follows a normal distribution with mean and variance
Y 2 x 2 2
N
@f Y .t/ D Y N @Z2 X .t/
We substitute in Equation (30.91) to get the stochastic differential equation of the ratio of the assets; that is, d
to find Q that equivaWiener
D Y x
Y 2 x 2 2 2
N 2 C dt CN dZ: 2 (30.95)
30.8 Remarks on Option Pricing For a review on the early literature on option pricing, see the two papers by Smith (1976, 1979). It is appropriate here to make a few remarks on the Black–Scholes option-pricing model to clarify its significance and its limitation.
464
First, the Black–Scholes model for a European call as originally derived, and as reported here, is based on several simplifying assumptions. 1. The stock price follows an Itô equation. 2. The market operates continuously. 3. There are no transaction costs in buying or selling the option or the underlying stock. 4. There are no taxes. 5. The riskless rate is known and constant. 6. There are no restrictions on short sales. Several researchers have extended the original Black– Scholes model by modifying these assumptions. Merton (1973) generalizes the model to include dividend payments, exercise-price changes, and the case of a stochastic interest rate. Roll (1977) has solved the problem of valuing a call option that can be exercised prior to its expiration date when the underlying stock is assumed to have made known dividend payments before the option matures. Ingersoll (1976) studies the effect of differential taxes on capital gains and income while Scholes (1976) determines the effects of the tax treatment of options on the pricing model. Furthermore, Merton (1976) and Cox and Ross (1976) show that if the stock-price movements are discontinuous, under certain assumptions the valuation model still holds. Derman et al. (1996) replaced the assumption of constant volatility with the concept of the local volatility – volatility that is stochastic in the sense that depends on the asset price – but also not fully stochastic since this dependence is deterministic. This assumption helped explain the existence of volatility smiles but with results that lacked economic intuition. Following a different direction Hull and White (1987) examined the effects of stochastic volatility and the capacity of the extended model to fit parsimoniously market prices. Also, Leland (1985) studied the consequences of introducing transaction costs in the replication of options. These and other modifications of the original Black–Scholes model are quite robust with respect to the relaxation of its fundamental assumptions. Second, it is worth repeating that the use of Itô’s calculus and the important insight concerning the appropriate concept of an equilibrium by creating a risk-free hedge portfolio have let Black and Scholes obtain a closed-form solution for option pricing. In this closed-form solution several variables do not appear, such as (1) the expected rate of return of the stock, (2) the expected rate of return of the option, (3) a measure of investor’s risk preference, (4) investor expectations, and (5) equilibrium conditions for the entire capital market. Third, the Black–Scholes pricing model has found numerous applications. Among these are the following: (1) pricing the debt and equity of a firm; (2) the effects of corporate policy and, specially, the effects of mergers, acquisitions, and scale expansions on the relative values of the debt and equity
G. Chalamandaris and A.G. Malliaris
of the firm; (3) the pricing of convertible bonds; (4) the pricing of underwriting contracts; (5) the pricing of leases; and (6) the pricing of insurance. Smith (1976, 1979) summarizes most applications and indicates the original reference. See also Brealey and Myers (1996). Fourth, Black (1976) shows that the original call-option formula for stocks can be easily modified to be used in pricing call options on futures. The formula is C T; F; 2 ; X; r D erT ŒFN .d1 / XN .d2 / (30.101) d1 D
ln .F =X / C 12 2 T p T
(30.102)
d2 D
ln .F =X / 12 2 T p T
(30.103)
In Equation (30.101) F now denotes the current futures price. The other four variables are as before – time to maturity, volatility of the underlying futures price, exercise price, and risk-free rate. Note that Equation (30.101) differs from Equation (30.53) only in one respect: by substituting erT F for S in the original Equation (30.53) Equation (30.101) is obtained. This holds because the investment in a futures contract is zero, which causes the interest rate in Equations (30.102) and (30.103) to drop out. Another easy way to reach the same conclusion is to define as the numeraire asset the zero coupon bond that matures at the expiry of the option. With this numeraire asset, the process of the discounted futures price shows no drift under the risk-neutral measure. Fifth, three important papers by Harrison and Kreps (1979) and Kreps (1981, 1982) consider some foundational issues that arise in conjunction with the arbitrage theory of option pricing. The important point to consider is this: the ability to trade securities frequently can enable a few multiperiod securities to span many states of nature. In the Black–Scholes theory there are two securities and many uncountable states of nature, but because there are infinitely many trading opportunities and because uncertainty resolves nicely, markets are effectively complete. Thus, even though there are far fewer securities than states of nature, markets are complete and risk is allocated efficiently. An interesting result of Harrison and Kreps (1979) is that certain self-trading strategies can create something out of nothing when there are infinitely many trading opportunities. The doubling strategies are the well-known illustrations of this phenomenon. Harrison and Kreps introduce the concept of a simple strategy to eliminate free lunches and conjecture that a nonnegative wealth constraint could rule out the doubling strategies. Duffie and Huang (1985) give an interpretation of admissible strategy as a limit of a sequence of simple strategies and use an integrability condition on the trading strategies. These latter papers helped develop the Martingale approach and established the equivalence of the absence of
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
arbitrage in complete markets with the existence of a unique Martingale measure for a given numeraire asset. See Pliska (1997) for a very clear discussion on this issue and the effects of market incompleteness in pricing. The first results in this direction, in a continuous-time setup, were obtained by Stricker (1990) and Ansel and Stricker (1992). These results were extended in various directions by Delbaen (1992), Schweizer (1992), Lackner (1993), Delbaen and Schachermayer (1994) and others. A very extensive and rigorous exposition of the Martingale approach is given by Musiela and Rutkowski (1997), whereas a more heuristic application of the method can be found in Rebonato (1998, 2004). For a detailed survey of numerous empirical tests, see Galai (1983) for stock options, Shastri and Tandon (1986) and Bodurtha and Courtadon (1987) for currency options and Chance (1986) for the market prices of index options. Finally for analytical results regarding the pricing of numerous exotic options, see Haug (1998).
30.9 Conclusion This chapter has discussed the basic concepts and equations of stochastic calculus (Itô’s calculus), which has become a very useful tool in understanding finance theory and practice. By using these concepts and equations, the manner in which Black and Scholes derived their famous optionpricing model, together with the more modern approach of the Martingale measure, was also illustrated. Although this chapter is not required to understand the basic ingredients of security analysis and portfolio management, it is useful for those with training in advanced mathematics to realize how advanced mathematics can be used in finance.
References Ansel, J.P. and C. Stricker. 1992. “Lois de martingale, densities et decompositions de Foellmer Schweizer.” Annales de l’Institut Henri Poincare (B) Probability and Statistics 28, 375–392. Arnold, L. 1974. Stochastic differential equation: theory and applications, Wiley, New York. Baxter, M. and A. Rennie. 1996. Financial calculus, Cambridge University Press, Cambridge. Bhattacharya, M. 1980. “Empirical properties of the Black–Scholes formula under ideal conditions.” Journal of Financial Quantitative Analysis 15, 1081–1105. Bjork, T. 1998. Arbitrage theory in continuous time, Oxford University Press, Oxford. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Black, F. 1976. “The pricing of commodity contracts.” Journal of Financial Economics 3, 167–178.
465
Bodurtha, J.N. and G.R. Courtadon. 1987. “Tests of an American option pricing model on the foreign currency options market.” Journal of Financial and Quantitative Analysis 22, 153–168. Boyler, P.P. and A. L. Ananthanarayanan. 1977. “The impact of variance estimation in option valuation models.” Journal of Financial Economics 5, 375–387. Brealey, R., and S. Myers. 1996. Principles of corporate finance, 5th ed., McGraw-Hill, New York. Britten-Jones, M. and A. Neuberger. 1996. “Arbitrage pricing with incomplete markets.” Applied Mathematical Finance 3, 347–363. Chance, D. M. 1986. “Empirical tests of the pricing of index call options.” Advances in Futures and Options Research 1(A), 141–166. Chance, D. 2006. An introduction to derivatives and risk management, Thompson South Western, Mason. Chiras, D. and S. Manaster. 1978. “The information content of option prices and a test of market efficiency.” Journal of Financial Economics 6, 213–234. Cox, D.R. and H.D. Miller. 1990. The theory of stochastic processes, Chapman & Hall, New York. Cox, J. C. and S. A. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3, 145–226. Cox, J. C. and M. Rubinstein. 1985. Options markets, Prentice-Hall, New Jersey. Das, S.R. and R.K. Sundaram. 1997. Of smiles and smirks: a term structure perspective, working paper, Harvard Business School and Stern School of Business, NYU, New York. Delbaen, F. 1992. “Representing martingale measures when asset prices are continuous and bounded.” Mathematical Finance 2, 107–130. Delbaen, F. and W. Schachermayer. 1994. “A general version of the fundamental theorem of asset pricing.” Mathematical Annals 300, 463–520. Derman, E., I. Kani, and J. Zhou. 1996. “The local volatility surface: unlocking the information in index option prices.” Financial Analyst’s Journal 52, 25–36. Duffie, D. 1996. Dynamic asset pricing theory, 2nd ed., Princeton University Press, New Jersey. Duffie, D. and C. Huang. 1985. “Implementing arrow-debreu equilibria by continuous trading of few long-lived securities.” Econometrica 53, 1337–1356. Dupire, B. 1994. “Pricing with a smile.” Risk 7, 32–39. Fama, E. 1970. “Efficient capital markets: a review of theory and empirical work.” Journal of Finance 25, 383–417. Fama, E. 1991. “Efficient capital markets: II.” Journal of Finance 46, 1575–1617. Galai, D. 1983. “A survey of empirical tests of option-pricing models.” in Option pricing, M. Brenner (Ed.). Lexington Books, Lexington, pp. 45–80. Garven, J. R. 1986. “A pedagogic note on the derivation of the Black– Scholes option pricing formula.” The Financial Review 21, 337–344. Gikhman, I. and A. V. Skorokhod. 1969. Investment to the theory of random processes, Saunders, New York. Harrison, J. M. and D. M. Kreps. 1979. “Martingales and arbitrage in multiperiod securities markets.” Journal of Economic Theory 20, 381–408. Harrison, J. and S. Pliska. 1981. “Martingales and stochastic integrals in the theory of continuous trading.” Stochastic Processes and Their Application 2, 261–271. Haug, E.G. 1998. Option pricing formulas, McGraw-Hill, New York. Heston, S.L. 1991. “A closed form solution for options with stochastic volatility with application to bond and currency options.” Review of Financial Studies 6, 327–343. Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” Journal of Finance 42, 281–300. Hull, J. 2006. Options, futures and other derivatives, 6th ed., Pearson Prentice Hall, New Jersey.
466 Ingersoll, J. E. 1976. “A theoretical and empirical investigation of the dual purpose funds: an application of contingent claims and analysis.” Journal of Financial Economics 3, 83–123. Jarrow, R. and A. Rudd. 1983. Option pricing, R. D. Irwin, IL. Karatzas, I. and S. Shreve. 2005. Brownian motion and stochastic calculus, Corrected Eighth Printing, Springer, New York. Kreps, D. M. 1981. “Arbitrage and equilibrium in economics with infinitely many commodities.” Journal of Mathematical Economics 8, 15–35. Kreps, D. M. 1982. “Multiperiod securities and the efficient allocation of risk: a comment on the Black–Scholes Option pricing model.” in The economics of information and uncertainty, J. J. McCall (Ed.). University of Chicago Press, Chicago. Lackner, P. 1993. “Martingale measure for a class of right continuous processes.” Mathematical Finance 3, 43–53. Lamberton, D. and B. Lapeyre. 2007. Introduction to stochastic calculus applied to finance, Chapman & Hall, New York. Latané, H. A. and R. J. Rendleman, Jr. 1976. “Standard derivations of stock price ratios implied in option prices.” Journal of Finance 31, 369–381. Leland, H.E. 1985. “Option pricing and replication with transaction costs.” Journal of Finance 40, 1283–1301. MacBeth, J. D. and L. J. Merville. 1979. “An empirical examination of the black–scholes call option pricing model.” Journal of Finance 34, 1173–1186. Malliaris, A. G. 1983 “Itô’s calculus in financial decision making.” Society of Industrial and Applied Mathematical Review 25, 482–496. Malliaris, A. and W. A. Brock. 1982. Stochastic methods in economics and finance, North-Holland, Amsterdam. McDonald, R. 2005. Derivatives markets, Pearson Addison Wesley, Boston. Merton, R. C. 1973. “The theory of rational option pricing.” Bell Journal of Economics and Management Science 4, 141–183. Merton, R. C. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics 3, 125–144. Merton, R. C. 1982. “On the mathematics and economics assumption of continuous-time model.” in Financial Economics: Essay in Honor of Paul Cootner, W. F. Sharpe and C. M. Cootner (Ed.). Prentice-Hall, New Jersey, 19–51. Merton, R. C. 1992. Continuous-time finance, Wiley-Blackwell, New York. Musiela, M. and M. Rutkowski 1997. Martingale methods in financial modeling, Springer, New York. Neftci, S. 2000. Introduction to the mathematics of financial derivatives, Academic Press, New York. Oksendal, B. 2007. Stochastic differential equations, 5th ed., Springer, New York. Pliska, S.R. 1997. Introduction to mathematical finance, Blackwell, Oxford. Rebonato, R. 1998. Interest rate option models, 2nd ed., Wiley, New York. Rebonato, R. 2002. Modern pricing of interest rate derivatives, Princeton University Press, New Jersey. Rebonato, R. 2004. Volatility and correlation: the perfect hedger and the fox, 2nd ed., Wiley, New York. Roll, R. 1977. “An analytic valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 5, 251–281. Rubinstein, M. 1976. “The valuation of uncertain income streams and the pricing of options.” Bell Journal of Economics 7, 407–425. Scholes, M. 1976. “Taxes and the pricing of options.” Journal of Finance 31, 319–332. Schweizer, M. 1992. “Martingale densities for general asset prices.” Journal of Mathematical Economics 21, 363–378.
G. Chalamandaris and A.G. Malliaris Shastri, K. and K. Tandon. 1986. “Valuation of foreign currency options: some empirical results.” Journal of Financial and Quantitative Analysis 21, 377–392. Smith, C. W., Jr. 1976. “Option pricing: a review.” Journal of Financial Economics 3, 3–51. Smith, C. W., Jr. 1979. “Applications of option pricing analysis.” in Handbook of financial economics, J. L. Bicksler (Ed.). North-Holland, Amsterdam, pp. 79–121. Stricker, C. 1990. “Arbitrage et Lois de Martingale.” Annales de l’Institut Henri Poincare (B) Probability and Statistics 26, 451–460. Taleb, N. 1997. Dynamic hedging, Wiley, New York. Wilmott, P. 2000. Paul Wilmott on quantitative finance, Wiley, New York.
Appendix 30A An Alternative Method To Derive the Black–Scholes Option-Pricing Model Perhaps it is unclear why it is assumed that investors have risk-neutral preferences when the usual assumption in finance courses is that investors are risk averse. It is feasible to make this simplistic assumption because investors are able to create riskless portfolios by combining call options with their underlying securities. Since the creation of a riskless hedge places no restrictions on investor preferences other than nonsatiation, the valuation of the option and its underlying asset will be independent of investor risk preferences. Therefore, a call option will trade at the same price in risk-neutral economy as it will in a risk-averse or risk-preferent economy.
30A.1 Assumptions and the Present Value of the Expected Terminal Option Price To derive the Black–Scholes formula it is assumed that there are no transaction costs, no margin requirements, and no taxes; that all shares are infinitely divisible; and that continuous trading can be accomplished. It is also assumed that the economy is risk neutral. In the risk-neutral assumptions of Cox and Ross (1976) and Rubinstein (1976), today’s option price can be determined by discounting the expected value of the terminal option price by the riskless rate of interest. As was seen earlier, the terminal call-option price can take on only two values: St X if the call option expires in the money, or 0 if the call expires out of the money. So today’s call option price is C D exp .rt/ Max .St X; 0/ where: C D the market value of the call option; r D riskless rate of interest; t D time to expiration;
(30A.1)
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
St D the market value of the underlying stock at time t; and X D exercise or striking price. Equation (30A.1) says that the value of the call option today will be either St X or 0, whichever is greater. If the price of stock at time t is greater than the exercise price, the call option will expire in the money. This simply means that an investor who owns the call option will exercise it. The option will be exercised regardless of whether the option holder would like to take physical possession of the stock. If the investor would like to own the stock, the cheapest way to obtain the stock is by exercising the option. If the investor would not like to own the stock, he or she will still exercise the option and immediately sell the stock in the market. Since the price the investor paid .X / is lower that the price he or she can sell the stock for .St /, the investor realizes an immediate the profit of St X . If the price of the stock .St / is less than the exercise price .X /, the option expires out of the money. This occurs because in purchasing shares of the stock the investor will find it cheaper to purchase the stock in the market than to exercise the option. Assuming that the call option expires in the money, then the present value of the expected terminal option is equal to the present value of the difference between the expected terminal stock price and the exercise price, as indicated in Equation (30A.2): C D exp .rt/ E ŒM ax .St X; 0/ Z 1 .St X/ h .St / dSt (30A.2) D exp .rt/ x
where h .St / is the log normal density function of St . To evaluate the integral in (30A.2) rewrite it as the difference between two integrals: Z
Z
1 x
1
St h .St / dSt X
C D exp .rt /
h .St / dSt x
D Ex .St / exp .rt / X exp .rt/ Œ1 H .X / (30A.3) where: Ex .St / D the partial expectation of St ; truncated from below at xI and H .X / D the probability that St X: Equation (30A.3) says that the value of the call option is present value of the partial expected stock price (assuming the call expires in the money) minus the present value of the exercise price (adjusted by the probability that the stock’s price will be less than the exercise price at the expiration of the option). The terminal stock price St , can be rewritten
467
as the product of the current price .S / and the t-period log normally distributed price ration St =S , so St D S .St =S /. Equation (30A.3) can also be rewritten:
St dSt St g S S x=s S
Z 1 St dSt g X S S x=s
St D S exp .rt / Ex=S S X X exp .rt / 1 G (30A.4) S
Z C D exp .rt/ S
1
where:
St g D log normal density function of St =S I S
St D the partial expextation of St =S; Ex=S S truncated from below at x=S I X D the probability that St =S X =S: G S
30A.2 Present Value of the Partial Expectation of the Terminal Stock Price The right-hand side of Equation (30A.4) is evaluated by considering the two integrals separately. The first integral, S exp .rt/ Ex=S .St =S /, can be solved by assuming the return on the underlying stock follows a stationary random walk. That is, St D exp .Kt/ (30A.5) S where K is the rate of return on the underlying stock per unit of time. Taking the natural logarithm of both sides of Equation (30A.5) yields:
St D .Kt/ ln S Since the ratio St =S is log normally distributed, it follows that Kt is log normally distributed with density f .Kt/, mean K t, and variance K2 t. Because St =S D exp .Kt/, the differential can be rewritten, d St =S D exp .Kt/ tdK. g .St =S / is a density function of a log normally distributed variable St =S ; so it can be transformed into a density function of a normally distributed variable Kt according to the relationship St =S D exp .Kt/ as: g
St S
D f .Kt/
St S
(30A.6)
468
G. Chalamandaris and A.G. Malliaris
These transformations allow the first integral in Equation (30A.4) to be rewritten: S exp .rt / Ex=S Z
St S
f .Kt/ exp .Kt/ t dK:
ln.x=S /
Because Kt is normally distributed, the density f .Kt/ with mean K t and variance K2 t is 1=2 1 f .Kt/ D 2 K2 t exp .Kt K t/2 K2 2 Substitution yields: S exp .rt / Ex=S Z
St S
1=2 D S exp .rt / 2 K2 t
1
exp ŒKt exp ln.x=S /
1 2 2
.Kt K t/ K t t dK 2
St S
S exp .rt / Ex=S
1=2 St D SE exp .rt/ 2 K2 t Z 1S 2. 2 1 2
exp .Kt/exp Kt K C K t K t 2 ln.x=S / (30A.9) Since the equilibrium rate of return in a risk-neutral economy is the riskless rate, E .St =S / may be rewritten as exp .rt/:
St SE S
exp .rt/ D S exp .rt / exp .rt / D S
So Equation (30A.9) becomes (30A.7)
Equation (30A.7)’s integrand can be simplified by adding the terms in the two exponents, multiplying and dividing the re sult by exp 12 K2 t . First, expand the term .Kt K t/2 and factor out t so that: 1 2 2 exp ŒKt exp .Kt K t / K t 2 Next, factor out t so: ı 1 exp .Kt/ exp t K 2 2K K C 2K K2 2
Now combine the two exponents:
(30A.8)
In Equation (30A.8), exp K C 12 K2 t D E .St =S /, the mean of the t-period log normally distributed price ration St =S . So, Equation (30A.7) becomes:
1
D S exp .rt /
2 1 K2 t
exp Kt K C K2 t 2
ı 1 exp t .K 2 2K K C 2K 2K2 K/ K2 2
1=2 St D S 2 K2 t S exp .rt / Ex=S S Z 1 2 . 2 1 2 K t t dK
exp Kt K C K t 2 ln.x=S/
(30A.10) To complete the simplification of this part of the Black– Scholes formula, define a standard normal random variable y: . y D Kt K C K2 t K2 t 1=2 : Solving for Kt yields: Kt D K C K2 t C K t 1=2 y and therefore: t dK D K t 1=2 dy:
Now, multiply and divide this result by exp 12 K2 t to get:
By making the transformation from Kt to y the lower limit of integration becomes
. ln .x=S / K C K2 t K t 1=2 :
ı 2 1 2 2 2 4 4 exp t .K 2K K C K 2K K C K K / K Further simplify the integrand by noting that the assumption 2 of a risk neural economy implies: Next, rearrange and combine terms to get:
1 2
h exp K C K t D exp .rt / i. 2 1 2 exp t K2 K K K2 K4 2K K2 2 Taking the natural logarithm of both sides yields:
1 2 D exp K C K t
2 1 2 K C K t D .rt/ 2
30 Itô’s Calculus and the Derivation of the Black–Scholes Option-Pricing Model
Hence, K C 12 K2 t D r C 12 K2 t. The lower limit of integration is now:
1 K t 1=2 D d1 : ln .S =x/ C r C K2 t 2 Substituting this into Equation (30A.10) and making the transformation to y yields:
,
Z 1 St 1 2 .2 /1=2 dy DS S exp .rt/ Ex=S exp y S 2 d1 Since y is a standard normal random variable (distribution is symmetric around zero) the limits of integration can be exchanged: S exp .rt / Ex=S
St S
DS
R d1 1
. exp 12 y 2 .2 /1=2 dy
D SN .d1 / (30A.11)
469
The integrand is now simplified by following the same procedure used in simplifying the previous integral. Define a standard normal random variable Z:
Kt K t ZD K t 1=2
Solving for Kt yields: Kt D K t C K t 1=2 Z and t dK D K t 1=2 dZ. Making the transformation from Kt to Z means the lower limit of integration becomes ln .X =S / K t : K t 1=2 Again, note that the assumption of a risk-neutral economy implies:
1 2 exp K C K t D exp .rt / 2
where N .d1 / is the standard normal cumulative distribution function evaluated at y D d1 .
Taking the natural logarithm of both sides yields:
1 K C K2 t D rt 2
30A.3 Present Value of the Exercise Price under Uncertainty To complete the derivation, the integrals that corresponds to the term X exp .rt/ Œ1 G .X =S / must be evaluated. Start by making the logarithmic transformation: ln
St S
or:
1 2 K t D r K t: 2
Therefore, the lower limit of integration becomes:
D Kt
This transformation allows the rewriting of g .St =S / to .S =St / f .Kt/ as mentioned previously. The differential can be written: St D exp .Kt/ t dK d S Therefore,
ln .S =x/ C r 12 K2 t D d1 K t 1=2 1=2 K t D d2 Substitution yields: Z
1
x exp .rt/ Œ1 G .X =S/ D x exp .rt /
exp d2
Z d2 1
Z 2 .2 /1=2 dZ D x exp .rt/ exp 2 1 1 2 1=2
Z .2 / dZ D x exp .rt/ N .d2 / 2
X exp .rt/ Œ1 G .X =S / Z
1
D X exp .rt /
f .Kt/ t dK ln.X =S/
D X exp .rt /
1=2 2 K2 t
1 .Kt K t/2 2
Z
1
exp
(30A.13)
ln.X =S /
K2 t t dK
(30A.12)
where N .d2 / is the standard normal cumulative distribution function evaluated at Z D d2 .
470
G. Chalamandaris and A.G. Malliaris
Substituting, Equation (30A.11) and (30A.13) into Equation (30A.4) completes the derivation of the Black– Scholes formula: C D SN .d1 / X exp .rt / N .d2 /
(30A.14)
This appendix provides a simple derivation of the Black– Scholes call-option pricing formula. Under an assumption of risk neutrality the Black–Scholes formula was derived using only differential and integral calculus and a basic knowledge of normal and log normal distributions.
Chapter 31
Constant Elasticity of Variance Option Pricing Model: Integration and Detailed Derivation Y.L. Hsu, T.I. Lin, and C.F. Lee
Abstract In this paper we review the renowned Constant Elasticity of Variance (CEV) option pricing model and give the detailed derivations. There are two purposes of this article. First, we show the details of the formulae needed in deriving the option pricing and bridge the gaps in deriving the necessary formulae for the model. Second, we use a result by Feller to obtain the transition probability density function of the stock price at time T given its price at time t with t < T . In addition, some computational considerations are given which will facilitate the computation of the CEV option pricing formula. Keywords Constant elasticity of variance model r Noncentral Chi-square distribution r Option pricing
t with t < T . We also showed the details of the formulae needed in deriving the option pricing. A proof of Feller’s result is given in the Appendix.
31.2 The CEV Diffusion and Its Transition Probability Density Function The CEV option pricing model assumes that the stock price is governed by the diffusion process dS D Sdt C S ˇ=2 dZ;
Y.L. Hsu () and T.I. Lin Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung, Taiwan e-mail: [email protected] C.F. Lee Graduate Institute of Finance, National Chaio Tung University, Hsinchu, Taiwan and Department of Finance, Rutgers University, New Brunswick, NJ, USA
(31.1)
where dZ is a Wiener process and is a positive constant. The elasticity is ˇ 2 since the return variance .S; t/ D 2 S ˇ2 with respect to price S has the following relationship
31.1 Introduction Cox (1975) has derived the renowned Constant Elasticity of Variance (CEV) option pricing model and Schroder (1989) has subsequently extended the model by expressing the CEV option pricing formula in terms of the noncentral Chi-square distribution. However, neither of them has given details of their derivations as well as the mathematical and statistical tools in deriving the formulae. There are two purposes of this article. First, we integrated the results obtained by Cox (1975) and Schroder (1989) and bridged the gaps in deriving the necessary formulae for the model. Second, we use a result by Feller (1951) to obtain the transition probability density function of the stock price at time T given its price at time
ˇ < 2;
d .S; t/=dS D ˇ 2; .S; t/=S which implies that d .S; t/=.S; t/ D .ˇ 2/dS=S . Upon integration on both sides, we have log .S; t/ D .ˇ 2/ log S C log 2 , or .S; t/ D .S; t/ D 2 S ˇ2 . If ˇ D 2, then the elasticity is zero and the stock prices are lognormally distributed as in the Black and Scholes model. If ˇ D 1, then Equation (31.1) is the model proposed by Cox and Ross (1976). In this article, we will focus on the case of ˇ < 2 since many empirical evidences (see Campbell 1987, Glosten et al. (1993), Brandt and Kang 2004) have shown that the relationship between the stock price and its return volatility is negative. The transition density for ˇ > 2 is given by Emanuel and Macbeth (1982) and the corresponding CEV option pricing formula can be derived through a similar strategy. For more details, see Chen and Lee (1993).
This paper is dedicated to honor and memory my Ph.D. advisor, Prof. Jack C. Lee, who died from cardiovascular disease on 2 March 2007. This paper is reprinted from Mathematics and Computers in Simulation, 79 (2008), pp. 60–71.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_31,
471
472
Y.L. Hsu et al.
In order to derive the CEV option pricing model, we need the transition probability density function f .ST jSt ; T > t/ of the stock price at time T given the current stock price St . For the transition probability density function f .ST jSt /, we will start with the Kolmogorov forward and backward equations. Assume Xt follows the diffusion process dX D .X; t/dt C .X; t/dZ;
(31.2)
and P D P .Xt ; t/ is the function of Xt and t, then P satisfies the partial differential equations of motion. From Equation (31.2), we have the Kolmogorov backward equation, @2 P @P @P 1 2 .X0 ; t0 / 2 C .X0 ; t0 / C D 0; 2 @X0 @t0 @X0
(31.3)
and the Kolmogorov forward (or Fokker–Planck) equation i @P h i @P 1 @2 h 2 .Xt ; t/P .Xt ; t/P D 0: 2 2 @Xt @Xt @t (31.4) Consider the following parabolic equation
.P /t D .axP /xx .bxCh/P ; x
0 < x < 1;
(31.5)
where P D P .x; t/, and a; b; h are constants with a > 0, .P /t is the partial derivative of P with respect to t, ./x and ./xx are the first and second partial derivatives of ./ with respect to x. This can be interpreted as the Fokker–Planck equation of a diffusion problem in which bx C h represents the drift, and ax represents the diffusion coefficient. Lemma 1. (Feller (1951)): Let f .x; t j x0 / be the probability density function for x and t conditional on x0 . The explicit form of the fundamental solution to the above parabolic equation is given by f .t; x j x0 / D
b
e bt x x0
Before pursuing further, we will first consider the special case in which ˇ D 1 which is the model considered by Cox and Ross (1976). In this situation we have dS D .S; t/dt C .S; t/dZ;
@P @P 1 2 @2 P S 2 C Œ.r a/S h C D rP; 2 @S @S @t
(31.9)
and the corresponding Kolmogorov forward equation for the diffusion process (Equation 31.8) is i @P @ h 1 @2 2 D 0; . S P / C .r a/S h P T 2 @St2 @ST @t (31.10) which is obtained by using (31.4) with .xt ; t/ D .r a/S h. Comparing with Equation (31.6), we set a D 2 =2, x D ST , x0 D St , b D r 2 =2, h D h and t D D .T t/. Thus, we have the following transition probability density function for the Cox–Ross model: f .ST jSt ; T > t/ D
exp a.e bt 1/ b.x C x0 e bt /
a.e bt 1/
2b bt 1=2
I1h=a .e ; xx / 0 a.1 e bt /
where Ik .x/ is the modified Bessel function of the first kind of order k and is defined as
(31.8)
p where .S; t/ D S. Now suppose also that each unit of the stock pays out in dividends in the continuous stream b.S; t/ so that the required mean becomes .S; t/ D rS b.S; t/ D rS .aS C h/, where b.S; t/ D aS C h and r isp the risk-free interest rate. Then dS D Œ.r a/S h dt C S dZ and the differential option price equation becomes
.ha/=2a
(31.6)
Proof. See the Appendix.
2.r 2 =2/ 2 Œe .r 2 =2/ 1 !.1C2h= 2 /=2 2 St e .r =2/
exp ST ( ) 2 2.r 2 =2/ŒST C St e .r =2/
2 Œe .r 2 =2/ 1
I1C2h= 2 2
4.r 2 =2/.St ST e .r =2/ /1=2
2 Œe .r 2 =2/ 1 2
! :
(31.11) We next consider the constant elasticity of variance diffusion, dS D .S; t/ C .S; t/dZ; (31.12) where
Ik .x/ D
X1
.x=2/2rCk : rD0 rŠ.r C 1 C k/
(31.7)
.S; t/ D rS aS;
(31.13)
31 Constant Elasticity of Variance Option Pricing Model
473
and .S; t/ D S ˇ=2 ;
0 ˇ < 2:
(31.14)
Then dS D .r a/Sdt C S ˇ=2 dZ: Let Y D Y .S; t/ D S
2ˇ
(31.15)
. By Ito’s Lemma with
For a proof of the above formula, see Chen and Lee (1993). We next present the detailed derivations of the option pricing formula as presented by Schroder (1989). Since the option pricing formula is expressed in terms of the noncentral Chi-square complementary distribution function, a brief review of the noncentral Chi-square distribution is presented in the next section.
@2 Y @Y @Y D .2ˇ/S 1ˇ ; D 0; D .2ˇ/.1ˇ/S ˇ ; @S @t @S 2 we have h i 1 d Y D .r a/.2 ˇ/Y C 2 .ˇ 1/.ˇ 2/ dt 2 C 2 .2 ˇ/2 Y dZ:
(31.16)
31.3 Review of Noncentral Chi-Square Distribution If Z1 ; : : : ; Z are standard normal random variables, and ı1 ; : : : ; ı are constants, then
The Kolmogorov forward equation for Y becomes 1 @2 2 @P D .2 ˇ/YP 2 @t 2 @Y 1 @ .r a/.2 ˇ/Y C 2 .ˇ 1/.ˇ 2/ P : @Y 2 (31.17) Then f .ST j St ; T > t/ D f .YT j yt ; T > t/ j J j where J D .2ˇ/S 1ˇ . By Feller’s Lemma with a D 12 2 .2ˇ/2 , b D .r a/.2 ˇ/, h D 12 2 .ˇ 2/.1 ˇ/, x D 1=T , x0 D 1=t and t D D .T t/, we have 1=.2.2ˇ// f .ST j St ; T > t/ D .2 ˇ/k 1=.2ˇ/ xz1ˇ
e xz I1=.2ˇ/ 2.xz/1=2 ; (31.18)
Y D
X .Zi C ıi /2
(31.20)
i D1
is the noncentral Chi-square distribution with degrees of P freedom and noncentrality parameter D j D1 ıj2 , and is 2
denoted as 0 ./. When ıj D 0 for all j , then Y is distributed as the central Chi-square distribution with degrees of freedom, and is denoted as 2 . The cumulative distribution 2 function of 0 ./ is 2
F .xI ; / D P .0 ./ x/ D e =2 Z
x
where
1 X
.=2/j j Š2=2 .=2 C j / j D0 y
y =2Cj 1 e 2 dy;
x > 0:
0
k D
2.r a/ ; 2 .2 ˇ/Œe .ra/.2ˇ/ 1 2ˇ .ra/.2ˇ/
x D k St
e
;
(31.21) 2ˇ
z D k ST
An alternative expression for F .xI ; / is
:
Cox (1975) obtained the following option pricing formula: 1 X e x x n G n C 1 C 1=.2 ˇ/; k K 2ˇ r C D St e .n C 1/ nD0 1 X e x x nC1=.2ˇ/ G n C 1; k K 2ˇ r Ke ; n C 1 C 1=.2 ˇ/ nD0 (31.19) R1 where G.m; / D Œ.m/ 1 e u um1 du is the standard complementary gamma distribution function.
F .xI ; / D
1 X .=2/j e =2 j D0
jŠ
P .2C2j x/: (31.22) 2
The complementary distribution function of 0 ./ is Q.xI ; / D 1 F .xI ; /;
(31.23)
where F .xI ; / is given in either Equation (31.21) or Equation (31.22). 2 The probability density function of 0 ./ can be expressed as a mixture of central Chi-square probability density functions.
474
Y.L. Hsu et al.
p02 ./ .x/ D e =2
1 X . 1 /j 2
jŠ
j D0
D
e
p2
1 .xC/=2 X
C2j
31.4.1 Detailed Derivations of C 1 and C 2
.x/
2ˇ
Making the change of variable w D k ST
x : 2j j Š .=2 C j /2 j D0
2=2
(31.24) An alternative expression for the probability density function 2 of 0 ./ is
dST D .2 ˇ/1 k
Z C1 D e
I
Ik .z/ D
1
p .2 xw/.w=x/1=.2ˇ/ .x=k /1=.2ˇ/ d w Z 1 D e r .x=k /1=.2ˇ/ e xw .x=w/1=.42ˇ/
k X
2
i D1
1 2ˇ
p
I 1 .2 xw/d w 2ˇ Z 1 D e r St e .ra/ e xw .w=x/1=.42ˇ/
(31.26)
y
e z cos.k / cos.k/d D Ik .z/:
i D1
e xw .x=w/1=.42ˇ/ y
p
I 1 .2 xw/d w 2ˇ Z 1 a D e St e xw .w=x/1=.42ˇ/ I
(31.27)
0
Xi 0Pk
1
y
The noncentral Chi-square distribution satisfies the reproductivity property with respect to n and . If X1 ; : : : ; Xk are 2 independent random variables with Xt distributed as 0ni .i /, then Y D
p .2 xw/.w=k /1=.2ˇ/ d w
Z
k X
1 2ˇ
D e r
It is noted that for integer k, Z
e xw .x=w/1=.42ˇ/ y
where Ik is the modified Bessel function of the first kind of order k and is defined as Ik .z/ D
w.ˇ1/=.2ˇ/ d w:
1
r
I
2 j
1 z =4 1 kX z : 2 j Š.k C j C 1/ j D0
1=.2ˇ/
Thus, with y D k K 2ˇ , we have,
1 x .2/=4 1 p02 ./ .x/ D exp . C x/ 2 2 p
I.2/=2 . x/; x > 0; (31.25)
, we have
=2Cj 1 j
ni
i :
y
p .2 xw/d w; (31.30)
and Z
(31.28)
1 2ˇ
C2 D Ke
1
r
1=.2ˇ/
.2 ˇ/k
1
.xw12ˇ / 42ˇ e xw
y
i D1
1=.2ˇ/
ˇ1 p k w 2ˇ d w .2 xw/ 2ˇ Z 1 1 D Ke r x 42ˇ w.12ˇC2ˇ2/=.42ˇ/ e xw
I
31.4 The Noncentral Chi-square Approach to Option Pricing Model
1 2ˇ
y
Following Schroder (1989), with the transition probability density function given in (31.18), the option pricing formula under the CEV model is
DT t C D E max.0; ST K/ ; Z 1 D e r f .ST j St ; T > t/.ST K/dST K Z 1
D e r
ST f .ST j St ; T > t/dST K
e
r
Z
1
y
(31.29)
1 2ˇ
p .2 xw/d w: (31.31)
Recall that the probability density function of the noncentral Chi-square distribution with noncentrality and degree of freedom is
K
D C1 C2 :
1 2ˇ
p 1 .x=/.2/=4 I.2/=2 . x/e .Cx/=2 2 D P .xI ; /:
p02 ./ .x/ D
f .ST j St ; T > t/dST
K
p .2 xw/d w Z 1 D Ke r e xw .x=w/1=.4ˇ/ I
I
31 Constant Elasticity of Variance Option Pricing Model
475
R1 Let Q.xI ; / D x p02 ./ .y/dy. Then letting w0 D 2w and x 0 D 2x, we have Z C1 D St e
1
a
e .xCw/=2
w 1=.42ˇ/ x
y
Z
1
D St e a
e .x
0
0
Cw /=2
1 2ˇ
C2 D Ke
x 1=.42ˇ/
e
w
y
Z
1
D Ke r
e .x
0
I
p 2 xw d w
1 2ˇ
!1=.42ˇ/ 0
0
Cw /=2
2y
(31.35)
(31.32)
obtained by noting that . 2/=2 D 1=.2 ˇ/, implying 0 0 D 2 C 2=.2 ˇ/. Analogously, with w D 2w, x D 2x and In .z/ D In .z/, we have xw
D
C D St e
Z
1
1
P .2yI 2; 2k/d k D y
Z
nD0 1
D y
D
1 X nD0
z 1 z
g.i; y/:
i D1
(31.36)
p 1 y .2/=4 I 2 ky e .kCy/=2 dy 2 2 k
.2/=2 1 y .2/=4 1 p ky 2 k 2
Z
1
D
1 X
e y=2 y
2
e k=2
Z
1
z
D
C2n dy 2
n
nD0
1 X
C2n 2 1
1 .C2n/=2
z
2 ; 2x : (31.34) 2ˇ
.k=2/ .n C 1/
.1=2/.C2n/=2 y=2 C2n 1 e y 2 dy C2n 2
e k=2
nD0
.k=2/n Q.zI C 2n; 0/; (31.37) .n C 1/
n
.zk/ dk nŠ.n C 1 C 1/
e z znC1 .n C /
D
1 X .ky=4/n .kCy/=2 e dy nŠ C2n 2 nD0 1 n 1 n X k=2 k 2 e D .n C 1/ nD0
(31.33)
y
1
D
p 1 e zk .z=k/1 kz
1 X
n X
Next, applying the monotone convergence theorem, we have
It is noted that 2 2=.2 ˇ/ can be negative for ˇ < 2. Thus further work is needed. Using the monotone convergence theorem and the integration by parts, we have Z
g.i; y/, which
G.m C 1; t/ D g.m C 1; t/ C G.m; t/:
Z
r Ke Q 2yI 2
i D1
The above result can also be expressed as
Q.zI ; k/ D
2 ; 2x 2ˇ
.i /
i D1
obtained by noting that . 2/=2 D 1=.2 ˇ/, implying D 2 2=.2 ˇ/. Thus, Q 2yI 2 C
P1
1
n X y i 1 e y
Z
x w0
p
1 0 dw
I 1 2 x 0 w0 2ˇ 2
2 ; 2x ; D Q 2yI 2 2ˇ
a
g.i; y/:
Z 1 n1 e k k n1 k G.n; y/ D dk D de k .n/ .n/ y y Z 1 n2 k k e y n1 e y C dk D .n/ .n 1/ y
0
1
1 X i D1
nD0
Z
D St e a Q.2yI ; x /;
2 a ; 2x ; D St e Q 2yI 2 C 2ˇ
Z
g.n C 1; z/
Now we also have the result G.n; y/ D can be shown by observing that
p
1 0 dw 2 x 0 w0 2
r
1 X
p 2 xw d w
1 2ˇ
!1=.42ˇ/ 0 w x0
2y
I
I
D
Z
1 y
e k k n dk .n C 1/
g.n C ; z/G.n C 1; y/
where Z
1
Q.zI C 2n; 0/ D Z
z
.1=2/.C2n/=2 y=2 C2n 1 e y 2 dy C2n 2
1
D z=2
C2n 1 C2n e y y 2 1 dy
2
D G .n C =2; z=2/ :
476
Y.L. Hsu et al.
Furthermore, from the property of G.; / as shown in Equation (31.36), we have Q.zI ; k/ D
1 X
We conclude that Q.2zI 2; 2k/ D 1
g .n C 1; k=2/ G .n C =2; z=2/
1 X
g .n C ; z/
n X
g .i; k/ :
i D1
nD1
(31.39)
nD0
D
2 ; z=2 : g .n; k=2/ G n C 2 nD0
1 X
(31.38) Hence
From (31.35) and (31.39) we observe that Z
1
P .2zI 2; 2k/d k D 1 Q .2zI 2. 1/; 2y/ : y
(31.40) Thus, we can write C2 as
Q.2zI 2; 2k/ D
1 X
g .n; k/ G .n C 1; z/ :
Z C2 D Ke r
P 2xI 2 C
y
nD0
D Ke r Q 2yI 2
Again from the property of G.; / as given by (31.36), we have Q.2zI 2; 2k/ D g .1; k/ G .; z/ C g .2; k/ G . C 1; z/
D Ke
Cg .3; k/ G . C 2; z/ C D g .1; k/ ŒG . 1; z/ C g .; z/ Cg .2; k/ ŒG . 1; z/ C g . C 1; z/ Cg .3; k/ ŒG . 1; z/ C g .; z/
D ŒG . 1; z/ C g .; z/
1 X nD1
1 X
g .n; k/
r
1 Q.2xI
2 ; 2w d w 2ˇ
2 ; 2x 2ˇ
2 ; 2y/ : 2ˇ
(31.41)
From (31.41) we immediately obtain Q 2yI 2
Cg . C 1; z/ C g . C 2; z/ C
Cg . C 1; z/
1
2 2 ; 2x C Q 2xI ; 2y D 1 2ˇ 2ˇ (31.42)
implying g .n; k/ Q.zI 2n; k/ C Q.kI 2 2n; z/ D 1;
(31.43)
with degrees of freedom 2 2n of Q.kI 2 2n; z/ can be a non-integer. 1 X From Equation (31.42), we can obtain that the noncentral g .n; k/ C Cg . C 2; z/ Chi-square Q.2yI 2 2=.2 ˇ/; 2x/ with 2 2=.2 ˇ/ nD3 degrees of freedom and the noncentrality parameter 2x can D G . 1; z/ C g .; z/ be represented by another noncentral Chi-square distribution Cg . C 1; z/ Œ1 g .1; k/ 1 Q.2xI 2=.2 ˇ/; 2y/ with degrees of freedom 2=.2 ˇ/ Cg . C 2; z/ Œ1 g .1; k/ g.2; k/ C and the noncentrality parameter 2y. The standard definition of noncentral Chi-square distribution in Sect. 31.3 has inte1 X ger degrees of freedom. If the degree of freedom is not an D G . 1; z/ C g . C n; z/ integer, we can use Equation (31.43) to transfer the original nD0 noncentral Chi-square distribution into another noncentral g . C 1; z/ Œg .1; k/ Chi-square distribution. Thus, we obtain an option pricing formula for the CEV model in terms of the complementary g . C 2; z/ Œg .1; k/ C g .2; k/ C noncentral Chi-square distribution function Q.zI ; k/ which D 1 g . C 1; z/ Œg .1; k/ is valid for any value of ˇ less than 2, as required by the model. g . C 2; z/ Œg .1; k/ C g .2; k/ : nD2
31 Constant Elasticity of Variance Option Pricing Model
477
Substituting Equation (31.41) into Equation (31.34), we obtain
2 a C D St e Q 2yI 2 C ; 2x 2ˇ
2 r ; 2y/ ; (31.44) 1 Q.2xI Ke 2ˇ 2ˇ
2ˇ where K , x D k St e .ra/.2ˇ/, k D 2.r a/= 2 y D k .ra/.2ˇ/ .2 ˇ/.e 1/ and a is the continuous proportional dividend rate. The corresponding CEV option pricing formula for ˇ > 2 can be derived through a similar manner. When ˇ > 2 (see, Emanuel and Macheth (1982), Chen and Lee (1993)), the call option formula is as follows:
C D St e
a
Ke
Q 2xI
r
2 ; 2y ˇ2
1 Q.2yI 2 C
2 ; 2x/ :(31.45) ˇ2
where D 2
2 ; 2ˇ 2ˇ .ra/.2ˇ/
D 2k St
e
:
Thus, the option pricing formula for the CEV model as given in (31.44) that can be obtained directly from the payoff function ST K; if ST > K max.ST K; 0/ D (31.47) 0; otherwise by taking the expectation of (31.47), with ST having the distribution given by (31.46). Before concluding this subsection we consider that the noncentral Chi-square distribution will approach log-normal as ˇ tends to 2. Since when either or approaches to infinity, the standardized variable 2
0 ./ . C / p 2. C 2/
We note that from the evaluation of the option pricing formula C , especially C2 , as given in (31.34), we have 2ˇ
2k ST
2
0 ./;
(31.46)
2ˇ
2
lim
ˇ!2
tends to N(0,1) as either ! 1 or ! 1. Using the fact that .x a 1/=a will approach to ln x as a ! 0, it can be verified that
0 ./ . C / 2k ST . C / D lim p p ˇ!2 2. C 2/ 2. C 2/ 2ˇ
D lim ˇ!2
s
D
2r ? ST
?
2ˇ r ? .2ˇ/
.1 ˇ/ 2 .e r .2ˇ/ 1/ 2r ? St 2 .e r ? .2ˇ/ 1/
e
2 .e r ? .2ˇ/ 1/=.2 ˇ/ 2ˇ r ? .2ˇ/ e
.1 ˇ/ 2 .e r ? .2ˇ/ 1/ C 4r ? St
ln ST Œln St C .r ? 2 =2/ p ;
where r ? D r a. Thus, ln ST j ln St N.ln St C .r a 2 =2/; 2 /
31.4.2 Some Computational Considerations (31.48)
as ˇ ! 2 . Similarly, (Equation 31.48) also holds when 2ˇ ˇ ! 2C . From (Equation 31.45), we have 2k ST 02 ./, where D 2 C 2=.ˇ 2/ if ˇ > 2. Thus, we clarify the result of (Equation 31.48).
As noted by Schroder (1989), (31.39) allows the following iterative algorithm to be used in computing the infinite sum when z and k are not large. First initialize the following four variables (with n D 1) gA D
e z z D g.1 C ; z/; .1 C /
478
Y.L. Hsu et al.
gB D e k D g.1; k/;
Acknowledgements We gratefully acknowledge the editor and an anonymous referee for his insightful comments and suggestions of the paper. Research Supported in Part by NSC grant 95-2118-M-005-003.
Sg D gB; R D 1 .gA/.Sg/: Then repeat the following loop beginning with n D 2 and increase increment n by one after each iteration. The loop is terminated when the contribution to the sum, R, is declining and is very small.
z D g.n C ; z/; nC1
k gB D gB D g.n; k/; n1 gA D gA
Sg D Sg C gB D g.1; k/ C g.n; k/ R D R .gA/.Sg/ D the nth partial sum. As each iteration, gA equals g.n C ; z/, gB equals g.n; k/ and Sg equals g.1; k/ C C g.n; k/. The computation is easily done. As for an approximation, Sankaran (1963) showed that the 2 h distribution of 0 =. C k/ is approximately normal with the expected value D 1 C h.h 1/P h.2 h/mP 2 =2 and variance 2 D h2 P .1CmP /, where h D 1 23 .Ck/.C3k/ . C2k/2 , P D . C2k/=. Ck/2 and m D .h1/.13h/. Using the approximation, we have approximately
References Brandt, M. W. and Q. Kang. 2004. “On the relationship between the conditional mean and volatility of stock returns: a latent VAR approach.” Journal of Financial Economics 72, 217–257. Campbell J. 1987. “Stock returns and the term structure.” Journal of Financial Economics 18, 373–399. Chen, R. R. and C. F. Lee. 1993. “A constant elasticity of variance (CEV) family of stock price distributions in option pricing: review and integration.” Journal of Financial Studies 1, 29–51. Cox, J. 1975. Notes on option pricing I: constant elasticity of variance diffusion, Unpublished Note, Standford University, Graduate School of Business. Also, Journal of Portfolio Management (1996) 23, 5–17. Cox, J. and S. A. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3, 145–166. Emanuel, D. and J. MacBeth. 1982. “Further results on the constant elasticity of variance call option pricing formula.” Journal of Financial and Quantitative Analysis 17, 533–554. Feller, W. 1951. Two singular diffusion problems. Annals of Mathematics 54, 173–182. Glostern, L., R. Jagannathan, and D. Runkle. 1993. “On the relation between the expected value and the volatility of the nominal excess returns on stocks.” Journal of Finance 48, 1779–1802. Sankaran, M. 1963. “Approximations to the non-central Chi-square distribution.” Biometrika 50, 199–204. Schroder, M. 1989. “Computing the constant elasticity of variance option pricing formula.” Journal of Finance 44 211–219.
2
Q.zI ; k/ D P r.0 > z/ 2
0 z > Ck Ck
D Pr 0 D Pr @
2
0 Ck
!h
>
!
z Ck
h
Appendix 31A Proof of Feller’s Lemma
1 A
1hP Œ1hC0:5.2h/mP Dˆ p h 2P .1 C mP /
We need some preliminary results in order to prove Equation (31.6).
h! z Ck
:
Proposition 1. f .z/ D e A=zz1 is the Laplace transformation of I0 .2.Ax/1=2 /, where Ik .x/ is the Bessel function Ik .x/ D
rD0
31.5 Conclusion The option pricing formula under the CEV model is quite complex because it involves the cumulative distribution function of the noncentral Chi-square distribution Q.zI ; k/. Some computational considerations are given in the article which will facilitate the computation of the CEV option pricing formula. Hence, the computation will not be a difficult problem in practice.
1 X
.x=2/2rCk : rŠ.r C 1 C k/
Proof. By the definition of Laplace transformation and the monotone convergence theorem, we have Z
1
f .z/ D
e zx I0 .2.Ax/1=2 /dx Z
0 1
D
e 0
zx
1 X .Ax/1=2 rŠ.r C 1/ rD0
31 Constant Elasticity of Variance Option Pricing Model
Z D
1
e
zx
0
479
A1h=a e A D .1 h=a/
.Ax/2 .Ax/ C C 1C .2/ 2Š.3/ .Ax/n C C nŠ.n C 1/
A 1 .A/2 .A/n C 2 C C C C z z 2Šz3 nŠznC1 A .A/2 1 .A/n 1C C D CC C z z 2Šz2 nŠzn
w.t; sI x0 / D
D e A=z z1 :
Z
1
e sx f .t; xI x0 /dx 0
h=a
sbx0 e bt D exp sa.e bt 1/ C b ! h b 2 x0 e bt ;
1 I a a.e bt 1/ sa.e bt C 1/ C b b sa.e bt 1/ C b
(31.50) where .nI z/ D 1 .n/
Rz 0
e x x n1 dx.
Proof. The proof of the lemma is too tedious and hence is omitted. For more details, please see Lemma 7 of Feller (1951). We now turn to prove Equation (31.6). From Equation (31.50), let AD
1 bx0 and z D sa.e bt 1/ C b : bt a.1 e / b
The w.t; sI x0 / in Equation (31.50) can be rewritten as zh=a e .11=z/A w.t; sI x0 / D .1 h=a/ D
A=z
e
x h=a
x
dx
0
zh=a e .11=z/A .1 h=a/ Z 1 0 e Ax =z .Ax 0 =z/h=a .A=z/1h=a dx 0
0
D
Z
1h=a A
e A .1 h=a/
Z
1
0
e A.1x /=z .x 0 /h=a z1 dx 0 0
D
(31.49)
where a; b; h are constants, 0 < h < a, then the Laplace transformation of f .t; x; x0 / with respect to x takes the form w.t; sI x0 / D
e A=z .1 /h=a z1 d:
0
Z A1h=a e A 1 .1 /h=a .1 h=a/ 0 Z 1 zx 1=2 dx d
e I0 2.Ax/ 0
Proposition 2. Consider the parabolic differential equation 0 < x < 1;
1
By Proposition 31.1, we know .z/ D e A=z z1 is the that f1=2 and by the Fubini Laplace transformation of I0 2.Ax/ theorem, we have
D
Pt D .axP /xx .bx C h/P x ;
Z
1h=a A
e A .1 h=a/ Z 1Z 1 .1 /h=a e zx I0 2.Ax/1=2 dxd
0
D
0 2h=a A
A e .1 h=a/x0 e bt Z 1Z 1 0 0 bt
.1 /h=a e sx e Ax =.x0 e / 0
0
I0 2A.e bt x 0 =x0 /1=2 ddx 0 : Hence, upon comparing the two formulae for w.t; sI x0 / and by the monotone convergence theorem, we have 1h=a b bx0 .1 h=a/a.e bt 1/ a.e bt 1/ Z 1 b.x C x0 e bt /
exp .1 /h=a I0 a.e bt 1/ 0
2b.e bt xx0 /1=2 d
a.1 e bt / 1h=a bx0 b D .1 h=a/a.e bt 1/ a.e bt 1/ Z 1 b.x C x0 e bt /
exp .1 /h=a a.e bt 1/ 0 bt 1 1=2 X b.e xx0 / = a.1 e bt / 2r d
rŠ.r C 1/ rD0
f .t; xI x0 / D
D
1h=a bx0 b .1 h=a/a.e bt 1/ a.e bt 1/ b.x C x0 e bt /
exp a.e bt 1/ 2r 1 X b.e bt xx0 /1=2 = a.1 e bt /
rŠ.r C 1/ rD0 Z
1
.1 /h=a r d 0
480
Y.L. Hsu et al.
1h=a bx0 b D .1 h=a/a.e bt 1/ a.e bt 1/ b.x C x0 e bt /
exp a.e bt 1/ 2r 1 X b.e bt xx0 /1=2 = a.1 e bt /
rŠ.r C 1/ rD0
.r C 1/.1 h=a/ .r C 1 C 1 h=a/
D
b
e bt x x0
.ha/=2a
a.e bt 1/ b.x C x0 e bt /
exp a.e bt 1/
2b bt 1=2
I1h=a .e xx0 / : a.1 e bt /
This completes the proof.
Chapter 32
Stochastic Volatility Option Pricing Models Cheng Few Lee and Jack C. Lee
Abstract In this chapter, we assume that the volatility of option price model is stochastic instead of deterministic. We apply such assumption to the nonclosed-form solution developed by Scott (Journal of Finance and Quantitative Analysis 22(4):419–438, 1987) and the closed-form solution of Heston (The Review of Financial Studies 6(2):327–343, 1993). In both cases, we consider a model in which the variance of stock price returns varies according to an independent diffusion process. For the closed form option pricing model, the results are expressed in terms of the characteristic function. Keywords Nonclosed-form option pricing model r Closed– forem option pricing model r Heston model r Itô’s Lemma r Characteristic function r Moment generating function
32.1 Introduction The variance of stock returns plays a vital role in option pricing, as evidenced in the Black-Scholes pricing formula. Also, it has long been observed that stock price changes over time. Thus, the assumption of a constant variance of stock price returns does not seem to be reasonable in option price. In this chapter, we will consider the situation in which the variance of stock price returns is not a constant. Instead, we will assume the variance of stock price returns is random and follows some distributions. Emphasis will be placed on the nonclosed-form solution developed by Scott (1987) and the closed-form solution of Heston (1993). In both cases, we consider a model in which the variance of stock price returns varies according to an independent diffusion process.
32.2 Nonclosed-Form Type of Option Pricing Model In the Black-Scholes model, the variance of stock returns was assumed to be constant. However, empirical studies show that volatility seems to change day by day. Its behavior looks like it is performing a random walk. Thus, it is intuitional to assume the following stochastic processes for stock prices with random variance: dS D ˛S dt C S dz1 d D ˇ.N /dt C dz2 ;
(32.1)
where dz1 and dz2 are Wiener processes. Since both stock price and volatility are assumed random variables, we need the following generalization of Ito’s Lemma. Lemma 32.1. Generalization of Ito’s Lemma Let f be a function of variables x1 ; x2 ; : : : ; xn and time t. Suppose that xi ’s follow Ito’s process: dxi D ai dt C bi d zi ;
i D 1; 2; ; n;
where dzi ’s are Wiener processes, instantaneous correlation coefficients between dzi and dzj is ¡ij , and ai ’s and bi ’s are functions of x1 ; x2 ; : : : ; xn and t. Then 1 X @f X @2 f 1 @f C ai C bi bj ij A dt df D @ @x @t 2 @x @x i i j i i;j 0
C
X @f bi d zi @xi i
Proof. A Taylor series expansion of f gives
C.F. Lee () Rutgers University, New Brunswick, NJ, USA e-mail: [email protected] J.C. Lee National Chiao Tung University, Taiwan, ROC
f D
X @f 1 X @2 f @f
t C
xi C
xi xj @xi @t 2 i;j @xi @xj i C
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_32,
1 X @2 f 1 @2 f
xi t C . t/2 C R2 2 i @xi @t 2 @t 2 481
482
C.F. Lee and J.C. Lee
1 D C1 ˛S C C2 ˇ.N / C3 C C11 2 S 2 2 1 1 C C22 2 C C12 S dt 2 2
A discrete form for dx can be written as p
xi D ai t C bi t"i ; where ©i ’s are standard normal variables. Then
CC1 Sd z1 C C2 d z2
3
xi xj D ai aj . t/2 C ai bj . t/ 2 "j 3
Caj bi . t/ 2 "i C bi bj t "i "j : The first term on the right is of order .dt/2 and the second and third terms are distributed with mean 0 and a standard deviation of order .dt/3=2 . As t tends to 0, lim xi xj D ij bi bj ;
where ¡ is the instantaneous correlation between dz1 and dz2 , and the subscripts on C indicate partial derivatives and D T t so that d D dt. To eliminate the risk from the portfolio, we observe that the coefficients for dz1 and dz2 in the return of the portfolio, dC.; ; 1 / C w2 dC.; ; 2 / C w3 dS, are respectively
t !0
where ¡ij is the correlation coefficient of ©j and ©j . Similarly the terms, xi t, are all distributed with mean 0 and a standard deviation of order .dt/3=2 , so they are comparably small with t as t ! 0. It can be seen that all the terms in R2 are at most as large as . t/3=2 , thus we can collect all terms of order t and write down the differential df as: df D
X @f @f dt .ai dt C bi d zi / C @x @t i i
1 X @2 f C bi bj ij dt 2 i;j @xi @xj 3 2 X @2 f X @f 1 @f C D4 ai C bi bj ij 5 dt @x @t 2 @x @x i i j i i;j C
X @f bi d zi @xi i
This completes the proof. A call option price on the stock shall have the form C.S; ¢; £/, a function of the stock price, volatility, and time to expiration. With the introduction of a random variance and thus two sources of uncertainty, a portfolio with only one option and one stock is not sufficient for creating a riskless investment strategy. A portfolio consisting of one stock and two options having different expiration dates is required, or it should be C.; ; 1 / C w2 C.; ; 2 / C w3 S: Now we may use Ito’s Lemma above to derive the stochastic differential: dC D C1 dS C C2 d C C3 d 1 1 1 C C11 .dS /2 C H22 .d/2 C C33 .d /2 2 2 2 CC12 dSd C C13 dSd C C23 dd
(32.2)
C1 .; ; 1 /S C w2 C1 .; ; 2 /S C w3 S and C2 .; ; 1 / C w2 C2 .; ; 2 /: Set both the two quantities to be 0, we get wO 2 D C2 .; ; 1 /=C2 .; ; 2 /: and wO 3 D C1 .; ; 1 / C C2 .; ; 1 /C1 .; ; 2 /=C2 .; ; 2 /: Then the return of the portfolio becomes: dC.; ; 1 / C wO 2 dC.; ; 2 / C wO 3 dS 1 1 D C3 .1 / C C11 .1 / 2 S 2 C C12 .1 /S ı 2 2 1 C2 .1 / 1 C3 .2 / C C11 .2 / 2 S 2 C C22 .1 / 2 2 C2 .2 / 2 1 1 C C12 .2 /S C C22 .2 / 2 dt 2 2 Since it is a riskless return, it should be equal to the riskfree rate; that is, dC.; ; 1/ C wO 2 dC.; ; 2 / C wO 3 dS D r ŒC.; ; 1 / C wO 2 C.; ; 2 / C wO 3 S dt: Substituting w O 2 and wO 3 into the above two differential equations we get 1 1 1 C3 .1 / C C11 .1 / 2 S 2 C C12 .1 /S ı C C22 .1 / 2 2 2 2 C2 .1 / 1 1 C3 .2 / C C11 .2 / 2 S 2 C C12 .2 /S C2 .2 / 2 2 1 C C22 .2 / 2 2
32 Stochastic Volatility Option Pricing Models
483
1 C1 ˛S C C2 ˇ.N / C3 C C11 2 S 2 2
1 1 C C22 2 C C12 S ı C; 2 2 C2 C1 S .˛ r/ C 2 ; D rC C C
D rŒC.; ; 1 / C2 .; ; 1 /C.; ; 2 /=C2 .; ; 2 / C .C1 .; ; 1 / C C2 .; ; 1 /C1 .; ; 2 /=C2 .; ; 2 // S ; implying 1 1 C3 .1 / C11 .1 / 2 S 2 C12 .1 /S ı 2 2 C2 .1 / 1 C22 .1 / 2 C C.1 /r C1 .1 /S r 2 C2 .2 / 1 1 C3 .2 / C11 .2 / 2 S 2 C12 .2 /S 2 2 1 C22 .2 / 2 C C.2 /r C1 .2 /S r D 0 (32.3) 2
or 1 1 1 C3 C11 2 S 2 C12 S ı C22 2 2 2 2 CC r C1 S r C2 Œˇ.N / D 0: (32.4)
This equation with the boundary conditions has a unique solution and it is easy to show that this solution also satisfies It is easily observed that the solutions of the following Equation (32.3). It is obvious that the expected return on the equation are also solutions of Equation (32.3): stock does not influence the option price, but the expected change and the risk premium associated with the volatilC3 12 C11 2 S 2 12 C12 S 12 C22 2 C C r C1 S r D 0: ity do. The solution for the option price function follows from But further investigations show that a more general solution lemma 32.1 of Cox et al. (1985): can be obtained from 1 1 1 C3 C11 2 S 2 C12 S C22 2 2 2 2 CC r C1 S r C2 b D 0 where b is an arbitrary function or constant. This means that a unique solution for the option price function in the random variance model cannot be obtained only through arbitrage. An alternative view on this problem is that we may form a duplicating portfolio for an option in this model that which contains the stock, the riskless bond, and another option. We cannot determine the price of a call option without knowing the price of any other call on the same stock. However, this is just what we are trying to do with the equation. In other words, only with a predetermined market price of the risk we can obtain a unique solution for the option price function. Some aspects to the market price of the risk are shown in Appendix 32A. To derive a unique option pricing function, the following equation based on the arbitrage pricing theory by S. Ross in 1976 is introduced, E
dC C
C1 C2 D r C 1 S C 2 dt C C C1 S C2 rC .˛ r/ C dt C C 2
where .’ r/ is the risk premium on the stock and œ2 is the risk premium associated with d¢. Equating these expressions we may get:
O rt maxf0; St KgjS0; 0 /; C.S; ; tI r; K/ D E.e (32.5) where EO represents a risk-adjusted expectation. For the riskadjustment, we reduce the mean parameter of dP and d† by the corresponding risk premiums. For the stock return, ’ is replaced by the risk-free rate, r, and for the standard deviation, Œˇ.N / is used in place of ˇ.N /. Following Karlin and Taylor (1981), the backward equation for the function can be derived and it can be shown that it solves the Equation (32.4) with the adjustments on the dS and d¢ processes. The option pricing function in Equation (32.5) is a general solution to this random variance model. For operational issues, parameters of the ¢ process, the risk premium œ , and the instantaneous correlation coefficients between the stock return and d¢ are all required. With these parameters and the current value of ¢ given, we do simulations to compute the option prices. The model can be simplified with œ and • set as 0. The risk premium could be zero if, for example, the volatility risk of the stock can be diversifiable or if d¢ is uncorrelated with the marginal utility of wealth. Now we develop the distribution of the stock price function at expiration with œ D 0 and • D 0. Consider the following solution conditional on the process f¢s W 0 < s < tg: St D S0 exp
8 t
1 2 r .s/ ds C 2
Zt
9 = .s/ d z1 .s/ : ;
0
484
C.F. Lee and J.C. Lee
We have 0 d exp @
Zt 0
Furthermore, we note that
1
2 ds A r 2
.s/ dz1 .s/ can be character-
0
ized as the Riemann sum:
1 0 t 1 Zt Z 2
2
ds A d @ ds A r r D exp @ 2 2
n X
0
0
Rt
0
0
1
Zt 2 2 D exp @ ds A r dt r 2 2 0
.ti / .zti C1 zti /
i D0
with a proper partition ft0 D 0; : : : ; tn D tg for which max.ti1 ti / ! 0 as n ! 1. Since Z1 is a Brownian motion, each piece of the path is independently normally distributed. So we can see that the Riemann sum is distributed as a normal distribution with mean 0 and varin P .t2i / .zti C1 zti /. As n is large enough, the sum apance i D0
and 0 d exp @
Zt
proaches the integral and thus
1
Zt
d z1 A
.s/ d z1 .s/ N.0; V/:
0
0
0 t 1 0 t 1 Z Z 1 D exp @ d z1 A d 2 @ d z1 A 2 0 C exp @
0
Zt
1
d z1 A d @
0
0
1 D 2 exp @ 2
0
Zt
1
0
Zt 0
with V D
Rt 0
ditional on S0 and f¢s g is lognormal,
1
V S t ln =S0 jS0 ; fs g N rt ; V : 2
d z1 A 0
d z1 A dt C exp @
0
Zt
1 d z1 A d z1 :
The formula for option price becomes Ct jS0 ; fs g D e rt E.maxf0; St KgjS0; fs g/
0
D S0 N.d1 / Ke rt N.d2 /;
So we see that 8 t 9 9 8 t
=
CS0 exp
8 t
0
9 =
8 t 9
=
0
0
8 t 92 0 t 1
= Z
0
8 t 9 0 t 1 3 Z
exp @
2 .s/ ds. Then the distribution of stock price con-
0
Zt
1
0
2 2 r ds A r dt 2 2
0
D rSt dt C St d:
That is, St is a solution of the stochastic equation for dS in Equation (32.1).
p ln .S0 =K/ C rt C V=2 p and d2 D d1 V . V This result is essentially the Black-Scholes formula with ¢ 2 t replaced by V. To find the unconditional expectation of the option price, we should integrate this formula over the distribution of V, where d1 D
C.S0 ; 0 ; tI r; K; ˇ; N ; / Z1 ŒS0 N.d1 / Ke rt N.d2 / dF .V I t;0 ; ˇ; N ; /:
D 0
(32.6) This integral converges because F is a distribution function and the function inside the bracket is bounded given the values of S0 , K, r and t. This finishes the problem. What are left is the estimation of the parameters in the d¢ processes and the current values of ¢. A common approach in the empirical literature is to use actual option prices to infer the values of ¢0 . Statistical techniques about point estimation can be applied here. One method is to determine the
32 Stochastic Volatility Option Pricing Models
485
unconditional distribution of stock returns as a function of ’; “; N and ”. Then apply the maximum likelihood method to get the estimates. Another approach is to use the method of moments and this is illustrated in Scott (1987).
Furthermore, if X D x and
R1
j'.t/j dt < 1, the pdf f .x/ exists at
1
1 f .x/ D 2
Z1 e itx 'x .t/dt:
(32.11)
1
32.3 Review of Characteristic Function One of the most important problems in statistics is the determination of cumulative distribution functions (cdf) of measurable functions of random variables. The characteristic function is one possible way in deriving the cdf of interest. It is also useful in generating moments and cumulants of the distribution. The characteristic function '.t/ of a random variable X having cdf F .x/ is defined as Z
1
'x .t/ D E.e itx / D
e itx dF .x/;
(32.7)
1
p where i D 1 and t is a real number. The function 'x .t/ is sometimes referred as the characteristic function corresponding to F .x/ or more briefly, the characteristic function of X. Since e itx D cos tx C i sin tx; cos tx and sin tx are both integrable over .1; 1/ for any real t, then 'x .t/ is a complex number whose real and imaginary parts are finite for every value of t. It is noted that the moment generating function Mx .t/ and 'x .t/ are related by the following, Mx .t/ D E.e tx / D
t i
(32.8)
If the rth moment 0r exists, we can differentiate Equation (32.7) h times, 0 < h r, with respect to t and obtain 0h D
.h/ .0/ ih
0
(32.9)
Once the characteristic function of a measurable function of random variables is obtained, it is usually of interest to obtain the cdf and the pdf. The answer is provided in the following theorem of Levy (1925).
Theorem 1 (Levy). Let X be a random variable having characteristic function 'x .t/ and cdf F .x/. Then if F .x/ is continuous at X D x ˙ ı; ı > 0, we have 1 A!1
ZA
F .x C ı/ F .x ı/ D lim
sin ı t itx e 'x .t/dt t
A
(32.10)
Proof. For the proof of the theorem, see Wilks (1963, 116–117). A very useful result for obtaining tail probability is given in the following theorem.
Theorem 2 (Gil-Pelaez 1951). Let '.t/ be the characteristic function of the random variable X. Then 1 F .x/ D P .X > x/ 1 1 D C 2
Z1
Œe itx .t/ e i tx .t/ dt 2i t
0
1 1 D C 2
Z1
e itx .t/ dt Re it
(32.12)
0
Proof. see Gil-Pelaez (1951). The above theorems will be useful in deriving the desired probabilities in Sect. 32.4.
32.4 Closed-Form Type of Option Pricing Model Instead of the non-closed solution of option pricing model with stochastic volatility as proposed by Scott (1987) and discussed in Sect. 32.2, we now discuss the closed-form solution of option pricing model when the volatility of stock returns is assumed stochastic, with a special emphasis on the model proposed by Heston (1993). The results are expressed in terms of the characteristic function, which is a close-form mathematically. Of course, numerical solution is still needed in practice. Since there are two stochastic partial differential equation as given in Equation (32.1), we will need Lemma 32.1 with n D 2. We next assume that the stock price at time t follows the diffusion dS .t/ D .t/ S .t/ dt C
p .t/S.t/dZ1 .t/
(32.13)
Where .t/ is the instantaneous drift of stock return and V .t/ is the variance, and Z1 .t/ is a Wiener process.
486
C.F. Lee and J.C. Lee
Furthermore, we assume that V .t/ follows dV .t/ D ˛.S; ; t/dt C ˇ.S; ; t/dZ2 .t/
(32.14)
where Z2 is Wiener process with dZ 1 dZ 2 D dt; 1 1. It is noted that is the volatility of volatility and is the correlation between dZ 1 and dZ 2 . In Black-Scholes case, there is only one source of randomness – the stock price, which can be hedged with stock.
In the present case, random changes in volatility also need to be hedged in order to form a riskless portfolio. So we set up a portfolio … containing the option being priced whose value we denote by U.S; v; t/, a quantity of the sock and a quantity 1 of another asset with value U1 .S; v; t/, we have … D U.s; v; t/ S 1 U1 .s; v; t/ By Lemma 32.1, we have
@U @U 1 1 @2 U @2 U @U @2 U dS C dv C C vS 2 2 C vˇS C 2 vˇ 2 2 dt; @S @v @t 2 @S @v@S 2 @v @U1 @U1 1 1 2 2 @2 U1 @2 U1 @U1 @2 U1 dS C dv C C vS 2 C dt; d U1 D C vˇS vˇ @S @v @t 2 @S 2 @v@S 2 @v2 dU D
d… D d U dS 1 d U1 v @2 U 1 @2 U @U @2 U C S 2 2 C vˇS C 2 vˇ 2 2 dt D @t 2 @S @S @v 2 @v v @2 U1 1 2 2 @2 U1 @U1 @2 U1 C S2 C dt 1 C vˇS vˇ @t 2 @S2 @S @v 2 @v2 @U @U1 C 1 dS @S @S @U @U1 C d v: 1 @v @v
To eliminate all randomness from the portfolio, we must choose @U1 @U 1 D0 @S @S to eliminate dS terms, and
1 1 @2 U @2 U @U @2 U C vS 2 2 C vˇS C 2 vˇ 2 2 dt @t 2 @S @v@S 2 @v 1 1 2 2 @2 U1 @U1 @2 U1 @2 U1 C vS 2 C dt 1 C vˇS vˇ @t 2 @S2 @v@S 2 @v2
d… D
@U1 @U 1 D0 @S @S to eliminate dv terms. This leaves us with
D r…dt D r.U S 1 U1 /dt;
32 Stochastic Volatility Option Pricing Models
487
where we have used the fact that the return on a risk-free portfolio must equal the risk-free rate r. Collecting all U
terms on the left-hand side and all U1 terms on the right-hand side, we find that
@U @2 U @2 U 1 1 @2 U @U C vS 2 2 C vˇS C 2 vˇ 2 2 C rs rU @t 2 @S @v@S 2 @v @S @U @v 1 1 @U1 @2 U1 @2 U1 @2 U1 U1 C vS 2 2 C vˇS C 2 vˇ 2 2 C rS rU1 @t 2 @S @v@S 2 @v @S D : @U1 @v
The left-hand side is a function of U and the right-hand side is a function of U1 only. The only way that this can be true is for both sides to be equal to some function f of the variables S,v,t. 1 1 @2 U @2 U @2 U @U C vS 2 2 C vˇS C 2 vˇ 2 2 @t 2 @S @v@S 2 @v @U U rU D .˛ ˇ/ Crs @S @v
(32.15)
where, without loss of generality, we have written the arbitrary function f of S,v,t as .˛ ˇ/. Conventionally, .S; v; t/ is called the market price of volatility risk because it tells us how much of the expected return of U is explained by the risk of v in the capital asset pricing model framework.
32.4.1 The Heston Model The Heston model (Heston 1993) corresponds to choosing ˛.S; v; t/ D k. v.t//; ˇ.S; v; t/ D 1; D in Equation (32.14). These equations become p v.t/ S.t/ dZ1 .t/ p d v.t/ D Œ v.t/ dt C v.t/ dZ2 .t/
dS.t/ D Sdt C
1 @U @2 U @2 U 1 2 @2 U vS C 2v C rS C vS 2 2 @S @S @v 2 @v @S @U @U rU C D0 CfŒ v.t/ .S; v; t/g @v @t (32.17) A European call option with strike price K and maturing at time T satisfies the PDE Equation (32.19) subject to the following boundary conditions: U.S; v; T / D max.0; S k/; U.0; v; T / D 0; @U .1; v; T / D 1; @S @U @U rS .S; 0; t/ C k .S; 0; t/ rU.S; 0; t/ @S @v @U .S; 0; t/ D 0; C @t U.S; 1; t/ D S: By analogy with the Black-Scholes formula, we guess a solution of the form U .S; v; t/ D SP1 Ke r.T t / P2
(32.16)
with dZ1 dZ2 D dt where is the speed of reversion of v.t/ and represents the long-term mean of volatility process. We substitute the above values for ˛.S; v; t/ and ˇ.S; v; t/ into the general valuation equation to obtain
(32.18)
where the first term is the present value of the spot asset upon optimal exercise, and the second term is the present value of the strike price payment. Let x D ln S , then U .x; v; t / D e x P1 Ker.T t / P2 From Equation (32.20), we have @U @P1 @P2 D ex rKe r.T t / P2 Ke r.T t / ; @t @t @t (32:180 a)
488
C.F. Lee and J.C. Lee
@U @x 1 @U @U D D @S @x @S S @x @P2 1 x @P1 e C e x P1 ke r.T t / D S @x @x
(32.180b)
where @P1 @P2 @U D ex C e x P1 Ke r.T t / ; @x @x @x @U @P1 @P2 D ex ke r.T t / ; @v @v @v
(32.180c)
For the option price to satisfy the boundary condition in Equation (32.10), these PDEs Equation (32.12) are subject to the terminal condition. 1 if x > lnŒk Pj .x; v; T I lnŒk / D 1.x>lnŒk / D 0 o:w: (32.20) Let fj . ; v; t/ denote the characteristic function of Pj .x; v; t/. The inversion of fj . ; v; t/ is given by 1 Pj .x; v; t / D 2
2 2 @2 U x @ P1 r.T t / @ P2 D e ke ; (32.180d) @v2 @v2 @v2
1 1 @U @2 U @ @U D D @S @v @v @S S @v @x 1 x @2 P1 @P1 @2 P2 e C ex ke r.T t / 2 ; D S @x@v @v @x (32.180e) 2
2
.a/
1 @S
@C 1 @x S
@Pj @ Pj @ Pj 1 1 @ Pj v 2 C v C 2 v 2 C r C uj v 2 @x @x@v 2 @v @x @Pj @Pj C D0 for j D 1; 2 (32.19) C a bj v @v @t 2
i e i x fj . ; v; t/ d
2
where u1 D 1=2; u1 D 1=2; a D ; b1 D C; b2 D C:
Z1
2 e i x fj . ; v; t / d 1
Z1
@2 Pj 1 D .c/ @x@v 2
i e i x
@fj . ; v; t / d @v
1
D
Substituting Equation (32:180 a)–(32.180g) into Equation (32.16), we obtain 2
Z1
@Pj 1 D @x 2
@2 Pj 1 .b/ D 2 @x 2
1 @C 1 @2 C C 2 2 2 S @x S @x 1 @P1 @P2 D 2 ex C e x P1 Ke r.T t / S @x @x 2 1 @ P1 @P1 @P1 C ex C 2 ex C ex 2 S @x @x @x 2 x r.T t / @ P2 Ce P1 Ke @x 2 1 @2 P2 @P1 @P2 C Ke r.T t / D 2 ex C ex S @x 2 @x @x @2 P2 (32.180g) Ke r.T t / 2 : @x D
(32.21)
1
2
@C @S
e i x fj . ; v; t / d 1
@ U @ P1 @P1 @ P2 C e x P1 ke r.T t / 2 ; D ex C 2e x @x 2 @x 2 @x @x (32.180f) @2 C @ D 2 @S @S
Z1
Z1
@2 Pj 1 .d/ D @v2 2
e i x
@2 fj . ; v; t/ d @v2
1
1 @Pj D .e/ @t 2
Z1 e i x
@fj . ; v; t / d @t
1
Substituting Equation (32.21) into Equation (32.19) gives @fj 1 2 2 @2 fj v fj iv C v 2 r C uj i fj 2 @v 2 @v @fj @fj C D 0: C a bj v @v @t The above equation can be rewritten as
@2 fj @fj C 2 v ˛fj ˇ @v @v
Ca
@fj @fj C i r fj D 0 @v @t (32.22)
where ˛D
'2 2 C uj i'; ˇ D bj i'; D ; a D # 2 2
To solve for the characteristic function explicitly, we guess the function form
32 Stochastic Volatility Option Pricing Models
fj D exp fC .'; t / C D .'; t/ vg fj .'; v; T / D
1 i' lnŒk exp fC .'; t/ C D .'; t/ vg ; e i'
where the characteristic function of Pj .x; v; T / through Equation (32.20) Z1 fj .'; v; T / D
e i'x Pj .x; v; T / dx
489
D .'; t/ D m
1 e dt ; 1 ge dt
bj C 'i d m D ; mC bj C 'i C d 2 1 ge dt C i'r: C .'; t / D a m t 2 ln 1g where g D
Using the result of Gil-Pelaez (1951), one can invert the characteristic function to get the desired probabilities. Therefore,
1
Z1 D
1 i' lnŒk e e i x dx D : i'
1 1 Pj .x; v; t / D C 2
Z1 Re
e i'x fj .'; v; t/ d': i'
0
lnŒk
It follows that @fj D @t
@C @D Cv @t @t
32.5 Conclusion
fj ;
@fj D Dfj ; @v @2 fj D D 2 fj : @v2 Then . / can be repressed as
@C C aD i' fj @t
@D 2 C ˛ ˇD C D fj D 0 Cv @t
Then Equation (32.22) is satisfied if @C D aD C i'; @t @D D ˛ˇD C D2 D .DmC / .Dm / ; @t p ˇ˙d ˇ ˙ ˇ 2 4˛ m˙ D D ; (32.23) 2 2 where p ˇ 2 4˛ s
2
2
2 D bj i 4 uj i' 2 2 q 2 D i bj 2 2 2 uj i :
In this chapter, we assume that the volatility of underlying asset is a stochastic process. Emphasis will be placed on the nonclosed-form solution developed by Scott (1987) and the closed-form solution of Heston (1993). In both cases, we consider a model in which the variance of stock price returns varies according to an independent diffusion process. For the closed form option pricing model, the results are expressed in terms of the characteristic function.
References Cox, J. C., J. E. Ingersoll and S. A. Ross. 1985. “A Theory of the Term Structure of Interest Rates,” Econometrica 53, 385–407. Gil-Pelaez, J. 1951. “Note on the inversion theorem.” Biometrika 38, 481–482. Heston, S. 1993. “A closed-form solution for options with stochastic volatility with application to bond and currency options.” The Review of Financial Studies 6(2), 327–343. Karlin, S. and H. Taylor. 1981. A Second Course in Stochastic Processes. New York: Academic Press. Levy, P. 1925. Calcul des probabilities, Gauthier-Villars, Paris. Ross, S. 1976. “The arbitrage theory of capital asset pricing.” Journal of Economic Theory, 13, 341–360. Scott, L. 1987. “Option pricing when the variance changes randomly: theory, estimation, and an application.” Journal of Finance and Quantitative Analysis 22(4), 419–438. Wilks, S. S. 1963. Mathematical statistics, 2nd Printing, Wiley, NY.
d D
Integrating Equation (32.23) with the terminal conditions C . ; 0/ D 0 and D . ; 0/ D 0 gives
Appendix 32A The Market Price of the Risk The equation we used to derive a unique solution for the stock price function is closely related to arbitrage pricing theory developed by S. Ross in 1976. Consider some derivative depending on n state variables and time t. Suppose that there
490
C.F. Lee and J.C. Lee
are a total of at least n C 1 traded securities. Some assumptions are made as:
where kj ’s satisfy X
The short selling of securities with full use of proceeds is
permitted. There are no transaction costs and taxes. All securities are perfectly divisible. There are no riskless arbitrage opportunities. Security trading is continuous.
Denote the state variables as ™i and suppose that they follow ItoO 0 s processes: di D m i dt C si i d zi and the correlation between dzi and dzj is ¡ij . Let fj be the price of ith traded security. With ItoO 0 s Lemma, X ij fj d zi dfj D j fj dt C
j
The cost of forming such a portfolio is
i
ij fj D
@i
m i i C
P
kj fj and if there
j
are no arbitrage opportunities, the return rate of portfolio must be equal to the riskfree interest rate, so that X
kj j fj D r
X
j
or
X
kj fj
j
kj fj .j r/ D 0:
(32A.2)
j
where X @fj
(32A.1)
The return of the portfolio is then X kj j fj dt: d… D
i
j fj D
kj ij fj D 0:
j
1 X @2 fj @f C i k si sk mi mk @t 2 @i @k i;k
Equations (32A.1) and (32A.2) can be regarded as n C 1 homogeneous linear equations in kj ’s. The kj ’s are not all zero if and only if the equations are consistent. In other words, fj .j r/ is a linear combination of ¢ij fj ’s, so that
@fj si i : @i
fj .j r/ D
X
i ij fj
i
In these equations, j is the instantaneous mean rate of return provided by fj . Since there are n C 1 securities and n sources of variation, an instantaneous riskless portfolio … can be obtained as …D
X j
kj fj ; i D 1; ; n
or fj .j r/ D
X
i ij fj ;
i
in which œi is the market price of the risk for ™i .
(32A.3)
Chapter 33
Derivations and Applications of Greek Letters: Review and Integration Hong-Yi Chen, Cheng-Few Lee, and Weikang Shih
Abstract In this chapter, we introduce the definitions of Greek letters. We also provide the derivations of Greek letters for call and put options on both dividends-paying stock and non-dividends stock. Then we discuss some applications of Greek letters. Finally, we show the relationship between Greek letters, with one of the examples from the Black– Scholes partial differential equation.
@… @S where … is the option price and S is underlying asset price. We next show the derivation of delta for various kinds of stock option.
Keywords Greek letters r Delta r Theta r Gamma r Vega r Rho r Black–Scholes option pricing model r Black–Scholes partial differential equation
33.2.1 Derivation of Delta for Different Kinds of Stock Options
D
From Black–Scholes option pricing model, we know the price of call option on a non-dividend stock can be written as:
33.1 Introduction “Greek letters” are defined as the sensitivities of the option price to a single-unit change in the value of either a state variable or a parameter. Such sensitivities can represent the different dimensions to the risk in an option. Financial institutions who sell options to their clients can manage their risk by Greek letters analysis. In this chapter, we will discuss the definitions and derivations of Greek letters. We will also specifically derive Greek letters for call (put) options on non-dividend stock and dividends-paying stock. Some examples are provided to explain the application of Greek letters. Finally, we will describe the relationship between Greek letters and the implication in a delta neutral portfolio.
Ct D St N.d1 / Xe r N.d2 /
(33.1)
and the price of put option on a non-dividend stock can be written as: Pt D Xer£ N .d2 / St N .d1 /
(33.2)
where
s2 C r C X 2 d1 D p s
St ¢2 ln X C r 2s £ p d2 D p D d 1 ¢s £ ¢s £ ln
St
DT t
33.2 Delta () The delta of an option, , is defined as the rate of change of the option price with respect to the rate of change of underlying asset price:
N ./ is the cumulative density function of normal distribution. Z N .d1 / D
d1
1
Z f .u/ du D
d1
1
1 u2 p e 2 du: 2
First, we calculate H.-Y. Chen (), C.-F. Lee, and W. Shih Rutgers University, Newark, NJ, USA e-mail: [email protected]
N 0 .d1 / D
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_33,
d12 @N .d1 / 1 D p e 2 @d1 2
(33.3)
491
492
H.-Y. Chen et al.
N 0 .d2 / D
d2 1 St 1 1 D Xer£ p e 2 er£ p X St ¢s £ 2
@N .d2 / @d2
d22 1 D p e 2 2
.d1 s 1 2 D p e 2
d2 1 1 1 .1 N .d1 // C St p e 2 p St ¢s £ 2
p 2 /
D St
p d12 s2 1 D p e 2 e d1 s e 2 2 d12 ln. SXt /C 1 D p e 2 e 2
1 Dp e 2
d2 21
2 rC 2s
C St e
s2 2
St r e X
(33.4)
and
The derivation of Equation 33.5 is in the following:
where
@Ct @N .d1 / @N .d2 / D N .d1 / C St Xer£ @St @St @St D N .d1 / C St
1 D N .d1 / C St p e 2 Xe
r£
1 p St ¢s £
d21 St 1 1 p e 2 er£ p X St ¢s £ 2
d21 d2 1 1 1 p p D N .d1 / C St e 2 St e 2 St ¢s 2 £ St ¢s 2 £
D N .d1 / For a European put option on a non-dividend stock, delta can be shown as (33.6)
D N.d1 / 1 The derivation of Equation 33.6 is
D
@N .d2 / @N .d1 / @Pt D Xer£ N .d1 / St @St @St @St D Xe
r£ @.1
St
Ct D St eq£ N .d1 / Xer£ N .d2 /
(33.7)
Pt D Xer£ N .d2 / St eq£ N .d1 /
(33.8)
d1 D
@N .d1 / @d1 @N .d2 / @d2 Xer£ @d1 @St @d2 @St d2 21
N .d2 // @d2 .1 N .d1 // @d2 @St
@.1 N .d1 // @d1 @d1 @St
d2 1 1 e 2 D N .d1 / 1 p St ¢s 2 £
If the underlying asset is a dividend-paying stock providing a dividend yield at rate q, Black–Scholes formulas for the prices of a European call option on a dividend-paying stock and a European put option on a dividend-paying stock are
Equations (33.2) and (33.3) will be used repetitively in determining following Greek letters when the underlying asset is a non-dividend paying stock. For a European call option on a non-dividend stock, delta can be shown as (33.5)
D N.d1 /
D
d2 1 1 p e 2 N .d1 / 1 St ¢s 2 £
d2 D
ln
St
ln
St
X
X
C rqC p ¢s £ C rq p ¢s £
¢s2 2
¢s2 2
£
£
p D d 1 ¢s £
To make the following derivations more easily, we calculate Equations (33.9) and (33.10) in advance. N0 .d1 / D
d12 @N .d1 / 1 D p e 2 @d1 2
N0 .d2 / D
@N .d2 / @d2
(33.9)
d2 1 2 D p e 2 2
.d1 ¢s 1 2 D p e 2
p 2 £/
p d2 ¢s2 £ 1 1 D p e 2 ed1 ¢s £ e 2 2 d2 ln. SXt /C 1 1 D p e 2 e 2
¢2 rqC 2s £
d2 1 St 1 D p e 2 e.rq/£ X 2
e
¢s2 £ 2
(33.10)
33 Derivations and Applications of Greek Letters: Review and Integration
For a European call option on a dividend-paying stock, delta can be shown as (33.11)
D eq£ N.d1 / The derivation of Equation (33.11) is
D
@N .d1 / @N .d2 / @Ct D eq£ N .d1 / C St eq£ Xer£ @St @St @St D eq£ N .d1 / C St eq£ Xer£
@N .d1 / @d1 @d1 @St
@N .d2 / @d2 @d2 @St
d21 1 1 D eq£ N .d1 / C St eq£ p e 2 p S ¢ 2 t s £ d21 1 St 1 Xer£ p e 2 e.rq/£ p X S ¢ 2 t s £
D eq£ N .d1 / C St eq£ St eq£
1 p
St ¢s 2 £
d21
e 2
d21 1 p e 2 St ¢s 2 £
D eq£ N .d1 / For a European call option on a dividend-paying stock, delta can be shown as
D eq£ ŒN.d1 / 1 :
(33.12)
The derivation of Equation (33.12) is
D
@N .d2 / @Pt D Xer£ eq£ N .d1 / @St @St St eq£ D Xer£
@N .d1 / @St
@.1 N .d2 // @d2 eq£ .1 N .d1 // @d2 @St
St eq£
33.2.2 Application of Delta Figure 33.1 shows the relationship between the price of a call option and the price of its underlying asset. The delta of this call option is the slope of the line at the point of A corresponding to current price of the underlying asset. By calculating delta ratio, a financial institution that sells options to a client can make a delta neutral position to hedge the risk of changes of the underlying asset price. Suppose that the current stock price is $100, the call option price on stock is $10, and the current delta of the call option is 0.4. A financial institution sold 10 call option to its client, so that the client has right to buy 1,000 shares at time to maturity. To construct a delta hedge position, the financial institution should buy 0:4 1;000 D 400 shares of stock. If the stock price goes up to $1, the option price will go up by $0.4. In this situation, the financial institution has a $400 ($1 400 shares) gain in its stock position, and a $400 ($0:4 1;000 shares) loss in its option position. The total payoff of the financial institution is zero. On the other hand, if the stock price goes down by $1, the option price will go down by $0.4. The total payoff of the financial institution is also zero. However, the relationship between option price and stock price is not linear, so the delta changes over different stock prices. If an investor wants his portfolio to remain delta neutral, he should adjust his hedged ratio periodically. The more frequent adjustments he makes, the better delta-hedging he gets. Figure 33.2 exhibits the change in delta affects the deltahedges. If the underlying stock has a price equal to $20, then the investor who uses only delta as a risk measure will consider that his portfolio has no risk. However, as the underlying stock prices changes, either up or down, the delta changes as well, and thus the investors will have to use different delta hedging. Delta measure can be combined with other risk measures to yield better risk measurement. We will discuss it further in the following sections.
@.1 N .d1 // @d1 @d1 @St
d21 1 St 1 D Xer£ p e 2 e.rq/£ p X St ¢s £ 2 d21 1 1 p eq£ .1N .d1 //CSt eq£ p e 2 St ¢s £ 2
D St eq£
d21 1 e 2 C eq£ .N .d1 / 1/ p St ¢s 2 £
C St eq£ De
q£
1 p
St ¢s 2 £
.N .d1 / 1/
493
d21
e 2
Fig. 33.1
494
H.-Y. Chen et al.
@N.d1 / @d1 @N.d2 / @d2 rX er£ N.d2 / C Xer£ @d1 @£ @d2 @£ ! 2 ¢ ¢2 r C 2s r C 2s ln Xst 1 d21 p p D St p e 2 2¢s £3=2 2¢s £ ¢s £ 2
D St
rX er£ N.d2 / C Xer£
d2 1 1 D St p e 2 2
! ¢2 ¢2 r C 2s ln Xst r C 2s p p 2¢s £3=2 2¢s £ ¢s £
rX er£ N.d2 /
33.3 Theta (‚) The theta of an option, ‚, is defined as the rate of change of the option price respected to the passage of time:
where … is the option price and t is the passage of time. If £ D T t, theta .‚/ can also be defined as minus one timing the rate of change of the option price respected to time to maturity. The derivation of such transformation is easy and straightforward: @… @£ @… @… D D .1/ ‚D @t @£ @t @£ where £ D T t is time to maturity. For the derivation of theta for various kinds of stock option, we use the definition of negative differential on time to maturity.
33.3.1 Derivation of Theta for Different Kinds of Stock Options For a European call option on a non-dividend stock, theta can be written as: St ¢s ‚ D p N0 .d1 / rX er£ N.d2 /: 2 £ The derivation of Equation (33.13) is @N.d1 / @Ct D St @£ @£
C .r/ X er£ N.d2 / C Xer£
d2 1 1 C St p e 2 2
d2 1 1 D St p e 2 2
@… ‚D @t
‚D
ln Xst r C 2s r p p ¢s £ 2¢s £3=2 2¢s £
Fig. 33.2
d2 1 St 1 p e 2 er£ X 2 ! ¢2
@N.d2 / @£
(33.13)
! ¢2 r C 2s ln Xst r p p ¢s £ 2¢s £3=2 2¢s £ ¢s2 2
p ¢s £
! rX er£ N.d2 /
St ¢s D p N0 .d1 / rX er£ N.d2 / 2 £ For a European put option on a non-dividend stock, theta can be shown as St ¢s ‚ D p N0 .d1 / C rX er£ N.d2 /: 2 £
(33.14)
The derivation of Equation (33.14) is @Pt D .r/ X er£ N.d2 / @£ @N.d2 / @N.d1 / C St Xer£ @£ @£
‚D
D .r/X er£ .1 N.d2 // Xer£
@.1 N.d2 // @d2 @d2 @£
@.1 N.d1 // @d1 D .r/X er£ .1 N.d2 // @d1 @£
1 d21 St r£ r£ 2 e C Xe p e X 2 ! ¢2 r C 2s ln Xst r p p ¢s £ 2¢s £3=2 2¢s £ 0 1 st ¢s2 ¢s2 2 d rC rC 2 ln X 1 1 St p e 2 @ p 2 p A 3 ¢ £ 2¢ 2 s =2 s £ 2¢s £ C St
D rX er£ .1 N.d2 //
33 Derivations and Applications of Greek Letters: Review and Integration
! ¢2 r C 2s ln Xst r p p ¢s £ 2¢s £3=2 2¢s £ ! ¢2 ¢2 r C 2s r C 2s ln Xst 1 d21 p p St p e 2 2¢s £3=2 2¢s £ ¢s £ 2 ! ¢s2 d21 1 2 p D rX er£ .1 N.d2 // St p e 2 ¢s £ 2 d21 1 C St p e 2 2
St ¢s D rX er£ .1 N.d2 // p N0 .d1 / 2 £ St ¢s D rX er£ N.d2 / p N0 .d1 / 2 £
495
D q St e
q£
N.d1 / St e
q£
d2 1 1 p e 2 2
¢s2 2
!
p ¢s £
rX er£ N.d2 / D q St eq£ N.d1 /
St eq£ ¢s N0 .d1 / rX er£ N.d2 / p 2 £
For a European call option on a dividend-paying stock, theta can be shown as ‚ D rX er£ N.d2 / qSt eq£ N.d1 /
St eq£ ¢s p N0 .d1 /: 2 £ (33.16)
The derivation of Equation (33.16) is For a European call option on a dividend-paying stock, theta can be shown as ‚ D q St eq£ N.d1 /
St eq£ ¢s p N0 .d1 / rX er£ N.d2 /: 2 £ (33.15)
The derivation of Equation (33.15) is @Ct @N.d1 / D q St eq£ N.d1 / St eq£ @£ @£ @N.d2 / C .r/ X er£ N.d2 / C Xer£ @£ @N.d1 / @d1 rX er£ N.d2 / D q St eq£ N.d1 / St eq£ @d1 @£
‚D
@N.d2 / @d2 D q St eq£ N.d1 / St eq£ @d2 @£ ! ¢2 ¢2 r q C 2s r q C 2s ln Xst 1 d21
p e 2 p p 2¢s £3=2 ¢s £ 2¢s £ 2
d21 1 St rX er£ N.d2 / C Xer£ p e 2 e.rq/£ X 2 ! st 2 ¢ r q C 2s ln X rq p p ¢s £ 2¢s £3=2 2¢s £ C Xer£
D q St eq£ N.d1 / St eq£ d2 1 1
p e 2 2
! 2 ¢2 r C 2s r C 2s ln Xst p p 2¢s £3=2 2¢s £ s £
rX er£ N.d2 /
0 1 st ¢s2 d21 r C ln 1 r X p2 A C St eq£ p e 2 @ p 3= ¢s £ 2¢ £ 2 s 2¢s £ 2
@N.d2 / @Pt D .r/ X er£ N.d2 / Xer£ @£ @£ @N.d1 / C .q/St eq£ N.d1 / C St eq£ @£
‚D
D rX er£ .1 N.d2 // Xer£ qSt eq£ N.d1 / C St eq£
@.1 N.d2 // @d2 @d2 @£
@.1 N.d1 // @d1 @d1 @£
D rX er£ .1 N.d2 // C Xer£
d2 1 St 1 p e 2 e.rq/£ X 2 ! st ¢2 ln X r q C 2s rq p p ¢s £ 2¢s £3=2 2¢s £ d2 1 1 qSt eq£ N.d1 / St eq£ p e 2 2 ! 2 ¢ ¢2 r q C 2s r q C 2s ln Xst p p 2¢s £3=2 ¢s £ 2¢s £ d2 1 1 D rX er£ .1 N.d2 // C St eq£ p e 2 2 ! st ¢2 r q C 2s ln X rq p p ¢s £ 2¢s £3=2 2¢s £ d2 1 1 qSt eq£ N.d1 / St eq£ p e 2 2 ! 2 ¢ ¢2 r q C 2s r q C 2s ln Xst p p 2¢s £3=2 ¢s £ 2¢s £
496
H.-Y. Chen et al.
D rX er£ .1 N.d2 // qSt eq£ N.d1 / ! ¢s2 d2 q£ 1 21 2 p St e p e ¢s £ 2
The derivation of Equation (33.17) is t @ @C @N .d1 / @d1 @St @2 Ct D D D 2 @St @d1 @St @St
D rX er£ .1 N.d2 // qSt eq£ N.d1 /
1 St
1 p D p N0 .d1 / ¢s £ St ¢s £
D N0 .d1 /
q£
St e ¢s p N0 .d1 / 2 £ D rX er£ N.d2 / qSt eq£ N.d1 /
St eq£ ¢s p N0 .d1 / 2 £
For a European put option on a non-dividend stock, gamma can be shown as 1 p N0 .d1 / : St ¢s £
D
33.3.2 Application of Theta (‚)
(33.18)
The derivation of Equation (33.18) is The value of option is the combination of time value and stock value. When time passes, the time value of the option decreases. Thus, the rate of change of the option price with respective to the passage of time, theta, is usually negative. Because the passage of time on an option is not uncertain, we do not need to make a theta hedge portfolio against the effect of the passage of time. However, we still regard theta as a useful parameter, because it is a proxy of gamma in the delta neutral portfolio. For the specific detail, we will discuss in the following sections.
2
@ Pt D @St 2
D
D N0 .d1 /
@
@Pt @St
D
@St
@.N .d1 / 1/ @d1 @d1 @St
1 St
1 p D p N0 .d1 / ¢s £ St ¢s £
For a European call option on a dividend-paying stock, gamma can be shown as eq£ p N0 .d1 / : St ¢s £
D
33.4 Gamma ()
(33.19)
The derivation of Equation (33.19) is The gamma of an option, , is defined as the rate of change of delta respected to the rate of change of underlying asset price: D
D
@2 … @
D @S @S2
@
2
@ Ct D @St 2
@Ct @St
@St
@ .eq£ N.d1 // @St
D
@N .d1 / @d1 St D eq£ N0 .d1 / p @d1 @St ¢s £
eq£ p N0 .d1 / St ¢s £
For a European call option on a dividend-paying stock, gamma can be shown as D
33.4.1 Derivation of Gamma for Different Kinds of Stock Options
eq£ p N0 .d1 / : St ¢s £
The derivation of Equation (33.20) is
For a European call option on a non-dividend stock, gamma can be shown as 1 p N0 .d1 / : St ¢s £
D
1
D eq£
where … is the option price and S is the underlying asset price. Because the option is not linearly dependent on its underlying asset, delta-neutral hedge strategy is useful only when the movement of underlying asset price is small. Once the underlying asset price moves wider, gamma-neutral hedge is necessary. We next show the derivation of gamma for various kinds of stock options.
D
(33.17)
D
2
@ Pt D @St 2
D eq£
@
@Pt @St
@St
D
@ .eq£ .N .d1 / 1// @St
@ Œ.N .d1 / 1/ @d1 @d1 @St
(33.20)
33 Derivations and Applications of Greek Letters: Review and Integration
D eq£ N0 .d1 / D
1 St
p ¢s £
eq£ 0 p N .d1 / St ¢s £
33.4.2 Application of Gamma () One can use delta and gamma together to calculate the changes of the option due to changes in the underlying stock price. This change can be approximated by the following relations.
497
The above analysis can also be applied to measure the price sensitivity of interest rate- related assets or portfolio to interest rate changes. Here we introduce Modified Duration and Convexity as risk measure corresponding to the above delta and gamma. Modified duration measures the percentage change in asset or portfolio value resulting from a percentage change in interest rate. Modified Duration D
Change in price Change in interest rate
Price
D =P Using the modified duration,
changeinoptionvalue changeinstockprice 1 C .changeinstockprice/2 2 From the above relation, one can observe that the gamma makes the correction for the fact that the option value is not a linear function of underlying stock price. This approximation comes from the Taylor series expansion near the initial stock price. If we let V be option value, S be stock price, and S0 be initial stock price, then the Taylor series expansion around S0 yields the following. @V .S0 / 1 @2 V .S0 / .S S0 / C .S S0 /2 @S 2Š @S 2 1 @n V .S0 / CC .S S0 /n 2Š @S n @V .S0 / .S S0 / V .S0 / C @S
V .S / V .S0 / C
C
1 @2 V .S0 / .S S0 /2 C o.S / 2Š @S 2
If we only consider the first three terms, the approximation is then, @V .S0 / 1 @2 V .S0 / .S S0 / C .S S0 /2 @S 2Š @S 2 1 .S S0 / C .S S0 /2 2
V .S / V .S0 /
For example, if a portfolio of options has a delta equal to $10,000 and a gamma equal to $5,000, the change in the portfolio value if the stock price drops to $34 from $35 is approximately change in portfolio value .$10000/ .$34 $35/ C
1
.$5000/ .$34 $35/2 2
$7500
Change in Portfolio Value D Change in interest rate D .Duration P/
Change in interest rate we can calculate the value changes of the portfolio. The above relation corresponds to the previous discussion of delta measure. We want to know how the price of the portfolio changes given a change in interest rate. Similar to delta, modified duration only shows the first order approximation of the changes in value. In order to account for the nonlinear relation between the interest rate and portfolio value, we need a second order approximation similar to the previous gamma measure; this can become the convexity measure. Convexity is the interest rate gamma divided by price, Convexity D =P and this measure captures the nonlinear part of the price changes due to interest rate changes. Using the modified duration and convexity together allow us to develop first as well as second order approximation of the price changes similar to previous discussion. Change in Portfolio Value Duration P
.change in rate/ C
1
Convexity P 2
.change in rate/2 : As a result, .duration P/ and (convexity P) act like the delta and gamma measure, respectively, in the previous discussion. This shows that these Greeks can also be applied in measuring risk in interest rate-related assets or portfolio. Next we discuss how to make a portfolio gamma neutral. Suppose the gamma of a delta-neutral portfolio is , the gamma of the option in this portfolio is o , and ¨o is the
498
H.-Y. Chen et al.
number of options added to the delta-neutral portfolio. Then, the gamma of this new portfolio is
h 0 St 2 3=2 1 d21 @ ¢s £ ln X C r C D St p e 2 ¢s2 £ 2
¨o o C :
0
To make a gamma-neutral portfolio, we should trade ¨o D = o options. Because the position of option changes, the new portfolio is not in the delta-neutral. We should change the position of the underlying asset to maintain delta-neutral. For example, the delta and gamma of a particular call option are 0.7 and 1.2, respectively. A delta-neutral portfolio has a gamma of 2;400. To make a delta-neutral and gamma-neutral portfolio, we should add a long position of 2;400=1:2 D 2;000 shares and a short position of 2;000
0:7 D 1;400 shares in the original portfolio.
1 St p e 2
d2 21
d2 1 1 D St p e 2 2
@
h ln SXt C r C
¢s2 2
¢s2 £
¢s2 £3=2 ¢s2 £
¢s2 2
i 11 £ £2 A
i 11 £ £2 A
p D St £ N0 .d1 /
For a European put option on a non-dividend stock, vega can be shown as p D St £ N0 .d1 / : (33.22) The derivation of Equation (33.22) is
33.5 Vega ( ) D The vega of an option, , is defined as the rate of change of the option price respected to the volatility of the underlying asset: @… D @¢ where … is the option price and ¢ is volatility of the stock price. We next show the derivation of vega for various kinds of stock option.
33.5.1 Derivation of Vega for Different Kinds of Stock Options For a European call option on a non-dividend stock, vega can be shown as p (33.21) D St £ N0 .d1 / : The derivation of Equation (33.21) is D
d21 1 St p e 2 er£ X 2
i 11 0 h ¢2 ln SXt C r C 2s £ £ 2 A @ ¢s2 £
D Xer£
@.1 N.d2 // @d2 @.1 N.d1 // @d1 St @d2 @¢s @d1 @¢s
1 d21 St r£ 2 D Xe p e e X 2
i 11 0 h ¢2 ln SXt C r C 2s £ £ 2 A @ ¢s2 £ r£
0 1 C St p e 2
d2 21
@
1 C St p e 2
¢s2 2
i 11 £ £2 A
h ¢s2 £3=2 ln SXt C r C
d2 21
d2 1 1 D St p e 2 2
@
¢s2 2
¢s2 £
0 h St 1 d21 @ ln X C r C D St p e 2 ¢s2 £ 2 0
@N.d1 / @N.d2 / @Ct D St Xer£ @¢s @¢s @¢s
@N.d1 / @d1 @N.d2 / @d2 D St Xer£ @d1 @¢s @d2 @¢s h 0 St 2 3=2 1 d21 @ ¢s £ ln X C r C D St p e 2 ¢s2 £ 2
@N.d2 / @N.d1 / @Pt D Xer£ St @¢s @¢s @¢s
¢s2 2
i 11 £ £2 A
h ¢s2 £3=2 ln SXt C r C
¢s2 £3=2 ¢s2 £
i 11 £ £2 A
¢s2 £
¢s2 2
i 11 £ £2 A
p D St £ N0 .d1 /
Xer£
For a European call option on a dividend-paying stock, vega can be shown as p D St eq£ £ N0 .d1 / :
(33.23)
33 Derivations and Applications of Greek Letters: Review and Integration
0
The derivation of Equation (33.23) is
@
@N.d1 / @N.d2 / @Ct D St eq£ Xer£ D @¢s @¢s @¢s
d2 21
¢s2 2
i 11 £ £2 A
d21 St 1 Xer£ p e 2 e.rq/£ X 2
i 11 0 h ¢2 ln SXt C r q C 2s £ £ 2 A @ ¢s2 £
¢s2 2
¢s2 2
i 11 £ £2 A
p D St eq£ £ N0 .d1 / :
(33.24)
The derivation of Equation (33.24) is @N.d2 / @N.d1 / @Pt D Xer£ St eq£ @¢s @¢s @¢s @.1 N.d2 // @d2 @d2 @¢s
@.1 N.d1 // @d1 @d1 @¢s
2 d 1 1 St .rq/£ r£ 2 D Xe p e e X 2 St eq£
i 11 £ £2 A
i 11 £ £2 A
¢s2 2
i 11 £ £2 A
2 3=2
1 d21 ¢ £ 2 s2 D St e p e ¢s £ 2 p D St eq£ £ N0 .d1 /
33.5.2 Application of Vega ( )
For a European call option on a dividend-paying stock, vega can be shown as
D Xer£
¢s2 2
¢s2 2
q£
2 3=2
d2 ¢ £ 1 1 D St p e 2 s 2 ¢s £ 2 p D St eq£ £ N0 .d1 /
D
i 11 £ £2 A
d2 1 1 C St eq£ p e 2 2 h 0 ¢s2 £3=2 ln SXt C r q C @ ¢s2 £
i 11 £ £2 A
0 h ln SXt C r q C d21 1 St eq£ p e 2 @ ¢s2 £ 2
¢s2 2
¢s2 £
d2 1 1 D St eq£ p e 2 2 0 h St ln X C r q C @ ¢s2 £
d21 1 D St eq£ p e 2 2 h 0 ¢s2 £3=2 ln SXt C r q C @ ¢s2 £
h ln SXt C r q C
d2 1 1 C St eq£ p e 2 2 h 0 ¢s2 £3=2 ln SXt C r q C @ ¢s2 £
@N.d1 / @d1 @N.d2 / @d2 D St eq£ Xer£ @d1 @¢s @d2 @¢s 1 D St eq£ p e 2 h 0 ¢s2 £3=2 ln SXt C r q C @ ¢s2 £
499
Suppose a delta-neutral and gamma-neutral portfolio has a vega equal to and the vega of a particular option is o . Similar to gamma, we can add a position of =o in option to make a vega-neutral portfolio. To maintain delta-neutral, we should change the underlying asset position. However, when we change the option position, the new portfolio is not gamma-neutral. Generally, a portfolio with one option cannot maintain its gamma-neutral and vega-neutral at the same time. If we want a portfolio to be both gamma-neutral and vega-neutral, we should include at least two kind of options on the same underlying asset in our portfolio. For example, a delta-neutral and gamma-neutral portfolio contains option A, option B, and underlying asset. The gamma and vega of this portfolio are 3; 200 and 2; 500, respectively. Option A has a delta of 0.3, gamma of 1.2, and vega of 1.5. Option B has a delta of 0.4, gamma of 1.6 and vega of 0.8. The new portfolio will be both gamma-neutral and vega-neutral when adding ¨A of option A and ¨B of option B into the original portfolio.
500
H.-Y. Chen et al.
p
d2 £ 1 St 1 p e 2 er£ X ¢ 2 s p
2 d 1 £ 1 C X£ er£ N.d2 / D St p e 2 ¢s 2 p
d2 1 £ 1 St p e 2 ¢s 2
Gamma Neutral W 3200 C 1:2¨A C 1:6¨B D 0
Vega Neutral W 2500 C 1:5¨A C 0:8¨B D 0 From two equations shown above, we can get the solution that ¨A D 1; 000 and ¨B D 1; 250. The delta of new portfolio is 1; 000 0:3 C 1; 250 0:4 D 800. To maintain delta-neutral, we need to short 800 shares of the underlying asset.
D X£ er£ N.d2 / For a European put option on a non-dividend stock, rho can be shown as
33.6 Rho (ρ) The rho of an option is defined as the rate of change of the option price respected to the interest rate: rho D
@… @r
33.6.1 Derivation of Rho for Different Kinds of Stock Options For a European call option on a non-dividend stock, rho can be shown as (33.25)
The derivation of Equation (33.25) is @N.d1 / @Ct D St .£/ X er£ N.d2 / rho D @r @r @N.d2 / Xer£ @r @N.d1 / @d1 C X£ er£ N.d2 / D St @d1 @r @N.d2 / @d2 @d2 @r p
d21 1 £ D St p e 2 ¢ 2 s
(33.26)
The derivation of Equation (33.26) is @Pt D .£/ X er£ N.d2 / @r @N.d2 / @N.d1 / St C Xer£ @r @r @.1 N.d2 // @d2 D X£ er£ .1 N.d2 // C Xer£ @d2 @r
rho D
where … is the option price and r is interest rate. The rho for an ordinary stock call option should be positive because higher interest rate reduces the present value of the strike price, which in turn increases the value of the call option. Similarly, the rho of an ordinary put option should be negative by the same reasoning. We next show the derivation of rho for various kinds of stock option.
rho D X£ er£ N.d2 /:
rho D X£ er£ N.d2 /:
St
@.1 N.d1 // @d1 @d1 @r
D X£ er£ .1 N.d2 // Xer£
p
£ 1 d21 St r£ p e 2 e X ¢s 2 p
2 d1 1 £ C St p e 2 ¢s 2 d2 1 1 D X£ e .1 N.d2 // St p e 2 2 p
2 d 1 £ 1 C St p e 2 ¢ 2 s
r£
p
£ ¢s
D X£ er£ N.d2 / For a European call option on a dividend-paying stock, rho can be shown as rho D X£ er£ N.d2 /:
(33.27)
The derivation of Equation (33.27) is
Xer£
rho D
@N.d1 / @Ct D St eq£ @r @r .£/ X er£ N.d2 / Xer£
C X£ er£ N.d2 / Xer£ D St eq£
@N.d2 / @r
@N.d1 / @d1 C X£ er£ N.d2 / @d1 @r
33 Derivations and Applications of Greek Letters: Review and Integration
@N.d2 / @d2 @d2 @r p
d2 £ q£ 1 21 C X£ er£ N.d2 / D St e p e ¢s 2
p
1 d21 St .rq/£ £ r£ 2 Xe p e e X ¢s 2 p
2 d 1 1 D St eq£ p e 2 C X£ er£ N.d2 / ¢ 2 s p
d2 £ q£ 1 21 St e p e ¢ 2 s Xer£
D X£ er£ N.d2 / For a European put option on a dividend-paying stock, rho can be shown as rho D X£ er£ N.d2 /:
(33.28)
501
Rhoput D X£er£ N.d2 / D .$58/.0:25/e.0:05/.0:25/N ! ln.65=58/ C 0:05 12 .0:3/2 .0:25/ p Š 3:168: .0:3/ 0:25 This calculation indicates that given 1% change increase in interest rate, say from 5% to 6%, the value of this European call option will decrease 0.03168 .0:01 3:168/. This simple example can be further applied to stocks that pay dividends using the derivation results shown previously.
33.7 Derivation of Sensitivity for Stock Options Respective with Exercise Price For a European call option on a non-dividend stock, the sensitivity can be shown as
The derivation of Equation (33.28) is @Pt D .£/ X er£ N.d2 / rho D @r @N.d2 / @N.d1 / St eq£ C Xer£ @r @r @.1 N.d2 // @d2 D X£ er£ .1 N.d2 // C Xer£ @d2 @r St eq£
@.1 N.d1 // @d1 @d1 @r
D X£ er£ .1 N.d2 //
p
d21 St £ 1 Xer£ p e 2 e.rq/£ X ¢s 2 p
d21 1 £ C St eq£ p e 2 ¢s 2 p
d21 1 £ D X£ er£ .1 N.d2 // St eq£ p e 2 ¢s 2 p
2 d1 1 £ C St eq£ p e 2 ¢s 2
@Ct D er£ N.d2 /: @X The derivation of Equation (33.29) is
@N.d1 / @N.d2 / @Ct D St e r N.d2 / Xe r @X @X @X @N.d1 / @d1 @N.d2 / @d2 e r N.d2 / Xe r D St @d1 @X @d2 @X
2 d1 1 1 1 e r N.d2 / Xe r D St p e 2 p X s 2
d12 1 1 St r 1 e
p e 2 p X X s 2
2 d1 St 1 e r N.d2 / D p e 2 X s 2
d2 St 1 21 e D e r N.d2 / p X s 2 For a European put option on a non-dividend stock, the sensitivity can be shown as
D X£ er£ N.d2 /
33.6.2 Application of Rho (ρ) Assume that an investor would like to see how interest rate changes affect the value of a 3-month European put option she holds with the following information. The current stock price is $65 and the strike price is $58. The interest rate and the volatility of the stock is 5% and 30% per year, respectively. The rho of this European put can be calculated as follows:
(33.29)
@Pt D er£ N.d2 /: @X
(33.30)
The derivation of Equation (33.30) is @Pt @N.d2 / @N.d1 / D er£ N.d2 / C Xer£ St @X @X @X @.1 N.d2 // @d2 D er£ .1 N.d2 // C Xer£ @d2 @X St
@.1 N.d1 // @d1 @d1 @X
502
H.-Y. Chen et al.
D er£ .1 N.d2 //
d21 1 1 St 1 p Xer£ p e 2 er£ X X ¢s £ 2
2 d1 1 1 1 C St p e 2 p X ¢s £ 2
d21 St 1 D er£ .1 N.d2 // C p e 2 X ¢s 2 £
2 d1 St 1 e 2 p X ¢s 2 £ D er£ N.d2 / For a European call option on a dividend-paying stock, the sensitivity can be shown as @Ct D er£ N.d2 /: @X
(33.31)
The derivation of Equation (33.31) is @N.d1 / @N.d2 / @Ct D St eq£ er£ N.d2 / Xer£ @X @X @X @N.d1 / @d1 @N.d2 / @d2 er£ N.d2 / Xer£ D St eq£ @d1 @X @d2 @X
2 d1 1 1 1 D St eq£ p e 2 p X ¢s £ 2 er£ N.d2 / Xer£
1 1 1 d21 St .rq/£ 2
p e e p X X ¢s £ 2
2 q£ d1 1 St e e 2 D p er£ N.d2 / X ¢s 2 £
d2 1 St eq£ 21 p e X ¢s 2 £
St eq£
@.1 N.d1 // @d1 @d1 @X
D er£ .1 N.d2 // Xer£
1 d21 St .rq/£ 1 1 2 p e e p X X ¢s £ 2
d2 1 1 1 1 C St eq£ p e 2 p X ¢ £ 2 s q£
d2 St e 1 1 D er£ .1 N.d2 // C p e 2 X ¢s 2 £
d2 St eq£ 1 1 p e 2 X ¢s 2 £ D er£ N.d2 /
33.8 Relationship Between Delta, Theta, and Gamma So far, the discussion has introduced the derivation and application of each individual Greeks and how they can be applied in portfolio management. In practice, the interaction or tradeoff between these parameters is of concern as well. For example, recall the partial differential equation for the Black–Scholes formula with non-dividend paying stock can be written as @… 1 @2 … @… C rS C 2 S 2 2 D r…: @t @S 2 @S Where … is the value of the derivative security contingent on stock price, S is the price of stock, r is the risk free rate, is the volatility of the stock price, and t is the time to expiration of the derivative. Given the earlier derivation, we can rewrite the Black–Scholes PDE as 1 ‚ C rS C 2 S 2 D r…: 2
D er£ N.d2 / For a European put option on a dividend-paying stock, the sensitivity can be shown as @Pt D er£ N.d2 /: @X
(33.32)
The derivation of Equation (33.32) is @Pt @N.d2 / @N.d1 / D er£ N.d2 / C Xer£ St eq£ @X @X @X @.1 N.d // @d 2 2 D er£ .1 N.d2 // C Xer£ @d2 @X
This relation gives us the tradeoff between delta, gamma, and theta. For example, suppose there are two delta neutral . D 0/ portfolios, one with positive gamma . > 0/ and the other one with negative gamma . < 0/ and they both have value of $1.… D 1/. The tradeoff can be written as 1 ‚ C 2 S 2 D r: 2 For the first portfolio, if gamma is positive and large, then theta is negative and large. When gamma is positive, change in stock prices results in higher value of the option. This
33 Derivations and Applications of Greek Letters: Review and Integration
means that when there is no change in stock prices, the value of the option declines as we approach the expiration date. As a result, the theta is negative. On the other hand, when gamma is negative and large, change in stock prices result in lower option value. This means that when there are no stock price changes, the value of the option increases as we approach the expiration and theta is positive. This gives us a tradeoff between gamma and theta and they can be used as proxy for each other in a delta neutral portfolio.
33.9 Conclusion In this chapter we have shown the derivation of the sensitivities of the option price to the change in the value of state variables or parameters. The first Greek is delta . /, which is the rate of change of option price to change in price of underlying asset. Once the delta is calculated, the next step is the rate of change of delta with respect to underlying asset price, which gives us gamma ./. Another two risk measures are theta .‚/ and rho ./, which measure the change in option value with respect to passing time and interest rate, respectively. Finally, one can also measure the change in option value with respect to the volatility of the underlying asset and this gives us the vega .v/. The relationship between these risk measures are shown, including an example from the Black–Scholes partial differential equation. Furthermore, the applications of these Greeks letters in portfolio management have also been discussed. Risk management is one of the important topics in
503
finance today, both for academics and practitioners. Given the recent credit crisis, one can observe that it is crucial to properly measure the risk related to the ever-more complicated financial assets.
References Bjork, T. 1998. Arbitrage theory in continuous time, Oxford University Press, Oxford. Boyle, P. P. and D. Emanuel. 1980. “Discretely adjusted option hedges.” Journal of Financial Economics 8(3), 259–282. Duffie, D. 2001. Dynamic asset pricing theory, Princeton University Press, Princeton, NJ. Fabozzi, F. J. 2007. Fixed income analysis, 2nd ed., Wiley, Hoboken, NJ. Figlewski, S. 1989. “Options arbitrage in imperfect markets.” Journal of Finance 44(5), 1289–1311. Galai, D. 1983. “The components of the return from hedging options against stocks.” Journal of Business 56(1), 45–54. Hull, J. 2006. Options, futures, and other derivatives, 6th ed., Pearson, Upper Saddle River, NJ. Hull, J., and A. White. 1987. “Hedging the risks from writing foreign currency options.” Journal of International Money and Finance 6(2), 131–152. Karatzas, I. and S. E. Shreve. 2000. Brownian motion and stochastic calculus, Springer, New York. Klebaner, F. C. 2005. Introduction to stochastic calculus with applications, Imperial College Press, London. McDonlad, R. L. 2005. Derivatives markets, 2nd ed., Addison-Wesley, Boston, MA. Shreve, S. E. 2004. Stochastic calculus for finance II: continuous time model, Springer, New York. Tuckman, B. 2002. Fixed income securities: tools for today’s markets, 2nd ed., Wiley, Hoboken, NJ.
Chapter 34
A Further Analysis of the Convergence Rates and Patterns of the Binomial Models San-Lin Chung and Pai-Ta Shih
Abstract This paper extends the generalized Cox-RossRubinstein (hereafter GCRR) model of Chung and Shih (2007). We provide a further analysis of the convergence rates and patterns based on various GCRR models. The numerical results indicate that the GCRR-XPC model and the GCRR-JR .p D 1=2/ model (defined in Table 34.1) outperform the other GCRR models for pricing European calls and American puts. Our results confirm that the node positioning and the selection of the tree structures (mainly the asymptotic behavior of the risk-neutral probability) are important factors for determining the convergence rates and patterns of the binomial models. Keywords Binomial model r Rate of convergence r Monotonic convergence
34.1 Brief Review of the Binomial Models Ever since the seminal works of Cox et al. (1979, hereafter CRR) and Rendleman and Bartter (1979), many articles have extended the binomial option pricing models in many aspects. One stream of the literature modifies the lattice or tree type to improve the accuracy and computational efficiency of binomial option prices. The pricing errors in the discretetime models are mainly due to “distribution error” and “nonlinearity error” (see Figlewski and Gao (1999) for thorough discussions). Within the literature, there are many proposed solutions that reduce the distribution error and/or nonlinearity error. It is generally difficult to reduce the distribution error embedded in the binomial model when the number of time steps .n/ is small, probably due to the nature of the problem. Nevertheless there are two important works that might shed some S.-L. Chung () and P.-T. Shih Department of Finance, National Taiwan University, No. 85, Section 34.4, Roosevelt Road, Taipei 106, Taiwan, R.O.C. e-mail: [email protected]
light on understanding or dealing with distribution error. First, Omberg (1988) developed a family of efficient multinomial models by applying the Gauss-Hermite quadrature1 to the integration problem (e.g., N.d1 / in the Black-Scholes formula) presented in the option pricing formulas. Second, Leisen and Reimer (1996) modified the sizes of up- and down-movements by applying various normal approximations (e.g., the Camp-Paulson inversion formula) to the binomial distribution derived in the mathematical literature. Concerning the techniques to reduce nonlinearity error, there are least three important works (methods) worth noting in the literature. First, Ritchken (1995) and Tian (1999) showed that one can improve the numerical accuracy of the binomial option prices by allocating one of the final nodes on the strike price or one layer of the nodes on the barrier price. Second, Figlewski and Gao (1999) proposed the so-called adaptive mesh method that sharply reduces nonlinearity error by adding one or more small sections of fine high-resolution lattice onto a tree with coarser time and price steps. Third, Broadie and Detemple (1996) and Heston and Zhou (2000) suggested modified binomial models by replacing the binomial prices prior to the end of the tree by the Black-Scholes values, or by smoothing payoff functions at maturity, and computing the rest of binomial prices as usual. Another stream of research focuses on the convergence rates and patterns of the binomial models. For example, Leisen and Reimer (1996) proved that the convergence is of order O.1=n/ for the CRR model, the Jarrow and Rudd (1983) model, and the Tian (1993) model. Heston and Zhou (2000) also showed that the rate of convergence for the lattice approach can be enhanced by smoothing the option payoff p functions and cannot exceed 1= n at some nodes of the tree. Concerning the convergence patterns of the binomial models, it is widely documented that the CRR binomial prices converge to the Black-Scholes price in a wavy, erratic way (for example, see Broadie and Detemple 1996; Tian 1999).2 1
The Gauss-Hermite quadrature is a well developed and highly efficient method in the numerical integration literature. 2 Also see Diener and Diener (2004) for a complete discussion of the price oscillations in a tree model.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_34,
505
506
S.-L. Chung and P.-T. Shih
Several remedies have been proposed in the literature, e.g., see Broadie and Detemple (1996), Leisen and Reimer (1996), Tian (1999), Heston and Zhou (2000), Widdicks et al. (2002), and Chung and Shih (2007). In a recent article, Chung and Shih (2007) proposed a generalized CRR (hereafter GCRR) model by adding a stretch parameter to the CRR model. With the flexibility from the stretch parameter, the GCRR model provides a numerically efficient method for pricing a range of options. Moreover, Chung and Shih (2007) also showed that (1) the convergence pattern of the binomial prices to the accurate price depends on the node positioning (see their Theorem 2), and (2) the rate of convergence depends on the selection of the tree structures (i.e., the risk-neutral probability p and jump sizes u and d ). In this paper, we provide a further analysis of the convergence rates and patterns based on the GCRR models. Our results confirm that the node positioning and the selection of the tree structures are important factors for determining the convergence rates and patterns of the binomial models. The rest of the paper is organized as follows. Section 34.2 discusses the importance of node positioning for monotonic convergence of binomial prices. We also summarize a complete list of binomial models with monotonic convergence in Sect. 34.2. Section 34.3 introduces the GCRR model and discusses its flexibility for node positioning. Various GCRR related models are also defined in Sect. 34.3. Section 34.4 presents the numerical results of various GCRR models (such as the GCRR-XPC and GCRR-JR models) against other binomial models for pricing a range of options. The numerical analysis is mainly focused the convergence rates and patterns of various binomial models. Section 34.5 concludes the paper.
period price jumps either upward to uS with probability q or downward to dS with probability 1 q, where 0 < d < e r t < u and 0 < q < 1. Again the standard dynamic hedge approach ensures that the risk-neutral probability p, instead of q, is relevant for pricing options. The binomial option pricing model is completely determined by the jump sizes u and d and the risk-neutral probability p. In the CRR model, three conditions; that is, the risk-neutral mean and variance of the stock price in the next period and ud D 1, are utilized to determine u; d , and p as follows:
34.2 The Importance of Node Positioning for Monotonic Convergence
Therefore, Theorem 1 shows that the node positioning (i.e., the relative distance of the strike price between the most adjacent nodes) is the main reason why the CRR binomial prices converge to the Black-Scholes price in a wavy, erratic way. To get a monotonic convergent pattern, we can (1) use the CRR model corrected with the quadratic function (see Equation (34.2)) as suggested by Chung and Shih (2007); (2) fine tune the binomial model such that ".n/ is a constant (see Tian 1999); (3) choose the appropriate number of time steps n such that ".n/ is a constant (see WAND 2002); (4) smooth the option payoff function to be a continuous and differentiable function (see Heston and Zhou 2000); (5) replace the binomial prices prior to the end of the tree by the BlackScholes values and computing the rest of binomial prices as usual (see Broadie and Detemple 1996); or (6) place the strike price in the center of the final nodes (see Leisen and Reimer 1996; Chung and Shih 2007). Once the monotonic convergent pattern is derived, various extrapolation techniques are applicable to the above binomial
Following the standard option pricing model, we assume that the underlying asset price follows a geometric Brownian motion; that is, dS=S D dt C dW; (34.1) where is the drift rate, is the instantaneous volatility, and W is a Wiener process. Standard dynamic hedge and complete market argument assures that options can be valued by assuming risk neutrality; that is, D r. For pricing plain vanilla European options, Black-Scholes had derived closedform solutions. For pricing other types of options such as American options, we have to rely on numerical approximation methods, such as a binomial model. In an n-period binomial model, the time to maturity of the option .T / is divided into n equal time steps; that is,
t D T =n. If the current stock price is S , then the next
u D e
p
t
; d D e
p
t
; and p D .e r t d /=.u d /:
It is widely documented that the CRR binomial prices converge to the Black-Scholes price in a wavy, erratic way (for example, see Broadie and Detemple 1996; Tian 1999). This kind of convergent pattern can be explained by the following theorem. Theorem 1. Let Cn and CBS be the n-period CRR binomial and continuous-time prices of a standard European call option with the terminal payoff max.ST X; 0/. Therefore, Cn D CBS C
ˇ.0:5 ".n//2 C O.1=n/ n
(34.2)
p 2 =2/T p where ˇ D 2XerT .d2 / T ; d2 D ln.S=X /C.r ; T m nm x =X / , and m is the smallest ".n/ D ln.S ln.u=d / ; Sx D Su d m nm > X. integer that satisfies Su d
Proof. Please see Chung and Shih (2007).
34 A Further Analysis of the Convergence Rates and Patterns of the Binomial Models
models and the option price can be more efficiently calculated. The extrapolation formula used in this article is briefly summarized in the Appendix 38A.
34.3 The Flexibility of GCRR Model for Node Positioning In the GCRR model of Chung and Shih (2007), a stretch parameter ./ is incorporated into the CRR model so that the shape (or spanning) of the lattice can be flexibly adjusted. In other words, the condition that ud D 1, which is used in the traditional CRR model, is relaxed in the GCRR model. As a result, the GCRR model has one more degree of freedom to adjust the shape of the binomial lattice. We restate the GCRR model of Chung and Shih (2007) in Theorem 2. Theorem 2. In the GCRR model the jump sizes and probability of going up are as follows: u D e
p
t
1
; d D e
p
t
; and p D
e r t d ; ud
(34.3)
507
the GCRR-XPC binomial prices converges to the accurate price smoothly and monotonically. A straightforward extension of Chung and Shih (2007) is to place the strike price on a certain place of binomial tree; for example, on the 1/3-th or the 2/3-th nodes. Moreover, the idea of the GCRR model potentially can be applied to other binomial models. For example, we can generalize the Jarrow and Rudd (1983, hereafter JR) model by setting pD
2
p 1 1 2 ; u D e .r 2 / t C t ; C1
1 2 1 and d D e .r 2 / t
p
t
:
(34.4)
By changing the stretch parameter , the shape (or spanning) of the lattice can be flexibly adjusted so that one of the final nodes can be allocated on the strike price. In the following section, we will investigate the numerical performances of various GCRR models where one of the final nodes can be allocated on the strike price and therefore the convergent patterns are monotonic. In addition to CRR and JR models, six GCRR related models are investigated in the next section and their definitions are given in Table 34.1.
where 2 RC is a stretch parameter that determines the shape (or spanning) of the binomial tree. Moreover, when
t ! 0; that is, the number of time steps n grows to infinity, the GCRR binomial option prices converge to the BlackScholes formulae for plain vanilla European options.
34.4 Numerical Results of Various GCRR Models
The stretch parameter ./ in the GCRR model gives us the flexibility for node positioning. For instance, Chung and Shih (2007) placed the strike price in the center of the final nodes (termed GCRR-XPC model in their paper) and showed that
In this section, we will investigate the convergence pattern, the rate of convergence, and the accuracy of six GCRR models defined in Table 34.1 against the CRR model and the
Table 34.1 The definitions of various GCRR models
Model
Definition
GCRR-XPC
A GCRR model where the stretch parameter is chosen such that the strike price is placed in the center of final nodes
GCRR-X on the 1/3-th node
A GCRR model where the stretch parameter is chosen such that the number of nodes above the strike price is one third of the total final nodes and one of the final nodes can be allocated on the strike price
GCRR-X on the 2/3-th node
A GCRR model where the stretch parameter is chosen such that the number of nodes above the strike price is two thirds of the total final nodes and one of the final nodes can be allocated on the strike price
GCRR-JR .p D 1=2/
A GCRR-JR model where the stretch parameter is chosen such that the up probability is closest to one half and one of the final nodes can be allocated on the strike price
GCRR-JR .p D 1=3/
A GCRR-JR model where the stretch parameter is chosen such that the up probability is closest to one third and one of the final nodes can be allocated on the strike price
GCRR-JR .p D 2=3/
A GCRR-JR model where the stretch parameter is chosen such that the up probability is closest to two thirds and one of the final nodes can be allocated on the strike price
508
S.-L. Chung and P.-T. Shih
JR model. Figure 34.1 shows the convergence pattern of the CRR model, the GCRR-XPC model, GCRR-X on the 1/3-th node model, and GCRR-X on the 2/3-th node model for pricing an in-the-money call option. The parameters in Fig. 34.1 are adopted from Tian (1999): the asset price is 100, the strike price is 95, the volatility is 0.2, the maturity of the option is 6 months, and the risk-free rate is 0.06. The pricing error is defined as the difference between the binomial price and the Black-Scholes price. In Fig. 34.1, the pricing errors are plotted against an even number of time steps ranging from 10 to 100. It is apparent from Fig. 34.1 that the convergence of the CRR model is not monotonic and smooth. Moreover, the pricing errors oscillate between positive and negative values. In contrast, the GCRR-XPC model, GCRR-X on the 1/3-th nodes model, and GCRR-X on the 2/3-th nodes model converge monotonically to the Black-Scholes price.
Therefore, various extrapolation techniques are applicable to the GCRR-XPC model, GCRR-X on the 1/3-th node model, and GCRR-X on the 2/3-th node model. A similar convergence pattern is also embedded in GCRR-JR .p D 1=2/ model, GCRR-JR .p D 1=3/ model and GCRR-JR .p D 2=3/ model as shown in Fig. 34.2. Table 34.2 reports the prices of European call and American put options calculated using the CRR model, the GCRR-XPC model, GCRR-X on the 1/3-th nodes model, and GCRR-X on the 2/3-th nodes model. The parameters are the following: the asset price is 40, the strike price is 45, the asset price volatility is 0.4, the maturity of the option is 6 months, and the risk-free rate is 0.6. The errors of the CRR model do not necessarily go down as n increases. For example, the error for the CRR model is 0.0001215 when n D 6400, worse than an error of 0.0000966 when n D 3200. In contrast, the error decreases
0.15 CRR GCRR-XPC GCRR- X on the 1/3-th nodes GCRR- X on the 2/3-th node
Pricing Error
0.1 0.05 0 -0.05 -0.1 -0.15 -0.2 10
28
46
64
82
100
Number of Time Steps
Fig. 34.1 The convergence patterns of the CRR model and three related GCRR models. This figure shows the convergence patterns of the CRR model, GCRR-XPC model, GCRR-X on the 1/3-th node model,
and GCRR-X on the 2/3-th node model. The asset price is 100, the strike price is 95, the volatility is 0.2, the maturity of the option is 6 months, and the risk-free rate is 0.06
0.05 JR GCRR-JR (p=1/3) GCRR-JR (p=1/2) GCRR-JR (p=2/3)
Pricing Error
0
-0.05
-0.1
-0.15
-0.2 10
28
46 64 Number of Time Steps
Fig. 34.2 The convergence patterns of the JR model and three related GCRR models. This figure shows the convergence patterns of the JR model, GCRR-JR .p D 1=2/ model, GCRR-JR .p D 1=3/ model and
82
100
GCRR-JR .p D 2=3/ model. The asset price is 100, the strike price is 95, the volatility is 0.2, the maturity of the option is 6 months, and the risk-free rate is 0.06
0:0197709 0:0057017 0:0037874 0:0000405 0:0004341 0:0004564 0:0000966 0:0001215
Error
0:020305 0:0009942 0:0025996 0:0008327 0:0005569 0:0003715 0:0000189
3:1127553 3:0872827 3:089197 3:0929439 3:0925503 3:0934409 3:0930811 3:0931059 3:0929844
Price
7:0271842 7:0068793 7:0058851 7:0084847 7:007652 7:008209 7:0078375 7:0078186
50 100 200 400 800 1600 3200 6400 B-S price
N
50 100 200 400 800 1600 3200 6400 20:4232027 0:3824407 3:1221175 1:4950395 1:4993231 19:6141612
Error ratio
3:467556 1:5054227 93:5404085 0:0932659 0:9511233 4:722664 0:7954713
Error ratio
6:9997733 7:0036989 7:0057155 7:0067137 7:0071955 7:0074457 7:0075681 7:0076292
Price
3:0761791 3:0845771 3:0887797 3:0908818 3:091933 3:0924587 3:0927216 3:092853 3:0929844
Price
GCRR-XPC
0:0039255 0:0020166 0:0009982 0:0004819 0:0002501 0:0001224 0:0000611
Error
1:9988942 1:9994844 1:9997516 1:9998781 1:9999397 1:9999699 1:9999849
0:0168054 0:0084073 0:0042047 0:0021026 0:0010514 0:0005257 0:0002629 0:0001314
1:9465921 2:0202401 2:0714921 1:9264892 2:04292 2:0050691
Error ratio
American put
Error ratio
Error
7:0165994 7:0174403 7:0155716 7:0139161 7:012333 7:0111153 7:0101693 7:0094751
Price
3:1172274 3:1164068 3:1109797 3:1070312 3:1033497 3:1006093 3:0984943 3:0969498 3:0929844
Price
Error ratio
1:0350322 1:3015833 1:2810991 1:3551761 1:3594105 1:3838448 1:3895148
Error ratio
0:0008409 0:0018687 0:4499987 0:0016555 1:1287822 0:0015831 1:0457441 0:0012178 1:2999743 0:000946 1:2873403 0:0006942 1:3626916
Error
0:024243 0:0234224 0:0179953 0:0140468 0:0103653 0:0076248 0:0055099 0:0039653
Error
GCRR-X on the 1/3-th nodes
6:9792354 6:9880395 6:9950865 6:9990891 7:0018798 7:0036826 7:0049213 7:00576
Price
3:0294775 3:0495034 3:065062 3:0739472 3:0801303 3:0841135 3:0868519 3:0887076 3:0929844
Price
0:0088041 0:007047 0:0040026 0:0027906 0:0018028 0:0012387 0:0008387
Error
0:063507 0:0434811 0:0279225 0:0190372 0:0128541 0:008871 0:0061326 0:0042768
Error
GCRR-X on the 2/3-th node
1:2493368 1:7606073 1:4343098 1:5479088 1:4553711 1:477035
Error ratio
1:4605662 1:557207 1:4667279 1:48102 1:4490152 1:4465349 1:4339211
Error ratio
Notes: The Parameters are: the asset price is 40, the strike price is 45, the asset price volatility is 0.4, the maturity of the option is 6 months, the risk-free rate is 0.6, and the number of time steps starts at 50 and doubles each time subsequently. This table reports price, pricing errors, and error ratios from the CRR model, the GCRR-XPC model, the GCRR-X on the 1/3-th nodes model, the GCRR-X on the 2/3-th nodes model
Error
Price
N
CRR
Table 34.2 Price and error ratios for European calls and American puts using the CRR model and three related GCRR models. European call
34 A Further Analysis of the Convergence Rates and Patterns of the Binomial Models 509
510
S.-L. Chung and P.-T. Shih
as n increases for the the GCRR-XPC model, GCRR-X on the 1/3-th nodes model, and GCRR-X on the 2/3-th nodes model. Thus, one can apply the Richardson extrapolation method to increase the accuracy for these models. Let e.n/ be the pricing error of the n-step binomial model; that is, (34.5) e.n/ D Cn CBS ; where Cn is the n-step binomial price and CBS is the BlackScholes price.3 Define the error ratio as: .n/ D e.n/=e.2n/:
(34.6)
The error ratio is a measure of the improvements in accuracy as the number of time steps doubles. It is clear from Table 34.2 that the pricing error of the GCRR-XPC model is almost exactly halved when the number of time steps doubles; that is, the error ratio almost equals two. In contrast, Table 34.2 indicates that the GCRR-X on the 1/3-th nodes model and the GCRR-X on the 2/3-th nodes model have p a slower convergence rate of order O.1= n/. Similarly, in Table 34.3, the pricing error of the GCRR-JR .p D 1=2/ model is almost exactly halved when the number of time steps doubles; that is, the error ratio almost equals two. In contrast, it seems that the GCRR-JR .p D 1=3/ model and the GCRR-JR .p D 2=3/ model have a slower convergence p rate of order O.1= n/. The extrapolation formulas when the error ratio is known are suggested in the Appendix. We also compare the numerical efficiency of the various GCRR models including GCRR-XPC model, GCRR-X on the 1/3-th nodes model, GCRR-X on the 2/3-th nodes model, GCRR-JR .p D 1=2/ model, GCRR-JR .p D 1=3/ model, and GCRR-JR .p D 2=3/ model. The above binomial methods are appropriate for applying the Richardson extrapolation to increase the accuracy. All the above methods are applied to price a range of European and American options. The accuracy of each method is measured using the root mean squared errors (RMSE) with respect to the closed-form solution, or the GCRR-XPC model with a two-point extrapolation .n D 50;000/ in the case of American puts. The parameters of these options are adopted from WAND (2002) as follows: the stock price is 40, the risk free rate is 0.06, the strike prices are 35 and 45, the volatilities are 0.2 and 0.4, and the maturities are 0.25 and 0.5.
3 Since there is no exact solution for the American put, following Tian (1999) and Heston and Zhou (2000), the pricing errors of the n-step binomial model is defined as: e.n/ D Cn Cn=2 .
To clarify the role played by Richardson extrapolation, we first compare the numerical efficiency of each method on a raw basis without any extrapolation procedure. The results are presented in Table 34.4. It is clear from Table 34.4 that GCRR-XPC outperforms the other methods for pricing European calls and American puts and GCRR-JR .p D 1=2/ model is the second best model. Moreover, the results of Table 34.4 also show that for European call options, GCRR-XPC and GCRR-JR .p D 1=2/ model monotonically converge to the Black-Scholes value at the 1=n rate. On the other hand, GCRR-X on the 1/3-th nodes model, GCRR-X on the 2/3-th nodes model, GCRR-JR .p D 1=3/ model and GCRR-JR .p D 2=3/ modelımonotonically converge to the p n rate. Black-Scholes value at the 1 We compare the numerical efficiency of the various GCRR models when the Richardson extrapolation technique is applied to all methods. The number of time steps used for all models with a two-point extrapolation is .n; 2Œn=4 / where Œx denotes the Gaussian symbol. In Table 34.5, it is very clear that GCRR-XPC also outperforms the other methods for pricing European calls and American puts, and GCRR-JR .p D 1=2/ model is the second best model when the Richardson extrapolation formula is applied to all binomial models.
34.5 Conclusion This paper provides a further analysis of the convergence rates and patterns based on the GCRR model of Chung and Shih (2007). We discuss how to correct the oscillatory behavior of the CRR model by various GCRR models so that the binomial prices converge to its continuous time price monotonically and smoothly, and thus one can apply the Richardson extrapolation method to increase the accuracy for these models. It is shown that when the risk-neutral probability p equals or converges to 0.5, then the convergence rate is order of O.1=n/. For example, GCRR-XPC and GCRR-JR .p D 1=2/ model monotonically converge to the Black-Scholes value at the 1=n rate. However, when p does note equal or p converge to 0.5, the convergence rate is order of O.1= n/. For example, GCRR-X on the 1/3-th nodes model, GCRR-X on the 2/3-th nodes model, GCRR-JR .p D 1=3/ model, and GCRR-JR .p D 2=3/ model monotonically converge to the p Black-Scholes value at the 1= n rate. Finally, the numerical results indicate that the GCRR models with various modifications are efficient for pricing a range of options. GCRR-XPC outperforms the other GCRR models for pricing European calls and American puts and GCRR-JR .p D 1=2/ model is the second best model.
0:0176285 0:0041689 0:0005397 0:0023722 0:0010473 0:0006063 0:0000826 0:0001073
Error
0:0106746 0:0043763 0:0002667 0:0012808 0:0013773 0:0003649 0:0001685
3:110613 3:0971533 3:0935241 3:0953566 3:0940317 3:0923781 3:093067 3:0928771 3:0929844
Price
7:0249121 7:0142375 7:0098612 7:0101279 7:008847 7:0074697 7:0078346 7:0076661
50 100 200 400 800 1600 3200 6400 B-S price
N
50 100 200 400 800 1600 3200 6400 2:4391967 16:4106629 0:2082054 0:9299303 3:7744614 2:1656175
Error ratio
4:2285686 7:7248534 0:2274975 2:2650576 1:7273667 7:3408261 0:7696826
Error ratio
6:9948508 7:0020738 7:0048184 7:0061896 7:0069831 7:0073317 7:0075096 7:0076004
Price
3:064688 3:0808786 3:0867817 3:0896514 3:0914335 3:0921893 3:0925832 3:0927856 3:0929844
Price
0:007223 0:0027446 0:0013712 0:0007935 0:0003486 0:000178 0:0000908
Error
2:3374246 1:9516768 1:8609907 2:1490994 1:9505708 1:9816991 2:0184492
0:0282964 0:0121058 0:0062028 0:003333 0:0015509 0:0007951 0:0004012 0:0001988
2:6316614 2:0016892 1:7280424 2:276293 1:9588599 1:9605003
Error ratio
American put
Error ratio
Error
GCRR-JR .p D 1=2/
7:0162674 7:0171254 7:0154509 7:0139527 7:0125108 7:011281 7:0103178 7:0095929
Price
3:1266496 3:1229406 3:1159382 3:1107778 3:1063754 3:1028496 3:1001488 3:0981502 3:0929844
Price
0:000858 0:0016745 0:0014982 0:0014419 0:0012298 0:0009631 0:0007249
Error
0:0336651 0:0299562 0:0229538 0:0177934 0:0133909 0:0098652 0:0071644 0:0051658
Error
GCRR-JR .p D 1=3/
0:5123652 1:117721 1:0390038 1:1724385 1:2768974 1:328679
Error ratio
1:123812 1:305067 1:2900141 1:3287666 1:3573885 1:3769746 1:3869016
Error ratio
6:970908 6:9846444 6:9931022 6:9979603 7:0011871 7:0032567 7:0046428 7:0055771
Price
3:000282 3:0342243 3:0554242 3:0676809 3:0759322 3:081286 3:0849041 3:0873602 3:0929844
Price
0:01373646 0:00845776 0:00485807 0:0032268 0:00206967 0:00138609 0:0009343
Error
0:09270245 0:05876007 0:03756026 0:02530355 0:01705221 0:0116984 0:00808034 0:00562422
Error
GCRR-JR .p D 2=3/
1:62412507 1:74097083 1:50553798 1:55909214 1:4931684 1:48356412
Error ratio
1:5776437 1:56442129 1:48438717 1:48388688 1:45765261 1:44776122 1:43670451
Error ratio
Notes: The Parameters are: the asset price is 40, the strike price is 45, the asset price volatility is 0.4, the maturity of the option is 6 months, the risk-free rate is 0.6, and the number of time steps starts at 50 and doubles each time subsequently. This table reports price, pricing errors, and error ratios from the JR model, the GCRR-JR .p D 1=3/ model, the GCRR-JR .p D 1=2/ model, the GCRR-JR .p D 2=3/ model
Error
Price
N
JR
Table 34.3 Price and error ratios for European calls and American puts using the JR model and three related GCRR models European call
34 A Further Analysis of the Convergence Rates and Patterns of the Binomial Models 511
512
S.-L. Chung and P.-T. Shih
Table 34.4 A Comparison of the performance of various models over a range of option types and parameters (without extrapolation) GCRR-X GCRR-X GCRR-JR GCRR-JR GCRR-JR N GCRR-XPC on the 1/3-th nodes on the 2/3-th nodes .p D 1=2/ .p D 1=3/ .p D 2=3/ Approximation RMS error for European calls 50 100 200 400 800 1600 3200 6400
9.17E-03 4.59E-03 2.29E-03 1.15E-03 5.74E-04 2.87E-04 1.43E-04 7.17E-05
2.55E-02 1.94E-02 1.34E-02 9.77E-03 6.91E-03 4.94E-03 3.51E-03 2.49E-03
3.17E-02 2.26E-02 1.50E-02 1.06E-02 7.32E-03 5.15E-03 3.61E-03 2.54E-03
1.50E-02 7.05E-03 3.61E-03 1.85E-03 9.06E-04 4.56E-04 2.28E-04 1.14E-04
2.18E-02 1.67E-02 1.22E-02 9.13E-03 6.72E-03 4.90E-03 3.53E-03 2.54E-03
4.40E-02 2.82E-02 1.80E-02 1.21E-02 8.23E-03 5.64E-03 3.91E-03 2.72E-03
2.12E-02 1.41E-02 9.36E-03 6.43E-03 4.46E-03 3.13E-03 2.20E-03 1.55E-03
1.89E-02 1.30E-02 8.72E-03 6.15E-03 4.34E-03 3.06E-03 2.17E-03 1.53E-03
Approximation RMS error for American puts 50 100 200 400 800 1600 3200 6400
4.46E-03 2.24E-03 1.11E-03 5.50E-04 2.75E-04 1.37E-04 6.81E-05 3.39E-05
1.70E-02 1.26E-02 8.51E-03 6.08E-03 4.25E-03 3.01E-03 2.12E-03 1.50E-03
1.67E-02 1.25E-02 8.47E-03 6.06E-03 4.24E-03 3.00E-03 2.12E-03 1.50E-03
8.29E-03 3.84E-03 1.98E-03 9.76E-04 4.86E-04 2.45E-04 1.22E-04 6.07E-05
Note: The options test set for both European and American options is from WAND (2002) and consists of combinations of strike prices of 35 and 45; maturities of 0.25 and 0.5, and volatilities of 0.2 and 0.4. RMS errors averaged over the eight different sets of parameters are reported. GCRR-XPC, GCRR-X on the 1/3-th nodes, GCRR-X on the 2/3-th nodes, GCRR-JR .p D 1=2/, GCRR-JR .p D 1=3/, and GCRR-JR .p D 2=3/ models are as described in the text Table 34.5 A comparison of the performance of various models over a range of option types and parameters (with extrapolation) GCRR-X GCRR-X GCRR-JR GCRR-JR GCRR-JR N GCRR-XPC on the 1/3-th nodes on the 2/3-th nodes .p D 1=2/ .p D 1=3/ .p D 2=3/ Approximation RMS error for European calls 50 100 200 400 800 1600 3200 6400
7.61E-04 2.95E-05 7.47E-06 1.88E-06 4.72E-07 1.18E-07 2.96E-08 7.39E-09
1.65E-02 9.91E-03 3.72E-03 2.16E-03 9.25E-04 5.08E-04 2.35E-04 1.24E-04
2.05E-02 7.73E-03 4.68E-03 1.85E-03 1.05E-03 4.67E-04 2.51E-04 1.18E-04
5.24E-03 1.54E-03 3.56E-04 1.80E-04 8.82E-05 2.15E-05 4.45E-06 2.83E-06
2.52E-02 1.23E-02 5.14E-03 3.12E-03 1.52E-03 7.66E-04 3.75E-04 1.90E-04
2.68E-02 1.20E-02 7.33E-03 2.78E-03 1.49E-03 7.63E-04 3.70E-04 1.84E-04
1.71E-02 7.64E-03 3.55E-03 1.85E-03 9.01E-04 4.23E-04 2.13E-04 1.02E-04
1.36E-02 6.59E-03 3.28E-03 1.62E-03 7.81E-04 3.90E-04 1.99E-04 9.96E-05
Approximation RMS error for American puts 50 100 200 400 800 1600 3200 6400
1.05E-03 1.03E-04 1.58E-04 2.09E-05 2.26E-05 5.94E-06 2.68E-06 7.21E-07
1.14E-02 5.43E-03 2.48E-03 1.09E-03 5.44E-04 2.48E-04 1.25E-04 5.93E-05
9.06E-03 4.40E-03 2.04E-03 9.43E-04 4.61E-04 2.27E-04 1.15E-04 5.71E-05
3.83E-03 1.09E-03 4.38E-04 1.28E-04 4.56E-05 1.48E-05 3.67E-06 2.29E-06
Note: The options test set for both European and American options is from WAND (2002) and consists of combinations of strike prices of 35 and 45; maturities of 0.25 and 0.5, and volatilities of 0.2 and 0.4. RMS errors averaged over the eight different sets of parameters are reported. GCRR-XPC, GCRR-X on the 1/3-th nodes, GCRR-X on the 2/3-th nodes, GCRR-JR .p D 1=2/, GCRR-JR .p D 1=3/, and GCRR-JR .p D 2=3/ models are as described in the text
34 A Further Analysis of the Convergence Rates and Patterns of the Binomial Models
References Broadie, M. and J. Detemple. 1996. “American option valuation: new bounds, approximations, and a comparison of existing methods.” Review of Financial Studies 9, 1211–1250. Chung, S. L. and P. T. Shih. 2007. “Generalized Cox-Ross-Rubinstein binomial models.” Management Science 53, 508–520. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263. Diener, F. and M. Diener. 2004. “Asymptotics of the price oscillations of a European call option in a tree model.” Mathematical Finance 14, 271–293. Figlewski, S. and B. Gao. 1999. “The adaptive mesh model: a new approach to efficient option pricing.” Journal of Financial Economics 53, 313–351. Heston, S. and G. Zhou. 2000. “On the rate of convergence of discretetime contingent claims.” Mathematical Finance 10, 53–75. Jarrow, R. and A. Rudd. 1983. Option pricing, Richard D. Irwin, Homewood, IL. Leisen, D. P. J. and M. Reimer. 1996. “Binomial models for option valuation – examining and improving convergence.” Applied Mathematical Finance 3, 319–346. Omberg, E. 1988. “Efficient discrete time jump process models in option pricing.” Journal of Financial Quantitative Analysis 23, 161–174. Rendleman, R. J. and B. J. Bartter. 1979. “Two-state option pricing.” Journal of Finance 34, 1093–1110. Ritchken, P. 1995. “On pricing barrier options.” The Journal of Derivatives 3, 19–28. Tian, Y. 1993 “A modified lattice approach to option pricing” Journal of Futures Markets 13, 563–577. Tian, Y. 1999. “A flexible binomial option pricing model.” Journal of Futures Markets 19, 817–843. Widdicks, M., A. D. Andricopoulos, D. P. Newton, and P. W. Duck. 2002. “On the enhanced convergence of standard lattice methods for option pricing.” Journal of Futures Markets 22, 315–338.
513
Appendix 34A Extrapolation Formulas for Various GCRR Models Since the convergence pattern of the GCRR model is monotonic and smooth, we apply the extrapolation technique to enhance its accuracy. We first establish a two-point extrapolation formula using the GCRR binomial prices with geometrically increasing time steps n and 2n. The Black-Scholes price can be exactly calculated as follows: CBS D
.n/C2n Cn ; .n/ 1
(34.7)
where C2n and Cn are the binomial option prices according to the number of time steps equal to 2n and n, respectively. Although the error ratio .n/ is generally unknown, we can still apply the following extrapolation formula to improve accuracy as long as the error ratio converges to a constant quickly enough: CO 2n D .C2n Cn /=. 1/, where is the limit of the error ratio series. It is straightforward to prove that the pricing error remaining in the extrapolated price follows: .n/ e.2n/: CO 2n CBS D 1
(34.8)
In other words, the pricing error reduces from e.2n/ to .n/ 1 e.2n/ after a two-point extrapolation technique is applied.
Chapter 35
Estimating Implied Probabilities from Option Prices and the Underlying Bruce Mizrach
Abstract This paper examines a variety of methods for extracting implied probability distributions from option prices and the underlying. The paper first explores nonparametric procedures for reconstructing densities directly from options market data. I then consider local volatility functions, both through implied volatility trees and volatility interpolation. I then turn to alternative specifications of the stochastic process for the underlying. I estimate a mixture of log normals model, apply it to exchange rate data, and illustrate how to conduct forecast comparisons. I finally turn to the estimation of jump risk by extracting bipower variation. Keywords Options r Implied probability densities r Volatility smile r Jump risk r Bipower variation
35.1 Introduction Traders, economists and central bankers all have interest in assessing market beliefs. The problem is that markets speak to us in the language of prices, and extracting their views can often be a difficult inference problem. This paper surveys and occasionally extends a variety of methods for obtaining not just market expectations, but the full range of predictive densities. I examine and implement procedures for both options and the underlying. A mastery of these approaches can help banks, trading firms and their regulators avoid excessive risks. The paper begins by explaining the theoretical connection between option prices and probability measures. The relationship linking the option price and the risk neutral density is established, under very general conditions, using the Feynman–Kac lemma. I then establish empirically that the log normality restrictions implied by Black–Scholes are too severe. Our evidence,
B. Mizrach () Department of Economics, Rutgers University, New Jersey Hall, New Brunswick, NJ, USA e-mail: [email protected]
drawn from an exchange rate example, illustrates the volatility smile and skew. The paper then moves beyond the log-normal and extracts densities using histogram estimators. These approaches perform fairly poorly, but I introduce a well-behaved method that can be solved in a spreadsheet. A closely related approach using Rubinstein’s (1994) implied binomial trees follows. We use the modifications of Barle and Cakici (1998) and Cakici and Foster (2002) in a numerical example. The most popular approach in the practitioner literature, local volatility modeling, is discussed next. We follow Shimko (1993) and Dumas, Fleming and Whaley (1998) and interpolate the volatility surface. These techniques are computationally straightforward, but are limited by potential arbitrage violations in the tails, precisely the region of greatest interest to policy authorities. I next turn to alternative stochastic processes for the underlying. I parameterize the spot price process as a mixture of log normals, as in Ritchey (1990) and Melick and Thomas (1997), and fit the model to options prices. This model has proven useful in analyzing exchange rate crises in Haas, Mittnik and Mizrach (2006) and, in Mizrach (2006), predicting Enron’s bankruptcy risk prior to its collapse. Econometric issues are considered in the stochastic process section. These include the choice of metric. The paper endorses the view that matching option volatility is the preferred distance measure. Hypothesis testing against simpler alternatives is not straightforward due to nuisance parameters, and a bootstrap approach is discussed. Comparing value-at-risk over a specific interval is done using the approach of Christoffersen (1998), and comparison of the entire predictive density is also done using Berkowitz (2001). I then examine jump processes and introduce the crucial recent results of Barndorff-Neilsen and Shephard (2006). These enable a straightforward extraction of the jump risk component in the underlying. I find that, in exchange rates, jump risk does not move in lockstep with stochastic volatility. I now begin in the obvious place, with the Black–Scholes model as a baseline.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_35,
515
516
B. Mizrach
35.2 Black Scholes Baseline
This can be expressed as a partial differential equation (PDE),
I specify the price process and the hedge portfolio in the first part. The link between the risk neutral measure and the solution to the portfolio differential equation is in the second part.
@Y 1 @2 Y 2 2 @Y C rSt C St D rYt : @t @S 2 @Y 2
This equation was initially proposed and solved by Black and Scholes (1973) in their seminal paper. A number of derivative securities can be priced using the solution to (35.9) along with the appropriate boundary conditions,
35.2.1 The B–S Differential Equation We assume that the spot price St follows a geometric Brownian motion, St D S0 exp.˛t C Wt /
(35.1)
where S0 is an initial condition, ˛ and are constants and Wt is a standard Brownian motion. This process is log normal because log.St / D log.S0 / C ˛t C Wt is normally distributed. Ito’s lemma implies that dSt D St dt C St dW t ;
(35.2)
with drift D .˛ C 2 =2/ and deterministic volatility . Suppose that Yt is the price of a derivative security contingent on S and t. From Ito’s lemma, we obtain, d Yt D
1 @2 Y 2 2 @Y @Y @Y St C C St dW t : S dt C t @S @t 2 @S 2 @S (35.3)
Consider next a portfolio, …, short one derivative security and long an amount @Y =@S of the underlying, …t D Yt C
@Y St : @S
(35.4)
The change in value of this portfolio in a short interval is d…t D d Yt C
@Y dSt : @S
(35.5)
Substituting (35.2) and (35.3) into (35.5) yields,
YT D exp.r/ .ST /
.St / D exp.r/ maxŒST K; 0 :
d…t D r…t dt
(35.7)
BSC .St ; K; T; r; / D St ˆ.d1 / K exp.r/ˆ.d2 /; (35.12) with ln.St =K/ C .r C 2 =2/ p ; p d2 D d1 ; d1 D
and ˆ./ denotes the distribution.
1 @2 Y 2 2 @Y @Y C S dt D r Y S t t dt: t @t 2 @S 2 @S
cumulative standard
normal
35.2.2 A Probability Density Approach The link between the typically numerical PDE approach and stochastic processes is provided by the Feynman–Kac analysis. Consider again a spot price process that solves (35.1), and a derivative security that follows the PDE, 1 @2 Y 2 2 @Y @Y St C C St D 0; @S @t 2 @S 2
(35.13)
and satisfies the boundary condition (35.11). Applying Ito’s lemma to (35.13), we again obtain (35.3). The first term in (35.3) drops out from (35.13), and then integrating on both sides, Z
Z
T
T
d Yt D YT Yt D t
Substituting from (35.4) and (35.6), this becomes
(35.11)
We then obtain the well-known Black–Scholes formula,
(35.6)
Since this equation does not involve d Wt , the portfolio is riskless during the interval dt. Under a no arbitrage assumption, this portfolio can only earn the riskless rate of interest r,
(35.10)
where T is the date of expiry, and D T t. For a European call with strike price K, the boundary condition would be
@Y 1 @2 Y 2 2 d…t D S C dt: t @t 2 @S 2
(35.9)
t
@Y St dW t : @S
(35.14)
Taking expectations, the right hand side is zero, leaving (35.8)
Yt D EŒYT D exp.r/EQ Œ .ST /jSt D s
(35.15)
35 Estimating Implied Probabilities from Option Prices and the Underlying
Note that we have replaced the solution of the PDE with a conditional expectation taken under the risk neutral measure Q. Regularity conditions are established in Karatzas and Shreve (1991). Hambly (2006) shows how we can use the Feynman– Kac to construct probability densities for the spot price. Set .s/ D I .s/, the indicator function of a set . The solution to the PDE (35.13) with this boundary condition is EŒI .ST /jSt D s D PrŒST 2 jSt D s
(35.16)
The probability density f obeys the Kolmogorov (or Fokker–Planck) forward equation, 1 @2 f 2 2 @f @f St C C St D 0 @S @t 2 @S 2
(35.17)
where we impose a terminal delta function. This equation describes the probability density of being at ST given that we are at St . In the case where the spot price process follows a Brownian process with drift r, dSt D rSt dt C St d Wt ;
(35.18)
we calculate that the transition density is the log-normal, f .ST / D
"
1
ST
p exp 2 2
Œln. SSTt / .r 2 =2/ 2 2 2
# :
517
35.3 Empirical Departures from Black Scholes Data on option prices appear to be inconsistent with the Black–Scholes assumptions. In particular, volatility seems to vary across strike prices – often with a parabolic shape called the volatility “smile.” The smile is often present on only one part of the distribution giving rise to a “smirk.” Tompkins (2001) shows that these patterns are quite typical in a wide range of options markets, including stock indices, Treasury bonds and exchanges rates for Germany, the U.K. and the U.S. The motivating examples in this section are drawn from the foreign exchange market and draw upon the work of Haas, Mittnik and Mizrach (2006). I analyze the US Dollar/ British Pound (US$/BP) option which trades on the Philadelphia exchange. Both American and European options are traded. The BP options are for 31,250 Pounds. I use daily closing option prices that are quoted in cents. Spot exchange rates are expressed as US$ per unit of foreign currency and are recorded contemporaneously with the closing trade. Foreign currency appreciation (depreciation) will increase the moneyness of a call (put) option. Interest rates are the Eurodeposit rates closest in maturity to the term of the option. To obtain a rough idea about the implied volatility pattern in the currency options, I look at sample averages. I sort the data into bins based on the strike/spot ratio, S=K, and compute implied volatilities using the Black–Scholes formula. In Fig. 35.1, I plot the data for all of 1992 and 1993, for
Averages of Implied Volatility BP/US$ Options 1992-93 17
Implied Volatility (% Per Annum)
16
15
14
13
12
11
10 <0.92 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 >1.08
Strike Price/Spot Price
Fig. 35.1 Averages of implied volatility BP/US$ options 1992–1993. Notes: The figure reports implied volatility for BP/US$ options from the Philadelphia Exchange for 1992–1993
518
B. Mizrach
the British Pound. It displays the characteristic pattern, with the minima of the implied volatility at the money, and with higher implied volatilities in the two tails. Black–Scholes cannot account for this pattern in implied volatility. The next three sections discuss how to move beyond the restrictive Black–Scholes assumptions.
35.4 Beyond Black Scholes I extend the Black–Scholes model by allowing volatility to vary with the strike price. The formal link is established using the forward Kolmogorov equation.
35.4.1 How Volatility Varies with the Strike The Feynman–Kac analysis enables us to define a risk neutral probability in which we can price options. Let f .ST / denote the terminal risk neutral (Q-measure) probability at time T , and let F .ST / denote the cumulative probability. A European call option at time t, expiring at T , with strike price K, is priced Z
.ST K/f .ST /dST ;
Dupire (1994) clarifies the isomorphism between the approaches that specify the density and those that specify price process. He shows that for driftless diffusions, there is a unique stochastic process corresponding to a given implied probability density. From the Kolmogorov equation (35.17) for this density, I rearrange 1 @2 C @C @C D 2 K 2 2 rK @T 2 @K @K
(35.22)
and obtain the local volatility formula, @C @C C rK @K : 2 .K; t/ D @T 1 2 @2 C K 2 @K 2
(35.23)
The principal problem in estimating f is that one does not observe a continuous function of option prices and strikes. My first attempt to estimate f utilizes a histogram method to try and trace out the density.
35.5 Histogram Estimators
1
C.K; / D exp.r/
35.4.2 The Kolmogorov Equation
(35.19)
K
In the case where f ./ is the log-normal density and volatility is constant with respect to K, this yields the Black–Scholes formula, In this benchmark case, implied volatility is a sufficient statistic for the entire implied probability density which is centered at the risk-free rate. In the baseline case, practitioners often try to invert (35.2) to estimate “implied volatility.” The academic literature has focused on whether volatility in the physical measure is better predicted by implied volatility, GARCH or stochastic volatility. Under basic no-arbitrage restrictions, we can consider more general densities than the log-normal for the underlying. Breeden and Litzenberger (1978) show that the first derivative is a function of the cumulative distribution, @C =@K jKDST D e r .1 F .ST //:
(35.20)
The second derivative then extracts the density, @2 C =@K 2 jKDST D e r f .ST /:
(35.21)
This section develops two nonparametric estimators of the risk neutral density.
35.5.1 A Crude Histogram Estimator This section describes a method first proposed by Longstaff (1990) and then modified by Rubinstein (1994). Let S be the current price of the underlying asset. Let C1 to Cn be the prices of n associated call options with striking prices K1 < K2 < < Kn all with the same time to expiration. Denote by ST the price of the asset at expiration, and let r be the riskless rate of interest, and ı be the payout rate on the underlying asset (e.g. a dividend for a stock or in the case of a foreign exchange option, the interest differential). As ST goes from Kn to Kn1 the payoff goes from 0 to Kn Kn1 with an average payoff of 0:5.Kn Kn1 /. It follows that Cn1 D 0:5pn .Kn Kn1 /=.1 C r/ with pn the risk neutral probability associated with the histogram bin n. Consider now the price of the call with the next lowest striking price. There are three outcomes one must consider.
35 Estimating Implied Probabilities from Option Prices and the Underlying
519
Histogram Estimators Probability 0.6 0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 Longstaff/Rubinstein
Normal Distribution
Optimized Histogram
−0.4 75
80
85
90
95
100
105
110
115
120
125
140
Strike
Fig. 35.2 Histogram estimators Notes: The figure reports various different histogram estimators
If ST < Kn2 then the payoff is zero. In the interval between Kn1 and Kn2 the average payoff is 0:5.Kn1 Kn2 /. Above Kn1 the payoff is pn Œ.Kn1 Kn2 / C 1=2.Kn Kn1 / . The latter part is easily seen as just .1 C r/Cn1 . Adding up these components, gives us the recursion, h Xj Cnj D 0:5pnj C1 C
i
kD2
pnj Ck
.Knj C1 Knj /=.1 C r/ C Cnj : (35.24) I can view the underlying asset as a call with strike K0 D 0. As before, I have three outcomes to consider. If ST < 0 then the payoff is zero. For 0 ST < K1 , then the payoff is 1=2.K1 0/p1 , and for ST > K1 , ı .K1 0/Œp2 C C pn .1 C r/ C C1 :
(35.25)
Adding up these payoff components gives us, h i Xn pk .1 C r/ C C1 : S=.1 C ı/ D K1 0:5p1 C
Pj 1 More generally, I can write pj D 2Œ1 kD1 pk .1 C r/ .Cj 1 Cj /.Kj Kj 1 /1 . I implement this histogram method in Fig. 35.2. The baseline is a normal distribution for which I have priced 11 Black–Scholes calls for strikes from 75 to 125, with a spot price of 100. For this example, the Longstaff–Rubinstein estimator does not provide even a good rough approximation. The probabilities whipsaw back and forth; every other bin turns negative. After some trial and error, it was apparent this property was not uncommon. I turn next to an improved estimator.
35.5.2 An Improved Histogram Estimator Let bi > 0; i D 0; : : : ; n C 1 be the cardinality of a set uniformly distributed with mean Ki C1 Ki for i D 1 to n1, and KN i at the end points. To transform bi into a probability measure, define
kD2
pj D b j
(35.26) Exploiting the fact that the probabilities sum to one, ı S=.1Cı/ D K1 Œ0:5p1 C.1p/1 .1Cr/CC1 ;
(35.27)
p1 D 2Œ1 .1 C r/.S=.1 C ı/ C1 /K11 :
b0 ;:::;bnC1
(35.28)
j D1
bj
(35.29)
The improved estimator minimizes the percentage deviation from the simulated and observed call prices, min
I can rearrange (35.27) and solve for P1 ,
ı XnC1
Xn j D1
.CO j Cj /=Cj ;
(35.30)
where the c C 0 s can be determined using the recursion (35.24).
520
B. Mizrach
I used this optimized histogram estimator to fit the same 11 Black–Scholes calls from the prior example. For a small number of calls, this algorithm can be solved in a spreadsheet. I plot the risk neutral probabilities in Fig. 35.2. The optimized histogram is a substantial improvement over the crude estimator. There are no negative probabilities (by construction) and with the exception of the long bin from 0 to 75, it provides a good approximation to the underlying normal. This method is quite reliable for back of the envelope calculations, but for more complicated processes, one needs more powerful techniques.
and Cakici and Foster (2002). The first key modification is to center the tree at the forward price Fi;j , Fi;j D pi;j Si;j C1 C .1 pi;j /Si;j :
(35.34)
The other change is to value options using Black–Scholes with interpolated volatilities rather than the Arrow–Debreu prices i;j , BSCi;j .S0;0 ; K; T; / Xn i;j exp.r t/ max.Si;j K; 0/; (35.35) D j D1
which is a call with strike price K, expiration t, and volatility . To grow the tree from the initial spot price (or any odd number step in the tree), we have two even nodes. The upper central node is
35.6 Tree Methods 35.6.1 A Standard Tree I begin the discussion by defining a notation for a standard binomial tree. Let S0;0 denote the current spot price. Any element of the spot price tree is a sequence of recombining up and down moves u; d :
Si C1;j C1 D Fi;j
i;j Fi;j C Ci;j
(35.36)
i;j Fi;j Ci;j
where Si;j D S0;0 uj d i j ; i D 0; : : : ; N; j D 0; : : : i; (35.31) where p u D exp T =N ; d D 1=u; N is the number of steps in the tree, T is the time to expiration, and is the annualized volatility. The risk neutral probability of an up move is pD
exprT =N d ud
(35.32)
where r is the risk free rate of return. The main advantage of the standard approach is simplicity. One of the important limitations for our purposes is the assumption that the probabilities are constant throughout the tree. This implies that the nodal probability of reaching any particular terminal point SN;j is just the binomial probability
Ci;j D exp.r t/Ci;j .Fi;j ; t/
Xi kDj C1
i;k .Fi;k Fi;j /; (35.37)
The lower central node must then satisfy Si C1;j Si C1;j C1 D 2 . Fi;j We then move up from the central node using the recursions, Si;j C1 D
Ci1;j Si;j i;j Fi 1;j .Fi 1;j Si;j /
Ci1;j i;j .Fi 1;j Si;j /
:
(35.38) Moves down from the central node are given by Si;j D
i 1;j Fi 1;j .Si;j C1 Fi 1;j / Pi1;j Si;j C1 i 1;j .Si;j C1 Fi 1;j / Pi1;j (35.39)
where pN;j D PrŒST D SN;j D ˆ.j; N; p/; j Š.N Š j /Š
p j .1 p/N j : D NŠ
(35.33)
Pi;j D exp.r t/Pi;j .Fi;j ; t/
Xj 1 kD1
i;k .Fi;k Fi;j / (35.40)
The Arrow–Debreu probabilities are then updated,
35.6.2 Implied Binomial Trees The implied binomial tree approach of Rubinstein (1994) and Jackwerth and Rubinstein (1996) relaxes the assumption of constant probabilities as in (35.32). Our discussion here follows the developments in Barle and Cakici (1998)
i C1;j exp.r t/ 9 8 for j D n C 1 = pi;j i;j < D pi;j 1 i;j 1 C .1 pi;j /i;j for 2 j n : ; : for j D 1 .1 pi;1 /i;1 (35.41)
35 Estimating Implied Probabilities from Option Prices and the Underlying
35.6.3 Numerical Example We illustrate the implied binomial trees using a numerical example. Consider an 2-step tree, 6 months from expiration, which implies t D 0:25. The initial price S0;0 D 100. Volatility is a smirk, rising when the strike price is above 100, K D max.20; 20 Fi;j =S0;0 /. The continuously compounded interest rate is r D ln.4:8790/ D 5%. Starting from S0;0 , we compute the forward price, F0;0 D 100 exp.0:04879 0:25/ D 101:2272. The Arrow–Debreu price at the origin is 0;0 D 1:0. The Black–Scholes call struck at the futures price, expiring next period and volatility D 20 101:2272=100 D 20:2454 is BSCi;j .S D S0;0 ; K D F0;0 ; T D 0:25; D 20:2454/ D 4:0367; which implies
Ci;j D exp.0:04879 0:25/4:0367 D 4:0862:
Having found the new spot prices, we can update the transition probabilities, p0;0 D
.F0;0 S1;0 / D 0:4798; .S1;1 S1;0 /
and the Arrow–Debreu prices, 1;1 D 0;0 p0;0 exp.r t/ D 0:4740; 1;0 D 0;0 .1 p0;0 / exp.r t/ D 0:5139: The next central node is at the 2-step ahead forward price, S2;1 D 100 exp.0:04879 0:25 2/ D 102:4695: We work up from there using (35.38). This requires the forward prices at the prior node, F1;1 D exp.r t/S1;1 D 111:0902; and the call prices. We compute the new local volatility D 22:2180, and price the call
We then use (35.36) to find S1;1 D 101:2272
521
.1:0/.101:2272/ C 4:0862 D 109:7434: .1:0/.101:2272/ 4:0862
We find the lower central node by equating spot and futures prices, 2 S1;0 D F0;0 =S1;1 D 93:3719:
S2;2 D
BSCi;j .S0;0 ; K D F1;1 ; T D 0:5; D 22:2180/ D 3:1597: Since this call is at expiry, we don’t need to discount,
Ci;j D BSCi;j . Substituting into (35.38),
.3:1597/.102:4695/ .0:4740/.111:0902/.111:0902 102:4695/ 3:1597 .0:4740/.111:0902 102:4695/
D 140:4882:
Moving down from the central node requires (35.39). We use the futures price F1;0 D exp.r t/S1;0 D 94:5178;
and the volatility D 20, to price the put, BSPi;j .S0;0 ; K D F1;0 ; T D 0:5; D 20/ D 2:3970: Substituting into (35.39),
S2;0 D
.0:5139/.94:5178/.102:4695 94:5178/ .2:3970/.102:4695/ .0:5139/.102:4695 94:5178/ 2:3970
D 83:2339:
522
B. Mizrach
This completes this simple example. A simpler approach which interpolates the implied volatility surface is my next topic.
35.8 PDF Approaches
35.7 Local Volatility Functions The most popular practitioners method is the use of local volatility functions. The seminal references are Shimko (1993) and Dumas, Fleming and Whaley (1998). Shimko (1993) proposed to fit a polynomial to the smile so as to infer how volatility varied with the striking price. He chose to use a quadratic, O .K; t/ D a0 C a1 K C a2 K 2 :
(35.42)
In the case where volatility varies in this fashion, the density f .ST / can be written, N 0 .d2 / Œd2K .a1 C 2a2 K/.1 d2 d2K / 2a2 K : (35.43) Shimko’s method has several advantages. It can be used to generate call prices for a continuum of strikes. In this way, I can fill in the histogram proposed by Longstaff (1990) and Rubinstein (1994). The downside is that it can easily generate either negative probabilities or pricing inconsistencies. I can illustrate this in an empirical example now. The difficulties with Shimko’s method arise when @=@K is too steep. At some point, the volatility may become so high that is implies a more deeply out of the money option has a higher price than those close to being in the money. I utilize data from the Philadelphia Exchange for June 25, 1992 DM/$ puts and calls. I fit a quadratic to the smile, 0:4035 .%M /2 C 0:288 .%M / C 10:362 in Fig. 35.3(a). At 4% out of the money, the implied volatility rises to 15.67 and at 6%, 23.16. Once I translate these volatilities into prices in Fig. 35.3(b), one can see that an arbitrage exists. The 6% call is more valuable than the 4%, 0.16 versus 0.10. A further difficulty is that even when this arbitrage is not a problem, there is no guarantee that the bracketed term in (35.58) will remain positive for all choices of a. I also conjecture that one set of restrictions does not necessarily imply the other, making this a fairly complicated constrained optimization. Brunner and Hafner (2003) is a promising step in this direction. Bliss and Panigirtzoglou (2002) endorse the use of local volatility functions1 over the mixture-of-normals approach in
1
the next section, but this conclusion does not seem to hold when we look just at the tails of the distribution.
They use natural splines in the implied volatility-delta space. When fitting the mixture of normals, they use the call price-strike metric.
35.8.1 Mixture-of-Log-Normals Specification I assume that the stock price process is a draw from a mixture of three (nonstandard) normal distributions, ˆ.j ; j /, j D 1; 2; 3; with 3 2 1 . Three additional parameters 1 ; 2 and 3 define the probabilities of drawing from each normal. To nest the Black–Scholes, we restrict the mean to equal the interest differential, 2 D id if . Risk neutral pricing then implies restrictions on either the other means or the probabilities. I chose to let 1 , 1 and 3 vary, which implies 1 1 ; (35.44) 3 D 3 and 2 D 1 1 3 :
(35.45)
For estimation purposes, this leaves six free parameters D .1 ; 2 ; : : : ; 6 /. To keep the estimates positive, I embed the parameters into the exponential function. The left-hand tail distribution is given by ˆ.1 ; 1 / D ˆ.id if e 1 ; 100 e 2 /:
(35.46)
The only free parameter of the middle normal density is the standard deviation, ˆ.2 ; 2 / D ˆ.id if ; 100 e 3 /:
(35.47)
I use the logistic function for the probabilities to bound them on [0, 1], 1 D
e 4 ; 1 C e 4
(35.48)
3 D
e 5 : 1 C e 5
(35.49)
The probability specification implies the following mean restrictions on the third normal,
e 4 =.1 C e 4 / 6 ˆ.3 ; 3 / D ˆ .id if C e 1 / : ; 100
e e 5 =.1 C e 5 / (35.50)
Mizrach (2006) shows that this data generating mechanism can match a wide range of shapes for the volatility smile.
35 Estimating Implied Probabilities from Option Prices and the Underlying
a
523
DM/$ Options June 25, 1992
Implied Volatility 15.0 14.5 14.0
Option Volatilities Estimated Volatility
13.5 13.0 12.5 12.0 11.5 11.0 10.5 10.0 4.0
3.0
2.0
1.0
−1.0
0.0
−2.0
-3.0
% In (+) or Out(-) of the Money
b DM/$ Options June 25, 1992 Prices from Shimko's Method
Option Prices 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 2.0
1.0
0.0
−1.0
−2.0
−3.0
−4.0
−5.0
−6.0
−7.0
−8.0
% In (+) or Out (-) of the Money Fig. 35.3 Local volatility functions and arbitrage violations Notes: The dark line in panel (a) is a quadratic function 0:4035.%M /2 C0:288 .%M / C 10:362 fit to the implied volatility of the DM/US$ options.
As we extrapolate the local volatility surface further out of the money, the model produces an arbitrage violation in panel (b) for the calls more than 5% out of the money
35.8.2 Data and Estimation Results
American options which can be exercised before expiration. The second is choosing the loss function for estimation. I approximate American puts and calls using the Bjerksund and Stensland (1993) approach. Hoffman (2000) shows that the Bjerksund–Stensland algorithm compares favorably in accuracy and computational efficiency to the Barone-Adesi and Whaley (1987) quadratic approximation. Our estimates were also quite similar using implied binomial trees.2
I focus on a subset of the results from Haas, Mittnik and Mizrach (2006). I utilize the British Pound/US$ option data described in Sect. 35.3. For estimation purposes, I excluded options that were less than 5 or more than 75 days to maturity, more than 10% in or out of the money, and with volumes less than 5 contracts. This seemed to eliminate most data points with unreasonably high implied volatilities. There are two key issues in fitting a stochastic process to the options data. The first is to extend the analysis to
2 We can also price exotics in this framework, pricing options using Monte Carlo or other numerical procedures. Conditions under
524
B. Mizrach
35.9 Inferences from the Mixture Model
Let fdj;t gnj D1 D Œc.1 ; K1 /; : : : ; c.m ; Km /; p.mC1 ; KmC1 /; : : : ; p.n ; Kn / denote a sample of size n of the calls c and puts p traded at time t, with strike price Kj and expiring in j years, and denote the pricing estimates from the model by fdj;t ./g. In estimation, Christoffersen and Jacobs (2004) emphasize that the choice of loss function is important. Bakshi, Cao and Chen (1997), for example, match the model to data using option prices. This can lead to substantial errors among the low priced options though. Since these options are associated with tail probability events, this is not the best metric for our exercise. We obtained the best fit overall using the implied Bjerksund–Stensland implied volatility, j;t D BJS T 1 .dj;t ; St ; it /:
j;t ./ D BJS T 1 .dj;t ./; St ; it /:
(35.52)
We then minimize the sum of squared deviations from the implied volatility in the data,
Xn j D1
35.9.1 Tests of the Adequacy of Black–Scholes Hansen (1996) considers the case of additive nonlinear models like our mixture model, j;t D .1 1 3 /ˆ.id if ; 2 .3 // C1 ˆ.1 .1 /; 1 .2 // C 3 ˆ.3 ; 3 1 .6 //
(35.51)
Let the estimated volatility be denoted by
min
There is often a belief that the standard model is sufficient in all but the most extreme circumstances. It turns out that a formal test of this kind in a mixture model is not at all straightforward. The forecast comparisons discussed in the second and third parts may make more sense for practitioners.
.j;t ./ j;t /2 :
(35.53)
As Christoffersen and Jacobs note, this is just a weighted least squares problem that, with the monotonicity of the option price in , satisfies the usual regularity conditions. I report estimates for 2 sample days around the ERM crisis in Table 35.1. Haas, Mittnik and Mizrach (2006) establish that the model fits well both before and after the crisis. Inferences drawn from the BP/US$ options are analyzed in the next section.
Under Black–Scholes, 1 D 3 D 0, and it is tempting to compare the models using the likelihood ratio. Unfortunately, under the Black–Scholes alternative, the parameters in the two tail log normals, 1 , 2 and 6 , are nuisance parameters. Formally, the derivative of the likelihood function is flat with respect to these parameters. Their nonidentification invalidates the distribution theory for the standard LR test. Hansen (1996) reports severe size distortions in several cases for the standard test. Fortunately, Hansen also offers a constructive alternative. In our present setting, we can take minimizing the squared residuals, "2t D .j;t ./ j;t /2 as the objective function. The restricted alternative is given by "2R;t D .j;t .3 / j;t /2 Construct the F -test,
which these simulated moments models are appropriate is considered in Mizrach (2006).
Table 35.1 Estimates of the mixture of log normals model Date 1 2 3 4
Fn D n
Xn j D1
."2R;t "2t /=
Xn
"2 : j D1 R;t
5
6
%VaR0:05
%VaR0:01
20-Aug-1992
2:179 (0.01)
1:908 (0.15)
0:816 (0.01)
1:855 (0.01)
1:908 (0.07)
2:440 (0.63)
4:84%
7:14%
17-Sep-1992
0:882 (0.07)
0:943 (0.48)
3:634 (0.48)
3:108 (0.02)
2:051 (0.09)
1:933 (2.12)
6:11%
8:73%
(35.54)
LR0:05
LR0:01
11.87 (0.00)
0.09 (0.76)
Notes: The 0 s are estimates of the model (35.14). t -ratios are in parentheses. %VaR0:05 and %VaR0:01 are the value at risk, in percent of long position in the BP over a 4-week time horizon at the 95 and 99% confidence level. The LR statistic is given by (35.55) and compares the VaR across the two dates. The simulation size is N D 250 observations. The LR test, with p-values underneath, is distributed 2 .1/.
35 Estimating Implied Probabilities from Option Prices and the Underlying
Hansen then shows that Fn D sup Fn ./: 2‚
whose distribution is not asymptotically F and must be obtained by bootstrap. Since coefficients are likely to change daily, this is potentially tedious procedure.
525
under the August loss coverage rejects too often, 26 times, rather than the expected 0:05 250 D 12:5 times under the null. The likelihood ratio is 11.87 which rejects the null at better than 1% significance, At the 99% level, I find n0 D 3, and the test, with LR0:01 D 0:094, is not nearly powerful enough to reject that the forecasts are different.
35.9.3 Comparing the Entire Density 35.9.2 Hypothesis Tests on the Forecast Intervals To compare the value-at-risk on the 2 days at both confidence levels. I adapt the framework of Christoffersen (1998). In each of the three cases, I construct the test of the null hypothesis from a sequence of Bernoulli trials, It˛ D
1; PrŒ.ST St /=St < %VaR˛ > ˛ 0 PrŒ.ST St /=St < %VaR˛ < ˛
Next, we evaluate the forecast densities produced across their entire support. The approach we take is the one originally proposed by Berkowitz (2001). He notes that the probability integral transform b .st / D F
st
f .u/d u: 1
; (35.55)
where St is the current spot price, T t is 4 weeks, and ˛ is the critical level of the VaR. I take the VaR loss intervals from August 20 as the null, and compute %VaR0:05 D 4:84% and %VaR0:01 D 7:14%. Since the VaR is assessing tail risk, we are concerned with the coverage of the forecast interval from .1; %VaR˛ /. Under what Christoffersen calls unconditional coverage, I test
generates uniform, independent and identically distributed estimates under fairly weak assumptions. Testing for an independent uniform density in small samples can be problematic, so Berkowitz suggests transforming the data into normal random variates, b .st //; zt D ˆ1 .F A simple test of the null hypothesis that the transformed forecast statistics, zt , have mean zero can be performed using the likelihood ratio,
H0 W EŒIt˛ D ˛; LR D
using the likelihood ratio ˛ n0 .1 b ˛ /n1 / ; LR˛ D 2 lnŒ˛ n0 .1 ˛/n1 =.b
Z
(35.56)
where b ˛ D n0 =.n1 C n0 /; is the maximum likelihood estimator of ˛. Under H0 , LR˛ is distributed 2 .1/. As one might expect, the VaR rose substantially with the higher volatility during the crisis. Values at risk rises to %VaR0:05 D 6:11% and %VaR0:01 D 8:73%. To assess the statistical significance of this, I implement the test in (35.56) in Table 35.1. I simulate N D 250 forecasts from the September 17 parameterization. Let n0 be the number of times that .ST St /=St is less than %VaR˛ . I find n0
XT t D1
z2t 1 ; O 2
(35.57)
where O is the forecast standard deviation. LR is then approximately distributed 2 .1/. I graph the forecast density for August 20, 1992 in Fig. 35.4. The realized 4-week returns are then plotted in comparison to the density. To the naked eye, the left-tail, associated with BP depreciation, is substantially longer. This was indeed detected in our VAR exercise in the prior section. When looking across the entire density, the power of the test is substantially weaker. The average forecast is 0.9756 or approximately a 2:5% decline. The average z-value is 0:4261, and O 2 D 2:4195. I compute LR D 1:5008 which has a p-value of only 0:2206. Clearly, when the risks are onesided, you want to exploit this information.
526
B. Mizrach
Forecast Density Comparison 9.00 8.00
20-Aug-92 Forecast Density Realized Data
7.00 6.00 Frequency (%)
Fig. 35.4 Forecast density comparison Notes: The light bars are the forecast density for the 4-week return on the spot exchange rate using the mixture model parameters from August 20, 1992 in Table 35.1. The dark bars are the subsequent 20 realizations. A formal statistical comparison is conducted using (35.57)
5.00 4.00 3.00 2.00 1.00 0.00 0.88
0.91
0.93
0.96
0.98
1.01
1.03
1.06
1.08
1.11
1.13
S(t+20)/S(t)
where 0 D .1 C k/ and fn is the Black–Scholes option price when the variance is
35.10 Jump Processes I now move back to the case of a single process for the underlying. The key step is to introduce discontinuities in the price process through jumps. I begin with the Merton (1976) model as a baseline, and then allow for stochastic volatility in the second part.
2 C nı 2 =T; and the risk free rate is r k C n ln.1 C k/=T: Carr and Wu (2004) consider the generalization of the jump diffusion model to Levy processes. Bates (1991) has estimated this model to infer the risk in options prior to the 1987 stock market crash.
35.10.1 Merton Model Merton (1976) has proposed a jump diffusion model dSt D . k/St dt C St dW t C dqt ;
(35.58)
35.10.2 Bipower Variation where dW t is a Wiener process, dqt is the Poisson process generating the jumps, and is the volatility. dW and dq are considered independent. This assumption is important because we cannot apply risk-neutral valuation to situations where the jump size is systematic. is the rate at which jumps happen, is the expected return, and k is the average jump size. This model gives rise to fatter left and right tails than Black–Scholes. If we assume that the log of k is normal with standard deviation ı, the European call option price is X1 e 0 T .0 /n fn ; C D nD0 nŠ
I follow Andersen, Bollerslev and Diebold (2007) and consider a stochastic volatility model with jumps, dpt D t dt C t d Wt C t dqt
where pt D ln.St /, qt is a counting process with intensity t , and t is the jump size with mean and standard deviation . The quadratic variation for the cumulative return process, rt D pt p0 is then Z
(35.59)
(35.60)
t
Œr; r t D 0
s2 ds C
X 0<st
2 .s/:
(35.61)
35 Estimating Implied Probabilities from Option Prices and the Underlying
Estimation of the quadratic variation proceeds with discrete sampling from the log price process. Denote the
-period returns by rt C1; D pt C1 pt C1 . The realized volatility is RVt C1; D
X1=
r2 : j D1 t Cj ;
X
!0
t <st C1
2 .s/:
(35.66)
I next turn to an application of this analysis in the ERM exchange rate context
35.10.3 An Application I will now take the sampling interval to be daily changes, and compute 50-period rolling sample estimates of realized volatility, RVt
1=2
D
X 50
To extract the integrated volatility from (35.60), BarndorffNielsen and Shephard have also introduced the realized bipower variation, X1= ˇ ˇˇ ˇ ˇrt Cj ; ˇ ˇrt C.j 1/ ; ˇ
1=2 rt2j =50
(35.67)
X50 ˇ ˇˇ ˇı ˇrt j ˇ ˇrt j 1/ ˇ 50:
(35.68)
j D1
t <st C1
t
(35.63)
BVt C1; D 2 1
p lim.RVt C1; BVt C1; / D
(35.62)
In the now vast literature on the stochastic volatility model, D 0, researchers have employed realized R t 2 volatility as an estimator of the integrated volatility, 0 s ds. In the case of discontinuous price paths, BarndorffNeilsen and Shephard (2006) show that the realized volatility will also include the jump component, Z t C1 X s2 ds C 2 .s/: p limRVt C1; D
!0
527
and bipower variation, BVt D . =2/
j D1
We constrain the jump risk to be positive,
j D2
p where 1 D 2= . It is then possible to show Z
t C1
p limBVt C1; D
!0
(35.64)
t
s2 ds:
(35.65)
By comparing (35.63) and (35.65), we have the estimate of just the jump portion of the process,
t D .maxŒRVt BVt ; 0 /1=2
(35.69)
I again use the British Pound/US$ exchange rate sample for January to October 1992 analyzed in Mizrach (1995), and plot the realized standard deviation and jump risk in Fig. 35.5. The figure has several features worth noting. The first is that while realized volatility for the British Pound was quite
Jump Risk Estimates 1.20
Daily % Return
1.00
Realized Volatility Jump Risk
0.80
0.60
0.40
Fig. 35.5 Jump risk estimates Notes: The figure plots realized daily standard deviation (35.67) and jump risk (35.32) for the British Pound spot rate between March 18 and September 30, 1992
0.20
0.00 12-Mar-1992
1-May-1992
20-Jun-1992
Date
9-Aug-1992
28-Sep-1992
528
high leading into the Spring of 1992, jump risk is zero until April 24. Jump risk rises rather steadily through the rest of the spring, peaking at 0:2836% on June 1. Jump risk then falls through the summer, returning to zero from July 24 to August 24. On August 25, the jump risk spikes to 0:2188% and then falls back until September 14th, when it spikes again to 0:4155%. Following a third local maximum of 0:3850% on Britain’s September 17th ERM withdrawal, jump risk falls back to zero by September 22. This is true even though the realized volatility remains as high as it was during the crisis. Although not shown in the figure, both the volatility and jump risks remain high into 1993. At the end of 1992, more than 3 months after the crisis, the realized volatility is at 0:9007%. The jump risk spikes twice above 0:30%, October 19 and December 2.
35.11 Conclusion It is important to stress what this paper has not discussed: the preservation of moments into the change to the risk neutral measure. While straightforward in the Black–Scholes case, the results do not hold in general. This paper has only shown, empirically, that there is useful information in the risk neutral measure. The stochastic volatility model was not addressed in great detail; my emphasis on tail behavior gave priority to jumps. I also have not discussed approaches like Bates (2006) that rely on the characteristic function. The paper has covered ground in three important areas: (1) nonparametric approaches estimating implied densities directly from option prices; (2) parametric modeling of the local volatility surface; (3) generalizations of the PDEs for the underlying process. I also evaluate and implement tools for hypothesis testing and evaluating forecasts from alternative models. The nuisance parameter issue is addressed with a bootstrap. Value-at-risk and forecast comparison relies on the formal statistical procedures introduced by Christoffersen (1998) and Berkowitz (2001). I last considered the very recent work of BarndorffNeilsen and Shephard on bipower variation. This enables researchers to easily extract the jump component from the underlying. These discontinuities are unhedgeable risks that may be systemic. Policy makers may find these tools and inference worthwhile in a variety of contexts. Their subjective weights between type I and type II errors should not only be tested ex-post but incorporated directly in the estimation. Both Skouras (2007) and Christoffersen and Jacobs (2004) have made progress along these lines.
B. Mizrach Acknowledgements I would like to thank seminar participants at the OCC, Federal Reserve Bank of Philadelphia, and the FDIC Derivative Securities and Risk Management Conference for helpful comments.
References Andersen, T. G., T. Bollerslev, and F. X. Diebold. 2007. “Roughing it up: including jump components in the measurement, modeling and forecasting of return volatility.” Review of Economics and Statistics 89, 701–720. Bakshi, C., C. Cao, and Z. Chen. 1997. “Empirical performance of alternative option pricing models.” Journal of Finance 52, 2003–2049. Barle, S. and N. Cakici. 1998. “How to grow a smiling tree.” Journal of Financial Engineering 7, 127–146. Barndorff-Nielsen, O. and N. Shephard. 2006. “Econometrics of testing for jumps in financial economics using bipower variation.” Journal of Financial Econometrics 4, 1–30. Barone-Adesi, G. and R. E. Whaley. 1987. “Efficient analytic approximation of American option values.” Journal of Finance 42, 301–320. Bates, D. 1991. “The crash of ’87: was it expected? the evidence from options markets.” Journal of Finance 46, 1009–1044. Bates, D. 2006. “Maximum likelihood estimation of latent affine processes.” Review of Financial Studies, 19, 909–965. Berkowitz, J. 2001. “Testing density forecasts with applications to risk management.” Journal of Business and Economic Statistics 19, 466–474. Bliss, R. and N. Panigirtzoglou. 2002. “Testing the stability of implied probability density functions.” Journal of Banking and Finance 26, 381–422. Bjerksund, P. and G. Stensland. 1993. “Closed-form approximation of American options.” Scandinavian Journal of Management 9, 87–99. Breeden, D. and R. Litzenberger. 1978. “State contingent prices implicit in option prices.” Journal of Business 51, 621–651. Brunner, B. and R. Hafner. 2003. “Arbitrage-free estimation of the riskneutral density from the implied volatility smile.” Journal of Computational Finance 7, 76–106. Cakici, N. and K. R. Foster. 2002. “Risk-neutralized at-the-money consistent historical distributions in currency options pricing.” Journal of Computational Finance 6, 25–47. Carr, P. and L. Wu. 2004. “Time-changed Levy processes and option pricing.” Journal of Financial Economics 71, 113–141. Christoffersen, P. 1998. “Evaluating interval forecasts.” International Economic Review 39, 841–862. Christoffersen, P. and K. Jacobs. 2004. “The importance of the loss function in option valuation.” Journal of Financial Economics 72, 291–318. Dumas, B., J. Fleming, and R. Whaley. 1998. “Implied volatility functions: empirical tests.” Journal of Finance 53, 2059–2106. Dupire, B. 1994. “Pricing with a smile.” Risk 7, 18–20. Haas, M., S. Mittnik, and B. Mizrach. 2006. “Assessing central bank credibility during the ems crises: comparing option and spot marketbased forecasts.” Journal of Financial Stability 2, 28–54. Hambly, B. 2006. Lecture notes on financial derivatives, Oxford University, New York. Hoffman, C. 2000. Valuation of American options, Oxford University Thesis. Karatzas, I. and S. Shreve. 1991. Brownian motion and stochastic calculus, 2nd ed. Springer, New York. Melick, W. and C. Thomas. 1997. “Recovering an asset’s implied PDF from option prices: an application to crude oil during the Gulf crisis.” Journal of Financial and Quantitative Analysis 32, 91–115.
35 Estimating Implied Probabilities from Option Prices and the Underlying Merton, R. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics, 3, 124–144. Mizrach, B. 1995. “Target zone models with stochastic realignments: an econometric evaluation.” Journal of International Money and Finance 14, 641–657. Mizrach, B. 2006. “The Enron bankruptcy: when did the options market lose its smirk.” Review of Quantitative Finance and Accounting 27, 365–382. Ritchey, R. 1990. “Call option valuation for discrete normal mixtures.” Journal of Financial Research 13, 285–295. Shimko, D. 1993. “Bounds of probability.” Risk 6, 33–37.
529
Skouras, S. 2007. “Decisionmetrics: a decision-based approach to econometric modelling.” Journal of Econometrics 127, 414–440. Stein, E. M. and J. C. Stein. 1991. “Stock price distributions with stochastic volatility: an analytic approach.” Review of Financial Studies 4, 727–752. Tompkins, R. 2001. “Implied volatility surfaces: uncovering regularities for options on financial futures.” The European Journal of Finance 7, 198–230.
Chapter 36
Are Tails Fat Enough to Explain Smile Ren-Raw Chen, Oded Palmon, and John Wald
Abstract It has been well documented that using the Black-Scholes model to price options with different strikes generates the so-called volatility smile. Many previous papers have attributed the smile to the normality assumption in the Black-Scholes model. Hence, they generalize the Black-Scholes model to incorporate a richer distribution. In contrast to previous studies, our model allows for not only a richer distribution, but also the relaxation of another crucial assumption in the Black-Scholes – continuous trading. We show, using S&P 500 call options, how relaxation of continuous trading explains a non-trivial change in the volatility. When an empirical distribution is considered, the smile is almost completely removed. Market prices of options that differ only in their strike prices are inconsistent with a single volatility for the underlying asset, and this well-known feature is called the volatility smile or smirk. The volatility smile is often attributed to either the non-normality of stock returns or to the impossibility of performing costless arbitrage (rebalancing) in continuous time. In this paper, we consider option pricing models that incorporate no rebalancing and/or a nonparametrically estimated density. We attempt to empirically identify which of these two factors contributes more to the smile bias. Using S&P 500 index options, we find that the model with no rebalancing but with normality has a somewhat diminished smile, and the model with a nonparametric density, but with continuous trading, has a slight smile. Only the model with both a nonparametric density and no rebalancing has no perceptible smile bias. R.-R. Chen () Graduate School of Business Administration, Fordham University, New York, NY 10019, USA e-mail: [email protected] O. Palmon Rutgers Business School, Rutgers University, Piscataway, NJ 08854, USA e-mail: [email protected] J. Wald Department of Finance, College of Business, University of Texas at San Antonio, San Antonio, TX 78249, USA e-mail: [email protected]
Keywords Option pricing r Volatility smile r Nonparametric density
36.1 Introduction The Black-Scholes model is known to bias option prices for in-the-money and out-of-the-money options. Two unrealistic assumptions of the Black-Scholes model that may be responsible for this bias are continuous rebalancing and normality. The literature has dealt with these two issues separately. Some studies (e.g., Leland 1985; Constantinides 1998) have introduced transaction costs to the Black-Scholes model to relax the continuous rebalancing assumption, while other studies allow for stochastic volatility (e.g., Hull and White 1987; Heston 1993) and jump diffusions (e.g., Merton 1976; Cox and Ross 1976) to enrich the distributional assumption of the Black-Scholes model. In this study we consider both unrealistic assumptions. We find that eliminating the continuous rebalancing assumption eliminates some of the volatility smile, accounting for the actual underlying largely density eliminates most of it, and replacing both assumptions simultaneously eliminates the volatility smile. To examine the continuous trading assumption, we present a model in which returns are distributed normally, investors do not rebalance their portfolios,1 and asset prices are determined by a standard asset pricing model. We find that a model without continuous rebalancing is able to eliminate a relatively small fraction of the volatility smile. To examine the normality assumption, we avoid any parametric model but use the nonparametrically estimated density of the underlying asset. This design avoids the bias caused by any mispricing due to rigid parametric structure of the model and hence can truthfully gauge the impact of bias caused by normality.2 We first assume continuous rebalancing with the 1
Investors may not be able to rebalance due to transaction costs. The use of an empirically based distribution instead of a parametric model, such as a jump diffusion or stochastic volatility model, is motivated by repeated empirical findings that those models fail to explain the Black-Scholes biases (see, for example, Das and Sundaram 1999). 2
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_36,
531
532
R.-R. Chen et al.
Fig. 36.1 From normal to empirical density
From Normal to Empirical Density From Continuous to Discrete Trading
Black-Scholes III.A
Risk-neutral Empirical III.B
Static Lognormal III.C
Static Empirical III.D
empirical distribution by forcing all discount rates to be the risk free rate.3 We find that this model is more effective in eliminating the volatility smile. Then, we present a model based on an empirical distribution that does not permit rebalancing. Empirically, we find that once both the normality and continuous rebalancing assumptions are removed, the smile largely disappears. Our work differs from that of Rubinstein (1994) and AitSahalia (1996), who use option prices to back out the return distribution of the underlying asset. In their models, option prices by construction do not generate a smile because they are taken as an input. Rosenberg and Engle (2002) also estimate a monthly pricing kernel based on index option data. In contrast, our model uses the density estimated nonparametrically from the underlying asset’s historical returns. The assumption of a nonparametric distribution for the underlying asset is not new,4 but prior models have not directly used the historical density to operationalize their model. Figure 36.1 graphically illustrates the relationship between the models we consider, with continuous rebalancing models on top and the static, CAPM-based models below, and normal models on the left and nonparametrically estimated distributions on the right. After a brief review of the literature in Sect. 36.2, we discuss the basic BlackScholes model with normality and continuous rebalancing in Sect. 36.3.1. We then remove the continuous rebalancing assumption in Sect. 36.3.2 and present a static CAPM option pricing model instead. In Sect. 36.3.3 we replace the normality assumption of Black-Scholes with a density nonparametrically estimated from the actual returns
3
Note that continuous rebalancing is an important necessary condition for the use of the risk neutral pricing methodology. Typically, this linkage is provided by the Brownian motion mathematics. For the exposition here, we assume such a linkage exists for the empirical distribution without offering a proof and proceed to use the risk free rate as the discount rate. 4 See Stutzer (1996), Ait-Sahalia (1996), Ait-Sahalia and Lo (1998).
of the underlying asset, but assume costless rebalancing. In Sect. 36.3.4 we combine the static CAPM with our nonparametric density. Section 36.4 provides a graphical and econometric comparison of the implied volatilities from these models, and Sect. 36.5 concludes.
36.2 Literature Review The empirical literature has consistently demonstrated that the Black-Scholes model biases option values (or, equivalently, implied volatilities) along the moneyness and maturity dimensions. These biases are known as the volatility smile, which is the focus of this paper, and the volatility term structure, respectively. We begin with a brief review of the relevant literature.5 In a well-known study, Rubinstein (1994) demonstrates that the implied volatility for S&P 500 index option is a “sneer.”6 Other studies include Shimko (1993), who argues that the implied S&P 500 index distribution is negatively skewed and more leptokurtic than a lognormal distribution, and Jackwerth and Rubinstein (1996) who find that the S&P 500 index futures distribution before the crash of 1987 resembles the lognormal distribution, while the post-crash distribution exhibits leptokurtosis and negative skewness.7
5
We note that the empirical literature on the Black-Scholes model is voluminous and a comprehensive review is a paper itself. Black and Scholes (1972) provide the first empirical study and find option prices for high (low) variance stocks to be lower (higher) than predicted by the model. Subsequent studies include MacBeth and Merville (1979), Emanuel and MacBeth (1982), Whaley (1982), Geske et al. (1983), Geske and Roll (1984), Rubinstein (1985), and Whaley (1986). 6 Note that the Black-Scholes sneer found by Rubinstein (1994) is based upon data from 1 day. 7 For additional evidence, see Bates (1991 and 1996) and Dumas et al. (1998).
36 Are Tails Fat Enough to Explain Smile
The literature attributes the smile bias in the Black-Scholes model to two unrealistic assumptions: normality of stock returns and costless continuous rebalancing. The assumption that stock returns are distributed normally may underestimate option prices (particularly for deep inthe-money or far out-of-the-money options) because it implies that the tail probabilities of the return distribution are underestimated. To increase the tail probability, the theoretical research has used stochastic volatility, jumps, and GARCH to extend the Black-Scholes model. However, the empirical results have not been overly supportive. Scott (1987) and Bates (1996) find that the stochastic volatility model cannot explain the volatility smile.8 Bakshi et al. (1997) document that even a model with both jumps and stochastic volatility does not explain the volatility smile. Using S&P 500 index options with maturities less than 60 days from 1988 through 1991 they observe noticeable smiles for the three alternative models: stochastic volatility, stochastic volatility with jumps, and stochastic volatility with stochastic interest rates.9 Duan (1995, 1996) uses a GARCH model to price call options on the FTSE-100 index and demonstrates that the GARCH model can generate prices with a smile similar to the BlackScholes smile for most options. However, this result is based upon only a few days of data and no evidence is provided for whether the model can consistently remove the smile.10 More recently, Das and Sundaram (1999) point out that jump-diffusion and stochastic volatility models generate skew and excess kurtosis patterns that are inconsistent with reality. For example, the excess kurtosis generated by jump diffusion models (stochastic volatility models) declines with the holding horizon faster (more slowly) than in reality.11 In contrast, the literature includes a much smaller number of studies on the relationship between the cost of rebalancing and the smile. Assuming proportional transaction costs, Leland (1985, proposition 2) argues that transaction costs should depress the values of near-the-money options more
533
than those of away-from-the-money options.12,13 Longstaff (1995), Jackwerth and Rubinstein (1996), Dumas et al. (1998), and Peña et al. (1999) report that transaction costs and liquidity contribute to, but do not completely explain, the volatility smile. In an empirical study, Dennis and Mayhew (2002) discover that the implied skewness of the stock return distribution (implied by option prices) is positively related with stock betas (the higher the beta the more negative the skewness). This finding indicates that risk premiums are important in pricing options. Our static models take this feature explicitly into consideration when computing option prices. In summary, the literature includes many attempts to explain the volatility smile.14 Most studies document that the curvature of the smile is negatively related to option maturity. However, to the best of our knowledge, none of the studies successfully and completely explains the phenomenon. In the rest of the paper, we examine to what degree eliminating the assumptions of continuous rebalancing (no transaction costs) and normality eliminates the Black-Scholes smile.
36.3 The Models 36.3.1 The Black-Scholes Model The Black-Scholes model assumes the following distribution for the underlying asset: ln ST D normal ln St C .r
t/; 2 .T t/ (36.1)
where ST is the underlying asset price at time T given information at time t, r is the continuously compounded risk free rate and is the annualized volatility of the underlying asset.15 Given that C is the call price, K is the strike price, 12
8 Scott (1987) uses selected stock options from July 1982 to June 1983, and Bates (1996) uses the Deutsche Mark options from 1984 through 1991. 9 See also Eberlein et al. (1998) and Peña et al. (1999). 10 Eberlein et al. (1998) use a hyperbolic function for the distribution of underlying returns and find that the smile and the time-to-maturity effects are reduced in comparison to the Black-Scholes model, but not completely eliminated. 11 For instance, in our sample of S&P 500 returns between January 3, 1950 and September 5, 2003, the excess kurtosis for the three-month period is 22% of the excess kurtosis for the 1-week holding period. In contrast, Das and Sundaram (1999) demonstrate that, using jump diffusion models, the corresponding number is less than 8%, while using stochastic volatility models, this number is more than 70%.
2 2 /.T
One intuitive explanation for this proposition is that at-the-money options have higher gamma, and thus would need to be dynamically hedged more frequently than in- and out-of-the-money options. 13 Boyle and Vorst (1992), and Cochrane and Saa–Requejo (2000) derive the upper and lower bounds for option prices if transaction costs are introduced but do not focus on the volatility smile. 14 In addition to the volatility smile, the literature also documents that the implied volatility depends on option maturity. For example, see Day and Lewis (1992), Canina and Figlewski (1993), Lamoureux and Lastrapes (1993), Heynen et al. (1994), and Xu and Taylor (1994), Campa and Chang (1995), Jorion (1995), and Amin and Ng (1997). However, this issue is largely beyond the scope of this paper. 15 The observed S&P 500 index (SPX) is an ex-dividend series. Usually, given the large number of stocks in the index, the ex-dividend process is assumed to be continuous and hence Equation (1) needs to be adjusted by the dividend yield (r must be replaced by r d). This replacement could potentially have a large effect for American options, but the effect
534
R.-R. Chen et al.
The definition of ˇt;T under the CAPM is:
the well-known Black-Scholes formula is:
CtBS D e r.T t / EO t ŒmaxfST K; 0g D St N.d1 / e r.T t / KN.d2 /
ST CT ; St Ct ST var St
cov ˇt;T D
(36.2)
where EO t Œ is the risk neutral expectation taken conditional on information given at time t and
2 .T t/ ln S ln K C r C 2 d1 D p T t p d2 D d1 T t
Solving (36.3) and (36.4) for the option price, we obtain the following: CtSL
( 1 covŒST ; CT Et ŒCT D 1 C r.T t/ varŒST )
.Et ŒST Œ1 C r.T t/ St /
Equation (36.2) is the standard no-arbitrage pricing theory that under the risk neutral measure, all assets must yield the risk-free return. A necessary condition for this pricing theory is continuous rebalancing, which under normality is equivalent to complete markets. In the empirical section, we calibrate the Black-Scholes model to the market price of the BS for a given option and solve for the implied volatility, t;T trade date and maturity date .t; T /.
36.3.2 The Static Lognormal Model
ST Et 1 r.T t/ St (36.3)
2 .T t/ ln S ln K C C 2 h1 D p T t p h2 D h1 T t p h3 D h1 C T t
for European options, like the ones we consider, is small. Further, note that our empirical distribution is formed by returns of ex-dividend levels, which already account for the dividend effect, and hence no additional adjustment is necessary. 16 It is possible for the risk neutral pricing methodology to co-exist with discrete trading but the option must be written on the only asset in the economy – usually assumed to be the aggregated consumption. See Brennan (1979) for a detailed discussion of this point. 17 In a recent working paper, Bondarenko (2003) uses CAPM to examine put option prices.
(36.6)
Also note that: 2
varŒST D St2 e 2.T t / e .T t / 1
where Et Œ is the conditional expectation under the real measure, ˇt;T is the systematic risk, assumed fixed between t and T , and St is the value of the market portfolio at time t.
(36.5)
The large bracketed term in (36.5) can be regarded as a “certainty equivalent” expected cash flow. Since it carries no risk, it can be discounted at the risk free rate. The “certainty equivalent” expected cash flow is the original expected cash flow, Et ŒCT , minus the “dollar risk premium;” that is, the “market risk premium” adjusted by the “beta” of the option. In addition to the option price, we are also able to obtain the call’s beta and the risk-adjusted rate. In order to derive the closed form solution for the lognormal distribution (i.e., (36.5)), we first define:
In the absence of continuous rebalancing, the risk neutral expectation in Equation (36.2) is not obtainable, and thus we need to rely on an asset pricing model.16 Under the normality assumption the standard Capital Asset Pricing Model implies.17 For options written on the market portfolio: Et ŒCT 1 D r.T t/ C ˇt;T Ct
(36.4)
covŒST ; CT D EŒST CT EŒST EŒCT ;
(36.7)
where: EŒST CT D St2 e .2C
2 /.T t /
N.h3 / KSt e .T t / N.h1 /
EŒST D St e .T t / EŒCT D St e .T t / N.h1 / KN.h2 /:
(36.8)
These terms can then be substituted into (36.7), to provide a value for the lognormal call value, CtSL (see Appendix 36.A for further details). Alternatively, we can solve for the imSL D plied volatility in the S tatic Lognormal model, t;T p varŒST =St , by solving (36.5) given actual market prices.
36 Are Tails Fat Enough to Explain Smile
Note that the model presented here is static. It assumes no trading/hedging prior to the maturity date of the option. To demonstrate the relationship between our static model and a general model that allows for a finite number of times for trading/hedging between now and maturity, we derive a “two-period” model in which investors are allowed to trade/hedge once before maturity. In this case, at the only time for trading/hedging, symbolized by t1 , the option price should be exactly the same as (36.5) with the substitution of t1 for t. And today’s option price is the use of (36.5) again with the substitution of t1 for T . However, under this case, there is no closed form solution like (36.5) (and (36.5)) and an iterative solution process is necessary. In Appendix 36.B, we demonstrate that when continuous trading is allowed, the result converges to the Black-Scholes model. Note that (36.5) does not provide a solution for volatility when the market price is too low. In the empirical analysis of this model, we omit the approximately 8% of the observations where a solution does not exist.
535
fitted to all the data. The differences in skewness and kurtosis are evident. The valuation equation under this model is: CtRE D e r.T t / EO t ŒmaxfST K; 0g
To implement this empirical model, we normalize the variables as follows: 8 Ct ˆ ˆ Ct ˆ ˆ St ˆ ˆ < ST Rt;T (36.10) ˆ St ˆ ˆ ˆ ˆ ˆ : K K St where St represents the current ex-dividend S&P 500 index (SPX) level. In order to impose the risk neutral pricing methodology, we define: RO t;T;i D Rt;T;i RN t;T C e r.T t / i D 1; : : : ; N: where
36.3.3 The “Risk-Neutral" Empirical Model In this subsection, we present an empirically based option pricing model. We use a pre-specified moving window to nonparameterically estimate the density of the underlying asset’s return. However, we impose the assumption that continuous rebalancing is available and hence all assets should yield the risk free return. The nonparametric estimation techniques used are standard, and are briefly discussed in Appendix 36.C. Figure 36.2 provides a graph comparing the nonparametrically estimated density with the normal density
Fig. 36.2 Density function
(36.9)
1 XN RN t;T D Rt;T;i i D1 N
(36.11)
(36.12)
is the average return on the market portfolio, represented by the S&P 500 index, for the period Œt; T . This transformation shifts the mean of the distribution to the risk free rate. Thus, (36.9) turns into: CtRE D e r.T t / Et ŒmaxfRO t;T K ; 0g
(36.13)
In order to compute (36.13), the distribution of RO t;T needs to be estimated. We use past realized returns from a 5-year window where each return’s holding period length (i.e., T t)
536
R.-R. Chen et al.
matches the time to expiration of the option.18 Then the distribution constructed from realized returns is smoothed using a Gaussian kernel, as described in Appendix 36.C. The valuation equation becomes:
CT E St
36.3.4 The Static Empirical Model
D EŒmaxfRO t;T K ; 0g Z1 D .RO t;T K /'.RO t;T /dRO t;T K
Z1 D
RO t;T '.RO t;T /d RO t;T K
K
Z1
'.RO t;T /d RO t;T
Z1 xf .x/dx K
K
CtSE
f .x/dx
K
" ! 1 Xn RO t;T;i K O D .Rt;T;i K /N i D1 n h 8 " #2 93 < 1 RO = h t;T;i K 5 (36.14) exp Cp : 2 ; h 2 where f .x/ is a Gaussian kernel defined by:
f .x/ D
The empirical model described in this subsection assumes no rebalancing, and therefore employs a static asset pricing model for the option (as in Sect. 36.3.2), with a nonparametrically estimated stock return density (as in Sect. 36.3.3).20 Dividing (36.9) by St on both sides, using the notation in (36.10) we obtain:
K
Z1 D
RE and t;T is the implied volatility for the Risk-neutral Empirical model.
1 Xn x Xi i D1 nh h
.Et ŒRt;T Œ1 C r.T t/ /
covŒRt;T ; CT D EŒRt;T CT EŒRt;T EŒCT
z2 1 .z/ D p e 2 2
EŒRt;T CT
(36.15)
t;T
C r.T t/
t;T D
18
1 Xn .RO t;T;i r.T t//2 i D1 N 1
ST K D E R max ;0 St
R.R K /'.R/dR
D K
Z1 x.x K /f .x/dx
D
(36.16)
K
Z1
(36.17)
D K
K
Z1 xf .x/dx C
x 2 f .x/dx
K
(36.21) The variance term is more straightforward: 2 EŒRt;T 2 varŒRt;T D E Rt;T
where r
Z1
Plugging (36.15) back into (36.14) results in the call option pricing formula. We calibrate the return distribution to fit the model price generated by (36.13) to the market price of the call by scaling the density defined in (36.15) as follows:19 RE t;T
(36.20)
where
;
(36.19)
We then need to compute covŒRt;T ; CT and varŒRt;T . Note that:
where n is the total number of observations, h is the bandwidth, Xi is the i th observation in the data series, ./ is the kernel function that is chosen to be Gaussian:
RQ t;T D .RO t;T r.T t//
( covŒRt;T ; CT 1 D Et ŒCT 1 C r.T t/ varŒRt;T )
(36.22)
where (36.18)
Our results are robust to other choices of window length from 2 to 30 years. 19 This transformation preserves the mean, skewness, and kurtosis of RE . the distribution while setting the standard deviation to t;T
EŒR D 20
1 Xn Ri i D1 n
This model is similar to the one in Chen and Palmon (2005). Here, we smooth the histogram with a Gaussian kernel to improve estimation realiability. Furthermore, with Gaussian kernel, the model is a closed form, as opposed to the numerical implementation in Chen and Palmon.
36 Are Tails Fat Enough to Explain Smile
EŒR2 D
537
1 Xn R2 i D1 i n
Substituting these results back to (36.19) leads to the final valuation equation. Implied volatility for the S tatic SE , is estimated from call prices as in the Empirical model, t;T risk-neutral empirical model by calibrating the distribution as follows: ^
Rt;T D .Rt;T RN t;T /
SE t;T
t;T
C RN t;T
(36.23)
where RN t;T is defined in (36.12). Again, this transformation allows us to calibrate the standard deviation of the distribution without changing the mean, skewness, or kurtosis. As in the lognormal model without rebalancing, there is no guarantee that the model can always be solved for the implied volatility; when the option price is too low, the model may not lead to a positive volatility. However, we find that once a nonparametric density is used, all the implied volatilities are positive.
36.4 Data and Empirical Results We collect daily closing levels for the S&P 500 index for the period of January 3, 1950 to September 5, 2003, and we use subsets of these returns to form the nonparametric distributions. Table 36.1 provides summary statistics for the returns derived from these prices. For shorter return horizons, this historical return distribution presents large skewness and excess kurtosis that are substantially larger than the corresponding values for the normal distribution. Consistent with the literature (e.g., Das and Sundaram 1999), the skewness and excess kurtosis dissipate over longer time horizons.21 We use synchronized S&P 500 index call options for the period of June 1988 through December 1991 as in Bakshi et al. (1997).22 Table 36.2 presents the moneyness, defined 21 By excess kurtosis we mean the difference between the fourth centralized moment and the kurtosis for a normal distribution, which equals three. 22 We thank Charles Cao for kindly supplying us this data, which was originally produced by the CBOE.
Table 36.1 Distribution statistics of S&P 500 returns (annualized) for various holding horizons (in business days) from 1/3/1950 to 9/5/2003
as .S K/=K and maturity distributions of the contracts in our sample. The sample contains 46,540 option price observations across various dates, strike prices, and maturities. We next estimate the implied volatilities for each of the models presented above. When estimating the underlying nonparametric returns density (models C and D), we use the actual returns on the S&P 500 from a 5-year moving window that ends just prior to the option pricing date.23 Figure 36.3 presents a smoothed graph of the implied volatility from the Black-Scholes model (IV_BS) on maturity (DTM) and moneyness (MNY).24 As expected, the graph shows a considerable smile (or smirk), where the implied volatilities are particularly high for shorter maturity in-themoney options. Figure 36.4 presents a similar graph of the implied volatilities from the Static Lognormal model (see III.B. above). Again, a similar smile is clearly visible. Figure 36.5 presents the implied volatility from the Risk-Neutral Empirical model (see III.C. above). Now the surface appears much smoother, and the smile is almost absent. Similarly, Fig. 36.6 presents the graph of implied volatility from the Static Empirical model (see III.D. above), and again the graph appears smoother. Thus, visually, it appears that much of the smile is related to the normality assumption, and using a nonparametric return density appears to remove much of the curvature. One possible concern is that the volatilities are muted relative to prices for some of our models, thus not providing a fair comparison. We therefore also examine the price impact of volatility for each of the four models and find them to be nearly identical. Thus the relationships we find when examining volatility are due to the impact of moneyness and not an ancillary artifact of the models. In order to quantitatively measure the magnitude and statistical significance of the smile under the various models, we regress implied volatility from each model on moneyness, moneyness squared, moneyness cubed, and moneyness raised to the forth power. In order to control for varying volatility across days and maturities, we include a separate
23
Recall that the choice of time horizon does not qualitatively affect the empirical results. 24 The smoothing algorithm used is the spline method provided in SAS.
1-day Mean Std. dev Skew Kurt
0:087 0:143 0:922 24:867
7-day
10-day
21-day
63-day
126-day
252-day
0:087 0:148 0:418 5:399
0:087 0:146 0:486 4:876
0:087 0:146 0:439 2:473
0:087 0:145 0:302 1:447
0:087 0:152 0:117 0:228
0:089 0:160 0:090 0:310
538
Table 36.2 Moneyness and maturity of the S&P 500 call options
R.-R. Chen et al.
Maturity moneyness
<1 month
1 3 months
3 6 months
>6 months
Total
>20% OTM 10 20% OTM 5 10% OTM <5% OTM <5% ITM 5 10% ITM 10 20% ITM >20% ITM Total
0 0 182 2108 2270 1662 1049 360 7631
0 128 1810 4037 3515 2586 1898 628 14602
2 955 2692 3058 2583 1856 1751 658 13555
49 1475 2027 2122 1652 1152 1515 760 10752
51 2558 6711 11325 10020 7256 6213 2405 46540
The number of observations of CBOE S&P 500 in-the-money (ITM) and out-of-themoney (OTM) call options. The trading day for all option contracts is not earlier than June 1988 and not later than December 1991. The moneyness is defined as .S K/=K
Fig. 36.3 Black-Scholes. Note: This is the implied volatility plot for the Black-Scholes model. MNY is moneyness and DTM is days to maturity
Fig. 36.4 Static lognormal. Note: This is the implied volatility plot for the model with a lognormal density but with no rebalancing (i.e., the Static Lognormal model). MNY is moneyness and DTM is days to maturity
36 Are Tails Fat Enough to Explain Smile
539
Fig. 36.5 Risk-neutral empirical. Note: This is the implied volatility plot for the kernel-smoothed empirical (nonparametric) density with a risk free discount rate (i.e., Risk-Neutral Empirical model). MNY is moneyness and DTM is days to maturity
Fig. 36.6 Static empirical. Note: This is the implied volatility plot for the kernel-smoothed (nonparametric) empirical density with no rebalancing (i.e., the Static Empirical model). MNY is moneyness and DTM is days to maturity
dummy variable for every pricing day as well as for every maturity. Thus our specification is: Vs;m;t D
XT t D1
˛t Dt C
XM mD1
m Dm C ˇ1 MNY
Cˇ2 MNY2 C ˇ3 MNY3 C ˇ4 MNY4 C "s;m;t (36.24) where Vs;m;t is the implied volatility for a given strike, maturity, and date, Dt are the 901 dummy variables corresponding to each pricing date, Dm are the 224 dummy variables corresponding to each maturity, MNY is the moneyness, and "s;m;t is the error term. Table 36.3 presents summary statistics for the implied volatility from each model and for moneyness. The average implied volatilities are similar for the various models
Table 36.3 Summary statistics Black-Scholes volatility Static Lognormal volatility Risk-neutral empirical volatility Static empirical volatility Moneyness
Mean
Standard deviation
0.203 0.181 0.192 0.201 0.026
0.067 0.060 0.041 0.040 0.094
Means and standard deviations for the volatilities implied by the four models, and also for the moneyness, defined as .S K/=K. The observations in the Static Lognormal model exclude zero volatility values
considered. As discussed above, observations where the Static Lognormal model implies a zero or negative volatility (approximately 8% of the cases) are excluded from the sample used in the Static Lognormal case.
540
R.-R. Chen et al.
Table 36.4 Smile regressions
Black-Scholes Moneyness Moneyness2 Moneyness3 Moneyness4 Adjusted R squared Number of observations
0.421 (108.657) 1.313 (50.971) 4:563 .24:983/ 4.137 (9.415) 0.677 46,540
Static lognormal
0.264 (70.865) 0.994 (39.040) 1:196 .6:695/ 1.937 (4.381) 0.670 42,768
Risk-neutral empirical
0:015 .6:182/ 0:104 .6:551/ 0.594 (5.261) 1.075 (3.960) 0.677 46,540
Static empirical 0.002 (0.955) 0.045 (2.801) 0:318 .2:807/ 0.486 (1.785) 0.656 46,540
Fixed effect regressions of implied volatility on moneyness, defined as .S K/=K, and moneyness2 , moneyness3 , and moneyness4 . All regressions include dummy variables (901 fixed effects) for the pricing date and dummy variables for maturity (224 dummy variables). We exclude volatility values in the Static Lognormal model if there is no positive solution. The implied volatilities for the four models are obtained by equating (36.2) (Black-Scholes), Equation (36.5) (Static Lognormal), Equation (36.13) (RiskNeutral Empirical), and Equation (36.19) (Static Empirical), respectively, to the market price of the option. denotes significance at the 10% level, while denotes significance at the 1% level Table 36.5 Smile regressions with an interaction of moneyness and maturity Moneyness Moneyness2 Moneyness3 Moneyness4 Log(Maturity) Moneyness Log(Maturity) Moneyness2 Log(Maturity) Moneyness3 Log(Maturity) Moneyness4 Adjusted R squared Number of observations
Black-Scholes
Static lognormal
Risk-neutral empirical
Static empirical
1.359 (112.371) 10.031 (76.556) 15:045 .17:669/ 8:042 .4:680/ 0:203 .79:044/ 1:974 .78:450/ 2.453 (15.292) 2.616 (7.727) 0.901 46,540
1.113 (82.971) 8.435 (58.583) 3.072 (3.204) 42:476 .20:914/ 0:182 .63:506/ 1:746 .62:367/ 0:815 .4:511/ 9.666 (23.728) 0.864 42,768
0.022 (1.667) 1.565 (10.948) 3.189 (3.432) 13:837 .7:380/ 0:008 .2:930/ 0:377 .13:720/ 0:517 .2:954/ 3.207 (8.681) 0.691 46,540
0.013 (0.977) 0.028 (0.190) 0:571 .0:599/ 1.841 (0.958) 0:002 .0:805/ 0.003 (0.116) 0.057 (0.315) 0:301 .0:795/ 0.656 46,540
Fixed effect regressions of implied volatility on moneyness, defined as .S K/=K, and moneyness2 , moneyness3 , and moneyness4 , and the interaction of moneyness with log(maturity). All regressions include dummy variables (901 fixed effects) for the pricing date and dummy variables for maturity (224 dummy variables). We exclude volatility values in the Static Lognormal model if there is no positive solution. The implied volatilities for the four models are obtained by equating Equation (36.2) (Black-Scholes), Equation (36.5) (Static Lognormal), Equation (36.13) (Risk-Neutral Empirical), and Equation (36.19) (Static Empirical), respectively, to the market price of the option. denotes significance at the 10% level, while denotes significance at the 1% level
Table 36.4 presents the results from regressions of implied volatility for each model on our dummy variables and linear and nonlinear terms in moneyness. The curvature visible in the Black-Scholes case is striking. All the coefficients are highly significant, and the coefficients on the squared and fourth powers of moneyness are positive suggesting a substantial smile pattern. Removing the assumption
of continuous trading brings us to the Static Lognormal model in the next column. Here the coefficients are 24–74% smaller, suggesting a shallower smile; however, the overall pattern is similar. A much more dramatic change is evident when we compare these coefficients to the implied volatilities from the Risk-Neutral Empirical model. For example, the magnitude of the coefficient on MNY2 is only 8%
36 Are Tails Fat Enough to Explain Smile
that of the Black-Scholes case. Moreover, while the coefficient on MNY4 is still positive, the coefficient on MNY2 is now negative, suggesting a much less noticeable smile. As suggested by Figs. 36.3 through 36.6, most of the curvature is eliminated with the nonparametric density. The last column in Table 36.4 gives the results for the Static Empirical model. Here the coefficients are much closer to zero, the coefficient on MNY is no longer statistically different from zero (and is only 1/200th the size of the coefficient in the BlackScholes case), and the coefficient on MNY4 is only different from zero at the 10% level. Clearly, each of the restrictive assumptions, continuous trading and normality, appears to be responsible for part of the volatility smile, but eliminating the assumption of normality appears to remove most of the smile. Figures 36.3 through 36.6 indicate that the smile is present primarily for shorter maturities. Thus, we introduce an alternative specification that includes interaction variables between log(maturity) and the four moneyness variables. The estimates of this specification are reported in Table 36.5. Again, the smile is evident in the Black-Scholes model, where the coefficients on moneyness as well as on the interactions of moneyness and maturity appear highly significant. The coefficients of the second power of moneyness are somewhat smaller under the Static Lognormal model, and considerably smaller under the Risk-Neutral Empirical model. Strikingly, using the Static Empirical framework appears to eliminate the significance of all the coefficients on moneyness as well as its interaction with maturity. As a further robustness check, we split our sample chronologically and run similar regressions on the two subsamples.25 Again, we find highly significant coefficients for the moneyness variables in the Black-Scholes model, and these decline to insignificance in the Static Empirical model. Further, although highly significant, the coefficients on the moneyness variables are not stable in the Black-Scholes model, while the near-zero coefficients in the Static Emprical model are stable.
36.5 Conclusion The literature has separately considered two unrealistic assumptions as possible causes of the smile: continuous rebalancing and a lognormal distribution. However, neither adding transaction costs (and thereby reducing rebalancing frequency) nor moving to a stochastic volatility or a jump 25
This robustness test is in spirit similar to the test in Dumas et al. (1998) in which two subsamples are used to test out-of-sample performance. The detailed regressions are included as Table 36A.1 and 36A.2 in the appendix.
541
diffusion model has been able to completely account for the smile bias. We examine models that replace the unrealistic assumptions individually and jointly. We consider two alternative distributions: the lognormal and a return distribution nonparametrically estimated from the actual density of past returns. We also consider two alternative assumptions regarding rebalancing: continuous rebalancing and no rebalancing. Thus, we consider four possible models: the Black-Scholes with continuous rebalancing and lognormal returns distribution, the Static Lognormal with no rebalancing and a lognormal returns distribution, the Risk-Neutral Empirical with continuous rebalancing and an empirically estimated returns distribution, and the Static Empirical with no rebalancing and an empirically estimated returns distribution. We find that the rebalancing assumption accounts for part of the smile. A larger fraction of the smile is accounted for by the density assumption. However, the smile virtually disappears only when we eliminate both unrealistic assumptions. One further implication of these results is that models like the static empirical may significantly modify estimates of the implied pricing kernel. We leave this extension as a topic for further research. Acknowledgments We would like to thank Shashi Murthy for his insightful comments.
References Ait-Sahalia, Y. 1996. “Nonparametric pricing of interest rate derivative securities.” Econometrica 64(3), 527–560. Ait-Sahalia, Y. and A. W. Lo. 1998. “Nonparametric estimation of stateprice densities implicit in financial asset prices.” The Journal of Finance 53(2), 499–547. Amin, K. I. and V. K. Ng. 1997. “Inferring future volatility from the information in implied volatility in eurodollar options: a new approach.” Review of Financial Studies 10(2), 333–367. Bakshi, G. S., C. Cao, and Z. Chen. 1997. “Empirical performance of alternative option pricing models.” Journal of Finance 52, 2003–2049. Bates, D. 1991. “The crash of 87’: was it expected? The evidence from options markets.” Journal of Finance 46, 1009–1044. Bates, D. 1996. “Jumps and stochastic volatility: exchange rate processes implicit in deutsche mark options.” Review of Financial Studies 9, 69–107. Black, F. and M. Scholes. 1972. “The valuation of option contracts and a test of market efficiency.” Journal of Finance 27, 399–417. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Bondarenko, O. 2003. Why are put options so expensive?, CalTech Working paper. Brennan, M. 1979. “The pricing of contingent claims in discrete time models.” Journal of Finance 24(1), 53–68. Campa, J. and K. H. Chang. 1995. “Testing the expectation hypothesis on the term structure of volatilities.” Journal of Finance 50, 529–547. Canina, L. and S. Figlewski. 1993. “The informational content of implied volatility.” Review of Financial Studies 6(3), 659–681.
542 Chen, R.-R. and O. Palmon. 2005. “A non-parametric option pricing model: theory and empirical evidence.” Review of Quantitative Finance and Accounting 24(2), 115–134. Cochrane, J. H. and Jesus Saa–Requejo. 2000. “Beyond arbitrage: gooddeal asset price bounds in incomplete markets.” Journal of Political Economy 108, 79–119. Cox, J. and S. Ross. 1976. “The pricing of options for the jump processes.” Journal of Financial Economics. Das, S. R. and R. K. Sundaram. 1999. “Of smiles and smirks: a term structure perspective.” Journal of Financial and Quantitative Analysis 34, 211–239. Day, T. E. and C. M. Lewis. 1992. “Stock market volatility and the information content of stock index options.” Journal of Econometrics 52, 267–287. Dennis, P. and S. Mayhew. 2002. “Risk-neutral skewness: evidence from stock options.” Journal of Financial and Quantitative Analysis 37(3), 471–493. Duan, J. C. 1995. “The GARCH option pricing model.” Mathematical Finance 5(1), 13–32. Duan, J. C. 1996. “Cracking the smile.” Risk 9, 55–56, 58, 60. Dumas, B., J. Fleming, and R. Whaley. 1998. “Implied volatility smiles: empirical tests.” Journal of Finance 53, 2059–2106. Eberlein, E., U. Keller, and K. Prause. 1998. “New insights into smile, mispricing, and value at risk: the hyperbolic model.” Journal of Business 71, 371–405. Emanuel, D. and J. D. MacBeth. 1982. “Further results on constant elasticity of variance call option models.” Journal of Financial and Quantitative Analysis 17, 533–554. Geske, R. and R. Roll. 1984. “On valuing American call options with the Black-Scholes European formula.” Journal of Finance 39, 443–455. Geske, R., R. Roll, and K. Shastri. 1983. “Over-the-counter option market dividend protection and “biases” in the Black-Scholes model: a note.” Journal of Finance 38, 1271–1277. Härdle, W. 1990. Applied nonparametric regression, Cambridge University Press, Cambridge. Heston, S. L. 1993. “A close-form solution for options with stochastic volatility with applications to bond and currency options.” Review of Financial Studies 6, 327–343. Heynen, R., A. G. Z. Kemna, and T. Vorst. 1994. “Analysis of the term structure of implied volatilities.” Journal of Financial and Quantitative Analysis 29, 31–56. Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” Journal of Finance 42, 281–300. Jackwerth, J. C. and M. Rubinstein. 1996. “Recovering probability distributions from option prices.” Journal of Finance 51, 1611–1631. Jorion, P. 1995. “Predicting volatility in the foreign exchange market.” Journal of Finance 50(2), 507–528. Leland, E. H. 1985. “Option pricing and replication with transactions costs.” Journal of Finance 40(5), 1283–1301. Longstaff, F. 1995. “Option pricing and the martingale restrict.” Review of Financial Studies 8, 1091–1124. MacBeth, J. D. and L. J. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” Journal of Finance 34, 1173–1186. Rosenberg, J. V. and R. F. Engle. 2002. “Empirical pricing kernels.” Journal of Financial Economics 64(3), 341–372. Rubinstein, M. 1985. “Nonparametric tests of alternative option pricing models using all reported trades and quotes on the 30 most active CBOE option classes from August 23, 1976 through August 31, 1978.” Journal of Finance 40, 455–480. Rubinstein, M. 1994. “Implied binomial trees.” Journal of Finance 49, 771–818.
R.-R. Chen et al. Scott, L. O. 1987. “Option pricing when the variance changes randomly: theory, estimation, and an application.” Journal of Financial and Quantitative Analysis 22, 419–438. Shimko, D. 1993. “Bounds of probability.” Risk 6, 33–37. Silverman, B.W. 1986. Density estimation, Chapman and Hall, London. Stutzer, M. 1996. “A simple nonparametric approach to derivative security valuation.” Journal of Finance 51(5), 1633–1652. Whaley, R. E. 1982. “Valuation of American call options on dividendpaying stocks: empirical tests.” Journal of Financial Economics 10, 29–58. Whaley, R. E. 1986. “Valuation of American futures options: theory and empirical tests.” Journal of Finance 41, 127–150. Xu, X. and S. J. Taylor. 1994. “The term structure of volatility implied by foreign exchange options.” Journal of Financial and Quantitative Analysis 29, 57–74.
Appendix 36A 36A.1 The Derivation of the Lognormal Model Under No Rebalancing In order to derive the closed form for the lognormal model under no rebalancing, each component in the calculation of beta is shown below. The variance term is: var
ST St
1 varŒST St2
1 D 2 EŒST2 .EŒST /2 St
1 2 D 2 St2 e .2C /.T t / St2 e 2.T t / St 2
D e 2.T t / e .T t / 1 D
The covariance term is: cov
ST CT ; St Ct
D
1 .EŒST CT EŒST EŒCT / St Ct
in which the expectation of the cross product term can be derived as follows: Z
1
EŒST CT D
ST maxfST K; 0g .ST /dST Z
0
Z
1
D K
ST2 .ST /dST K
D St2 e .2C
2 /.T t /
1
ST .ST /dST K
N.h3 / KSt e .T t / N.h1 /
36 Are Tails Fat Enough to Explain Smile
543
Define the discount rate of the option as D EŒS1 =S0 . Plug these results back in Equation (36.5):
where
2
ln S ln K C C 2 .T t/ h1 D p T t
2 ln S ln K C C 32 .T t/ h3 D p T t
EŒC1 C0 D
C11 C10 .EŒS1 S0 / S11 S10
C11 C10 pC11 C .1 p/C10 ud D
and the expectation of each term is: EŒST D St e .T t / Z
1
EŒCT D
D
maxfST K; 0g .ST /dST
pC11 C .1 p/C10
0
in which
2 ln S ln K C 2 .T t/ p : h2 D T t
C11 C10 .p.u d / C d / ud
(36.27)
which is the risk neutral result and the proof is complete. Note that the subjective probability p disappears from the equation.
36A.2 Continuous Rebalancing
36A.3 Smoothing Techniques
In the following, we prove that when the number of trading/hedging times approaches infinity, the pricing model. Equation (36.5) approaches the Black-Scholes model ((Equation (36.2)). Note that when the number of trading/hedging times approaches infinity; or the trading horizon approaches dt, the Brownian motion becomes a random walk (i.e., a one-period binomial model). Hence, we only need to show that the equilibrium model of Equation (36.5) is identical to Equation (36.2) in a short time interval dt. To do that, we rewrite Equations (36.7) and (36.8) as follows: EŒS1 C1 D pS11 C11 C .1 p/S10 C10 EŒS1 D pS11 C .1 p/S10
We use standard nonparametric kernel smoothing procedures to estimate the density of returns (see Härdle 1990; Silverman 1986). Thus, the density function estimated is: x z 1 X j i ; j D 1; : : : ; T wi fO.xj / D nh i D1 h n
where z is a vector of length n of returns, h is the bandwidth, wi is a vector of weights, set equal to one in our case, and x is a vector of length T over which the kernel is estimated. We use a standard Gaussian kernel, and the results are similar with an Epanechnikov kernel. We use the bandwidth calculated as in Silverman (1986), Equation (36.31):
(36.25) hD
Hence, varŒS1 D p.1 p/.S11 S10 /2 covŒS1 ; C1 D p.1 p/.S11 S10 /.C11 C10 /
EŒS1 S0
d u C11 C C10 u d u d D
D St e .T t / N.h1 / KN.h2 /
EŒC1 D pC11 C .1 p/C10
(36.26)
0:9 min .s; IQR=1:34/ n1=5
where s is the sample standard deviation and IQR is interquartile range of the data points.
544
R.-R. Chen et al.
36A.4 Results of Sub-Sample Testing
Table 36A.1 Smile regressions with an interaction of moneyness and maturity for the first half of the sample
Moneyness Moneyness2 Moneyness3 Moneyness4 Log(Maturity) Moneyness Log(Maturity) Moneyness2 Log(Maturity) Moneyness3 Log(Maturity) Moneyness4 Adjusted R squared Number of observations
Black-Scholes
Static lognormal
Risk-neutral empirical
Static empirical
1.336 (83.741) 10.184 (58.425) 18:811 .17:359/ 2.551 (1.233) 0:207 .61:201/ 1:947 .59:135/ 3.151 (15.698) 0.374 (0.936) 0.902 23,311
1.114 (64.859) 9.320 (49.428) 6:697 .5:655/ 22:021 .9:307/ 0:175 .48:089/ 1:874 .51:976/ 0.793 (3.603) 6.022 (12.877) 0.858 22,424
0.137 (7.919) 0:980 .5:199/ 9.102 (7.768) 13:275 .5:932/ 0:045 .12:408/ 0.141 (3.958) 1:374 .6:329/ 2.387 (5.514) 0.620 23,311
0.012 (0.645) 0.079 (0.378) 0:626 .0:484/ 1.404 (0.568) 0:002 .0:592/ 0:015 .0:390/ 0.103 (0.431) 0:239 .0:500/ 0.550 23,311
Fixed effect regressions of implied volatility on moneyness, defined as .S K/=K, and moneyness2 , moneyness3 , and moneyness4 , and the interaction of moneyness with log(maturity). All regressions include dummy variables (495 fixed effects) for the pricing date and dummy variables for maturity (224 dummy variables). We exclude volatility values in the Static Lognormal model if there is no positive solution. The implied volatilities for the four models are obtained by equating Equation (36.2) (BlackScholes), Equation (36.5) (Static Lognormal), Equation (36.13) (Risk-neutral Empirical), and Equation (36.19) (Static Empirical), respectively, to the market price of the option. denotes significance at the 10% level, denotes significance at the 5% level, and denotes significance at the 1% level
36 Are Tails Fat Enough to Explain Smile
Table 36A.2 Smile regressions with an interaction of moneyness and maturity for the second half of the sample
545
Black-Scholes Moneyness Moneyness2 Moneyness3 Moneyness4 Log(Maturity) Moneyness Log(Maturity) Moneyness2 Log(Maturity) Moneyness3 Log(Maturity) Moneyness4 Adjusted R squared Number of observations
1.262 (65.000) 11.000 (54.256) 3:420 .2:068/ 59:617 .13:906/ 0:176 .42:636/ 2:247 .55:389/ 0.234 (0.737) 13.182 (15.399) 0.904 23,229
Static lognormal
1.049 (47.530) 7.247 (31.578) 29.819 (15.420) 114:082 .20:490/ 0:183 .39:133/ 1:519 .31:744/ 5:347 .14:506/ 20.512 (18.219) 0.879 20,344
Risk-neutral empirical
0:129 .6:169/ 5.494 (25.245) 5:394 .3:038/ 24:747 .5:378/ 0.038 (8.640) 1:216 .27:922/ 0.743 (2.177) 7.139 (7.769) 0.746 23,229
Static empirical 0.023 (1.125) 0:281 .1:293/ 0:460 .0:260/ 5.255 (1.145) 0:004 .0:863/ 0.080 (1.832) 0:015 .0:045/ 1:007 .1:099/ 0.724 23,229
Fixed effect regressions of implied volatility on moneyness, defined as .S K/=K, and moneyness2 , moneyness3 , and moneyness4 , and the interaction of moneyness with log(maturity). All regressions include dummy variables (406 fixed effects) for the pricing date and dummy variables for maturity (224 dummy variables). We exclude volatility values in the Static Lognormal model if there is no positive solution. The implied volatilities for the four models are obtained by equating Equation (36.2) (Black-Scholes), Equation (36.5) (Static Lognormal), Equation (36.13) (RiskNeutral Empirical), and Equation (36.19) (Static Empirical), respectively, to the market price of the option. denotes significance at the 10% level, while denotes significance at the 1% level
Chapter 37
Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates Gurdip Bakshi, Charles Cao, and Zhiwu Chen
Abstract Recent studies have extended the Black–Scholes model to incorporate either stochastic interest rates or stochastic volatility. But, there is not yet any comprehensive empirical study demonstrating whether and by how much each generalized feature will improve option pricing and hedging performance. This paper fills this gap by first developing an implementable option model in closed-form that admits both stochastic volatility and stochastic interest rates and that is parsimonious in the number of parameters. The model includes many known ones as special cases. Based on the model, both delta-neutral and single-instrument minimum-variance hedging strategies are derived analytically. Using S&P 500 option prices, we then compare the pricing and hedging performance of this model with that of three existing ones that respectively allow for (i) constant volatility and constant interest rates (the Black–Scholes), (ii) constant volatility but stochastic interest rates, and (iii) stochastic volatility but constant interest rates. Overall, incorporating stochastic volatility and stochastic interest rates produces the best performance in pricing and hedging, with the remaining pricing and hedging errors no longer systematically related to contract features. The second performer in the horse-race is the stochastic volatility model, followed by the stochastic interest rates model and then by the Black–Scholes. Keywords Stock option pricing r Hedge ratios r Hedging Pricing performance r Hedging performance
r
37.1 Introduction Option pricing has, in the last two decades, witnessed an explosion of new models that each relax some of the restrictive assumptions underlying the seminal Black–Scholes (1973) model. In doing so, most of the focus has been on the counterfactual constant-volatility and constant-interestrates assumptions. For example, Merton’s (1973) option pricing model allows interest rates to be stochastic but keeps a constant volatility for the underlying asset, while Amin and Jarrow (1992) develop a similar model where, unlike in Merton’s, interest rate risk is also priced. A second class of option models admits stochastic conditional volatility for the underlying asset, but maintains the constant-interest-rates assumption. These include the Cox and Ross (1976) constantelasticity-of-variance model and the stochastic volatility models of Bailey and Stulz (1989), Bates (1996b, 2000), Heston (1993), Hull and White (1987a), Scott (1987), Stein and Stein (1991), and Wiggins (1987). Recently, Bakshi and Chen (1997) and Scott (1997) have developed closed-form equity option formulas that admit both stochastic volatility and stochastic interest rates.1 Their efforts have, in some sense, helped reach the ultimate possibility of completely relaxing the Black–Scholes assumptions of constant volatility and constant interest rates. As a practical matter, these sufficiently general pricing formulas should in principle result in significant improvement in pricing and hedging performance over the Black–Scholes model. While option pricing theory has made such impressive progress, the empirical front is nonetheless far behind.2 Will incorporating these general 1
G. Bakshi Department of Finance, College of Business, University of Maryland, MD, USA C. Cao () Department of Finance, Smeal College of Business, Pennsylvania State University, University Park, PA 16802, USA e-mail: [email protected] Z. Chen School of Management, Yale University, Prospect St. New Haven, CT, USA
Amin and Ng (1993), Bailey and Stulz (1989), and Heston (1993) also incorporate both stochastic volatility and stochastic interest rates, but their option pricing formulas are not given in closed form, which makes applications difficult. Consequently, comparative statics and hedge ratios are difficult to obtain in their cases. 2 There have been a few empirical studies that investigate the pricing, but not the hedging, performance of versions of the stochastic volatility model, relative to the Black–Scholes model. These include Bates (1996b, 2000), Dumas et al. (1998), Madan et al. (1998), Nandi (1996), and Rubinstein (1985). In Bates’ work, currency and equity index options data are respectively used to test a stochastic volatility model with Poisson jumps included. Nandi does investigate the pricing and hedging
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_37,
547
548
features improve both pricing and hedging effectiveness? If so, by how much? Can these relaxed assumptions help resolve the well known empirical biases associated with the Black–Scholes formula, such as the volatility smiles [e.g., Rubinstein (1985, 1994)]? – These empirical questions must be answered before the potential of the general models can be fully realized in practical applications. In this paper, we first develop a practically implementable version of the general equity option pricing models in Bakshi and Chen (1997) and Scott (1997), that admits stochastic interest rates and stochastic volatility, yet resembles to the extent possible the Black–Scholes model in its implementability. We present procedures for applying the resulting model to price and hedge option-like derivative products. Next, we conduct a complete analysis of the relative empirical performance, in both pricing and hedging, of the four classes of models that respectively allow for (i) constant volatility and constant interest rates (the BS model), (ii) constant volatility but stochastic interest rates (the SI model), (iii) stochastic volatility but constant interest rates (the SV model), and (iv) stochastic volatility and stochastic interest rates (the SVSI model). As the SVSI model has all the other three models nested, one should expect its static pricing and dynamic hedging performance to surpass that of the other classes. But, this performance improvement must come at the cost of potentially more complex implementation steps. In this sense, conducting such a horse-race study can at least offer a clear picture of possible tradeoffs between costs and benefits that each model may present. Specifically, the SVSI option pricing formula is expressed in terms of the underlying stock price, the stock’s volatility and the short-term interest rate. The spot volatility and the short interest rate are each assumed to follow a Markov mean-reverting square-root process. Consequently, seven structural parameters need to be estimated as input to the model. These parameters can be estimated using the Generalized Method of Moments (GMM) of Hansen (1982), as is done in, for instance, Andersen and Lund (1997), Chan et al. (1992), and Day and Lewis (1997). Or, they can be backed out from the pricing model itself by using observed option prices, as is similarly done for the BS model both in the existing literature and in Wall Street practice. In our empirical investigation, we will adopt this implied parameter approach to implement the four models. In this regard, it is important to realize that the BS model is implemented as if the spot volatility and the spot interest rates were assumed to be time-varying within the model, that is, the spot
performance of Heston’s stochastic volatility model, but he focuses exclusively on a single-instrument minimum-variance hedge that involves only the S&P 500 futures. As will be clear shortly, we address in this paper both the pricing and the hedging effectiveness issues from different perspectives and for four distinct classes of option models.
G. Bakshi et al.
volatility is backed out from option prices each day and used, together with the current yield curve, to price the following day’s options. The SI and the SV models are implemented with a similarly internally inconsistent treatment, though to a lesser degree. Since this implementation is how one would expect each model to be applied, we chose to follow this convention in order to give the alternatives to the standard BS model the “toughest hurdle.” Clearly, such a treatment works in the strongest favor of the BS model and is especially biased against the SVSI model. Based on 38,749 S&P 500 call (and put) option prices for the sample period from June 1988 to May 1991, our empirical investigation leads to the following conclusions. First, on the basis of two out-of-sample pricing error measures, the SVSI model is found to perform slightly better than the SV model, while they both perform substantially better than the SI (the third-place performer) and the BS model. That is, when volatility is kept constant, allowing interest rates to vary stochastically can produce respectable pricing improvement over the BS model. However, in the presence of stochastic volatility, doing so no longer seems to improve pricing performance much further. Thus, modeling stochastic volatility is far more important than stochastic interest rates, at least for the purpose of pricing options. It is nonetheless encouraging to know that based on our sample both the SVSI and the SV models typically reduce the BS model’s pricing errors by more than half, whereas the SI model helps reduce the BS pricing errors by 20% or more. While all four models inherit moneyness- and maturity-related pricing biases, the severity of these types of bias is increasingly reduced by the SI, the SV, and the SVSI models. In other words, the SVSI model produces pricing errors that are the least moneynessor maturity-related. This conclusion is also confirmed when the Rubinstein (1985) implied-volatility-smile diagnostic is adopted to examine each model. Two types of hedging strategy are employed in this study to gauge the relative hedging effectiveness. The first type is the conventional delta-neutral hedge, in which as many distinct hedging instruments as the number of risk sources affecting the hedging target’s value are used so as to make the net position completely risk-immunized (locally). Take the SVSI model as an example. The call option value is driven by three risk sources: the underlying price shocks, volatility shocks, and shocks to interest rates. Accordingly, we employ the underlying stock, a different call option, and a position in a discount bond to create a delta-neutral hedge for a target call option. That closed-form expressions are derived for each hedge ratio is of great value for devising hedging strategies analytically. Similarly, for the SV model, we only need to rely on the underlying stock and an option contract to design a delta-neutral hedge. Based on the deltaneutral hedging errors, the same performance ranking of the four models obtains as that determined by their static pricing
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
performance, except that now the SVSI and the SV models, and the SI and the BS models, are respectively pairwise virtually indistinguishable. This reenforces the view that adding stochastic interest rates may not affect performance much. However, it is found that the average hedging errors by the SVSI and the SV models are typically less than one third of the corresponding BS model’s hedging errors. Furthermore, reducing the frequency of hedge rebalancing does not tend to reduce the SV and the SVSI models’ hedging effectiveness, whereas the BS and the SI models’ hedging errors are often doubled when rebalancing frequency changes from daily to once every 5 days. Therefore, after stochastic volatility is controlled for, the frequency of hedge rebalancing will have relatively little impact on hedging effectiveness. This finding is in accord with Galai’s (1983a) results that in any hedging scheme it is probably more important to control for stochastic volatility than for discrete hedging [see Hull and White (1987b) for a similar, simulation-based result for currency options]. To see how the models perform under different hedging schemes, we also look at minimum-variance hedges involving only a position in the underlying asset. As argued by Ross (1995), the need for this type of hedges may arise in contexts where a perfect delta-neutral hedge may not be feasible, either because some of the underlying risks are not traded or even reflected in any traded financial instruments, or because model misspecifications and transaction costs render it undesirable to use as many instruments to create a perfect hedge. In the present context, both volatility risk and interest rate risk are, of course, traded and hence can, as indicated above, be controlled for by employing an option and a bond. But, a point can be made that it is sometimes more preferable to adopt a single-instrument minimum-variance hedge. To study this type of hedges, we again calculate the absolute and the dollar-value hedging errors for each model. Results from this exercise indicate that the SV model performs the best among all four, while the BS and the SV models outperform their respective stochastic-interest-rates counterparts, the SI and the SVSI models. Therefore, under the single-instrument hedges, incorporating stochastic interest rates actually worsens hedging performance. It is also true that hedging errors under this type of hedges are always significantly higher than those under the conventional delta-neutral hedges, for each given moneyness and maturity option category. Thus, whenever possible, including more instruments in a hedge will in general produce better hedging effectiveness. While our discussion is mainly focused on results obtained using the entire sample period and under specific model implementation designs, robustness of these empirical results is also checked by examining alternative implementation designs, different subperiods as well as option transaction price data. Especially, given the popularity of the “implied-volatility matrix” method among practitioners, we
549
will also implement each of the four models, and compare their pricing and hedging performance, by using only option contracts from a given moneyness-maturity category. It turns out that this alternative implementation scheme does not change the rankings of the four models. The rest of the paper proceeds as follows. Section 37.2 develops the SVSI option pricing formula. It discusses issues pertaining to the implementation of the formula and derives the hedge ratios analytically. Section 37.3 provides a description of the S&P 500 option data. In Sect. 37.4 we evaluate the static pricing and the dynamic hedging performance of the four models. Concluding remarks are offered in Sect. 37.5.
37.2 The Option Pricing Model Consider an economy in which the instantaneous interest rate at time t, denoted R.t/, follows a Markov diffusion process: p dR.t/ D ŒR N R R.t/ dt C R R.t/d!R .t/
t 2 Œ0; T ; (37.1)
where N R regulates the speed at which the interest rate adjusts to its long-run stationary value NRR , and !R D f!R .t/ W t 2 Œ0; T g is a standard Brownian motion.3 This singlefactor interest rate structure of Cox et al. (1985) is adopted as it requires the estimation of only three structural parameters. Adding more factors to the term structure model will of course lead to more plausible formulas for bond prices, but it can make the resulting option formula harder to implement. Take a generic non-dividend-paying stock whose price dynamics are described by p dS.t/ D .S; t/dt C V .t/ d!S .t/ S.t/
t 2 Œ0; T ;
(37.2)
where .S; t/, which is left unspecified, is the instantaneous expected return, and !S a standard Brownian motion. The instantaneous stock return variance, V .t/, is assumed to follow a Markov process: p dV.t/ D Œv N v V .t/ dt C v V .t/d!v .t/
t 2 Œ0; T ; (37.3)
where again !v is a standard Brownian motion and the structural parameters have the usual interpretation. We refer to
3
Here we follow a common practice to assume from the outset a structure for the underlying price and rate processes, rather than derive them from a full-blown general equilibrium. See Bates (1996a), Heston (1993), Melino and Turnbull (1990, 1995), and Scott (1987, 1997). The simple structure assumed in this section can, however, be derived from the general equilibrium model of Bakshi and Chen (1997).
550
G. Bakshi et al.
V .t/ as the spot volatility or, simply, volatility. This process is also frugal in the number of parameters to be estimated and is similar to the one in Heston (1993). Letting denote the correlation coefficient between !S and !v , the covariance between changes in S.t/ and in V .t/ is C ovt ŒdS.t/; dV .t/ D S v S.t/ V .t/ dt, which can take either sign and is timevarying. According to Bakshi et al. (1997, 2000a, b), Bakshi and Chen (1997), Bates (1996a), Cao and Huang (2008), and Rubinstein (1985), this additional feature is important for explaining the skewness and kurtosis related biases associated with the BS formula. Finally, for ease of presentation, assume that the equity-related shocks and the interest rate shocks are uncorrelated:4 C ovt .d!S ; d!R / D C ovt .d!v ; d!R / D 0. By a result from Harrison and Kreps (1979), there are no free-lunches in the economy if and only if there exists an equivalent martingale measure with which one can value claims as if the economy were risk-neutral. For instance, the time-t price B.t; / of a zero-coupon bond that pays $1 in periods can be determined via B.t; / D EQ
Z exp
R.s/ds
p dR.t/ D ŒR R R.t/ dt C R R.t/ d!R .t/ (37.5) p dV .t/ D Œv v V .t/ dt C v V .t/ d!v .t/; (37.6) where R N R C R and v N v C v . The risk-neutralized stock price process becomes p dS.t/ D R.t/ dt C V .t/ d!S .t/; S.t/
;
(37.4)
t
where EQ denotes the expectation with respect to an equivalent martingale measure and conditional on the information generated by R.t/ and V .t/. Assume that the factor risk premiums for R.t/ and V .t/ are respectively given by R R.t/ and v V .t/, for two constants R and v . Bakshi and Chen (1997) provide a general equilibrium model in which risk premiums have precisely this form and in which the interest rate and stock price processes are as assumed here. Under 4
This assumption on the correlation between stock returns and interest rates is somewhat severe and likely counterfactual. To gauge the potential impact of this assumption on the resulting option model’s performance, we initially adopted the following stock price dynamics:
(37.7)
That is, under the martingale measure, the stock should earn no more and no less than the risk-free rate. With these adjustments, we solve the conditional expectation in (37.4) and obtain the familiar bond price equation below: B.t; / D exp Œ'./ %./R.t/ ;
(37.8)
io h & .& R / C 2 ln 1 .1e 2&/.&R / , q 2.1e& / %./ D 2&Œ& and & R2 C 2R2 . See Cox & / R .1e et al. (1985) for an analysis of this class of term structure models. where './ D
t C
this assumption, we obtain the risk-neutralized processes for R.t/ and V .t/ below:
R R2
n
37.2.1 Pricing Formula for European Options Now, consider a European call option written on the stock, with strike price K and term-to-expiration . Let its time-t price be denoted by C.t; /. As .S; R; V / form a joint Markov process, the price C.t; / must be a function of S.t/, R.t/ and V .t/ (in addition to ). By a standard argument, the option price must solve
p p dS.t / D .S; t /dt C V .t / d!S .t /CS;R R.t /d!R .t / t 2 Œ0; T ; S.t /
1 @2 C @C @2 C @2 C 1 VS2 2 C R S C v V S C v2 V 2 @S @S @S @V 2 @V 2
with the rest of the stochastic structure remaining the same as given above. Under this more realistic structure, the covariance between stock price changes and interest rate shocks is C ovt ŒdS.t /; dR.t / D S;R R R.t /S.t / dt , so bond market innovations can be transmitted to the stock market and vice versa. The obtained closed-form option pricing formula under this scenario would have one more parameter S;R than the one presented shortly, but when we implemented this slightly more general model, we found its pricing and hedging performance to be indistinguishable from that of the SVSI model studied in this paper. For this reason, we chose to present the more parsimonious SVSI model derived under the stock price process in (37.2). We could also make both the drift and the diffusion terms of V .t / a linear function of R.t / and !R .t /. In such cases, the stock returns, volatility and interest rates would all be correlated with each other (at least globally), and we could still derive the desired equity option valuation formula. But, that would again make the resulting formula more complex while not improving its performance.
C Œv v V
1 @C @2 C C R2 R 2 @V 2 @R @C @C R C D 0; C ŒR R R @R @
(37.9)
subject to C.t C ; 0/ D maxfS.t C / K; 0g. In the Appendix it is shown that C.t; / D S.t/˘1 .t; I S; R; V /KB.t; /˘2 .t; ; S; R; V /; (37.10) where the risk-neutral probabilities, ˘1 and ˘2 , are recovered from inverting the respective characteristic functions [see Heston (1993), and Scott (1997) for similar treatments]:
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
1 1 ˘j .t; I S.t/; R.t/; V .t// D C 2
Z
1 0
551
"
# e i lnŒK fj .t; ; S.t/; R.t/; V .t/I / Re d ; i
(37.11)
for j D 1; 2. The characteristic functions fj are respectively given by
ŒR R .1 e R / R C ŒR R f1 .t; / D exp 2 2 ln 1 2R R
Œv v C .1 C i /v .1 e v / v 2 2 ln 1 v 2v 2i .1 e R / v R.t/ Œ C .1 C i
/ / C i
lnŒS.t/ C v v v v2 2R ŒR R .1 e R / i .i C 1/.1 e v / C V .t/ ; 2v Œv v C .1 C i /v .1 e v /
(37.12)
and, (
# " ! R ŒR R .1 e R / f2 .t; / D exp 2 2 ln 1 C ŒR R 2R R " ! # Œv C i v .1 e v / v C v v C i v 2 2 ln 1 v 2v
Ci lnŒS.t/ lnŒB.t; / C
2.i 1/.1 e R /
2R ŒR R .1 e R / ) i .i 1/.1 e v / C V .t/ ; 2v Œv v C i v .1 e v / q p where R D R2 2R2 i , v D Œv .1 C i /v 2 q p i .i C 1/v2 , R D R2 2R2 .i 1/, and v D q Œv i v 2 i .i 1/v2 . The price of a European put on the same stock can be determined from the put-call parity. The option valuation model in (37.10) has several distinctive features. First, it applies to cases with stochasticallyvarying interest rates and volatility. It contains as special cases most existing models, such as the SV models, the SI models, and clearly the BS model. Second, as mentioned earlier, it allows for a flexible correlation structure between the stock return and its volatility, as opposed to the perfect correlation assumed in, for instance, Heston’s (1993) model.
R.t/
(37.13)
Furthermore, the volatility risk premium is time-varying and state-dependent. This is a departure from Hull and White (1987), Scott (1987), Stein and Stein (1991), and Wiggins (1987) where the volatility risk premium is either a constant or zero. Third, when compared to the general models in Bakshi and Chen (1997) and Scott (1997), the formula in (37.10) is parsimonious in the number of parameters; Especially, it is given only as a function of identifiable variables such that all parameters can be estimated based on available financial market data. The pricing formula in (37.10) applies to European equity options. But, in reality most of the traded option contracts are American in nature. While it is beyond the scope of the present paper to derive a model for American options,
552
G. Bakshi et al.
it is nevertheless possible to capture the first-order effect of early exercise in the following manner. For options with early exercise potential, compute the Barone-Adesi and Whaley (1987) or Kim (1990) early-exercise premium, treating it as if the stock volatility and the yield-curve were time-invariant. Adding this early-exercise adjustment component to the European option price in (37.10) should deliver a reasonable approximation of the corresponding American option price [e.g., Bates (1996b)].
37.2.2 Hedging and Hedge Ratios One appealing feature of a closed-form option pricing formula, such as the one in (37.10), is the possibility of deriving comparative statics and hedge ratios analytically. In the present context, there are three sources of stochastic variations over time, price risk S.t/, volatility risk V .t/ and interest rate risk R.t/. Consequently, there are three deltas: S .t; I K/
@C.t; / D ˘1 > 0 @S
V .t; I K/
@C.t; / @V
D S.t/ R .t; I K/
(37.14)
@˘2 @˘1 KB.t; / > 0 (37.15) @V @V
@˘1 @C.t; / D S.t/ @R @R
Fig. 37.1 The ı-curve, the -curve, the 4-curve, the C-curve, and the ˘-curve respectively plot the difference between the SVSI call option delta (with respect to stock) and its Black–Scholes counterpart, as varies from 1:0; 0:50; 0; 0:50, to 1.0. The structural parameter values used in the computation of the delta in (37.14) are backed out using Procedure B described in Sect. 37.2.3 and correspond to the
KB.t; /
@˘2 %./˘2 > 0; @R (37.16)
where, for g D V; R and j D 1; 2, @˘j 1 D @g
Z
1 0
1 i lnŒK @fj d : Re .i / e @g
(37.17)
The second-order partial derivatives with respect to these variables are provided in the Appendix. As V .t/ and R.t/ are both stochastic in our model, these deltas will in general differ from their Black–Scholes counterpart. To see how they may differ, let’s resort to an pexample in which we set R.t/ D 6:27%, S.t/ D 270, V .t/ D 22:12%, R D 0:481, R D 0:037, R D 0:043, v D 1:072, v D 0:041, v D 0:284, and D 0:60. These values are backed out from the S&P 500 option prices as of July 5, 1988. Fix K D $270 and D 45 days. Let S be as given in (37.14) for the SVSI model and bs S its BS counterpart, with bs S calculated using the same implied volatility. Figure 37.1 plots the difference between S and bs S , across different spot price levels and different correlation values. The correlation coefficient is chosen to be the focus as it is known to play a crucial role in determining the skewness of the stock return distribution. When is respectively at 0:50 and 1:0 (see the -curve and the ı-curve), the difference between the deltas is W-shaped, and it reaches the highest value when the option is at the money. The reverse is true when is positive. Thus, S is generally different from bs S . Analogous difference patterns emerge when the other option
calendar date July 5, 1988. The values of the structural parameters are: R D 0:4811, R D 0:0370, R D 0:0429, v D 1:072, v = 0.0409, v D 0:284, D 0:60. The initial (time-t ) R D 0:062733, p V D 22:12%, B.t; 0:1232/ D 0:99163. The strike price is fixed at $270 and the term-to-expiration of the option is 45 days
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
553
Fig. 37.2 The ı-curve, the -curve, the 4-curve, the C-curve, and the ˘-curve respectively plot the difference between the SVSI call option delta (with respect to the standard deviation) and the Black–Scholes counterpart, as varies from 1:0; 0:50; 0; 0:50, to 1.0. The strike price is fixed at $270 and the term-to-expiration of the option is 45 days. All computations are based on the parameter values given in the note to Fig. 37.1
Fig. 37.3 The ı-curve, the -curve, the 4-curve, the C-curve, and the ˘-curve respectively plot the difference between the SVSI call option delta (with respect to the spot interest rate) and the Black–Scholes counterpart, as varies from 1:0; 0:50; 0; 0:50, to 1.0. The strike price is fixed at $270 and the term-to-expiration of the option is 45 days. All computations are based upon the parameter values reported in the note to Fig. 37.1
deltas are compared with their respective BS counterpart. From Figs. 37.2 and 37.3, one can observe the following. (i) The volatility hedge ratio V from the SVSI model is, at each spot price, lower than its BS counterpart (except for deep in-the-money options when < 0, and for deep outof-the-money options when > 0).5 (ii) The interest-rate delta, R , and its BS counterpart, bs R , are almost not different from each other for slightly out-of-the-money options, but can be dramatically different for at-the-money options as well as for sufficiently deep in-the-money or deep out-of-themoney calls. For example, pick D 1:0. When S D $315, we have R D 30:94 and bs R D 32:35; When S D $226, we have R D 0:003 and bs R D 0:430. (iii) As expected, out-
5
In making such a comparison, one should apply sufficient caution. In the BS model, the volatility delta is only a comparative static, not a hedge ratio, as volatility is assumed to be constant. In the context of the SVSI model, however, V is time-varying hedge ratio as volatility is stochastic. This distinction also applies to the case of the interest-rate delta R .
of-the-money options are overall less sensitive to changes in the spot interest rate, regardless of the model used. In summary, if a portfolio manager/trader relies, in an environment with stochastic interest rates and stochastic volatility, on the BS model to design a hedge for option positions, the manager/trader will likely fail. Analytical expressions for the deltas are useful for constructing hedges based on an option formula. Below, we present two types of hedges by using the SVSI model as an example.
37.2.2.1 Delta-Neutral Hedges To demonstrate how the deltas may be used to construct a delta-neutral hedge, consider an example in which a financial institution intends to hedge a short position in a call option with periods to expiration and strike price K. In the stochastic interest rate-stochastic volatility environment, a perfectly delta-neutral hedge can be achieved by taking
554
G. Bakshi et al.
a long position in the replicating portfolio of the call. As three traded assets are needed to control the three sources of uncertainty, the replicating portfolio will involve a position in (i) some XS .t/ shares of the underlying stock (to control for the S.t/ risk), (ii) some XB .t/ units of a -period discount bond (to control for the R.t/ risk), and (iii) some XC .t/ units of another call option with strike price KN (or any option on the stock with a different maturity) in order to control for the volatility risk V .t/. Denote the time-t price of the replicating portfolio by G.t/: G.t/ D X0 .t/ C N where X0 .t/ XS .t/ S.t/CXB .t/ B.t; /CXC .t/ C.t; I K/, denotes the amount put into the instantaneously-maturing risk-free bond and it serves as a residual “cash position.” Deriving the dynamics for G.t/ and comparing them with those of C.t; I K/, we find the following solution for the deltaneutral hedge: XC .t/ D
V .t; I K/ N V .t; I K/
N XC .t/ XS .t/ D S .t; I K/ S .t; I K/ XB .t/ D
(37.18) (37.19)
˚ 1 N XC .t/ R .t; I K/ R .t; I K/ B.t; / %./
H.t C t/ D X0 e R.t /t C XS .t/S.t C t/ CXB .t/B.t C t; t/ N CXC .t/C.t C t; tI K/ C.t C t; tI K/:
(37.22)
Then, at time t C t, reconstruct the self-financed portfolio, repeat the hedging error calculation at time t C 2t, and so on. Record the hedging errors H.t C jt/, for j D t 1; ; J t . Finally, compute the average absolute hedging error as a function of rebalancing frequency t: P H.t/ D J1 Jj D1 j H.t C jt/ j, and the average dollarP value hedging error: HN .t/ D J1 Jj D1 H.t C jt/. In comparison, if one relies on the BS model to construct a delta-neutral hedge, the hedging error measures can be similarly defined as in (37.22), except that XB .t/ and XC .t/ must be restricted to zero and XS .t/ must be the BS delta. Likewise, if the SI model is applied, the only change is to set XC .t/ to zero with S and R determined by the SI model; In the case of the SV model, set XB .t/ D 0 and let S and V be as determined in the SV model. The Appendix provides in closed form a SI option pricing formula and a SV option formula.
(37.20) and the residual amount put into the instantaneouslymaturing bond is N X0 .t/ D C.t; I K/ XS .t/ S.t/ XC .t/ C.t; I K/ XB .t/ B.t; /;
(37.21)
where all the primitive deltas, S , R and V , are as determined in equations (37.14)–(37.16). Like the option prices, these hedge ratios all depend on the values taken by S.t/, V .t/ and R.t/ and those by the structural parameters. Such a hedge created using the general option pricing model should in principle perform better than using the BS model. In the latter case, only the underlying price uncertainty is controlled for, but not the uncertainties associated with volatility and interest rate fluctuations. In theory this delta-neutral hedge requires continuous rebalancing to reflect the changing market conditions. In practice, of course, only discrete rebalancing is possible. To derive a hedging effectiveness measure, suppose that portfolio rebalancing takes place at intervals of length t. Then, precisely as described above, at time t short the call option, go long in (i) XS .t/ shares of the underlying asset, (ii) XB .t/ units of the –period bond, and (iii) XC .t/ contracts of a call option with the same term-to-expiration but a N and invest the residual, X0 , in an indifferent strike price K, stantaneously maturing riskfree bond. After the next interval, compute the hedging error according to
37.2.2.2 Single-Instrument Minimum-Variance Hedges As discussed before, consideration of such factors as model misspecification and transaction costs may render it more practical to use only the underlying asset of the target option as the hedging instrument. Under this single-instrument constraint, a standard design is to choose a position in the underlying stock so as to minimize the variance of instantaneous changes in the value of the hedge. Letting XS .t/ again be the number of shares of the stock to be purchased, solving the standard minimum-variance hedging problem under the SVSI model gives XS .t/ D
V .t; / Covt ŒdS.t/; dC.t; / D S C v ; Var ŒdS.t/ S.t/ (37.23)
and the resulting residual cash position for the replicating portfolio is X0 .t/ D C.t; / XS .t/S.t/:
(37.24)
This minimum-variance hedge solution is quite intuitive, as it says that if stock volatility is deterministic (i.e., v D 0), or if stock returns are not correlated with volatility changes (i.e., D 0), one only needs to long S .t/ shares of the stock and no other adjustment is necessary. However, if volatility
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
is stochastic and correlated with stock returns, the position to be taken in the stock must control not only for the direct impact of underlying stock price changes on the target option value, but also for the indirect impact of that part of volatility changes which is correlated with stock price fluctuations. This effect is reflected in the last term in (37.23), which shows that the additional number of shares needed besides S is increasing in (assuming v > 0). As for the previous case, suppose that the target call is shorted and that XS .t/ shares are bought and X0 .t/ dollars are put into the instantaneous risk-free bond, at time t. The combined position is a self-financed portfolio. At time t Ct, the hedging error of this minimum-variance hedge is calculated as H.t C t/ D XS .t/S.t C t/ C X0 .t/e R.t /t C.t C t; t/:
(37.25)
Unlike in Nandi (1996) where he uses the remaining variance of the hedge as a hedging effectiveness gauge, we compute, based on the entire sample period, the average absolute and the average dollar hedging errors to measure the effectiveness of the hedge. Minimum-variance hedging errors under the SV model as well as under the SI model can be similarly determined accounting for their modeling differences. In the case of the SV model, there is still an adjustment term for the single stock position as in (37.23). But, for the SI model, the corresponding XS .t/ is the same as its S . For the BS model, this single-instrument minimum-variance hedge is the same as the delta-neutral hedge. Both types of hedging strategy will be examined under each of the four alternative models.
37.2.3 Implementation In addition to the strike price and the term-to-expiration (which are specified in the contract), the SVSI pricing formula in (37.10) requires the following values as input: The spot stock price. If the stock pays dividends, the stock
price must be adjusted by the present value of future dividends; The spot volatility; The spot interest rate; The matching -period yield-to-maturity (or the bond price); The seven structural parameters: {R , R , R , v , v , v , }.
555
For computing the price of a European option, we offer two alternative two-step procedures below. One can implement these steps on any personal computer: Procedure A: Step 1. Obtain a time-series each for the short rate, the stock return, and the stock volatility. Jointly estimate the structural parameters, {R , R , R , v , v , v , }, using Hansen’s (1982) GMM. Step 2. Determine the risk-neutral probabilities, ˘1 and ˘2 , from the characteristic functions in (37.12) and (37.13). Substitute (i) the two probabilities, (ii) the stock price, and (iii) the yield-to-maturity, into (37.10) to compute the option price. While offering an econometrically rigorous method to estimate the structural parameters, Step 1 in Procedure A may not be as practical or convenient, because of its requirement on historical data. A further difficulty with this approach is its dependence on the measurement of stock volatility. In implementing the BS model, practitioners predominantly use the implied volatility from the model itself, rather than relying on historical data. This practice has not only reduced data requirement dramatically but also resulted in significant performance improvement [e.g., Bates (2000), and Melino and Turnbull (1990, 1995)]. Clearly, one can also follow this practice to implement the SVSI model: Procedure B: Step 1. Collect N option prices on the same stock and taken from the same point in time (or same day), for any N 8. Let CO n .t; n ; Kn / be the observed price, and Cn .t; n ; Kn / the model price as determined by (37.10) with S.t/ and R.t/ taken from the market, for the nth option with n periods to expiration and strike price Kn and for each n D 1; : : : ; N . Clearly, the difference between CO n and Cn is a function of the values taken by V .t/ and by ˆ fR ; R ; R ; v ; v ; v ; g. Define n ŒV .t/; ˆ CO n .t; n ; Kn / Cn .t; n ; Kn /;
(37.26)
for each n. Then, find V .t/ and parameter vector ˆ (a total of eight), so as to minimize the sum of squared errors: N X
j n ŒV .t/; ˆ j2 :
(37.27)
nD1
The result from this step is an estimate of the implied spot variance and seven structural parameter values, for date t. See Bates (1996b, 2000), Day and Lewis (1997), Dumas
556
et al. (1998), Longstaff (1995), Madan et al. (1998), and Nandi (1996) where they adopt this technique for similar purposes. Step 2. Based on the estimate from the first step, follow Step 2 of Procedure A to compute date-.t C 1/’s option prices on the same stock. In the existing literature, the performance of a new option pricing model is often judged relative to that of the BS model when the latter is implemented using the model’s own implied volatility and the time-varying interest rates. Since volatility and interest rates in the BS are assumed to be constant over time, this internally inconsistent practice will clearly and significantly bias the application results in favor of the BS model. But, as this is the current standard in judging performance, we will follow Procedure B to implement the SVSI model and similar procedures to implement the BS, the SV, and the SI models. Then, the models will be ranked relative to each other according to their performance so determined.
G. Bakshi et al.
The data on the daily Treasury-bill bid and ask discounts with maturities up to 1 year are hand-collected from the Wall Street Journal and provided to us by Hyuk Choe and Steve Freund. By convention, the average of the bid and ask Treasury bill discounts is used and converted to an annualized interest rate. Careful attention is given to this construction since Treasury bills mature on Thursdays while index options expire on the third Friday of the month. In such cases, we utilize the two Treasury-bill rates straddling the option’s expiration date to obtain the interest rate of that maturity, which is done for each contract and each day in the sample. The Treasury bill rate with 30-days to maturity is the surrogate used for the short rate in (37.1) [and in the determination of the probabilities in (37.10)]. For European options, the spot stock price must be adjusted for discrete dividends. For each option contract with periods to expiration from time t, we first obtain the present value of the daily dividends D.t/ by computing N / D.t;
37.3 Data Description For all the tests to follow, we use, based on the following considerations, S&P 500 call option prices as the basis. First, options written on this index are the most actively traded European-style contracts. Recall that like the BS model, formula (37.10) applies to European options. Second, the daily dividend distributions are available for the index (from the S&P 500 Information Bulletin). Harvey and Whaley (1992a, b), for instance, emphasize that critical pricing errors can result when dividends are omitted from empirical tests of any option valuation model. Furthermore, S&P 500 options and options on S&P 500 futures have been the focus of many existing empirical investigations including, among others, Bates (2000), Dumas et al. (1998), Madan et al. (1998), Nandi (1996), and Rubinstein (1994). Finally, we also used S&P 500 put option prices to estimate the pricing and hedging errors of all four models and found the results to be similar, both qualitatively and quantitatively, to those reported in the paper. To save space, we chose to focus on the results based on the call option prices. The sample period extends from June 1, 1988 through May 31, 1991. The intradaily transaction prices and bid-ask quotes for S&P 500 options are obtained from the Berkeley Option Database. Note that the recorded S&P 500 index values are not the daily closing index levels. Rather, they were the corresponding index levels at the moment when the recorded option transaction took place or when an option price quote was recorded. Thus, there is no non-synchronous price issue here, except that the S&P 500 index level itself may contain stale component stock prices at each point in time.
t X
e R.t;s/s D.t C s/;
(37.28)
sD1
where R.t; s/ is the s-period yield-to-maturity. This procedure is repeated for all option maturities and for each day in our sample. In the next step, we subtract the present value of future dividends from the time-t index level, in order to obtain the dividend-exclusive S&P 500 spot index series that is later used as input into the option models. Several exclusion filters are applied to construct the option price data set. First, option prices that are time-stamped later than 3:00 PM Central Daytime are eliminated. This ensures that the spot price is recorded synchronously with its option counterpart. Second, as options with less than 6 days to expiration may induce liquidity-related biases, they are excluded from the sample. Third, to mitigate the impact of price discreteness on option valuation, option prices lower than $ 38 are not included. Finally, quote prices that are less than the intrinsic value of the option are taken out of the sample. We divide the option data into several categories according to either moneyness or term to expiration. A call option S 2 .0:97; 1:03/, is said to be at-the-money (ATM) if its K where S is the spot price and K the strike; out-of-the-money S S 0:97; and in-the-money (ITM) if K 1:03. (OTM) if K A finer partition resulted in 9 moneyness categories. By the term to expiration, each option can be classified as [e.g., Rubinstein (1985)] (i) extremely short-term (<30 days ); (ii) short-term (30–60 days); (iii) near-term (60–120 days); (iv) middle-maturity (120–180 days); and (v) long-term (>180 days). The proposed moneyness and term-to-expiration classifications resulted in 54 categories for which the empirical results will be reported. Table 37.1 describes sample properties of the S&P 500 call option prices used in the tests. Summary statistics are
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.1 Sample properties of S&P 500 index options. The reported numbers are respectively the average quoted bid-ask mid-point price and the number of observations. Each option contract is consolidated across moneyness and term-to-expiration categories. The sample period extends from June 1, 1988 through May 31, 1991 for a total of 38,749 calls. Daily information from the last quote of each option contract is used to obtain the summary statistics
Moneyness
557
Term-to-expiration (days)
S K
<30
30–60
60–90
90–120
120–180
180
<0.93
0.78 {23} 1.02 {121} 1.35 {488} 2.47 {838} 5.27 {776} 9.65 {752} 14.79 {675} 20.20 {620} 41.23 {2143} {6436}
1.33 {246} 1.91 {595} 3.05 {1012} 5.53 {1020} 8.99 {954} 13.17 {906} 17.80 {844} 22.63 {760} 42.28 {2350} {8687}
1.99 {266} 3.30 {267} 5.35 {316} 8.23 {312} 11.96 {285} 15.99 {276} 20.80 {241} 25.83 {224} 47.50 {1284} {3471}
2.84 {431} 5.08 { 319} 7.45 {351} 10.83 {336} 14.55 {308} 18.84 {283} 23.36 {264} 27.83 {242} 49.27 {1355} {3889}
4.88 {1080} 8.14 { 596} 10.87 {670} 14.19 {676} 17.95 {629} 22.06 {607} 26.39 {542} 30.69 {449} 51.34 {2184} {7483}
7.82 {1538} 12.86 { 646} 15.91 {628} 19.33 {706} 23.20 {631} 27.74 {597} 31.91 {501} 35.70 {473} 59.82 {3063} {8783}
0.93–0.95 0.95–0.97 0.97–0.99 0.99–1.01 1.01–1.03 1.03–1.05 1.05–1.07 1.07 Subtotal
Subtotal {3584} {2544} {3465} {3888} {3583} {3421} {3067} {2818} {12379} {38749}
S denotes the spot S&P 500 Index level and K is the exercise price.
reported for the average bid-ask mid-point price and the total number of observations, for each moneyness-maturity category. Note that there is a total of 38,749 call price observations, with deep in-the-money and at-the-money options respectively taking up 32% and 28% of the total sample, and that the average call price ranges from $0.78 for extremely short-term, deep out-of-the money options to $59.82 for long-term, deep in-the-money options.
37.4 Empirical Tests This section examines the relative empirical performance of the four models. The analysis is intended to present a complete picture of what each generalization of the benchmark BS model can really buy in terms of performance improvement and whether each generalization produces a worthy tradeoff between benefits and costs. We will pursue this analysis by using three yardsticks: (i) the size of the out-of-sample cross-sectional pricing errors (static performance); (ii) the size of model-based hedging errors (dynamic performance); and (iii) the existence of systematic biases across strike prices or across maturities (i.e., does the implied volatility still smile?). Based on Procedure B of Sect. 37.2.3, Table 37.2 reports the summary statistics for the daily estimated structural parameters and the implied spot standard deviation,
respectively for the SVSI, the SV, the SI and the BS models. Take the SVSI model as an example. Over the entire sample period 06:1988–05:1991, v D 0:906, v D 0:042, and v D 0:414. These estimates imply a long-run mean of 21:53% for the volatility process. The implicit (average) half-life for variance mean-reversion is 9.18 months. These estimates are similar in magnitude to those reported in Bates (1996b, 2000) for S&P 500 futures options. The estimated parameters for the (risk-neutralized) short rate process are also reasonable and comparable to those in Chan et al. (1992). The presented correlation estimate for is 0:763. The average implied-standard-deviation is 19.27%. As seen from the reported standard errors in Table 37.2, for each given model the daily parameter and spot volatility estimates are quite stable from subperiod to subperiod. Histogram-based inferences (not reported) indicate that the majority of the estimated values are centered around the mean. In estimating the structural parameters and the implied volatility for a given day, we used all S&P 500 options collected in the sample for that day (regardless of maturity and moneyness). This is the treatment applied to the SI, the SV and the SVSI models. For the BS model, however, Whaley (1982) makes the point that ATM options may give an implied-volatility estimate which produces the best pricing and hedging results. Based on his justification, we used, for each given day, one ATM option that had at least 15 days to expiration to back out the BS model’s implied-volatility value. This estimate was then used to determine the next
558
G. Bakshi et al.
Table 37.2 Estimates of the structural parameters for stochastic inter- timated parameters is reported first, followed by its standard error in est rates (SI), stochastic volatility (SV). and stochastic volatility and parentheses. The average implied volatility obtained from inverting stockastic interest rates (SVSI) models. Each day in the sample, the the Black–Scholes model (using a short-term at-the-money option) is structural parameters of a given model are estimated by minimizing respectively 18.47%, 17.72%, 17.41%, and 20.52% over the sample the sum of squared pricing errors between the market price and the periods: 06:1988–05:1991, 06:1988–05:1989, 06:1989–05:1990, and model-determined price for each option. The daily average of the es- 06:1990–05:1991 SI SV SVSI SI SV SVSI SI SV SVSI SI SV SVSI Parameters R R R
06:1988–05:1991 2.35 (0.03) 0.35 (0.00) 0.04 (0.00)
v v v p V .t / (%)
18.14 (0.13)
1.10 (0.02) 0.04 (0.00) 0.38 (0.00) 0:64 (0.01) 19.02 (0.15)
0.61 (0.02) 0.02 (0.00) 0.03 (0.00) 0.91 (0.03) 0.04 (0.01) 0.41 (0.00) 0:76 (0.01) 19.27 (0.15)
06:1988–05:1989 2.51 (0.06) 0.33 (0.01) 0.04 (0.00)
17.22 (0.15)
1.29 (0.05) 0.04 (0.00) 0.30 (0.01) 0:54 (0.01) 18.41 (0.17)
day’s pricing and hedging errors of the BS model. See Bates (1996a) for a review of alternative approaches to estimating the BS model’s implied volatility. Observe in Table 37.2 that for the overall sample period, the average implied standard-deviation is 19.27% by the SVSI model, 19.02% by the SV, 18.14% by the SI, and 18.47% by the BS model, where the difference between the highest and the lowest is only 1.13%. For each subperiod the implied-volatility estimates are similarly close across the four models. This is somewhat surprising. It should however be recognized that this comparison is based only on the average estimates over a given period. When we examined the day-to-day time-series paths of the four models’ implied-volatility estimates, we found the difference between two models’ implied standard-deviations to be sometimes as high as 6%. Economically, option prices and hedge ratios are generally quite sensitive to the volatility input [see Figlewski (1989)]. Even small differences in the implied-volatility estimate can lead to significantly different pricing and hedging results.
37.4.1 Static Performance To examine out-of-sample cross-sectional pricing performance for each model, we use previous day’s option prices to back out the required parameter values and then use them as input to compute current day’s model-based option prices.
06:1989–05:1990
0.49 (0.02) 0.02 (0.00) 0.04 (0.00) 1.16 (0.07) 0.04 (0.00) 0.35 (0.01) 0:73 (0.01) 18.50 (0.18)
2.29 (0.05) 0.34 (0.01) 0.04 (0.00)
17.43 (0.13)
1.05 (0.03) 0.04 (0.00) 0.42 (0.01) 0:62 (0.01) 17.61 (0.18)
0.61 (0.03) 0.02 (0.00) 0.03 (0.00) 0.78 (0.05) 0.04 (0.00) 0.44 (0.01) 0:74 (0.01) 18.10 (0.21)
06:1990–05:1991 2.03 (0.05) 0.35 (0.01) 0.04 (0.01)
20.14 (0.25)
0.94 (0.04) 0.04 (0.00) 0.43 (0.00) 0:76 (0.01) 21.19 (0.35)
0.76 (0.04) 0.02 (0.00) 0.03 (0.00) 0.76 (0.04) 0.04 (0.00) 0.46 (0.00) 0:82 (0.01) 21.47 (0.35)
Next, subtract the model-determined price from the observed market price, to compute both the absolute pricing error and the percentage pricing error. This procedure is repeated for each call and each day in the sample, to obtain the average absolute and the average percentage pricing errors and their associated standard errors. These steps are separately followed for each of the BS, the SI, the SV and the SVSI models. The results from this exercise are reported in Table 37.3. Let’s first examine the relative performance in pricing OTM options. Overpricing of OTM options is often considered a critical problem for the BS model [e.g., McBeth and Merville (1979) and Rubinstein (1985)]. Panel A of Table 37.3 reports the absolute and the percentage pricing error estimates for OTM options. According to both error measures, the overall ranking of the four models is consistent with our priors: the SVSI model outperforms all others, followed by the SV, the SI and finally the BS model. For extremely short-term (<30 days) and extremely out-of-themoney ( KS < 0:93) options, for example, the average absolute pricing error by the SVSI model is $0.23 versus $0.53 by the BS, $0.28 by the SI, and $0.25 by the SV model. For this category, the BS model’s absolute pricing error is cut by more than a half by each of the other three models. Fix the monS 2 .0:93; 0:95/. Then, for medium-term eyness category at K (120–180 days) options, the SVSI model produces an average absolute pricing error of $0.44 versus $1.38 by the BS, $0.72 by the SI, and $0.39 by the SV model. For short-term (30–60 days) calls, the absolute pricing errors are $0.44 by the SVSI, $0.48 by the SV, $0.73 by the SI, and $0.90 by the
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
559
Table 37.3 Out-of-sample pricing errors. market price. The reported absolute pricing error is the sample average Panel A: Out-of-the-money options. For a given model, compute the of the absolute difference between the market price and the model price price of each option using previous day’s implied parameters and im- for each call. The corresponding standard errors are recorded in parenplied stock volatility. The reported percentage pricing error is the sam- theses. The sample period is 06:1988–05:1991, with a total of 38,749 ple average of the market price minus the model price divided by the call option prices Percentage pricing error Absolute pricing error Term-to-expiration (days) Term-to-expiration (days) Moneyness S Model <30 30–60 60–90 90–120 120–180 180 <30 30–60 60–90 90–120 120–180 180 K <0:93
BS SI SV SVSI
0.93–0.95
BS SI SV SVSI
0.95–0.97
BS SI SV SVSI
65:99 (12.02) 24:53 (6.59) 22:08 (6.90) 16:29 (7.79)
86:80 (4.51) 58:13 (3.81) 30:38 (3.07) 21:96 (2.64)
62:45 (2.96) 40:04 (2.60) 12:43 (1.54) 5:68 (1.40)
57:63 (2.92) 28:43 (1.67) 4:02 (0.90) 1:68 (0.93)
47:71 (1.37) 16:70 (0.95) 0.89 (0.47) 0.92 (0.51)
33:72 (1.05) 3:92 (0.63) 6.08 (0.39) 0.18 (0.64)
0.53 (0.10) 0.24 (0.04) 0.25 (0.04) 0.23 (0.05)
1.00 (0.04) 0.66 (0.03) 0.44 (0.03) 0.38 (0.02)
1.14 (0.05) 0.72 (0.04) 0.34 (0.02) 0.29 (0.02)
1.50 (0.06) 0.80 (0.03) 0.33 (0.02) 0.33 (0.02)
1.96 (0.05) 0.91 (0.05) 0.43 (0.04) 0.46 (0.04)
2.36 (0.06) 0.96 (0.05) 0.62 (0.05) 0.66 (0.04)
53:68 (5.31) 42:06 (5.32) 25:68 (4.61) 22:50 (4.53)
54:50 (2.08) 49:30 (2.18) 26:16 (1.43) 18:85 (1.43)
33:82 (1.79) 32:22 (2.07) 8:83 (0.81) 4:84 (0.85)
21:88 (1.25) 15:78 (1.07) 3:39 (0.61) 2:39 (0.74)
16:43 (0.61) 10:18 (0.55) 0:55 (0.30) 0.66 (0.32)
11:25 (0.56) 5:91 (0.43) 1.23 (0.24) 0.71 (0.26)
0.56 (0.04) 0.42 (0.03) 0.40 (0.03) 0.38 (0.03)
0.90 (0.03) 0.77 (0.02) 0.48 (0.02) 0.44 (0.02)
1.05 (0.04) 0.92 (0.05) 0.35 (0.02) 0.31 (0.02)
1.24 (0.06) 0.83 (0.05) 0.39 (0.02) 0.42 (0.03)
1.38 (0.04) 0.85 (0.03) 0.39 (0.02) 0.44 (0.02)
1.80 (0.06) 0.98 (0.05) 0.52 (0.02) 0.58 (0.03)
36:61 (2.33) 35:83 (2.45) 23:68 (2.06) 16:90 (2.01)
28:83 (0.93) 30:09 (1.09) 16:94 (0.68) 13:53 (0.72)
16:21 (0.95) 18:97 (1.30) 5:63 (0.58) 3:59 (0.60)
9:91 (0.84) 7:44 (0.68) 1:63 (0.42) 1:80 (0.51)
7:75 (0.41) 5:70 (0.41) 0:26 (0.22) 0.05 (0.23)
5:77 (0.45) 3:62 (0.32) 0.56 (0.20) 0.30 (0.21)
0.55 (0.03) 0.51 (0.04) 0.45 (0.03) 0.42 (0.03)
0.81 (0.02) 0.81 (0.02) 0.51 (0.02) 0.49 (0.02)
0.87 (0.04) 0.92 (0.05) 0.40 (0.02) 0.38 (0.02)
1.03 (0.05) 0.69 (0.04) 0.40 (0.02) 0.47 (0.03)
1.05 (0.04) 0.79 (0.03) 0.41 (0.02) 0.45 (0.02)
1.44 (0.06) 0.86 (0.04) 0.49 (0.02) 0.56 (0.02)
BS model. Clearly, the performance improvement is significant for each moneyness and maturity category in Panel A, from the BS to the SI, to the SV, and to the SVSI model. This pricing performance ranking of the four models can also be seen using the average percentage pricing errors, as given in the same table. Here, the SVSI model produces percentage pricing errors that are the lowest in magnitude. As an example, take OTM options with term-to-expiration of 30–60 S 2 .0:93; 0:95/. In this category the BS, the days and with K SI, the SV, and the SVSI models respectively have average percentage pricing errors of 54.50%, 46.20%, 26.16%, S and 18.85%. For long-term options with K 2 .0:93; 0:95/ S and with K 2 .0:95; 0:97/, the SVSI model results in a percentage pricing error that is as low as 0.71% and 0.30%, respectively. For ATM calls, recall that the BS model’s impliedvolatility input is backed out from the (previous day’s) shortterm ATM options, which should give the BS model a relative advantage in pricing ATM options. In contrast, the implied
spot variance for the other models is obtained by minimizing the sum of squared errors for all options of the previous day. Thus, for ATM options, one would expect the BS model to perform relatively better. As seen from Panel B of Table 37.3, except for the shortest-term ATM calls, the SVSI model typically generates the lowest absolute and percentage pricing errors (especially for longer-term options), followed by the SV, by the SI and finally by the BS model. For the shortestS S 2 .0:97; 0:99/ and K 2 .0:99; 1:01/, term options with K the BS and the SI models perform somewhat better than the other two. Panel C of Table 37.3 reports the average absolute and percentage pricing errors of ITM calls by all four models. While the previous ranking of the models based on OTM and ATM options is preserved by Panel C, it can be noted that the average percentage pricing error is below 1.0% for 12 out of the 18 categories in the case of the SVSI model, for 8 out of the 18 categories in the case of the SV model, for 3 categories out of 18 for the SI model, and for none of the 18 categories
560
G. Bakshi et al.
Table 37.3 Out-of-sample pricing errors. Panel B: At the money options Percentage pricing error Term-to-expiration (days) Moneyness S Model <30 30–60 60–90 K 0.97–0.99
BS SI SV SVSI
0.99–1.01
BS SI SV SVSI
1.01–1.03
BS SI SV SVSI
Absolute pricing error Term-to-expiration (days) 90–120
120–180
180
<30
30–60
60–90
90–120
120–180
180
19:94 (1.03) 22:85 (1.16) 18:93 (1.01) 17:10 (1.05)
10:16 (0.49) 11:38 (0.59) 8:37 (0.40) 7:74 (0.45)
4:83 (0.58) 7:76 (0.70) 2:76 (0.37) 2:36 (0.39)
2:66 (0.56) 1:69 (0.46) 0:44 (0.28) 1:04 (0.37)
2:21 (0.30) 2:29 (0.30) 0:16 (0.17) 0:42 (0.17)
1:42 (0.35) 1:76 (0.23) 0:04 (0.15) 0:29 (0.15)
0.51 (0.02) 0.53 (0.02) 0.50 (0.02) 0.49 (0.02)
0.66 (0.02) 0.68 (0.02) 0.52 (0.02) 0.53 (0.02)
0.63 (0.03) 0.71 (0.04) 0.37 (0.02) 0.38 (0.02)
0.77 (0.05) 0.58 (0.03) 0.37 (0.02) 0.46 (0.03)
0.82 (0.03) 0.67 (0.03) 0.40 (0.02) 0.44 (0.02)
1.18 (0.05) 0.74 (0.03) 0.48 (0.02) 0.52 (0.02)
4:42 (0.47) 5:97 (0.57) 7:93 (0.47) 8.22 (0.49)
1:29 (0.30) 2:28 (0.38) 3.72 (0.26) 3.93 (0.29)
1.14 (0.39) 1:08 (0.48) 0.76 (0.27) 0.74 (0.30)
1.44 (0.45) 1.81 (0.39) 0.51 (0.25) 0.40 (0.31)
1.06 (0.24) 0.29 (0.25) 0.13 (0.14) 0.31 (0.15)
0.86 (0.26) 0:63 (0.22) 0.05 (0.13) 0.25 (0.15)
0.44 (0.01) 0.50 (0.02) 0.50 (0.02) 0.51 (0.02)
0.57 (0.02) 0.63 (0.02) 0.50 (0.02) 0.55 (0.02)
0.54 (0.03) 0.59 (0.04) 0.38 (0.02) 0.42 (0.03)
0.74 (0.04) 0.68 (0.05) 0.43 (0.02) 0.51 (0.03)
0.77 (0.03) 0.70 (0.03) 0.42 (0.02) 0.46 (0.02)
0.97 (0.04) 0.75 (0.04) 0.51 (0.02) 0.59 (0.03)
2.43 (0.23) 1.65 (0.27) 0.99 (0.23) 1.49 (0.23)
3.01 (0.19) 2.29 (0.23) 0.69 (0.17) 1.16 (0.18)
4.15 (0.29) 2.08 (0.38) 0.46 (0.22) 0.12 (0.24)
3.76 (0.38) 3.52 (0.30) 1.04 (0.20) 0.27 (0.24)
3.36 (0.19) 1.76 (0.19) 0.38 (0.11) 0.19 (0.12)
2.47 (0.28) 0.34 (0.20) 0.21 (0.11) 0.55 (0.11)
0.46 (0.01) 0.47 (0.02) 0.39 (0.01) 0.41 (0.02)
0.63 (0.02) 0.65 (0.02) 0.42 (0.02) 0.46 (0.02)
0.77 (0.04) 0.70 (0.04) 0.37 (0.02) 0.39 (0.03)
0.96 (0.05) 0.88 (0.05) 0.44 (0.03) 0.48 (0.03)
1.01 (0.03) 0.76 (0.03) 0.42 (0.02) 0.46 (0.02)
1.42 (0.05) 0.81 (0.04) 0.49 (0.02) 0.54 (0.02)
in the case of the BS model. The pricing improvement by the SV and the SVSI models over the BS and the SI is quite substantial for ITM options, especially for long-term options. Some patterns of mispricing can, however, be noted across all moneyness-maturity categories. First, all four models produce negative percentage pricing errors for options with S 0:99, and positive percentage pricing errors moneyness K S 1:03, subject to their time-to-expiration for options with K not exceeding 120 days. This means that the models systematically overprice OTM call options while underprice ITM calls. But the magnitude of such mispricing varies dramatically across the models, with the BS producing the strongest and the SVSI model the weakest systematic biases. Next, according to the absolute pricing error measure, the SV model seems to perform slightly better than the SVSI in pricing calls with more than 90 days to expiration. This pattern is, however, not supported by the percentage pricing errors reported in Table 37.3, possibly because for these relatively long-term calls the two models produce pricing errors that have mixed signs, in which case taking the average absolute value of the pricing errors can sometimes distort the picture.
According to the percentage pricing errors, the SVSI model does slightly better than the SV in pricing those longer-term options. Finally, for the BS model, its absolute pricing error has a U-shaped relationship (i.e., “smile”) with moneyness, and the magnitude of its percentage pricing error increases as the call goes from deep in the money to deep out of the money, regardless of time to expiration. These patterns are reduced by each relaxation of the BS model assumptions.
37.4.2 Dynamic Hedging Performance Recall that in implementing a hedge using any of the four models, we follow three basic steps. First, based on Procedure B of Sect. 37.2.3, estimate the structural parameters and spot variance by using day 1’s option prices. Next, on day 2, use previous day’s parameter and spot volatility estimates and current day’s spot price and interest rates, to construct the desired hedge as given in Sect. 37.2.2. Finally, rely on either equation (37.22) or equation (37.25) to calculate the
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.3 Out-of-sample pricing errors. Panel C: In the money options Percentage pricing error Term-to-expiration (days) Moneyness S Model <30 30–60 60–90 K 1.03–1.05
BS SI SV SVSI
1.05–1.07
BS SI SV SVSI
>1:07
BS SI SV SVSI
561
Absolute pricing error Term-to-expiration (days) 90–120
120–180
180
<30
30–60
60–90
90–120
120–180
180
3.69 (0.13) 3.37 (0.15) 1.27 (0.13) 0.84 (0.13)
4.45 (0.14) 3.83 (0.18) 0.79 (0.13) 0.32 (0.14)
5.31 (0.24) 3.64 (0.28) 1.09 (0.17) 0.83 (0.19)
4.76 (0.29) 4.08 (0.25) 1.11 (0.15) 0.29 (0.20)
4.38 (0.17) 2.51 (0.17) 0.43 (0.10) 0.03 (0.10)
2.98 (0.24) 0.17 (0.19) 0.20 (0.11) 0.41 (0.11)
0.59 (0.02) 0.57 (0.02) 0.38 (0.01) 0.37 (0.01)
0.85 (0.03) 0.82 (0.03) 0.42 (0.02) 0.45 (0.02)
1.14 (0.05) 0.92 (0.05) 0.42 (0.03) 0.42 (0.03)
1.20 (0.07) 1.04 (0.06) 0.43 (0.03) 0.49 (0.03)
1.29 (0.04) 0.90 (0.04) 0.42 (0.02) 0.45 (0.02)
1.40 (0.06) 0.83 (0.04) 0.50 (0.02) 0.54 (0.03)
3.37 (0.10) 3.28 (0.12) 1.82 (0.09) 1.59 (0.09)
4.54 (0.11) 4.02 (0.14) 1.41 (0.09) 1.12 (0.10)
5.57 (0.22) 4.08 (0.24) 1.47 (0.16) 1.35 (0.17)
5.08 (0.27) 4.47 (0.22) 1.44 (0.14) 0.83 (0.17)
4.82 (0.14) 2.65 (0.15) 0.54 (0.09) 0.17 (0.09)
4.27 (0.22) 0.59 (0.18) 0.40 (0.11) 0.52 (0.11)
0.70 (0.02) 0.69 (0.02) 0.45 (0.02) 0.42 (0.01)
1.06 (0.02) 0.97 (0.03) 0.46 (0.02) 0.46 (0.02)
1.46 (0.05) 1.13 (0.06) 0.53 (0.03) 0.52 (0.03)
1.47 (0.06) 1.29 (0.06) 0.50 (0.03) 0.51 (0.04)
1.56 (0.04) 0.99 (0.04) 0.45 (0.02) 0.44 (0.02)
1.73 (0.06) 0.88 (0.05) 0.57 (0.03) 0.58 (0.03)
1.79 (0.04) 1.86 (0.05) 1.36 (0.13) 1.29 (0.03)
2.65 (0.05) 2.50 (0.06) 1.33 (0.13) 1.26 (0.03)
2.96 (0.07) 2.14 (0.07) 1.06 (0.17) 1.18 (0.04)
3.10 (0.08) 2.45 (0.08) 0.92 (0.15) 0.81 (0.04)
3.36 (0.05) 1.63 (0.05) 0.45 (0.10) 0.40 (0.03)
2.61 (0.05) 0.76 (0.05) 0.64 (0.11) 0.37 (0.03)
0.60 (0.01) 0.59 (0.01) 0.50 (0.01) 0.46 (0.01)
0.95 (0.01) 0.89 (0.02) 0.55 (0.01) 0.55 (0.01)
1.22 (0.02) 0.88 (0.02) 0.52 (0.01) 0.58 (0.01)
1.35 (0.02) 1.04 (0.03) 0.49 (0.02) 0.52 (0.02)
1.56 (0.02) 0.86 (0.02) 0.42 (0.01) 0.44 (0.01)
1.58 (0.02) 1.09 (0.02) 0.64 (0.01) 0.57 (0.01)
hedging error as of day 3. We then compute both the average absolute and the average dollar hedging errors of all call options in a given moneyness-maturity category, to gauge the relative hedging performance of each model. It should be recognized that in both the delta-neutral and the minimum-variance hedging exercises conducted in the two subsections below, the spot S&P 500 index, rather than an S&P 500 futures contract, is used in place of the “spot asset” for the hedges devised in Sect. 37.2.2. This is done out of two considerations. First, the spot S&P 500 and the immediate-expiration-month S&P 500 futures price generally have a correlation coefficient close to one. This means that whether the spot index or the futures price is used in the hedging exercises, the qualitative as well as the quantitative conclusions are most likely the same. In other words, if it is demonstrated using the spot index that one model results in better hedging performance than another, the same hedging performance ranking of the two models will likely be achieved by using an S&P 500 futures contract. After all, our main interest here lies in the relative performance of the models. Second, when a futures contract is used in construct-
ing a hedge, a futures pricing formula has to be adopted. That will introduce another dimension of model misspecification (due to stochastic interest rates), which can in turn produce a compounded effect on the hedging results. For these reasons, using the spot index may lead to a cleaner comparison among the four option models.
37.4.2.1 Effectiveness of Delta-Neutral Hedges Observe that the construction and the execution of the hedging strategy in (37.22) requires, in the cases of the SV and the SVSI models, (i) the availability of prices for four time-matched target and hedging-instrumental N C.t C t; tI K/, options: C.t; I K/, C.t; I K/, N C.t C t; tI K/ and (ii) the computation of S , V and R for the target and the instrumental option. Due to this requirement, it is important to match as closely as possible the time points at which the target and the instrumental option prices were respectively taken, in order to ensure that the hedge ratios are properly determined. For this reason,
562
we use as hedging instruments only options whose prices on both the hedge construction day and the following liquidation day were quoted no more than 15 s apart from the times when the respective prices for the target option were quoted. This requirement makes the overall sample for the hedging exercise smaller than that used for the preceding pricing exercise, but it nonetheless guarantees that the deltas for the target and instrumental options on the same day are computed based on the same spot price. The remaining sample contains 15,041 matched pairs when hedging revision occurs at 1-day intervals, and 11,704 matched pairs when rebalancing takes place at 5-day intervals. In addition, we partition the target options into three maturity classes: less than 60 days, 60–180 days, and greater than 180 days, and report hedging results accordingly. In theory, a call option with any expiration date and any strike price can be chosen as a hedging instrument for any given target option. In practice, however, different choices can mean different hedging effectiveness, even for the same option pricing model. Out of this consideration, we employ as a hedging instrument the call option which has the same expiration date as the target option and whose strike price is the closest, but not identical, to the target option’s. Table 37.4 presents delta-neutral hedging results for the four models. Several patterns emerge from this table. First, the BS model produces the worst hedging performance by most measures, the SI shows noticeable improvement according to the average dollar hedging errors (especially in the 5-day hedging revision categories) but not so according to the average absolute hedging errors, while the SV and the SVSI models have average absolute and average dollar hedging errors that are typically one-third of the corresponding BS hedging errors, or lower. The improvement by the SV and the SVSI is thus remarkable. Second, as portfolio adjustment frequency decreases from daily to once every 5 days, hedging effectiveness deteriorates, regardless of the model used. The deterioration is especially apparent for OTM and ATM S 1:05. It should however be noted with options with K emphasis that for both the SV and the SVSI models, their hedging effectiveness is relatively stable, whether the hedges are rebalanced each day or once every 5 days. For the BS and the SI models, such a change in revision frequency can mean doubling their hedging errors. This finding is strong evidence in support of the SV and the SVSI models for hedging. Third, the BS model-based delta-neutral hedging strategy always overhedges a target call option, as its average dollar hedging error is negative for each moneyness-maturity category and at either frequency of portfolio rebalancing. In contrast, the dollar hedging errors based on the SV and the SVSI models are more random and can take either sign. Therefore, the BS formula has a systematic hedging bias pattern, whereas the SV and the SVSI do not.
G. Bakshi et al.
Fourth, the SVSI model is indistinguishable from the SV according to their absolute hedging errors, but is slightly better than the latter when judged using their average dollar hedging errors. Similarly, the SI model has worse hedging performance than the BS according to their absolute hedging error values, but the reverse is true according to their dollar hedging errors. This phenomenon exists possibly because with stochastic interest rates there are larger hedging errors of opposite signs, so that when added together, these errors cancel out, but the sum of their absolute values is nonetheless large. Finally, no matter which model is used, there do not appear to be moneyness- or maturity-related bias patterns in the hedging errors. In other words, hedging errors do not seem to “smile” across exercise prices or times to expiration, as pricing errors do. This is a striking disparity between pricing and hedging results.
37.4.2.2 Effectiveness of Single-Instrument Minimum-Variance Hedges If one is, for reasons given before, constrained to using only the underlying stock to hedge a target call option, dimensions of uncertainty that move the target option value but are uncorrelated with the underlying stock price cannot be hedged by any position in the stock and will necessarily be uncontrolled for in such a single-instrument minimum-variance hedge. Based on the sample option data, the average absolute and the average dollar hedging errors, with either a daily or a 5-day rebalancing frequency, are given in Table 37.5 for each of the four models and each of the moneyness-maturity categories. With this type of hedges, the relative performance of the models is no longer clear-cut. For OTM options with S 0:97, the SV model has, regardless of the hedging erK ror measure used and the hedge revision frequency adopted, the lowest hedging errors, followed by the SVSI, then by the BS, and lastly by the SI model. For ATM options, the hedging performance by the BS and the SV models is almost indistinguishable, but still better, by a small margin, than that by both the SI and the SVSI models, whereas the latter two models’ performance is also indistinguishable. Finally, for ITM options, the BS model has the best hedging performance, followed by the SV, the SVSI, and then by the SI model. Having said the above, it should nonetheless be noted that for virtually all cases in Table 37.5 the hedging error differences among the BS, the SV and the SVSI models are economically insignificant because of their low magnitude. Only the SI model’s performance appears to be significantly poorer than the others’. The fact that the SI model performs worse than the BS and that the SVSI model performs slightly worse than the
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.4 Delta-neutral hedging errors Panel A: Out-of-the-money options. For each call option, calculate the hedging error, which is the difference between the market price of the call and the replicating portfolio. The average dollar hedging error and the average absolute hedging error are reported for each Dollar hedging error 1-day revision 5-day revision
S K
<0:93
5-day revision
Term-to-expiration (days)
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
BS
NA
0.06 (0.03) 0.08 (0.02) 0.02 (0.02) 0.02 (0.02)
0.04 (0.02) 0.05 (0.02) 0.01 (0.02) 0.00 (0.02)
NA
0.33 (0.09) 0.40 (0.05) 0.03 (0.02) 0.03 (0.02)
0.21 (0.06) 0.36 (0.06) 0.03 (0.02) 0.00 (0.03)
NA
0.37 (0.01) 0.35 (0.01) 0.15 (0.01) 0.16 (0.01)
0.45 (0.01) 0.40 (0.01) 0.14 (0.01) 0.14 (0.01)
NA
0.91 (0.06) 0.69 (0.03) 0.17 (0.02) 0.17 (0.01)
0.87 (0.04) 0.81 (0.04) 0.21 (0.02) 0.22 (0.02)
NA
0.06 (0.02) 0.08 (0.02) 0.01 (0.01) 0.01 (0.01)
0.01 (0.03) 0.00 (0.04) 0.01 (0.02) 0.01 (0.02)
NA
0.24 (0.07) 0.29 (0.05) 0.00 (0.01) 0.00 (0.01)
0.02 (0.07) 0.22 (0.07) 0.00 (0.03) 0.00 (0.03)
NA
0.32 (0.01) 0.33 (0.01) 0.12 (0.01) 0.12 (0.01)
0.46 (0.02) 0.43 (0.02) 0.23 (0.01) 0.24 (0.02)
NA
0.79 (0.04) 0.67 (0.03) 0.13 (0.01) 0.13 (0.01)
0.82 (0.05) 0.66 (0.05) 0.18 (0.02) 0.18 (0.02)
0.08 (0.06) 0.06 (0.02) 0.03 (0.03) 0.02 (0.03)
0.06 (0.02) 0.06 (0.02) 0.01 (0.01) 0.01 (0.01)
0.01 (0.03) 0.06 (0.04) 0.01 (0.02) 0.01 (0.02)
0.55 (0.16) 0.22 (0.05) 0.03 (0.03) 0.02 (0.02)
0.21 (0.06) 0.31 (0.05) 0.00 (0.01) 0.01 (0.01)
0.12 (0.08) 0.34 (0.09) 0.02 (0.03) 0.01 (0.03)
0.23 (0.04) 0.27 (0.01) 0.10 (0.02) 0.09 (0.02)
0.33 (0.02) 0.34 (0.01) 0.13 (0.01) 0.13 (0.01)
0.45 (0.03) 0.41 (0.02) 0.19 (0.01) 0.20 (0.01)
0.66 (0.13) 0.61 (0.03) 0.10 (0.02) 0.08 (0.02)
0.77 (0.04) 0.71 (0.03) 0.14 (0.01) 0.14 (0.01)
0.85 (0.05) 0.89 (0.06) 0.26 (0.02) 0.27 (0.02)
SV SVSI
BS SI SV SVSI
0.95–0.97
Absolute hedging error 1-day revision
Model
SI
0.93–0.95
model. The standard errors are given in parentheses. The sample period is 06:1988–05:1991. In calculating the hedging errors generated with daily (once every 5 days) hedge rebalancing, 15,041 (11,704) observations are used
Term-to-expiration (days)
Moneyness
563
BS SI SV SVSI
SV suggests that adding stochastic interest rates to the option pricing framework actually make the single-instrument hedge’s performance worse. This can be explained as follows. In the setup of the present paper, interest rate shocks are assumed to be independent of shocks to the stock price and/or to the stochastic volatility. Therefore, in the singleinstrument minimum-variance hedges, there is no adjustment in the optimal position in the underlying stock to be taken. The hedging results in Table 37.5 have shown that if interest rate risk is not to be controlled by any position in the hedging instrument, then it is perhaps better to design a singleinstrument hedge based on an option model that assumes no interest rate risk. Assuming interest rate risk in an option pricing model and yet not controlling for this risk in a hedge can make the hedging effectiveness worse. In the case of the SV versus the BS model, the situation is somewhat different from the above. As volatility
shocks are assumed to be correlated with stock price shocks, the position to be taken in the underlying stock (i.e., the hedging instrument) needs to be adjusted relative to the BS model-determined hedge, so that this single position not only helps contain the underlying stock’s price risk but also neutralize that part of volatility risk which is related to stock price fluctuations [see equation (37.23)]. Thus, by rendering it possible to use the single hedging position to control for both stock price risk and volatility risk, introducing stochastic volatility into the BS framework helps improve the single-instrument hedging performance, albeit by a small amount. Nandi (1996) uses the remaining variance of a hedged position as a hedging effectiveness measure, according to which he finds the SV model performs better than the BS model. Our single-instrument hedging results are hence consistent with his, regarding the SV versus the BS model.
564
G. Bakshi et al.
Table 37.4 Delta-neutral hedging errors. Panel B: At the money options Dollar hedging error 1-day revision
Absolute hedging error 1-day revision
5-day revision
Term-to-expiration (days)
Moneyness
5-day revision
Term-to-expiration (days)
S K
Model
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
0.97–0.99
BS
0.01 (0.05) 0.08 (0.01) 0.03 (0.02) 0.02 (0.02)
0.04 (0.02) 0.05 (0.02) 0.01 (0.01) 0.01 (0.01)
0.04 (0.03) 0.07 (0.02) 0.00 (0.01) 0.00 (0.01)
0.36 (0.08) 0.30 (0.05) 0.04 (0.03) 0.02 (0.03)
0.11 (0.05) 0.25 (0.05) 0.00 (0.01) 0.00 (0.01)
0.21 (0.07) 0.44 (0.08) 0.01 (0.02) 0.01 (0.02)
0.34 (0.03) 0.29 (0.01) 0.12 (0.01) 0.12 (0.01)
0.34 (0.01) 0.36 (0.01) 0.13 (0.01) 0.13 (0.01)
0.46 (0.02) 0.41 (0.02) 0.17 (0.01) 0.17 (0.01)
0.54 (0.06) 0.73 (0.03) 0.14 (0.02) 0.13 (0.02)
0.75 (0.03) 0.70 (0.03) 0.14 (0.01) 0.14 (0.01)
0.89 (0.05) 0.79 (0.06) 0.23 (0.01) 0.24 (0.01)
0.10 (0.01) 0.08 (0.02) 0.01 (0.02) 0.01 (0.02)
0.02 (0.02) 0.05 (0.02) 0.00 (0.01) 0.00 (0.01)
0.01 (0.03) 0.03 (0.03) 0.01 (0.01) 0.01 (0.01)
0.43 (0.09) 0.29 (0.05) 0.02 (0.02) 0.02 (0.02)
0.08 (0.05) 0.24 (0.05) 0.01 (0.01) 0.00 (0.01)
0.10 (0.07) 0.18 (0.08) 0.03 (0.02) 0.04 (0.02)
0.37 (0.01) 0.36 (0.01) 0.14 (0.01) 0.14 (0.03)
0.37 (0.01) 0.37 (0.01) 0.13 (0.01) 0.13 (0.01)
0.47 (0.02) 0.42 (0.02) 0.17 (0.01) 0.17 (0.01)
0.80 (0.06) 0.81 (0.03) 0.15 (0.02) 0.16 (0.02)
0.77 (0.03) 0.67 (0.03) 0.15 (0.01) 0.15 (0.01)
0.77 (0.05) 0.61 (0.05) 0.25 (0.02) 0.25 (0.01)
0.09 (0.03) 0.09 (0.02) 0.00 (0.03) 0.00 (0.01)
0.02 (0.02) 0.05 (0.02) 0.00 (0.01) 0.00 (0.01)
0.01 (0.03) 0.07 (0.04) 0.03 (0.01) 0.03 (0.01)
0.40 (0.08) 0.30 (0.05) 0.01 (0.02) 0.00 (0.01)
0.11 (0.05) 0.25 (0.05) 0.01 (0.01) 0.01 (0.01)
0.09 (0.07) 0.27 (0.09) 0.05 (0.02) 0.05 (0.02)
0.40 (0.02) 0.38 (0.01) 0.13 (0.01) 0.14 (0.01)
0.39 (0.01) 0.36 (0.01) 0.14 (0.01) 0.14 (0.01)
0.46 (0.02) 0.43 (0.01) 0.17 (0.01) 0.17 (0.01)
0.82 (0.05) 0.75 (0.03) 0.13 (0.01) 0.13 (0.01)
0.75 (0.03) 0.65 (0.03) 0.17 (0.01) 0.17 (0.01)
0.82 (0.05) 0.71 (0.06) 0.24 (0.02) 0.24 (0.02)
SI SV SVSI
0.99–1.01
BS SI SV SVSI
1.01–1.03
BS SI SV SVSI
It is useful to recall that all four models are implemented allowing both the spot volatility and the spot interest rates to vary from day to day, which is, except in the sole case of the SVSI model, not consistent with the models’ assumptions. Given this practical ad hoc treatment, it may not come as a surprise that when only the underlying asset is used as the hedging instrument, the four models performed virtually indifferently, with the magnitude of their hedging error differences being generally small. As easily seen, if all four models were implemented in a way consistent with the respective model setups, the single-instrument hedges based on the SVSI model would for sure perform the best. Comparing Tables 37.4 and 37.5, one can conclude that based on a given option model, the conventional delta-neutral hedges perform far better than their single-instrument counterparts, for every moneyness-maturity category. This is not surprising as the former type of hedges involves more hedging instruments (except under the BS model).
37.4.3 Regression Analysis of Option Pricing and Hedging Errors So far we have examined pricing and hedging performance according to option moneyness-maturity categories. The purpose was to see whether the errors have clear moneyness- and maturity-related biases. By appealing to a regression analysis, we can more rigorously study the association of the errors with factors that are either contract-specific or market condition-dependent. Fix an option pricing model, and let n .t/ denote the nth call option’s percentage pricing error on day t. Then, run the regression below for the entire sample: n .t/ D ˇ0 C ˇ1
S.t/ C ˇ2 n C ˇ3 SPREADn .t/ Kn
Cˇ5 LAGVOL.t 1/ C ˇ4 SLOPE.t/ C n .t/; (37.29)
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.4 Delta-neutral hedging errors. Panel C: In the money options Dollar hedging error 1-day revision
Absolute hedging error 1-day revision
5-day revision
Term-to-expiration (days)
Moneyness
565
5-day revision
Term-to-expiration (days)
S K
Model
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
1.03–1.05
BS
0.06 (0.02) 0.09 (0.02) 0.01 (0.03) 0.01 (0.01)
0.03 (0.02) 0.05 (0.02) 0.01 (0.01) 0.00 (0.01)
0.05 (0.03) 0.07 (0.04) 0.01 (0.02) 0.00 (0.02)
0.36 (0.05) 0.31 (0.06) 0.00 (0.01) 0.00 (0.01)
0.09 (0.04) 0.19 (0.05) 0.00 (0.01) 0.00 (0.01)
0.23 (0.08) 0.35 (0.09) 0.03 (0.03) 0.03 (0.02)
0.40 (0.02) 0.39 (0.01) 0.15 (0.01) 0.15 (0.01)
0.38 (0.01) 0.36 (0.01) 0.13 (0.01) 0.12 (0.01)
0.47 (0.02) 0.41 (0.02) 0.15 (0.01) 0.16 (0.01)
0.70 (0.03) 0.65 (0.03) 0.17 (0.01) 0.16 (0.01)
0.69 (0.03) 0.61 (0.03) 0.15 (0.01) 0.14 (0.01)
0.90 (0.05) 0.81 (0.06) 0.24 (0.02) 0.25 (0.01)
0.05 (0.02) 0.07 (0.02) 0.00 (0.01) 0.00 (0.01)
0.02 (0.02) 0.04 (0.02) 0.00 (0.01) 0.00 (0.01)
0.06 (0.04) 0.11 (0.04) 0.01 (0.02) 0.00 (0.02)
0.35 (0.04) 0.26 (0.04) 0.05 (0.01) 0.03 (0.01)
0.06 (0.04) 0.12 (0.04) 0.02 (0.01) 0.02 (0.00)
0.22 (0.07) 0.56 (0.08) 0.00 (0.02) 0.00 (0.02)
0.41 (0.01) 0.40 (0.01) 0.16 (0.01) 0.15 (0.01)
0.40 (0.01) 0.37 (0.02) 0.13 (0.01) 0.12 (0.01)
0.47 (0.02) 0.44 (0.02) 0.18 (0.01) 0.17 (0.01)
0.68 (0.02) 0.57 (0.03) 0.18 (0.01) 0.17 (0.01)
0.64 (0.03) 0.51 (0.03) 0.15 (0.01) 0.15 (0.01)
0.77 (0.04) 0.59 (0.01) 0.22 (0.02) 0.22 (0.01)
0.04 (0.01) 0.05 (0.03) 0.01 (0.01) 0.00 (0.00)
0.03 (0.00) 0.04 (0.01) 0.01 (0.00) 0.00 (0.00)
0.02 (0.01) 0.03 (0.02) 0.00 (0.01) 0.00 (0.01)
0.15 (0.02) 0.18 (0.02) 0.03 (0.01) 0.01 (0.01)
0.07 (0.02) 0.08 (0.02) 0.01 (0.00) 0.00 (0.00)
0.10 (0.03) 0.21 (0.08) 0.01 (0.01) 0.01 (0.01)
0.36 (0.01) 0.35 (0.01) 0.15 (0.01) 0.15 (0.00)
0.39 (0.01) 0.37 (0.01) 0.14 (0.01) 0.14 (0.00)
0.48 (0.01) 0.43 (0.01) 0.20 (0.01) 0.20 (0.00)
0.51 (0.01) 0.45 (0.01) 0.17 (0.00) 0.16 (0.00)
0.58 (0.01) 0.51 (0.01) 0.18 (0.00) 0.17 (0.00)
0.72 (0.02) 0.66 (0.02) 0.27 (0.00) 0.27 (0.01)
SI SV SVSI
1.05–1.07
BS SI SV SVSI
>1:07
BS SI SV SVSI
where Kn is the strike price of the call, n the remaining time to expiration, and SPREADn .t/ the percentage bidask spread at date t of the call (constructed by computing AskBid /, all of which are contract-specific variables. 0:5.AskCBid / The variable, LAGVOL.t 1/, is the (annualized) standard deviation of the previous day’s intraday S&P 500 returns computed over 5-min intervals, and it is included in the regression to see whether the previous day’s volatility of the underlying may cause systematic pricing biases. The variable, SLOPE.t/, represents the yield differential between 1-year and 30-day Treasury bills. This variable can provide information on whether the single-factor Cox-Ingersoll-Ross (1985) term structure model assumed in the present paper is sufficient to make the resulting option formula capture all term structure-related effects on the S&P 500 index options. In some sense, the contract-specific variables help detect the existence of cross-sectional pricing biases, whereas LAGVOL.t 1/ and SLOPE.t/ serve to indicate whether the pricing errors over time are related to the dynamically changing market conditions. Similar regression analyses have been
done for the BS pricing errors in, for example, Galai (1983b), George and Longstaff (1993), and Madan et al. (1998). For each given option model, the same regression as in (37.29) is also run for the conventional delta-neutral hedging errors, with n .t/ in (37.29) replaced by the dollar hedging error for the nth option on day t. Table 37.6 reports the regression results based on the entire sample period, where the standard error for each coefficient estimate is adjusted according to the White (1980) heteroskedasticity-consistent estimator and is given in the parentheses. Let’s first examine the pricing error regressions. For every option model, each independent variable has statistically significant explanatory power of the remaining pricing errors. That is, the pricing errors from each model have some moneyness, maturity, intraday volatility, bid-ask spread, and term structure related biases. The magnitude of each such bias, however, decreases from the BS to the SI, to the SV, and to the SVSI model. For instance, the BS percentage pricing errors will on the average be 2.29 points higher when the yield spread SLOPE.t/ increases by one point, whereas
566
G. Bakshi et al.
Table 37.5 Single instrument hedging errors. Panel A: Out-of-the-money options. For each call option, calculate the hedging error, which is the difference between the market price of the call and the replicating portfolio. The average dollar hedging error and the average absolute hedging error are reported for each Dollar hedging error 1-day revision 5-day revision
model. The standard errors are shown in parentheses. The sample period is 06:1988–05:1991. In calculating the hedging errors generated with daily (once every 5 days) hedge rebalancing, 15,041 (11,704) observations are used Absolute hedging error 1-day revision
Term-to-expiration (days)
Moneyness
5-day revision
Term-to-expiration (days)
S K
Model
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
<0:93
BS
NA
0.06 (0.03) 0.09 (0.05) 0.02 (0.03) 0.02 (0.03)
0.04 (0.02) 0.06 (0.04) 0.05 (0.02) 0.04 (0.03)
NA
0.33 (0.09) 0.51 (0.17) 0.03 (0.08) 0.11 (0.08)
0.21 (0.06) 0.45 (0.08) 0.09 (0.05) 0.13 (0.05)
NA
0.37 (0.01) 0.49 (0.03) 0.30 (0.02) 0.35 (0.03)
0.45 (0.01) 0.52 (0.03) 0.39 (0.02) 0.43 (0.02)
NA
0.91 (0.06) 1.33 (0.10) 0.75 (0.05) 0.76 (0.05)
0.87 (0.04) 0.91 (0.05) 0.75 (0.04) 0.82 (0.04)
NA
0.06 (0.02) 0.10 (0.03) 0.06 (0.02) 0.04 (0.03)
0.01 (0.03) 0.00 (0.04) 0.00 (0.03) 0.00 (0.03)
NA
0.24 (0.07) 0.36 (0.10) 0.14 (0.07) 0.17 (0.07)
0.02 (0.07) 0.38 (0.08) 0.00 (0.06) 0.17 (0.07)
NA
0.32 (0.01) 0.35 (0.02) 0.33 (0.02) 0.36 (0.02)
0.46 (0.02) 0.52 (0.03) 0.43 (0.02) 0.47 (0.02)
NA
0.79 (0.04) 0.97 (0.06) 0.79 (0.04) 0.82 (0.04)
0.82 (0.05) 0.72 (0.06) 0.73 (0.04) 0.78 (0.04)
0.08 (0.06) 0.04 (0.11) 0.03 (0.06) 0.04 (0.01)
0.06 (0.02) 0.09 (0.03) 0.05 (0.02) 0.02 (0.02)
0.01 (0.03) 0.04 (0.05) 0.01 (0.03) 0.00 (0.03)
0.55 (0.16) 0.44 (0.05) 0.06 (0.16) 0.21 (0.03)
0.21 (0.06) 0.38 (0.09) 0.14 (0.06) 0.14 (0.04)
0.12 (0.08) 0.42 (0.10) 0.16 (0.07) 0.23 (0.06)
0.23 (0.04) 0.33 (0.01) 0.21 (0.04) 0.22 (0.03)
0.33 (0.02) 0.39 (0.01) 0.32 (0.02) 0.35 (0.01)
0.45 (0.03) 0.47 (0.03) 0.42 (0.02) 0.46 (0.02)
0.66 (0.13) 0.74 (0.09) 0.48 (0.09) 0.65 (0.11)
0.77 (0.04) 0.94 (0.06) 0.77 (0.04) 0.83 (0.04)
0.85 (0.05) 0.92 (0.07) 0.74 (0.04) 0.79 (0.04)
SI SV SVSI
0.93–0.95
BS SI SV SVSI
0.95–0.97
BS SI SV SVSI
the SV and the SVSI percentage errors will only be, respectively, 0.32 and 0.34 points higher in response. Thus, a higher yield spread on the term structure means higher pricing errors, regardless of the option model used. This points out that a possible direction to further improve pricing performance is to include the yield spread as a second factor in the term structure model of interest rates. Other noticeable patterns include the following. The BS pricing errors are decreasing, while the SI, the SV and the SVSI pricing errors are increasing, in both the option’s time-to-expiration and the underlying stock’s volatility on the previous day. The deeper inthe-money the call or the wider its bid-ask spread, the lower the SI’s, the SV’s and the SVSI model’s mispricing. But, for the BS model, its mispricing increases with moneyness and decreases with bid-ask spread. Even though all four models’ pricing errors are significantly related to each independent variable, the collective
explanatory power of these variables is not so impressive. The adjusted R2 is 29% for the BS formula’s pricing errors, 22% for the SI’s, 12% for the SV’s, and 7% for the SVSI model’s. Therefore, while both the BS and the SI model have significant overall biases related to contract terms and market conditions (indicating systematic model misspecifications), the remaining pricing errors under the SV and the SVSI are not as significantly associated with these variables. About 93% of the SVSI model’s pricing errors cannot be explained by these variables! As reported in Table 37.6, delta-neutral hedging errors by the BS and the SI model tend to increase with the moneyness and the bid-ask spread of the target call, but decrease with the non-contract-specific yield spread and lagged stock volatility variables. Therefore, the two models are misspecified for hedging purposes and they lead to systematic hedging biases. But, overall, these variables can explain only 1%
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.5 Single instrument hedging errors. Panel B: At the money options Dollar hedging error 1-day revision
Absolute hedging error 1-day revision
5-day revision
Term-to-expiration (days)
Moneyness
567
5-day revision
Term-to-expiration (days)
S K
Model
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
0.97–0.99
BS
0.01 (0.05) 0.00 (0.06) 0.01 (0.06) 0.03 (0.06)
0.04 (0.02) 0.06 (0.03) 0.05 (0.02) 0.05 (0.02)
0.04 (0.03) 0.06 (0.04) 0.06 (0.03) 0.03 (0.03)
0.36 (0.08) 0.34 (0.05) 0.20 (0.12) 0.15 (0.11)
0.11 (0.05) 0.11 (0.09) 0.11 (0.06) 0.16 (0.06)
0.21 (0.07) 0.42 (0.08) 0.19 (0.07) 0.27 (0.07)
0.34 (0.03) 0.34 (0.05) 0.35 (0.04) 0.36 (0.04)
0.34 (0.01) 0.40 (0.02) 0.35 (0.01) 0.37 (0.01)
0.46 (0.02) 0.51 (0.03) 0.44 (0.02) 0.45 (0.02)
0.54 (0.06) 0.82 (0.09) 0.56 (0.07) 0.59 (0.07)
0.75 (0.03) 0.90 (0.06) 0.81 (0.03) 0.86 (0.03)
0.89 (0.05) 0.82 (0.06) 0.84 (0.04) 0.89 (0.05)
0.10 (0.01) 0.15 (0.05) 0.12 (0.04) 0.04 (0.01)
0.02 (0.02) 0.04 (0.03) 0.02 (0.02) 0.01 (0.02)
0.01 (0.03) 0.03 (0.05) 0.00 (0.03) 0.00 (0.03)
0.43 (0.09) 0.34 (0.15) 0.22 (0.11) 0.27 (0.11)
0.08 (0.05) 0.18 (0.07) 0.12 (0.05) 0.13 (0.05)
0.10 (0.07) 0.31 (0.09) 0.06 (0.06) 0.15 (0.06)
0.37 (0.01) 0.39 (0.03) 0.38 (0.03) 0.38 (0.03)
0.37 (0.01) 0.41 (0.02) 0.37 (0.01) 0.40 (0.01)
0.47 (0.02) 0.55 (0.03) 0.45 (0.02) 0.47 (0.02)
0.80 (0.06) 0.77 (0.09) 0.78 (0.06) 0.88 (0.07)
0.77 (0.03) 0.82 (0.05) 0.79 (0.03) 0.89 (0.04)
0.77 (0.05) 0.78 (0.06) 0.69 (0.04) 0.80 (0.04)
0.09 (0.03) 0.10 (0.04) 0.06 (0.03) 0.06 (0.03)
0.03 (0.02) 0.05 (0.03) 0.04 (0.02) 0.04 (0.02)
0.01 (0.03) 0.04 (0.05) 0.00 (0.03) 0.01 (0.03)
0.40 (0.08) 0.43 (0.11) 0.21 (0.09) 0.29 (0.08)
0.11 (0.05) 0.16 (0.07) 0.15 (0.05) 0.18 (0.05)
0.09 (0.07) 0.34 (0.10) 0.03 (0.06) 0.17 (0.06)
0.40 (0.02) 0.41 (0.02) 0.38 (0.02) 0.41 (0.02)
0.39 (0.01) 0.41 (0.01) 0.39 (0.02) 0.42 (0.01)
0.46 (0.02) 0.51 (0.03) 0.43 (0.02) 0.45 (0.02)
0.82 (0.05) 0.86 (0.07) 0.84 (0.05) 0.91 (0.05)
0.75 (0.03) 0.83 (0.05) 0.81 (0.03) 0.85 (0.03)
0.82 (0.05) 0.89 (0.07) 0.72 (0.04) 0.76 (0.04)
SI SV SVSI
0.99–1.01
BS SI SV SVSI
1.01–1.03
BS SI SV SVSI
of the hedging errors by the two models. And, even more impressively, none of the included independent variables can explain any of the remaining hedging errors by the SV and the SVSI model, as their R2 values are both zero. Finally, when the dollar pricing errors are used to replace the percentage pricing errors or when the percentage hedging errors are employed to replace the dollar hedging errors in the above regressions, the sign of each resulting coefficient estimate and the magnitude of each R2 value in Table 37.6 remain unchanged. Thus, the conclusions drawn from Table 37.6 are independent of the choice of the pricing or hedging error measure. Results from these exercises are not reported here but available upon request.
37.4.4 Robustness of Empirical Results Using the entire sample period data, we have concluded that the evidence, based on both static performance and dynamic
performance measures, is in favor of both the SVSI and the SV model. However, it is important to demonstrate that this conclusion still holds when alternative test designs and different sample periods are used. Below we briefly report results from two controlled experiments. According to Rubinstein (1985), the volatility smile pattern and the nature of pricing biases are time perioddependent. To see whether our conclusion may be reversed, we separately examined the pricing and hedging performance of the models in three sub-periods: 06:1988–05:1989, 06:1989–05:1990, and 06:1990–05:1991. Each sub-period contains about 10,000 call option observations. As the results are similar for each subperiod, we provide the percentage pricing errors in Panel A and the absolute delta-neutral hedging errors in Panel B of Table 37.7, for the subperiod 06:1990–05:1991. It is seen that these results are qualitatively the same as those in Tables 37.3 and 37.4. We examined the pricing and hedging error measures of each model when the structural parameters were not updated daily. Rather, retain the structural parameter values estimated
568
G. Bakshi et al.
Table 37.5 Single instrument hedging errors. Panel C: In the money options. Dollar hedging error 1-day revision
Absolute hedging error 1-day revision
5-day revision
Term-to-expiration (days)
Moneyness
5-day revision
Term-to-expiration (days)
S K
Model
<60
60–180
180
<60
60–180
180
<60
60–180
180
<60
60–180
180
1.03–1.05
BS
0.06 (0.02) 0.08 (0.03) 0.04 (0.03) 0.05 (0.02)
0.03 (0.02) 0.03 (0.03) 0.02 (0.02) 0.02 (0.02)
0.05 (0.03) 0.04 (0.05) 0.07 (0.04) 0.04 (0.03)
0.36 (0.05) 0.47 (0.08) 0.23 (0.06) 0.27 (0.05)
0.09 (0.04) 0.11 (0.06) 0.09 (0.05) 0.14 (0.05)
0.23 (0.08) 0.52 (0.09) 0.23 (0.08) 0.31 (0.08)
0.40 (0.02) 0.43 (0.02) 0.41 (0.02) 0.41 (0.01)
0.38 (0.01) 0.40 (0.02) 0.39 (0.01) 0.40 (0.01)
0.47 (0.02) 0.48 (0.03) 0.46 (0.02) 0.47 (0.02)
0.70 (0.03) 0.84 (0.05) 0.70 (0.03) 0.77 (0.03)
0.69 (0.03) 0.77 (0.04) 0.76 (0.03) 0.80 (0.03)
0.90 (0.05) 0.97 (0.07) 0.88 (0.05) 0.90 (0.05)
0.05 (0.02) 0.07 (0.03) 0.06 (0.02) 0.05 (0.02)
0.02 (0.02) 0.04 (0.03) 0.03 (0.02) 0.02 (0.02)
0.06 (0.04) 0.07 (0.05) 0.05 (0.04) 0.02 (0.04)
0.35 (0.04) 0.37 (0.05) 0.32 (0.04) 0.31 (0.04)
0.06 (0.04) 0.09 (0.06) 0.09 (0.04) 0.09 (0.04)
0.22 (0.07) 0.55 (0.08) 0.10 (0.07) 0.09 (0.07)
0.41 (0.01) 0.45 (0.02) 0.43 (0.02) 0.42 (0.01)
0.40 (0.01) 0.42 (0.02) 0.40 (0.01) 0.43 (0.01)
0.47 (0.02) 0.50 (0.03) 0.44 (0.03) 0.46 (0.02)
0.68 (0.02) 0.74 (0.04) 0.71 (0.02) 0.74 (0.03)
0.64 (0.03) 0.66 (0.04) 0.68 (0.03) 0.79 (0.03)
0.77 (0.04) 0.81 (0.06) 0.69 (0.04) 0.75 (0.05)
0.04 (0.01) 0.05 (0.01) 0.04 (0.04) 0.04 (0.01)
0.03 (0.00) 0.04 (0.01) 0.04 (0.02) 0.03 (0.01)
0.02 (0.01) 0.04 (0.02) 0.02 (0.02) 0.01 (0.01)
0.15 (0.02) 0.17 (0.03) 0.18 (0.02) 0.18 (0.01)
0.07 (0.02) 0.07 (0.03) 0.14 (0.02) 0.14 (0.02)
0.10 (0.03) 0.26 (0.03) 0.09 (0.02) 0.10 (0.02)
0.36 (0.01) 0.40 (0.01) 0.35 (0.01) 0.36 (0.01)
0.39 (0.01) 0.41 (0.01) 0.39 (0.01) 0.41 (0.01)
0.48 (0.01) 0.47 (0.01) 0.44 (0.01) 0.46 (0.01)
0.51 (0.01) 0.61 (0.02) 0.50 (0.01) 0.50 (0.01)
0.58 (0.01) 0.66 (0.02) 0.58 (0.01) 0.63 (0.01)
0.72 (0.02) 0.79 (0.02) 0.64 (0.02) 0.70 (0.02)
SI SV SVSI 1.05–1.07
BS SI SV SVSI
>1:07
BS SI SV SVSI
Table 37.6 Regression analysis of pricing and hedging errors. The regression results below are based on the equation: S.t/ n .t / D ˇ0 C ˇ1 Kn C ˇ2 n C ˇ3 SPREADn .t / C ˇ4 SLOPE.t / C ˇ4 LAGVOL.t 1/ C n .t /; where n .t / denotes either the percentage pricing error or the dollar hedging error of the nth call on date-t ; KSn and n respectively represent the moneyness and the term-to-expiration of the option contract; The BS SI SV SVSI Coefficient Percentage pricing errors
variable SPREADn .t / is the percentage bid-ask spread; SLOPE.t / the yield differential between the 1-year and the 30-day Treasury bill rates; And LAGVOL.t 1/ the previous day’s (annualized) standard deviation of S&P 500 index returns computed from 5-min intradaily returns. The standard errors, reported in parenthesis, are White’s (1980) heteroskedastically consistent estimator. The sample period is 06:1988– 05:1991 for a total of 38,749 observations. BS SI Hedging errors
SV
SVSI
Constant
0.05 ( 0.03)
0.28 (0.03)
0.24 (0.02)
0.11 (0.02)
0.41 (0.11)
0.30 (0.10)
0.00 (0.05)
0.03 (0.05)
S K
0.22 (0.03)
0.18 (0.02)
0.20 (0.01)
0.09 (0.02)
0.34 (0.09)
0.29 (0.08)
0.00 (0.04)
0.03 (0.04)
0.04 (0.01)
0.04 (0.01)
0.08 (0.01)
0.05 (0.01)
0.03 (0.02)
0.08 (0.02)
0.00 (0.01)
0.00 (0.01)
SPREAD
5.24 (0.12)
4.48 (0.11)
2.13 (0.07)
1.57 (0.08)
2.26 (0.48)
1.04 (0.32)
0.32 (0.23)
0.34 (0.23)
SLOPE
2.29 (0.16)
1.33 (0.13)
0.32 (0.08)
0.34 (0.11)
2.09 (0.58)
2.01 (0.65)
0.39 (0.26)
0.39 (0.25)
LAGVOL
0.16 (0.02)
0.12 (0.02)
0.06 (0.01)
0.04 (0.01)
0.31 (0.07)
0.51 (0.05)
0.06 (0.02)
0.05 (0.02)
Adj. R2
0.29
0.22
0.12
0.07
0.01
0.01
0.00
0.00
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
Table 37.7 Robustness analysis. Panel A: percentage pricing errors, 06:1990–05:1991. The reported percentage pricing error is the sample average of the market price minus the model price divided by the market price. The sample period is 06:1990–05:1991 for a total of 11,979 call options
569
Term-to-expiration (days)
Moneyness S K
Model
<30
30–60
60–90
90–120
120–180
180
<0:93
BS SI SV SVSI
76.51 26.88 25.05 20.56
96.75 62.89 35.80 31.82
74.73 38.89 12.59 8.15
78.71 34.40 1.25 3.00
61.36 18.68 1.64 0.58
46.83 5.14 8.51 3.57
0.93–0.95
BS SI SV SVSI
54.99 46.25 26.57 28.25
59.06 46.77 25.32 21.71
31.97 28.99 8.57 6.26
24.29 10.62 1.60 2.18
19.28 9.55 0.33 0.04
12.36 5.37 1.50 0.79
0.95–0.97
BS SI SV SVSI
34.72 31.85 20.09 15.83
29.97 24.18 13.89 13.08
16.71 16.79 5.54 4.20
12.74 4.92 1.68 3.29
10.27 5.35 0.56 0.91
7.33 4.49 0.61 0.18
0.97–0.99
BS SI SV SVSI
15.93 15.27 13.09 12.38
10.36 7.45 7.04 7.49
5.84 7.22 3.64 3.37
3.22 2.36 0.77 0.90
3.46 2.05 0.47 0.83
2.14 1.24 0.15 0.33
0.99–1.01
BS SI SV SVSI
3.92 3.24 6.69 7.46
1.23 0.09 3.43 4.17
0.62 0.87 1.23 1.49
1.54 5.22 1.22 0.33
0.65 0.38 0.10 0.48
1.99 0.09 0.14 0.27
1.01–1.03
BS SI SV SVSI
2.48 2.73 0.92 1.41
3.36 3.75 0.56 1.02
4.17 2.78 0.24 0.09
4.05 5.57 1.44 0.59
3.28 1.82 0.19 0.16
2.93 0.60 0.24 0.53
1.03–1.05
BS SI SV SVSI
3.93 3.95 1.21 0.85
4.86 4.92 0.74 0.19
5.35 4.08 0.82 0.48
5.57 6.62 1.87 0.79
4.62 2.61 0.39 0.08
3.41 0.08 0.17 0.40
1.05–1.07
BS SI SV SVSI
3.69 3.82 1.83 1.68
5.07 5.08 1.52 1.18
5.84 4.67 1.49 1.17
6.36 6.55 2.07 1.38
5.12 2.87 0.51 0.32
4.93 1.29 0.54 0.57
>1:07
BS SI SV SVSI
1.99 2.44 1.43 1.38
2.98 2.90 1.40 1.30
3.58 2.77 1.21 1.18
4.35 4.08 1.47 1.18
4.00 1.93 0.45 0.46
3.59 1.01 0.69 0.40
570
Table 37.7 Robustness analysis. Panel B: Absolute hedging errors (1 and 5 day), 06:1990–05:1991. The average absolute hedging error for each model is reported based on the sub-sample period 06:1990–05:1991 (with a total of 6,440 observations)
G. Bakshi et al.
1-day revision
5-day revision
Term-to-expiration (days)
Moneyness S K
Model
<60
60–180
>180
<60
60–180
>180
<0:93
BS SI SV SVSI
NA
0.42 0.46 0.17 0.17
0.48 0.45 0.22 0.22
NA
1.13 0.77 0.18 0.18
0.99 0.82 0.30 0.30
0.93–0.95
BS SI SV SVSI
NA
0.40 0.45 0.13 0.13
0.50 0.47 0.25 0.25
NA
0.97 0.73 0.15 0.15
0.95 0.75 0.30 0.30
0.95–0.97
BS SI SV SVSI
NA
0.37 0.45 0.16 0.16
0.44 0.44 0.22 0.22
NA
0.96 0.77 0.16 0.16
0.85 0.86 0.29 0.29
0.97–0.99
BS SI SV SVSI
0.39 0.33 0.14 0.14
0.42 0.45 0.17 0.17
0.47 0.41 0.17 0.16
0.72 0.66 0.16 0.15
0.95 0.74 0.17 0.17
0.97 0.76 0.24 0.23
0.99–1.01
BS SI SV SVSI
0.41 0.40 0.16 0.16
0.43 0.48 0.16 0.16
0.50 0.50 0.17 0.17
0.99 0.79 0.20 0.19
0.91 0.71 0.17 0.17
0.89 0.78 0.28 0.26
1.01–1.03
BS SI SV SVSI
0.40 0.45 0.17 0.17
0.46 0.44 0.17 0.17
0.47 0.45 0.18 0.17
0.99 0.74 0.19 0.19
0.89 0.71 0.20 0.20
0.83 0.73 0.25 0.25
1.03–1.05
BS SI SV SVSI
0.45 0.46 0.17 0.17
0.43 0.44 0.14 0.14
0.50 0.48 0.17 0.17
0.88 0.71 0.18 0.18
0.85 0.72 0.16 0.16
0.97 0.68 0.27 0.27
1.05–1.07
BS SI SV SVSI
0.46 0.47 0.18 0.17
0.47 0.45 0.14 0.14
0.51 0.50 0.22 0.21
0.73 0.61 0.19 0.19
0.78 0.67 0.16 0.16
0.77 0.68 0.24 0.22
>1:07
BS SI SV SVSI
0.41 0.38 0.17 0.16
0.45 0.46 0.15 0.15
0.53 0.50 0.22 0.21
0.62 0.48 0.18 0.18
0.70 0.64 0.19 0.18
0.81 0.75 0.32 0.31
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
from the options of the first day of each month and then, for the remainder of the month, use them as input to compute the corresponding model-based price for each traded option, except that the implied spot volatility is updated each day based on the previous day’s option prices. The obtained absolute pricing errors for the subperiod 06:1990–05:1991 indicate that the performance ranking of the four models remains the same as before. In addition, when we used only ATM (or only ITM or only OTM) option prices to back out each model’s parameter values, the resulting pricing and hedging errors did not change the performance ranking of the models either. This means that even if one would estimate and use a matrix of implied volatilities (across moneynesses and maturities) to accordingly price and hedge options in different moneynessmaturity categories, it would still not change the fact that the SV and the SVSI models are better specified than the other two for pricing and hedging. Given that the implied-volatility matrix method has gained some popularity among practitioners, our results should be appealing. On the one hand, they suggest that with the SV and the SVSI models there is far less a need to engage in moneyness- and maturity-related fitting. On the other hand, if one is still interested in the matrix method, the SV and the SVSI models should be better model choices. Early in the project we used only option transaction price data for the pricing and hedging estimations. But, that meant a far smaller data set, especially for the hedging estimations. Nonetheless, the results obtained from the transaction prices were similar to these presented and discussed in this paper.
37.5 Conclusions We have developed and analyzed a simple option pricing model that admits both stochastic volatility and stochastic interest rates. It is shown that this closed-form pricing formula is practically implementable, leads to useful analytical hedge ratios, and contains many known option formulas as special cases. This last feature has made it relatively straightforward to conduct a comparative empirical study of the four classes of option pricing models. According to the pricing and hedging performance measures, the SVSI and the SV models both perform much better than the SI and the BS models, as the former typically reduce the pricing and hedging errors of the latter by more than a half. These error reductions are also economically significant. Furthermore, the hedging errors by the SV and the SVSI models are relatively insensitive to the frequency of portfolio revision, whereas those of the SI and the BS
571
models are sensitive. Given that both the SV and the SVSI models can be easily implemented on a personal computer, they should thus be better alternatives to the widely applied BS formula. A regression-based analysis of the pricing and hedging errors indicates that while the BS and the SI models show significant pricing biases related to moneyness, time-to-expiration, bid-ask spread, lagged stock volatility and interest rate term spread, pricing errors by the SV and the SVSI models are not as systematically related to either contract-specific or market-dependent variables. Overall, the results lend empirical support to the claim that incorporating stochastic interest rates and, especially, stochastic volatility, can both improve option pricing and hedging performance substantially and resolve some known empirical biases associated with the BS model. The empirical issues and questions addressed in this paper can also be re-examined using data from individual stock options, American-style index options, options on futures, currency and commodity options, and so on. Eventually, the acceptability of option pricing models with the added features will be judged not only by its easy implementability or even its impressive pricing and hedging performance as demonstrated in this paper using European-style index calls, but also by its success or failure in pricing and hedging other types of options. These extensions are left for future research. Acknowledgements We would like to thank Sanjiv Das, Ranjan D’Mello, Helyette Geman, Eric Ghysels, Frank Hatheway, Steward Hodges, Ravi Jagannathan, Andrew Karolyi, Bill Kracaw, C. F. Lee, Dilip Madan, Louis Scott, René Stulz, Stephen Taylor, Siegfried Trautmann, Alex Triantis, and Alan White for their helpful suggestions. Any remaining errors are our responsibility alone.
References Amin, K. and R. Jarrow. 1992. “Pricing options on risky assets in a stochastic interest rate economy.” Mathematical Finance 2, 217–237. Amin, K. and V. Ng. 1993. “Option valuation with systematic stochastic volatility.” Journal of Finance 48, 881–910. Andersen, T. and J. Lund. 1997. “Estimating continuous time stochastic volatility models of the short term interest rate.” Journal of Econometrics 77, 343–377. Bailey, W. and R. Stulz. 1989. “The pricing of stock index options in a general equilibrium model.” Journal of Financial and Quantitative Analysis 24, 1–12. Bakshi, G. and Z. Chen. 1997. “An alternative valuation model for contingent claims.” Journal of Financial Economics 44, 123–165. Bakshi, G., C. Cao, and Z. Chen. 1997. “Empirical performance of alternative option pricing models.” Journal of Finance 52, 2003–2049. Bakshi, G., C. Cao, and Z. Chen. 2000a. “Do call prices and the underlying stock always move in the same direction?” Review of Financial Studies 13, 549–584. Bakshi, G., C. Cao, and Z. Chen. 2000b. “Pricing and hedging longterm options.” Journal of Econometrics 94, 277–318.
572 Barone-Adesi, G. and R. Whaley. 1987. “Efficient analytic approximation of American option values.” Journal of Finance 42, 301–320. Bates, D. 1996a. “Testing option pricing models,” in Statistical methods in finance (Handbook of statistics, Vol. 14), G. S. Maddala and C. R. Rao (Eds.). Amsterdam: Elsevier, pp. 567–611. Bates, D. 1996b. “Jumps and stochastic volatility: exchange rate processes implicit in Deutschemark options.” Review of Financial Studies 9(1), 69–108. Bates, D. 2000. “Post-87 crash fears in S&P 500 futures options.” Journal of Econometrics 94, 181–238. Black, F. and M. Scholes. 1973. “ The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Cao, C. and J. Huang. 2008. “Determinants of S&P 500 index option returns.” Review of Derivatives Research 10, 1–38. Chan, K., A. Karolyi, F. Longstaff, and A. Sanders. 1992. “An empirical comparison of alternative models of the short-term interest rate.” Journal of Finance 47, 1209–1227. Cox, J. and S. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3, 145–166. Cox, J., J. Ingersoll, and S. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 385–408. Day, T. and C. Lewis. 1997. “Initial margin policy and stochastic volatility in the crude oil futures market.” Review of Financial Studies 10, 303–332. Dumas, B., J. Fleming, R. Whaley. 1998. “Implied volatility functions: empirical tests.” Journal of Finance 53(6), 2059–2106. Figlewski, S. 1989. “Option arbitrage in imperfect markets.” Journal of Finance 44, 1289–1311. Galai, D. 1983a. “The components of the return from hedging options against stocks.” Journal of Business 56, 45–54. Galai, D. 1983b. “A survey of empirical tests of option pricing models,” in Option pricing, M. Brenner (Ed.). Lexington, MA: Heath, pp. 45–80. George, T. and F. Longstaff. 1993. “Bid-ask spreads and trading activity in the S&P 100 index options market.” Journal of Financial and Quantitative Analysis 28, 381–397. Hansen, L. 1982. “Large sample properties of generalized method of moments estimators.” Econometrica 50, 1029– 1054. Harrison, M. and D. Kreps. 1979. “Martingales and arbitrage in multiperiod securities markets.” Journal of Economic Theory 20, 381–408. Harvey, C. and R. Whaley. 1992a. “Market volatility and the efficiency of the S&P 100 index option market.” Journal of Financial Economics 31, 43–73. Harvey, C. and R. Whaley. 1992b. “Dividends and S&P 100 index option valuation.” Journal of Futures Markets 12, 123–137. Heston, S. 1993. “A closed-form solution for options with stochastic volatility with applications to bond and currency options.” Review of Financial Studies 6, 327–343. Hull, J. and A. White. 1987a. “The pricing of options with stochastic volatilities.” Journal of Finance 42, 281–300. Hull, J. and A. White. 1987b. “Hedging the risks from writing foreign currency options.” Journal of International Money and Finance 6, 131–152. Kim, I.-J. 1990. “The analytical valuation of American options.” Review of Financial Studies 3(4), 547–572. Longstaff, F. 1995. “Option pricing and the Martingale restriction.” Review of Financial Studies 8(4), 1091– 1124. Madan, D., P. Carr and E. Chang. 1998. The variance gamma process and option pricing, European Finance Review 2, 79–105. McBeth, J. and L. Merville. 1979. “An empirical examination of the Black–Scholes call option pricing model.” Journal of Finance 34, 1173–1186. Melino, A. and S. Turnbull. 1990. “Pricing foreign currency options with stochastic volatility.” Journal of Econometrics 45, 239–265.
G. Bakshi et al. Melino, A. and S. Turnbull. 1995. “Misspecification and the pricing and hedging of long-term foreign currency options.” Journal of International Money and Finance 45, 239–265. Merton, R. 1973. “Theory of rational option pricing.” Bell Journal of Economics 4, 141–183. Nandi, S. 1996. “Pricing and hedging index options under stochastic volatility.” Working Paper, Federal Reserve Bank of Atlanta. Ross, S. 1995. “Hedging long-run commitments: exercises in incomplete market pricing.” Working Paper, Yale School of Management. Rubinstein, M. 1985. “Nonparametric tests of alternative option pricing models using all reported trades and quotes on the 30 most active CBOE options classes from August 23, 1976 through August 31, 1978.” Journal of Finance 455–480. Rubinstein, M. 1994. “Implied binomial trees.” Journal of Finance 49, 771–818. Scott, L. 1987. “Option pricing when the variance changes randomly: theory, estimators, and applications.” Journal of Financial and Quantitative Analysis 22, 419–438. Scott, L., 1997. “Pricing stock options in a jump-diffusion model with stochastic volatility and interest rates: application of Fourier inversion methods.” Mathematical Finance 7, 413–426. Stein, E. and J. Stein. 1991.“Stock price distributions with stochastic volatility.”Review of Financial Studies 4, 727–752. Whaley, R. 1982. “Valuation of American call options on dividend paying stocks.” Journal of Financial Economics 10, 29–58. Wiggins, J. 1987. “Option values under stochastic volatilities.” Journal of Financial Economics 19, 351–372.
Appendix 37A Proof of the option pricing formula in (37.10). The valuation PDE in (37.9) can be re-written as:
@C 1 @2 C 1 @2 C V C C R V v 2 @L2 2 @L @L@V 1 @2 C @C C v2 V C Œv v V 2 2 @V @V @C @2 C @C 1 R C D 0; C R2 R 2 C ŒR R R 2 @R @R @ (37.30) where we have applied the transformation L.t/ lnŒS.t/ . Inserting the conjectured solution in (37.10) into (37.30) produces the PDEs for the risk-neutralized probabilities, ˘j for j D 1; 2:
@˘1 1 @2 ˘1 1 @2 ˘1 V C C R C V v 2 @L2 2 @L @L@V @2 ˘1 @˘1 1 C Œv .v v /V C v2 V 2 @V 2 @V @2 ˘1 @˘1 @˘1 1 D 0; C ŒR R R C R2 R 2 2 @R @R @ (37.31)
37 Option Pricing and Hedging Performance Under Stochastic Volatility and Stochastic Interest Rates
and
573
with the boundary condition:
1 @2 ˘2 1 @2 ˘2 @˘2 C R V V C v 2 @L2 2 @L @L@V
fj .t C ; 0I / D e i L.t C /
1 2 @2 ˘2 1 @2 ˘2 @˘2 C R C v2 V C Œ V v v 2 @V 2 @V 2 R @R2
@˘2 @˘2 R2 @B.t; / R D 0: C R R B.t; / @R @R @
˘j .t C ; 0/ D 1L.t C /K
j D 1; 2:
f1 .t; ; S.t/; V .t/; R.t/I / D exp fur ./ C uv ./ Cxr ./ R.t/ C xv ./ V .t/ Ci lnŒS.t/ g (37.37)
f2 .t; ; S.t/; V .t/; R.t/I / D exp fzr ./ C zv ./ Cyr ./ R.t/ C yv ./ V .t/
(37.33)
The corresponding characteristic functions for ˘1 and ˘2 will also satisfy similar PDEs:
@f1 1 @2 f1 1 @2 f1 V C C R C V v 2 @L2 2 @L @L@V 1 @2 f1 @f1 C v2 V C Œv .v v /V 2 2 @V @V @2 f1 @f1 @f1 1 D 0; C R2 R 2 C ŒR R R 2 @R @R @ (37.34) and
1 @2 f2 @f2 1 @2 f2 V C C R V v 2 @L2 2 @L @L@V 1 1 @2 f2 @f2 @2 f2 C v2 V C R2 R 2 C Œv v V 2 2 @V @V 2 @R
@f2 @f2 R2 @B.t; / R D 0; C R R B.t; / @R @R @
(37.36)
Conjecture that the solution to the PDEs (37.34) and (37.35) is respectively given by
(37.32) Observe that (37.31) and (37.32) are the Fokker-Planck forward equations for probability functions. This implies that ˘1 and ˘2 must indeed be valid probability functions, with values bounded between 0 and 1. These PDEs must be separately solved subject to the terminal condition:
j D 1; 2:
Ci lnŒS.t/ lnŒB.t; / g (37.38) with ur .0/ D uv .0/ D xr .0/ D xv .0/ D 0 and zr .0/ D zv .0/ D yr .0/ D yv .0/ D 0. Solving the resulting system of differential equations and noting that B.t C ; 0/ D 1 will respectively produce the desired characteristic functions in (37.12) and (37.13). Both the constant interest rate–stochastic volatility and constant volatility–stochastic interest rate option pricing models are nested in (37.10). In the constant interest rate– stochastic volatility model, for instance, the partial derivatives with respect to R vanishes in (37.30). The general solution in (37.37)–(37.38) will still apply except that now R.t/ D R (a constant), B.t; / D e R , xr ./ D i , yr ./ D .i 1/, and ur ./ D zr ./ D 0. The final characteristic functions fOj for the constant interest rate–stochastic volatility option model are respectively given by
(37.35)
Œv v C .1 C i /v .1 e v / v fO1 D exp i lnŒB.t; / 2 2 ln 1 v 2v
i .i C 1/.1 e v / v V .t/ ; (37.39) 2 Œv v C .1 C i /v / C i lnŒS.t/ C v 2v Œv v C .1 C i /v .1 e v /
574
G. Bakshi et al.
and, ( " !# v Œv v C i v .1 e v / O f2 D exp i lnŒB.t; / 2 2 ln 1 v 2v
) v i .i 1/.1 e v / 2 v v C i v C i lnŒS.t/ C V .t/ ; v 2v Œv v C i v .1 e v /
Similarly, the constant volatility–stochastic interest rate option model obtains with V .t/ D V (a constant), xv ./ D 1 1 2 i .1Ci /, yv ./ D 2 i .i 1/, and uv ./ D zv ./ D 0. The final characteristic functions fQj for the stochastic interest rate–constant volatility model are:
@2 C.t; / @˘1 D 2 @S @S Z 1 1 i d : > 0: D Re .i /1 e i lnŒK f1 0 S
S
1 fQ1 D exp i .1 C i /V C i lnŒS.t/ 2
R ŒR R .1 e R / 2 2 ln 1 C ŒR R 2R R 2i .1 e R / C R.t/ ; (37.41) 2R ŒR R .1 e R /
(37.43) V
Expressions for the gamma measures. The various secondorder partial derivatives of the call price in (37.10), which are commonly referred to as Gamma measures, are given below:
2
2
2
2
2
@ C.t; / @ ˘1 @ ˘2 D S.t/ KB.t; / 2 2 @V @V @V 2 (37.44)
R
and, 1 fQ2 D exp i .i 1/V C i lnŒS.t/ lnŒB.t; / 2 # " ! ŒR R .1 e R / R 2 2 ln 1 CŒR R 2R R ) 2.i 1/.1 e R / C R.t/ ; (37.42) 2R ŒR R .1 e R /
(37.40)
@ C.t; / @ ˘1 D S.t/ 2 @R @R2 2 @ ˘2 @˘2 2 C % KB.t; / 2%./ ./˘ 2 : @R2 @R (37.45)
@˘1 @2 C.t; / D @S @V @V Z 1 1 @f1 d : D Re .i /1 e i lnŒK 0 @V
S;V
(37.46)
where for g D V; R and j D 1; 2 @2 ˘j 1 D @g 2
Z
1 0
@2 fj d : Re .i /1 e i lnŒK @g 2 (37.47)
Chapter 38
Application of the Characteristic Function in Financial Research H.W. Chuang, Y.L. Hsu, and C.F. Lee
Abstract In this chapter we introduce the application of the characteristic function in financial research. We consider the technique of the characteristic function useful for many option pricing models. Two option pricing models are derived in details based on the characteristic functions. Keywords Characteristic function r Constant elasticity of variance r Option pricing r Stochastic volatility
38.2 The Characteristic Functions In probability theory, the characteristic function of any random variable completely defines its probability distribution. On the real line it is given by the following formula, where X is any random variable with the distribution in question: 'X .t/ D E.e itX / D E.cos.tX// C iE.sin.tX//
38.1 Introduction The characteristic function in nonprobabilistic contexts is called the Fourier transform. The characteristic function was used widely in applied physics (signal process, quantum mechanics). The technique of the characteristic function is also useful for determining the option prices. In Heston (1993), the importance of the characteristic function was demonstrated to find a closed-form solution for options with stochastic volatility. It was subsequently considered by many authors, including Bates (1996), Bakshi and Chen (1997), Scott (1997), Carr and Madan (1999), among others. In addition, we consider the constant elasticity of variance (CEV) option pricing model (see Cox 1996; Schroder 1989; and Hsu et al. 2008) and options with stochastic volatility (see Heston 1993) based on the characteristic functions.
Y.L. Hsu () Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung, Taiwan e-mail: [email protected]
(38.1)
where t is a real number, I is the imaginary unit, and E denotes the expected value. The characteristic function is thus defined as the moment generating function but with the real argument s replaced by it; it has the advantage that it always exists because e itx is bounded. If FX is the cumulative distribution function, then the characteristic function is given by the Riemann-Stieltjes integral Z1 itX E.e / D e itx dFX .x/: (38.2) 1
If there is a probability density function, fX , this becomes Z1 E.e / D itX
e itx fX .x/dx:
(38.3)
1
If X is a vector-valued random variable, one takes the argument t to be a vector and t0 X to be an inner product. The characteristic functions for the common distributions are given in Table 38.1. Besides, if F is a one-dimensional distribution function and f is corresponding characteristic function, then the cumulative distribution function FX and its corresponding probability density function .x/ D F 0 .x/ can be retrieved via:
H.W. Chuang Department of Finance, National Taiwan University, Taipei, Taiwan C.F. Lee Department of Finance, Rutgers University, New Brunswick, NJ, USA
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_38,
1
.x/ D 2
Z1 e itz f .t/dt
(38.4)
1
575
576
H.W. Chuang et al.
Table 38.1 The characteristic functions of the specific probability functions
Probability function
Distribution Normal Uniform Exponential Chi-Squared
Interval
Characteristic function
2 p1 e x =2 2
1 < x < 1
e t
1 e x
0<x<1 0<x<1 0<x<1
e it 1 it 1 1it
.1 2it/v=2
x D 0; 1; : : : ; n
.1 p C peit /n
x D 0; 1; : : :
e .e
x .v2/=2 e x=2 v .v=2/ 2!
n p x .1 p/nx x
Binomial
e x xŠ
Poisson
1 1 F .x/ D C 2 2
Z1
e itx f .t/ e itx f .t/ dt it
… D U S:
1 1 F .x/ D 2
Z1
e itx f .t/ dt: Re it
@U 1 @2 U @U @U 2 dt C dS C dS .dS/ @t @S 2 @S 2 @S
2 1@ U 2 ˇ @U C (38.9) S dt: D @t 2 @S 2
(38.6)
d… D
0
If F1 and F2 have respective characteristic functions '1 .t/ and '2 .t/, then the convolution of F1 and F2 has characteristic function '1 .t/'2 .t/. Although convolution is essential to the study of sums of independent random variables, it is a complicated operation, and it is often simpler to study the products of the corresponding characteristic functions. Every probability distribution on R or on Rn has a characteristic function, because one is integrating a bounded function over a space whose measure is finite, and for every characteristic function there is exactly one probability distribution.
Cox (1996) has derived the renowned CEV option pricing model. The CEV option pricing model assumes that the stock price is governed by the diffusion process dz;
ˇ<2
Since there is no diffusion term, the value of the arbitrage portfolio is certain. In order to preclude arbitrage, the payoff must equal …rdt; that is,
1 @2 U 2 ˇ @U @U C S rdt: dt D U S @t 2 @S 2 @S
(38.7)
where dz is a Wiener process and is a positive constant. If ˇ D 2, the stock prices are log-normally distributed as in Black–Scholes model (Black and Scholes 1973). In the Black–Scholes case, there is only one source of randomness – the stock price, which can be hedged with the stock. From the view of no-arbitrage, we can form a portfolio that grows in the risk-free rate. We then have the partial differential equation (PDE) subject to some boundary conditions. Similarly, we can get the option price derived from the CEV model by the way of Black-Scholes. Now, we set up a portfolio … containing the option being priced whose valueis denoted by U.S; t/, a quantity of . That is, the stock D @U @S
(38.10)
We have the PDE @U 1 @2 U 2 ˇ @U S rU D 0: C rS C @t @S 2 @S 2
38.3 CEV Option Pricing Model
dS D Sdt C S
(38.8)
The small change of the portfolio in a time interval dt is given by
ˇ=2
it 1/
(38.5)
0
or
2 =2
(38.11)
Hence, we have the derivative’s price by solving Equation (38.11) and giving the boundary conditions. If U.S; t/ is a call option, CT D max.ST K; 0/, then 8 @C 1 @2 C 2 ˇ < @C C rS C S rC D 0 @S 2 @S 2 : C@t D max.S T T K; 0/
(38.12)
where K is the strike price of the option. Now, we can simplify Equation (38.11) by rewriting it in terms of the logarithm of the spot price; that is, x D ln.S /, thus @C 1 @C @C 1 @2 C 1 @C D ; : D 2 2 2 @S @x S @S S @x S @x
(38.13)
We transform the problem into 8
1 2 @C 1 @2 C 2 x.ˇ2/ < @C C r C e rC D 0 @t 2 @x 2 @x 2 : xT CT D max.e K; 0/: (38.14)
38 Application of the Characteristic Function in Financial Research
577 j @2 fQj D 2 e C Ci x ; 2 @x j j @fQj @C D e C Ci x : @ @
By analogy with the Black–Scholes formula, we guess a solution of the form: Ct .x; t/ D e x P1 .x; t/ Ker.T t / P2 .x; t/
(38.15)
where the first term is the present value of the spot asset upon optimal exercise, and the second term is the present value of the strike-price payment. And we substitute the proposed solution Equation (38.15) into (38.14). Substituting the proposed solution into the PDE shows that both P1 .x; t/ and P2 .x; t/ are satisfied
lim Pj .x; tI lnŒK / D
t!0
1; if x T > lnŒK : 0; if xT lnŒK
@C1 1 D ri C 2 Œi e x.ˇ2/ ; @ 2 C1D0 D 0;
Pj .xt / D Pj .xT lnŒK jxt / 1 D 2
Z1
e i' lnŒK Q fj .xT;' jxt /d': (38.19) i'
C2D0 D 0:
1 2 @fQ2 1 @2 fQ2 2 x.ˇ2/ @fQ2 C r C e D 0; @ 2 @x 2 @x 2
(38.21)
subject to fQ2 .x D0 ; '/ D e i'xŒ D0 . j We guess the solution fQj .x D0 ; 'jx / D e C Ci'x . Thus, we have for j D 1, 2 j @fQj D i e C Ci x ; @x
(38.22)
(38.28)
1 C1 D ri C 2 Œi e x.ˇ2/ ; 2 1 C2 D ri 2 Œi C e x.ˇ2/ : 2
(38.29) (38.30)
Hence, we have fQj .x D0 ; 'jx / D e C Ci'x where C1 and C2 are given by above equations. Finally, we can found the analytical form of probability functions Pj ; that is, j
Pj .x D0 lnŒK jx / 1 1 D C 2
Using the above formula, we can shift Equation (38.14) in Fourier space and define D T t, we have
subject to fQ1 .x D0 ; '/ D e i'xŒ D0 ,
(38.27)
They are all first-order ordinary differential equations with constant coefficient, and they can be solved using a single integration:
1
1 2 @fQ1 1 @2 fQ1 2 x.ˇ2/ @fQ1 C rC C e D 0; (38.20) @ 2 @x 2 @x 2
(38.25) (38.26)
1 @C2 D ri 2 Œi C e x.ˇ2/ ; @ 2
(38.18)
We can apply the characteristic function of Pj .x; t/ D Pj .xT lnŒK jxt /. It can be taken the following form
(38.24)
Substituting the above Equations into (38.20) and (38.21), we have
1 2 @P1 1 @2 P1 2 x.ˇ2/ @P1 C rC C e D 0 (38.16) @t 2 @x 2 @x 2
1 2 @P2 1 @2 P2 2 x.ˇ2/ @P2 C r C e D 0; (38.17) @t 2 @x 2 @x 2 Respectively, and subject to the terminal condition for j D 1, 2
(38.23)
Z1 Re
e i' lnŒK Q fj .xT;' jxt / d' i'
(38.31)
0
or Pj .x D0 lnŒK jx / 1 1 D C 2
Z1
e i lnŒK Q Im fj .xT; jxt / d : (38.32) i
0
38.4 Options with Stochastic Volatility Heston (1993) used the characteristic functions to derive a closed-form solution for the price of a European call option on an asset with stochastic volatility.
578
H.W. Chuang et al.
We begin by assume that the spot asset at time t follows the diffusion dS.t/ D Sdt C
p v.t/Sd z1 .t/;
(38.33)
where z1 .t/ is a Wiener process. If the volatility follows an Ornstein–Uhlenbeck process (Uhlenbeck and Ornstein 1930), d
p p v.t/ D ˇ v.t/dt C ıdz2 .t/;
(38.34)
then Itô’s Lemma (Itô 1944) shows that the variance follows the process p dv.t/ D Œı 2 2ˇv.t/ dt C 2ı v.t/dz2 .t/;
(38.35)
where z2 .t/ has correlation with z1 .t/. This can be written as the familiar square-root process p dv.t/ D Œ v.t/ dt C v.t/dz2 .t/:
(38.36)
For simplicity, we assume a constant interest rate r. Therefore, the price at time t of a unit discount bond that matures at time t C is P .t; t C / D e r :
(38.37)
In this case, there two random sources – the stock price and random change in volatility – which needs to be hedged to form a riskless portfolio. Thus, we set up a portfolio … containing the option being priced the value of which is denoted by U.S; v; t/, a quantity of the stock and a quantity 1 of another asset whose value U1 depends on volatility. That is, … D U S 1 U1 (38.38) The small change of the portfolio in a time interval dt is given by
d… D
1 @2 U @2 U @U C vS2 .t/ 2 C vS.t/ @t 2 @S @v@S 1 @2 U C v 2 2 dt 2 @v 1 @U1 @2 U1 @2 U1 C vS2 .t/ 1 C vS.t/ 2 @t 2 @S @v@S 2 @ U1 1 C v 2 2 dt 2 @v @U @U1 C 1 dS @S @S @U1 @U 1 dv: (38.39) C @v @v
In order to make the portfolio instantaneously risk-free, we choose @U @U1 1 D 0: @v @v (38.40) The portfolio grows the risk-free rate. We have @U1 @U 1 D0 @S @S
and
d… D
1 @U @2 U @2 U C vS2 .t/ 2 C vS.t/ @t 2 @S @v@S 1 @2 U C v 2 2 dt 2 @v 1 @U1 @2 U1 @2 U1 1 C vS2 .t/ C vS.t/ 2 @t 2 @S @v@S 2 1 @ U1 C v 2 2 dt 2 @v
D r…dt D r .U S 1 U1 / dt:
(38.41)
If we use some simple algebra to collect all U terms on the left-hand side and U1 terms on the left-hand side, we get @2 U @2 U @U @U @2 U 1 C vS2 .t/ 2 C 2ıvS.t/ C 2vı 2 2 C rS rU @t 2 @S @v@S @v @S @U @v @U1 1 @2 U1 @2 U1 @U1 @2 U1 C vS2 .t/ C 2vı 2 rU 1 C 2ıvS.t/ C rS 2 @t 2 @S @v@S @v2 @S D : @U1 @v (38.42)
From the factor model, the two risk assets must satisfy the internal consistent relationship. Then the value of any asset U.S; v; t/ must satisfy the PDE 1 @2 U @2 U @U C vS2 .t/ 2 C vS.t/ @t 2 @S @v@S @2 U @U 1 rU C v 2 2 C rS 2 @v @S @U D fŒ v.t/ .S; v; t/g : @v
(38.43)
The unspecified term .S; v; t/ represents the price of volatility risk, and must be independent of the particular asset. Conventionally, .S; v; t/ is called the market price of volatility risk. We note that in Breeden’s (1979) consumptionbased model,
dC ; .S; v; t/dt D Cov dv; C
(38.44)
38 Application of the Characteristic Function in Financial Research
where C.t/ is the consumption rate and is the relative-risk aversion of an investor. And we note that in Cox et al. (1985) model, dC.t/ D c v.t/Cdt C c
p v.t/Cdz3 .t/;
(38.45)
where consumption growth has constant correlation with the spot asset return. These two papers motivate us with the choice of .S; v; t/ to v; .S; v; t/ D v. A European call option with strike price K and maturing at time T satisfies Equation (38.43) subject to the following boundary conditions:
rS
U.S; v; T / D Max.0; S K/;
(38.46)
U.0; v; t/ D 0;
(38.47)
@U .1; v; t/ D 1; @S
(38.48)
@U @U .S; 0; t/ C ı 2 .S; 0; t/ rU.S; 0; t/ @S @v @U C .S; 0; t/ D 0; @t U.S; 1; t/ D S:
1 @2 P1 1 @2 P1 @2 P1 @P1 C v 2 C v C v 2 2 @t 2 @x @v@x 2 @v
@P1 @P1 1 CŒ.v/Cv v D 0: (38.54) C rC v 2 @x @v Besides, both P1 .x; v; t/ and P2 .x; v; t/ are subject to the terminal condition 1; if x > lnŒK lim Pj .x; v; tI lnŒK / D : (38.55) 0; if x lnŒK t!0 The probabilities of P1 .x; v; t/ and P2 .x; v; t/ are not immediately available in close form. We will show that their characteristic function satisfies Equation (38.51). Suppose that we have two processes
(38.49) (38.50)
cov ŒdW 1 .t/; dW 2 .t/ D dt;
1 @2 V 1 @2 V @V @2 V C v 2 C v C v 2 2 @t 2 @x @v@x 2 @v
1 @V @V C r v C Œ. v/ v rV D 0: 2 @x @v (38.51) By analogy with the Black–Scholes formula, we can guess a solution of the form (38.52)
where the first term is the present value of the spot asset upon optimal exercise, and the second term is the present value of the strike-price payment. And we substitute the proposed solution into Equation (38.51). For P2 .x; v; t/, we have 1 @2 P2 1 @2 P2 @2 P2 @P2 C v 2 C v C v 2 2 @t 2 @x @v@x 2 @v
@P2 @P2 1 C Œ. v/ v D 0: C r v 2 @x @v
For P1 .x; v; t/, we have
p 1 dx.t/ D r v.t/ dt C v.t/dW 1 .t/; (38.56) 2 p dv.t/ D fŒ v.t/ v.t/g dt C v.t/dW 2 .t/; (38.57)
Now, we can simplify Equation (38.43) by rewriting them in terms of the logarithm of the spot price; that is, x D ln.S / and V .x; v; t/ D U.S; v; t/.
C.S; v; t/ D SP1 .x; v; t/ Ker.T t / P2 .x; v; t/;
579
(38.53)
(38.58)
and a twice-differentiable function f .x.t/; v.t/; t / D E Œg.x.T /; v.T //jx.t/ D x; v.t/ D v : (38.59) From Itô’s Lemma we have df D
1 @2 f @f @2 f 1 1 2 @2 f v C v v C v C r 2 @v2 @v@x 2 @x 2 2 @x
@f @f dt C C Œ. v/ v @v @t
1 @f C r v dW 1 CfŒ v vgdW 2 (38.60) 2 @x
Besides, by integrated expectations, we know that f .x.t/; v.t/; t/ is a martingale, then the df coefficient must vanish; that is,
1 @2 f @f 1 2 @2 f @2 f 1 v C v 2 C r v C v 2 2 @v @v@x 2 @x 2 @x C Œ. v/ v
@f @f C D 0: (38.61) @v @t
The final condition f .x; v; T / D g.x; v/ which depends on the choice of g. Choosing the g.x; v/ D e i'x the solution is the characteristic function, which is available in closed form.
580
H.W. Chuang et al.
In order to solve Equation (38.61) with above condition, we invert the time direction: D T t. It means that we have solved
1 @2 f @f 1 2 @2 f @2 f 1 v C v v C v C r 2 2 2 @v @v@x 2 @x 2 @x C Œ. v/ v
@f @f D0 @v @t
E./ D Ae y1 C Be y2 ; and y2 D . i'/d . where y1 D . i'/Cd 2 2 A and B can get from the boundary conditions
f .x; v; 0/ D e
Hence, we obtain
:
ŒC. /CD. /v Ci'x
(38.64)
with C.0/ D 0 and D.0/ D 0. By substituting the function form Equation (38.64) into (38.62) we have:
1 1 1 2 2 v D f C vi Df v 2 f C r v i f 2 2 2 C Œ. v/ v Df .C 0 C D 0 v/f D 0:
E./ D
E.0/ .ge y1 e y2 / ; g1
E 0 ./ D
E.0/ .gy1 e y1 y2 e y2 / ; g1
gD
i d y1 ; D y2 i C d
(38.63)
To solve the characteristic function explicitly, we guess the functional form to be f .x; v; t/ D e
E.0/ D A C B : Ay1 C By2 D 0
(38.62)
subject to the initial condition: i x
(38.65)
and D./ D
2 e y2 e y1 y2 y 2 e 2 ge y1
C C d i D 2
1 2 2 1 1 D C i D 2 C i . C /D D 0 vf 2 2 2 0 C ir C D C f D 0; (38.66) to reduce it two ordinary differential equations, respectively, 1 2 2 1 1 D C i'D ' 2 C i' .C/D 2 2 2
1 2 2 1 1 D C i'D ' 2 C i' .C/D 2 2 2
by using D D
E0 . 2 2 E
(38.68)
(38.69)
It follows that E 00 . i' /E 0 1 2 1 D 0. 2 ' C 2 i' p Let d D . i' /2 C 2 .' 2 C i'/, then the above equation has the general solution 1 2 2
1 ed : 1 ge d
C./ D ir' C
2 D ir' 2
E 0 .s/ ds 2 E.s/ 2
Z0
E 0 .s/ ds E.s/
2 E./ ln 2 E.0/ D ir' C 2 . C C d i'/
1 ge d : (38.71) 2 ln 1 ed
D ir'
For Equation (38.67), we shall solve the Riccati differential equation D0 D
Z0
(38.67)
and C 0 D ir C D:
For Equation (38.68), the C./ can be solved integration merely,
Therefore, we get the PDE,
D0 D
(38.70)
As a result, we can invert the characteristic function to get the desired probabilities. 1 1 Pj .x; v; tI lnŒK / D C 2
# Z1 " i' lnŒK e fj .x; v; T I '/ Re d': i' 0
(38.72)
38 Application of the Characteristic Function in Financial Research
38.5 Conclusion In this chapter, we use characteristic functions to solve option pricing problems. The characteristic functions are widely used in solving differential equations and the inversion formula permits one to determine the underlying distribution function from the characteristic function. The use of the characteristic functions in finance will provide an effective and practical means of dealing with the option pricing.
References Bakshi, G. and Z. Chen. 1997. “An alternative valuation model for contingent claims.” Journal of Financial Economics 44(1), 123–165. Bates, D. 1996. “Jumps and stochastic volatility: exchange rate processes implicit in Deutschemark Options.” Review of Financial Studies 5, 613–636. Black, F. and M. Scholes. 1973. “The valuation of option contracts and corporate liabilities.” Journal of Political Economy 81, 637–654.
581 Breeden, D. T. 1979. “An intertemporal asset pricing model with stochastic consumption and investment opportunities.” Journal of Financial Economics 7, 265–296. Carr, P. and D. Madan. 1999. “Option valuation using the fast Fourier transform.” Journal of Computational Finance 2(4), 61–73. Cox, J. 1996. “Notes on option pricing I: constant elasticity of variance diffusion.” Journal of Portfolio Management 23, 15–17. Cox, J. C., E. Ingersoll, and S. A. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 385–408. Heston, S. 1993. “A closed-form solution for options with stochastic volatility with applications to bond and currency options.” Review of Financial Studies 6, 327–343. Hsu, Y. L., T. I. Lin, and C. F. Lee. 2008. “Constant elasticity of variance (CEV) option pricing model: integration and detailed derivation.” Mathematics and Computers in Simulation 79(1), 60–71. Itô, K. 1944. “Stochastic integral.” Proc Imperial Acad, Tokyo 20, 519–524. Schroder, M. 1989. “Computing the constant elasticity of variance option pricing formula.” Journal of Finance 44, 211–219. Scott, L. 1997. “Pricing stock options in a jump-diffusion model with stochastic volatility and interest rates: application of Fourier inversion methods.” Mathematical Finance 7, 413–426. Uhlenbeck, G. E. and L. S. Ornstein. 1930. “On the theory of Brownian motion.” Physical Review 36, 823–841.
Chapter 39
Asian Options Itzhak Venezia
Abstract The payoffs on the expiration dates of Asian options depend on the underlying asset’s average price over some prespecified period rather than on its price at expiration. In this chapter we outline the possible applications of these options and describe the different methodologies and techniques that exist for their evaluation as well as their advantages and disadvantages. Keywords Options Valuation
r
Asian options
r
Exotic options
r
39.1 Introduction Asian options are a type of options in the class of exotic options. The payoffs on expiration of Asian options are not dependent on the underlying asset’s price at that time as in plain vanilla options, but rather on its average price at some prespecified n periods. The sensitivity of the option’s value on expiration date to the underlying asset’s price at that time is thus lower for Asian options than for regular options making, Asian options less susceptible to price manipulations. These options were named Asian options because they were introduced in some Asian markets to discourage price abuse on expiration.1 Asian options, also called Average Rate Options (AROs), are not traded in any well-known exchange but are quite popular in OTC trading, mainly in energy, oil, and currency markets. In addition to being less sensitive to manipulation they are also effective in hedging cash flows that follow patterns resembling averages over time. For example, many firms periodically make or receive foreign currency payments; Asian options would more closely match such cash flows than options that are defined on the price of the currency on some specific date. Similarly, an airline that continuously buys oil is better off hedging its costs by Asian options rather than by ordinary ones. AROs can also I. Venezia () The Hebrew University, Jerusalem, Israel e-mail: [email protected]
be employed in balance sheet hedging, as firms use average rather than year-end rates to hedge their exposures. Another feature of the Asian options that makes them popular with some investors (and unpopular with others) is their lower volatility. Since the value of the option at expiration is composed of the average of the price of the underlying asset over n periods, its variance is lower than that of a similar plain vanilla option. Asian options are sometimes embedded in other securities. In the late 1970s, commodity-linked bond contracts with an average-value settlement price were introduced. These contracts entitle the investors to the average value of the underlying commodity over a certain time interval or the nominal value of the bond, whichever is higher. Hence, the investors are offered a straightforward bond plus an option on the average value of the commodity where the exercise price is equal to the nominal value of the bond. Early examples of such bonds included: Oranje Nassau bonds, 1985,; Mexican Petrobonds 1977; Delaware Gold indexed bond; and BT Gold Limited Notes, 1988. Some warrant issues include Asian options features as poison pills to preclude hostile takeovers. Firms would issue such warrants to friendly investors. In case of a hostile takeover, the warrants would be exercised by these investors, and because of the averaging of prices, the dilution factor would be low. These types of warrants have been used by such French companies as Compagnie Financiere Indo-Suez and Bouygues. Asian options are issued either in the European version (Eurasian) in which case they can be exercised only on expiration date or in the American version (Amerasians) where they can be exercised any time prior to expiration. Most of the Asian options appear as Eurasians. Since AROs of the American type can be exercised at any time prior to expiration, the averaging that is designed to protect the investors from manipulation is not as effective as that provided by their European- style counterparts. 1
According to legend such options were originally used in 1987 when Bankers Trust’s Tokyo office used them for pricing average options on crude oil contracts.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_39,
583
584
Arithmetic averages are usually used in Asian options but geometric averages can be used as well. Asian options can also be broken down into several other categories: those where the average price of the underlying asset is considered or those where different strike prices are averaged. In addition, they can be applied to other exotic options such as basket options (see e.g., Castellacci and Siclari 2003). If the prices are averaged over the entire life of the option they are sometimes called plain vanilla Asian options. If they are averaged over a shorter period closer to expiration they may be called forward starting options.
I. Venezia
39.2.1 Monte Carlo Simulations Monte Carlo simulations are usually considered as the most accurate way to evaluate the distribution of the average returns, and hence options. The accuracies of other approximation methods are often measured against the values obtained by the simulations. Unfortunately, simulations are computationally time consuming, especially if the Greeks also need to be calculated (see e.g., Boyle 1977).
39.2.2 Approximations 39.2 Valuation Most option pricing methods based on Black-Scholes’, model, 1973, assume that the underlying asset prices are lognormally distributed. When geometric averaging is used the valuation of Asian options is straightforward because the average price is also lognormally distributed with parameters that can easily be calculated. A straightforward application of the Black Scholes formula is then possible using the known parameters of the distribution (see Turnbull and Wakeman 1991; Curran 1992; Hull 2006). On the other hand, if an arithmetic averaging is used, the distribution of the average is unknown and there are no known closed form formulas for the options valuation. Various methods have been proposed to estimate or approximate the value of the options in this case. These methods include the following: Monte Carlo simulations, diverse mathematical and numerical techniques designed to estimate the distribution of the returns and thereby the value of the options, techniques designed to find lower and upper bounds for the options’ values using arbitrage arguments, binomial tree models, and the application of insurance formulas to the evaluation of options. As there is no one method that dominates all others, the choice of the appropriate procedure depends on computational convenience, on the required accuracy, and on the parameters of the options being evaluated. Some models would provide excellent approximations for several sets of values of the parameters but would not perform well for others.2 Given the many strong assumptions that underlie most valuation methods (e.g., knowledge of the standard deviation, lognormality, flat yield curve, efficient markets), the inaccuracies involved in most of the extant evaluation techniques do not seem to be too ominous. Below are some suggested techniques to evaluate average Asian options. 2 Since most of the options are of the European style, evaluations are provided for the calls, and the puts can be evaluated from the Put-Call parity. For the valuation of American type options, see Ben-Ameur et al. 2002, and Longstaff and Schwartz 2001.
The simplest approximation for the value of an arithmetic Asian option and the easiest one to calculate is the an equivalent geometric option (see e.g., Kemna and Vorst 1990; Curran 1992; Turnbull and Wakeman 1991; Hull 2006). Consider a newly issued Asian (Eurasian) option on an underlying asset with a volatility of returns ¢ and a continuous dividend yield (or growth rate) q and suppose the option’s value on expiration depends on the average underlying asset’s price from issue until that date.3 The value of the geometric option is obtained by using thepusual BlackScholes formula with a standard deviation, = 3, and a dividend yield, 0:5.r C q C 2 =6/, where r is the risk free interest rate. Since the geometric average is always lower than or equal to the arithmetic average, the approximations thus obtained underestimate the call options and overestimate the put options. Nevertheless, Turnbull and Wakeman show that estimates based on this method are quite accurate for low standard deviations (20% or lower). Their quality, however, declines when higher standard deviations are considered. Another relatively easy method for approximating the value of Asian options is to estimate the arithmetic average price by an ad hoc distribution, either lognormal or inverse Gaussian, with the same first and second moment (or higher moments). Such approximations were introduced by Levy (1992), Turnbull and Wakeman (1991), Jacques (1996), and Hull (2006). Turnbull and Wakeman use the Edgeworth series expansion around the log normal distribution to provide an algorithm to compute moments of the arithmetic average. Levy (1992), uses a simpler approach, employing the Wilkinson approximation that utilizes just the first two moments of the distribution. In the above example, suppose the risk free rate is r, the current price of the underlying asset is S0 , the strike price
3
If the option is not newly issued and some of the prices that comprise the average have already been observed, then taking that into account, the strike price and the definition of the random variable are adjusted in a straightforward manner.
39 Asian Options
585
is K, and that the time to maturity is T. The distribution of the average returns is then approximated by a lognormal distribution with first two moments M1 and M2 given by: M1 D
e .rq/T 1 S0 .r q/T
and M2 D
2 2e Œ2.rq/C T S02 .r q C 2 /.2r 2q C 2 /T 2 1 2S02 e .rq/T C .r q/T 2 2.r q/ C 2 r q C 2
when q ¤ r. The Asian option can then be regarded as a regular option on a futures contract with current value of F0 and volatility 2 given by: F0 D M1 and
1 M2 D ln T M12 2
The values of the call option, c, and the put option, p, are then given by: c D e rT ŒF0 N.d1 / KN.d2 / p D e rT ŒKN.d2 / F0 N.d1 / ı ln.F0 K/ C 2 T =2 d1 D p T ı p ln.F0 K/ 2 T =2 d2 D D d1 T p T Levy shows that his method provides good approximations for the value of the commonly used Asian options that are usually characterized by low variances or short times to expiration. For options with ap standard deviation, ¢, and time to expiration, t, satisfying ¢ t 0:2, he shows that the approximations are quite accurate, although their accuracy falls as the standard deviations get larger. Since the approximations described above have an analytical form, one can explicitly calculate the Greeks that are extremely important for market traders in managing their risks.
39.2.3 Other Mathematical and Numerical Methods Kemna and Vorst (1990), present a dynamic hedging strategy from which they derive the value of average Asian options
using arbitrage arguments. They also show that the price of an ARO is always less than or equal to that of an equivalent plain vanilla option.4 In addition they also provide some variance reduction methods that are instrumental in calculating confidence intervals and in reducing the computation times. Jacques (1996), extends their method to calculate the values of the replicating portfolios in hedging Asian type options. Bouaziz et al. (1994), propose a slightly different arbitrage-based model, which works quite well, to approximate the value of Asian options. The advantage of their method is that they also provide upper bounds to the approximation errors. This is an important issue that some of the previous methods have failed to address. Geman and Yor (1993), have obtained an expression for the Laplace transform of the price of Asian options. However, calculation of the inverse of this transform is quite time consuming. Fu et al. (1998/1999), and Linetsky (2002), also use Laplace transforms to price Asian options and have reached a reasonable degree of accuracy; In addition, Carverhill and Clewlow (1990), make use of Fourier transform techniques. In general, the price of an Asian option can be found by solving a partial differential equation (PDE) in two space dimensions (see e.g., Ingersoll 1987). Rogers and Shi (1995), have formulated a one-dimensional PDE that can model both floating and fixed strike Asian options. Zhang (1995) presents a semi-analytical approach that is shown to be fast and stable by means of a singularity removal technique and then numerically solving the resultant PDE. Vecer (2001), provides a simple and numerically stable two-term PDE characterizing the price of any type of arithmetically averaged Asian option. The approach includes both continuously and discretely sampled options, which is easily extended to handle continuous or discrete dividend yields.
39.2.4 Binomial Models As is true for any other option, Asian options can be priced using tree models. This type of method, however, is not used frequently because it could be computationally inconvenient. At any point in time on the tree, the value of the option depends on the average of the price along the path it has taken, and one must then determine a minimum and maximum range at each node. As the number of nodes on a tree grows, so does the number of averages that must be taken, 4 Turnbull and Wakeman claim however that Kemna and Vorst’s result holds true only if the averaging period starts after the initiation (and valuation) of the options. They provide examples showing that when the time to maturity of the option is lower than the averaging period, the value of the Asian option could be higher than that of a standard European option. Geman and Yor 1993, arrive at a similar conclusion.
586
and this number is exponentially related to the number of possible asset paths. Hull and White (1993), show a way to alleviate this problem by adding a state variable to the tree nodes and approximations are undertaken with interpolation techniques in backward induction.
39.2.5 Applying Insurance Models The equivalence between insurance and options is well known. Many insurance contracts have payoffs defined in terms of claims accumulations rather than the end-of-period values of the underlying state variables. This property suggests that the analogy between insurance and Asian options is more appropriate in some cases than the analogy between insurance and regular options. The problem of pricing Asian options is then similar to that of calculating stop-loss premiums of a sum of dependent random variables. Results previously used for calculating premiums for such insurance contracts (see e.g., Cummins and Geman 1994; Goovaerts et al. 2000; Jacques 1996; Kaas et al. 2000; Dhaene et al. 2002a, b) can therefore be applied with slight modifications to find analytical lower and upper bounds for the price of Asian options.
39.3 Conclusion Asian options are popular in markets with thin trading because it is harder to manipulate their values close to expiration than the values on expiration of ordinary options. They are also attractive to some sets of particularly risk averse investors. In addition, many financial instruments may be modeled as Asian options because their values are contingent on the average price of other “underlying” assets. The popularity of Asian options and the absence of a universally accepted formula for their evaluation created the need to rely on approximations. While users of Asian options enjoy the choice of many models for approximations that are quite accurate and easy to use, practitioners still wait for a breakthrough that will make the evaluation more accessible, accurate, and user-friendly.
References Ben-Ameur, H., M. Breton, and P. L’Ecuyer. 2002. “A dynamic programming procedure for pricing American-style Asian options.” Management Science 48, 625–643.
I. Venezia Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Bouaziz, L., E. Briys, and M. Crouhy. 1994. “The pricing of forwardstarting Asian options.” Journal of Banking and Finance 18, 823–839. Boyle, P. 1977. “Options: a Monte-Carlo approach.” Journal of Financial Economics 4, 323–338. Carverhill, A. P. and L. J. Clewlow. 1990. “Valuing average rate (‘Asian’) options.” Risk 3, 25–29. Castellacci, G. M. and J. Siclari. 2003. “Asian basket spreads and other exotic options.” Power Risk Management. http://www.olf.com/news/articles/2005/articles_20050610.aspx Cummins, D. J. and H. Geman. 1994. “An Asian option approach to the valuation of insurance futures contracts.” Working paper, Financial Institutions Center University of Pennsylvania 03–94. Curran, M. 1992. “Valuing Asian and portfolio options by conditioning on the geometric mean price.” Management Science 40, 1705–1711. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002a. “The concept of comonotonicity in actuarial science and finance: applications.” Insurance Mathematics and Economics 31, 133–161. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002b. “The concept of comonotonicity in actuarial science and finance: theory.” Insurance Mathematics and Economics 31, 3–33. Fu, M., D. Madan, and T. Wang. 1998/1999. “Pricing continuous Asian options: a comparison of Monte Carlo and Laplace transform inversion methods.” The Journal of Computational Finance 2, 49–74. Geman, H. and M. Yor. 1993. “Bessel processes, Asian options, and perpetuities.” Mathematical Finance 3, 349–375. Goovaerts, M. J., J. Dhaene, and A. De Schepper. 2000. “Stochastic upper bounds for present value functions.” Journal of Risk and Insurance Theory 67, 1–14. Hull, J. 2006. Options, Futures, and Other Derivatives, Prentice Hall, NJ. Hull, J. and A. White. 1993. “Efficient procedures for valuing European and American path-dependent options.” Journal of Derivatives 1, 21–31. Ingersoll, J. 1987. Theory of Financial Decision Making, Rowman & Littlefield Publishers. Jacques, M. 1996. “On the hedging portfolio of Asian options” ASTIN Bulletin 26, 165–183. Kaas, R., J. Dhaene, M. J. Goovaerts 2000. “Upper and lower bounds for sums of random variables.” Insurance: Mathematics and Economics 27, 151–168. Kemna, A. G. Z. and A. C. F. Vorst 1990. “A pricing method for options based on average asset values.” Journal of Banking and Finance 14, 113–129. Levy, E. 1992. “Pricing European average rate currency options” Journal of International Money and Finance 11, 474–491. Linetsky, V. 2002. “Exact pricing of Asian options: an application of spectral theory,” Working Paper, Northwestern University. Longstaff, F. A. and E. S. Schwartz 2001. “The Review of Financial Studies” 14, 113–147. Rogers, L. and Z. Shi 1995. “The value of an Asian option” Journal of Applied Probability 32, 1077–1088. Turnbull, S. M. and L. M. Wakeman 1991. “A quick algorithm for pricing European average options.” Journal of Financial and Quantitative Analysis 26, 377–389. Vecer, J. 2001. “A new PDE approach for pricing arithmetic average Asian options.” The Journal of Computational Finance 4, 105–113. Zhang, P. 1995. “Flexible arithmetic Asian options.” Journal of Derivatives 2, 53–63.
Chapter 40
Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution Kehluh Wang and Ming-Feng Hsu
Abstract We have developed a modified Edgeworth binomial model with higher moment consideration for pricing European or American Asian options. If the number of the time steps increases, our numerical algorithm is as precise as that of Chalasani et al. (1999), with lognormal underlying distribution for benchmark comparison. If the underlying distribution displays negative skewness and leptokurtosis, as often observed for stock index returns, our estimates are better and very similar to the benchmarks in Hull and White (1993). The results show that our modified Edgeworth binomial model can value European and American Asian options with greater accuracy and speed given higher moments in their underlying distribution. Keywords Asian options Numerical analysis
r
Edgeworth binomial model
r
40.1 Introduction The payoff of an Asian option depends on the arithmetic or geometric price average of the underlying asset during the life of the option. Its value is path-dependent, normally without the closed form solution, and therefore more difficult to calculate than that of a standard option. However, the hedging effect of an Asian option, which is widely used in the foreign exchange market, is better than that of a standard option and offers convenience and lower cost. Many researchers have applied various methods to value Asian options. There are mainly two approaches. Analytic approximations that produce closed-form solutions were proposed by Turnbull and Wakeman (1991), Lévy (1992),
K. Wang () National Chiao Tung University, Hsinchu, Taiwan, ROC e-mail: [email protected] M.-F. Hsu National Chiayi University, Chiayi City, Taiwan, ROC e-mail: [email protected]
Zhang (2001, 2003), Curran (1994), Rogers and Shi (1995), and Thompson (2000). However, obtaining a pricing formula is still a challenging problem for many applications. In recent years, numerical methods have assumed increasing importance (Grant et al. 1997). Three popular methods, the Monte Carlo simulation, the finite difference method, and the binomial lattice, have been widely used to value Asian options. Although the Monte Carlo simulation is often used instead of other pricing methods due to its convenience and flexibility, its calculation is considered inefficient and very time-consuming. The difference equations transformed from partial differential equations can be solved quickly using numerical method. However, Barraquand and Pudet (1996) point out that for the path-dependent pricing problem, an augmentation of the state space is not viable in this approach. Binomial tree models have been extensively applied in various option valuations (Amin 1993; Rubinstein 1994; Hilliard and Schwartz 1996; Chang and Fu 2001; Klassen 2001). In particular, Hull and White (1993), Chalasani et al. (1998, 1999), Neave and Ye (2003), and Costabile et al. (2006) all adopt extended binomial models to value European or American path-dependent options. They claim that their algorithms are considerably faster and provide more accurate results compared with the analytical approximation used by Turnbull and Wakeman (1991). Most of the valuation methods for Asian options assume that the return distribution of the underlying asset is lognormal. However, practitioners and academics are well aware that the finite sum of the correlated lognormal random variables is not lognormal. Corrado and Su (1997), and Kim and White (2003) both find significant negative skewness and positive excess kurtosis in the implied distribution of S&P 500 index options. Ahn et al. (1999) argues nonnormal distribution when using put options to reduce the cost of risk management. It is for this reason that some researchers have tried to investigate other alternatives by considering the number of moments. For example, Milevsky and Posner (1998a, b) and Posner and Milevsky (1998) apply moment-matching analytical methods to approximate the density function of the underlying average for the European Asian options.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_40,
587
588
K. Wang and M.-F. Hsu
The purpose of this paper is to introduce a numerical algorithm in pricing European and American Asian options while considering the higher moments of the underlying asset distribution. We have developed a modified Edgeworth binomial model that applies the refined binomial lattice (Chalasani et al. 1998, 1999) and use the Edgeworth expansible distribution (Rubinstein 1998) to include the parameters for higher moments. The numerical results show that this approach can effectively deal with the higher moments of the underlying distribution and provide better option value estimates than those found in various studies in the literature. The rest of the paper is organized as follows: The next section introduces the pricing process of a binomial tree for an Asian option. The third section presents our Edgeworth binomial model for pricing Asian options with higher moment consideration. In the fourth and fifth sections, we discuss the lower and upper bounds of the prices of the European and American Asian options. The numerical examples for our approach are then provided in Sect. 40.6. Specifically, the algorithmic process with three periods is shown and our estimates are compared with the benchmarks discussed in literature. Finally, we offer our conclusion in the last Section.
40.2 Definitions and the Basic Binomial Model
S(0)u,
S(0)
S(0)u-1.
Now consider a call option with two periods .k D 2/ before its expiry date. The price process of the stock will show three possible values after two periods, S(0)u2, S(0)u S(0)
S(0), S(0)u-1
S(0)u-2.
This price process of the stock can be extended to n time steps. The stochastic differential equation describing this price process, i.e., dS.t/ D S.t/dt C S.t/dB.t/, has the following solution, p
1 2 S.t/ D S.0/e . 2 /t C ˆ t ;
An underlying variable, S.t/, of an option at time t is generally assumed to satisfy the stochastic differential equation in a risk-neutral world: dS.t/ D S.t/dt C S.t/dB.t/; where the drift and volatility are constant, and fB.t/g denotes a Brownian motion process. Assume that the riskfree interest rate r is a constant, and that the option expires at time T . A binomial tree can approximate the continuous time function S.t/, where one divides the life of the option into n time steps of length t D T=n. In each time step, the underlying asset may move up by a factor u with probability pc , or down by a factor d D u1 with probability qc D 1 pc , with 0 < d < 1 < u. Firstly, we consider the one period case, i.e., time step k D 1. The stock price at the end of the period will have two possible values, either up to a value S.0/u with probability pc or down to a value S.0/u1 with probability 1pc . These price movements can be represented in the following diagram,
where ˆ is a standardized normal random variable. For a binomial random walk to have the correct drift over a time period of t, we need h p i 1 2 pc Su C .1 pc /Sd D SE e . 2 / t C ˆ t D Se t I namely, pc u C .1 pc /d D e t . Rearranging this equation we can obtain e t d pc D ud p
with u D e t . Here, let n be a sample space of an experiment including all possible sequences of n upticks and downticks. A typical element of n is presented as ! D !1 !2 : : : : !n , where !i denotes the i th uptick or downtick. Let fHk .!/g be an associated family of random variables, where Hk .!/ denotes the number of upticks at time k and H0 .!/ D 0 for all !. We can define a symmetric random walk Xk , such that for each k 1; Xk D Hk .k Hk / D 2Hk k, which represents
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
S0u2
S0 (0,0)
S0u
(2,2)
(1,1)
S0
S0u-1
(2,1)
(1,0)
S0u-2
S0u3 (3,3)
S0u (3,2) -1
S0u (3,1)
589
variable yh D Œ2h n =n1=2 with a standardized binomial density b.yh / D ŒnŠ= hŠ.n h/Š .1=2/n. Giving predetermined skewness and kurtosis, the binomial density is transformed by the Edgeworth expansion up to the fourth moment. The result is F .yh / D f .yh / b.yh / 1 D 1 C 1 yh3 3yh 6 1 .2 3/ yh4 6yh2 C 3 24 1 C 1 yh6 15yh4 C 45yh2 15 72 n 1 nŠ ; (40.1)
2 hŠ.n h/Š
C
(2,0) S0u-3 (3,0) Fig. 40.1 A binomial tree. Node (3, 2) means there are 2 upticks in any path reaching this node at time 3
the number of upticks minus the number of downticks up to time k. It is used to define the nodes in a binomial lattice corresponding to the possible positions of the underlying random walk at different times. Specifically, a tree path ! is displayed to pass through or reach node .k; h/ if and only if Hk .!/ D h for times k D 0; 1; : : : ; n and the number of possible upticks h D 0; 1; : : : :; k. Consequently, the underlying asset price at time k is Sk .k D 0; 1; : : : :; n/, where Sk D S0 uXk D S0 u2Hk k : For example, S3 D S0 u.22–3/ D S0 u at node (3, 2) in the lattice diagram of Fig. 40.1. The underlying asset price at node .k; h/ is given by S0 u2hk , whose average at time k is defined as Ak D .S0 CS1 C: : : :CSk /=.kC1/; k 0. Therefore, the payoff of an Asian call with strike price L at time n is VnC D .An L/C D max fAn L; 0g. The price of a European option is the present value of the expected payoff discounted to time 0; that is, C D E VnC =.1 C r/n . And the price of an American option at time 0 is the maximum discounted expected payoff from possible exercise strategy all ; that is, C D max E VnC =.1 C r/n . Note that E VnC is a P probability-weighted average given by k Pk .Ak L/C , where Pk denotes the risk-neutral probability associated with Ak at the expiration date.
40.3 Edgeworth Binomial Model for Asian Option Valuation To consider the higher moments, we first apply the Edgeworth binomial tree model (Rubinstein 1998). Assume that the tree has n time steps and n C 1 nodes .h D 0; 1; : : : ; n/ at step n. At each node h, there is a random
4 3 with f .yh / D 1C.1=6/” 1 yh 3yh C.1=24/.” 23/ yh 6 4 2 6yh2 C 3 C.1=72/” 1 3yh 15yh C 45yh 15 , Q where ”1 D E yh is the skewness and ”2 D EQ yh 4 is the kurtosis of the underlying distribution under riskneutral measure. While the sum of F .yh / is not one, we normalize F .yh / by F .yh /=†j F .yj / and denote it as Ph . The variable yh , which has probability Ph , can be standardized as xh D .yh M/=V with M D †h Ph yh and V 2 D †h Ph .yh M/2 . The variable xh is used later in Equation (40.2) to obtain the asset price and the corresponding risk-neutral probability, Ph , for a path to node h. Consider a tree model of n steps. The asset price at the hth node .h D 0; 1; : : : ::; n/ during the final step, SOn;h , is ^
Sn;h D S0 e T C with D r
1 T
ln
n P
Ph e
p
T xh
p T xh
(40.2)
,
hD0
where S0 is the initial asset price, r is the continuously compounded annual risk-free rate, T is the time for expiration of the option (in years), is the annualized volatility rate for the cumulative asset return, and xh is a random variable from probability distribution Ph with mean 0 and variance 1. Ph is determined by modifying the binomial distribution using the Edgeworth expansion up to the fourth moment of ln.SOn;h= S0 /. Finally, is used to ensure that the expected risk-neutral asset return equals r. Solving backwards recursively from the end of the tree, the nodal value, Sn1;h , is
^ ^ rT Sn1;h D pe Sn;hC1 C qe Sn;h exp n
(40.3)
with pe D pn;hC1 =.pn;hC1 C pn;h / and qe D .1 pe /, where pn;h is Ph =ŒnŠ= hŠ.nh/Š . To demonstrate these nodal values in the binomial lattice, a numerical example with three-time steps is shown in a later section.
590
K. Wang and M.-F. Hsu
(6,3) (5,2)
Node (k+1, h+1) = (6,3) Area a M(k+1, h+1, a) 0 1 1 1 2 2 3 3 4 3 5 3 6 3 7 2 8 1 9 1
Node (k, h) = (5,2) Area a M(k, h, a) 0 1 1 1 2 2 3 2 4 2 5 1 6 1
(6,2)
W6,2 (w)=5
Node (k+1, h) = (6,2) Area a M(k+1, h, a+h) 0 1 1 1 2 2 3 2 4 3 5 2 6 2 7 1 8 1
Fig. 40.2 A binomial lattice. The shaded area shows the diamondshaped boxes for node (6,2)
The path dependence of Asian options is analyzed using the approach by Chalasani et al. (1999). In order to represent the refined binomial lattice, a new random variable Wk;h denoting an area at time k is assigned. Its initial value W0 is zero. For any node .k; h/ in the tree, a lowest path reaching .k; h/ is defined as the path with k h downticks followed by h upticks, and a highest path reaching .k; h/ means the one with h upticks followed by k h downticks. The area Wk;h .!/ of a path ! reaching .k; h/ can be defined as the number of diamond-shaped boxes enclosed between this path ! and the lowest path reaching this node. For example, the node (5, 2) means that the paths reaching it have 2 upticks at time 5. As demonstrated in Fig. 40.2, a path passing through (5, 2) and reaching node (6, 2) is shown by the thick line segments. The area W6; 2 .!/ of this path is the number of diamond-shaped boxes contained between this path and the lowest path reaching node (6, 2), as shown by the shaded area in the graph. The maximum area of any path reaching .k; h/ is the number of boxes between the highest and the lowest paths reaching .k; h/; that is, h.k h/. The minimum area of any path reaching .k; h/ is zero. The set of possible areas of paths reaching node .k; h/ is therefore f0; 1; : : : ; h.k h/g. Each node of the binomial lattice can be partitioned into “nodelets” based on the areas of the paths reaching this node. Therefore, any path reaching a given nodelet .k; h; a/ has an area Wk;h .!/ D a with h upticks at time k. For instance, Fig. 40.3 shows the nodelets in the nodes (5, 2), (6, 3) and (6, 2). As noted in Chalasani et al. (1999), there is a oneone correspondence between the possible areas and the possible geometric averages of underlying asset prices for paths reaching .k; h/. Therefore, .k; h; a/ represents all the paths in the binomial tree that reach node .k; h/ and has the same geometric average asset price from time 0 to time k. Suppose the area of a path A reaching .k; h/ is Wk;h.A/ D a. If A has an uptick after this point, it reaches node .k C 1; h C 1/ at the next time-step. The path A and the lowest path B
Fig. 40.3 Nodelets in the nodes (5, 2), (6, 3) and (6, 2) as circled in Figure 40.2. This Figure exhibits the number of paths M.k; h; a/ reaching each nodelet .k; h; a/. An example is shown for nodelet (5, 2, 2) which is updated as in the nodelets (6, 3, 2) and (6, 2, 4)
reaching .k C 1; h C 1/ share the same edge linking .k; h/ and .k C 1; h C 1/ in the lattice. Hence, the number of boxes between A and B at time k C 1 is the same as the number at time k. In this way the path A reaches nodelet .k C 1; h C 1; a/. On the other hand, if A has a downtick after time k, it will reach node .kC1; h/. In this case, the number of boxes at time k C1 between A and the lowest path reaching .k C1; h/ will be increased by h to get a C h. The path A then reaches nodelet .k C 1; h; a C h/. We now show how the arithmetic average of underlying asset prices over all paths reaching .k; h; a/ is computed and denote it by AN .k; h; a/ D EŒAk j Hk D h; Wk;h .!/ D a , k D 0; 1; : : : ; n; h k. It is simply the average of Ak over these paths. So the arithmetic average of stock prices over all paths reaching nodelet .k; h; a/ can be expressed as:
A.k; h; a/ D
S 00 .k; h; a/ ; .k C 1/M.k; h; a/
where S 00 .k; h; a/ D k P i D0
M P mD1
(40.4)
Sm0 .k; h; a/; Sm0 .k; h; a/ D
Si;h , with k D 0; 1; : : : ; nI h D 0; 1; : : : ; kI a D
0; 1; : : : ; h.k h/I m D 1; 2; : : : ; M.k; h; a/; and M.k; h; a/ is the number of paths reaching .k; h; a/ with M.0; 0; 0/ D 1. Here, S 00 .k; h; a/ is the sum of Sm0 .k; h; a/ over all paths passing through .k; h; a/ with S 00 .0; 0; 0/ D S0 ,
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
while Sm0 .k; h; a/ is the sum of the asset prices along any possible path passing through .k; h; a/ from time 0 to k. Any path passing through nodelet .k; h; a/ and having an uptick will get to nodelet .k C 1; h C 1; a/ at time k C 1. Thus, the number of paths reaching nodelet .k C 1; h C 1; a/; namely, M.k C 1; h C 1; a/, should include M.k; h; a/ paths through .k; h; a/. The sum of the prices from all the paths reaching .k C 1; h C 1; a/; namely, S 00 .k C 1; h C 1; a/, would be S 00 .k; h; a/CM.k; h; a/ SkC1;hC1 for paths passing .k; h; a/. Likewise, all paths passing through nodelet .k; h; a/ with a downtick will reach nodelet .k C 1; h; a C h/ at time k C 1. Similarly, M.k C 1; h; a C h/ should also include M.k; h; a/ and S 00 .k C 1; h; a C h/ would be S 00 .k; h; a/ C M.k; h; a/ SkC1;h in the forward induction process.
591
N h; a/, bound, EŒAn j Wn;h ; Sn;h , can be expressed as A.n; N as in Equation (40.4). A.n; h; a/ is the expectation of the average stock price An at node .n; h/, where An D .S0 C S1;h C : : : : C Sn;h /=.n C 1/ on a tree path passing through .n; h; a/. All paths through this nodelet have the same probability P .Wn;h ; Sn;h /, which is M.n; h; a/pe h qe nh . Thus, we can calculate the lower bound as i h Cd0 D E ŒE .An jWn;h ; Sn;h / L C D
n h.nh/ X X
C P .Wn;h ; Sn;h / A.n; h; a/ L
hD0 aD0
D
n h.nh/ X X
C M.n; h; a/phe qenh A.n; h; a/ L :
hD0 aD0
40.4 Upper Bound and Lower Bound for European Asian Options
As a result, the error bound is E E .An L/C jW n;h ; Sn;h h i E E Œ.An L/ jW n;h ; Sn;h C D E E .An L/C jW n;h ; Sn;h i E Œ.An L/ jW n;h ; Sn;h C
Next, we present how the value of an Asian option after obtaining the arithmetic average of the stock prices from Equation (40.4) is estimated. We apply the approach used by Rogers and Shi (1995), where the lower bound and the error bound are calculated for the price of an Asian option. This lower bound on the price of an Asian call option with strike L can be expressed as: E .An L/C D E E .An L/C jZ i h i h E E ŒAn L jZ C D E ŒE .An jZ / L C
(40.5)
where Z D .Wn;h ; Sn;h / in which the random variable Wn;h denotes the area at node .n; h/, and Sn;h represents the stock price reaching .n; h/. The composition in the lower
1X 2
h.nh/
hD0
aD0 Ami n .n;h;a/L
1X 2
ˇ 1 i 1 h E var An L ˇWn;h ; Sn;h 2 ; 2
(40.6)
assuming Vn min .Z/ < 0 and Vn max .Z/ > 0, where Vn D An L.1 Note that var .An L/ D var An D E A2 n .E An /2 and AN2 .n; h; a/ D E A2 n jWn;h ; Sn;h . Let Amin .k; h; a/ denote the minimal value of Ak and Amax .k; h; a/ its maximum over all paths passing through the nodelet .k; h; a/. Thus, the error bound by Equation (40.6) equals
12 P.Wn;h ; Sn;h / A2 .n; h; a/ A.n; h; a/2
X
n
D
X
n
h.nh/
hD0
aD0 Ami n .n;h;a/L
1 2 .n; h; a/ A.n; h; a/2 2 : M.n; h; a/phe qnh A e
1
(40.7)
We set the minimum and maximum values of Vn over paths ¨ with Z.¨/ D z, to be Vn max .z/ D max¨2 fVn .¨/jZ.¨/ D zg and Vn min .z/ D min¨2 fVn .¨/jZ.¨/ D zg. If Vn max .zi / 0, then C for all paths ¨ with Z.¨/ D zi , we can deduce Vn .¨/ D 0, which implies E Vn C j zi D 0, and also E.Vn jzi / 0, which implies E.Vn jzi /C D 0. Hence, the error bound is zero. Similarly, if Vn min .zi / 0, then for all paths ¨ with Z.¨/ D zi , we can deduce Vn C .¨/ D Vn .¨/, which implies E Vn C jzi D E.Vn jzi /, and also E.Vn jzi / 0, which implies E.Vn jzi /C D E.Vn jzi /. Therefore, the error bound is again zero.
592
K. Wang and M.-F. Hsu
The AN .n; h; a/ can be derived from Equation (40.4). Meanwhile, Amin .k; h; a/ D S min .k; h; a/=.k C 1/ and Amax .k; h; a/ D S max .k; h; a/=.k C 1/, where S min .k; h; a/ and S max .k; h; a/ are the minimum value and maximum value of Sk;h , respectively, over these paths reaching .k; h; a/. AN2 .n; h; a/ can also be calculated from the following:
.k; h; a/ C 2§.k; h; a/ ; .k C 1/2 M.k; h; a/ k P 2 where ®.k; h; a/ is the sum of Si;h and §.k; h; a/ is the i D0 P sum of Si;h Sj;h .2 With the lower bound and the error A2 .k; h; a/ D
0i j k
bound, we can obtain the upper bound. Suppose one upward probability pe , denoting the probability of the stock price moving up for the next step in the Edgeworth binomial tree, is lower than the other upward probability pc , the probability of the stock price moving up in the binomial tree of Chalasani et al. (1999). The average stock price in a path with upward drift causes higher probability of Amin .k; h; a/ > L; that is, higher probability of zero variance. So the total variance of the average stock price will be smaller. According to Equation (40.7), the error bound of the option price with upward probability pe will be smaller than the error bound with probability pc . We can show in the following proposition that this can lead to tighter bounds on the error from approximating E Vn C if its upward probability is lower. 3 X
2 X
hD0
aD0 Amax >L;Ami n
1
M.3; h; a/pch .1 pc /3h .var.A3 ja; S3;h // 2
show that the error bound in approximating We first E Vn C from a modified Edgeworth binomial tree model and that from a binomial tree model employed by Chalasani et al. (1999) are proportional to their upward probabilities, respectively, in the binomial paths. For this, we use a discrete approximation method similar to the lattice approach. Let T be the time for expiration of the option. At time T , let Ye .T / denote the variance of the arithmetic average of the stock prices in our modified Edgeworth binomial tree with upward probabilities pe , and Yc .T / denote the variance of the average price in a binomial tree from Chalasani et al. with upward probability pc . From Equation (40.2), the asset price in the Edgeworth model is affected by the drift with upward trend, resulting in higher average than the case in Chalasani et al. From Equation (40.7), higher average price increases the probability of Amin .k; h; a/ > L; that is, higher probability of zero variance based on explanations shown in Footnote 1. We thus have Yc .T / Ye .T /. Assume for the moment that pe is less than pc . At time t D T=3, the conditional expectations of the variances with upward probabilities pe and pc are given by E3t ŒYe .T / and E3t ŒYc .T / , respectively. We can see (ignoring the discount factors) that 1
1
E3t ŒYc .T / 2 E3t ŒYe .T / 2 ; because 3 X
2 X
hD0
aD0 Amax >L;Ami n
1
M.3; h; a/peh .1 pe /3h .var.A3 ja; S3;h // 2 : (40.8)
Similarly, at time t D T =4, 1
1
E4t ŒYc .T / 2 E4t ŒYe .T / 2 ; since 4 X
4 X
hD0
aD0 Amax >L;Ami n
1
M.4; h; a/pch .1pc /4h .var.A4 ja; S4;h // 2
4 X
4 X
hD0
aD0 Amax >L;Ami n
1
M.4; h; a/peh .1 pe /4h .var.A4 j a; S4;h // 2 : (40.9)
2
To show how AN2 .k; h; a/ is derived, we can write .k C 1/2 A2k D !2 k k P P P Si;h D S2i;h C2 Si;h Sj;h . Because all paths reaching
iD0
iD0
0ij k
2 N2 .k; h; a/ have the same probability, ! A .k; h; a/ is the average of Ak D k P P S2i;h C 2 Si;h Sj;h .k C 1/2 over these paths. iD0
0ij k
Therefore, as long as the above inequality continuously holds for all time t T=5, the error bound for a tree model with upward probability pe will be tighter than that for a tree with upward probability pc , given that pe is less than pc . The following proposition shows that this can lead to tighter bounds
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
on the error from approximating E Vn C if its upward probability is lower. The detailed analytical explanation is discussed in Lo et al. (2008). Proposition 1. The error bound in pricing a European Asian option from the modified Edgeworth binomial model is tighter than the error bound from the model by Chalasani et al. (1999).
40.5 Upper Bound and Lower Bound for American Asian Options Define CU .k; h; x/ as the value of an American Asian option at time k, given that the number of upticks is h and the arithmetic asset price average Ak equals x. CU can be expressed as follows: CU .n; h; x/ D .x L/C ; h n CU .k; h; x/ D maxf.x L/C ; .pe CU .k C 1; h C 1; x U .k; h// Cqe CU .k C 1; h; x L .k; h/// exp.rT =n/g; k < n; h Œk ;
(40.10)
where L is the exercise price, pe D pkC1;hC1 =.pkC1;hC1 C pkC1;h / and qe D .1 pe / with pkC1;h being Ph =Œ.kC1/Š= .hŠ..k C 1/ h/Š/ . Here x U .k; h/ D Œx.k C 1/ C SkC1;hC1 =.k C 2/ is the arithmetic asset price average AkC1 , given that x is the arithmetic asset price average Ak and an uptick occurring at time k C 1. Similarly x L .k; h/ D Œx.k C 1/ C SkC1;h =.k C 2/ is the arithmetic asset price average AkC1 , given that x equals Ak and there is a downtick at time k C1. Therefore, for any path ! such that Hk .!/ D h and Ak .!/ D x, the price of an American Asian option at time k on ! is CU .k; h; x/ whose value at time 0 is CU .0; 0; S0 /. However, we cannot directly calculate CU .0; 0; S0 /. The quantity x U .k; h/ may not equal any of the possible averages of asset prices over all paths reaching node .kC1; hC1/ with area a. And similarly, x L .k; h/ may not equal any average at node .k C 1; h/. But we can use linear interpolation to N h; a/ is a strictly derive these missing values because A.k; increasing function in a. First, we compute the upper bound of the American Asian option using an idea similar to Hull and White (1993). For N h; a/, we can find b such that for some a given x D A.k; 0 1, N N hC1; b/C.1/ A.kC1; hC1; bC1/: x U .k; h/ D A.kC1;
593
Therefore, on the right hand side of Equation (40.8), CU (k+1,h C 1; x U .k; h// can be replaced by W U .k; h; a/ D w1 C .1/w2 ; N C 1; h C 1; b/ L C where w1 is W .k C 1; h C 1; b/ D ŒA.k N C 1; h C 1, and w2 is W .k C 1; h C 1; b C 1/ D ŒA.k C L b C 1/ L . Similarly CU .k C 1; h; x .k; h// can be substituted by W L .k; h; a/ D œw0 1 C .1 œ/w0 2 , where w0 1 N C 1; h; b/ L C and w0 2 is is W .k C 1; h; b/ D ŒA.k N C 1; h; b C 1/ L C . Using W .k C 1; h; b C 1/ D ŒA.k N h; a/, this procedure backward recursively, for x D A.k; a D 0; 1; : : : ; h.k h/, we have
W .k; h; a/ D max .x L/C ; pe W U .k; h; a/
rT (40.11) Cqe W L .k; h; a/ exp n for all (k, h, a) in the tree. It follows that the estimate W .0; 0; 0/ is an upper bound for the value of an American Asian option at time 0. We now provide a lower bound of the American Asian option, which can be obtained during the process for upper bound calculation applying a proper exercise rule. Such lower bound has been proposed for European Asian options by Rogers and Shi (1995) and generalized by Dhaene et al. (2002a, b). Let Z be a random variable with the property that all random variables EŒSi jZ are non-increasing or non-decreasing functions of Z. The notation S l represents a comonotonic sum of n lognormal random variables.3 The cdf of this sum can be obtained by Theorem 5 from Dhaene et al. (2002a). If we assume that the cdfs of the random variables EŒSi jZ are strictly increasing and continuous, then the cdf of S l is also strictly increasing and continuous. From Equation (48) of Dhaene et al. (2002a), we obtain that for all 1C 1 .0/; FEŒS .1/ , nL 2 FEŒS i jZ i jZ n1 X
1 FEŒS .FS l .nL// D nL; i jZ
i D0
where FX1 .p/ is the inverse of the cumulative distribution function of X. Or equivalently, n1 X
E Si jZ D FZ1C .1 FS l .nL// D nL:
i D0
3
The detailed representation of comonotonic random variables is described in Section 40.4 from Dhaene et al. (2002a).
594
K. Wang and M.-F. Hsu
This determines the cdf of the convex order random variable n1 P Si . S l D EŒS 0 m jZ for Sm0 D i D0
Under the same assumptions, the stop-loss premiums can be determined from Equation (40.55) of Dhaene et al. (2002a):4 n1 X E .S l d /C D E Œ.EŒSi jZ EŒSi jZ i D0
D FZ1C .1 FS l .d // /C ;
1 which holds for all retentions d 2 FS1C .0/; F .1/ . l Sl Hence, the lower bound on the price of an American Asian call option with strike L, applying Jensen’s rule, is:
.A L/C E R
.A L/C DE E jZ R "
C # A L E E jZ R " # ˇ .E.A ˇZ / L/C DE R D
D
e
rT n
n e
rT n
n
E
h C i E Sm0 .n 1; h; a/jZ nL 2
E4
n1 X
E Si;h jZ
!C 3 nL 5 (40.12)
iD0
where represents a stopping time for a fixed exercise rule, Zk is ˆk -measurable and ˆk represents the information set at time k, and R is the discount factor; that is, R D e rT=n . Note that each possible value of the random variable Zk corresponds to a nodelet in the refined lattice. For an exercise rule , the calculation of the upper bound in inequality Equation (40.10) will generate the estimate for the lower bound. While calculating the upper bound, whenever W .k; h; x/ > Œpe W U .k; h; a/Cqe W L .k; h; a/ exp.rT=n/ in Equation (40.9), the option will be exercised at nodelet .k; h; a/. So CL .k; h; a/ D ŒS 0 .k; h; a/ .k C 1/L C at each exercised nodelet .k; h; a/. For a nodelet at which the option is not exercised,
where S 0 .k; h; a/=.k C 1/ D E.Ak jZ£ D .k; h; a// D k P E ŒSi;h jZ =.k C 1/, and pe D pkC1;hC1 =.pkC1;hC1 C i D0
pkC1;h / and qe D .1 pe / with pkC1;h being Ph =Œ.k C 1/Š= .hŠ..k C 1/ h/Š/ . Thus S 0 .k; h; a/ is simply the average of †S 0 m .k; h; a/ over all paths at nodelet (k, h, a), while S 0 m .k; h; a/ is the sum of the asset prices along any path with exercise point in its m paths reaching (k, h, a) from time 0 to k. We can easily see that CL .0; 0; 0/ equals the expected value at time 0 in Equation (40.10), which is the lower bound.
40.6 Numerical Examples To explain the Edgeworth binomial pricing model for American and European Asian options under distributions with higher moments, consider the following examples of call options with three periods. The initial stock price S0 D 100, the maturity T D 1 year, and the strike prices L D 100. The underlying distribution has volatility D 0:3 and risk-free rate r D 0:1.
40.6.1 Pricing European Asian Options Under Lognormal Distribution We construct the binomial lattice for the asset value process as described in the third section. Exhibit 40.1 shows the nodal values with lognormal distribution in the binomial lattice. We can write down a simple algebraic expression for the
177.66 146.68 121.11 100
125.64 103.73
85.65
CL .k; h; x/ D Œpe CL .k C 1; h C 1; a/
88.87 73.36
Cqe CL .k C 1; h; a C h/ exp.rT =n/; 62.84
Time: 0
1
2
3
4
The stop-loss premium with retention d of a random variable X is shown by EŒ.X d /C . See page 7 of Dhaene et al. (2002a) for detailed discussion.
Exhibit 40.1 The nodal values in the binomial lattice under lognormal distribution
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
underlying asset value of the node (3,3) and (2,2). The steps are described below: p p D b. 3/ D 0:125 from Equation 1. We can get F . 3/ p (40.1) because y3 D 3 at node (3,3) when the underlying distribution is lognormal. Therefore P3 D 0:125 and p x3 D 3 can be obtained. 2. From P3 , x3 and the assumed D 0:3, r D 0:1 and p 3 P T D 1, the drift D 0:1 11 ln Ph e 0:3 1xh 0:0552. hD0
Therefore, the passet price at node (3,3) is S3;3 D p 0:0552.1/C0:3 1. 3/ 100e D 177:66. The asset prices of the other nodes at the final step in the binomial lattice can be calculated similarly. 3. Upward and downward probabilities are pe D qe D 0:5 under lognormal distribution. Solving backwards recursively from the node (3,3) and (3,2), the nodal value, S2;2 , D is S2;2 D Œ0:5.177:66/ C 0:5.125:64/ exp 0:1.1/ 3 146:68, according to Equation (40.3). Using the same method, we can compute the remainder of the asset prices in the binomial lattice. After the underlying asset value is obtained for each node in the binomial lattice, we next calculate the arithmetic average
C0d D
8 3 h.3h/ <X X :
3h
M.3; h; a/phe qe
hD0 aD0
595
of the stock prices under lognormal distribution as shown in Exhibit 40.2. The algorithmic process for the average price at node (3,2) is described as follows: 1. We first compute the number of paths reaching node (3,2). There are three paths; (3,2,0), (3,2,1) and (3,2,2). The number of the paths reaching these nodelets, M.3; 2; 0/; M.3; 2; 1/ and M.3; 2; 2/, is equal to one. 2. The sums of the asset prices over all paths passing through these nodelets are S 00 .3; 2; 0/ D 100 C 85:65 C 103:73 C 125:64 D 415:02; S 00 .3; 2; 1/ D 450:48 and S 00 .3; 2; 2/ D 493:43, respectively. 3. According to Equation (40.4), the arithmetic averages of the stock prices over all paths reaching nodelets (3,2,0), (3,2,1) and (3,2,2) can be obtained. The average price at nodelet (3,2,0), AN (3,2,0), is then equal to 415:02=Œ.3 C N 2; 1/ D 112:62 and 1/.1/ D 103:76. Similarly, A.3; N 2; 2/ D 123:36. A.3; According to the average values of the underlying asset for the nodes at the end of the tree, the lower bound formula in the fourth section can be applied to calculate the price of a European Asian option. The numerical result is:
9 C = ˚ A.3; h; a/ L
e rT D M.3; 0; 0/pe0 qe3 A.3; 0; 0/ L ;
CM.3; 1; 0/pe1 qe2 A.3; 1; 0/ L C M.3; 1; 1/pe1qe2 A.3; 1; 1/ L C M.3; 1; 2/pe1 qe2 A.3; 1; 2/ L CM.3; 2; 0/pe2 qe1 A.3; 2; 0/ L C M.3; 2; 1/pe2 qe1 A.3; 2; 1/ L C M.3; 2; 2/pe2qe1 A.3; 2; 2/ L CM.3; 3; 0/pe3 qe0 A.3; 3; 0/ L e rT ˚ D .1/.0:5/0 .0:5/3 .0/ C .1/.0:5/1 .0:5/2 .0/ C .1/.0:5/1 .0:5/2 .0/ C .1/.0:5/1 .0:5/2 .3:43/ C .1/.0:5/2 .0:5/1 .3:76/ C.1/.0:5/2 .0:5/1 .12:62/ C .1/.0:5/2 .0:5/1 .23:36/ C .1/.0:5/3 .0:5/0 .36:36/ e .0:1/.1/ D 8:99
The error bound from Equation (40.7) is then used to obtain the upper bound of this option. That is,
er D
8 ˆ 3 <1 X ˆ :2
D
hD0
X
9 > = 1
h.3h/
M.3; h; a/phe qe3h .A2 .3; h; a/ A.3; h; a/2 / 2
aD0 Ami n .3;h;a/L
> ;
e rT
1 1 1
M.3; 0; 0/pe0 qe3 .A2 .3; 0; 0/ A.3; 0; 0/2/ 2 C M.3; 1; 0/pe1qe2 .A2 .3; 1; 0/ A.3; 1; 0/2 / 2 2 1
1
CM.3; 1; 1/pe1 qe2 .A2 .3; 1; 1/ A.3; 1; 1/2 / 2 C M.3; 1; 2/pe1qe2 .A2 .3; 1; 2/ A.3; 1; 2/2 / 2 1
1
CM.3; 2; 0/pe2 qe1 .A2 .3; 2; 0/ A.3; 2; 0/2 / 2 C M.3; 2; 1/pe2 qe1 .A2 .3; 2; 1/ A.3; 2; 1/2 / 2 1
CM.3; 2; 2/pe2 qe1 .A2 .3; 2; 2/ A.3; 2; 2/2 / 2
i o 1 CM.3; 3; 0/pe3 qe0 .A2 .3; 3; 0/ A.3; 3; 0/2 / 2 1Ami n .3;h;a/L e rT D 0
596
K. Wang and M.-F. Hsu 136.36 (3,3,0) 122.60 (2,2,0) 110.56 (1,1,0)
100.00 (0,0,0)
108.28 (2,1,1) 96.46 (2,1,0)
92.82 (1,0,0) 86.34 (2,0,0)
36.36 (3,3,0)
123.36 (3,2,2) 112.62 (3,2,1) 103.76 (3,2,0) 103.43 (3,1,2) 94.56 (3,1,1) 86.97 (3,1,0)
23.36 (3,2,2) 12.62 (3,2,1) 3.76 (3,2,0)
8.99 (0,0,0)
3.43 (3,1,2) 0 (3,1,1) 0 (3,1,0)
80.46 (3,0,0)
1
Time: 0
2
Exhibit 40.2 The arithmetic averages of the stock prices under lognormal distribution. The top number at each node denotes the arithmetic average of the stock prices and its nodelet position in the binomial lattice is shown in parenthesis
The number of all paths used in the above equation is equal to one. And the upward and downward probabilities are both 0.5 in calculating the lower bound. The error bound is zero because the conditions Amin .k; h; a/ < L and Amax .k; h; a/ > L are not satisfied for all the paths. We illustrate the calculating process of the AN2 (3,1,0) as follows: A2 .3; 1; 0/ D D
'.3; 1; 0/ C 2§.3; 2; 0/ .3 C 1/2 M.3; 1; 0/ 30615:49 C 2 45202:5 D 7563:78 .3 C 1/2 .1/
where '.3; 1; 0/ D
3 X
2 Si;1
i D0
D .1002 C85:652 C73:362 C88:872 / D 30615:49
and §.3; 1; 0/ D
X
0 (3,0,0)
3
Si;h Sj;h
0i j 3
D 100 85:65 C .100 C 85:65/ 73:36 C.100 C 85:65 C 73:36/ 88:87 D 45202:5 From the lower bound and the error bound, we can obtain the upper bound, 8.99. This result is shown in Exhibit 40.3.
Time: 0
1
2
3
Exhibit 40.3 Pricing European Asian option under lognormal distribution. The top number at each node denotes the value of the call option, and its nodelet position in the binomial lattice is shown in parenthesis
Considering first the normal skewness and kurtosis, the results are tested and compared with those in the literature. The call option to be valued has the initial stock price S0 D 100, the maturity T D 1 year, and the strike prices L D 95, 100, 105, and 110, respectively. The underlying distribution has volatility D 0:1, 0.3 and 0.5, respectively, with normal skewness ”1 D 0 and kurtosis ”2 D 3. The risk-free rate r is set to be 0.09. The time steps N equals 30, and the computing time and memory space needed in our algorithm are similar to those of Chalasani et al. (1998). We present our simulation results in Tables 40.1 and 40.2. In Table 40.1, we compare our results with those of Rogers and Shi (1995) and of Chalasani et al. (1998). When the call is in-the-money, our valuation in general is smaller than those of Chalasani et al. (1998). However, the range of our lower and upper bounds is narrower than theirs. For at-the-money and out-of-the-money calls, our estimates are greater than theirs and closer to those of Rogers and Shi’s, but the distance between our lower and upper bounds is almost the same as that of Chalasani et al. The difference between our calculations and those of Rogers and Shi’s is due to our numerical approximation versus their continuous-time integrals. In Table 40.2, we compare our results with those of Monte Carlo simulations from Lévy and Turnbull (1992). Because Chalasani et al. (1998) claim that their results are closer to Monte Carlo estimations than those of Rogers and Shi (1995), we also listed their bounds. As depicted in the table, our estimates are much closer to the Monte Carlo
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
Table 40.1 Model comparisons for European Asian option valuations under normal skewness and kurtosis
597
Strike .L/
Vol.
r
E-LB
E-UB
RS-LB
RS-UB
C-LB
C-UB
95 100 105 95 100 105 95 100 105 90 100 110 90 100 110 90 100 110
0:05 0:05 0:05 0:05 0:05 0:05 0:05 0:05 0:05 0:10 0:10 0:10 0:10 0:10 0:10 0:10 0:10 0:10
0:05 0:05 0:05 0:09 0:09 0:09 0:15 0:15 0:15 0:05 0:05 0:05 0:09 0:09 0:09 0:15 0:15 0:15
7:177 2:712 0:332 8:811 4:306 0:957 11:100 6:799 2:745 11:947 3:635 0:319 13:385 4:909 0:621 15:404 7:024 1:411
7:177 2:712 0:332 8:811 4:306 0:957 11:100 6:799 2:745 11:947 3:635 0:320 13:385 4:909 0:621 15:404 7:024 1:412
7:178 2:716 0:337 8:809 4:308 0:958 11:094 6:794 2:744 11:951 3:641 0:331 13:385 4:915 0:630 15:399 7:028 1:413
7:183 2:722 0:343 8:821 4:318 0:968 11:114 6:810 2:761 11:973 3:663 0:353 13:410 4:942 0:657 15:445 7:066 1:451
7:178 2:708 0:309 8:811 4:301 0:892 11:100 6:798 2:667 11:949 3:632 0:306 13:386 4:902 0:582 15:404 7:015 1:316
7:178 2:708 0:309 8:811 4:301 0:892 11:100 6:798 2:667 11:949 3:632 0:306 13:386 4:902 0:583 15:404 7:015 1:317
90 100 110 90 100 110 90 100 110
0:30 0:30 0:30 0:30 0:30 0:30 0:30 0:30 0:30
0:05 0:05 0:05 0:09 0:09 0:09 0:15 0:15 0:15
13:928 7:924 4:041 14:961 8:811 4:672 16:494 10:197 5:715
13:936 7:932 4:051 14:968 8:818 4:682 16:500 10:205 5:725
13:952 7:944 4:070 14:983 8:827 4:695 16:512 10:208 5:728
14:161 8:153 4:279 15:194 9:039 4:906 16:732 10:429 5:948
13:929 7:924 4:040 14:964 8:807 4:661 16:499 10:187 5:685
13:938 7:932 4:049 14:972 8:815 4:671 16:506 10:195 5:696
Note: The European Asian option to be valued has initial stock price S0 D 100 dollars and option life T D 1:0 year. Using time steps N D 30, the lower and upper bounds from our algorithm are indicated by E-LB and E-UB, respectively, while those from Rogers and Shi (1995) are indicated by RS-LB and RS-UB, and those from Chalasani et al. (1998) by C-LB and C-UB. We used normal skewness 1 D 0 and kurtosis 2 D 3 in our algorithm
Table 40.2 Comparisons with Monte Carlo simulations under normal skewness and kurtosis
Strike .L/ 95 100 105 90 100 110 90 100 110
Vol. 0:10 0:10 0:10 0:30 0:30 0:30 0:50 0:50 0:50
R 0:09 0:09 0:09 0:09 0:09 0:09 0:09 0:09 0:09
Monte Carlo
E-LB
E-UB
C-LB
C-UB
8:91 4:91 2:06 14:96 8:81 4:68 18:14 12:98 9:10
8:91 4:91 2:07 14:96 8:81 4:67 18:14 12:98 9:07
8:91 4:91 2:07 14:97 8:82 4:68 18:18 13:02 9:11
8:91 4:90 2:03 14:96 8:81 4:66 18:15 12:99 9:08
8:91 4:90 2:03 14:97 8:82 4:67 18:19 13:03 9:12
Note: The European Asian option to be valued has initial stock price S0 D 100 dollars and option life T D 1:0 year. Using time steps N D 30, the lower and upper bounds from our algorithm are indicated by E-LB and E-UB, respectively, while Monte Carlo estimates from Lévy and Turnbull (1992) are indicated by Monte Carlo, and those from Chalasani et al. (1998) are indicated by C-LB and C-UB. We used normal skewness 1 D 0 and kurtosis 2 D 3 in our algorithm
598
K. Wang and M.-F. Hsu
simulations; therefore our algorithm in pricing Asian options performs better than that of Chalasani et al. (1998).
40.6.2 Pricing American Asian Options Under Lognormal Distribution We can use the example in Exhibits 40.1 and 40.2 to explain the process for computing an American Asian option under lognormal distribution. Equation (40.9) in the fifth section is applied to estimate an upper bound for the value of an American Asian option at time 0. The following illustration for nodelet (2,2,0) explains this recursive backward procedure. ( _ W .2; 2; 0/ D max .A.2; 2; 0/ L/C ; pe W U .2; 2; 0/
rT C qe W .2; 2; 0/ exp 3 L
)
where W U .2; 2; 0/ D w1 C .1 / w2 with w1 being N W .2C1; 2C1; b/ D ŒA.2C1; 2C1; b/L C and w2 being N C 1; 2 C 1; b C 1/ L C , W .2 C 1; 2 C 1; b C 1/ D ŒA.2 L 0 and W .2; 2; 0/ D œ w 1 C .1 œ/ w0 2 with w0 1 being N C 1; 2; b/ L C and w0 2 being W .2 C 1; 2; b/ D ŒA.2 N C 1; 2; b C 1/ L C . W .2 C 1; 2; b C 1/ D ŒA.2 We need to calculate values of ; w1 and w2 before W U .2; 2; 0/ can be estimated. N 2; 0/.2 C 1/ C S2C1;2C1 =.2 C 2/ x U .2; 2/ D A.2;
Hence, we can get W (2, 2, 0) below, (
_ W .2; 2; 0/ D max .A.2; 2; 0/ L/C ; pe W U .2; 2; 0/
) rT L Cqe W .2; 2; 0/ exp 3 (
.122:6 100/C ; ..0:5/.36:36/
) .0:1/.1/ C.0:5/.23:36//e 3 D 28:88; D max
Values at other nodes can be calculated in the same way. As shown in Exhibit 40.4, the estimated W .0; 0; 0/ D 9:12, which is the upper bound for the value of the American Asian option at time 0. The exercise rule discussed in the fifth section can be applied to find the lower bound. Whenever W .k; h; x/ > Œ.pkC1;hC1 =p/ W U .k; h; a/ C .pkC1;h =p/ W L .k; h; a/ exp.rt=n/ in Equation (40.9), the option will be exercised at nodelet (k, h, a); otherwise it will not be exercised. We again use the illustration of nodelet (2,2,0) to account for the exercise rule. At nodelet (2,2,0), CL .2; 2; 0/ is equal to ŒS.2; 2; 0/.2C1/L C =.2C1/ D Œ.100C121:11C146:68/ .2 C 1/.100/ =.2 C 1/ D 22:6 if the option is exercised. However, if it is not exercised, CL .2; 2; 0/ equals Œpe CL .2 C 1; 2 C 1; 0/ C qe CL .2 C 1; 2; 0 C 2/ exp..0:1/.1/=3/ D Œ.0:5/.36:36/ C .0:5/.23:36/ exp..0:1/.1/=3/ D 28:88. Therefore, the option will not be exercised at nodelet (2,2,0),
D Œ122:6 3 C 177:66 =4 D 136:36; _
x .2; 2/ D A.2 C 1; 2 C 1; 0/ U
36.36 (3,3,0)
_
C.1 / A.2 C 1; 2 C 1; 1/; _
D
x U .2; 2/ A.3; 3; 1/
_
_
D
28.88 (2,2,0)
136:36 0 D 1; 136:36 0
A.3; 3; 0/ A.3; 3; 1/ _ C W U .2; 2; 0/ D A.3; 3; 0/ L _ C C.1 / A.3; 3; 1/ L
17.97 (1,1,0) 9.12 (0,0,0)
D .1/ Œ136:36 100 C
C.1 1/ Œ0 100 C D 23:36:
3.43 (3,1,2) 0 (3,1,1) 0 (3,1,0)
0 (2,0,0)
Similarly, value of W L .2; 2; 0/ is:
D .1/ Œ123:36 100 C
8.28 (2,1,1) 1.82 (2,1,0)
0.88 (1,0,0)
C.1 1/ Œ0 100 C D 36:36:
_ C W L .2; 2; 0/ D A.3; 2; 2/ L _ C C.1 / A.3; 2; 3/ L
23.36 (3,2,2) 12.62 (3,2,1) 3.76 (3,2,0)
0 (3,0,0)
Time: 0
1
2
3
Exhibit 40.4 Pricing American Asian option under lognormal distribution. The top number at each node denotes the value of the call option and its nodelet position in the binomial lattice is shown in parenthesis
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
so its value is equal to 28.88. Because all the nodelets before the maturity time in Exhibit 40.4 are not exercised, the lower bound is equal to the upper bound. In order to examine the possible errors from the Edgeworth approximations, we first compare our results with those of Chalasani et al. (1999) and Hull and White (1993) for lognormal underlying distribution. Table 40.3 shows the estimated value of an at-the-money American Asian option using different time steps with yearly volatility at 0.3, the risk-free rate at 0.1, and the initial stock price at 50 dollars. Table 40.4 further provides the estimates of the values with 40 time steps but under various maturities and strike prices. Table 40.3 Model comparisons for American Asian option valuations under lognormal distribution
599
Hull and White’s approximation for American Asian options is used as our benchmark. In Table 40.3, we find that the lower and the upper price bounds of an American Asian option using our method are lower than those from Chalasani et al. (1999) and Hull and White (1993). However, when we gradually increase the number of the time steps our results are almost the same as theirs. Although the Edgeworth density f(x) is not exactly a probability measure, the approximation errors tend to be very small if more time steps are used in the simulations. Table 40.4 shows the results under different strikes and option lives using the same number of time steps. When
Steps .n/
E-LB
E-UB
C-LB
C-UB
HW .h D 0:005/
HW .h D 0:003/
20 40 60 80
4.811 4.886 4.916 4.932
4.813 4.888 4.917 4.933
4.812 4.888 4.917 4.933
4.815 4.889 4.918 4.934
4.815 4.892 4.924 4.942
4.814 4.890 4.920 4.936
Note: The American Asian options are valued with initial stock price S0 D 50 dollars, strike K D 50 dollars, option life T D 1:0 year, volatility D 0:3 per year, and risk-free rate r D 0:1 per year. The estimated lower and upper bounds from our Edgeworth binomial model are indicated by E-LB and E-UB, respectively, while the estimates from Hull-White (1993) with different gridsize h are denoted by HW, and those from Chalasani et al. (1999) are denoted by C-LB, C-UB. All simulations are conducted under different time steps with normal skewness 1 D 0 and kurtosis 2 D 3. Parameter values are selected so the results can be compared with those in the literature
Table 40.4 Model comparisons for American Asian option valuations under lognormal distribution with different option lives and strikes
Option life .T /
Strike .L/
E-LB
E-UB
C-LB
C-UB
HW
0.5 0.5 0.5 0.5 0.5
40 45 50 55 60
12:105 7:248 3:268 1:150 0:323
12:105 7:248 3:269 1:151 0:323
12:111 7:255 3:269 1:148 0:320
12:112 7:255 3:270 1:148 0:320
12:115 7:261 3:275 1:152 0:322
1.0 1.0 1.0 1.0 1.0
40 45 50 55 60
13:136 8:535 4:886 2:537 1:211
13:137 8:537 4:888 2:539 1:213
13:150 8:546 4:888 2:532 1:204
13:151 8:547 4:889 2:534 1:206
13:153 8:551 4:892 2:536 1:208
1.5 1.5 1.5 1.5 1.5
40 45 50 55 60
13:967 9:636 6:193 3:774 2:201
13:969 9:639 6:195 3:777 2:204
13:984 9:648 6:195 3:767 2:190
13:985 9:650 6:197 3:770 2:193
19:988 9:652 6:199 3:771 2:194
2.0 2.0 2.0 2.0 2.0
40 45 50 55 60
14:685 10:605 7:320 4:889 3:180
14:688 10:609 7:323 4:893 3:184
14:709 10:620 7:322 4:881 3:167
14:712 10:623 7:325 4:885 3:170
14:713 10:623 7:326 4:886 3:171
Note: The American Asian options are valued with initial stock price S0 D 50 dollars, time steps n D 40, volatility D 0:3 per year, and risk-free rate r D 0:1 per year. The estimated lower and upper bounds from our Edgeworth binomial model are indicated by E-LB and E-UB, respectively, while the estimates from Hull-White (1993) are denoted by HW, and those from Chalasani et al. (1999) are denoted by C-LB, C-UB. All simulations are conducted under 40 time steps with normal skewness 1 D 0 and kurtosis 2 D 3, but with various option lives and strikes. Parameter values are selected so the results can be compared with those in the literature
600
K. Wang and M.-F. Hsu
the options are out of the money, our estimates are slightly greater than those from Chalasani et al. (1999) and Hull and White (1993). However, for in-the-money options, ours are slightly lower than theirs. The differences between our results and theirs are lowered if the options are closer to the expiration. Our simulations confirm the discussions by Ju (2002) that the Edgeworth expansion method works fine for shorter maturities, but not for long maturities, when pricing Asian options.
2 4 p1 p1 .3:053/ 6 24 3 3 4 2 6 1 C3/ C 72 .0:05/ p1 15 p1 C 45 p1 p1 3
3
3
p1 3
C
1
3
3
3
15/ binomial density is the resulting
D 0:9894, F p1 D f p1 b p1 D 0:9894 0:375 D 0:371. 3 3 3 Therefore, P2 D 0:372 and x2 D 0:585 given p F .1= 3/ D 0:371 and y2 D p13 . p 3 P Ph e 0:3 1xh 2. We estimate the drift D 0:1 11 ln hD0
40.6.3 Pricing European Asian Options Under Distributions with Higher Moments We now consider the examples of call options with higher moments in the underlining distribution. Most initial presumptions are the same as those used for lognormal distribution, except that skewness ”1 D 0:05 and kurtosis ”2 D 3:05. We just demonstrate the process for computing the underlying values and their arithmetic averages along each path in the binomial lattice with higher moment consideration. Other pricing procedures discussed earlier for lognormal distribution can be similarly applied to the cases with higher moments in the underlining distribution. Exhibit 40.5 provides the nodal values in the binomial lattice under distribution with higher moments. The procedure to get these values is demonstrated below: 1. At node (3,2), the random variable y2 D p13 . After calculating the Edgeworth expansion 1 C 16 .0:05/
178.28 147.10 121.17 100
125.93 103.69
85.81
88.95 73.57
62.83
Time: 0
1
2
3
Exhibit 40.5 The nodal values in the binomial lattice under distribution with higher moments
0:05509 using P2 ; x2 and the assumed D 0:3; r D 0:1 and T D 1. The asset price at node (3,2) is then equal p 0:05509.1/C0:3 1.0:585/ to S3;2 D 100e D 125:93. The asset prices at other nodes in the final step of the binomial lattice can be calculated similarly. 3. The upward and downward probabilities under distribution with higher moments are pe D 0:4936 and qe D 0:5064. The nodal value, S2;1 , can be obtained by solving backwards from the values at nodes (3,2) and (3,1): S2;1 D Œ0:4936.125:93/ C 0:5064.88:95/
0:1.1/ D 103:69:
exp 3 We next calculate the arithmetic average of the stock prices under distribution with higher moments. The algorithmic process for the average price at node (3,1) is explained below: 1. First, count the number of the paths reaching node (3,1). M.3; 1; 0/; M.3; 1; 1/ and M.3; 1; 2/ all equal to one. 2. The sum of the asset prices over all paths passing through the nodelet (3,1,0), S 00 .3; 1; 0/ D 100 C 85:81 C 73:57 C 88:95 D 348:33. Similarly S 00 .3; 1; 1/ D 378:45 and S 00 .3; 1; 2/ D 413:8. 3. According to Equation (40.4), the average price at nodelet N 1; 0/ is then equal to 348:33=Œ.3 C 1/.1/ D (3,1,0), A.3; N 1; 1/ D 94:61 and A.3; N 1; 2/ D 87:08. Similarly A.3; 103:45. The average of the underlying asset values at each nodelet is denoted in Exhibit 40.6. The valuation performance of the European Asian option in Table 40.5 is based on the initial stock price S0 D 100, the risk-free rate r D 0:09, and the maturity T D 1 year with varying skewness and kurtosis. The results of the numerical analysis are compared with those of the Edgeworth expansion model by Turnbull and Wakeman (1991) (TW), the modified Edgeworth expansion method by Lévy and Turnbull (1992) (LT), and the four-moment approximation model by Posner and Milevsky (1998) (PM). All these models are considered up to the fourth moments in their valuations.
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution
Table 40.5 Model comparisons for European Asian option valuations under various skewness and kurtosis
Strike .L/ 95 100 105 95 100 105 90 100 110 90 100 110
Vol. 0:05 0:05 0:05 0:1 0:1 0:1 0:3 0:3 0:3 0:5 0:5 0:5
601
1
2
MC
E-LB
E-UB
LT
TW
PM
0 0 0:03 0 0 0:02 0 0:01 0 0:01 0 0
3 3 3 3 3 3 3 3 3 3 3:02 3
8.81 (0.00) 4.31 (0.00) 0.95 (0.00) 8.91(0.00) 4.91 (0.00) 2.06 (0.00) 14.96(0.01) 8.81 (0.01) 4.68 (0.01) 18.14 (0.03) 12.98 (0.03) 9.10 (0.03)
8:81 4:31 0:95 8:91 4:91 2:06 14:97 8:80 4:68 18:14 12:97 9:09
8:81 4:31 0:95 8:91 4:91 2:06 14:98 8:82 4:70 18:21 13:03 9:16
8:81 4:31 0:95 8:91 4:91 2:06 15:00 8:84 4:69 18:13 13:00 9:12
8:81 4:31 0:95 8:91 4:91 2:06 14:91 8:78 4:69 17:66 12:86 9:22
NAN NAN NAN NAN NAN NAN 14.96 8.80 4.67 18.14 12.97 9.07
Note: E-LB and E-UB indicate the lower and the upper bounds from our model with various skewness .1 / and kurtosis .2 /. The approximations of Lévy and Turnbull (1992) are represented by LT, and of Turnbull and Wakeman (1991) by TW, MC represents the Monte Carlo estimates in the Lévy and Turnbull (1992), and PM represents the four-moment approximation by Posner and Milevsky (1998). The simulations assume the option life T D 1 year, the domestic interest rate r D 0:09, the time steps N D 52 and the initial spot price S0 D 100
136.64 (3,3,0) 122.76 (2,2,0) 110.58 (1,1,0) 100.00 (0,0,0)
108.29 (2,1,1) 96.50 (2,1,0)
92.90 (1,0,0) 86.46 (2,0,0)
123.55 (3,2,2) 112.70 (3,2,1) 103.86 (3,2,0) 103.45 (3,1,2) 94.61 (3,1,1) 87.08 (3,1,0) 80.55 (3,0,0)
Time: 0
1
2
3
Exhibit 40.6 The arithmetic averages of stock prices under distribution with higher moments. The top number at each node denotes the arithmetic average of the stock prices and its nodelet position in the binomial lattice is shown in parenthesis
For low volatility cases ( D 0:05 and 0.1) in Table 40.5, our results from in-the-money or at-the-money calls are very close to Monte Carlo estimates, which is the benchmark used by Lévy and Turnbull (1992) under lognormal distribution. This is similar to those of LT and TW. For out-of-the-money calls, our outcomes are the same as theirs under right-skewed conditions. For high volatility cases ( D 0:3 and 0.5), our outcomes for at-the-money or deep-in-the-money calls approach the results from Monte Carlo simulation with positive skewness and a slight leptokurtic. Under lognormal
distribution, when the call is deep-out-of-money, our lower bounds are more accurate than the estimates from all the other methods. The results from Monte Carlo method are consistent with our lower and upper bounds. Overall, our outcomes are better than those of LT and TW, and similar to PM.
40.6.4 Pricing American Asian Options Under Distributions with Higher Moments Finally, we compare the price estimates of American Asian options from our Edgeworth binomial model with those from Chalasani et al. (1999) and Hull and White (1993) under distributions having higher moments. As indicated in Table 40.6, if the underlying distribution has negative skewness and positive excess kurtosis while other parameters are the same as in Table 40.3, we find that the results from our model (E-LB D 4.813, 4.890, 4.920, 4.937 and E-UB D 4.815, 4.891, 4.920, 4.937) are closer to the estimates from Hull-White (with grid size h D 0:003 as used in Hull and White (1993)) than those from Chalasani et al. (C-LB D 4.812, 4.888, 4.917, 4.933 and C-UB D 4.815, 4.889, 4.918, 4.934). We also adopt the same parameter values used in Table 40.4 but assume an underlying distribution with leftskewness and leptokurtosis for calculating the option price. Our results, as shown in Table 40.7, are again closer to those from Hull-White (with h D 0:005 as used in Hull and White (1993)). The evidence in Tables 40.6 and 40.7 demonstrates that our modified Edgeworth binomial model performs better in simulating American Asian options.
602
Table 40.6 Model comparisons for American Asian option valuations with various time steps, skewness and kurtosis
K. Wang and M.-F. Hsu
Steps .n/
1
2
E-LB
E-UB
C-LB
C-UB
HW .h D 0:005/
HW .h D 0:003/
20 40 60 80
0.002 0.046 0.051 0.050
3.00 3.06 3.06 3.05
4.813 4.890 4.920 4.937
4.815 4.891 4.920 4.937
4.812 4.888 4.917 4.933
4.815 4.889 4.918 4.934
4.815 4.892 4.924 4.942
4.814 4.890 4.920 4.936
Note: The American Asian options are valued with initial stock price S0 D 50 dollars, strike K D 50 dollars, option life T D 1:0 year, volatility D 0:3 per year, and risk-free rate r D 0:1 per year. The estimated lower and upper bounds from our Edgeworth binomial model are indicated by E-LB and E-UB, respectively, while the estimates from Hull-White (1993) with different grid-size h are denoted by HW, and those from Chalasani et al. (1999) are denoted by C-LB, C-UB. All simulations are conducted under various time steps, skewness .1 / and kurtosis .2 /. Parameter values are selected so the results can be compared with those in the literature
Table 40.7 Model comparisons for American Asian option valuations with various option lives, strikes, skewness and kurtosis
Option life .T /
Strike .L/
1
2
E-LB
E-UB
C-LB
C-UB
HW
0.5 0.5 0.5 0.5 0.5
40 45 50 55 60
0.030 0.030 0.040 0.000 0.020
3:06 3:06 3:04 3:00 3:25
12:115 7:260 3:275 1:150 0:321
12:115 7:261 3:275 1:151 0:322
12:111 7:255 3:269 1:148 0:320
12:112 7:255 3:270 1:148 0:320
12:115 7:261 3:275 1:152 0:322
1.0 1.0 1.0 1.0 1.0
40 45 50 55 60
0.041 0.040 0.040 0.003 0.003
3:09 3:10 3:05 3:00 3:01
13:152 8:551 4:891 2:536 1:207
13:153 8:552 4:892 2:538 1:209
13:150 8:546 4:888 2:532 1:204
13:151 8:547 4:889 2:534 1:206
13:153 8:551 4:892 2:536 1:208
1.5 1.5 1.5 1.5 1.5
40 45 50 55 60
0.017 0.017 0.007 0.050 0.007
3:00 3:02 3:00 3:01 3:01
13:987 9:651 6:199 3:770 2:192
13:989 9:653 6:201 3:771 2:194
13:984 9:648 6:195 3:767 2:190
13:985 9:650 6:197 3:770 2:193
19:988 9:652 6:199 3:771 2:194
2.0 2.0 2.0 2.0 2.0
40 45 50 55 60
0.036 0.029 0.026 0.028 0.001
3:05 3:05 3:03 3:01 3:02
14:712 10:623 7:325 4:886 3:168
14:715 10:625 7:327 4:889 3:172
14:709 10:620 7:322 4:881 3:167
14:712 10:623 7:325 4:885 3:170
14:713 10:623 7:326 4:886 3:171
Note: The American Asian options are valued with initial stock price S0 D 50 dollars, time steps n D 40, volatility D 0:3 per year, and risk-free rate r D 0:1 per year. The estimated lower and upper bounds from our Edgeworth binomial model are indicated by E-LB and E-UB, respectively, while the estimates from Hull-White (1993) are denoted by HW, and those from Chalasani et al. (1999) are denoted by C-LB, C-UB. All simulations are conducted under various option lives, strikes, skewness .1 / and kurtosis .2 /. Parameter values are selected so the results can be compared with those in the literature
40.7 Conclusion In this chapter, we have developed the modified Edgeworth binomial model to price European and American Asian options with higher moments in the underlying distribution. Many studies in the literature have illustrated significant leftskewness and leptokurtosis for the distribution of the underlying asset. Our model combines the refined binomial lattice with the Edgeworth expansible distribution to include the high-moment parameters. As the algorithm of the Edgeworth binomial tree can greatly enhance the computing efficiency, our modified model would be able to price Asian options faster and with higher precision when the underlying
distribution displays higher moments. The numerical results show that this approach can effectively deal with the higher moment issue and provide better option value estimates than those found in various studies in the literature.
References Ahn, D.-H., J. Boudoukh, M. Richardson, and R. F. Whitelaw. 1999. “Optimal risk management using options.” Journal of Finance 54(1), 359–375. Amin, K. I. 1993. “Jump diffusion option valuation in discrete time.” Journal of Finance 48(5), 1833–1863.
40 Numerical Valuation of Asian Options with Higher Moments in the Underlying Distribution Barraquand, J. and T. Pudet. 1996. “Pricing of American pathdependent contingent claims.” Mathematical Finance 6, 17–51. Chalasani, P., S. Jha, and A. Varikooty. 1998. “Accurate approximations for European Asian options.” Journal of Computational Finance 1(4), 11–29. Chalasani, P., S. Jha, F. Egriboyun, and A. Varikooty. 1999. “A refined binomial lattice for pricing American Asian options.” Review of Derivatives Research 3(1), 85–105. Chang, C. C. and H. C. Fu. 2001. “A binomial options pricing model under stochastic volatility and jump.” CJAS 18(3), 192–203. Corrado, J. C. and T. Su. 1997. “Implied volatility skews and stock index skewness and kurtosis implied by S&P 500 index option prices.” Journal of Derivatives 4(4), 8–19. Costabile, M., I. Massabo, and E. Russo. 2006. “An adjusted binomial model for pricing Asian options.” Review Quantitative Finance and Accounting 27(3), 285–296. Curran, M. 1994. “Valuing Asian and portfolio options by conditioning on the geometric mean.” Management Science 40, 1705–1711. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002. “The concept of comonotonicity in actuarial science and finance: theory.” Insurance: Mathematics and Economics 31, 3–33. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002. “The concept of comonotonicity in actuarial science and finance: applications.” Insurance: Mathematics and Economics 31, 133–161. Grant, D., G. Vora, and D. Weeks. 1997. “Path-dependent options: extending the Monte Carlo simulation approach.” Management Science 43(11), 1589–1602. Hilliard, J. E. and A. L. Schwartz. (1996. Fall). “Binomial option pricing under stochastic volatility and correlated state variable.” Journal of Derivatives 4, 23–39. Hull, J. and A. White. (1993. Fall). “Efficient procedures for valuing European and American path-dependent options.” The Journal of Derivatives 1, 21–31. Ju, N. 2002. “Pricing Asian and basket options via Taylor expansion.” Journal of Computational Finance 5(3), 1–33. Kim, T. H. and H. White. 2003. On more robust estimation of skewness and kurtosis: simulation and application to the S&P 500 Index, Working paper, Department of Economics, UC San Diego.
603
Klassen, T. R. 2001. “Simple, fast, and flexible pricing of Asian options.” Journal of Computational Finance 4, 89–124. Lévy, E. 1992. “Pricing European average rate currency options.” Journal of International Money and Finance 11, 474–491. Lévy, E. and S. Turnbull. 1992. “Average intelligence.” RISK 5, 53–58. Lo, K., K. Wang, and M. Hsu. 2008. “Pricing European Asian options with skewness and kurtosis in the underlying distribution.” Journal of Futures Markets 28(6), 598–616. Milevsky, M. A. and S. E. Posner. 1998a. “A closed-form approximation for valuing basket options.” The Journal of Derivatives 5(4), 54–61. Milevsky, M. A. and S. E. Posner. 1998b. “Asian options, the sum of lognormals, and the reciprocal Gamma distribution.” Journal of Financial and Quantitative Analysis 33, 409–422. Neave, E. H. and G. L. Ye. 2003. “Pricing Asian options in the framework of the binomial model: a quick algorithm.” Derivative Use, Trading and Regulation 9(3), 203–216. Posner, S. E. and M. A. Milevsky. 1998. “Valuing exotic options by approximating the SPD with higher moments.” The Journal of Financial Engineering 7(2), 109–125. Rogers, L. C. G. and Z. Shi. 1995. “The value of an Asian option.” Journal of Applied Probability 32, 1077–1088. Rubinstein, M. 1994. “Implied binomial trees.” Journal of Finance 49(3), 771–818. Rubinstein, M. 1998. “Edgeworth binomial trees.” The Journal of Derivatives 5(3), 20–27. Thompson, G. W. P. 2000. Fast narrow bounds on the value of Asian options, Working paper, Centre for Financial Research, Judge Institute of Management, University of Cambridge. Turnbull, S. and L. Wakeman. 1991. “A quick algorithm for pricing European average options.” Journal of Financial and Quantitative Analysis 26, 377–389. Zhang, J. E. 2001. “A semi-analytical method for pricing and hedging continuously sampled arithmetic average rate options.” Journal of Computational Finance 5, 59–79. Zhang, J. E. 2003. “Pricing continuously sampled Asian options with perturbation method.” Journal of Futures Markets 23(6), 535–560.
Chapter 41
The Valuation of Uncertain Income Streams and the Pricing of Options* Mark Rubinstein
Abstract A simple formula is developed for the valuation of uncertain income streams consistent with rational investor behavior and equilibrium in financial markets. Applying this formula to the pricing of an option as a function of its associated stock, the Black–Scholes formula is derived even though investors can trade only at discrete points in time. Keywords CRRA intertemporal CAPM r Pricing uncertain income streams r Single-price law of markets r Arbitrage r State-prices r Consumption-based CAPM rLocal expectations hypothesis r Unbiased term structure r Random walk r Option pricing r Time-additive utility r Logarithmic utility r Black– Scholes formula r Equity risk premium puzzle r Joint normality covariance theorem
41.1 Introduction Since the theoretical breakthrough leading to a simple singleperiod valuation formula for uncertain income consistent with rational risk-averse investor behavior and equilibrium in financial markets (Sharpe 1964), there have been various attempts to extend this model to a multiperiod context. Fama (1970) proved that even though a risk averter maximized the expected utility from the stream of consumption over his lifetime, his choices in each period would be indistinguishable from that of a properly specified rational investor with a single-period horizon. Moreover, if his investment (and consumption) opportunities followed a (possibly nonstationary) random walk, then the induced single-period utility would be a nonstochastic function. These observations implied that the simple single-period model could apply to successive periods in a multiperiod setting. Subsequently, Merton (1973b) and later Long (1974) generalized the single-period formula to intertemporally stochastically dependent opportunity sets.
M. Rubinstein University of California, Berkley, CA, USA e-mail: [email protected]
While their resulting formulas were more complex, they still possessed the merit of empirical promise. However, of greater interest than the anchoring of a single-period valuation formula in a multiperiod setting is development of a simple multiperiod formula; that is, a method of determining the present value of a series of cash flows received over many future dates. The early certainty-equivalent and risk-adjusted discount rate approaches (Robichek and Myers 1965) were flawed by failing to specify determinants of the adjustment parameters. In effect, they remained little more than definitions of these parameters. More recently, implicitly making use of Fama’s result that the simple singleperiod model could be applied successfully, Bogue and Roll (1974) have indeed derived a method for discounting an uncertain income stream consistent with rational risk-averse investor behavior. In the spirit of dynamic programming, they apply the single-period mean–variance model to the last period determining the date T 1 present value of income received at date T . Knowing the determinants of this uncertain present value and the additional uncertain income received at date T 1, the single period model is again applied determining the date T 2 present value, and so forth. By this means, the data zero present value of the income stream is determined. Unfortunately, the resulting formula, even for convenient special cases, is far from simple. The comfortable elegance of the formula for discounting a certain income stream does not seem to carry over to uncertainty. That is all the more discouraging considering the obvious importance of valuing uncertain income streams for research in financial markets. The primary purpose of this paper is to develop the needed valuation formula that satisfies the tests of uncertain income received over time, consistency with rational risk-averse investor behavior and equilibrium in financial markets, as well as simplicity and statement in terms of empirically observable inputs. Section 41.2 presents a general approach for valuation of uncertain income streams, but the formula contains inputs that are not easily observable.
*
This paper is reprinted from Bell Journal of Economics, 7 (Autuamn 1976), No. 2, pp. 407–425.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_41,
605
606
M. Rubinstein
Section 41.3 remedies this defect for the special case of constant proportional risk aversion (CPRA). With a simple valuation formula in hand, in Sect. 41.4 it is applied to the multiperiod problem of the valuation of an option in terms of its associated stock. To my surprise, the resulting option pricing formula is identical to the Black and Scholes (1973) formula even though only costless discrete-time trading opportunities are available so that investors cannot create a perfect hedge, investors are risk-averse, and must simultaneously choose among a large number of securities.
2. Nonsatiation: Ceteris paribus, the larger its dividends for any state, the greater the current price of a security. then ZŒs.t/ > 0 for all states.2 These two assumptions motivate the following theorem: Theorem 1. Given assumptions 1 and 2, there exists a positive random variable Y Œs.t/ , the same for all securities, such that for any security3 P0 D
41.2 Uncertain Income Streams: General Case Assume there exist states of nature so that given the revelation of the state at any date, the cash flow (i.e., dividend) received by the owner of any security is known with certainty. Let t D 0; 1; 2; : : : denote dates, s.t/ (random variable) denote the state at date t; X Œs.t/ the dividend received from a security in state s.t/, and P Œs.t/ the price of the security in state s.t/. The requirement that securities with identical returns should have the same value is probably the most basic condition for equilibrium in financial markets. In particular, assume 1. Single-price law of markets: If two securities, or more, generally two portfolios of securities,1 yield the same dividends for every future state, then their current prices are the same. Since there then can be no more linearly independent securities than states, there must exist a set of random variables fZŒs.t/ g the same for all securities, such that for any security 1 X X ZŒs.t/ X Œs.t/ : P0 D
1 Extension of the single-price law of markets to portfolios of securities (i.e., convex combinations of dividends) is inconsistent with transaction costs that vary with the scale of investment in any security. However, it insures that any pricing formula for arbitrary securities will possess the “portfolio property” that it applies to arbitrary portfolios as well.
(41.1)
Proof. Let Œs.t/ be a probability,4 assessed at date t D 0, that state s.t/ will occur. Define random variable Z 0 Œs.t/ P 0 ZŒs.t/ = Œs.t/ ; then P0 D t E .Xt Z t /. Second, define 0 random variable Y Œs.t/ E.Yt /RFt Z Œs.t/ ; then P0 D P t E.Xt Yt /=ŒRFt E.Yt / . The result follows from the definition of covariance. Q.E.D. The valuation formula implies that the current price of a security (or portfolio of securities) equals the sum of the discounted certainty equivalents of its dividends discounted at the riskless rate. The risk adjustment factor applied to the mean dividend at each date is its “coefficient of covariation” with a random variable common to all securities. Not only et not be unique if the number of will this random variable Y states exceeds the number of different securities, but even if the number of states and different securities are equal, Yt will be unique only up to a positive multiplicative constant.5 et other than its sign, However, until more is said about Y the theory has little empirical content. While we shall shortly et , first we develop provide a simple characterization of Y the natural uncertainty analogue to the Williams (1938), Gordon et al. (1956) perpetual dividend growth model. Let e t be denoted by e g t and one plus the rate of growth of X e y t so that one plus the rate of growth of Y t be denoted by e et D Y0 .e e t D X0 .e g 1e g 2 e g t / and Y y 1e y 2 e y t /. Letting X RFt D rF 1 rF 2 rFt , Equation (41.1) becomes
t D1 s.t /
Of course, the random variables will not be unique unless the number of linearly independent securities equals the number of states. A normalized riskless security for any date t has a certain dividend X Œs.t/ D 1 at date t and a certain divi1 as dend of zero at any other date. Therefore, denoting RFt the current price of a date t normalized riskless security, then P 1 RFt D s.t / ZŒs.t/ . If, additionally, we assume
1 X E.Xt / C C ov.Xt ; Yt /=E.Yt / RF t t D1
( P0 D X0
1 X EŒg1 g2 gt y1 y2 yt =EŒy1 y2 yt t D1
2
rF 1 rF 2 rF t
) :
If it were also assumed that a security existed insuring a non-negative rate of return to whichP all investors had access (i.e., cash), then 0 < ZŒs.t / < 1 and indeed s.t/ ZŒs.t / 1 for all dates. Moreover, R1 R2 R3 : : :. 3 E./ denotes expectation, Cov./ covariance, Var./ variance, and Std./ standard deviation. 4 These probabilities need not be held homogeneously by all investors, or indeed by any investor. They may be viewed as a purely mathematical construct. However, every state must be interpreted as possible so that Œs.t / > 0. 5 Beja (1971) has developed similar results to Equation (41.1).
41 The Valuation of Uncertain Income Streams and the Pricing of Options
Analogous to the stationary riskless rate and stationary dividend growth rate assumptions adopted by Williams and Gordon under certainty, we assume that rFt rF is stationg follows a stationary random walk with a ary, that e gt e yt e y follows a stastationary correlation with e y t , and that e tionary random walk.6 Theorem 2. Given assumptions 1 and 2, if the riskless rate is stationary and the rate of growth of a security’s dividend stream follows a stationary random walk with stationary coret , which also follows a relation with the rate of growth of Y stationary random walk, then P0 D X0
g C .gy =y / : rF .g C .gy =y //
Proof. From the premise, e g t is serially uncorrelated, e yt is serially uncorrelated, and lagged values of e g t and e yt are uncorrelated. Therefore, EŒg1 g2 gt y1 y2 yt D E.g1 y1 /E.g2 y2 / E.gt yt /; moreover EŒy1 y2 yt D E.y1 /E.y2 / E.yt /. Applying the stationary assumptions, P0 D X0
1 X gy =y t t D1
rF
:
The result then follows from the formula for the sum of an infinite geometric series and the definition of covariance. Q.E.D. Observe that, except the stationarity of the stochastic process of the dividend stream, the formula is quite general relying primarily on the single-price law of markets. If a security has no “nondiversifiable risk” so that gy D 0, then the formula simplifies to P0 D X0 g =.rF g /. However, again since e y is not well-specified, the formula appears empty of empirical content. But this is not quite the case. Let the expression in brackets be denoted so that P0 X0 . At date e 1 . Moreover, since X1 D X0e e1 D X g 1 , then the t D 1; P e 1 /=P0 D e e1 C P g Œ.1 C /= rate of return of the security .X follows a stationary random walk. In short, we have derived important properties of the stochastic process of the rate of return of a security from the stochastic process of its dividend stream.7 6
By definition, g E.g/; y E.y/; gy Cov.g; y/; gy E.gy/. 7 From much of the literature on “efficient markets,” it may have been expected that security rates of return should follow a random walk irrespective of the stochastic process of dividends. For example, see Granger and Morgenstern (1970, p. 26). Clearly, this is not true for default-free pure-discount bonds. Moreover, I have argued elsewhere (1974b) that in an important case, the rate of return of the market portfolio will be serially independent if and only if the rate of growth of aggregate consumption (i.e., the social dividend) is serially independent. This result is generalized in Section III.
607
Corollary 1. Under the conditions of Theorem 2, the rate of return of a security follows a stationary random walk.8 e 1 =.X e1 C P e 1 P0 / as the date t D 1 “payout Define X ratio” from market-determined earnings. Since this equals Œ1 C . =g/ 1 , the payout ratio also follows a stationary random walk, varying inversely with the growth rate of dividends. That is, although the absolute level of dividends increases, it must be accompanied by a lower payout to assure continuation of the stationary dividend process. This harmonizes with Lintner’s (1956) empirical observation that in order to maintain a target payout ratio (i.e., Œ1C . =g / 1 /, firms will under (over) shoot in times of unusually high (low) earnings. Corollary 2. Under the conditions of Theorem 2, the payout ratio varies inversely with the growth rate of dividends. Returning now to general dividend processes, we shall et by enriching the identify the positive random variable Y economic environment. Specifically, we shall assume 3. Perfect, competitive and Pareto-efficient financial markets: There are no scale economies in the purchase of any security, every participant acts as if he cannot influence the price of any security, all investors can purchase the same security at the same date for the same price, and there exist a sufficient diversity of securities such that no investor desires the costless creation of a new security.9 4. Rational time-additive tastes: Every investor acts as if he maximizes his expected utility over his lifetime dollar value of consumption that is concave and additive in consumption at each date. 5. Weak aggregation10: There exists an average investor such that (a) homogeneity: every homogeneous economic characteristic also applies to the average investor 8 Samuelson (1973) has derived a more general proposition that, unlike the simple case here, does not require stationary discount rates over time. However, like the present case, it does require that discount rates be nonstochastic functions of time; that is, future interest rates are assumed known in advance. I have shown elsewhere (1974a) that this may lead to unrealistic implications. 9 This last condition would be satisfied if the financial market were complete, or if, as in the familiar mean-variance model, a riskless security exists and all investors divide their wealth after consumption between it and the market portfolio. The condition is required for Pareto-efficient exchange arrangements. 10 If all investors were identical this assumption is trivially satisfied. I have developed elsewhere (1974a, 1974b) more general sets of conditions. Perhaps the most appealing is the case where investors are heterogeneous with respect to the scale and composition of endowed resources, lifetime, time-preference, beliefs, and whether proportional risk aversion is increasing, constant, or decreasing. Additive generalized logarithmic utility is the only homogeneity requirement.
608
M. Rubinstein
(b) commensurability: if an economic characteristic is denominated in units of wealth, then this characteristic for the average investor is an unweighted arithmetic average over the corresponding characteristic of every investor (c) consensus: prices are determined as if every investor were average. Under these assumptions, we can meaningfully define an average investor who max
X X
fC Œs.t / g
t
subject to W0 D
s.t /
i
X X t
s.t /
ZŒs.t/ C Œs.t/ :
et D Ut0 .C e t / for all Theorem 3. Given assumptions 1–5, Y dates and states. Since this is a classic and easily obtainable result from the first-order conditions to the programming problem, no proof will be given. The economic content of this theorem is that et is solely determined by per capita (1) the randomness of Y e et is a decreasing function of C et consumption C t , and (2) Y (since Ut is concave). From Equation (41.1), this implies that securities tend to be more valuable if they have high dividends in dates and states with relatively low per capita consumption. et from depending as well Time-additive tastes prevent Y on past per capita consumption, and in general, weak aggreet from depending on the distrigation is required to prevent Y bution of consumption across investors. Since we are aiming at simple formulas, these two assumptions are useful. However, in one well-known case we can obtain even greater simplicity without requiring weak aggregation, but at the price e t and of an alternative assumption: the joint distribution of X et is bivariate normal for each date t and all investors have C homogeneous beliefs. The Appendix shows that, subject to a mild regulatory condition, if x and y are bivariate normal and g.y/ is any at least once differentiable function of y, then Cov.x; g.y// D EŒg0 .y/ Cov.x; y/. Applying this to Equation (41.1), individual i would choose his lifetime conei ; : : : ; C e it ; : : : such that for every security sumption Coi ; C 1 1 X E.Xt / ti C ov.Xt ; Cti / ; RF t t D1
i ı i 0 i e t Ut C e t . Since financial marwhere ti E Uti 00 C kets are Pareto-efficient, prices are set as if there were
i
for each date, P0 D
Œs.t/ Ut fC Œs.t/ g
where Ut denotes P Phis date tutility over consumption C Œs.t/ and W0 t s.t / ZŒs.t/ C Œs.t/ is his initial wealth defined in terms of endowed claims to consumption C Œs.t/ . et . With these assumptions, we can identify Y
P0 D
complete markets. For every date t, we can therefore “manufacture” a risky security that paysoff only at that date, so that P0 RFt D E.Xt / ti Cov Xt ; Cti . Summing this over all investors I and dividing by the number of investors I in the economy, produces a similar result to that found in singleperiod models, P0 RFt D E.Xt / t Cov.Xt Ct /, where hP i1 P i 1 et e t =I and t ti C C =I . Applying this 1 X E.Xt / t C ov.Xt ; Ct / : RF t t D1
Again, since t > 0, securities are penalized, which, ceteris paribus, have high dividends in dates and states with relatively high per capita consumption.
41.3 Uncertain Income Streams: Special Case While our description of the economic environment has et , the resulting valuation formula is not easily identified Y empirically useful. With weak aggregation, we need to determine the average utility function and even with the apparently simplified result from homogeneous joint normality, we require measurement of the stochastic process of per capita consumption. Moreover, to obtain the simple relationships of the familiar single-period model, we need to charet / as a function of the market rates of return acterize Ut0 .C e t =.W e t 1 C e t 1 / where W e t is per capita wealth e r Mt W e t has the natural interpretation of a social divat date t. C idend on the market portfolio. In other words, we need to inquire under what circumstances does there exist a nonranet /? dom function gt such that gt .e r M1 ;e r M 2 ; : : : ;e r Mt / D Ut0 .C In particular, gt ./ cannot be separately dependent on per e for any date 0 < £ t. For then, gt ./ capita wealth W would be a random function. It is known from portfolio theory that optimal portfolio consumption is independent of wealth if and only if utility is constant proportional riskaverse (CPRA); that is, if and only if Ut .CQ t / D 1 2 t
1 Q 1b C ; 1b t
where b > 0 and t > 0.11 t is a measure of time-preference et /=Ut0 .C e t /; b is the measure of et Ut00 .C and since b D C e t / be expressed in CPRA. Since only in this case can Ut0 .C r M 2 ; : : : ;e r Mt and otherwise independent of terms of e r M1 ; e e for 0 < £ t, CPRA acquires empirical significance. W
11
In the limiting case of b D 1; Ut .e C t / D 1 2 t log e Ct.
41 The Valuation of Uncertain Income Streams and the Pricing of Options
609
Theorem 4. Given assumptions 1–5, and if average utility is CPRA and if either b D 1 or the rate of growth of per capita consumption follows a (possibly nonstationary) random walk, then eb et D R Y Mt
Although no stochastic restrictions have been placed on the rates of return of securities, the premise that the rate of growth of per capita consumption follows a random walk is sufficiently strong to imply important stochastic properties for “basic” portfolios.
eMt e for all dates and states, where R r M1e r M 2 e r Mt .
Corollary 3. Under the conditions of Theorem 4, if b ¤ 1
Proof. Hakansson (1971) and Rubinstein (1974b) have 1. market portfolio: the rate of return of the market portfolio shown that for an investor with CPRA,his average propensity follows a (possibly nonstationary) random walk; to consume wealth at any date is independent of his wealth. 2. default-free bonds: the term structure of interest rates is Indeed, this independence holds only under CPRA. In particunbiased in the sense that at each date the next period exular, in this case, there exists a random variable e k t indepenpected rates of return of default-free pure discount bonds e t . Moreover, et D e kt W dent of wealth at any date such that C of all maturities are the same. if b D 1 (logarithmic utility) then while kt depends on the date (through date dependent time-preference 1 2 t and Moreover, if b D 1, then conclusion (1) holds if and only if the time remaining until his death), it is nonstochastic. When the rate of growth of per capita consumption follows a (posb ¤ 1, then kt will also be nonstochastic if the rate of growth sibly nonstationary) random walk; and conclusion (2) holds of per capita consumption follows a (possibly nonstationary) if and only if the inverse one plus rate of growth of per capita random walk.12 The assumptions of the theorem therefore consumption is serially uncorrelated. imply kt is a nonstochastic function of time. Since et; e e t D kt W r Mt Proof. To prove assertion (1), recall that C WQ t D .WQ t 1 CQ t 1 /QrMt ; then CQ t D .WQ t 1 CQ t 1 /kt rQMt : e e e W t =.W t 1 C t 1 / and one plus the rate of growth of per e t =C e t 1 for all t. Therefore, capita consumption e r Ct C e t 2 C e t 2 /.1 kt 1 /kte e t D .W r Mt1e r Mt . rMt D rCt Œkt 1 =kt .1 kt 1 / so that .rMt ; rCt / D 1. When Similarly, C Continuing to work backwards b D 1, we do not require the premise that e r Ct follow a random walk to keep kt and kt 1 nonstochastic. To prove asser1 CQ t D ŒW0 .1 k0 /.1 k1 / .1 kt 1 /kt RQ Mt : denote the price at t 1 of one dollar tion (2), let t 1 RFt for sure at date t C 1. From the first-order conditions of the Substituting this into Equation (41.1), noting Œ is non- programming problem of Sect. 41.2, it follows that e b et / D 1 2 t C stochastic and Ut0 .C t , yields the conclu sion of the theorem. Q.E.D. et rCbt C1 rF1t D t Et 1 rCbt ; rF1t C1 D t C1 E In brief, given the conditions of the theorem, for any security (or portfolio) 1 b X E.Xt / t .Xt ; RM t /Std.Xt / P0 D R Ft t D1
(41.2)
b b =E RMt and ./ is a correlation where t Std RMt coefficient.
and 1 t 1 RF t C1
D t t C1 Et 1 rCbt rCbt C1
if utility is CPRA, where indicates random variables from the perspective of information available at date t 1. Additionally, if t 1e r Ft denotes the one plus rate of return on the two-period bond from date t 1 to date t, then in conformity with the single-price law of markets t 1 RF t C1
12
Specifically, it follows from Rubinstein (1974b), that 1b 1b 1b B e B B e e k 1 D 1 C tC1 E t rCtC1 tC2 E t rCtC1 rCtC2 C : : : ; C tC1 t
r Ct e C t =e C t1 , and expectations are assessed with where B b 1 ; e respect to information available at date t . When the rate of growth of per capita consumption follows a random walk, then kt is nonstochastic, since 1b 1b 1b B B B C tC1 Et rCtC2 C : : : Et rCtC1 tC2 Et rCtC1 kt1 D 1 C t1 Moreover, it is easy to see when b D 1 that kt is nonrandom even if the rate of growth of per capita consumption does not follow a random walk.
Dt 1 e r F te r F t C1
From this it follows that 1 : Et 1 .t 1 rF t / D Et 1 rCbt C1 t Et 1 rCbt rCbt C1 r CtC1 follow a When b ¤ 1 and by premise e r Ct and e random walk, then this simplifies to Et 1 .t 1 rFt / D b 1 so that rFt D Et 1 .t 1 rFt /. When b D 1, t Et 1 rCt 1 1 and rCtC1 this same conclusion is reached if and only if rCt are serially uncorrelated. This proof is easily extended to default-free bonds of all maturities. Q.E.D.
610
M. Rubinstein
Whether or not (1) the market portfolio follows a random walk or (2) the term structure is unbiased depends critically on the stochastic process of per capita consumption over time. The underlying real intertemporal stochastic process of per capita consumption is mirrored in the equilibrium financial process governing the prices of “basic” portfolios. Moreover, when e r Ct follows a random walk or b D 1, then the real contemporaneous stochastic process of per capita consumption is also mirrored in the market portfolio. From the above r Ct Œkt 1 =kt .1 kt 1 / so that e r Ct is normal proof, e r Mt D e (lognormal) if and only if e r Mt is normal (lognormal). If the rate of growth of per capita consumption follows a stationary random walk and average time-preference is stationary, then Equation (41.2) can be further simplified. Corollary 4. Under the conditions of Theorem 4, if the rate of growth of per capita consumption follows a stationary random walk and average time-preference is stationary over an infinite life-time, then 1 bt X Std.Xt / E.Xt / t Xt ; rM P0 D ; rF t t D1 where t D
p b b =E rM : .1 C 2 /t 1 and Std rM
Proof. From the proof of the previous corollary, e r Mt D e r Ct Œkt 1 =kt .1 kt 1 / . From the expression for kt in Footnote 12, under the premise of the above corollary since e r Ct e r C and t , then kt k is stationary. Therefore, r M is likewise stationary. In general, if any random e r Mt e variable e x t follows a stationary random walk (or is merely stationary and serially uncorrelated), then t t Var.x1 x2 xt / D x2 C 2x 2x : xt / D tx . From these properAnd, of course, E.x1 x2p ties, it follows that t D .1 C 2 /t 1. Similarly, from the e t 1 r b , proof of the previous corollary, since e r Ft 1 D t E Ct then e r Ft D rFt rF is a stationary constant. Q.E.D. Under these stationarity conditions, the “market price of risk” t increases with the time to the receipt of the cash flow so that risk-averse investors tend to penalize more distant risks more than near risks. Compounding this effect is the tendency of Std.Xt / to increase with t. For example, if the e t follows a stationary random walk, then growth rate e g t of X Std.Xt / D X0
r
g2 C 2g
t
t 2g ;
which clearly increases with t. However, since rFt increases with t (assuming rF > 1), the net effect on present value of the influence of time to receipt of income on risk is indefinite. eb et as R Although Theorem 4 identifies Y Mt , unless b, the value of CPRA, were known in advance, empirical application of the valuation formula Equation (41.2) would prove difficult. Even the implied single-period formula Pt 1 D
b E.X t / O t C ov.X t ; rM t /Std.X t / rF t
(41.3)
b b et C P e t and O t Std rMt =E rMt is where X t X difficult to apply. Standard linear regression techniques cannot be used to estimate b, since the formula is nonlinear in b. Nonetheless, there are three interesting methods to overcome this difficulty. As Blume and Friend (1975) and Cohen et al. (1975) have recently attempted, it may be possible to measure b from empirical surveys of consumer wealth allocation. Second, and possibly more promising, is to infer b by trial and error from the behavior of the rate of return of the market portfolio. Since Equation (41.3) holds for the market 1b portfolio, b dividing by Pt 1 and rearranging rFt D E rMt = E rMt . Partially differentiating the righthand side by b, the Schwartz b inequality can be used to show 1b = E rMt =@b > 0. Therefore, a trial-andthat @ E rMt error procedure will quickly converge to a good estimate of b that approximately satisfies the above equality. Third, is to infer b from the rate of return of the market portfolio when it is assumed to conform to a given stochastic process. Corollary 5. Under the conditions of Theorem 4, if the single-period one plus rate of return of the market portfolio is lognormal, then b D
1 E.log rM t / log rF t C : Var.log rM t / 2
Proof. Since Equation (41.3) holds for the market portfolio, b by Pt 1 and rearranging, rFt D 1b dividing = E rMt . From the Appendix, ife r Mt is lognormal, E rMt 1b 1b D and e r b are lognormal. Therefore, E rMt thene r Mt Mt exp .1 b/E.log rMt / C 12 .1 b/2 Var.log rMt / and b D exp bE.log rMt / C 12 b 2 Var.log rMt / . The E rMt result follows by dividing these two expressions, setting the quotient equal to rFt and taking logarithms. Q.E.D. Equation (41.2) with b estimated from the corollary, requires only that e r Mt be lognormal and (unless b D 1) follow a (possibly nonstationary) random walk; no distributional restrictions have been placed on the rates of return of other securities. This is fortunate since the rates of return of many securities such as options and default-free bonds near maturity, as well as the rates of return of capital budgeting
41 The Valuation of Uncertain Income Streams and the Pricing of Options
projects, are neither lognormal nor follow a random walk. However, for securities with one plus rates of returne r t jointly lognormal with e r Mt , a similar argument to the corollary can be used to show
611
be the respective values of the option and its associated stock e t D 0 if e et D e S t < K and Q St K at expiration. Since15 Q e if S t K, then E.Qt / D EŒSt KjSQt K :
logŒE.rt / D log rF t C b C ov.log rt ; log rM t /: Defining Although this formula has been derived in discrete time, it is consistent with Merton’s (1973b) continuous-time model as interpreted over discrete intervals by Jensen (1972, p. 386). CPRA has substituted for continuous trading opportunities to achieve the same end.13
The similarity between the valuation results derived in discrete and continuous time suggests that other security pricing relationships may also be immune from this distinction. In particular, it is known that the Black and Scholes (1973) option pricing equation for dividend-protected options, p Q D SN.z C t / KrF t N.z /
z
E.Qt / D SEŒR .K=S /jRQ .K=S / :
Defining Q rQ log R;
E.Qt / D SEŒe r .K=S /jQr log.K=S / :
e is lognormal and e r is normal, Since e S t is lognormal, R therefore
41.4 Options
where
RQ SQt =S;
(41.4)
Z E.Qt / D S
where f ./ is the normal density function. Breaking this apart into the difference between two integrals, using Equations (41.1) and (41.2) of the Appendix to evaluate them, and noting the transformation of the mean of a lognormal variable, we obtain
log.S=K/ C .log rF 12 2 /t p ; t
holds either with continuous trading opportunities or risk neutrality.14 Clearing up the notation: Q D current price of option (i.e., call) S D current price of associated stock K D strike price t D time to expiration rF D one plus the interest rate 2 D the variance of the logarithm of 1 plus the rate of return of the associated stock N./ D the standard normal cumulative density function Although the proofs are available elsewhere, to ease understanding of the subsequent discrete-time analysis under risk aversion, it will be useful to sketch a proof here. Following Sprenkle (1961), we first derive a simplified expression for the expected future value of the option at expiration and e t and e St then discount this value back to the present. Let Q
K f .r/dr; er S log.K=S / 1
E.Qt / D SR N.z C r / KN.z /; where
log.S=K/ C r : r Since R D exp r C 12 r2 , then r D log R 12 r2 . Finally discounting E.Qt / back to the present by the expected compound rate of return Q of the option through expiration, we obtain z
Q D ŒSR N.z C r / KN.z / =Q where z
log.S=K/ C log R 12 r2 : r
The usefulness of this result is hampered primarily by measurement of Q and secondarily by measurement of R . However, assuming risk neutrality, the expected rates of return of the stock and option are equal, and moreover, equal to the riskless rate so that RF D Q D R . If, additionally, the
13
See Kraus and Litzenberger (1976) for similar remarks. Merton (1973a) mentions that Samuelson and Merton (1969) have uncovered yet a third set of assumptions for Equation (41.4): (a) the average investor has CRPA, (b) only three securities exist – a defaultfree bond, the option, and its associated stock, and (c) the net supply of both the options and the bonds is zero. 14
15
Merton (1973a) has shown that rational nonsatiated investors will not exercise a call option prior to expiration if it is properly payoutprotected and if the strike price is fixed. He also proves this will not generally be true for puts.
612
M. Rubinstein
riskless rate is constant over time so that rF D .R /1=t D .Q /1=t , and the rate of return of the associated stock follows a stationary random walk so that r2 D t 2 , then the Black–Scholes equation follows exactly. Recently, Black and Scholes have stimulated academic interest in options by showing that the valuation formula derived under risk neutrality holds under seemingly much more general conditions. They argue that if the associated stock price follows a continuous stochastic process in time,16 and if investors can costlessly continuously revise their portfolios, then investors can create a perfectly hedged (i.e., riskless) portfolio by selling a call against a long position in the associated stock. Cox and Ross (1975) have argued that since, in equilibrium, this portfolio must earn the riskless rate, then given the current price of the stock, the interest rate, and agreement about the relevant statistics characterizing the rate of return of the stock, the current price of the call will be independent of the characteristics of other securities and the preferences of investors. Therefore, the same current price of the option will be derived irrespective of investor preferences. To simplify matters, we are then free to assume risk neutrality without loss of generality. But, by the previous analysis, the Black–Scholes equation again holds. On the surface, the existence of continuous trading opportunities to insure the riskless hedge seems critical to the Black–Scholes analysis. How else (except in the trivial risk neutrality case) could the pricing formula be independent of investor preference datum and opportunities for diversification? Although Merton (1975, p. 2) claims that the continuous trading formula appears in the limit as the trading interval shrinks to zero, it remains unclear what kind of an approximation is involved even for small finite trading intervals. The following theorem proves that the Black–Scholes pricing formula is robust within lognormality conditions, even with only discrete trading opportunities and risk aversion. Theorem 5. Given assumptions 1 and 2, if St and Yt are jointly lognormal, investors agree on ¢, and the associated stock pays no dividends through expiration of the option, then Q D SN.z C / KRF1 N.z /
(41.5)
where
1 log.S=K/ C log RF 2 and RF is the riskless discount rate through the expiration of the option and is the standard deviation of the logarithm of the compound one plus rate of return of the associated stock through the expiration of the option (i.e., Std.log R/). z
16
See Cox and Ross (1975) and Merton (1975) for analyses of option pricing when the associated stock price follows a jump process.
Proof. Recalling the proof to Theorem 1, we can define a 1 e 0t Y et RFt e 0t such that Z =E.Y the random variable Z t /. 0Since . MoreZ stock is assumed to pay no dividends, S D E S t t over Q D E .St K/Zt0 je S t K and RF1 D E Zt0 . By ee defining R S t =S , these relationships become E.RZt0 / D 1;
RF1t D E.Zt0 /
and Q D SEŒ.R .K=S //Zt0 jRQ > .K=S / : e and e e0t ; Q D SEŒ.e r Defining e r log R y log Z y e 0t are jointly log.K=S //e je r log.K=S / . Since e S t and Z normal, then e r and e y are jointly normal, therefore Z
1
Z
QDS 1
K r e y f .r; y/ dr dy; e S log.K=S / 1
where f ./ is the bivariate normal density function. Breaking this apart into the difference between two integrals, using Equations (41.3) and (41.4) of the Appendix to evaluate them, and noting the transformations of the mean of a lognormal variable and the mean of the product of two lognormal variables, we obtain Q D SE.RZt0 /N.z C r / KE.Zt0 /N.z /; where
log.S=K/ C r C r y : r Since E RZ 0t D 1 and E Zt0 D RF1 , this simplifies to z
Q D SN.z C r / KRF1 N.z /: i h Since E RZ 0t D exp r C 12 r2 C y C 12 y2 C r y D1 i h and E Zt0 D exp y C 12 y2 D RF1 , then r C 12 r2 log RF C r y D 0. Substituting this into the expression for z proves the theorem. Q.E.D.
Appending the Black–Scholes stationarity conditions converts Equation (41.5) into Equation (41.4). However, the derivation presented here shows that under the strengthened lognormality assumption, these stationarity conditions are in no way necessary to retain the important features of et be jointly Equation (41.4).17 Under what conditions will Y
17
Merton (1973a) has relaxed the stationarity conditions on the interest rate and allows the standard deviation of the logarithm of one plus the rate of return of the associated stock . / to be a known nonstochastic function of time; here can be stochastic as well, extending the model to more realistic stochastic processes. See Rosenberg (1972).
41 The Valuation of Uncertain Income Streams and the Pricing of Options
lognormal with e S t ? Theorem 4 supplies an interesting sufeb e et D R ficient condition. Y Mt is jointly lognormal with S t if 18 eMt and e S t are jointly lognormal. R To extend Equation (41.5) to dividend paying stock, assume through the life of the option, the dividend yield of the associated stock is a known nonstochastic function of time. e t =e S t denote the dividend yield of the stock at Let ıt X any date t through the life of the option. Then the one plus the compound rate of return of the stock to the expiration e D .e date, R S t =S /.1 C ı1 /.1 C ı2 / .1 C ıt /. Defining e De S t =.S 1 /.
.1 C ı1 /.1 C ı2 / .1 C ıt /, then R 1 In the proof of Theorem 5, replacing S with S ,
Corollary 6. Under the conditions of Theorem 5, except that the associated stock, through the expiration of the option, has a dividend yield, which is known nonstochastic function of time, and the option cannot be exercised prior to its expiration date, then Q D S 1 N.z C / KRF1 N.z / where z
1 log.S 1 =K/ C log RF 2
and RF is the riskless discount rate through the expiration of the option and is the standard deviation of the logarithm of the compound one plus rate of return of the associated stock through the expiration of the option (i.e., Std.log R/). When will the dividend yield be nonstochastic? Theorem 2 provides a sufficient condition in its premise. An “American” option, one that can be exercised prior to its expiration date, will be worth more than Q as given by the above formula.19
18 The Samuelson-Merton (1969) assumptions (see Footnote 14) are clearly a special case. Their assumptions (b) and (c) imply that Yt D e Rb , since the associated stock is the market portfolio. The risk neuY t is nonstochatrality justification is a degenerate special case where e stic. It should also be noted that the premise of Theorem 5 is stronger Y t shared only than necessary. We actually require a random variable e by the option, its associated stock, and the default-free bond, which is S t . In effect, as in the Black-Scholes analysis, jointly lognormal with e Equation (41.5) will apply even if S and RFt are not in equilibrium with the other securities in the market. Finally, agreement on Var.log St / is not required if heterogeneous investor beliefs can be meaningfully aggregated; see Rubinstein (1976). 19 See Samuelson and Merton (1969). The corollary generalizes Merton (1973a, p. 171) by allowing the dividend yield to depend on time. In a succeeding working paper, Geske (1976) has further generalized the option pricing model for underlying stock with a stochastic but lognormal dividend yield. Since a perfect hedge cannot be constructed in this case, even if investors can make continuous revisions, his generalization requires the preference-specific approach developed in this paper.
613
41.5 Conclusion This paper has developed a simple and practical formula for the valuation of uncertain income streams, consistent with rational risk-averse investor behavior and equilibrium in financial markets. Since the formula places virtually no stochastic restrictions on the income stream, it can be used to value securities or other assets such as capital budgeting projects with serially correlated cash flows or rates of return, securities with or without limited liability, and derivative securities such as options in terms of their underlying securities. The formula was used to derive simple sufficient restrictions on real economic variables for the rate of return of the market portfolio to follow a random walk and the term structure to be unbiased. When applied to the pricing of options, it was shown, given appropriate stochastic restrictions, to imply the Black–Scholes option pricing formula, although a perfect hedge cannot be constructed. Acknowledgment Research for this paper was supported in part by a grant from the Dean Witter Foundation. The author would like to acknowledge comments on the original version of this paper from John Cox, Robert Geske, Robert Merton, and Stephen Ross.
References Beja, A. 1971. “The structure of the cost of capital under uncertainty.” Review of Economic Studies 38(3), 359–368. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Blume, M. and I. Friend. 1975. “The asset structure of individual portfolios and some implications for utility functions.” Journal of Finance 10(2), 585–603. Bogue, M. and R. Roll. 1974. “Capital budgeting of risky projects with ‘imperfect’ markets for physical capital.” Journal of Finance 29(2), 601–613. Cohen, R., W. Lewellen, R. Lease, and G. Schlarbaum. 1975. “Individual investor risk aversion and investment portfolio composition.” Journal of Finance 30, 605–620. Cox, J. and S. Ross. 1975. The pricing of options for jump processes, Wharton Working Paper #2–75, University of Pennsylvania. Fama, E. 1970. “Multiperiod consumption-investment decisions.” American Economic Review 60(1), 163–174. Geske, R. 1976. The pricing of options with stochastic dividend yield, unpublished manuscript. Gordon, M. J. and E. Shapiro. 1956. “Capital equipment analysis: the required rate of profit.” Management Science 3(1), 102–110. Granger, C. and O. Morgenstern. 1970. Predictability of stock market prices, Heath Lexington. Hakansson, N. 1971. “Optimal entrepreneurial decisions in a completely stochastic environment.” Management Science 17(7), 427–449. Jensen, M. 1972. “Capital markets: theory and evidence.” Bell Journal of Economics and Management Science 3(2). Kraus, A. and R. Litzenberger. 1976. “Market equilibrium in a multiperiod state preference model with logarithmic utility.” Journal of Finance (forthcoming).
614
M. Rubinstein
Lintner, J. 1956. “Distribution of income of corporations among dividends, retained earnings, and taxes.” American Economic Review 46(2), 97–113. Long, J. 1974. “Stock prices, inflation and the term structure of interest rates.” Journal of Financial Economics 1, 131–170. Merton, R. 1973a. “Theory of rational option pricing.” Bell Journal of Economics and Management Science 4(1), 141–183. Merton, R. 1973b. “An intertemporal capital asset pricing model.” Econometrica 41, 867–880. Merton, R. 1975. Option pricing when underlying stock returns are discontinuous, Working Paper #787–75, Massachusetts Institute of Technology. Miller, M. and F. Modigliani. 1961. “Dividend policy, growth and the valuation of shares.” Journal of Business 34, 411–433. Robichek, A. and S. Myers. 1965. Optimal financing decisions, Prentice-Hall, Prentice. Rosenberg, B. 1972. The behavior of random variables with nonstationary variance and the distribution of security prices, Working Paper #11, Research Program in Finance, University of California, Berkeley. Rubinstein, M. 1973. “A comparative statics analysis of risk premiums,” essay 2 in four essays in the theory of competitive security markets, Ph.D. dissertation (UCLA 1971), appeared in Journal of Business. Rubinstein, M. 1974a. “An aggregation theorem for security markets.” Journal of Financial Economics 1(3), 225–244. Rubinstein, M. 1974b. A discrete-time synthesis of financial theory, Working Papers #20 and #21, Research Program in Finance, University of California, Berkeley.
(
1 p
exp f .x; y/ D 2.1 2 / 2 x 2 y 2 .1 2 / 1
where x E.x/; y E.y/; x2 Var x; y2 D Var y; .x; y/.R Defining the marginal density function of x; 1 f .x/ 1 f .x; y/dy (and similarly for y), it is not difficult to prove that f .x/ (and f .y/) is a normal density function; that is, 1 1 2 f .x/ D p exp 2 .x x / : 2x x 2
f .yjx/ D q
1 2 .1 2 /y2
exp
.y y /
Appendix 41A The Bivariate Normal Density Function Random variables x and y are said to be bivariate normal if their joint density function is "
.x x /.y y / .y y /2 .x x /2 2 C 2 x x y y2
;
and Var.yjx/ D .1 2 /y2 : One very useful but little known20 property of this bivariate normal distribution is that if x and y are bivariate normal and g.y/ is any at least once differentiable function of y, then, subject to mild regularity conditions,
To see this, since “ C ov.x; g.y/ D
1 2.1 2 /y2
y .x x / x
#)
C ov.x; g.y/ D EŒg 0 .y/ C ov.x; y/:
Defining the conditional density function of y given x; f .yjx/ f .x; y/=f .x/, it follows that (
Rubinstein, M. 1975. “Securities market efficiency in an arrow-debreu economy.” American Economic Review 65(5), 812–824. Rubinstein, M. 1976. “The strong case for the generalized logarithmic utility model as the premier model of financial markets.” Journal of Finance 3, 551–571. Samuelson, P. 1973. “Proof that properly discounted present values of assets vibrate randomly.” Bell Journal of Economics and Management Science 4(2), 369–374. Samuelson, P. and R. Merton. 1969. “A complete model of warrant pricing that maximizes utility.” Industrial Management Review 10, 17–46. Sharpe, W. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. Sprenkle, C. 1961. “Warrant prices as indicators of expectations and preferences.” Yale Economic Essays 1(2), 178–231. Stein, C. 1973. “Estimation of the mean of a multivariate normal distribution.” Proceedings of the Prague Symposium on Asymptotic Statistics, 345–381. Williams, J. 1938. Theory of investment value, North Holland, Amsterdam.
xg.y/f .x; y/dx dy x E.g.y/ ;
then using the conditional density
2 )
Z
:
C ov.x; g.y/ D
g.y/E.xjy/ f .y/ dy x E.g.y/
Observe that this conditional density is itself a normal density with mean and variance, respectively, 20
E.yjx/ D y C .y =x /.x x /
To my knowledge, this property of bivariate normal variables was first noted in Rubinstein (1971). Stein (1973) has independently made its discovery.
41 The Valuation of Uncertain Income Streams and the Pricing of Options
or, more particularly, for bivariate normal variables C ov.x; g.y// D
D C ov.x; y/
Defining h .y/ obtain 0
yy y2
" g.y/
y y y2
# f .y/ dy:
1 a
f .y/ and integrating by parts, we
Z
1
D
2 2 1 p e .yy / =2y , 2
g .y/h.y/dy : 0
then h.y/ D f .y/. In
addition, since then f .1/ D f .1/ D 0 and if g.y/ is bounded above and below or, more generally, if lim g.y/f .y/ D lim g.y/f .y/ D 0;
y!1
Z C ov.x; g.y// D Cov.x; y/
g .y/f .y/ dy
D EŒg .y/ C ov.x; y/: In the process of deriving an analytic solution for the discrete-time valuation of options, we need to evaluate several definite integrals over the marginal and conditional normal density functions: f .x/ dx D N a
a C x x
Since the exponent of e is equal to 1 1 x C x2 2 .x x x2 /2 ; 2 2x Z 1
1 2 e x f .x/ dx D e x C 2 x a
Z
2 1 1 dx: p exp 2 x x C x2 2x x 2
1
a
Z
1
e y f .yjx/dy
0
0
1
a
1 1 p exp 2 .x x /2 C x dx: 2x x 2
Again, the result follows by converting this density into the standard normal density and interchanging the limits. As an interim result, we will need to prove that
y!1
then
Z
Z
e x f .x/ dx
Z
y
1
i
C ov.x; g.y// D C ov.x; y/ g.1/h.1/ g.1/h.1/
Since f .y/ D
a C 1 2 x e x f .x/ dx D e x C 2 x N C x : x a (41A.2) To see this, Z
g.y/.y y /f .y/ dy Z
h
Second, we show that
Z
x y
615
(41A.1)
1
y 1 D exp y C .x x / C .1 2 /y2 : x 2
This is equivalent to finding the expected value of e y where y is normally distributed with mean y C .y =x /.x x / and variance .12 /y2 . This is analogous to integral (41A.2) where a D 1. In this case, N./ D N.1/ D 1 so that the exponent of e may be replaced by y C .y =x /.x x / C 1 2 2 2 .1 /y . Now we are prepared to prove Z
1
Z
1
e y f .x; y/ dx dy
where N.z / is the standard cumulative normal distribution R z 2 from 1 to z ; that is, N.z / 1 p1 e z =2 dz. This 2 is proved by first converting f .x/ into the standard normal 2 density function f .z/ p1 e z =2 and then observing that 2 since z is symmetric around zero, the limits of integration can be interchanged. That is, Z
Z
1
f .x/ dx D
ax x
a
Z D
1
1
a
a C 1 2 x D e y C 2 y N C y (41A.3) x
From our interim result Z 1Z 1 e y f .x; y/ dx dy 1
f .z/ d z
aCx x
1
Z
a 1
D f .z/ d z D N
a C x : x
Z
1
f .x/ Z
1
a 1
D
e a
e y f .yjx/ dy dx
y y C x
.xx /C 12 .12 /y2
f .x/ dx:
616
M. Rubinstein
Combining terms in the exponents of e, this, in turn, equals Z
y 1 p exp y C .x x / x x 2 1 1 2 dx: C .1 2 /y2 .x / x 2 2x2
1 a
Since the exponent of e is equal to 2 1 1 y C y2 2 .x x / x y ; 2 2x Z 1Z 1
1 2 e y f .x; y/ dx dy D e y C 2 y 1 a Z 1
1 1 p exp 2 Œx .x C x y / 2 dx: 2x x 2
a
Again, the result follows by converting this density into the standard normal density and interchanging the limits. Finally, Z
1
Z
Substituting this into the integral, converting this density into the standard normal density, and interchanging the limits yields (41A.4). If x and y are bivariate normal, then random variables X e x and Y e y are said to be bivariate lognormal. Since sums of jointly normal variables are normal, products of jointly lognormal variables are lognormal. Moreover, if a and b are constants, since y D a C bx is normally distributed if x is normally distributed, then Y D aX b is lognormally distributed if X is lognormally distributed. Since x D log X and y D log Y are normally distributed and since d.log X / D dX=X and d.log Y / D dY=Y , then from the bivariate normal density function, the bivariate lognor1 f .x; y/, mal density function of X and Y is F .X; Y / D XY where f .x; y/ is again the bivariate normal density function. Consequently, the marginal lognormal density function of X , usually called simply the lognormal density, is F .X / D X1 f .x/. Using the lognormal density and integrating Z
1 x y
0
e e f .x; y/dx dy 1
a
1 2 2 D e x Cy C 2 .x C2x y Cy / N
a C x C y C x x
(41A.4)
1
Z Z
a 1
D
Z
Z
1
D
y
1
e y C x .xx /C 2 .1
2 / 2 y
0
1 2 1 2 E.X Y / D exp x C y C x C xy C y : 2 2
1
a
.X X /2 F .X / dX;
2
1 2 1 2 2 e x 1 : X D e x C 2 x and X2 D e x C 2 x
e y f .yjx/ dy dx
1
e x f .x/
1
With these relationships, since the product XY is itself lognormal,
1
e x e y f .x; y/ dx dy 1
XF .X / dX and X2
it can be shown that the mean and variance of the lognormal variables, x and x2 are related to the mean and variance of their corresponding normal variables, X and X2 by21
Again from our interim result, Z
Z
1
X
e x f .x/ dx:
a
Combining terms in the exponents of e, this, in turn, equals Z
1 a
y 1 p exp y C .x x / x x 2
1 1 2 dx: C .1 2 /y2 C x .x / x 2 2x2
Since the exponent of e is equal to
1 2 x C 2x y C y2 2 2 1 ; 2 .x x / x y C x2 2x
x C y C
21
The general formula for the nth order central moment of a lognormal variable is " n # X r.r1/x2 n n n nr 2 .1/ e E Œ.X x / D .x / rŠ.n r/Š rD0
Chapter 42
Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach John Lee
Abstract This paper will first demonstrate how Microsoft Excel can be used to create the Decision Trees for the Binomial Option Pricing Model. At the same time, this paper will discuss the Binomial Option Pricing Model in a less mathematical fashion. All the mathematical calculations will be done by the Microsoft Excel program that is presented in this paper. Finally, this paper uses the Decision Tree approach to demonstrate the relationship between the Binomial Option Pricing Model and the Black-Scholes Option Pricing Model. Keywords Binomial Option Pricing Model r Decision trees r Black-Scholes Option Pricing Model r Call options r Put options r Microsoft Excel r Visual basic for applications (VBA) r Put-call parity r Sigma r Volatility r Recursive programming
42.1 Introduction The Binomial Option Pricing Model derived by Rendleman and Barter (1979), and Cox et al. (1979) is one the most famous models used to price options. Only the Black-Scholes model (1973) is more famous. One problem with learning the Binomial Option Pricing Model is that it is computationally intensive, which makes it a very complicated formula. The complexity of the Binomial Option Pricing Model makes it a challenge to learn the model. Most books teach the Binomial Option model by describing the formula. This is a not very effective because it usually requires the learner to mentally keep track of many details, many times to the point of information overload. There is a well-known principle in psychology that the average number of things that a person can remember at one time is seven. This paper will first demonstrate the power of Microsoft Excel. It will do this by demonstrating that it is possible to create large decision trees for the Binomial Pricing Model using Microsoft Excel. A ten-period decision tree would
J. Lee () Center for PBBEF Research, North Brunswick, NJ, USA e-mail: [email protected]
require 2,047 call calculations and 2,047 put calculations. This paper will also show the decision tree for pricing a stock and a bond, each requiring 2,047 calculations. Therefore, there would be 2; 047 4 D 8; 188 calculations for a complete set of ten-period decision trees. Second, this paper will present the Binomial Option Model in a less mathematical fashions. It will try to make it so that the reader will not have to keep track of many things at one time. It will do this by using decision trees to price call and put options. Finally, we will show the relationship between the Binomial Option Pricing Model and the Black-Scholes Option Pricing Model. This paper uses a Microsoft Excel workbook called binomialBS_OPM.xls that contains the VBA code to create the decision trees for the Binomial Option Pricing Model. The VBA code is published in Appendix 42A. E-mail me at [email protected] and indicate the password “bigsky” to obtain a copy of this Microsoft Excel workbook. Section 42.2 discusses the basic concepts of call and put options. Section 42.3 demonstrates the one period call and put option-pricing models. Section 42.4 presents the twoperiod option-pricing model. Section 42.5 demonstrates how to use the Microsoft Excel workbook binomialBS_OPM.xls to create the decision trees for a n-period Binomial Option Pricing Model. Section 42.6 demonstrates the use of the Black-Scholes model. Section 42.7 shows the relationship between the Binomial Option Pricing Model and the BlackScholes Option Pricing Model. Finally, Sect. 42.8 shows how to use the Microsoft Excel workbook binomialBS_OPM.xls to demonstrate the relationship between the Binomial Option Pricing Model and the Black-Scholes Option Pricing Model.
42.2 Call and Put Options A call option gives the owner the right but not the obligation to buy the underlying security at a specified price. The price in which the owner can buy the underlying price is called
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_42,
617
618
J. Lee
Fig. 42.1 Value of JPM Call Option
Value of JPM Call Option Strike Price = $30 20 15 10 Dollars
5 0 −5 −10 0
5
10
15
20
25
30
35
40
45
35
40
45
−15 Price
Fig. 42.2 Value of JPM Put Option
Value of JPM Put Option Strike Price $30 35 30 25
Dollars
20 15 10 5 0
−5 −10 −15
0
5
10
15
20
25
30
Price
the exercise price. A call option becomes valuable when the exercise price is less than the current price of the underlying stock price. For example, a call option on a JPM stock with an exercise price of $30 when the stock price of an JPM stock is $35 is worth $5. The reason it is worth $5 is because a holder of the call option can buy the JPM stock at $30 and then sell the JPM stock at the prevailing price of $35 for a profit of $5. Also, a call option on an JPM stock with an exercise price of $30 when the stock price of an JPM stock is $25 is worth zero. A put option gives the owner the right but not the obligation to sell the underlying security at a specified price. A put option becomes valuable when the exercise price is more than the current price of the underlying stock price. For example, a put option on an JPM stock with an exercise price of $30 when the stock price of a JPM stock is $25 is worth $5. The reason it is worth $5 is because a holder of the put option can buy the JPM stock at the prevailing price
of $25 and then sell the JPM stock at the put price of $30 for a profit of $5. Also, a put option on a JPM stock with an exercise price of $30 when the stock price of the JPM stock is $35 is worth zero. Figures 42.1 and 42.2 are charts showing the value of call and put options of the above JPM stock at varying prices.
42.3 One Period Option Pricing Model What should be the value of these options? Let’s look at a case where we are only concerned with the value of options for one period. In the next period a stock price can either go up or go down. Let’s look at a case where we know for certain that a JPM stock with a price of $30 will either go up 5% or go down 5% in the next period and the exercise after one period is $30. Figures 42.3–42.5 shows the decision tree for the JPM stock price, the JPM call option price, and the JPM put option price, respectively.
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach Fig. 42.3 JPM Stock Price
Period 0
619
We can solve for B by substituting the value 0.5 for S in the first equation.
Period 1 31.50
31:5 .0:5/ C 1:03B D 1:5
30 28.50 Fig. 42.4 JPM Call Option Price
Period 0
15:75 C 1:03B D 1:5 1:03B D 14:25
Period 1
B D 13:8350
1.50 ?? 0 Fig. 42.5 JPM Put Option Price
Period 0
Period 1 0
?? 1.50
Let’s first consider the issue of pricing a JPM call option. Using a one-period decision tree we can illustrate the price of a JPM stock if it goes up 5% and the price of a stock JPM if it goes down 5%. Since we know the possible endings values of the JPM stock, we can derive the possible ending values of a call option. If the stock price increases to $31.50, the price of the JPM call option will then be $1.50 ($31.5–$30). If the JPM stock price declines to $28.50, the value of the call option will worth zero because it would be below the exercise price of $30. We have just discussed the possible ending value of a JPM call option in period 1. But, what we are really interested in is the value is of the JPM call option knowing the two resulting values of the JPM call option. To help determine the value of a one-period JPM call option, it’s useful to know that it is possible to replicate the resulting two-state of the value of the JPM call option by buying a combination of stocks and bonds. The formula below replicates the situation where the price increases to $31.50. We will assume that the interest rate for the bond is 3%.
Therefore, from the above simple algebraic exercise, we should at period 0 buy 0.5 shares of JPM stock and borrow 13.8350 at 3% to replicate the payoff of the JPM call option. This means the value of a JPM call option should be 0:5 30 13:8350 D 1:165. If this were not the case, there would then be arbitrage profits. For example, if the call option were sold for $3 there would be a profit of 1.835. This would result in an increase in the selling of the JPM call option. The increase in the supply of JPM call options would push the price down for the call options. If the call option were sold for $1, there would be a saving of 0.165. This saving would result in the increase demand for the JPM call option. This increase demand would result in an increase of the price of the call option. The equilibrium point would be 1.165. Using the above mentioned concept and procedure, Benninga (2000) has derived a one-period call option model as C D qu MaxŒS.1 C u/X; 0 C qd MaxŒS.1 C d / X; 0 (42.1) where,
We can use simple algebra to solve for both S and B. The first thing that we need to do is to rearrange the second equation as follows, 1:07B D 28:5S With the above equation, we can rewrite the first equation as 31:5S C .28:5S/ D 1:5 3S D 1:5 S D 0:5
i d .1 C i /.u d /
qd D
ui .1 C i /.u d /
u D increase factor d D down factor i D interest rate
31:5S C 1:03B D 1:5 28:5S C 1:03B D 0
qu D
If we let i D r; p D .r d /=.u d /; 1 p D .u r/= .u d /; R D 1=.1 C r/, Cu D MaxŒS.1 C u/ X; 0 and Cd D MaxŒS.1 C d / X; 0 then we have C D ŒpCu C .1 p/Cd =R;
(42.2)
where, Cu D call option price after increase Cd D call option price after decrease Equation (42.2) is identical to Equation (6B.6) in Lee et al. (2000, p. 234).1 1
Please note that in Lee et al. (2000, p. 234) u D 1C percentage of price increase, d D 1 percentage of price increase.
620
J. Lee
Fig. 42.6 Call Option Price
Period 0
31:5 .0:5/ C 1:03B D 0
Period 1 1.500
1.1650
1:03B D 15:75 B D 15:2913
0
See below to calculate the value of the above one-period call option where the strike price, X , is $30 and the risk free interest rate is 3%. We will assume that the price of a stock for any given period will either increase or decrease by 5%. X D $30 S D $30
From the above simple algebra exercise we have S D 0:5 and B D 15:2913. This tells us that we should in period 0 lend $15.2913 at 3% and sell 0.5 shares of stock to replicate the put option payoff for period 1. And the value of the JPM put option should be 30.0:5/ C 15:2913 D 0:2913. Using the same arbitrage argument that we used in the discussing the call option, 0.2913 has to be the equilibrium price of the put option. As with the call option, Benninga (2000) has derived a one-period put option model as P D qu MaxŒX S.1 C u/; 0 C qd MaxŒX S.1 C d /; 0 (42.3) where,
u D 1:05 d D 0:95 R D 1 C r D 1 C 0:03 p D .1:03 0:95/=.1:05 0:95/ C D Œ0:8.1:5/ C 0:2.0/ =1:03 D $1:1650 Therefore from the above calculations, the value of the call option is $7.94. Figure 42.6 shows the resulting decision tree for the above call option. Like the call option, it is possible to replicate the resulting two states of the value of the put option by buying a combination of stocks and bonds. See below is for the formula to replicate the situation where the price decreases to $28.50. 31:5S C 1:03B D 0 28:5S C 1:03B D 1:5 We will use simple algebra to solve for both S and B. The first thing we will do is to rewrite the second equation as follows, 1:03B D 1:5 28:5S The next thing is to substitute the above equation to the first put option equation. Doing this would result in the following, 31:5S C 1:5 28:5S D 0 The following solves for S, 3S D 1:5 S D 0:5 Now let us solve for B by putting the value of S into the first equation. This is shown below.
qu D
i d .1 C i /.u d /
qd D
ui .1 C i /.u d /
u D increase factor d D down factor i D interest rate If we let i D r; p D .r d /=.ud /; 1p D .ur/=.ud /, R D 1=.1 C r/, Pu D MaxŒX S.1 C u/; 0 and Pd D MaxŒX S.1 C d /; 0 then we have P D ŒpPu C .1 p/Pd =R;
(42.4)
where, Pu D put option price after increase Pd D put option price after decrease Below calculates the value of the above oneperiod put option where the strike price, X , is $30 and the risk-free interest rate is 3%. P D Œ0:8 .0/ C 0:2 .1:03/ =1:03 D $:2913 From the above calculation the put option pricing decision tree would look like the following. Figure 42.7 shows the resulting decision tree for the above put option. There is a relationship between the price of a put option and the price of a call option. This relationship is called the put-call parity. Equation (42.5) shows the relationship between the price of a put option and the price of a call option. P D C C X=R S
(42.5)
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach Fig. 42.7 JPM Put Option Price
Period 0
Period 0
Period 1
Period 1
621
Period 2
0
3.0750
1.5000
0.0000
0.2913
Period 0
Period 1
0.0000
Period 2
0.0000
33.075 31.500 29.925 30.000 29.925
Fig. 42.9 JPM Call Option
Period 0
28.500
Period 1
Period 2 0.0000
27.075 0.0750 Fig. 42.8 JPM Stock Price
0.0750
Where,
2.9250
C D call price X D strike price R D 1C interest rate S D Stock Price The following uses the put-call parity to calculate the price of the JPM put option. P D $1:165 C $30= .1:03/ $30 D 1:165 C 9:1262 30 D :2912
42.4 Two-Period Option Pricing Model We now will look at pricing options for two periods. Figure 42.8 shows the stock price decision tree based on the parameters indicated in the last section. This decision tree was created based on the assumption that a stock price will either increase or decrease by 10%. How do we price the value of a call and put option for two periods? The highest possible value for our stock based on our assumption is $33.075. We get this value first by multiplying the stock price at period 0 by 105% to get the resulting value of $31.50 of period 1. We then multiply the stock price in period 1 by 105% to get the resulting value of $33.075. In period two, the value of a call option when a stock price is $33.075 is the stock price minus the exercise price, $33:075 30, or $3.075. In period two, the value of a put option when a stock price of $33.075 is the exercise price minus the stock price, $30 – $33.075, or $3:075. A nega-
Fig. 42.10 JPM Put Option
tive value has no value to an investor so the value of the put option would be zero. The lowest possible value for our stock based on our assumptions is $27.075. We get this value first by multiplying the stock price at period 0 by 95% (decreasing the value of the stock by 5%) to get the resulting value of $28.5 of period 1. We then again multiply the stock price in period 1 by 95% to get the resulting value of $27.075. In period two, the value of a call option when a stock price is $27.075 is the stock price minus the exercise price, $27:075 $30, or $2:925. A negative value has no value to an investor so the value of a call option would be zero. In period two, the value of a put option when a stock price is $27.075 is the exercise price minus the stock price, $30 – $27.075, or $2.925. We can derive the call and put option value for the other possible value of the stock in period 2 in the same fashion. Figures 42.9 and 42.10 show the possible call and put option values for period 2. We cannot calculate the value of the call and put option in period 1 the same way we did in period 2 because it’s not the ending value of the stock. In period 1 there are two possible call values. One value is when the stock price increased and one value is when the stock price decreased. The call option decision tree shown in Fig. 42.9 shows two possible values for a call option in period 1. If we just focus on the value of a call option when the stock price increases from period one, we will notice that it is like the decision tree for a call option for one period. This is shown in Fig. 42.11. Using the same method for pricing a call option for one period, the price of a call option when stock price increase
622
J. Lee
Period 0
Period 1
Period 0
Period 2
Period 1
Period 2 3.0750
3.0750
2.3883 0.0000
0
1.8550 0.0000
0
0.0000 0.0000
0
Fig. 42.11 JPM Call Option
Fig. 42.14 JPM Call Option
Period 0
Period 1
Period 2 0.0000
Period 0
Period 1
0.0146
Period 2
0.0750
3.0750
0.1329
2.3883
0.0750
0
0.6262 2.9250
0 0
Table 42.1 Binomial calculations Periods Calculations
Fig. 42.12 JPM Call Option
Period 0
Fig. 42.15 JPM Put Option
Period 1
Period 2 3.0750
2.3883 0 0 0 0
Fig. 42.13 JPM Call Option
from period 0 will be $2.3883. The resulting decision tree is shown in Fig. 42.12. In the same fashion we can price the value of a call option when a stock price decreases. The price of a call option when a stock price decreases from period 0 is zero. The resulting decision tree is shown in Fig. 42.13. In the same fashion we can price the value of a call option in period 0. The resulting decision tree is shown in Fig. 42.14. We can calculate the value of a put option in the same manner as we did in calculating the value of a call option. The decision tree for a put option is shown in Fig. 42.15.
1 2 3 4 5 6 7 8 9 10 11 12
3 7 17 31 63 127 255 511 1; 023 2; 047 4; 065 8; 191
42.5 Using Microsoft Excel to Create the Binomial Option Trees In the previous section we priced the value of a call and put option by pricing backwards, from the last period to the first period. This method of pricing call and put options will work for any n-period. To price the value of a call options for two periods required seven sets of calculations. The number of calculations increases dramatically as n increases. Table 42.1 lists the number of calculations for specific number of periods. After two periods it becomes very cumbersome to calculate and create the decision trees for call and put options. In the previous section we saw that the calculations were very repetitive and mechanical. To solve this problem, we will use
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
623
Microsoft Excel to do the calculations and create the decision trees for the call and put options. We will also use Microsoft Excel to calculate and draw the related Decision Trees for the underlying stocks and bonds. To avoid the repetitive and mechanical calculation of the Binomial Option Pricing Model, we will look at a Microsoft Excel file called binomialBS_OPM.xls. And use the workbook to produce four decision trees for the JPM stock which was discussed in the previous sections. The four decision trees are as follows: 1. 2. 3. 4.
Stock Price Call Option Price Put Option Price Bond Price
Figure 42.16 shows the Excel file binomialBS_OPM.xls after the file is opened. Pushing the button shown in Fig. 42.16 will get the dialog box shown in Fig. 42.17. The dialog box shown in Fig. 42.17 shows the parameters for the Binomial Option Pricing Model. These parameters are changeable. The dialog box in Fig. 42.17 shows the default values. Pushing the calculate button shown in Fig. 42.17 will produce the four decision trees shown in Figs. 42.18–42.21. The table at the beginning of this section indicated 31 calculations were required to create a decision tree that has four periods. This section showed four decision trees. Therefore, the Excel file did 31 4 D 124 calculations to create the four decision trees. Benninga (2000, p. 260) has defined the price of a call option in a Binomial Option Pricing Model with n periods as
Fig. 42.17 Dialog Box Showing Parameters for the Binomial Option Pricing Model
n
X n i ni qu qd maxŒS.1Cu/i .1Cd /ni ; 0 C D i
(42.6)
i D0
and the price of a put option in a Binomial Option Pricing Model with n-periods as n
X n i ni qu qd maxŒX S.1 C u/i .1 C d /ni ; 0 i i D0 (42.7) Lee et al. (2000, p. 237) has defined the pricing of a call option in a Binomial Option Pricing Model with n-period as,
P D
C D
n nŠ 1 X p k .1 p/nk n R kŠ.n kŠ/ kD0
max 0; .1 C u/k .1 C d /nk ; S X (42.8)
The definition of the pricing of a put option in a Binomial Option Pricing Model with n period would then be defined as, P D Fig. 42.16 Excel File BinomialBS_OPM.xls
n 1 X nŠ p k .1 p/nk n R kŠ.n k/Š kD0
max 0; X .1 C u/k .1 C d /nk ; S
(42.9)
624
J. Lee
Stock Price
Call Option Pricing
Decision Tree
Decision Tree
Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.05,D ⫽ 0.95,N ⫽ 4,R ⫽ 0.03
Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.05,D ⫽ 0.95,N ⫽ 4,R ⫽ 0.03
Number of calculations: 31
Number of calculations: 31 Binomial Call Price: 3.4418
36.4652
6.4652
34.7287
5.6025
32.9923
2.9923
33.0750
4.8028
32.9923
2.9923
31.4212
2.3241
29.8502
0.0000
31.5000
4.0808
32.9923
2.9923
31.4212
2.3241
29.8502
0.0000
29.9250
1.8051
29.8502
0.0000
28.4287
0.0000
27.0073
0.0000
30.0000
3.4418
32.9923
2.9923
31.4212
2.3241
29.8502
0.0000
29.9250
1.8051
29.8502
0.0000
28.4287
0.0000
27.0073
0.0000
28.5000
1.4021
29.8502
0.0000
28.4287
0.0000
27.0073
0.0000
27.0750
0.0000
27.0073
0.0000
25.7212
0.0000
24.4352
0.0000
Fig. 42.18 Stock Price Decision Tree
Fig. 42.19 Call Option Pricing Decision Tree
42.6 Black-Scholes Option Pricing Model The most famous option pricing model is the Black-Scholes Option Pricing Model. In this section we will demonstrate the usage of the Black-Scholes Option Pricing Model. In latter sections we will demonstrate the relationship between the Binomial Option Pricing Model and the Black-Scholes Pricing Model. The Black-Scholes Model prices European call and put options. The Black-Scholes model for a European call option is C D SN.d1/ Xe rT N.d2/ Where, C D Call price S D Stock price
(42.10)
r D risk free interest rate T D time to maturity of option in years N() D standard normal distribution ¢ D stock volatility ln.S=X / C .r C d1 D p T
2 /T 2
p d 2 D d1 T Let’s manually calculate the price of an European call option in terms of Equation (42.10) with the following parameter values, S D 30; X D 30; r D 3%; T D 4; ¢ D 20%:
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
Put Option Pricing
Bond Pricing
Decision Tree
Decision Tree
625
Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.05,D ⫽ 0.95,N ⫽ 4,R ⫽ 0.03
Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.05,D ⫽ 0.95,N ⫽ 4,R ⫽ 0.03
Number of calculations: 31
Number of calculations: 31
Binomial Put Price: 0.0964
1.1255
0.0000
1.0927
0.0000
1.1255
0.0000
1.0609
0.0056
1.1255
0.0000
1.0927
0.0291
1.1255
0.1498
1.0300
0.0351
1.1255
0.0000
1.0927
0.0291
1.1255
0.1498
1.0609 1.1255
0.1580 1.0927
0.1498
1.1255
0.6975 2.9927
1.0000 1.1255
0.0964 1.0927
0.0000
1.1255
0.0291
1.0609
0.1498
1.1255
0.1580
1.0927
0.1498 0.6975 2.9927
1.1255 1.0300 1.1255
0.3563
1.0927
0.1498 0.6975 2.9927 1.2029 2.9927
1.1255 1.0609 1.1255 1.0927 1.1255
3.4050 5.5648
Fig. 42.21 Bond Pricing Decision Tree
Fig. 42.20 Put Option Pricing Decision Tree
Solution,
2 ln.S=X / C r C 2 T d1 D p T
2 ln.30=30/ C :03 C : :22 .4/ D p :2 4 :2 .:03 C :02/ 4 D D :5; :4 :4 p d 2 D :5 :2 4 D :1
The Black-Scholes put-call parity equation is P D C S C Xe rT The put option value for the stock would be P D 6:38 30 C 30 .0:8869/ D 2:987
D
42.7 Relationship Between the Binomial OPM and the Black-Scholes OPM
N .d1/ D 0:69146; N .d2/ D 0:5398; e rT D 0:8869 C D .30/ .0:69146/ .30/ .0:8869/ 0:5398 D 20:7438 14:3624 D 6:3813414
We can use either the Binomial model or Black-Scholes to price an option. They both should result in similar numbers. If we look at the parameters in both models we will notice
626
J. Lee
that the Binomial model has an Increase Factor (U), a Decrease Factor (D) and n-period parameters that the Black-Scholes model does not have. We also notice that the Black-Scholes model has the ¢ and T parameters whereas the Binomial model does not Benninga (2008) suggests the following translation between the Binomial and Black-Scholes parameters.
t D T =n R D e r t
U D e
p
t
D D e
p
t
Call Option Pricing Decision Tree Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.2214,D ⫽ 0.8187, N ⫽ 4,R ⫽ 0.03 Number of calculations: 31 Binomial Call Price ⫽ 6.1006 Black-Scholes Call Price ⫽ 6.3803,d1 ⫽ 0.5000, d2 ⫽ 0.1000,N(d1) ⫽ 0.6915,N(d2) ⫽ 0.5398 36.7662 25.5502 14.7547
In the Excel program shown in Appendix 42A, we use Benniga’s (2008) Increase Factor and Decrease Factor definitions. They are defined as follows: qU D
RD ; R.U D/
qD D
16.5018 14.7547 7.5287 0.0000 10.1880
U R R.U D/
14.7547 7.5287 0.0000 3.8416 0.0000
where, U D 1C percentage of price increase D D 1 percentage of price increase R D 1C interest rate
0.0000 0.0000 6.1006 14.7547 7.5287 0.0000 3.8416 0.0000 0.0000
42.8 Decision Tree Black-Scholes Calculation
0.0000 1.9602 0.0000
We will now use the BinomialBS_OPM.xls Excel file to calculate the Binomial and Black-Scholes call and put values illustrated in Sect. 42.5. Note that in Fig. 42.22. the Binomial
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 Fig. 42.23 Decision Tree approximation of Black-Scholes Call Pricing
Black-Scholes Approximation box is checked. Checking this box will cause T and Sigma parameters to appear and will adjust the Increase Factuor – u and Decrease Factor – d parameters. The adjustment was done as indicated in Sect. 42.7. Note in Figs. 42.23 and 42.24 the Binomial Option Pricing Model value does not agree with the Black-Scholes Option Pricing model. The Binomial OPM value will get very close to the Black-Scholes OPM value once the Binomial parameter n becomes very large. Benninga (2008) demonstrated that the Binomial value will be close to the Black-Scholes when the Binomial n parameter is larger than 500.
42.9 Conclusion
Fig. 42.22 Dialog Box Showing Parameters for the Binomial Option Pricing Model
This paper has demonstrated, using Microsoft Excel and decision trees, the Binomial Option Model in a less mathematical fashion. This paper allowed the reader to focus more
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
Put Option Pricing Decision Tree Price ⫽ 30,Exercise ⫽ 30,U ⫽ 1.2214, D ⫽ 0.8187,N ⫽ 4,R ⫽ 0.03 Number of calculations: 31 Binomial Put Price: 2.7082 Black-Scholes Put Price: 2.9880 0.0000 0.0000 0.0000 0.0000
627
This paper also published the Microsoft Excel Visual Basic for Application (VBA) code that created the Binomial Option Decision Trees. In addition, a major computer science programming concept used by the Excel VBA program in this paper is recursive programming. Recursive programming follows a procedure which repeats itself many times. Inside the procedure are statements to decide when not to call itself. This paper also used decision trees to demonstrate the relationship between the Binomial Option Pricing Model and the Black-Scholes Option Pricing Model.
0.0000 0.0000 0.0000 0.9639 0.0000
References
0.0000
Benninga, S. 2000. Financial modeling. MIT Press, Cambridge. Benninga, S. 2008. Financial modeling. MIT Press, Cambridge. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 31, 637–659. Cox, J., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics. World Scientific, NJ. Rendleman, R. J., Jr., and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance, 34(5), 1093–1110. Wells, E. and S.Harshbarger, 1997. Microsoft Excel 97 Developer’s Handbook. Microsoft Press, Redmond.
0.0000 2.0945 0.0000 4.5514 9.8904 2.7082 0.0000 0.0000 0.0000 2.0945 0.0000 4.5514 9.8904 4.8162 0.0000 4.5514 9.8904 8.1433 9.8904 12.6490
Appendix 42A Excel VBA Code–Binomial Option Pricing Model
16.5201
Fig. 42.24 Decision Tree approximation of Black-Scholes Put Pricing
which were created by Microsoft Excel. This paper also demonstrates that using Microsoft Excel releases the reader from the computation burden of the Binomial Option Model and allows a better focus on the concepts by studying the associated decision trees.
It is important to note that the thing that makes Microsoft Excel powerful is that it offers a powerful professional programming language called Visual Basic for Applications (VBA). This section shows the VBA code that generated the Decision Trees for the Binomial Option pricing model. This code is in the form frmBinomiaOption. The procedure cmdCalculate_Click is the first procedure to run.
628
J. Lee
Decision Tree Visual Basic for Application Programming Code ’/*************************************************************************** ’/ Relationship Between the Binomial OPM ’/ and Black-Scholes OPM: ’/ Decision Tree and Microsoft Excel Approach ’/ ’/ by John Lee ’/ [email protected] ’/ All Rights Reserved ’/*************************************************************************** Option Explicit Dim mwbTreeWorkbook As Workbook Dim mwsTreeWorksheet As Worksheet Dim mwsCallTree As Worksheet Dim mwsPutTree As Worksheet Dim mwsBondTree As Worksheet Dim mdblPFactor As Double Dim mBinomialCalc As Long Dim mCallPrice As Double ’jcl 12/8/2008 Dim mPutPrice As Double ’jcl 12/8/2008 ’/************************************************** ’/Purpose: Keep track the numbers of binomial calc ’/************************************************* Property Let BinomialCalc(l As Long) mBinomialCalc = l End Property Property Get BinomialCalc() As Long BinomialCalc = mBinomialCalc End Property Property Set TreeWorkbook(wb As Workbook) Set mwbTreeWorkbook = wb End Property Property Get TreeWorkbook() As Workbook Set TreeWorkbook = mwbTreeWorkbook End Property Property Set TreeWorksheet(ws As Worksheet) Set mwsTreeWorksheet = ws End Property Property Get TreeWorksheet() As Worksheet Set TreeWorksheet = mwsTreeWorksheet End Property Property Set CallTree(ws As Worksheet) Set mwsCallTree = ws End Property Property Get CallTree() As Worksheet Set CallTree = mwsCallTree End Property Property Set PutTree(ws As Worksheet) Set mwsPutTree = ws End Property Property Get PutTree() As Worksheet Set PutTree = mwsPutTree End Property Property Set BondTree(ws As Worksheet) Set mwsBondTree = ws End Property Property Get BondTree() As Worksheet Set BondTree = mwsBondTree End Property Property Let CallPrice(dCallPrice As Double) ’12/8/2008 mCallPrice = dCallPrice End Property
(continued)
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
629
Property Get CallPrice() As Double Let CallPrice = mCallPrice End Property Property Let PutPrice(dPutPrice As Double) ’12/10/2008 mPutPrice = dPutPrice End Property Property Get PutPrice() As Double ’12/10/2008 Let PutPrice = mPutPrice End Property Property Let PFactor(r As Double) Dim dRate As Double dRate = ((1 + r) - Me.txtBinomialD) / (Me.txtBinomialU - Me.txtBinomialD) Let mdblPFactor = dRate End Property Property Get PFactor() As Double Let PFactor = mdblPFactor End Property Property Get qU() As Double Dim dblDeltaT As Double Dim dblDown As Double Dim dblUp As Double Dim dblR As Double dblDeltaT = Me.txtTimeT / Me.txtBinomialN dblR = Exp(Me.txtBinomialr * dblDeltaT) dblUp = Exp(Me.txtSigma * VBA.Sqr(dblDeltaT)) dblDown = Exp(-Me.txtSigma * VBA.Sqr(dblDeltaT))
qU = (dblR - dblDown) / (dblR * (dblUp - dblDown)) End Property Property Get qD() As Double Dim Dim Dim Dim
dblDeltaT As Double dblDown As Double dblUp As Double dblR As Double
dblDeltaT = Me.txtTimeT / Me.txtBinomialN dblR = Exp(Me.txtBinomialr * dblDeltaT) dblUp = Exp(Me.txtSigma * VBA.Sqr(dblDeltaT)) dblDown = Exp(-Me.txtSigma * VBA.Sqr(dblDeltaT)) qD = (dblUp - dblR) / (dblR * (dblUp - dblDown)) End Property
Private Sub chkBinomialBSApproximation_Click() On Error Resume Next ’Time and Sigma only BlackScholes parameter Me.txtTimeT.Visible = Me.chkBinomialBSApproximation Me.lblTimeT.Visible = Me.chkBinomialBSApproximation Me.txtSigma.Visible = Me.chkBinomialBSApproximation Me.lblSigma.Visible = Me.chkBinomialBSApproximation txtTimeT_Change End Sub
(continued)
630
J. Lee
Private Sub cmdCalculate_Click() Me.Hide BinomialOption Unload Me End Sub Private Sub cmdCancel_Click() Unload Me End Sub
Private Sub txtBinomialN_Change() ’jcl 12/8/2008 On Error Resume Next If Me.chkBinomialBSApproximation Then Me.txtBinomialU = Exp(Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) Me.txtBinomialD = Exp(-Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) End If End Sub
Private Sub txtTimeT_Change() ’jcl 12/8/2008 On Error Resume Next If Me.chkBinomialBSApproximation Then Me.txtBinomialU = Exp(Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) Me.txtBinomialD = Exp(-Me.txtSigma * Sqr(Me.txtTimeT / Me.txtBinomialN)) End If End Sub Private Sub UserForm_Initialize() With Me .txtBinomialS = .txtBinomialX = .txtBinomialD = .txtBinomialU = .txtBinomialN = .txtBinomialr = .txtSigma = 0.2 .txtTimeT = 4
30 30 0.95 1.05 4 0.03
Me.chkBinomialBSApproximation = False End With chkBinomialBSApproximation_Click Me.Hide End Sub Sub BinomialOption() Dim wbTree As Workbook Dim wsTree As Worksheet Dim rColumn As Range Dim ws As Worksheet Set Set Set Set Set
Me.TreeWorkbook = Workbooks.Add Me.BondTree = Me.TreeWorkbook.Worksheets.Add Me.PutTree = Me.TreeWorkbook.Worksheets.Add Me.CallTree = Me.TreeWorkbook.Worksheets.Add Me.TreeWorksheet = Me.TreeWorkbook.Worksheets.Add
(continued)
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
631
Set rColumn = Me.TreeWorksheet.Range("a1") With Me .BinomialCalc = 0 .PFactor = Me.txtBinomialr .CallTree.Name = "Call Option Price" .PutTree.Name = "Put Option Price" .TreeWorksheet.Name = "Stock Price" .BondTree.Name = "Bond" End With DecisionTree rCell:=rColumn, nPeriod:=Me.txtBinomialN + 1, _ dblPrice:=Me.txtBinomialS, sngU:=Me.txtBinomialU, _ sngD:=Me.txtBinomialD DecitionTreeFormat TreeTitle wsTree:=Me.TreeWorksheet, sTitle:="Stock Price " TreeTitle wsTree:=Me.CallTree, sTitle:="Call Option Pricing" TreeTitle wsTree:=Me.PutTree, sTitle:="Put Option Pricing" TreeTitle wsTree:=Me.BondTree, sTitle:="Bond Pricing" Application.DisplayAlerts = False For Each ws In Me.TreeWorkbook.Worksheets If Left(ws.Name, 5) = "Sheet" Then ws.Delete Else ws.Activate ActiveWindow.DisplayGridlines = False ws.UsedRange.NumberFormat = "#,##0.0000_);(#,##0.0000)" End If Next Application.DisplayAlerts = True Me.TreeWorksheet.Activate End Sub Sub TreeTitle(wsTree As Worksheet, sTitle As String) wsTree.Range("A1:A5").EntireRow.Insert (xlShiftDown) With wsTree With .Cells(1) .Value = sTitle .Font.Size = 20 .Font.Italic = True End With With .Cells(2, 1) .Value = "Decision Tree" .Font.Size = 16 .Font.Italic = True End With With .Cells(3, 1) .Value = "Price = " & Me.txtBinomialS & _ ",Exercise = " & Me.txtBinomialX & _ ",U = " & Format(Me.txtBinomialU, "#,##0.0000") & _ ",D = " & Format(Me.txtBinomialD, "#,##0.0000") & _ ",N = " & Me.txtBinomialN & _ ",R = " & Me.txtBinomialr .Font.Size = 14 End With With .Cells(4, 1) .Value = "Number of calculations: " & Me.BinomialCalc .Font.Size = 14 End With
(continued)
632
J. Lee
If wsTree Is Me.CallTree Then With .Cells(5, 1) .Value = "Binomial Call Price= " & Format(Me.CallPrice, "#,##0.0000") .Font.Size = 14 End With If Me.chkBinomialBSApproximation Then wsTree.Range("A6:A7").EntireRow.Insert (xlShiftDown) With .Cells(6, 1) .Value = "Black-Scholes Call Price= " & Format(Me.BS_Call, "#,##0.0000") _ & ",d1=" & Format(Me.BS_D1, "#,##0.0000") _ & ",d2=" & Format(Me.BS_D2, "#,##0.0000") _ & ",N(d1)=" & Format(WorksheetFunction.NormSDist(BS_D1), "#,##0.0000") _ & ",N(d2)=" & Format(WorksheetFunction.NormSDist(BS_D2), "#,##0.0000") .Font.Size = 14 End With End If ElseIf wsTree Is Me.PutTree Then With .Cells(5, 1) .Value = "Binomial Put Price: " & Format(Me.PutPrice, "#,##0.0000") .Font.Size = 14 End With If Me.chkBinomialBSApproximation Then wsTree.Range("A6:A7").EntireRow.Insert (xlShiftDown) With .Cells(6, 1) .Value = "Black-Scholes Put Price: " & Format(Me.BS_PUT, "#,##0.0000") .Font.Size = 14 End With End If End If End With End Sub Sub BondDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rBond As Range Dim rPup As Range Dim rPDown As Range Set rBond = Me.BondTree.Cells(rPrice.Row, rPrice.Column) Set rPup = Me.BondTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column) Set rPDown = Me.BondTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rPup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then rPup.Value = (1 + Me.txtBinomialr) ^ (rPup.Column - 1) rPDown.Value = rPup.Value End If With rBond .Value = (1 + Me.txtBinomialr) ^ (rBond.Column - 1) .Borders(xlBottom).LineStyle = xlContinuous End With rPDown.Borders(xlBottom).LineStyle = xlContinuous With rPup .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((rPDown.Row - rPup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With End Sub
(continued)
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
633
Sub PutDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rCall As Range Dim rPup As Range Dim rPDown As Range Set rCall = Me.PutTree.Cells(rPrice.Row, rPrice.Column) Set rPup = Me.PutTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column) Set rPDown = Me.PutTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rPup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then rPup.Value = WorksheetFunction.Max(Me.txtBinomialX - arCell(iCount - 1), 0) rPDown.Value = WorksheetFunction.Max(Me.txtBinomialX - arCell(iCount), 0) End If With rCall ’12/10/2008 If Not Me.chkBinomialBSApproximation Then .Value = (Me.PFactor * rPup + (1 - Me.PFactor) * rPDown) / (1 + Me.txtBinomialr) Else .Value = (Me.qU * rPup) + (Me.qD * rPDown) End If Me.PutPrice = .Value
’12/8/2008
.Borders(xlBottom).LineStyle = xlContinuous End With rPDown.Borders(xlBottom).LineStyle = xlContinuous With rPup .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((rPDown.Row - rPup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With End Sub Sub CallDecisionTree(rPrice As Range, arCell As Variant, iCount As Long) Dim rCall As Range Dim rCup As Range Dim rCDown As Range Set rCall = Me.CallTree.Cells(rPrice.Row, rPrice.Column) Set rCup = Me.CallTree.Cells(arCell(iCount - 1).Row, arCell(iCount - 1).Column) Set rCDown = Me.CallTree.Cells(arCell(iCount).Row, arCell(iCount).Column) If rCup.Column = Me.TreeWorksheet.UsedRange.Columns.Count Then With rCup .Value = WorksheetFunction.Max(arCell(iCount - 1) - Me.txtBinomialX, 0) .Borders(xlBottom).LineStyle = xlContinuous End With With rCDown .Value = WorksheetFunction.Max(arCell(iCount) - Me.txtBinomialX, 0) .Borders(xlBottom).LineStyle = xlContinuous End With End If With rCall If Not Me.chkBinomialBSApproximation Then .Value = (Me.PFactor * rCup + (1 - Me.PFactor) * rCDown) / (1 + Me.txtBinomialr) Else .Value = (Me.qU * rCup) + (Me.qD * rCDown) End If
(continued)
634
J. Lee
Me.CallPrice = .Value
’12/8/2008
.Borders(xlBottom).LineStyle = xlContinuous End With rCup.Offset(1, 0).Resize((rCDown.Row - rCup.Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End Sub Sub DecitionTreeFormat() Dim rTree As Range Dim nColumns As Integer Dim rLast As Range Dim rCell As Range Dim lCount As Long Dim lCellSize As Long Dim vntColumn As Variant Dim iCount As Long Dim lTimes As Long Dim arCell() As Range Dim sFormatColumn As String Dim rPrice As Range
Application.StatusBar = "Formatting Tree.. " Set rTree = Me.TreeWorksheet.UsedRange nColumns = rTree.Columns.Count
Set rLast = rTree.Columns(nColumns).EntireColumn.SpecialCells(xlCellTypeConstants, 23) lCellSize = rLast.Cells.Count For lCount = nColumns To 2 Step -1 sFormatColumn = rLast.Parent.Columns(lCount).EntireColumn.Address Application.StatusBar = "Formatting column " & sFormatColumn ReDim vntColumn(1 To (rLast.Cells.Count / 2), 1) Application.StatusBar = "Assigning values to array for column " & _ rLast.Parent.Columns(lCount).EntireColumn.Address vntColumn = rLast.Offset(0, -1).EntireColumn.Cells(1).Resize(rLast.Cells.Count / 2, 1) rLast.Offset(0, -1).EntireColumn.ClearContents ReDim arCell(1 To rLast.Cells.Count) lTimes = 1 Application.StatusBar = "Assigning cells to arrays. Total number of cells: " & lCellSize For Each rCell In rLast.Cells Application.StatusBar = "Array to column " & sFormatColumn & " Cells " & rCell.Row Set arCell(lTimes) = rCell lTimes = lTimes + 1 Next lTimes = 1 Application.StatusBar = "Formatting leaves for column " & sFormatColumn For iCount = 2 To lCellSize Step 2 Application.StatusBar = "Formatting leaves for cell " & arCell(iCount).Address If rLast.Cells.Count <> 2 Then Set rPrice = arCell(iCount).Offset(-1 * ((arCell(iCount).Row - arCell(iCount 1).Row) / 2), -1) rPrice.Value = vntColumn(lTimes, 1)
(continued)
42 Binomial OPM, Black-Scholes OPM and Their Relationship: Decision Tree and Microsoft Excel Approach
635
Else Set rPrice = arCell(iCount).Offset(-1 * ((arCell(iCount).Row - arCell(iCount 1).Row) / 2), -1) rPrice.Value = vntColumn End If
arCell(iCount).Borders(xlBottom).LineStyle = xlContinuous With arCell(iCount - 1) .Borders(xlBottom).LineStyle = xlContinuous .Offset(1, 0).Resize((arCell(iCount).Row - arCell(iCount - 1).Row), 1). _ Borders(xlEdgeLeft).LineStyle = xlContinuous End With lTimes = 1 + lTimes CallDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount PutDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount BondDecisionTree rPrice:=rPrice, arCell:=arCell, iCount:=iCount Next Set rLast = rTree.Columns(lCount - 1).EntireColumn.SpecialCells(xlCellTypeConstants, 23) lCellSize = rLast.Cells.Count Next ’ / outer next rLast.Borders(xlBottom).LineStyle = xlContinuous Application.StatusBar = False End Sub ’/********************************************************************* ’/Purpse: To calculate the price value of every state of the binomial ’/ decision tree ’/********************************************************************* Sub DecisionTree(rCell As Range, nPeriod As Integer, _ dblPrice As Double, sngU As Single, sngD As Single) Dim lIteminColumn As Long If Not nPeriod = 1 Then ’Do Up DecisionTree rCell:=rCell.Offset(0, 1), nPeriod:=nPeriod - 1, _ dblPrice:=dblPrice * sngU, sngU:=sngU, _ sngD:=sngD ’Do Down DecisionTree rCell:=rCell.Offset(0, 1), nPeriod:=nPeriod - 1, _ dblPrice:=dblPrice * sngD, sngU:=sngU, _ sngD:=sngD End If
lIteminColumn = WorksheetFunction.CountA(rCell.EntireColumn)
If lIteminColumn = 0 Then rCell = dblPrice Else If nPeriod <> 1 Then rCell.EntireColumn.Cells(lIteminColumn + 1) = dblPrice Else rCell.EntireColumn.Cells(((lIteminColumn + 1) * 2) - 1) = dblPrice Application.StatusBar = "The number of binomial calcs are : " & Me.BinomialCalc _ & " at cell " & rCell.EntireColumn.Cells(((lIteminColumn + 1) * 2) - 1).Address End If
(continued)
636
J. Lee
End If Me.BinomialCalc = Me.BinomialCalc + 1 End Sub Function BS_D1() As Double Dim dblNumerator As Double Dim dblDenominator As Double On Error Resume Next dblNumerator = VBA.Log(Me.txtBinomialS / Me.txtBinomialX) + _ ((Me.txtBinomialr + Me.txtSigma ^ 2 / 2) * Me.txtTimeT) dblDenominator = Me.txtSigma * Sqr(Me.txtTimeT)
BS_D1 = dblNumerator / dblDenominator End Function Function BS_D2() As Double On Error Resume Next BS_D2 = BS_D1 - (Me.txtSigma * VBA.Sqr(Me.txtTimeT)) End Function Function BS_Call() As Double BS_Call = (Me.txtBinomialS * WorksheetFunction.NormSDist(BS_D1)) _ - Me.txtBinomialX * Exp(-Me.txtBinomialr * Me.txtTimeT) * _ WorksheetFunction.NormSDist(BS_D2) End Function ’Used put-call parity theorem to price put option Function BS_PUT() As Double BS_PUT = BS_Call - Me.txtBinomialS + _ (Me.txtBinomialX * Exp(-Me.txtBinomialr * Me.txtTimeT)) End Function
Chapter 43
Combinatorial Methods for Constructing Credit Risk Ratings Alexander Kogan and Miguel A. Lejeune
Abstract This study uses a novel combinatorial method, the Logical Analysis of Data (LAD), to reverse-engineer and construct credit risk ratings, which represent the creditworthiness of financial institutions and countries. The proposed approaches are shown to generate transparent, consistent, self-contained, and predictive credit risk rating models, closely approximating the risk ratings provided by some of the major rating agencies. The scope of applicability of the proposed method extends beyond the rating problems discussed in this study, and can be used in many other contexts where ratings are relevant. The proposed methodology is applicable in the general case of inferring an objective rating system from archival data, given that the rated objects are characterized by vectors of attributes taking numerical or ordinal values. Keywords Credit risk rating r Reverse-engineering r Logical analysis of data r Combinatorial optimization r Data-mining
43.1 Introduction 43.1.1 Importance of Credit Risk Ratings Credit ratings published by such agencies as Moody’s, Standard & Poor’s, and Fitch, are considered important indicators for financial markets, providing critical information about the likelihood of future default. The importance of credit ratings is recognized by international bodies, as manifested by the Basel Capital Accord (2001, 2006). The progressively A. Kogan () Rutgers Business School, Rutgers University, Newark-New Brunswick, NJ, USA and RUTCOR – Rutgers Center for Operations Research, Piscataway, NJ, USA e-mail: [email protected] M.A. Lejeune George Washington University, Washington, DC, USA e-mail: [email protected]
increasing importance of credit risk ratings is driven by the dramatic expansion of investment opportunities associated with the globalization of the world economies. Since these opportunities are often risky, the internationalized financial markets have to rely on agencies’ ratings for the assessment of credit risk. The importance of credit risk rating systems is manifold, as shown below: Credit approval: the credit risk rating plays a major role
in the credit pre-approval decisions. Indeed, a binary decision model in which the credit risk rating of the obligor is an explanatory variable is typically used to pre-approve the decision to grant or not the credit. Pricing: in case of loan pre-approval, the credit risk rating impacts the conditions (interest rate, covenants, collaterals, etc.) under which the final credit is granted (Treacy and Carey 2000). The credit risk rating is also utilized in the operations subsequent to the pre-approval stage. Provisioning: the credit risk rating is a variable used for calculating the expected losses, which, in turn, are used to determine the amount of economic capital that a bank must keep to hedge against possible defaults of its borrowers. The expected loss of a credit facility is a function of the probability of default of the borrower to which the credit line is granted, as well as the exposure-at-default and the loss given default associated with the credit facility. Since there is usually a mapping between credit risk rating and the borrower’s probability of default, and provided that the rating of a borrower is a key predictor to assess the recovery rate associated with a credit facility granted to this borrower, the importance of the credit risk rating in calculating the amounts of economic capital is evident. Moral hazard: the reliance upon an objective and accurate credit risk rating system is a valuable tool against some moral hazard situations. Some financial institutions do not use a credit risk rating model but instead let lending officers assign credit ratings based on their judgment. The lending officers are in charge of the marketing of banking services, and their performance and therefore
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_43,
639
640
compensation are determined with respect to the “profitability” of the relationships between the bank and its customers. Clearly, the credit risk ratings assigned by the lending officers will affect the volume of the approved loans and the compensation of the officer who, as a result, could have an incentive to assign ratings in a way that is not consistent with the employer’s interests. Thus, the use of a reliable credit risk rating system could lead to avoiding such perverse incentive situations. Basel compliance: the New Basel Capital Accord (Basel II) requires banks to implement a robust framework for the evaluation of credit risk exposures of financial institutions and the capital requirements they must bear. This involves the construction and the cross-validation of accurate and predictive credit risk rating systems. Financial institutions, while taking into account external ratings (e.g., those provided by Fitch, Moody’s, S&P), have increasingly been developing efficient and refined internal rating systems over the past years. The reasons for this trend are multiple. First, the work of rating agencies, providing external, public ratings, has recently come under intense scrutiny and criticism, justified partly by the fact that some of the largest financial collapses of the decade (Enron Corp, and so forth) were not anticipated by the ratings. At his March 2006 testimony before the Senate Banking Committee, the president and CEO of the CFA Institute1 highlighted the conflicts of interest in credit ratings agencies (Wall Street Letter 2006). He regretted that credit rating agencies “have been reluctant to embrace any type of regulation over the services they provide,” and reported that credit rating agencies “should be held to the highest standards of transparency, disclosure and professional conduct. Instead, there are no standards.” The problem is reinforced by the fact that rating agencies, because they charge fees to rated countries, may be suspected of reluctance to downgrade them, due to the possibility of jeopardizing their income sources. This is claimed, for example, by Tom McGuire, an executive vice-president of Moody’s, who states that “the pressure from fee-paying issuers for higher ratings must always be in a delicate balance with the agencies’ need to retain credibility among investors.”2 The necessity to please the payers of the ratings opens the door to many possible issues. Kunczik (2000) notes that the IMF fears the danger that “issuers and intermediaries could be encouraged to engage in rating shopping – a process in which the issuer searches for the least expensive and/or least demanding rating.” The reader is referred to Hammer et al.
A. Kogan and M.A. Lejeune
(2006) for a discussion of some other criticisms (lack of comprehensibility, procyclicality, black box, lack of predictive and crisis-warning power, regional bias, and so forth) commonly addressed to rating agencies. Second, an internal rating system provides autonomy to a bank’s management in defining credit risk in line with that bank’s core business and best international practices. Internal ratings are also used to report to senior management various key metrics such as risk positions, loan loss reserves, economic capital allocation, and employee compensation (Treacy and Carey 2000). Third, although the Basel Committee on Banking Supervision of the Bank for International Settlements originally favored (i.e., in the 1988 Capital Accord) the ratings provided by external credit ratings agencies, it is now encouraging the Internal Ratings-Based (IRB) approach under which banks use their own internal rating estimates to define and calculate default risk, on the condition that the robust regulatory standards are met and the internal rating system is validated by the national supervisory authorities. The committee has defined strict rules for credit risk models used by financial institutions, and requires them to develop and cross-validate these models to comply with the Basel II standard (Basel Committee on Banking Supervision 2001, 2006). Fourth, aside from the Basel II requirements, banks are developing internal risk models to make the evaluation of credit risk exposure more accurate and transparent. Credit policies and processes will be more efficient, and the quality of data will be improved. This is expected to translate into substantial savings on capital requirements. Today’s eight cents out of every dollar that banks hold in capital reserves could be reduced in banks with conservative credit risk policies, resulting in higher profitability.
43.1.2 Contribution and Structure The above discussion outlines the impact of the credit risk ratings, the possible problems of objectivity and transparency of external (i.e., provided by rating agencies) credit risk ratings, and the importance and need for financial institutions to develop their own, internal credit risk rating systems. In this paper, we use the novel combinatorial pattern extraction method called the Logical Analysis of Data (LAD) to understand a given credit risk rating system and to develop on its basis a rating system having the following characteristics: Self-containment: the rating system does not use as pre-
1
Wall Street Letter, 2006. CFA to Senate: follow our lead on credit rating. 2 The Economist, July 15, 1995, 62.
dictor variables any other credit risk rating information (non-recursiveness). Clearly, this requirement precludes the use of lagged ratings as independent variables. It is
43 Combinatorial Methods for Constructing Credit Risk Ratings
important to note that this approach is in marked contrast with that of the current literature (see Hammer et al. 2006, 2007 for a discussion). The significant advantage of the non-recursive nature of the rating system is its applicability to not-yet-rated obligors. Objectivity: the rating system relies only on measurable characteristics of the rated entities. Transparency: the rating system has formal explicit specification. Accuracy: it is in close agreement with the learned, opaque rating system. Consistency: the discrepancies between the learned rating system and the constructed one are resolved by subsequent changes in the learned rating system. Generalizability: applicability of the rating system to evaluate the creditworthiness of obligors at subsequent years or of obligors that were not previously rated. Basel compliance: it satisfies the Basel II Accord requirements (cross-validation, and so forth).
In this study, we derive a rating system for two types of obligors:
641
This provides a tremendous opportunity for the application of modern data mining and machine learning techniques, based on statistics (Jain et al. 2000) and combinatorial pattern extraction (Hammer 1986). The above described approaches are implemented using the LAD combinatorial pattern extraction method, and it turns out to be a very conclusive case for the application and the contribution of data mining to the credit risk industry. The manuscript is structured as follows. Section 43.2 provides an overview of LAD (Hammer 1986), which is used to develop a new methodology for reverse-engineering and rating banks and countries with respect to their creditworthiness. Section 43.3 details the absolute creditworthiness model constructed for rating financial institutions and describes the obtained results. Section 43.4 is devoted to the description of two relative creditworthiness models for rating countries and contains an extensive description of the results. Section 43.5 provides concluding remarks.
43.2 Logical Analysis of Data: An Overview3
Countries: Eliasson (2002) defines country risk as the
“risk of national governments defaulting on their obligations,” while Afonso et al. (2007) state that “sovereign credit ratings are a condensed assessment of a government’s ability and willingness to repay its public debt both in principal and in interests on time.” Haque et al. (1996) define country credit risk ratings compiled by commercial sources as an attempt “to estimate country-specific risks, particularly the probability that a country will default on its debt-servicing obligations.” Financial institutions: bank financial strength ratings represent the “bank’s intrinsic safety and soundness” (Moody’s 2006). Two different approaches developed in this paper for reverseengineering and constructing credit risk ratings will be based on: Absolute creditworthiness, which evaluates the riskiness
of individual obligors; and Relative creditworthiness, which first evaluates the comparative riskiness of pairs of obligors and then the absolute riskiness of entities derived from their relative riskiness using the combinatorial techniques of partially ordered sets. As was noted in the literature (de Servigny and Renault 2004), banks and other financial services organizations are not fully utilizing the opportunities provided by the significant increase in the availability of financial data. More specifically, the area of credit rating and scoring is lacking in up-to-date methodological advances (Galindo and Tamayo 2000; de Servigny and Renault 2004; Huang et al. 2004).
The logical analysis of data (LAD) is a modern data mining methodology based on combinatorics, optimization, and Boolean logic. LAD can be applied for the analysis and classification of archives containing both binary (Hammer 1986; Crama et al. 1988) and numerical (Boros et al. 1997) data. The novelty of LAD consists in utilizing combinatorial search techniques to discover various combinations of attribute values that are characteristic of the positive or negative character of observations (such as whether a bank is solvent or not, or whether a patient is healthy or sick). Then LAD selects (usually small) subsets of such combinations (usually optimizing a certain quality objective) to construct what is called a model (Boros et al. 2000). We briefly describe below the basic concepts of LAD, referring the reader for a more detailed description to Boros et al. (2000) and Alexe et al. (2007). Observations in archives analyzed by LAD are represented by n-dimensional real-valued vectors that are called positive or negative based on the value of the additional binary (0,1) attribute called the outcome or the class of the observation. Consider a dataset as a collection of M points .aj ; zj / where the outcome zj of observation j has value 1 for a positive outcome and 0 for a negative one, and aj is an n-dimensional vector. Figure 43.1 illustrates a dataset containing five observations described by three variables. Each component aj Œi of the Œ5 3 -matrix in Fig. 43.1 gives the value taken by variable i in observation j . We will use aŒi 3 The presentation in this section is partly based on Hammer et al. (2006).
642
A. Kogan and M.A. Lejeune
Observation
Variables
Outcome
j
a[1]
a[2]
a[3]
zj
1 2 3 4 5
3.5 2.6 1 3.5 2.3
3.8 1.6 2.2 1.4 2.1
2.8 5 3.7 3.9 1
1 1 1 0 0
Fig. 43.1 Set of observations
At least one of the pattern conditions is violated by a suf-
ficiently high proportion of the negative (positive) observations. The number of variables the values of which are constrained in the definition of a pattern is called the degree of the pattern. The fraction of positive (negative) observations covered by a positive (negative) pattern is called the prevalence of it. The fraction of positive (negative) observations among those covered by a positive (negative) pattern is called the homogeneity. Considering the above example, the pattern y1;3 D 0 and y3;1 D 1
to denote the variables corresponding to the components of this datatset. The rightmost column provides the outcome of the observation. LAD discriminates positive and negative observations by constructing a binary-valued function f depending on the n input variables, in such a way that it closely approximates the unknown actual discriminator. LAD constructs this function f as a weighed sum of combinatorial patterns. In order to specify how such a function f is found, we first transform the original dataset into a binarized dataset in which the variables can only take the values 0 and 1. We shall achieve this goal by using indicator variables that show whether the values the variables take in a particular observation are “large” or “small”; more precisely, each indicator variable shows whether the value of a numerical variable does or does not exceed a specified level. This is achieved by defining, for each variable aŒi , a set of K.i / values fci;k jk D 1; : : : ; K.i /g, called cutpoints to which binary variables fyi;k jk D 1; : : : ; K.i /g are associated. The values of these binary variables for each observation .zj ; aj / are then defined as: 1 if aj Œi ci;k j yi;k D 0 otherwise Figure 43.2 provides the binarized dataset corresponding to the data displayed in Fig. 43.1, and shows the values ci;k of the cutpoints k for each variable aŒi and those of the binary j variables yi;k associated with any cutpoint k of variable i in 1 D 1 since a1 Œ1 D 3:5 is observation j . For example, y1;1 greater than c1;1 D 3. Positive (negative) patterns are combinatorial rules obtained as conjunctions of binary variables and their negations, which, when translated to the original variables, constrain a subset of input variables to take values between identified upper and lower bounds, so that: All the pattern conditions are satisfied by a sufficiently
high proportion of the positive (negative) observations in the dataset, and
is a positive one of degree 2 covering one positive observation and no negative observation. Therefore, its prevalence is equal to 100/3% and its homogeneity is equal to 100%. In terms of the original variables, it imposes a strict upper bound (3) on the value of aŒ1 and a strict lower bound (4) on the value of aŒ3 . LAD starts its analysis of a dataset by generating the pandect; that is, the collection of all patterns in a dataset. Note that the pandect of a dataset of typically occurring dimension can contain exponentially large number of patterns, but many of these patterns are either subsumed by other patterns or similar to them. It is therefore important to impose a number of limitations on the set of patterns to be generated, by restricting their degrees (to low values), their prevalence (to high values), and their homogeneity (to high values); these bounds are known as LAD control parameters. The quality of patterns satisfying these conditions is usually much higher than that of patterns having high degrees, or low prevalence, or low homogeneity. Several algorithms have been developed for the efficient generation of large subsets of the pandect corresponding to reasonable values of the control parameters (Boros et al. 2000). The collections of patterns sufficient for classifying the observations in the dataset are called models. A model is supposed to include sufficiently many positive (negative) patterns to guarantee that each of the positive (negative) observations in the dataset is “covered” by (i.e., satisfies the conditions of) at least one of the positive (negative) patterns in the model. Furthermore, good models tend to minimize the number of points in the dataset covered simultaneously by both positive and negative patterns in the model. LAD models can be constructed using the Datascope software (Alexe 2002). LAD classifies observations on the basis of model’s evaluation of them in the following way. An observation (contained in the given dataset or not) satisfying the conditions of some of the positive (negative) patterns in the model, and not satisfying the conditions of any of the negative (positive) patterns in the model, is classified as positive (negative). To classify an observation that satisfies both positive and negative
Variables Cutpoints
a[1]
a[2]
c 1,1
c 1,2
c 1,3
c 2,1
3
2.4
1.5
3
a[3]
c 2,2
c 3,1
c 3,2
c 3,3
2
4
3
2
j i ,k
y Binary Variables
Table 43.1 Classification matrix Classification of observations Observation classes Positive Negative Unclassified Positive Negative
a b
c d
e f
(43.1)
and the corresponding classification is determined by the sign of this expression. LAD leaves unclassified any observation for which .!/ D 0, since in this case either the model does not provide sufficient evidence, or the evidence it provides is contradictory. Computational experience with real-life problems has shown that the number of unclassified observations is usually small. The results of classifying the observations in a dataset can be represented in the form of a classification matrix (Table 43.1). Here, the percentage of positive (negative) observations that are correctly classified is represented by a (respectively d ). The percentage of positive (negative) observations that are misclassified is represented by c (respectively b). The percentage of positive (negative) observations that remain unclassified is represented by e (respectively f ). Clearly, a C c C e D 100% and b C d C f D 100%. The quality of the classification is defined by: Q D 1=2.a C d / C 1=4.e C f /:
1 0 0 1 0
1 1 0 1 0
1 1 0 1 1
1 0 0 0 0
z 1 0 1 0 1
0 1 0 0 0
0 1 1 1 0
1 1 1 1 0
j
1 1 1 0 0
43.3 Absolute Creditworthiness: Credit Risk Ratings of Financial Institutions4 43.3.1 Problem Description
patterns in the model, LAD utilizes a discriminant that assigns specific weights to the patterns in the model (Boros et al. 2000). To define the simplest discriminant that assigns equal weights to all positive (negative) patterns, let p and q represent the number of positive and negative patterns in a model and let h and k, respectively, represent the numbers of those positive, and negative patterns in the model that cover a new observation !. Then the discriminant .!/ is calculated as:
.!/ D h=p k=q
1 2 3 4 5
Outcome
Fig. 43.2 Binarized dataset
643
Observations
43 Combinatorial Methods for Constructing Credit Risk Ratings
(43.2)
The capability of evaluating the credit quality of banks has become extremely important in the last 30 years given the increase in the number of bank failures: during the period from 1950 to 1980, bank failures averaged less than seven per year, whereas during the period from 1986 to 1991, they averaged 175 per year (Barr and Siems 1994). Curry and Shibut (2000) report that the so-called savings and loan crisis cost about $123.8 billion. Central banks are afraid of widespread bank failures as they could exacerbate cyclical recessions and result in more severe financial crises (Basel Committee on Banking Supervision 2004). More accurate credit risk models for banks could enable the identification of problematic banks early, which is seen by the Bank of International Settlements (2004) as a necessary condition to avoid failure, and could serve the regulators in their efforts to minimize bailout costs. The evaluation and rating of the creditworthiness of banks and other financial organizations is particularly challenging, since banks and insurance companies appear to be more opaque than firms operating in other industrial sectors. Morgan (2002) attributes this to the fact that banks hold certain assets (loans, trading assets, and so forth) the risks of which change fast and are very difficult to assess, and it is further compounded by banks’ high leverage. Therefore, it is not surprising that the main rating agencies (Moody’s and S&P) disagree much more often about the ratings given to banks than about those given to obligors in other sectors. The difficulty of accurately rating those organizations is also due to the fact that the rating migration volatility of banks
4
The presentation in this section is based on Hammer et al. (2007b).
644
is historically significantly higher than it is for corporations and countries, and that banks tend to have higher default rates than corporations (de Servigny and Renault 2004). Another distinguishing characteristic of the banking sector is the external support (i.e., from governments) that banks receive that other corporate sectors do not (Fitch Ratings 2006). A thorough review of the literature pertaining to the rating and evaluation of credit risk of financial institutions can be found in Hammer et al. (2007b). In the next sections, we shall:
A. Kogan and M.A. Lejeune
Table 43.2 Fitch individual rating system (Fitch Ratings 2001) Numerical scale Category Description A
9
A very strong bank. Characteristics may include outstanding profitability and balance sheet integrity, management, or operating environment
B
7
A strong bank. There are no major concerns regarding the bank
C
5
An adequate bank that, however, possesses one or more troublesome aspects. There may be some concerns regarding its profitability, balance sheet integrity, management, operating environment or prospects
D
3
A bank that has weaknesses of internal and/or external origin. There are concerns regarding its profitability, management, balance sheet integrity, franchise, operating environment or prospects
E
1
A bank with very serious problems that either requires or is likely to require external support
Identify a set of variables that provides sufficient informa-
tion to accurately replicate the Fitch bank ratings; Construct LAD patterns to discriminate between banks
with high and low ratings; Construct an optimized model utilizing (some of) the
LAD patterns, which is capable of distinguishing between banks with high and low ratings; Define an accurate bank rating system on the basis of the discriminant values provided by the constructed model; and Cross-validate the proposed rating system.
43.3.2 Data 43.3.2.1 External Credit Risk Ratings of Financial Institutions This section starts with a brief description of the Fitch Individual Bank rating system. Long- and short-term credit ratings provided by Fitch constitute an opinion on the ability of an entity to meet financial commitments (interest, preferred dividends, or repayment of principal) on a timely basis (Fitch Ratings 2001). These ratings are comparable worldwide and are assigned to countries and corporations, including banks. Fitch bank ratings include individual and support ratings. Support ratings comprise five rating categories that reflect the likelihood that a banking institution will receive support either from the owners or the governmental authorities if it runs into difficulties. The availability of support, though critical, does not reflect completely the likelihood that a bank will remain solvent in case of adverse situations. To complement a support rating, Fitch also provides an individual bank rating to evaluate credit quality separately from any consideration of outside support. This rating is commonly viewed as assessing a bank were it entirely independent and could not rely on external support. It supposedly takes into consideration such factors as profitability and balance sheet integrity, franchise, management, operating environment, and prospects.
Table 43.2 presents a detailed description of the nine rating categories characterizing the Fitch individual bank credit rating system. Since individual bank credit ratings are comparable across different countries, they will be used in the remaining part of this paper. In addition, Fitch uses gradations among these five ratings: A/B, B/C, C/D and D/E, the corresponding numerical values of which being respectively 8, 6, 4 and 2. This conversion of the Fitch individual bank ratings into a numerical scale is commonly used (see, e.g., Poon et al. 1999).
43.3.2.2 Variables and Observations We use the following 11 financial variables (loans, other earning assets, total earning assets, non-earning assets, net interest revenue, customer and short-term funding, overheads, equity, net income, total liability and equity, operating income) and 9 representative financial ratios as predictors in our model. The variables are ratio of equity to total assets (asset quality), net interest margin; ratio of interest income to average assets; ratio of other operating income to average assets; ratio of non-interest expenses to average assets; return on average assets (ROAA); return on average equity
43 Combinatorial Methods for Constructing Credit Risk Ratings
(ROAE); cost to income ratio (operations), and ratio of net loans to total assets (liquidity). The values of these variables were collected at the end of 2000 and come from the database Bankscope. As an additional variable, we use the S&P risk rating of the country where the bank is located. The S&P country risk rating scale comprises 22 different categories (from AAA to D). We convert these categorical ratings into a numerical scale, assigning the largest numerical value (21) to the countries with the highest rating (AAA). Similar numerical conversions of country risk ratings are also used by Ferri et al. (1999) and Sy (2004). Similarly, Bloomberg has also developed a standard cardinal scale for comparing Moody’s, S&P, and Fitch-BCA ratings. Our dataset consists of 800 banks rated by Fitch and operating in 70 different countries (247 in Western Europe, 51 in Eastern Europe, 198 in Canada and the USA, 45 in developing Latin-American countries, 47 in the Middle East, 6 in Oceania, 6 in Africa, 145 in developing Asian countries, and 55 in Hong-Kong, Japan, and Singapore).
43.3.3 LAD Model for Bank Ratings Our design of an objective and transparent bank rating system on the basis of the LAD methodology is guided by the properties of LAD as a classification system. Therefore, we define a classification problem associated to the bank rating problem, construct an LAD model for it, and then define a bank rating system rooted in this LAD model. We define as positive observations the banks that have been rated by Fitch
645
as A, A/B or B, and as negative observations those whose Fitch rating is D, D/E or E. In the binarization process, cutpoints were introduced for the 19 of the 24 numerical variables shown in Table 43.3. Actually, the other five numerical variables were also binarized, but since it turned out that they were redundant, only the variables shown in Table 43.3 were retained for constructing the model. Table 43.3 provides all the cutpoints used in pattern and model construction. For example, two cutpoints (24.8 and 111.97) are used to binarize the numerical variable “Profit before Tax” (PbT); that is, two binary indicator variables replace PbT, one telling whether PbT exceeds 24.8, and the other telling whether PbT exceeds 111.97. The first step of applying the LAD technique to the problem binarized with the help of these variable cutpoints was the identification of a collection of powerful patterns. One example of such a powerful negative pattern is (1) the country risk rating is strictly lower than A, and (2) the profits before tax are at most equal to d111:96 millions. One can see that these conditions describe a negative pattern, since none of the positive observations (i.e., banks rated A, A/B or B) satisfy both of them, while no less than 69.11% of the negative observations (i.e., those banks rated D, D/E or E) do satisfy both conditions. This pattern has degree 2, prevalence 69.11%, and homogeneity 100%. The model we have developed for bank ratings is very parsimonious, consisting of only 11 positive and 11 negative patterns, and is built on a support set of only 19 out of the 24 original variables. All the patterns in the model are of degree at most 3, have perfect homogeneity (100%), and very substantial prevalence (averaging 50.9% for the positive, and 37.7% for the negative patterns).
Table 43.3 Cutpoints Numerical variables Cutpoints
Numerical variables
Country Risk Rating
11, 16, 19.5, 20
Operating Income (Memo)
1360.74
Non Int Exp/Avg Assets
2.00, 2.77, 3.71, 4.93
Other Earning Assets
3809, 9564.9
Profit before Tax
24.8, 111.97
Return on Average Assets
0.30, 0.80, 1.52
Non-Earning Assets
364
Equity/Total Assets
4.90
Return on Average Equity
11.82, 15.85, 19.23
Total Assets
5735.8
Equity
370.45
Overhead
127, 324
Customer and Short Term Funding
4151.3, 9643.8
Net Interest Margin
1.87
Net Loans/Total Assets
44.95, 53.90, 59.67, 66.50
Cost to Income Ratio
50.76, 71.92
Net Int Rev/Avg Assets
3.02
Other Operating Income
46.8, 155.5, 470.5
Oth Op Inc/Avg Assets
0.83, 1.28, 1.86, 3.10
Cutpoints
Numerical variables
Cutpoints
646
43.3.4 LAD Model Evaluation
A. Kogan and M.A. Lejeune
Referring to the rating category of bank i by j.i / and to the set of banks by N , we solve the convex nonlinear problem
43.3.4.1 Accuracy and Robustness of the LAD Model minimize The classification of the banks whose ratings are A, A/B, B, D, D/E or E with the above LAD model is 100% accurate. We use ten two-folding experiments to cross-validate the model. In each of those experiments, we randomly assign the observations to a training and a testing sets of equal size. We use the training set to build the model and we apply it to classify the observations in the testing set. In the second half of the experiment, we reverse the role of the two sets. The average accuracy and the standard deviation of the accuracy equal to 95.12% and 0.03, respectively. These results highlight the predictive capability and the robustness of the derived LAD model. We have also computed the correlation between the discriminant values of the LAD model and the bank credit risk ratings (represented on their numerical scale). Despite the fact that the correlation measure takes into account all the banks in the dataset (i.e., not only those that were used in creating the LAD model, but also those rated B/C, C or C/D, that were not used in the learning process), the correlation is very high, equal to 80.70%, attesting to the strong predictive power of the LAD model. The ten two-folding experiments described above were then used to verify the stability of the correlation between the LAD discriminant values and the bank ratings. The average correlation is equal to 80.04%, with a standard deviation of 0.04, another testimony of the stability of the close positive association between the LAD discriminant values and the bank ratings.
43.3.4.2 From LAD Discriminant Values to Ratings The objective of this section is to map the numerical values of the LAD discriminant to the nine bank rating categories .A; A=B; : : : ; E/. This will be accomplished using a nonlinear optimization problem that partitions the interval of the discriminant values into nine sub-intervals that we associate to the nine rating categories. The partitioning is determined through cutpoints xi such that 1 D x0 x1 x2 : : : x8 x9 D 1, where i indexes the rating categories (with 1 corresponding to E, and 9 corresponding to A). A bank should be rated i if its discriminant value is in the interval Œxi ; xi C1 . Since such a perfect partitioning may not exist, we replace the LAD discriminant values di by an adjusted discriminant value ıi , and find values of ıi for which such a partitioning exists, and which are “as close as possible” to the values di . The expression “as close as possible” involves the minimization of the mean square approximation error.
X
.ıi di /2
i 2N
subject to ıi xj.i /C1 ; i 2 N xj.i / < ıi ; i 2 N 1 D x0 x1 x2 ;. . . ; xj ;. . . ; x8 x9 D 1 1 ıi 1; i 2 N (43.3) on the NEOS server (Czyzyk et al. 1998) with the solver Lancelot to determine the cutpoints xj and the adjusted discriminant values ıi that will be used for rating the banks not only in the training sample, but also those that are not. In this case, the bank rating is defined by the particular sub-interval containing the LAD discriminant value. As compared to ordered logistic regression, the LADbased approach has the additional advantage that it can be used for generating any desired number of rating categories. The LAD model can take the form of a binary classification model and can be used to pre-approve a loan to a bank. The LAD rating approach can also be used to derive models with higher granularity (i.e., more than nine rating categories), which are used by banks to further differentiate their customers and to tailor their credit pricing policies accordingly.
43.3.4.3 Conformity of Fitch and LAD Bank Ratings In this section, the goal is to investigate the goodness-of-fit of our rating system and observe to which extent the original LAD discriminant values fit in the identified rating subintervals. We denote by qk .k D 0; : : : ; 8/ the fraction of banks whose rating category determined in this way, differs from its Fitch rating by exactly k categories. It is worth recalling that the rating cutpoints were derived using all the banks in the sample, but that the LAD discriminant values were obtained by only taking into account the banks rated A, A/B, B, D, D/E and E. This is why we calculate separately the discrepancy counts for the banks rated A, A/B, B, D, D/E and E, and for those rated B/C, C and C/D. Table 43.4 highlights the high goodness-of-fit of the proposed LAD model. More than 95% of the banks are rated within two categories of their Fitch rating, with about 30% of the banks receiving an identical rating as in the Fitch rating system, and another 51% being off by exactly one rating category. The very high concordance between the LAD and the Fitch ratings is illustrated by the weighted average distances between the two ratings equal to (1) 0.93 for the categories A, A/B, B, D, D/E and E; (2) 0.98 for the categories B/C, C and C/D, and (3) 0.95 for all banks in the data set. The
43 Combinatorial Methods for Constructing Credit Risk Ratings
647
Table 43.4 Discrepancy analysis qk k
N D {A, A/B, B, D, D/E, E}
N D {B/C, C, C/D}
N D {A, A/B, B, B/C, C, C/D, D, D/E, E}
0 1 2 3 4 5–6–7–8
29.60% 53.28% 12.05% 4.23% 0.85% 0.00%
29.36% 47.71% 18.96% 3.36% 0.61% 0.00%
29.50% 51.00% 14.88% 3.88% 0.75% 0.00%
Table 43.5 Cross-validated discrepancy analysis k
qNk
0
1
2
3
4
5
6
7
8
29.01%
51.18%
14.10%
4.36%
1.14%
0.20%
0.01%
0.00%
0.00%
stability of the proposed rating system and its suitability to evaluate the creditworthiness of “new” banks; that is, banks that are not rated by agencies, or banks with which the rater has not dealt with previously, are magnified by the fact that the goodness-of-fit of ratings for the banks not used in deriving the LAD model is very close to the goodness-of-fit for the banks used for deriving the LAD model (i.e., those rated A, A/B, B, D, D/E and E). In order to appraise the robustness of the proposed rating system, we apply ten times the two-folding procedure described above to derive the average .qNk / discrepancy counts (Table 43.5) of the bank ratings predicted for the testing sets. The fact that on the average the difference between the Fitch and the LAD ratings is only 0.98 is a very strong indicator of the LAD model’s stability and the absence of overfitting.
generating profits and that of shareholders’ equity in generating profits, are critical indicators of a company’s prosperity, and are presented by Sarkar and Sriram (2001) as key predictors in assessing the wealth of a bank. The return on average equity is also found to be significant for predicting the rating of U.S. banks (Huang et al. 2004).
43.3.5 Remarks on Reverse-Engineering Bank Ratings The results presented above demonstrate that the LAD approach can be used to derive rating models with varying granularity levels: A binary classification model to be used for the pre-
43.3.4.4 Importance of Variables While the focus in LAD is on discovering how the interactions between the values of small groups of variables (as expressed in patterns) affect the outcome (i.e., the bank ratings), one can also use the LAD model to learn about the importance of individual variables. A natural measure of importance of a variable in an LAD model is the frequency of its appearance in the model’s patterns. The three most important variables in the 22 patterns constituting the LAD model are the credit risk rating of the country where the bank is located, the return on average total assets, and the return on average equity. The importance of the country risk rating variable, which appears in 18 of the 22 patterns, can be explained by the fact that credit rating agencies are reluctant to give an entity a better credit risk rating than that of the country where it is located. Both the return on average assets and the return on average equity variables appear in six patterns. These two ratios, respectively, representing the efficiency of assets in
approval operations; A model with the same granularity as the benchmarked
rating model; A model with higher discrimination power; that is, with
higher granularity than that of the benchmarked rating model, to allow the bank to refine its pricing policies and allocate regulatory capital. We show that the LAD model cross-validates extremely well and is therefore highly generalized, and could be used by financial institutions to develop internal, Basel-compliant rating models. The availability and processing of data have been major obstacles in using credit risk rating models. Until recently, many banks did not maintain such data sets and were heavily dependent on qualitative judgments. It is only after the currency crises of the 1990s and the requirements imposed by the Basel Accord that financial institutions have an incentive in collecting and maintaining the necessary data and databases.
648
The move toward a heavier reliance on rating models is based on the assumption that models produce more consistent ratings and that, over the long haul, operating costs will diminish since less labor will be required to produce ratings. The model proposed in this paper will reinforce the incentives to develop and rely on credit risk models in bank operations because: The accuracy and predictive ability of the proposed model
will guarantee dependable ratings, Its parsimony will alleviate the costs of extracting and
maintaining large data sets, It will result in leaner loan approval operations and faster
decisions, and thus reduce the overall operating costs.
43.4 Relative Creditworthiness: Country Risk Ratings5 43.4.1 Problem Description Country risk ratings have critical importance in international financial markets since they are the primary determinants of the interest rates at which countries can obtain credit. There are numerous examples of countries having to pay higher rates on their borrowing following their rating downgrade. An often-cited example is Japan. As mentioned above, another critical aspect of country risk ratings is in the countries’ influence on the ratings of national banks and companies that would make them more or less attractive to foreign investors. That is why the extant literature calls country risk ratings the “pivot of all other country’s ratings” (Ferri et al. 1999), and considers them the credit risk ceiling for all obligors located in a country (Eliasson 2002; Mora 2006). Historical record shows the reluctance of raters to give a company a higher credit rating than that of the sovereign country where the company operates. Contractual provisions sometimes prohibit institutional investors from investing in debt rated below a prescribed level. It has been demonstrated (Ferri et al. 1999) that variations in sovereign ratings drastically affect the ratings of banks operating in low-income countries, while the ratings of banks operating in high-income countries (Kaminsky and Schmukler 2002; Larrain et al. 1997) do not depend that much on country ratings. Banks, insurance companies, and public institutions frequently assess the amount of exposure they have in each country and establish lending limits taking into account the estimated level of country risk. Credit managers in multinational corporations have to assess evolving con5
The presentation in this section is based on Hammer et al. (2006, 2007b).
A. Kogan and M.A. Lejeune
ditions in foreign countries in order to decide whether to request letters of credit for particular transactions. Country risk estimates and their updates are utilized on a real-time basis by multinational corporations facing constant fluctuation in international currency values and difficulties associated with moving capital and profits across national boundaries. Financial institutions have to rely on accurate assessments of credit risk to comply with the requirements of the Basel Bank of International Settlements. Experiencing the pressure to get higher ratings could lead to fraud attempts. A notorious case was Ukraine’s attempt to obtain IMF credits through misleading reporting of its reserves data on foreign exchanges. The existing literature on country risk, defined by Bourke and Shanmugam (1990) as “the risk that a country will be unable to service its external debt due to an inability to generate sufficient foreign exchange,” recognizes both financial/economic and political components of country risk. There are two basic approaches interpreting the reasons for defaulting. The first one is the debt-service capacity approach, which considers the deterioration of solvency of a country as preventing it from fulfilling its commitments. This approach views country risk as a function of various financial and economic country parameters. The second one is the cost-benefit approach which considers a default on commitments or a rescheduling of debt as a deliberate choice of the country. In this approach the country accepts possible long-term negative effects (e.g., the country’s exclusion from certain capital markets (Reinhart 2002) as preferable to repayment. As it is politically driven, this approach includes political country parameters in addition to the financial and economic ones in country risk modeling (Brewer and Rivoli 1990; Rivoli and Brewer 1997; Citron and Neckelburg 1987).
43.4.2 Data 43.4.2.1 Ratings We analyze Standard & Poor’s foreign currency country ratings, as opposed to the ratings for local currency debt. The former is the more important problem, since the sovereign government usually has a lower capacity to repay external (as opposed to domestic) debt, and as an implication, the international bond market views foreign currency ratings as the decisive factor (Cantor and Packer 1996). This is manifested by much higher likelihood for international investors to acquire foreign currency obligations rather than domestic ones. In evaluating foreign currency ratings one has to take into account not only the economic factors but also the country intervention risk; that is, the risk that a country imposes measures such as, exchange controls or a debt moratorium. The evaluation of local currency ratings need not take into account this country intervention risk.
43 Combinatorial Methods for Constructing Credit Risk Ratings
Table 43.16 lists the different country risk levels used by S&P’s country ratings, and it also provides descriptions associated with these labels. A rating inferior to BBC indicates that a country is non-investment grade (speculative). A rating of CCCC or lower indicates that a country presents serious default risks. BB indicates the least degree of speculation and CC the highest. The addition of a plus or minus sign modifies the rating (between AA and CCC) to indicate relative standing within the major rating category. These subcategories are treated as separate ratings in our analysis.
43.4.2.2 Selected Variables The selection of relevant variables is based on three criteria. The first criterion is the variable’s significance (relevance) in assessing countries’ creditworthiness. The review of the literature on predictors for country risk ratings (see Hammer et al. 2007) played an important role in defining the set of candidate variables to include in our model. The second criterion is the availability of complete and reliable statistics, to allow us to maintain the significance and the scope of our analysis. The third criterion is the uniformity of data across countries. This is why we decided against incorporating the unemployment rate statistics provided by the World Bank, since it is compiled according to different definitions: there are significant differences between countries in the treatment of temporarily laid off workers, those looking for their first job, and the criteria for being considered as unemployed. After applying the criteria of relevance, availability and uniformity described above, the following variables were incorporated in our model: Gross Domestic Product Per capita (GDPc), Inflation Rate (IR), Trade Balance (TB), Exports’ Growth Rate (EGR), International Reserves (RES), Fiscal Balance (FB), Debt to GDP (DGDP), Political Stability (PS), Government Effectiveness (GE), Corruption (COR), Exchange Rate (ER), Financial Depth and Efficiency (FDE). In addition to eight economic variables that have already been used in the country credit risk rating literature, a new one, called Financial depth and efficiency (FDE) was added. This variable is measured as the ratio of the domestic credit provided by the banking sector to the GDP. It measures the growth of the banking system since it reflects the extent to which savings are financial. The financial depth and efficiency variable was first considered in Hammer et al. (2006, 2007) for the evaluation of country risk ratings. This variable is included because it captures some information that is relevant to the assessment of the creditworthiness of a country and that information is not accounted for by other variables and in particular by the fiscal balance of a country. Our analysis utilized the values of these nine economic/financial variables and three political variables taken
649
at the end of 1998. Our dataset included the values of these 12 variables for the 69 countries considered: 24 industrialized countries, 11 Eastern European countries, 8 Asian countries, 10 Middle Eastern countries, 15 Latin American countries, and South Africa. The dependent variable is the S&P’s country risk ratings for these countries at the end of 1998. The sources utilized for compiling the values of the economic/financial variables include the International Monetary Fund (World Economic Outlook database), the World Bank (World Development Indicators database) and, for the ratio of debt to gross domestic product, Moody’s publications. Values of political variables are taken from Kaufmann et al. (1999a, b), whose database is a joint product of the Macroeconomics and Growth, Development Research Group and Governance, Regulation and Finance Institutes affiliated with the World Bank.
43.4.3 Rating Methodologies In the next subsections, we derive two new combinatorial models for country risk rating by reverse-engineering and “learning” from past S&P’s ratings. The models are developed using the novel combinatorial-logical technique of Logical Analysis of Data, which derives a new rating system only from the qualitative information representing pairwise comparisons of country riskiness. The approach is based on a relative creditworthiness concept, which posits that the knowledge of (pre)order of obligors with respect to their creditworthiness should be the sole source for creating a credit risk rating system. Stated differently, only the order relation between countries should determine the ratings. Thus, the inference of a model for the order relation between countries is the objective of this study. This is in perfect accordance with the general view of the credit risk rating industry. Altman and Rijken (2004) state that the objective of rating agencies is “to provide an accurate relative (i.e., ordinal) ranking of credit risk,” which is confirmed by Fitch Ratings (2006) saying that “Credit ratings express risk in relative rank order, which is to say they are ordinal measures of credit.” Bhatia (2002) adds that, “Although ratings are measures of absolute creditworthiness, in practice, the ratings exercise is highly comparative in nature: : : On one level, the ratings task is one of continuously sorting the universe of rated sovereigns – assessed under one uniform set of criteria – to ensure that the resulting list of sovereigns presents a meaningful global order of credit standing. On another level, the sorting task is constrained by a parallel need to respect each sovereign’s progression over time, such that shifting peer comparisons is a necessary condition – but not a sufficient one – for upward or downward ratings action.”
650
A. Kogan and M.A. Lejeune
A self-contained model of country risk ratings can hardly be developed by standard econometric methods since the dataset contains information on only about 69 countries, each described by 12 explanatory variables. A possible alternative using combinatorial techniques is to examine the relative riskiness of one country compared to another one rather than modeling the riskiness of each individual country. This approach has the advantage of allowing the modeling to be based on a much richer dataset (2,346 pairs of countries), which consists of the comparative descriptions of all pairs of countries in the current dataset. The models utilize the values of nine economic and three political variables associated to a country, but do not use previous years’ ratings directly or indirectly. This is a very important feature, because the inclusion of information from past ratings (lagged ratings, rating history) does not allow the construction of a self-contained rating system and does not make it possible to rate the creditworthiness of not-yetrated countries. We refer to Hammer et al. (2006, 2007) for a more detailed discussion of the advantages of building a non-recursive country risk rating system. Moreover, the proposed LAD models completely eliminate the need to view the ratings as numbers. Section 43.4.3.1 presents the common features of the two developed LAD models, while Sects. 43.4.3.2 and 43.4.3.3 discuss the specifics of the LAD-based Condorcet ratings and the logical rating scores, respectively.
43.4.3.1 Commonalities 43.4.3.1.1 Pairwise Comparison of Countries: Pseudo-observations Every country i 2 I D f1; : : : ; 69g in this study is described by the 13-dimensional vector Ci , whose first component is the country risk rating given by Standard and Poor’s, while the remaining 12 components specify the values of the nine economic/financial and three political variables. A pseudo-observation Pij , associated to every pair of countries, i; j 2 I , provides below a comparative description of the two countries. Every pseudo-observations is also described by a 13dimensional vector. The first component is an indicator that takes the value 1 if the country i in the pseudo-observation Pij has a higher rating than the country j; 1, if the country j has a higher rating than the country i , and 0 if the two countries have the same rating. The other components k; k D 2; : : : ; 13 of the pseudo-observation Pij Œk are derived by taking the differences of the corresponding components of Ci and Cj : Pij Œk D Ci Œk Cj Œk ; k D 2; : : : ; 13
(43.4)
Transformation (43.4) alleviates the problems related to the small size .jI j/ of the original dataset by constructing a substantially larger dataset containing jI j .jI j 1/ pseudoobservations. While the set of pseudo-observations is not independent, since Phi CPij D Phj , Hammer et al. (2006) show that this does not create any problems for the LAD-based combinatorial data analysis techniques. We illustrate the construction of pseudo-observations with Japan and Canada. Rows 2 and 3 in Table 43.6 display the values of the 12 economic/financial and political variables, as well as the S&P’s rating for Japan and Canada (at the end of December 1998), while rows 4 and 5 report the pseudoobservations PJapan;Canada and PCanada;Japan from the country observations CJapan and CCanada . Since Japan and Canada are rated AAA and AAC, respectively, by S&P’s at the end of December 1998, and the rating AAA is better than rating AAC, the first component of the pseudo-observation vector is equal to 1. The set of pseudo-observations is antisymmetric. An advantage of this transformation is that it avoids the problems posed by the fact that the original dataset contains only a small number .jI j/ of observations. The transformation (43.4) provides a larger dataset containing jI j .jI j 1/ pseudo-observations.
43.4.3.1.2 Construction of Relative Preferences The LAD-based reverse-engineering of Standard & Poor’s rating system starts with deriving an LAD model (constructed as a weighed sum of patterns) from the archive of all those pseudo-observations Pij , corresponding to pairs of countries i and j having different S&P’s ratings. A model resulting from applying LAD to the 1998 dataset consists of 320 patterns. As an example, let us describe two of these patterns below. The positive pattern FDE > 28:82I GDPc > 1539:135I GE > 0:553; can be interpreted in the following way. If country i is characterized by (i) A financial depth and efficiency (FDE) exceeding that of country j by at least 28.82, and (ii) A gross domestic product per capita (GDPc) exceeding that of country j by at least 1,539.135, and (iii) A government efficiency (GE) exceeding that of country j by at least 0.553, then country i is perceived as more creditworthy than country j . Similarly, the negative pattern GDPc < 4886:96I ER < 0:195I COR < 0:213
Table 43.6 Examples of country and pseudo-observations S&P’s rating FDE RES CJapan AAA 138:44 5:168 CCanada AAC 94:69 1:01964 PJapan; Canada 1 43:75 4:15 PCanada; Japan 1 43:75 4:15
IR 0:65 0:99 0:34 0:34
TB 21:7471 55:9177 34:17 34:17
EGR 2:54 8:79 11:33 11:33
GDPc 24; 314:2 24; 855:7 541:5 541:5
ER 0:839 0:939 0:1 0:1
FB 7:7 0:9 8:6 8:6
DGDP 0:47 0:5 0:03 0:03
PS 1:153 1:027 0:126 0:126
GE 0:839 1:717 0:878 0:878
COR 0:724 2:055 1:331 1:331
43 Combinatorial Methods for Constructing Credit Risk Ratings
651
can be interpreted in the following way. If country j is characterized by (i) A gross domestic product per capita (GDPc) exceeding that of country i by 4,886.96, and (ii) An exports growth rate (EGR) exceeding that of country i by 0.195, and (iii) A level of incorruptibility (CR) exceeding that of country i by 0.213, then country i is perceived as less creditworthy than country j . The constructed LAD model allows us to compute the discriminant .Pij / for each pseudo-observation Pij .i ¤ j /. We call the values .Pij / of the discriminant the relative preferences. They can be interpreted as measuring how “superior” the country i ’s rating over that of country j . We call the Œ69 69 -dimensional anti-symmetric matrix , having the relative preferences as components, the relative preference matrix. Its components are relative preferences
.Pij /.i ¤ j / associated with every pair of countries, including those that have the same S&P’s ratings, although only those pseudo-observation Pij for which i and j had different ratings were used in deriving the LAD model.
43.4.3.1.3 Classification of Pseudo-observations and Cross-Validation The dataset used to derive the LAD model contains 4,360 pseudo-observations, with an equal number of positive and negative pseudo-observations and is anti-symmetric .Pij D Pij /. The results of applying the LAD model to classify the observations in the dataset are presented in Table 43.7. The overall classification quality of the LAD model (according to Equation (43.2)) is 95.425%. This very high classification accuracy may be misleading, since the constructed LAD model can be overfitting the data; that is, it has adapted excessively well to random noise in the training data, and thus has an excellent accuracy on this training data, but it could perform very poorly on new observations. We therefore utilize a statistical technique known as “jackknife” (Quenouille 1949) or “leave-one-out” to show that the high classification accuracy is not due to overfitting. This technique removes from the dataset one observation at a time, learns a model from all the remaining observations, and evaluates the resulting model on the removed observation; then it repeats all these steps for each observation in the dataset. If on the average the predicted evaluations are “close to” the actual ones, then the model is not affected by overfitting. In the case at hand, it would not be statistically sound to implement the jackknife technique in a straightforward way because of the dependencies among pseudo-
652
A. Kogan and M.A. Lejeune
Table 43.7 Classification matrix
Classified as
Positive observations Negative observations
Table 43.8 Classification matrix for JK
Positive
Negative
Unclassified
Total
93:90% 3:72%
3:72% 93:90%
2.38% 2.38%
100% 100%
Positive
Negative
Unclassified
Total
93:49% 3:67%
3:67% 93:49%
2.84% 2.84%
100% 100%
Classified as
Positive observations Negative observations
observations (Hammer et al. 2006). Indeed, even after explicitly eliminating a single pseudo-observation Pij from the dataset it would still remain in the dataset implicitly, since Pih C Phj D Pij . This problem can be resolved by modifying the above described procedure so that at each step, instead of just removing a single pseudo-observation Pij , all the pseudo-observations that involve a particular country i are removed. The LAD discriminant is derived on the basis of the remaining pseudo-observations, and then used to evaluate the relative preferences for every removed pseudoobservation, resulting in a row of relative preferences of all pseudo-observations Pij , which involve the country i . This modified procedure is repeated for every country in the dataset, and the obtained rows are combined into a matrix of relative preferences denoted by JK . The absence of overfitting is indicated by a very high correlation level of 96.48% between the matrix JK and the original relative preference matrix . To further test for overfitting, we use the obtained matrix of relative preferences JK to classify the dataset of 4,360 pseudo-observations. The results of this classification are presented in Table 43.8, and the overall classification quality of the LAD model (according to Equation (43.2)) is 95.10%. The results in Table 43.8 are virtually identical to those shown in Table 43.7, thus proving the absence of overfitting. Since the derived LAD model does not suffer from overfitting, it is tempting to interpret the signs of relative preferences as indicative of rating superiority, and infer that a positive value for .Pi;j / indicates that country i is more creditworthy than country j , while a negative one for .Pi;j / justifies the opposite conclusion. However, this naïve approach towards relative rating superiority ignores the potential noise in the data and in relative preferences, which would make it difficult to transform classifications of pseudoobservations to a consistent ordering of countries by their creditworthiness. It is shown in Hammer et al. (2006) that the relationship based on the naïve interpretation of the relative
preferences can violate the transitivity requirement of an order relation, and therefore does not provide a consistent partially ordered set of countries. The following section is devoted to overcoming this issue by relaxing the overly constrained search for (possibly non-existent) country ratings whose pairwise orderings are in precise agreement with the signs of relative preferences. Instead, we utilize a more flexible search for a partial order on the set of countries, which satisfies the transitivity requirements, and approximates well the set of relative preferences.
43.4.3.2 Condorcet Ratings The ultimate objective of this section, which is based on the results of Hammer et al. (2006), is to derive a rating system from the LAD relative preferences. This is accomplished in two stages. First, we use the LAD relative preferences to define a partial order on the set of countries, which represents their creditworthiness dominance relationship. Then, we extend the derived dominance relationship to two rating systems, respectively, based on the so-called weak Condorcet winners and losers. The number of rating categories in these rating systems is the same, and will be determined by the structure of the partial order obtained in the first stage; that is, it is not a priori fixed.
43.4.3.2.1 From LAD Relative Preferences to a Partial Order on the Set of Countries A partial order ˘.X / is a reflexive, anti-symmetric, and transitive binary relation on a set X . Any two distinct elements x and y of X , such that .x; y/ 2 ˘.X /, are said to be comparable and denoted by x y. If neither .x; y/ 2 ˘.X /, nor .y; x/ 2 ˘.X /, then x and y are called incomparable and denoted by xjjy.
43 Combinatorial Methods for Constructing Credit Risk Ratings
As mentioned above, the naïve rating superiority relation based on the LAD model is not transitive. This difficulty can be overcome by defining a strengthened version of the naïve rating superiority relation, to be called the dominance relationship. While the former only relies on the sign of the relative preference .Pij /, the definition of dominance of a country i over another country j takes into account not only the sign of the relative preference .Pij /, but also the values of the relative preferences of each of these two countries i and j over every other country k. We will need the following notation to define the dominance relationship. Let Sij .k/ D .Pik / .Pjk /
(43.5)
define the external preference of country i over country j with respect to country k, let P Sij .k/ Sij D
j 2I
jI j
;
(43.6)
define the average external preference of i over j , and let
ij D
vP u Œ. .Pik / .Pjk // Sij 2 u t k2C jI j
(43.7)
define the standard deviation of the external preference of i over j. The dominance relationship of a country i over another country j will be defined by two conditions. The first one requires that .Pij / > 0. It might also seem logical to require that Sij .k/ > 0 for every country k; k ¤ i; j . However, this condition is so difficult to satisfy that the resulting partially ordered set is extremely sparse, with very few country pairs that are comparable. Thus, a more relaxed second condition will require that, only at a certain confidence level, the external preference of i over j should be positive. We parameterize the level of confidence by the multiplier of the standard deviation ij . More formally, for a given > 0, A country i is said to dominate another country j if:
.Pij / > 0 Sij ij > 0
.Pij / < 0 Sij Cij < 0
Note that the larger the value of is, the stronger the conditions are, and the fewer pairs of countries are comparable. If is sufficiently large, then the dominance relationship is transitive and is a partial order, since it becomes progressively sparser until being reduced to empty. On the other hand, if is small (and in particular for D 0), then the dominance relationship becomes denser, but is not necessarily transitive. We are looking for a “rich” dominance relationship (applying to as many country pairs as possible) that is transitive. Formally, our objective is to maximize the number of comparable country pairs subject to preserving the transitivity of the dominance relationship. The richest dominance relationship defined by the two conditions (43.8) and (43.9) in the case of D 0 will be called the base dominance relationship. If i and j are any two countries comparable in the base dominance relationship, then it is possible to calculate the smallest value of the parameter ij D jSij =ij j such that the i and j are not comparable in any dominance relationship defined by a parameter value exceeding or equal to ij . This calculation is based on the fact that given a value of , it is possible to check in polynomial time (Tarjan 1972) whether the corresponding dominance relationship is transitively closed. Then the algorithm that determines in polynomial time the minimum value for which the corresponding dominance relationship is still transitive, sorts at most jI j2 numbers ij in ascending order and then checks one by one the transitivity of the corresponding dominance relationships. When the dominance relationship becomes transitive for the first time, the algorithm stops and outputs equal to the corresponding value of the parameter ij . This study utilizes this value and the corresponding dominance relationship between countries, called here the logical dominance relationship and denoted by the subscript of LAD (e.g., ). LAD
The definition of dominance relationship between countries on the basis of average external preferences bears some similarities to the so-called “column sum methods” (Choo and Wendley 2004) utilized in reconciling inconsistencies resulting from the application of pairwise comparison matrix methods.
(43.8)
A country i is said to be dominated by another country
j if:
653
(43.9)
In all the other cases, countries i and j are said to be not
comparable; this can be due to the lack of evidence, or to conflicting evidence about the dominance of i over j .
43.4.3.2.2 Extending Partially Ordered Sets to “Extreme” Linear Preorders The information about country preferences contained in the economic and political attributes is represented most faithfully by the logical dominance relationship defined above. However, it is impractical to use, since this partial order requires a large amount of data to describe. On the other hand, country preferences can be expressed very compactly
654
by country ratings, since the latter are a very special type of partial orders called linear preorders. A partial order ˘.X / is called a linear preorder if there exists a mapping M W X ! f0; 1; : : : ; kg such that x y if and only if M.x/ > M.y/. Therefore, a linear preorder is completely described by specifying its mapping M . Without loss of generality, one can assume that for every i 2 f0; 1; : : : ; kg, there exists x 2 X such that M.x/ D i . Such a linear preorder is said to have k C 1 levels. To make logical dominance of countries practically utilizable, this relationship should be transformed into a linear preorder preserving all the order relations between countries (i.e., an extension of the partial order), and is as close as possible to it. The logical dominance relationship can be extended in a multitude of ways to a variety of linear preorders. In particular, two extreme linear preorders called the optimistic and the pessimistic extensions, denoted by OE and PE, respectively, are constructed below. The names are justified since the former assigns to each country the highest level it can expect, while the latter assigns to each country the lowest level it can expect: In the first step of OE construction, those countries that
are not dominated by any other country are assigned the highest level, and are then removed from the set of countries under consideration. Iteratively, nondominated countries in the remaining set of countries are assigned the highest remaining level until every country is assigned a level denoted by OEi . In the first step of PE construction, those countries that do not dominate any other country are assigned the lowest level, and are then removed from the set of countries under consideration. Iteratively, nondominating countries in the remaining set of countries are assigned the lowest remaining level until every country is assigned a level denoted by PEi . The method utilized above to construct OE and PE is known as the Condorcet method. It represents a specific type of voting system (Gehrlein, Lepelley 1998), and it is often used to determine the winner of an election. The Condorcet winner(s) of an election is generally defined as the candidate(s) who, when compared in turn with every other candidate, is preferred over each of them. Given an election with preferential votes, one can determine weak Condorcet winners (Ng et al. 1996) by constructing the Schwartz set as the union of all possible candidates such that (1) every candidate inside the set is pairwise unbeatable by any other candidate outside the set, (ties are allowed), and (2) no proper subset of the set satisfies the first property. The Schwartz set consists exactly of all weak Condorcet winners. The weak Condorcet losers are the reverse of the weak Condorcet winners; that is, those losing pairwise to
A. Kogan and M.A. Lejeune
every other candidate. OE assigns the highest level to those countries that are the weak Condorcet winners; that is, better than or incomparable with every other country. PE assigns the lowest level to those countries that are the weak Condorcet losers; that is, worse than or are incomparable with every other country. Note that both OE and PE have the minimum possible number of levels. Indeed, in a directed graph whose vertices are the countries and whose arcs represent comparable countries in the dominance relationship, the length of the longest directed path bounds from below the number of levels in any linear preorder extending the dominance relationship. This length equals the number of levels in OE and PE.
43.4.3.3 Logical Rating Scores The LAD relative preferences can be utilized in a completely different way (compared to OE or PE) to construct country risk ratings. We describe here how to derive new numerical ratings for all countries, called “logical rating scores” (LRS), by applying multiple linear regression. LRS were defined by Hammer et al. (2007) as numerical values whose pairwise differences approximate optimally the relative preferences over countries as expressed in their risk ratings. A common way to calculate the relative preferences is based on interpreting sovereign ratings as cardinal values (see e.g., Ferri et al. 1999; Hu et al. 2002; Sy 2004). If the sovereign ratings ˇ are viewed as cardinal values, then one can view the relative preferences as differences of the corresponding ratings:
.Pij / D ˇi ˇj ; for all i; j 2 I; i ¤ j
(43.10)
Since the Equation (43.10) is not necessarily consistent, it should be relaxed in the following way:
.Pij / D ˇi ˇj C "ij ; for all i; j 2 I; i ¤ j
(43.11)
The values of the ˇ’s providing the best L2 approximation of the ’s can be found by solving the following multiple linear regression problem:
. / D
X
ˇk xk . /C". /;
(43.12)
k2I
where D f.i; j / ji; j 2 I; i ¤ j g and xk .i; j / 8 for k D i ˆ <1; ˆ :
D
1; for k D j
0; otherwise The logical rating scores ˇk obtained by fitting the regression model are given in Column 8 of Table 43.16.
43 Combinatorial Methods for Constructing Credit Risk Ratings
43.4.4 Evaluation of the Results In this section, we analyze the results obtained with the proposed rating systems. The evaluation of the results involves the analysis of The relative preferences, The partially ordered set (i.e., the logical dominance rela-
tionship), and The LRS scores and the Condorcet ratings (extremal lin-
ear extensions) with respect to the rating systems of S&P’s, Moody’s, and The Institutional Investor.
43.4.4.1 Canonical Relative Preferences In this section, we evaluate the quality of the informational content in the matrix of relative preferences. To reach this goal, we first define, for any set of numerical scores si representing sovereign ratings, the canonical relative preferences dij D si sj for each pair of countries. Second, we compare the LAD relative preferences .Pij / with the canonical relative preferences d S &P ij ; d M ij ; d II ij , and d LRS ij associated with the S&P’s ratings, Moody’s ratings, The Institutional Investor’s scores, and the logical rating scores (Hammer et al. 2007). Respectively, the corresponding matrices of relative preferences are denoted d S &P ; d M ; d II , and d LRS , respectively. We evaluate the proximity between the LAD relative preferences (Hammer et al. 2006) and the canonical relative preferences based on their correlation levels (Table 43.9). The high correlation levels show that the LAD relative preferences are in strong agreement with both the ratings of S&P’s and those of the other agencies, as well as with the logical rating scores. The superiority of the logical rating scores compared to the relative preferences associated with the LAD discriminant is explained by the filtering of the noise done when deriving the LRS scores.
655
(viewed as partially ordered sets) and the partially ordered sets associated with The Institutional Investor’s and the logical rating scores. The partially ordered sets corresponding to The Institutional Investor’s scores and the logical rating scores are obtained as follows: i j if si sj > , i j if si sj < , i jjj otherwise,
where is the positive number chosen so as to obtain a partially ordered set of the same density as the dominance relationship, and si represents the numerical score given to country i by the respective rating system. The incomparability between two countries means, for S&P’s and Moody’s, that the two countries are equally creditworthy, while, for the logical dominance relationship, it means that the evidence about the relative creditworthiness of the two countries is either missing or conflicting. The concept of density of a partially ordered set is defined in Hammer et al. (2006) and represents the extent to which a partial order on a set of countries differentiates them by their creditworthiness. To assess the extent to which the preference orders agree with each other, the following concepts are introduced (Hammer et al. 2006). Given a pair of countries .i; j /, two partially ordered sets are: In concordance if one of the following relations i j;
i j , or i jjj holds for both partially ordered sets; In discordance if i j for one of the partially ordered
sets, and i j for the other one; Incomparable otherwise; that is, if i jjj for one of the par-
tially ordered sets, and either i j or i j for the other partially ordered set. We measure the levels of concordance, discordance, or incomparability between two partially ordered sets by the fractions of pairs of countries for which the two partially ordered sets are respectively in concordance, discordance, or incomparable. Hammer et al. (2006) have shown that there is a very high level of agreement between The logical dominance relationship and the preference or-
43.4.4.2 Preference Orders The logical dominance relationship is compared to the preference orders derived from the S&P’s and Moody’s ratings Table 43.9 Correlation levels between LAD and canonical relative preferences d S&P dM d II d LRS d S&P dM d II d LRS
100% 93.21% 92.89% 91.82% 97.57%
93.21% 100% 98.01% 96.18% 95.54%
92.89% 98.01% 100% 96.31% 95.20%
91.82% 96.18% 96.31% 100% 94.11%
97.57% 95.54% 95.20% 94.11% 100%
ders associated with S&P’s and Moody’s ratings and The Institutional Investor scores. The logical dominance relationship and the logical rating scores.
43.4.4.3 Discrepancies with S&P 43.4.4.3.1 Logical Dominance Relationship This section is devoted to the study of the discordance between the logical dominance relationship and the preference order of S&P’s. We define as a discrepancy (Hammer
656
A. Kogan and M.A. Lejeune
et al. 2006) a country pair for which the logical dominance relationship and the preference order of S&P’s are in discordance. The 2.17% discordance level between the logical dominance relationship and the preference order of S&P’s represents 51 discrepancies. Next, in order to determine the minimum number of countries, for which the S&P’s ratings must be changed so that the new adjusted S&P’s preference order has a 0% discordance level with the dominance relationship, we solve the integer program below: min
X
ai
i 2I
subject to ˇ ˇ ˇS Si ˇ M ai ; for all i 2 I i Si Sj for every pair .i; j / such that i j LAD
ai 2 f0; 1g; Si "f0; 1; : : : ; 21g for all i 2 I (43.13) where ai takes the value 1 if the S&P’s rating of country i must be modified, and the value 0 otherwise; Si is the original S&P’s rating of country i I Si is the adjusted S&P’s rating of country i ; and M is a sufficiently large positive number (e.g., M D 22). The optimal solution of Equation (43.13) shows that the 0% discordance level can be achieved by adjusting the S&P’s ratings of nine countries: France, India, Japan, Colombia, Latvia, Lithuania, Croatia, Iceland, and Romania. To check the relevance of the proposed rating adjustments, we examine the S&P’s ratings posterior to December 1998. We observe that Romania, Japan, and Columbia’s S&P’s ratings have been modified in the direction suggested by our model. More precisely, Columbia was downgraded by S&P’s twice, moving from BBB in December 1998 to BBC in September 1999, and then to BB in March 2000. Japan was downgraded to AAC in February 2001 and AA in November 2001. Romania was upgraded to B in June 2001. The S&P’s rating of the other countries (Iceland, France, India, Croatia, Latvia, and Lithuania) is unchanged.
43.4.4.3.2 Logical Rating Scores We have already shown that the LRS and the S&P’s ratings are in close agreement. However, since LRS and the S&P’s ratings are not expressed on the same scale, the comparison of the two scores of an individual country presents a challenge. In order to bring the LRS and the S&P’s ratings to the same scale, we apply a linear transformation a ˇi C c to the logical rating scores ˇi in such a way that the mean square difference between the transformed LRS and the S&P’s ratings is minimized. This is obtained by solving a series of quadratic optimization problems in which the
decision variables are a and c. Clearly, the consistency of the LRS and S&P’s ratings is not affected by this transformation. In 1998, it appears that five countries (Columbia, Hong Kong, Malaysia, Pakistan, and Russia) have a S&P’s rating that does not fall within the confidence interval of the transformed LRS. The 1-year modification of the S&P’s ratings for Columbia, Pakistan, and Russia is in agreement with the 1998 LRS of these countries, and highlights the prediction power of the LRS model. Moreover, the evolution of the S&P’s ratings of Malaysia and Hong Kong is also in agreement with their 1998 LRS. Indeed, both Malaysia and Hong Kong were upgraded shortly thereafter, the former moving to BBB from BBB in November 1999, and the latter to AC from A in February 2001.
43.4.4.4 Optimistic and Pessimistic Extensions The optimistic and pessimistic extensions of the logical dominance relationship (Table 43.16) comprise 21 levels, while the S&P’s rating system contains 22 rating categories. Table 43.10 provides the correlation levels between all the ratings (scores). The analysis reconfirms the high level of agreement between the proposed rating models and that of S&P’s. The high correlation levels attest that the LRS approximate very well the S&P’s ratings, as well as those of the other rating agencies.
43.4.4.5 Temporal Validity In this section, we apply the “out-of-time” or “walk-forward” validation approach (Sobehart et al. 2000, Stein 2002) to further verify the robustness and the relevance of our rating models. This involves testing how well the LAD model derived from the 1998 data performs when applied to the 1999 data. To evaluate the “temporal validity” of the proposed models, we proceed as follows: (1) we derive the LAD relative preferences; (2) we build the logical dominance relationship and run the regression model for the LRS scores, (3) we calculate the weak Condorcet ratings (pessimistic and optimistic extensions) and the logical rating scores; and (4) we compare these to the rating systems of the rating agencies (S&P’s, Moody’s, and The Institutional Investor).
43.4.4.5.1 Relative Preferences Table 43.11 shows that the LAD relative preferences are highly correlated with those of the S&P’s rating system, as well as with the logical rating scores. The LAD relative preferences and the LRS were obtained by applying to the 1999 data the models derived from the 1998 data.
43 Combinatorial Methods for Constructing Credit Risk Ratings
657
Table 43.10 Correlation analysis S&P’s Moody II OE PE LRS
Table 43.11 Correlation levels between relative preference matrices
Table 43.12 Concordance, discordance, and incomparability levels with dominance relationship
S&P
d d LRS
S&P’s
Moody
II
OE
PE
LRS
100% 98.01% 96.18% 94.31% 95.40% 95.54%
98.01% 100% 96.31% 94.13% 95.42% 95.20%
96.18% 96.31% 100% 93.26% 94.62% 94.11%
94.31% 94.13% 93.26% 100% 99.15% 99.24%
95.40% 95.42% 94.62% 99.15% 100% 99.10%
95.54% 95.20% 94.11% 99.24% 99.10% 100%
d S&P
d LRS
100% 91.70% 94.12%
91.70% 100% 96.98%
94.12% 96.98% 100%
Logical dominance relationship Concordance S&P’s Logical rating score
S&P’s
The high levels of pairwise correlations between the S&P’s 1999 ratings, the relative preferences given by the LAD discriminant, and the canonical relative preferences corresponding to LRS, show that the LRS model has a very strong temporal stability, and indicate its high predictive power.
43.4.4.5.2 Preorders The logical dominance relationship is compared to the 1999 preference order of the S&P’s and the partially ordered set associated with the logical rating scores. Table 43.12 displays the concordance, discordance, and incomparability levels among the logical dominance relationship, the preference order of S&P’s, and the partial order associated with the logical rating scores, and underlines their very strong level of agreement.
43.4.4.5.3 Discrepancies with S&P’s The discordance level between the logical dominance relationship obtained using the 1999 data and the preference order of the 1999 S&P’s ratings is equal to 2.64%. The solution of the above integer programming problem Equation (43.13)
Incomparability
Discordance
84.21% 13.15% 93.43% 6.49% Logical rating score
2.64% 0.08%
Concordance 83.46%
Discordance 3.85%
Incomparability 12.69%
reveals that the discrepancies would disappear if one modified the ratings of eight countries (France, Japan, India, Colombia, Latvia, Croatia, Iceland, and Hong Kong). The relevance of the ratings obtained with the logical dominance relationship is proven by observing the rating changes published by S&P’s subsequent to December 1999. The identification of the discrepancies between S&P’s ratings and the LRS scores requires the derivation of the confidence intervals for the new 1999 observations, and therefore for the transformed LRS of each country (Hammer et al. 2007, b). We denote by n and p the number of observations and predictors, respectively. The expression t.1˛=2; np/ refers to the Student test with .n p/ degrees of freedom, and with upper and lower tail areas of ’=2; Xj is the p-dimensional vector of the values taken by the observation Yj on the p predictors, Xp0 is the transposed of Xj , and .X 0 X /1 is the variance-covariance matrix; that is, the inverse of the [pxp]-dimensional matrix .X 0 X /. Denoting byhMSE i the mean square of errors, the estimated variance 2 O s Yj D MSE Œ1 C X 0 j .X 0 X /1 Xj of the predicted rating YOj , and the .1˛/ confidence interval for YOj;n is given by: n
YOj;n t.1 ˛=2; n p/ sŒpred ; o YOj;n C t.1 ˛=2; n p/ sŒpred : (43.14)
658
A. Kogan and M.A. Lejeune
Table 43.13 Correlation analysis S&P’s OE
PE
LRS
S&P’s OE PE LRS
95.15% 99.59% 100.00% 98.24%
94.12% 98.43% 98.24% 100.00%
100.00% 95.09% 95.15% 94.12%
95.09% 100.00% 99.59% 98.43%
Hammer et al. (2007b) say that there is a discrepancy between S&P’s rating RjSP and the logical rating score if: n RjSP … YOj;n t.1 ˛=2; n p/ sŒpred ; o YOj;n C t.1 ˛=2; n p/ sŒpred for ˛ D 0:1: Applying the 1998 LRS model to the 1999 data, only two countries (Russia and Hong Kong) have S&P’s ratings that are outside the confidence intervals of the corresponding transformed LRS. The creditworthiness of these two countries seems to have been underevaluated by S&P’s in 1998: the ratings of both were upgraded in 1999.
43.4.4.5.4 Condorcet Ratings and LRS Scores The optimistic and pessimistic extensions (Table 43.16) of the logical dominance relationship obtained using the 1999 data both comprise 20 levels, while the S&P’s rating system contains 22 rating categories. Table 43.13 provides the correlation levels among the 1999 S&P’s ratings, the optimistic and pessimistic extension levels, and the logical rating scores. The high levels of correlation and their comparison with those presented in Table 43.10 provides further evidence of the temporal validity of the proposed models. 43.4.4.6 Predicting Creditworthiness of Unrated Countries 43.4.4.6.1 Condorcet Approach The application of the logical dominance relationship to predict the rating of countries not included in the original dataset, and for years subsequent to 1998, is an additional validation procedure, sometimes referred to as “out-ofuniverse” cross-validation (Sobehart et al. 2000; Stein 2002). We use the 1998 LAD model to calculate the relative preferences for all pseudo-observations involving one or two of the four “new” countries (Ecuador, Guatemala, Jamaica, Papua New Guinea), which allows us to derive the logical dominance relationship and the computation of the optimistic and pessimistic extensions of previously unrated countries.
The levels assigned to these countries by the recalculated optimistic and pessimistic extensions are shown in Table 43.14. It appears that: Guatemala’s first S&P’s rating (in 2001) was BB.
Guatemala’s OE/PE levels are the same as Morocco’s (the only country with OE D PE D 5), and Morocco’s S&P’s rating in 1999 was BB. Jamaica’s first S&P’s rating (1999) was B. Its OE/PE levels .OE D 3; PE D 2/ are identical to these of Paraguay, Brazil, the Dominican Republic, and Bolivia, which had 1999 S&P’s ratings of B, BC; BC and BB, respectively. The first S&P’s rating for Papua New Guinea was BC. Its OE/PE levels .OE D 3; PE D 3/ are the same as those of Peru and Mexico, which both had 1999 S&P’s ratings of BB. Ecuador’s OE/PE levels .OE D 3; PE D 2/ are the same as those of Paraguay, Brazil, the Dominican Republic, and Bolivia, which had 1999 S&P’s ratings of B, BC; BC and BB, respectively. Interestingly, although the initial S&P’s rating of Ecuador was SD (in July 2000), it was upgraded in August 2000 (1 month later) to B. The striking similarity between the initial S&P’s rating of each of the four countries discussed above, and the S&P’s ratings of those countries that have the same OE/PE levels, validates the proposed model, indicating its power to predict the creditworthiness of previously unrated countries.
43.4.4.6.2 LRS Approach The LAD discriminant, which does not involve in any way the previous years’ S&P’s ratings, allows the rating of previously unrated countries in the following way. First, we construct all the pseudo-observations involving the new countries to be evaluated. Second, we calculate the relative preferences for these pseudo-observations, and we add the resulting columns and rows to the matrix of relative preferences. Third, we determine the new LRS for all the countries (new and old) by running the multiple linear regression model Equation (43.12). Fourth, we apply the linear transformation defined above to the LRS so that the transformed LRS and the S&P’s ratings are on the same scale. The ability of LRS to accurately predict S&P’s ratings is carried out by comparing the predicted LRS (obtained as described above) and the S&P’s ratings (when they first become available). We compute the confidence intervals Equation (43.14) for the transformed LRS of four countries never rated by S&P’s by December 1998. The predictions for Guatemala, Jamaica, and Papua New Guinea correspond perfectly to the first time (subsequent) S&P’s ratings. The
43 Combinatorial Methods for Constructing Credit Risk Ratings
Table 43.14 Out-of-universe validation
659
Optimistic extension
Pessimistic extension
First S&P’s rating
S&P’s linear extension
Ecuador
3
2
SD (07/2000)
0
Guatemala
5
5
BB (10/2001)
10
Jamaica
3
2
B (11/1999)
7
Papua New Guinea
3
3
BC (01/1999)
8
comparison between the LRS and the first (July 2000) S&P’s rating (SD) to Ecuador shows that S&P’s rated it too harshly, because 1 month later it raised its rating to B, justifying the LRS prediction.
43.4.5 Importance of Variables The methodology developed in this paper permits the important assessment of the variables in rating countries’ creditworthiness. The importance of variables is shown by their use in the patterns of the LAD model and is usually measured by the proportion of patterns containing a particular variable. The patterns of the 1998 LAD model show that the three most frequently used variables are financial depth and efficiency, political stability, and gross domestic product per capita (appearing in 47.5, 39.4, and 35.6% of the LAD patterns, respectively). In addition, the presence of political stability variable among the three most significant variables in the selected set justifies its inclusion in country risk rating models. This result is in agreement with the cost-benefit approach to country risk (i.e., the risk of defaulting is heavily impacted by the political environment, see Brewer and Rivoli 1990, Rivoli and Brewer 1997; Citron and Neckelburg 1987, Afonso 2003) which is not a view shared by all (Haque et al. 1996, 1998). The fact that the LAD approach identifies gross domestic product per capita as significant was expected, for most studies on country risk ratings acknowledge its crucial importance in evaluating the creditworthiness of a country. A key new result is the identification of the financial depth and efficiency variable as a major factor in determining country risk ratings.
43.5 Conclusions The central objective of this study is to develop transparent, consistent, self-contained, and stable credit risk rating models, closely approximating the risk ratings provided by some
of the main rating agencies. We use the combinatorial optimization method called LAD to develop a relative creditworthiness approach for assessing the credit risk of countries, while using an absolute creditworthiness approach for financial institutions. The evaluation of the creditworthiness of financial organizations is particularly important due to the growing number of bankruptcies and the magnitude of losses caused by such bankruptcies; in addition, evaluation is challenging due to the opaqueness of the banking sector and the high variability of banks’ creditworthiness. We use the logical analysis of data (LAD) to reverse-engineer the Fitch bank credit ratings. The LAD method identifies strong combinatorial patterns distinguishing banks with high and low ratings. These patterns constitute the core of the rating model developed here for assessing the credit risk of banks. The results show that the LAD ratings are in very close agreement with the Fitch ratings. In that respect, it is important to note that the critical component of the LAD rating system – the LAD discriminant – is derived utilizing only information about whether a bank’s rating is “high” or “low,” without the exact specification of the bank’s rating category. Moreover, the LAD approach uses only a fraction of the observations in the dataset. The higher classification accuracy of LAD appears even more clearly when performing cross-validation and applying the LAD model derived by using the banks in the training set to those in the testing one. The study also shows that the LAD-based approach to reverse-engineering bank ratings provides a model that is parsimonious and robust. This approach allows rating models with varying levels of granularity that can be used at different stages in the credit granting decision process, and can be employed to develop internal rating systems that are Basel 2 compliant. Besides impacting the credit risk of a financial institution, the use of the generalizable and accurate credit risk rating system proposed here will also be critical in mitigating the financial institution’s operational risk due to breakdowns in established processes and riskmanagement operations, or to inadequate process mapping within business lines. In particular, the reliance on such risk rating system will reduce the losses due to mistakes made in
660
executing transactions, such as settlement failures, failures to meet capital regulatory requirements, untimely debt collections, losses due to the offering of inappropriate financial products or credit conditions, or giving incorrect advice to counterparty. The evaluation of the creditworthiness of countries is also of utmost importance, since the country’s risk rating is generally viewed as the upper bound on the rating that entities within a given country can be assigned. This study proposes an LAD methodology for inducing a credit risk system from a set of country risk rating evaluations. It uses nine economic and three political variables to construct the relative preferences of countries on the basis of their creditworthiness. Two methods are then developed to construct countries’ credit rating systems on the basis of their relative creditworthiness. The first one is based on extending the preorder of countries using the Condorcet voting technique and provides two rating systems (weak Condorcet winners and losers), while the second one uses linear regression to determine the logical rating scores. The proposed rating systems correlate highly with those of the utilized rating system (S&P’s) and those of other rating agencies (Moody’s and The Institutional Investor), and are shown to be stable, having an excellent classification accuracy when applied to the following years’ data, or to the ratings of previously unrated countries. Rating changes implemented by the S&P’s in subsequent years have resolved most of the (few) discrepancies between the constructed partially ordered set and S&P’s initial ratings. This study provides new insights regarding the importance of variables such as economic, political, and financial depth and efficiency, and their necessary inclusion in the analysis of assessing country risk. The rating systems proposed here for banks as well as countries: Avoid overfitting as attested by the back-testing analysis
(i.e., extremely high concordance between in- and out-ofsample rating predictions calculated using the k-folding and jackknife cross-validation methods), and Distinguish themselves from the rating models in the existing literature by their self-contained nature; that is, by their nonreliance on any information derived from lagged ratings. Therefore, the high level of correlation between predicted and actual ratings cannot be attributed to the reliance on lagged ratings and is a reflection of the predictive power of the independent variables included in these models. An important advantage of the nonrecursive nature of the proposed models is their applicability to notyet-rated obligors. The scope of the proposed methodology extends beyond the rating problems discussed in this study, and can be used in many other contexts where ratings are relevant. The proposed
A. Kogan and M.A. Lejeune
methodology is applicable in the general case of inferring an objective rating system from archival data, given that the rated objects are characterized by vectors of attributes with numerical or ordinal values.
References Afonso, A. 2003. “Understanding the determinants of sovereign debt ratings: evidence for the two leading agencies.” Journal of Economics and Finance 27(1), 56–74. Afonso, A., P. Gomes, P. Rother. 2007. What “hides” behind sovereign debt ratings? European Central Bank Working Paper Series 711. Alexe, S. 2002. “Datascope – A new tool for logical analysis of data.” DIMACS Mixer Series. DIMACS, Rutgers University. Alexe, G., S. Alexe, T. O. Bonates, and A. Kogan. 2007. “Logical analysis of data – The vision of Peter L. Hammer.” Annals of Mathematics and Artificial Intelligence 49, 265–312. Altman, E. I., and H. A. Rijken. 2004. “How rating agencies achieve rating stability.” Journal of Banking and Finance 28, 2679–2714. Barr, R. S., and T. F. Siems. 1994. “Predicting bank failure using DEA to quantify management quality.” Federal Reserve Bank of Dallas Financial Industry Studies 1, 1–31. Basel Committee on Banking Supervision. 2001.“The internal ratings based approach.” Basel Committee on Banking Supervision. Basel Committee on Banking Supervision. 2004. Bank failures in mature economies, Working Paper 13, Basel Committee on Banking Supervision. Basel Committee on Banking Supervision. 2006.“International convergence of capital measurement and capital standards: a revised framework (Basel II).” Basel Committee on Banking Supervision. Bhatia, A. V. 2002. Sovereign credit risk ratings: an evaluation, IMF Working Paper WP/03/170. Boros, E., P. L. Hammer, T. Ibaraki, and A. Kogan. 1997. “Logical analysis of numerical data.” Mathematical Programming 79, 163–190. Boros, E., P. L. Hammer, T. Ibaraki, A. Kogan, E. Mayoraz, and I. Muchnik. 2000. “An implementation of logical analysis of data.” IEEE Transactions on Knowledge and Data Engineering 12(2), 292–306. Bouchet, M. H., E. Clark, and B. Groslambert. 2003. Country risk assessment: a guide to global investment strategy, Wiley, Chichester, England. Bourke, P., and B. Shanmugam. 1990. An introduction to bank lending. Addison-Wesley Business Series, Sydney. Brewer, T. L., and P. Rivoli. 1990. “Politics and perceived country creditworthiness in international banking.” Journal of Money, Credit and Banking 22, 357–369. Cantor, R., and F. Packer. 1996. “Determinants and impact of sovereign credit ratings.” FRBNY Economic Policy Review 2, 37–53. Choo E.U. and Wedley W.C. 2004. A common framework for deriving preference values from pairwise comparison matrices. Computers and Operational Research 31, 893–908. Citron, J. T., and G. Neckelburg. 1987. “Country risk and political instability.” Journal of Development Economics 25, 385–395. Crama, Y., and P. L. Hammer., T. Ibaraki. 1988. “Cause-effect relationships and partially defined boolean functions.” Annals of Operations Research 16, 299–326. Curry, T., and L. Shibut. 2000. “Cost of the S&L crisis.” FDIC Banking Review 13(2), 26–35. Czyzyk, J., M. P. Mesnier, and J. J. Moré. 1998. “The NEOS server.” IEEE Computer Science Engineering 5(3), 68–75. de Servigny, A., and O. Renault. 2004. Measuring and managing credit risk, McGraw-Hill, New York, NY.
43 Combinatorial Methods for Constructing Credit Risk Ratings Eliasson, A. 2002. Sovereign credit ratings, Working Papers 02-1, Deutsche Bank. Ferri, G., L.-G. Liu, and J. Stiglitz. 1999. “The procyclical role of rating agencies: evidence from the East Asian crisis.” Economic Notes 3, 335–355. Fitch Ratings. 2001. “Fitch simplifies bank rating scales.” Technical Report. Fitch Ratings. 2006. “The role of support and joint probability analysis in bank ratings.” Fitch Special Report. Galindo, J., and P. Tamayo. 2000. “Credit risk assessment using statistical and machine learning/basic methodology and risk modeling applications.” Computational Economics 15, 107–143. Gehrlein, W. V., and D. Lepelley. 1998. “The condorcet efficiency of approval voting and the probability of electing the condorcet loser.” Journal of Mathematical Economics 29, 271–283. Hammer, P. L. 1986. “Partially defined Boolean functions and causeeffect relationships.” International Conference on Multi-Attribute Decision Making Via OR-Based Expert Systems. University of Passau, Passau, Germany. Hammer, P. L., A. Kogan, and Lejeune M. A. 2006. “Modeling country risk ratings using partial orders.” European Journal of Operational Research 175(2), 836–859. Hammer, P. L., A. Kogan, and M. A. Lejeune. 2010. Reverseengineering country risk ratings: A combinatorial non-recursive model. In print in: Annals of Operations Research. DOI 10.1007/ s10479-009-0529-0. Hammer, P. L., A. Kogan, and M. A. Lejeune. 2007b. “Reverseengineering banks’ financial strength ratings using logical analysis of data.” RUTCOR Research Report 10–2007, Rutgers University, New Brunswick, NJ. Haque, N. U., M. S. Kumar, N. Mark, and D. Mathieson. 1996. “The economic content of indicators of developing country creditworthiness.” International Monetary Fund Working Paper 43(4), 688–724. Haque, N. U., M. S. Kumar, N. Mark, and D. Mathieson. 1998. “The relative importance of political and economic variables in creditworthiness ratings.” International Monetary Fund Working Paper 46, 1–13. Hu, Y.-T., R. Kiesel, and W. Perraudin. 2002. “The estimation of transition matrices for sovereign credit ratings.” Journal of Banking and Finance 26(7), 1383–1406. Huang, Z., H. Chen, C.-J. Hsu, W.-H. Chen, and S. Wu. 2004. “Credit rating analysis with support vector machines and neural networks: a market comparative study.” Decision Support Systems 37, 543–558. Jain, K., R Duin, and J. Mayo. 2000. “Statistical pattern recognition: a review.” IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37. Kaminsky, G., and S. L. Schmukler. 2002. “Emerging market instability: do sovereign ratings affect country risk and stock returns?” World Bank Economic Review 16, 171–195.
661 Kaufmann, D., A. Kraay, and P. Zoido-Lobaton. 1999a. Aggregating governance indicators, World Bank Policy Research Department Working Paper 2195. Kaufmann, D., A. Kraay, and P. Zoido-Lobaton. 1999b. Governance matters, World Bank Policy Research Department Working Paper 2196. Kunczik, M. 2000. “Globalization: news media, images of nations and the flow of international capital with special reference to the role of rating agencies.” Paper presented at the IAMCR Conference, Singapore, pp. 1–49. Larrain, G., H. Reisen, J. von Maltzan. 1997. “Emerging market risk and sovereign credit ratings.” OECD Development Center, 1–30. Moody’s. 2006. “Bank financial strength ratings: revised methodology.” Moody’s Global Credit Research Report. Mora, N. 2006. “Sovereign credit ratings: guilty beyond reasonable doubt?” Journal of Banking and Finance 30(7), 2041–2062. Morgan, D. P. 2002. “Rating banks: risk and uncertainty in an opaque industry.” The American Economic Review 92(4), 874–888. Ng, W.-Y., K.-W. Choi, and K.-H. Shum. 1996. “Arbitrated matching: formulation and protocol.” European Journal of Operational Research 88(2), 348–357. Poon, W. P. H., M. Firth, and H.-G. Fung. 1999. “A multivariate analysis of the determinants of Moody’s bank financial strength ratings.” Journal of International Financial Markets, Institutions & Money 9(3), 267–283. Quenouille, M. 1949. “Approximate tests of correlation in time series.” Journal of the Royal Statistical Society, Series B 11, 18–84. Reinhart, C. M. 2002. “Default, currency crises, and sovereign credit ratings.” World Bank Economic Review 16, 151–170. Rivoli, P., and T. L. Brewer. 1997. “Political instability and country risk.” Global Finance Journal 8(2), 309–321. Sarkar, S., and R. S. Sriram. 2001. “Bayesian models for early warning of bank failures.” Management Science 47(11), 1457–1475. Sobehart, J. R., S. C. Keenan, and R. M. Stein. 2000. “Benchmarking quantitative default risk models: a validation methodology.” Moody’s Investors Service. New York, NY. Stein, R. M. 2002. “Benchmarking default prediction models: pitfalls and remedies in model validation.” Moody’s KMV. New York, NY. Sy, A. N. R. 2004. “Rating the rating agencies: anticipating currency crises or debt crises?” Journal of Banking and Finance 28(11), 2845–2867. Tarjan, R. E. 1972. “Depth-first search and linear graph algorithms.” SIAM Journal on Computing 1, 146–160. The International Monetary Fund. 2001. World economic outlook, Washington DC. Treacy, W. F., and M. S. Carey. 2000. “Credit risk rating systems at large US banks.” Journal of Banking & Finance 24(1–2), 167–201.
662
A. Kogan and M.A. Lejeune
Appendix 43A
Table 43.15 Standard & Poor’s country rating system Investment rating
Speculative rating
Default rating
Level
Description
AAA
An obligor rated AAA has extremely strong capacity to meet its financial obligations. AAA is the highest issuer credit rating assigned by S&P’s
AA
An obligor rated AA has very strong capacity to meet its financial commitments. It differs from the highest rated obligors only in small degree
A
An obligor rated A has strong capacity to meet its financial commitments but is somewhat more susceptible to the adverse effects of changes in circumstances and economic conditions than obligors in higher-rated categories
BBB
An obligor rated BBB has adequate capacity to meet its financial commitments. However, adverse economic conditions or changing circumstances are more likely to lead to a weakened capacity of the obligor to meet its financial commitments
BB
An obligor rated BB is less vulnerable in the near term than other lower-rated obligors. However, it faces major ongoing uncertainties and exposure to adverse financial or economic conditions, which could lead to its inability to meet financial commitments
B
An obligor rated B is more vulnerable than the obligors rated BB, but, at the time of the rating, it has the capacity to meet financial commitments. Adverse business, financial conditions could likely impair its capacity to meet financial commitments
CCC
An obligor rated CCC is vulnerable at the time of the rating, and is dependent upon favorable business, financial, and economic conditions to meet financial commitments
CC
An obligor rated CC is highly vulnerable at the time of the rating
C
An obligor rated C is vulnerable to nonpayment at the time of the rating and is dependent upon favorable business, financial, and economic conditions to meet financial commitments
D
An obligor rated D is predicted to default
SD
An obligor rated SD (selected default) is presumed to be unwilling to repay
S&P’s ratings (1998)
BB AA AAA AAC BB BB AAC A BBBC BBB BB BBB AC A AAC BC BBB BB BBBC AA AAA AAA BBB A BBB AC BB CCCC AAC A AA AAA BB BC BBC
Countries
Argentina Australia Austria Belgium Bolivia Brazil Canada Chile China Colombia Costa Rica Croatia Cyprus Czech Republic Denmark Dominican Rep Egypt El Salvador Estonia Finland France Germany Greece Hong-Kong Hungary Iceland India Indonesia Ireland Israel Italy Japan Jordan Kazakhstan Korea. Rep.
Table 43.16 1998 ratings
Moody’s ratings (1998) 9 19 21 20 8 7 20 14 15 12 11 12 16 14 20 10 11 12 14 21 21 21 14 15 13 18 10 6 21 15 18 20 9 9 11
S&P’s preorder (1998)
10 19 21 20 9 9 20 15 14 12 10 12 17 15 20 8 12 10 14 19 21 21 13 16 13 17 10 5 20 15 19 21 9 8 11
42:7 74:3 88:7 83:5 28 37:4 83 61:8 57:2 44:5 38:4 39:03 57:3 59:7 84:7 28:1 44:4 31:2 42:8 82:2 90:8 92:5 56:1 61:8 55:9 67 44:5 27:9 81:8 54:3 79:1 86:5 37:3 27:9 52:7
7 16 17 15 3 2 16 11 10 2 7 5 12 10 15 3 6 4 9 14 13 18 9 17 9 15 1 0 17 11 12 15 4 1 9
The institutional investor Optimistic ratings extension (1998) ratings (1998) 6 16 17 14 3 2 16 11 9 2 6 5 12 10 15 2 5 4 8 14 13 18 9 13 8 15 1 0 16 9 12 14 3 1 6
Pessimistic extension ratings (1998)
S&P’s ratings (1999) BB AAC AAA AAC BB BC AAC A BBB BBC BB BBB A A AAC BC BBB BBC BBBC AAC AAA AAA A A BBB AC BB CCCC AAC A AA AAA BB BC BBB
LRS scores (1998) 0:2768 0:0289 0:0094 0:0476 0:366 0:3744 0:0241 0:191 0:2159 0:3854 0:2748 0:297 0:1081 0:2088 0:048 0:3568 0:2915 0:3379 0:2518 0:064 0:0828 0:001 0:2255 0:017 0:2442 0:047 0:4063 0:4576 0:0179 0:2215 0:1064 0:0604 0:323 0:4095 0:2649 10 20 21 20 9 8 20 15 13 11 10 12 16 15 20 8 12 11 14 20 21 21 15 16 13 17 10 5 20 15 19 21 9 8 13
S&P’s preorder (1999) 7 16 17 15 3 3 16 11 10 2 7 6 12 10 15 3 6 5 8 14 13 18 9 16 9 15 2 0 17 10 12 15 4 1 10
Optimistic extension ratings (1999) 6 16 16 15 2 2 16 11 10 2 6 6 12 10 15 2 6 4 8 13 13 17 9 15 8 14 1 0 16 9 12 14 3 1 8
Pessimistic extension ratings (1999)
(continued)
0:263 0:0128 0:0038 0:0439 0:3518 0:4016 0:0112 0:1841 0:224 0:3964 0:257 0:3202 0:1021 0:1904 0:0492 0:3431 0:3067 0:3301 0:245 0:0458 0:0614 0:0126 0:1917 0:0213 0:247 0:0378 0:3994 0:4316 0:0249 0:2189 0:1122 0:0506 0:2818 0:4048 0:2182
LRS scores (1999)
43 Combinatorial Methods for Constructing Credit Risk Ratings 663
BBB BB BBB BBB AC BB BB AAA AAC AAA CC BBC BB BB BBC BBB AA B CCC AAA BBC A BBC AA AAC AAA BBB BBC BBB B AAA AAA BBB BC
Countries
Latvia Lebanon Lithuania Malaysia Malta Mexico Morocco The Netherlands New Zealand Norway Pakistan Panama Paraguay Peru Philippines Poland Portugal Romania Russia Singapore Slovak Republic Slovenia South Africa Spain Sweden Switzerland Thailand Trinidad & Tob Tunisia Turkey UK United States Uruguay Venezuela
Moody’s ratings (1998) 13 8 11 12 15 10 11 21 19 21 5 11 7 9 11 12 19 6 6 20 11 15 12 19 19 21 11 11 12 8 21 21 12 7
S&P’s preorder (1998)
13 9 12 12 17 10 10 21 20 21 2 11 9 10 11 12 19 6 3 21 11 16 11 19 20 21 12 11 12 7 21 21 12 8
38 31:9 36:1 51 61:7 46 43:2 91:7 73:1 86:8 20:4 39:9 31:3 35 41:3 56:7 76:1 31:2 20 81:3 41:3 58:4 45:8 80:3 79:7 92:7 46:9 43:3 50:3 36:9 90:2 92:2 46:5 34:4
5 4 4 11 13 3 6 19 18 19 0 7 2 3 4 7 14 1 0 19 7 11 8 13 18 20 8 6 9 0 18 19 7 0
The institutional investor Optimistic extension ratings ratings (1998) (1998) 5 4 4 9 12 3 5 19 17 18 0 6 2 3 4 6 13 1 0 18 6 9 6 12 17 20 8 6 8 0 18 19 7 0
Pessimistic extension ratings (1998)
S&P’s ratings (1999) BBB BB BBB BBB A BB BB AAA AAC AAA B BBC B BB BBC BBB AA B SD AAA BBC A BBC AAC AAC AAA BBB BBB BBB B AAA AAA BBB B
LRS scores (1998) 0:3026 0:3223 0:3247 0:1676 0:0999 0:3608 0:2952 0:0251 0:0001 0:0125 0:4563 0:2712 0:3865 0:3536 0:3242 0:2772 0:0742 0:3987 0:4428 0:0073 0:2814 0:1922 0:2523 0:0924 0:0106 0:071 0:2452 0:2824 0:2488 0:4458 0:0057 0:0205 0:2695 0:4444 13 9 12 13 16 10 10 21 20 21 6 11 7 10 11 13 19 6 0 21 11 16 11 20 20 21 12 12 12 7 21 21 12 7
S&P’s preorder (1999) 5 4 4 11 12 3 5 19 17 18 0 6 3 3 4 7 14 1 0 18 7 10 7 13 17 19 8 6 8 0 18 19 7 1
Optimistic extension ratings (1999) 5 4 4 11 12 3 5 18 16 18 0 6 2 3 3 7 13 1 0 18 6 9 6 13 17 19 8 6 8 0 18 18 7 0
Pessimistic extension ratings (1999)
0:3039 0:3121 0:3233 0:1712 0:2402 0:3284 0:2881 0:0337 0:0001 0:0076 0:4501 0:2487 0:4066 0:3644 0:349 0:2743 0:0706 0:3942 0:4197 0:0225 0:269 0:1878 0:2386 0:0798 0:0143 0:0613 0:2383 0:248 0:242 0:4177 0:0062 0:0264 0:2409 0:3921
LRS scores (1999)
We have converted the Standard & Poor’s rating scale (columns 1 and 4) into a numerical scale (columns 2 and 5). Such a conversion is not specific to us. Bouchet et al. (2003), Ferri et al. (1999), Sy (2004) proceed similarly. Moreover, Bloomberg, a major provider of financial data services, developed a standard cardinal scale for comparing Moody’s, S&P’s and Fitch-BCA ratings (Kaminsky and Schmukler 2002). A higher numerical value denotes a higher probability of default. The numerical scale is referred to in this paper as Standard & Poor’s preorder
S&P’s ratings (1998)
Table 43.16 (continued)
664 A. Kogan and M.A. Lejeune
Chapter 44
The Structural Approach to Modeling Credit Risk Jing-zhi Huang
Abstract In this article we present a survey of recent developments in the structural approach to modeling of credit risk. We first review some models for measuring credit risk based on the structural approach. We then discuss the empirical evidence in the literature on the performance of structural models of credit risk. Keywords Credit risk r Default risk r Structural credit risk models r Credit spreads r Corporate bonds r Credit default swaps r Credit derivatives
44.1 Introduction Assessing and managing credit risk of financial assets has been a major area of interest and concern to academics, practitioners, and regulators. There are three widely used approaches to credit risk modeling in practice. One method for estimating a firm’s probability of default is the statistical analysis based credit scoring approach (e.g., Altman 1968). Another one is the reduced-form approach of Jarrow and Turnbull (1995) and Duffie and Singleton (1999). See also Das and Tufano (1996), Jarrow (2001), Jarrow et al. (1997), Lando (1998), and Madan and Unal (1998). The third approach is the so-called structural approach originated with Black and Scholes (1973) and Merton (1974). In this article we focus on the structural approach. We first review the well-known Merton (1974) model and some of its extensions. We then discuss the empirical evidence in the literature on the performance of structural credit risk models in both estimating a firm’s probability of default and predicting the price of a credit-sensitive instrument such as corporate bonds and credit default swaps.
J.-z. Huang () Department of Finance, Smeal College of Business, Pennsylvania State University, University Park, PA 16802, USA e-mail: [email protected]
44.2 Structural Credit Risk Models Under the structural approach to credit risk modeling, securities issued by a firm are considered to be a contingent claim written on the asset value of the firm. The firm asset value evolves according to a stochastic process. Default occurs when the asset value crosses a boundary. The boundary can be either endogenous or exogenous. Once the riskneutral default probability (over a given horizon) of the firm is determined, the price of any corporate securities can then be obtained given the payoff of the securities. Below we describe the Merton (1974) model first, which can be considered to be the benchmark structural model. We then discuss some extensions of the Merton model.
44.2.1 The Merton (1974) Model In this model the firm is capitalized with common stock and one zero-coupon bond. The bond has a face value F and maturity date T . Default occurs when the asset value of the firm is below the face value. In the event of default, bondholders get the entire firm and shareholders get nothing. In addition, the asset value is assumed to follow a geometric Brownian motion process. Let Vt be the time-t values of the firm’s asset and r be the constant risk-free interest rate. The dynamics of the firm’s asset value is given as follows: dV t D .r ı/ Vt dt C v Vt dZ t
(44.1)
where v and ı are constants and Z is a one-dimensional standard Brownian motion processes under the risk-neutral measure. The value of the zero-coupon bond at time t, denoted by Bt , is given by Bt D F e r.T t / N.d2 / C Vt N.d1 /
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_44,
(44.2)
665
666
J.-z. Huang
where N./ represents the cumulative standard normal function and d1 D
ln.Vt =F / C .r C v2 =2/.T t/ p v T t p d2 D d1 v T t
(44.3) (44.4)
The net payout parameter ı is assumed to be zero in the original Merton model. However, the pricing formula (44.2) can be easily extended to incorporate a nonzero ı. The Merton model is very intuitive and also easy to implement. As mentioned later in Sect. 44.3, the model is found to systematically underestimate yield spreads of corporate bonds. This stylized fact has led researchers to extending the Merton model in order to generate more realistic levels of yield spreads.
As observed in Huang and Huang (2003), although barrier models differ in certain economic assumptions, they can be embedded in the same underlying structure that includes specifications of the underlying firm’s asset process, the default boundary, the recovery rate, etc. Let V be the firm’s asset process, K the default boundary, and r the defaultfree interest rate process. Assume that, under a risk-neutral measure, dVt Q D .rt ı/dt C v dW t Vt 2 Q 3 Nt
X Q Zi 1 5 Q Q dt; Cd 4 d ln Kt D ` Œ .rt / ln.Kt =Vt / dt
(44.6)
Q
drt D .˛ ˇ rt / dt C r dZt
44.2.2 Extensions of the Merton (1974) Model The recent theoretical literature on structural credit risk models includes a variety of extensions and improvements of the Merton (1974) model, such as allowing for coupons, default before maturity and stochastic interest rates etc. Below we review some recently developed models in the literature.
Q Q
Q
u y fY Q .y/ D puQ Q 1fy0g C pd d e d y 1fy<0g : u e (44.8) Q
Black and Cox (1976) propose the first barrier based model, where the default boundary is a curve and serves as a lower (default) barrier. The default boundary can be either exogenous or endogenous. The former can be considered to be a safety covenant whereas the latter can result from the equity holder’s option to default, as in Geske (1977). (We focus on the former case in this subsection.) One advantage of the Black–Cox model over Merton’s is that it allows the default to occur prior to the debt maturity. Kim et al. (1993), Nielsen et al. (1993), Longstaff and Schwartz (1995), and Briys and De Varenne (1997) incorporate stochastic interest rates into a barrier based model with an exogenous default boundary. Collin-Dufresne and Goldstein (2001) and Tauren (1999) extend the Longstaff– Schwartz model to allow for stationary leverage ratios. Zhou (2001) and Huang and Huang (2003) incorporate jump risk into the Black–Cox model with an exogenous default boundary. The latter is based on a doubleexponential jump diffusion asset return process and is much more tractable than the former when dealing with risky coupon bonds, thanks to a result of Kou and Wang (2003).
(44.7)
where ı, v , ` , , , ˛, ˇ, r , and D ˛=ˇ are constants, and W Q and Z Q are both one-dimensional standard Brownian motion under the risk-neutral measure and are assumed to have a constant correlation coefficient of . In Equation (44.5), the process N Q is a Poisson process with Q a constant intensity Q > 0, the Zi ’s are i.i.d. random variQ ables, and Y Q ln.Z1 / has a double-exponential distribution with a density given by Q
44.2.2.1 First-Passage Models
(44.5)
i D1
Q
Q
Q
In Equation (44.8), parameters u ; d > 0 and pu ; pd 0 Q Q are all constants, with pu C pd D 1. The mean percentage jump size Q is given by Q Q Q Q i h Q p pu u Q D EQ e Y 1 D Q C Qd d 1: u 1 d C 1
(44.9)
Most first-passage models are special cases of the general specification in Equations (44.5)–(44.7). For instance, if the jump intensity is zero, then the asset process is a geometric Brownian motion. This specification is used in the diffusion models, such as Black and Cox (BC), Longstaff and Schwartz (LS), and Collin-Dufresne and Goldstein (CDG). Regarding the specification of the default boundary K, it is a point at the bond maturity in the Merton model. If ` is set to be zero, then the default boundary is flat (a continuous barrier), an assumption made in Black and Cox (BC), Longstaff and Schwartz (LS), and the jump diffusion (HH) models. The mean-reverting specification in Equation (44.6) is used in the Collin-Dufresne and Goldstein (CDG) model. The Vasicek model in Equation (44.7) is used to describe the dynamics of
44 The Structural Approach to Modeling Credit Risk
the risk-free rate in the two-factor models of Longstaff and Schwartz (LS) and Collin-Dufresne and Goldstein (CDG) models. If both ˇ and r are zero, then the interest rate is constant, an assumption made in the three one-factor models. Prices of coupon bonds under the first-passage models specified in Equation (44.5) can be calculated straightforwardly. See, for example, Huang and Huang (2003).
667
44.2.2.4 Models with Incomplete Accounting Information Duffie and Lando (2001) take into account the incomplete accounting information and show that their structural model with imperfect information about the firm asset process is equivalent to a reduced-form model. See Cetin et al. (2004), Giesecke (2006), and Guo et al. (2008). 44.2.2.5 Models of Optimal Capital Structure
44.2.2.2 Models with an Endogenous Default Boundary Black and Cox (1976) and Geske (1977) propose a model where the default boundary can be endogenously determined by the equityholder’s option to default. The former focuses on continues-coupon bonds and the latter considers bonds with discrete coupons. Leland (1994) extends the Black and Cox model to include tax and bankruptcy cost and obtains a model of the optimal capital structure. Leland and Toft (1996) consider a specific stationary debt structure such that their model can apply to a finite maturity debt. See also Leland (1998). The above models have an analytical representation for the price of defaultable bonds. But Geske’s formula is not easy to implement for a long maturity bond. However, as shown in Huang (1997) and Eom et al. (2004), the Geske model can be implemented directly using the binomial method. Chen and Kou (2005) and Hilberink and Rogers (2002) incorporate jumps in the asset return process into the Leland model.
44.2.2.3 Models with Strategic Default Anderson and Sundaresan (1996) and Mella-Barral and Perraudin (1997) incorporate strategic defaults. Anderson et al. (1996) and Fan and Sundaresan (2000) provide analytical results for the prices of an infinite-maturity bond with strategic default. Huang (1997) extends Anderson and Sundaresan (1996) to allow for equity financing. Acharya and Carpenter (2002) incorporate stochastic interest rates into Anderson and Sundaresan (1996). Acharya et al. (2006) consider the interactions between the equityholder’s options, specifically, the option to carry cash reserves, the option to service debt strategically, and the option to liquidate. Broadie et al. (2007) explicitly model the impact of bankruptcy procedures on the valuation of defaultable bonds. Francois and Morellec (2004) consider the similar issue albeit in a setting without strategic default.
Brennan and Schwartz (1978) examine optimal leverage using a first-passage model with an exogenous default boundary. Leland (1994) considers a perpetual debt structure and obtains closed-form solutions for optimal capital structure with an endogenous default boundary. Leland and Toft (1996) extend the Leland model to a finite-maturity stationary debt structure. Other examples of optimal capital structure include Fischer et al. (1989), Mello and Parsons (1992), Leland (1998), and Hackbarth et al. (2006). 44.2.2.6 Affine Structural Models with Lévy Jumps Both the Merton (1974) and first-passage models can be extended to an affine jump-diffusion setting. In particular, in the case of defaultable zero-coupon bonds, if we follow Merton to assume that default is possible only at the bond maturity, we can incorporate stochastic interest rate, stochastic asset volatility and jumps into the Merton model and still obtain the closed-form solution for the prices of bonds, thanks to the advances in the option pricing literature. For instance, Shimko et al. (1993) incorporate the one-factor Vasicek (1977) term structure model into Merton (1974) and Delianedis and Geske (2001) extend the Merton model to allow for Poisson jumps in the asset return process. Huang (2005) considers an affine class of structural models of corporate bond pricing, in which the underlying asset volatility can be stochastic, the underlying asset return can include a Lévy jump component, and the interest rate process can be driven by multi factors. As mentioned in Huang (2005), quasianalytical solutions can be obtained for prices of zero-coupon bonds with default possible only at maturity by following Duffie et al. (2000) and Carr and Wu (2004), and, as a result, can obtained for prices of coupon bonds also by following Chen and Huang (2001). Affine structural models can be motivated by the stylized empirical facts documented in the literature. In particular, Huang and Huang (2003) mention that a model with stochastic asset return volatility may explain the credit risk puzzle. Huang (2005) provides some preliminary evidence on
668
this. Huang and Zhou (2008) find that the standard structural models cannot fit either the level or time variation of equity volatility and argue that it may call for a stochastic asset volatility model. Zhang et al. (in press) provide further evidence based on credit default swap data. The use of a multifactor term structure model can be somewhat motivated by the finding that we may need a more sophisticated model than the one-factor Vasicek model used in the existing structural models (Eom et al. 2004).
44.2.3 Models of Default Probabilities In addition to prices of corporate securities, the structural approach can be also used to estimate default probabilities (under the physical measure). For instance, it is known that Moody’s KMV uses the insight of the Merton model to develop a method for estimating expected default frequency. Given a structural credit risk model along with the specification of its underlying structure under the physical measure, the probability of default over a particular horizon is straightforward to calculate. In particular, as shown in Huang and Huang (2003), analytical or quasianalytical solutions are available for default probabilities under the class of firstpassage models specified in Equations (44.5)–(44.7) and the assumption that the asset risk premium is a constant or follows a mean-reverting process.
44.3 Empirical Evidence This section reviews empirical studies of structural models based on the information from the corporate bond market, the credit default swap market, and default rates.
44.3.1 Evidence from the Corporate Bond Market Structural models can be used to predict both the level of corporate bond spreads and the shape of yield spread curves. Below we present the empirical evidence on the performance of structural models in each of these two applications.
44.3.1.1 On Corporate Bond Pricing Jones et al. (1984) provide the first empirical test of a structural model using a sample of corporate bonds. However, they consider a sample of callable corporate bonds in their
J.-z. Huang
empirical analysis. As such, their study does not actually test the Merton model and it tests only a Merton-type model that is applicable to callable bonds. Ogden (1987) conducts a similar study using a sample of new bond offerings. Recently reliable bond price data have become available to academics and, as a result, there have been a number of empirical studies of structural models for noncallable bonds. For instance, Lyden and Saraniti (2000), Delianedis and Geske (2001), Eom et al. (2004), Arora et al. (2005), and Ericsson and Reneby (2005) examine the performance of some models for the valuation of corporate bonds. The empirical evidence so far indicates that structural models have difficulty predicting accurately the level of corporate bond spreads. In particular, Eom et al. (2004) construct a sample of 182 bond prices from firms with simple capital structures during the period 1986–1997 and test five structural models: those of Merton (1974), Geske (1977), Longstaff and Schwartz (1995), Leland and Toft (1996), and Collin-Dufresne and Goldstein (2001). Their main finding is that on average, the first two models underestimate the corporate bond spread whereas the latter three newer models overestimate the spread. However, Schaefer and Strebulaev (2008) find that although not accurate in term of pricing, the Merton model is useful for hedging. Structural models are also examined indirectly using regression analysis that links individual bond yield spreads with certain structural model variables. For instance, CollinDufresne et al. (2001) find that structural model implied variables can explain only a small portion of changes in corporate bond spreads. Campbell and Taksler (2003) document that idiosyncratic volatility can explain one third of corporate bond spreads cross-sectionally. Davydenko and Strebulaev (2007) do not find strong evidence that links bond spreads with strategic actions of equity holders. Cremers et al. (in press) demonstrate that option implied volatility can help explain the credit spread puzzle documented in Huang and Huang (2003). Some empirical studies are based on aggregate credit spread data. For instance, Wei and Guo (1997) implement the Merton (1974) and Longstaff and Schwartz (1995) models using aggregate data and find that the former is more accurate than the latter. Anderson and Sundaresan (2000) fit the Merton model to corporate bond indices. Huang and Kong (2003) present an examine of the determinants of corporate bond credit spreads using both weekly and monthly option-adjusted spreads for nine corporate bond indexes from Merrill Lynch from January 1997 to July 2002. They find that the Russell 2000 index historical return volatility and the Conference Board composite leading and coincident economic indicators have significant power in explaining credit spread changes, especially for high yield indexes. Using
44 The Structural Approach to Modeling Credit Risk
669
the same Merrill data over a slightly longer sample period, Huang and Kong (2005) find that the announcement surprises in leading economic indicators and employment reports have significant impact on credit spreads of high-yield bonds.
and Kurbat 2001), Leland (2004), and Bharath and Shumway (in press) present evidence on the usefulness of structural models for predicting default probabilities.
44.3.1.2 On the Slope of Credit Spread Curves
44.3.3 The Credit Spread Puzzle and Its Implications
Another issue is whether structural models can correctly predict the slope of the credit spread curve. Fons (1994), Sarig and Warga (1989), and He et al. (2000) examine the problem and find that spread curves for high yield bonds tend to be downward-sloping. Helwege and Turner (1999) point out that the results of such studies are biased by failing to properly control for issuers’ credit quality. Using a matched sample of bonds issued by the same firm, they find that the credit curve is upward-sloping even for speculative-grade bonds. In addition to precisely controlling for credit quality, an important element of testing the shape of the credit spread curve is to consider the role of the coupon. Some structural models, such as the Merton (1974) model, apply only to zerocoupon bonds. This raises the question as to whether Helwege and Turner’s findings, which seem to overturn those of Sarig and Warga, differ because they use a sample of coupon bonds while the latter study is based on zero-coupon bonds. Huang and Zhang (2008) examine the shape of credit spread curves for both investment-grade and speculativegrade bonds using an approach that deals with the presence of coupons in both Treasury and corporate bonds. In particular, they calculate a spot credit spread curve for each set of matched bonds in order to test the predicted slopes of zero-coupon based structural models. Using a sample of newly issued corporate bonds from Securities Data Corporation (SDC) over a thirty year period, they find that the term structure of credit spreads for both zero-coupon and coupon bonds is usually upward-sloping. This is true for both investment-grade and high yield bonds. This result suggests that structural models that predict a downward-sloping credit yield curve for high yield bonds cannot price corporate bonds well, regardless of coupons. However, the sample of highyield bonds used in their analysis includes only those rated B or higher. As such, spread curves for credit ratings below B may not be upward sloping. In fact, Lando and Mortensen (2005) document some evidence consistent with this in the credit default swap market.
44.3.2 Evidence from the Real Default Rates As mentioned earlier, one important application of structural models is to generate the (real) probability of default over a particular horizon. Moody’s KMV (e.g., Kealhofer
Huang and Huang (2003) examine the implication of structural models for both prices of defaultable bonds and default probabilities. Specifically, they consider several existing models plus two new ones. (The two proposed models are the double-exponential jump diffusion model and a firstpassage model with a mean-reverting asset risk premium.) They then conduct a two-step calibration analysis. First, they calibrate each of the models such that it is consistent with data on historical default loss experience and equity risk premium. They then calculate bond yield spreads using a calibrated model. Huang and Huang find that credit risk accounts for only a small fraction of the observed corporate-Treasury yield spreads for investment grade bonds of all maturities and that it accounts for a much higher fraction of yield spreads for junk bonds. In particular, contrary to the conventional wisdom, the jump diffusion model is found to be unable to generate high enough yield spreads even for short maturity bonds once such a model is calibrated to the historical default rates. In another word, the main finding of Huang and Huang is that structural models have difficulty explaining yield spreads and default rates simultaneously. This stylized fact has been referred to as the credit spread puzzle by some researchers in the literature. Notice that the credit spread puzzle does not simply mean that structural models underestimate bond yield spreads. In fact, as documented in the empirical literature, some structural models can overestimate bond yield spreads. The puzzle refers to the failure of structural models in explaining both yield spreads and default rates, and involves the risk premium that links spreads (under the so-called Q-measure) with default probabilities (under the P-measure). One possible implication of the credit spread puzzle is that existing structural models are misspecified and that more sophisticated models are needed in order to generate both realistic yield spreads and default rates. For instance, Huang and Huang (2003) conjecture that a structural model with stochastic asset return volatility may explain the credit risk puzzle. As mentioned earlier in Sect. 44.2.2.6, there seems to be some recent empirical support for such models. Cremers et al. (in press) show that option implied volatility can help explain the puzzle. Several studies go beyond the standard structural approach and examine if general equilibrium models can explain the credit spread puzzle. See, for example,
670
Bhamra et al. (2008), Chen (2007), Chen et al. (in press), David (2007), and Tang and Yan (2006). One alternative implication of the credit spread puzzle is that perhaps the unexplained portion of bond yield spreads is due to some noncredit factors such as liquidity (and thus that existing purely credit-based structural models are “good enough” at least in terms of explaining the credit portion of yield spreads). Recently several empirical studies have documented that corporate bond yields include a significant liquidity component, using a variety of liquidity measures. For instance, Chen et al. (2007) apply the measure of liquidity developed in Lesmond et al. (1999) to a sample of corporate bonds from Datastream, and find that liquidity alone can explain 7% of the cross-sectional variation in yield spreads for investment grade bonds and up to 22% of the variation for speculative grade bonds. Mahanti et al. (2008) propose a latent measure of liquidity that does not rely on trading activity or bid-ask spreads. Using the CDS spread as the measure of credit risk, Longstaff et al. (2005) show that the credit risk can explain a portion of bond yield spreads. Other studies on corporate bond liquidity include Bao et al. (2008), Han and Zhou (2008), and Mahanti et al. (2007). Some studies model bond liquidity explicitly. Leland (2006) shows that incorporating a liquidity variable into a structural model can help explain the credit spread puzzle. Ericsson and Renault (2006) also consider the impact of liquidity on spreads although they do not focus on the credit spread puzzle.
44.3.4 Evidence from the CDS Market Credit default swaps are a very popular credit derivatives instrument. The CDS market accounts for about half of the credit derivatives market and has a notional value of over $62 trillion at the end of 2007, according to the International Swaps and Derivatives Association. CDS spreads are generally considered to be a purer measure of credit risk than corporate bond spreads. As such, empirical evidence on the presence of a liquidity component in corporate bond spreads has motivated researchers to test credit risk models using data on CDS spreads. For instance, Predescu (2005) examines the Merton (1974) model and a Black and Cox (1976) type model with a rolling estimation procedure combined with the MLE approach proposed in Duan (1994) and finds that both models underestimate the CDS spread. Arora et al. (2005) and Chen et al. (2006) also find that the Merton model underestimates the spread. In addition, both studies show that firstpassage models overestimate the CDS spread. (However, the Longstaff–Schwartz model implemented in the latter study
J.-z. Huang
is based on an approximated solution.) Ericsson et al. (2006) find that whereas the Leland (1994) and Fan–Sundaresan (2000) models underestimate the CDS spread, the Leland– Toft (1996) model overestimates the spread. Finally, Hull et al. (2004) investigate the Merton model using a calibration approach. All these studies focus on how well a structural model can fit to the spread of 5-year CDS contracts (most liquid ones in the CDS market). Huang and Zhou (2008) conduct a specification analysis of structural models using the information from both CDS and equity markets. More specifically, unlike the existing studies, they use the entire term structure of CDS spreads and equity volatility from high-frequency return data. This allows them to provide a consistent econometric estimation of the pricing model parameters and specification tests based on the joint behavior of time-series asset dynamics and crosssectional pricing errors. They test four first-passage models: Black and Cox (1976), Longstaff and Schwartz (1995), the stationary leverage model of Collin-Dufresne and Goldstein (2001) and the double exponential jump-diffusion model considered in Huang and Huang (2003). One main finding of Huang and Zhou is that although on average these four models all underestimate the CDS spread, the two newer ones improve significantly over the other two (judged under the pricing error). Furthermore, the CDS pricing error of the best performing model here (the CDG model) is much smaller than the bond pricing error of the best performing Geske (1977) model reported in Eom et al. (2004). Still, Huang and Zhou document that the existing structural models have difficulty capturing the time-series behavior of CDS spreads and equity volatility, especially for investment grade names. This points to a potential role of time-varying asset volatility, a feature that is missing in the standard structural models. Like its counterpart in the corporate bond market, the unexplained portion of the CDS spread documented in Huang and Zhou (2008) may be due to liquidity. (However, Huang and Zhou focus only on the pricing implication of the four structural models and thus do not examine the credit spread puzzle.) See Bongaerts et al. (2008) and Tang and Yan (2007) for evidence on CDS liquidity. There are also studies that link CDS spreads with structural-model implied variables plus some other variables using a regression analysis. See, for example, Cossin and Hricko (2001), Ericsson et al. (2005), Houweling and Vorst (2005), and Cao et al. (2007). In particular, Zhang et al. (in press) provide empirical evidence that jumps in equity return and a stochastic asset volatility model help raise the explanatory power of regression models. Overall, the empirical evidence so far indicates that structural models seem to fit to CDS spreads better than to corporate bond spreads but they still cannot fully explain CDS spreads and capture the time series behavior of the CDS term structure.
44 The Structural Approach to Modeling Credit Risk
44.4 Conclusion The structural approach to modeling credit risk is widely used by both academics and practitioners. The approach has been found quite useful in many aspects but the empirical evidence also points out several limitations of the existing structural credit risk models. This chapter covers only a part of a large and yet fast growing literature on credit risk modeling. For a more complete review and reference, see Bielecki and Rutkowski (2002), Dai and Singleton (2003), Duffie and Singleton (2003), Lando (2004), Saunders and Allen (2002), and Schönbucher (2003).
References Acharya, V. and J. Carpenter. 2002. “Corporate bond valuation and hedging with stochastic interest rates and endogenous bankruptcy.” Review of Financial Studies 15, 1355–1383. Acharya, V., J.-Z. Huang, M. Subrahmanyam, and R. Sundaram. 2006. “When does strategic debt-service matter?” Economic Theory 29, 363–378. Altman, E. I. 1968. “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy.” The Journal of Finance 23, 589– 609. Anderson, R. W. and S. Sundaresan. 1996. “Design and valuation of debt contracts.” Review of Financial Studies 9, 37–68. Anderson, R. and S. Sundaresan. 2000. “A comparative study of structural models of corporate bond yields: an exploratory investigation.” Journal of Banking and Finance 24, 255–269. Anderson, R. W., S. Sundaresan, and P. Tychon. 1996. “Strategic analysis of contingent claims.” European Economic Review 40, 871–881. Arora, N., J. R. Bohn, and F. Zhu. 2005. “Reduced form vs. structural models of credit risk: a case study of three models.” Journal of Investment Management 3, 43–67. Bao, J., J. Pan, and J. Wang. 2008. “Liquidity of corporate bonds.” SSRN eLibrary, http://ssrn.com/paperD1106852. Bhamra, H. S., L.-A. Kuehn, and I. A. Strebulaev. 2008. “The levered equity risk premium and credit spreads: a unified framework.” SSRN eLibrary, http://ssrn.com/paperD1016891. Bharath, S. and T. Shumway. in press. “Forecasting default with the KMV-Merton model.” Review of Financial Studies. Bielecki, T. R. and M. Rutkowski. 2002. Credit risk: modeling, valuation and hedging, Springer, New York. Black, F. and J. Cox. 1976. “Valuing corporate securities: some effects of bond indenture provisions.” Journal of Finance 31, 351–367. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Bongaerts, D., F. de Jong, and J. Driessen. 2008. Derivative pricing with liquidity risk: theory and evidence from the credit default swap market, Working paper. Brennan, M. J. and E. S. Schwartz. 1978. “Corporate income taxes, valuation, and the problem of optimal capital structure.” Journal of Business 51, 103. Briys, E. and F. De Varenne. 1997. “Valuing risky fixed rate debt: an extension.” Journal Of Financial And Quantitative Analysis 32, 239– 248. Broadie, M., M. Chernov, and S. Sundaresan. 2007. “Optimal debt and equity values in the presence of Chapter 7 and Chapter 11.” Journal of Finance 62, 1341–1377. Campbell, J. and G. Taksler. 2003. “Equity volatility and corporate bond yields.” Journal of Finance 58, 2321–2349.
671 Cao, C., F. Yu, and Z. Zhong. 2007. The information content of optionimplied volatility for credit default swap valuation, Working paper. Carr, P. and L. Wu. 2004. “Time-changed Lévy processes and option pricing.” Journal of Financial Economics 71, 113–141. Cetin, U., R. Jarrow, P. Protter, and Y. Yildirim. 2004. “Modeling credit risk with partial information.” Annals of Applied Probability 14, 1167–1178. Chen, H. 2007. Macroeconomic conditions and the puzzles of credit spreads and capital structure, Working paper, University of Chicago. Chen, N. and S. Kou. 2005. Credit spreads, optimal capital structure, and implied volatility with endogenous default and jump risk, Working paper. Chen, R. R., F. J. Fabozzi, G. G. Pan, and R. Sverdlove. 2006. “Sources of credit risk: evidence from credit default swaps.” The Journal of Fixed Income 16, 7–21. Chen, R.R. and J.-Z. Huang. 2001. “Credit spread bounds and their implications for credit risk modeling.” SSRN eLibrary, http://ssrn. com/abstractD275339. Chen, L., D. A. Lesmond, and J. Wei. 2007. “Corporate yield spreads and bond liquidity.” Journal of Finance 62, 119–149. Chen, L., P. Collin-Dufresne, and R. Goldstein. in press. “On the relation between the credit spread puzzle and the equity premium puzzle.” Review of Financial Studies. Collin-Dufresne, P. and R. Goldstein. 2001. “Do credit spreads reflect stationary leverage ratios?” Journal of Finance 56, 1929–1957. Collin-Dufresne, P., R. Goldstein, and S. Martin. 2001. “The determinants of credit spread changes.” Journal of Finance 56, 2177–2207. Cossin, D. and T. Hricko. 2001. Exploring for the determinants of credit risk in credit default swap transaction data, Working Paper. Cremers, M., J. Driessen, P. Maenhout, and D. Weinbaum. in press. “Explaining the level of credit spreads: option-implied jump risk premia in a firm value model.” Review of Financial Studies. Dai, Q. and K. Singleton. 2003. “Term structure dynamics in theory and reality.” Review of Financial Studies 16, 631–678. Das, S. and P. Tufano. 1996. “Pricing credit sensitive debt when interest rates, credit ratings and credit spreads are stochastic.” Journal of Financial Engineering 5, 161–198. David, A. 2007. “Inflation uncertainty, asset valuations, and the credit spreads puzzle.” Review of Financial Studies 20. Davydenko, S. A. and I. A. Strebulaev. 2007. “Strategic actions and credit spreads: an empirical investigation.” Journal of Finance 62, 2633–2671. Delianedis, G. and R. Geske. 2001. “The components of corporate credit spreads: default, recovery, tax, jumps, liquidity, and market factors.” SSRN eLibrary, http://ssrn.com/abstractD306479. Duan, J. C. 1994. “Maximum likelihood estimation using price data of the derivative contract.” Mathematical Finance 4, 155–167. Duffie, D. and D. Lando. 2001. “Term structures of credit spreads with incomplete accounting information.” Econometrica 69, 633–664. Duffie, D. and K. Singleton. 1999. “Modeling term structure of defaultable bonds.” Review of Financial Studies 12, 687–720. Duffie, D. and K. J. Singleton. 2003. Credit risk: pricing, measurement, and management, Princeton University Press, Princeton, NJ. Duffie, D., J. Pan, and K. Singleton. 2000. “Transform analysis and asset pricing for affine jump-diffusions.” Econometrica 68, 1343– 1376. Eom, Y. H., J. Helwege, and J.-Z. Huang. 2004. “Structural models of corporate bond pricing: an empirical analysis.” Review of Financial Studies 17, 499–544. Ericsson, J. and O. Renault. 2006. “Liquidity and credit risk.” Journal of Finance 61, 2219–2250. Ericsson, J. and J. Reneby. 2005. “Estimating structural bond pricing models.” Journal of Business 78, 707–735. Ericsson, J., K. Jacobs, and R. Oviedo. 2005. The determinants of credit default swap premia, Working Paper, McGill University.
672 Ericsson, J., J. Reneby, and H. Wang. 2006. Can structural models price default risk? Evidence from bond and credit derivative markets, Working Paper, McGill University. Fan, H. and S. M. Sundaresan. 2000. “Debt valuation, renegotiation, and optimal dividend policy.” Review of Financial Studies 13, 1057– 1099. Fischer, E. O., R. Heinkel, and J. Zechner. 1989. “Dynamic capital structure choice: theory and tests.” Journal of Finance 44, 19–40. Fons, J. S. 1994. “Using default rates to model the term structure of credit risk.” Financial Analysts Journal 50, 25–32. Francois, P. and E. Morellec. 2004. “Capital structure and asset prices: some effects of bankruptcy procedures.” Journal of Business 77, 387–411. Geske, R. 1977. “The valuation of corporate liabilities as compound options.” Journal of Financial and Quantitative Analysis 12, 541– 552. Giesecke, K. 2006. “Default and information.” Journal of Economic Dynamics and Control 30, 2281–2303. Guo, X., R. A. Jarrow, and Y. Zeng. 2008. Credit risk models with incomplete information, Working paper. Hackbarth, D., J. Miao, and E. Morellec. 2006. “Capital structure, credit risk, and macroeconomic conditions.” Journal of Financial Economics 82, 519–550. Han, S. and H. Zhou. 2008. “Effects of liquidity on the nondefault component of corporate yield spreads: evidence from intraday transactions data.” SSRN eLibrary, http://ssrn.com/paperD946802 He, J., W. Hu, and L. H. P. Lang. 2000. “Credit spread curves and credit ratings.” SSRN eLibrary, http://ssrn.com/abstractD224393. Helwege, J. and C. M. Turner. 1999. “The slope of the credit yield curve for speculative-grade issuers.” Journal of Finance 54, 1869–1884. Hilberink, B. and L. C. G. Rogers. 2002. “Optimal capital structure and endogenous default.” Finance and Stochastics 6, 237–263. Houweling, P. and T. Vorst. 2005. “Pricing default swaps: empirical evidence.” Journal of International Money and Finance 24, 1200–1225. Huang, J.-Z. 1997. Default risk, renegotiation, and the valuation of corporate claims, PhD dissertation, New York University. Huang, J.-Z. 2005. Affine structural models of corporate bond pricing, Working Paper, Penn State University, Presented at the 2005 FIDC Conference. Huang, J.-Z. and M. Huang. 2003. “How much of corporatetreasury yield spread is due to credit risk?” SSRN eLibrary, http://ssrn.com/paperD307360. Huang, J.-Z. and W. Kong. 2003. “Explaining credit spread changes: new evidence from option-adjusted bond indexes.” Journal of Derivatives 11, 30–44. Huang, J.-Z. and W. Kong. 2005. “Macroeconomic news announcements and corporate bond credit spreads.” SSRN eLibrary, http://ssrn.com/paperD693341. Huang, J.-Z. and X. Zhang. 2008. “The slope of credit spread curves.” Journal of Fixed Income 18, 56–71. Huang, J.-Z. and H. Zhou. 2008. “Specification analysis of structural credit risk models.” SSRN eLibrary, http://ssrn.com/paperD1105640. Hull, J., I. Nelken, and A. White. 2004. Merton’s model, credit risk, and volatility skews, Working Paper, University of Toronto. Jarrow, R. 2001. “Default parameter estimation using market prices.” Financial Analysts Journal 57, 75–92. Jarrow, R. and S. Turnbull. 1995. “Pricing derivatives on financial securities subject to default risk.” Journal of Finance 50, 53–86. Jarrow, R., D. Lando, and S. Turnbull. 1997. “A Markov model for the term structure of credit spreads.” Review of Financial Studies 10, 481–523. Jones, E. P., S. Mason, and E. Rosenfeld. 1984. “Contingent claims analysis of corporate capital structures: an empirical investigation.” Journal of Finance 39, 611–625.
J.-z. Huang Kealhofer, S. and M. Kurbat. 2001. The default prediction power of the Merton approach, relative to debt ratings and accounting variables, Working paper, Moody’s KMV. Kim, I. J., K. Ramaswamy, and S. Sundaresan. 1993. “Does default risk in coupons affect the valuation of corporate bonds?: a contingent claims model.” Financial Management 22, 117–131. Kou, S. G. and H. Wang. 2003. “First passage time of a jump diffusion process.” Advances in Applied Probability 35, 504–531. Lando, D. 1998. “On Cox processes and credit risky securities.” Review of Derivatives Research 2, 99–120. Lando, D. 2004. Credit risk modeling: theory and applications, Princeton University Press, Princeton, NJ. Lando, D. and A. Mortensen. 2005. “Revisiting the slope of the credit curve.” Journal of Investment Management 3, 6–32. Leland, H. E. 1994. “Corporate debt value, bond covenants, and optimal capital structure.” Journal of Finance 49, 1213–1213. Leland, H. E. 1998. “Agency costs, risk management, and capital structure.” Journal of Finance 53, 1213–1243. Leland, H. E. 2004. “Predictions of default probabilities in structural models of debt.” Journal of Investment Management 2, 5–20. Leland, H. E. 2006. “Structural models in corporate finance,” Princeton University Bendheim Lecture Series in Finance. Leland, H. E. and K. B. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” Journal of Finance 51, 987–1019. Lesmond, D. A., J. P. Ogden, and C. A. Trzcinka. 1999. “A new estimate of transaction costs.” Review of Financial Studies 12, 1113–1141. Longstaff, F. and E. Schwartz. 1995. “A simple approach to valuing risky fixed and floating rate debt.” Journal of Finance 50, 789–820. Longstaff, F. A., S. Mithal, and E. Neis. 2005. “Corporate yield spreads: default risk or liquidity? New evidence from the credit-default-swap market.” Journal of Finance 60, 2213–2253. Lyden, S. and D. Saraniti. 2000. An empirical examination of the classical theory of corporate security valuation, Working paper, Barclays Global Investors. Madan, D. and H. Unal. 1998. “Pricing the risks of default.” Review of Derivatives Research 2, 120–160. Mahanti, S., A. J. Nashikkar, and M. G. Subrahmanyam. 2007. “Latent liquidity and corporate bond yield spreads.” SSRN eLibrary, http://ssrn.com/paperD966695. Mahanti, S., A. Nashikkar, M. G. Subrahmanyam, G. Chacko, and G. Mallik. 2008. “Latent liquidity: a new measure of liquidity, with an application to corporate bonds.” Journal of Financial Economics 88(2), 272–298. Mella-Barral, P. and W. Perraudin. 1997. “Strategic debt service.” The Journal of Finance 52, 531–556. Mello, A. S. and J. E. Parsons. 1992. “Measuring the agency cost of debt.” Journal of Finance 47, 1887–1887. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29, 449–470. Nielsen, L., J. Sao-Requejo, and P. Santa-Clara. 1993. Default risk and interest rate risk: the term structure of default spreads, Working Paper, INSEAD. Ogden, J. P. 1987. “Determinants of the ratings and yields on corporate bonds: tests of the contingent claims model.” Journal of Financial Research 10, 329–339. Predescu, M. 2005. The performance of structural models of default for firms with liquid CDS spreads, Working Paper, Rotman School of Management, University of Toronto. Sarig, O. and A. Warga. 1989. “Some empirical estimates of the risk structure of interest rates.” Journal of Finance 44, 1351–1360. Saunders, A. and L. Allen. 2002. Credit risk measurement: new approaches to value at risk and other paradigms, Wiley, New York. Schaefer, S. and I. A. Strebulaev. 2008. “Structural models of credit risk are useful: evidence from hedge ratios on corporate bonds.” Journal of Financial Economics 90(1), 1–19.
44 The Structural Approach to Modeling Credit Risk Schönbucher, P. J. 2003. Credit derivatives pricing models: models, pricing and implementation, Wiley, New York. Shimko, D. C., N. Tejima, and D. Van Deventer. 1993. “The pricing of risky debt when interest rates are stochastic.” The Journal of Fixed Income 3, 58–65. Tang, D. Y. and H. Yan. 2006. “Macroeconomic conditions, firm characteristics, and credit spreads.” Journal of Financial Services Research 29, 177–210. Tang, D. Y. and H. Yan. 2007. Liquidity and credit default swap spreads, Working paper.
673 Tauren, M. P. 1999. “A model of corporate bond prices with dynamic capital structure.” SSRN eLibrary, http://ssrn.com/paperD154848. Wei, D. G. and D. Guo. 1997. “Pricing risky debt: an empirical comparison of Longstaff and Schwartz and Merton models.” Journal of Fixed Income 7, 8–28. Zhang, B. Y., H. Zhou, and H. Zhu. in press. “Explaining credit default swap spreads with equity volatility and jump risks of individual firms.” Review of Financial Studies. Zhou, C. 2001. “The term structure of credit spreads with jump risk.” Journal of Banking and Finance 25, 2015–40.
Chapter 45
An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior Michael S. Pagano
Abstract We develop a comprehensive empirical specification that treats risk-management and risk-taking as integrated facets of a financial intermediary’s risk profile. Three main results emerge from a sample of 518 U.S. bank holding companies during 1991–2000: (1) The corporate risk-management theories most consistently supported are those related to financial distress costs and debt holder-related agency costs (with weaker support for the rationales related to managerial contracting costs, firm size, and hedge substitutes); (2) the asymmetric information theory for managing risk is not supported by our sample; (3) a conventional linear model of risk-management adequately explains cross-sectional and time-series variation in the sample. The model’s findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. Keywords Corporate finance r Risk-management r Banks Empirical analysis
r
45.1 Introduction The importance of risk-management and risk-taking behavior and its impact on firm value has increased greatly over the past decade as new derivative securities have proliferated and gained acceptance with financial institutions such as commercial banks. In fact, Saunders (1997) suggests that risk-management is the primary business of financial institutions. This paper reviews the rationales for corporate risk-taking and bank risk-management activities and develops several testable hypotheses from these theories in an integrated empirical framework. This framework synthesizes M.S. Pagano Villanova School of Business, Villanova University, 800 Lancaster Avenue, Villanova, PA, 19382, USA e-mail: [email protected]
key theoretical work in both the banking and corporate finance literatures. The tests are conducted on a sample of 518 publicly traded bank holding companies (BHC) in the U.S. during 1991–2000. Although there is a growing list of empirical papers attempting to test the various risk-management theories described below, no study has developed an integrated, comprehensive test for financial service companies such as commercial banks. We view the banking firm as a collection of risks that must be managed carefully in order to maximize shareholder wealth.1 Thus, the bank managers’ key risk-management function is to decide which risks to accept, increase, transfer, or avoid. In this context, risk-taking and risk-management are two sides of the same risk-return tradeoff related to investing in risky assets. That is, the literature on bank risk-taking focuses on the risks banks choose to accept or increase while ignoring the possibility that these actions also imply the transference or avoidance of other bank-related risks. Likewise, the literature on corporate risk-management typically focuses on the risks a firm chooses to transfer or avoid and ignores the implicit effects of such choices on the risks that are retained or increased. Our paper addresses these two perspectives holistically with a single, comprehensive empirical structure that treats riskmanagement and risk-taking as integrated facets of a bank’s risk profile.2 The theoretical hedging literature of the past 20 years has demonstrated how widely held large corporations may hedge in order to: (1) reduce the expected costs of financial 1
There is some discussion in the literature regarding hedging cash flows versus hedging market values of existing and anticipated risk exposures. Our view is that hedging particular exposures of either cash flows or market values will both lead to meaningful impacts on shareholder wealth. Thus, we focus on risk-management’s effects on shareholder value and do not make distinctions between cash flow- and market value-hedging since both forms of risk-management can equally affect total shareholder wealth. 2 See Pagano (1999, 2001) for reviews of the theoretical and empirical literature related to risk-management and risk-taking activities of U.S. commercial banks. Also, see Stulz (1996) and Graham and Rogers (2002) for a discussion of the concepts of a firm’s net risk exposures, integrated risk-management, and their effects on shareholder value.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_45,
675
676
distress, (2) decrease other costs such as those associated with debt-related agency problems and asymmetric information, and (3) increase the expected level of after-tax cash flows (e.g., via a reduction in taxes as well as through lower transaction and contracting costs).3 The key insight of this literature is that hedging at the corporate level (rather than by individual investors) can increase the firm’s market value by reducing the expected costs associated with the market imperfections noted above. Thus, corporate hedging activity can affect the level and riskiness of the firm’s market value and, in particular, the market value of the firm’s equity. In this context, we define “risk-management” broadly as any activity under management’s control (both on- and offbalance sheet) that alters the expected return and volatility of the firm’s market value of equity.4 We use the Flannery and James (1984) two-factor returngenerating model as the theoretical and empirical foundation for our analysis. As is well known, a model’s parameter estimates can be biased when important explanatory variables are omitted from the specification. This can create inference problems when, as in our case, there are multiple potentially competing theories. Our tests attempt to avoid this omitted variables problem by incorporating proxy variables for all extant theories of corporate hedging. This approach also leads to an increase in the explanatory power of the model when compared with conventional tests found in the literature. In addition to the use of a comprehensive set of explanatory variables, this study explores how well corporate riskmanagement rationales can explain risk-taking and hedging activities in the commercial banking industry rather than in non-financial companies. Given the importance of risktaking and risk-management in banking, this paper sheds light on which hedging rationales and interest rate risk
3
For reviews of the effects of these market imperfections on hedging, see Cummins et al. (1998), Tufano (1996), Culp et al. (1994), Smith and Stulz (1985), and Shapiro and Titman (1985). 4 For example, one can assume that two similar banks (A and B) both face an increase in demand for fixed rate loans and that both meet this demand by supplying these loans yet they choose to manage the resulting increase in interest rate risk exposure differently. Suppose that bank A immediately sells the fixed rate loans in the secondary market while bank B holds the loans on its balance sheet and then uses derivatives to completely hedge the interest rate risk. The sensitivity to interest rate risk in economic terms is essentially the same for both banks, although they have chosen different ways to manage this risk. In our model, we focus on the economic significance of cross-sectional and time-varying differences in the interest rate risk (and total risk) exposures of U.S. commercial banks and we do not make distinctions between whether the bank used on- or off-balance sheet transactions to manage these exposures. In this sense, our view of risk-management is broader than what is typically defined as hedging in the existing literature (i.e., the explicit use of derivatives). See Stulz (1996) and Meulbroek (2002) for a detailed discussion of a broader definition of risk-management, which is consistent with our perspective.
M.S. Pagano
exposures are motivating banks’ decisions towards managing total risk. We examine interest rate risk as well as total bank risk because interest rate risk is a risk that most banks can easily hedge if they so desire. Thus, of the many risks a bank faces, one can claim that interest rate risk is one in which senior management has the most discretion over and thus provides an interesting area to test various corporate riskmanagement theories. Further, our tests examine the possibility of inter-relations and nonlinearities between various explanatory variables such as credit, operating, and liquidity risks by specifying a quadratic model and then comparing the results of this specification with those of a conventional linear model. We find that a simple linear model is sufficient for explaining the cross-sectional and time-series variation of the sensitivities of bank stock returns to an overall risk measure (i.e., the standard deviation of monthly stock returns) as well as an interest rate risk measure (defined as the interest rate parameter from a Flannery and James-type regression). Thus, nonlinearities and interactions between the model’s explanatory variables do not provide a significantly more descriptive picture of the key variables affecting bank risk-taking and risk-management decisions in our sample. The empirical tests presented here extend the earlier work on bank risk-taking behavior of Saunders et al. (1990), Gorton and Rosen (1995), Schrand and Unal (1998), and Spong and Sullivan (1998) as well as recent research on general corporate risk-management activities by Graham and Rogers (2002), Tufano (1996, 1998a), Berkman and Bradbury (1996), and Mian (1996), among others. Empirical research based on samples of non-financial companies has found mixed support for the above rationales. The lack of a strong consensus within this line of research appears to be due to difficulties in defining an appropriate dependent variable and specifying a comprehensive set of explanatory variables.5 Tests in the banking literature have typically focused on a particular theory (or subset of theories in banking) and attempt to explain risk-taking rather than risk-management behavior. For example, Saunders et al. (1990) focus on a
5 For example, Nance et al. (1993), Mian (1996) and Wall and Pringle (1989) employ binary dependent variables and lack a consistent test of all competing hedging theories. The use of a binary variable is problematic because it does not fully describe the extent of a firm’s hedging activity. When a binary dependent variable is employed, a firm that hedges 1 or 100% of its risk exposure is treated the same in the model. Tufano (1996) attempts to test most (but not all) of the major hedging rationales and uses a continuous variable – an estimated hedge ratio. However, his sample covers a limited number of companies in a relatively small sector of the economy, the North American gold mining industry. Tufano (1996, 1998a) reports empirical results for 48 publicly traded gold mining firms in North America during 1991–1993. Graham and Rogers (2002) focuses on a cross-section of non-financial companies and finds that companies typically hedge to increase the tax benefits related to increased debt capacity and, to a lesser extent, to reduce financial distress costs and take advantage of economies of scale.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
bank’s ownership structure and its impact on bank riskiness (measured in several different ways using bank stock returns). This approach directly tests the contracting cost theory of hedging (described later in Sect. 45.2). However, the test does not directly address several other related, or competing, hedging theories. Schrand and Unal (1998) is another example of this focused approach since its primary emphasis is on the risk-taking behavior of thrifts with respect to credit and interest rate risk. The paper finds thrifts substitute “compensated” risks (e.g., a risk that earn a positive economic rent) for “hedgeable” risks. However, the paper does not explore the impact of all theoretically relevant factors influencing the thrift’s choice between credit and interest rate risk.6 Empirical tests of our generalized model of interest rate risk-management behavior yield several insights. When we examine both the total risk and interest rate risk exposures of a bank holding company, we find strong support for hedging theories related to financial distress costs (i.e., banks facing higher financial distress costs are more likely to hedge their interest rate risk and yet face greater overall risk). We also find support for the effects of debtholder-related agency costs (with weaker support for the managerial contracting costs, firm size, and hedge substitutes rationales) while the asymmetric information rationale for managing risk is not supported by our sample. Our estimated effect of financial distress on bank risk-management activity during 1991–2000 is consistent with evidence found in Haushalter (2000) for 100 oil and gas producers during 1992–1994 and Graham and Rogers’ (2002) results for 158 non-financial companies during 1994–1995. Further, banks facing greater potential debtholder-related agency conflicts (e.g., due to high financial leverage) are also more likely to have smaller interest rate risk exposures. Note that the financial distress costs and agency problems a bank faces can be inter-related and, interestingly, we find strong empirical support for each of these theories even when such inter-relationships (as well as nonlinearities) are explicitly modeled. In addition, Lin and Smith (2007) study nonfinancial companies during 1992–1996 and find significant inter-relationships between hedging, financing, and investment decisions. This result is consistent with our findings because the underlying factors affecting corporate hedging, financing, and investment are ultimately forces such as financial distress costs, agency problems, asymmetric information, and so forth. Overall, our findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities, 6
Additional examples in the banking literature of these focused empirical tests include Galloway et al. (1997) and Cebenoyan et al. (1999) concerning the effect of bank regulation on risk-taking as well as Gorton and Rosen (1995) on the potential agency and contracting problems related to bank riskiness.
677
and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. The paper is organized as follows. Section 45.2 briefly reviews the relevant theoretical and empirical literature in the areas of corporate finance and banking and presents several testable hypotheses. Section 45.3 describes the sample data, methodology, and proposed tests while Sect. 45.4 discusses the empirical results of these tests. Section 45.5 provides some concluding remarks.
45.2 Theories of Risk-Management, Previous Research, and Testable Hypotheses 45.2.1 Brief Review of Main Theories of Corporate Hedging Until the early 1980s, the implications from theories such as the capital asset pricing model (CAPM) and Modigliani and Miller’s (1958) Proposition I seriously challenged the usefulness of the early theories of hedging. However, recent research has argued that imperfect markets can explain why large, diversified firms actively engage in hedging activities. Large corporations will hedge in order to reduce the variance of cash flow for the following primary reasons: (1) to reduce the expected costs of financial distress or, more generally, the costs of external financing (e.g., see Smith and Stulz 1985; Shapiro and Titman 1985; Froot et al. 1993; and Copeland and Copeland 1999); (2) to decrease other costs such as those associated with agency problems and asymmetric information (e.g., Stulz 1990; Campbell and Kracaw 1990; and DeMarzo and Duffie 1995); and (3) to increase the expected level of after-tax cash flows via a reduction in taxes as well as lower transaction and contracting costs (e.g., Stulz 1984; Smith and Stulz 1985; Shapiro and Titman 1985; Nance et al. 1993). In addition, the corporate hedging decision will be influenced by the availability of “hedging substitutes” such as large cash balances. By relaxing the assumption of perfect capital markets, corporate finance theory has suggested several ways large, widely held corporations can increase their market value by engaging in risk-management activities.7 7 Note that a related strand of the accounting literature has, rather than using shareholder wealth maximization as the firm’s objective function, focused on analyzing hedging decisions based on managing accounting earnings and book value-based regulatory capital requirements (e.g., see Wolf 2000). Instead, we follow the logic of developing tests where the effect of a bank’s net interest rate risk exposure is measured in terms of its impact on the firm’s stock returns. This is a well-accepted, objective
678
45.2.2 Review of Related Banking Theory In addition to the empirical model of Flannery and James (1984) noted in the Introduction (and described in more detail later in the following subsection), another aspect of the banking literature relevant to our analysis pertains to the theoretical results of Diamond (1984) concerning financial intermediation. In this paper, financial intermediaries can capture the value associated with resolving information asymmetries between borrowers and lenders. In addition, this view of a financial intermediary’s delegated monitoring function implies the intermediary should hedge all systematic risk (e.g., risk associated with interest rate fluctuations) to minimize financial distress costs. Therefore, in theory, banks should not have a material exposure to systematic, hedgeable factors such as interest rate risk. Diamond’s results imply the optimal “hedge ratio” for a financial intermediary’s systematic risk such as interest rate risk is 100%. We can use this result to develop an empirical benchmark for risk-management activity by a BHC. For example, according to Diamond’s line of reasoning, a BHC’s return on its investment should have zero sensitivity to interest fluctuations if it is hedging optimally. In Diamond’s model, the bank’s liabilities can therefore be made asymptotically risk-free by combining the use of hedging and diversification. Thus, statistically significant non-zero sensitivities to interest rate movements would suggest less hedging compared to firms with zero sensitivities (after controlling for other relevant factors). It should be noted, however, that Froot and Stein (1998) argue against the optimality of a 100% hedge ratio when some risks are not tradable in the capital markets. Their theory of integrated risk-management suggests that non-zero risk exposures to market-wide factors such as interest rates can be optimal when the only means of hedging “non-tradable” risks is via holding costly bank capital and the bank faces convex external financing costs (i.e., financial distress costs are economically significant). In addition, Stein (1998) demonstrates that adverse selection problems can significantly affect a bank’s asset/liability management activities when there are sizable information asymmetries associated with the quality of a bank’s assets. Interestingly, these ideas from the banking literature are consistent with the financial distress and asymmetric information rationales of risk-management found in the corporate finance literature. Since a bank’s maturity intermediation decision is a classic example of Niehans’ (1978) definition of “qualitative asset transformation” by a financial intermediary, Deshmukh
approach that is commonly used in the corporate risk-management literature and therefore we readily adopt it here.
M.S. Pagano
et al. (1983) provides a theory that links interest rates to key financial intermediation services, which, in turn, directly affect the cash flows and stock returns of a commercial bank. For example, Deshmukh et al. (1983) demonstrates that the level and uncertainty of long-term interest rates relative to short-term rates is an important determinant of the amount of interest rate risk a financial intermediary will assume. Consequently, the above banking literature provides theoretical support for our use of a long-term U.S. Treasury interest rate as the key interest factor in our empirical estimation of the two-factor Flannery and James-type model of bank stock returns presented in the following subsection.8 In addition to the above issues, banking research such as Saunders et al. (1990) and Brewer et al. (1996) reveals banks face multiple risks associated with interest rates, credit quality, liquidity, and operating expenses. Further, it is possible these four key risks are inter-related and therefore need to be controlled for in our model. This realization, along with the results found in Flannery and James (1984) and Diamond (1984), provides us with a theoretical framework to construct a conventional linear model as well as a full quadratic empirical specification that includes linear, squared, and interaction terms for the relevant explanatory variables. This quadratic model approach can then be compared to the conventional linear specification in order to verify empirically whether or not these nonlinearities and interaction terms are relevant in a statistical sense.
45.2.3 Relation to Previous Empirical Research Recent empirical papers have attempted to test the theories presented above. The factors displayed below in Equation (45.1) are based on those forces that researchers in this area have most frequently cited as key influences on corporate hedging. The decision to hedge or not hedge with derivatives is a function of the six corporate finance-related rationales for hedging (i.e., taxes, t.V /, financial distress costs, BC, size-related and managerial contracting costs, CC, debt-related agency costs, AC, asymmetric information problems, AI, and alternative financial policies such as hedge substitutes, AFP).9
8
As we will discuss later, we also test a short-term interest rate, the three-month U.S. T-bill rate, and find that the long-term rate is statistically more significant than this short-term rate. 9 Hedge substitutes in our case can include bank-related on-balance sheet activities such as altering the firm’s asset and liability mix in order to reduce risk. Compared to conventional manufacturing and service companies, commercial banks have greater flexibility in employing these on-balance sheet techniques.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
The above relationships have been expressed in the format of a stylized function in empirical research as follows: Hedge D f .t .V /; BC; CC; AC; AI; AFP/
(45.1)
where, Hedge D either a continuous variable such as an estimated “hedge ratio” or a binary variable, which takes on a value of 1 if the firm hedges with derivatives and 0 if the firm does not hedge with these instruments. Examples of empirical tests based on Equation (45.1) can be found in Nance et al. (1993), Mian (1996), Wall and Pringle (1989), Berkman and Bradbury (1996), Tufano (1996), Howton and Perfect (1998), Haushalter (2000), and Allayannis and Weston (2001). The authors find mixed support for several hedging rationales primarily related to taxes, financial distress costs, managerial contracting, debt-related agency costs, and hedge substitutes.10 Schrand and Unal (1998) take a different approach and examine the role of risk-management in terms of allocating rather than reducing risk. The authors suggest thrift institutions substitute risks that can earn a positive economic rent (e.g., credit risk) for “hedgeable” risks such as interest rate risk. They report a statistically significant shift from interest rate to credit risk in a sample of thrifts that converted from mutual to stock charters during 1984–1988. Cebenoyan et al. (1999) also examine thrift risk-taking and find that unprofitable risk-taking activities increase during periods of relatively lax regulation and low charter values. This result is similar to the findings reported in Galloway et al. (1997) for commercial banks. Another strand of the empirical literature on riskmanagement pertains to surveys of corporate treasury departments regarding derivatives usage.11 As summarized in Bodnar et al. (1996), the Wharton survey results suggest there are substantial participation or “set-up” costs related to derivatives usage and hedging that may not be economically justifiable for many firms. These findings are consistent with the contracting cost theory of hedging related to potential economies of scale available in a risk-management program.
10
It should be noted that this strand of the literature treats the variables on the right-hand-side of Equation (45.1) as exogenous and independent of one another when, in reality, most, if not all, of these factors may be inter-dependent. As will be seen in Sect. 45.3.5, our tests differ from the extant literature by allowing for potential interactions between the right-hand-side variables found in Equation (45.1). 11 Recent empirical studies by Geczy et al. (1997) and Schrand (1997) have also provided more formalized tests of the factors describing corporate derivatives usage. Since these papers focus exclusively on derivatives and do not directly investigate the hedging theories described in this paper, we do not summarize their findings here.
679
Based on the above results and the findings reported in Saunders et al. (1990), we can use the standard deviation of the BHC’s monthly stock returns as a measure of total risk (denoted here as TOTRISK). That is, if the firm’s goal is to maximize shareholder value, then a bank’s risk-taking and risk-management activities should ultimately influence the volatility of its common equity, which we have defined as TOTRISK. In addition, we posit a formula for determining a financial institution’s sensitivity to interest rate risk via a Flannery and James-type model. In this way, we can examine how corporate risk-management factors influence both the bank’s overall riskiness as well as its management of a key market-based risk (i.e., interest rate risk). The Flannery and James (1984) model is a good choice for estimating this latter risk exposure because it explicitly accounts for the effect of interest rate risk on the firm’s stock returns. The relation incorporates risk premiums associated with the bank’s exposure to market-wide risk factors such as the return on a market portfolio proxy and the relative change in a long-term interest rate. This relation can be summarized as follows: rN;t D “0;N C “M;N rm;t C “I;N irt C ©N;t
(45.2)
where, rN;t D stock return on the N-th BHC during month-t, rm;t D total return on a value-weighted CRSP stock market index (S&P 500) during month-t, irt D the change in the yield on the constant maturity 10year U.S. Treasury note during month-t scaled by 1 plus the prior month’s 10-year Treasury note yield,12 ©N;t D a disturbance term with assumed zero mean and normal distribution, 12
The interest rate factor was computed in two ways: (1) simply as it is defined above using the ten-year interest rate (i.e., not orthogonalized) and (2) the orthogonalized interest rate, where (per Unal and Kane 1988) the interest rate factor is regressed on the equity market portfolio proxy and the resulting residuals from this regression are used as the second factor in Equation (45.2) rather than the simple nonorthogonalized version of the interest rate factor. We find that the empirical results are qualitatively the same whether the interest rate factor is orthogonalized or non-orthogonalized. Using Occam’s razor, we thus choose to report the simplest estimation method in our Empirical Results section (i.e., we report our results based on the non-orthogonalized interest rate factor since they are simpler to compute and yield the same main results as when the more complicated orthogonalization procedure is employed). The results are essentially the same when either procedure is used because the simple correlation between the market portfolio proxy and the interest rate factor is quite low and thus the two factors in Equation (45.2) are essentially already orthogonalized before even applying an orthogonalization technique. Also, we find that using the simple changes in the three-month Tbill rate provides statistically weaker but qualitatively similar results as those obtained for the ten-year interest rate factor. Thus, we focus our subsequent discussion on the empirical results related to the tenyear Treasury note rather than results based on the three-month T-bill rate.
680
“k;N D parameter estimates of the model for the N-th BHC. In particular, “I;N is in principle the “interest rate beta” used originally by Flannery and James (1984) to estimate a BHC’s interest rate sensitivity. With the estimates of “k;N generated for each BHC during each year,13 we can then use these estimates as the dependent variable in a second-stage set of regressions. Equation (45.2) states that the required stock return for a BHC should be related to the investment’s riskiness relative to the market portfolio (i.e., the first term on the right hand side of the equation) and the bank’s existing net interest rate risk exposure (the second term on the right hand side of the equation). If ˇI;N D 0, then we can interpret Equation (45.2) as a conventional one-factor market model. Note that the ˇI;N parameter is an estimate of a bank’s interest rate risk net of any interest rate hedging activities the bank has engaged in.14 However, Froot and Stein (1998), among others, argue that ˇI;N will not necessarily be zero due to the presence of financial distress costs and other market imperfections. Equation (45.2) provides us with a theoretical framework to test the various incentives for risk-management via its straightforward decomposition of risk into equity marketrelated and interest-related risks. For our tests, we can estimate the relative risk parameters for the interest rate risk factor in Equation (45.2) for each bank and each year using monthly or daily data and then use the absolute value of these firm-specific parameter estimates in a panel data set to analyze interest rate risk-taking behavior at U.S. BHCs.15 Similar to Graham and Rogers’ (2002) usage of absolute values of net derivatives positions, we use the absolute value of ˇI;N as the dependent variable in our second-stage set of regressions because our review of the relevant literature suggests that banks with higher interest rate risk (and lower hedging activity, ceteris paribus) have interest rate betas that are either
13
This is done to allow for the interest rate beta to vary over time as well as cross-sectionally. Note that our model is similar to Graham and Rogers’ (2002) approach for estimating interest rate risk except we use stock returns as the dependent variable (versus operating income) because we are interested in studying risk-management’s effects on shareholder value rather than rely on indirect measures of market value such as accounting variables drawn from the firm’s income statement. 14 If equity markets are informationally efficient, then we can interpret this interest rate beta (referred to here as IBETA in its raw form and IRR in its absolute value form) as a summary measure of the sensitivity of the bank’s market value of common equity to changes in interest rates. This measure of interest rate risk is consistent with the corporate riskmanagement definitions of interest rate risk, hedging, and its net impact on a firm’s market value of equity. 15 We find that our empirical beta estimates are similar when either monthly or daily data are used, albeit the estimates with the monthly data provide more explanatory power in terms of adjusted R2 statistics. Thus, to conserve space, we report our results from estimates computed using monthly return data.
M.S. Pagano
large positive or negative values (depending on whether the bank has chosen to be exposed to rising or falling interest rates). The absolute value of ˇI;N (referred to as IRR in the following sections) captures these divergent interest rate risk exposures by reporting large values of jˇI;N j for banks with large positive or negative exposures while firms with smaller exposures have values of jˇI;N j closer to zero. Note that we do not use the notional values of derivatives positions as a dependent variable in our model because, as Smith (1995) shows, two firms with the same notional value can have completely different hedging strategies. In order to specify the details of a second-stage regression that uses IRR as its dependent variable, we must first define our hypotheses related to this regression based on the current corporate hedging and banking literature.16 The following subsection presents nine hypotheses related to this literature.
45.2.4 Hypotheses Our review of the corporate hedging and banking literature provides us with an important set of implications about bank risk-taking and risk-management behavior that we now formalize by establishing a set of empirically testable hypotheses. Following each hypothesis is a brief description of the primary empirical variables employed in the analysis. The positive or negative sign in parentheses following each variable denotes the variable’s expected relationship with the BHC’s relative interest rate risk measure, IRR. A more detailed description of the variables used in our analysis is also contained within Table 45.1.17 In addition, a discussion of TOTRISK’s relation to our model’s independent variables (along with relevant predicted signs) is provided below only when the variables’ effects are different than the ones discussed for the IRR variable.18 H1. Tax hypothesis: A BHC will hedge more (and therefore assume less interest rate risk, ceteris paribus) when it faces a convex, or more progressive, tax schedule since
16 To mitigate the potential downward bias in parameter results caused by using an estimated interest rate risk parameter as a dependent variable, we use a generalized method of moments (GMM) estimator. This nonlinear instrument variables estimation technique explicitly accounts for the errors-in-variables problem that can exist when the dependent variable is estimated via a first-stage set of regressions (see Greene, 1993, and Ferson and Harvey 1991, for more details on this econometric issue). 17 Table 45.1 also contains a description of alternate explanatory variables that are used for robustness testing in a specification referred to as our “Alternative Model” of interest rate risk. This alternative specification is described in detail in the Empirical Results subsection. 18 Note that only hypotheses H2, H7, and H9 presented below might exhibit divergent effects for TOTRISK relative to IRR.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
greater convexity implies larger potential reductions in expected tax liabilities. A lower marginal tax rate or little reported net income would imply greater tax convexity.
681
Since higher tax rates are related to a less convex tax schedule, a higher level of TAX corresponds to a lower incentive to hedge and more interest rate risk, ceteris paribus.
Table 45.1 Description of potential factors affecting risk-management decisions Panel A. Summary of explanatory variables and their definitions Definition
Data sourcesa
TAX
Annual average effective tax rate
Y9-C
TAXDUM
Dummy variable equal to 1 when the firm’s net income was above $100,000
Y9-C
PBANK
Variance of quarterly net income during the relevant year/(beginning Equity Capital C Mean of Quarterly Net Income during the relevant year)2
Y9-C
PBANKDUM
Dummy variable equal to 1 when the firm’s value for PBANK is greater than 0.01 (i.e., greater than 1%)b
Y9-C
OPTGRANT
Number of common shares underlying stock options granted to senior officers/Total number of shares outstanding
Proxy Statements/SNL Securities/Compact Disclosure
PBONUS
Percentage of senior officers’ total cash compensation that is received in the form of bonuses
Proxy Statements/SNL Securities/Compact Disclosure
MGMTOWN
Number of common shares held by officers and directors/total number of shares outstanding
Proxy Statements/SNL Securities/Compact Disclosure
MGMTDUM
Dummy variable equal to 1 when MGMTOWN is greater than the sample mean of 16%
Proxy Statements/SNL Securities/Compact Disclosure
INST
Percentage of shares outstanding held by un-affiliated Institutional Investors
SNL Securities/Compact Disclosure
EQBLOCK
Percentage of common shares owned by large blockholders (i.e., external investors with ownership > 5%)
Proxy Statements/SNL Securities/Compact Disclosure
SIZE
Book Value of Total Assets (TA)
Y9-C
REVENUE
Total annual interest and non-interest income earned
Y9-C
Variable name Tax variables
Financial distress variables
Contracting cost variables
Agency cost variables LEVERAGE
TA/Market Value (MV) of Equity Capital
CRSP; Y-9C
LEVDUM
Dummy variable equal to 1 when the TA/Book Value of Equity Capital is greater than the sample mean of 12.3c
Y9-C
Number of security analysts publishing annual earnings forecasts for the BHC
Compact Disclosure
LIQUIDITY
(Cash C Invest. Securities)/MV of Equity Capital
Y9-C; CRSP
CASH/BVE
Cash/BV of Equity Capitald
Y9-C
IRt
Change in the end-of-month yield of the 10-year U.S. Treasury note scaled by 1.0 C the 10-year T-note’s yield lagged 1 month
Federal Reserve
RM;t
Monthly total return on the CRSP value-weighted stock index
CRSP
OPERISK
Non-interest Operating Expense/Total Revenue
Y9-C
BADLOANS
(Non-Performing Loans C Loan Chargeoffs)/MV of Equity Capital
Y9-C; CRSP
NONPERF/TA
(Non-Performing Loans C Loan Chargeoffs)/TA
Y9-C
Asymmetric information ASYMINFO Hedging substitutes
Market-related variables
Control and miscellaneous variables
(continued)
682
M.S. Pagano
Table 45.1 (continued) Panel B. Summary of dependent variables and their definitions Variable name
Definition
Data sourcesa
RN;t
Monthly total return on N-th BHC’s common stock
CRSP
IRRN;y
Absolute value of the interest rate risk parameter estimated via a 2-factor model for the N -th BHC during the y-th year
CRSP
Panel A contains definitions of the explanatory variables employed in the empirical analysis for our Primary and Alternative Models. The panel also reports the data sources used to obtain the relevant data. Panel B provides the definitions and data sources for the empirical model’s dependent variable a Data sources: Y9-C Call Report: detailed financial statement data reported to the relevant bank regulatory authorities on a quarterly basis. CRSP: the Center for Research in Securities Prices database of stock returns. Compact Disclosure: a database containing financial statement, management compensation, and equity ownership information. SNL Securities: compensation data from a publication titled SNL Executive Compensation Review (years 1991–2000) b We also estimated the Alternative Model using a threshold value for PBANKDUM equal to the sample mean of 0.0053 and found essentially the same results as those reported in Table 45.3. We report our results using 0.01 as the threshold value for PBANKDUM because it provides a slightly better fit than the sample mean value of 0.0053 and also because 0.01 is a relatively simple, intuitive breakpoint that bank regulators would probably be interested in using for their monitoring activities c Note that we use the book value of equity capital (rather than the market value) in order to develop a dummy variable that is not too closely related to the LEVERAGE variable. However, we find that using a dummy variable that is based on LEVERAGE provides results that are consistent with those reported here using our book value-based LEVDUM variable d Note that this is a more narrowly defined liquidity measure that is based solely on book values
Consequently, we expect a positive relation to exist between TAX and a bank’s interest rate risk exposure. TAX .C/: This tax influence can be tested via the average effective tax rate of the BHC. This variable captures the convexity of the BHC’s tax schedule because a very low effective tax rate suggests the firm is facing the most convex portion of the U.S. tax code.19 H2. Financial distress costs hypothesis: According to the hedging theories noted earlier, a BHC will hedge more when its costs related to financial distress are expected to be high in order to lower this expected cost as much as possible. PBANK ./: Since it is not feasible to observe a BHC’s expected financial distress costs directly, we use a variable that proxies for one key component of this expected cost: an estimate of the likelihood of experiencing financial distress. We expect this proxy variable to be highly, and positively, correlated with the bank’s true, unobservable expected financial distress costs. An estimate of the probability 19
Including a dummy variable (denoted TAXDUM) that equals 1 when the firm’s annual net income is above $100,000 provides an alternative way to test this tax influence. This TAXDUM measure of the taxbased incentive to hedge is used as a robustness check in our “Alternative Model” and is similar in spirit to the factors used to simulate the tax-related benefits of corporate hedging in Graham and Smith (1999). This dummy variable attempts to capture the convexity of the BHC’s tax schedule because it is at this low level of net income that the U.S. tax code is most convex. Ideally, a variable measuring tax loss carryforwards and carrybacks would be useful in measuring a firm’s tax convexity. However, these data are not readily available for commercial banks.
of financial distress first suggested by Blair and Heggestad (1978), PBANK, is therefore created by employing Chebyshev’s inequality to derive an upper bound on the probability of financial distress for a commercial bank based on the mean and variance of the firm’s quarterly net income during the year and the year-end level of equity capital. The actual calculation employed is described in Table 45.1. Panel A.20 Conversely, we expect PBANK to have a positive effect on TOTRISK because an increased risk of financial distress is reflected in greater stock return volatility when bankruptcy and external financing costs are convex functions of the level of outstanding debt. H3. Contracting costs hypothesis no. 1: Managers of BHC’s might have an incentive to take on more interest rate risk when they own options to buy the BHC’s stock. That is, the BHC’s riskiness should be positively correlated with the level of stock options granted to senior management as it is well known that incurring increased risk can raise the value of these options. OPTGRANT .C/: This effect is measured by the number of shares underlying the options granted annually to senior management scaled by the total number of common shares
20
Although one might view this variable as endogenous, the level and quality of the bank’s assets and the amount of equity capital on hand at the beginning of each year primarily determine these financial distress variables. Thus, we can consider the financial distress cost proxy as a predetermined variable in our empirical specification.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
outstanding.21 A larger number of stock options implies a greater incentive to assume more risk, ceteris paribus. H4. Contracting costs hypothesis no. 2: The relation between the level of equity owned by the company’s managers and the degree of risk taken by the firm may be negative. That is, risk-averse managers may take on less risk than is optimal at the firm level if managerial stock ownership is very high since these risk-averse managers might not be able to adequately diversify their personal investments in the firm.22 Also, un-affiliated large institutional investors should act as influential bank monitors and, via exercising their voting rights, reduce risk-averse managers’ ability to select lower, sub-optimal levels of risk.23 This theory predicts a positive relationship between institutional investor equity ownership levels and bank risk-taking activity, as well as a negative relation between managerial ownership and IRR.
21
Ideally, one would like to also see if the dollar value of these options is related to a bank’s interest rate risk. However, these data would require the application of an option-pricing model with all the related assumptions such a model requires. After 1991, most firms reported some of the relevant data but the assumptions underlying these dollar estimates are not consistent across firms. Since option grants are typically issued “at the money” (i.e., at approximately the current stock price at the time of issuance), using the number of shares granted scaled by the total number of shares outstanding provides us with an objective, standardized measure of incentive-based compensation. As reported in Table 45.1, we also use an alternative proxy variable (PBONUS), which calculates the percentage of annual cash compensation received in bonus form to provide an alternative standardized measure of the risk-taking incentives of senior management. 22 As noted in Gorton and Rosen (1995) and Spong and Sullivan (1998), among others, the relationship between the level of equity owned by the company’s managers and the degree of risk taken by the firm might be non-monotonic. That is, risk averse managers might take on less risk than is optimal at the firm level when managerial stock ownership is very low (since there is no real incentive to take risk) or very high (because too much of the manager’s wealth is at risk). At moderate levels of ownership, however, managers may take on more risk since their personal incentives may be better aligned with those of outside shareholders. In this case, we expect an inverted U-shaped relationship between risk-taking and equity ownership similar to the one reported in Gorton and Rosen (1995) where both low and high levels of management equity ownership exhibit lower levels of risk-taking than moderate ownership levels. Empirical tests of this alternative hypothesis showed no evidence in support of this non-monotonic relationship. These results (not reported here in order to conserve space) could be due to the more detailed specification of the factors affecting risk-taking in our model. 23 For robustness testing, we use an alternative proxy variable, EQBLOCK, which is computed as the percentage of total shares outstanding held by outside equity “blockholders” (those unaffiliated external investors that own 5% or more of the firm’s common equity). We include only the percentage of shares owned by “true” outside blockholders such as mutual fund companies and exclude any shares owned by “quasi-insiders” such as the bank’s Employee Stock Ownership Plan or relatives/associates of senior bank managers. We make a distinction between these types of groups because quasi-insiders might not be as effective monitors of the firm as true outside blockholders.
683
MGMTOWN ./ and INST .C/: We use the percentages of shares outstanding held by directors or officers and unaffiliated institutional investors as proxies for this hypothesis (MGMTOWN and INST, respectively). H5. Contracting costs hypothesis no. 3: The possibility of economies of scale related to risk-management activities suggest larger firms are more likely to engage in hedging than smaller firms since the marginal costs of such activities are usually lower for larger companies. This hypothesis predicts a negative relationship between firm size and the bank’s risk level. SIZE (): The SIZE variable (i.e., total assets of the BHC) can be employed as a proxy for the economies of scale available to a firm related to risk-management activities. H6. Contracting costs hypothesis no. 4: The potential diversification effects of firm size may provide incentives to larger BHCs to assume greater exposures to systematic risk factors such as interest rate risk. In this case, larger firms are less likely to engage in hedging activities than smaller competitors. This alternative argument regarding firm size is based on empirical studies such as Demsetz and Strahan (1995), which have documented that larger BHCs have less idiosyncratic risk due to the positive effects of diversification on their larger loan portfolios. Thus, these larger institutions are able to take on more interest rate risk (and hedge less) if these firms wish to maintain a pre-determined level of total risk (i.e., systematic plus idiosyncratic risk). This effect would also be reinforced if “too big to fail” (TBTF) policies and mispriced deposit insurance are present within the U.S. banking system during our sample period (see hypothesis H8 below for more details on our control for TBTF effects). Consequently, there may be a positive relationship between firm size and risk. SIZE .C/: This hypothesis employs the same variable described in H5 but predicts a positive relationship between firm size and risk. H7. Agency cost hypothesis: BHCs that face larger agency costs related to external debt will hedge more than BHCs with less conflict between their debtholders and shareholders. Thus, we would expect firms with greater leveragerelated agency costs to hedge more than firms with relatively smaller agency problems. Financial leverage would therefore be negatively related to interest rate risk-taking activity. LEVERAGE ./: A proxy variable for this debt-related agency cost is financial leverage (defined as the book value of total assets divided by the market value of common equity). Conversely, we expect a positive relation between LEVERAGE and TOTRISK because increased financial leverage creates higher financial risk and, as Hamada (1972) has shown within the CAPM framework, this increased financial risk can lead to greater stock return volatility. H8. Asymmetric information hypothesis: A BHC that has superior management as well as a more severe degree of
684
asymmetric information between insiders and external investors will have a greater incentive to hedge than BHCs with less severe information asymmetries. According to DeMarzo and Duffie (1995), we expect well-run BHCs with more severe information asymmetries to hedge more and assume less risk than other BHCs, ceteris paribus, in order to generate a less-noisy signal of the firm’s “quality” or cash flow prospects. ASYMINFO .C/: The number of security analysts publishing earnings forecasts for the company in a given year is employed as a measure of the degree of asymmetric information surrounding a particular company (e.g., a higher number of analysts following a company would imply a less acute asymmetric information problem). This hypothesis predicts that less acute information asymmetries should decrease the firm’s need to hedge.24 H9. Hedge substitutes hypothesis: A firm with a large amount of hedge substitutes is “hedging” against future adverse business conditions because any potential earnings shortfall can be compensated for more easily if, for example, the firm has a large amount of liquid assets relative to the BHC’s short-term liabilities (LIQUIDITY). This is effectively the same as conventional hedging via derivative securities because high liquidity reduces the chances that management will be forced to forgo future investment in order to conserve cash when earnings are unexpectedly low. A BHC that uses alternative financial policies or hedge substitutes, such as a large amount of liquid assets, will therefore have less of an incentive to hedge (and a stronger incentive to take interest rate risk). Conversely, firms with low liquidity are expected to have a greater incentive to reduce risk by hedging more.25 Consequently, the bank’s liquidity level is expected to be positively related to interest rate risk. LIQUIDITY .C/: The BHC’s liquidity level (cash plus marketable securities) relative to its market value of common equity is employed as a measure of hedge substitutes and is expected to have a positive relationship with BHC riskiness. This relationship is based on the notion that a firm with a high level of liquidity will have more resources available 24 It should be noted that ASYMINFO and SIZE are most likely interrelated. In our empirical tests discussed later, we accommodate this potential inter-relationship via a full quadratic model of interest rate risk. In addition, both of these variables may also be correlated with moral hazard incentives associated with “too big to fail” (TBTF) policies and mispriced deposit insurance. In effect, the inclusion of these two variables can also help control for risk-increasing incentives related to very large banks that are deemed “TBTF” by regulators. 25 Financial firms with low levels of liquidity face a greater degree of liquidity risk. This risk can be defined as the BHC’s risk of not having sufficient funds on hand to meet depositors’ withdrawal needs. Typically, this risk leads to higher borrowing costs since the BHC is forced to pay a premium to obtain the additional funds on short-term notice in the federal funds market (or at the Fed’s discount window in extreme cases of liquidity needs).
M.S. Pagano
in the case of an earnings shortfall or sudden drop in common equity. Thus, the firm will have an incentive to hedge less and take on more interest risk. Note that for overall risk (TOTRISK) the above effect might be offset by the ability of high levels of LIQUIDITY to act as a “shock absorber” against unexpected declines in cash flow and therefore could reduce overall stock return volatility.
45.2.5 Control Variables As noted earlier in this section, the corporate finance and banking literature has identified some key factors other than interest rate risk that can influence the observed levels of risktaking in the banking industry. Presented below is a description of the control variables related to asset quality, operating risk, and portfolio composition.26 As Stein (1998) has noted, adverse selection problems can affect a bank’s asset/liability management choices when there are large information asymmetries related to a bank’s assets (e.g., due to poor credit quality). Therefore, the informational asymmetry engendered by the credit quality of the bank’s assets can affect a bank’s interest rate risk exposure via potential mis-matches between asset and liability maturities and thus this effect should be controlled for in our tests. We use the ratio of nonperforming assets plus gross chargeoffs relative to the bank’s market value of equity (BADLOANS) as a proxy variable for a bank’s asset quality. Keeton and Morris (1987) recommend a measure such as this because it removes the possibility of distortion created by a bank’s attempt to “manage” the level of its chargeoffs for financial reporting purposes. The use of this variable as a proxy assumes that the quality of a loan portfolio is normally (strongly) positively correlated with the proportion of loans that are nonperforming or have been charged off. A substantial exposure to a risk factor such as high operating leverage can also lead to a greater level of risk for the BHC, ceteris paribus. As Hamada (1972) showed, operating leverage is positively related to systematic risk. Alternatively, high credit and operating risks (denoted BADLOANS and OPERISK, respectively) might create an incentive to hedge more interest rate risk in order to keep total risk at a lower, more tolerable level (from management’s and regulators’ perspectives).27 In addition, the composition of a bank’s
26
Note that when TOTRISK is used as the dependent variable, we must also include our interest rate risk factor, IRR, as a control variable because interest rate risk (along with credit, operating, and liquidity risks) can affect the bank’s overall riskiness. 27 Our definition of OPERISK is noninterest operating expense divided by total revenue and effectively measures operating leverage because OPERISK will be high for banks with a large amount of operating
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
portfolio affects the firm’s liquidity risk. Thus, the liquidity risk inherent in a bank’s asset portfolio might also influence interest rate risk as well as act as a hedge substitute. Consequently, liquidity, asset quality, and operating risks might be negatively related to interest rate risk. Note, however, that the relationship between the above variables and interest rate risk might be positive if managers are attempting to raise the BHC’s overall risk level via multiple sources of risk. For example, bank managers might try to increase the BHC’s total risk by coordinating several risks such as those related to interest rates, credit quality, liquidity, and operations rather than concentrating on one type of risk. Other potential control variables such as details of the BHC’s portfolio composition beyond the LIQUIDITY variable are not explicitly included in our cross-sectional empirical model. The omission of these variables is appropriate because portfolio composition variables such as total loans are endogenous to the set of influences already included in our model. That is, portfolio composition is a function of the risk-taking and risk-management incentives related to, for example, the contracting costs and tax effects outlined earlier in Sect. 45.2.4. As demonstrated in the empirical results of Kashyap and Stein (1995) and the theoretical findings of Stein (1998), the bank’s management can coordinate the level of loans relative to its investment in marketable securities to minimize costs associated with information asymmetries. In particular, smaller banks, which are assumed to have greater information asymmetries, reduce their volume of lending when changes in monetary policy create an outflow of deposits. Thus, by not including specific portfolio composition variables in our model, we allow these variables to adjust optimally to the risk-taking and risk-management incentives explicitly included in our model.28 In addition to the issues described above, we must also consider the role of bank regulation since regulatory changes are likely to affect the firm’s risk-taking behavior over our sample period. In particular, the introduction of stricter enforcement policies via the Federal Deposit Insurance Corporation Improvement Act (FDICIA) and the Bank for International Settlement’s (BIS) higher capital standards during 1991–1993 might have influenced BHC riskiness by reducing bank risk-taking incentives. Thus, a dummy variable, REGDUM, is set to 1 for the years of our sample in which FDICIA was fully in effect (1994–2000). overhead (e.g., due to high fixed operating costs related to an extensive branch network, a large customer support staff, and so forth). 28 It should also be noted that bank-specific fixed effects via a “Fixed Effects” model could be used to incorporate unidentified firm-specific factors that might affect a firm’s interest rate risk exposure. To conserve space, we simply note here that the use of a fixed effects model does not alter the main findings of our basic linear model. That is, allowing for unidentified, firm-specific effects does not improve upon our basic model described later in Equation (45.3).
685
45.3 Data, Sample Selection, and Empirical Methodology 45.3.1 Data The data used in this analysis are based on a total of 518 publicly traded, highest-level bank holding companies (BHCs) that reported annual data for at least one of the 10 years during the period of 1991–2000.29 The analysis employed here requires data from four distinct areas: (1) financial statement data from regulatory reports; (2) stock return data; (3) managerial compensation information (e.g., option grants, equity ownership); and (4) institutional stock ownership data. The following section on sample selection provides more detail on these data sources. Highest-level BHCs were chosen because it is at this level that one can obtain the best description of risk-taking and risk-management behavior across the entire banking organization. For example, an analysis of a two-bank BHC based on studying the effects of risk-taking at the individual subsidiary bank-level would most likely give a distorted view of the BHC’s overall riskiness and riskmanagement activities because this type of analysis would ignore (a) the potential effects of diversification across the two subsidiaries, and (b) the effects of nontraditional bank activities such as securities brokerage services on the BHC’s risk exposure. The above rationales are consistent with those suggested by Hughes et al. (1996, 2006) and among others. Publicly traded BHCs were selected for this sample in order to estimate the interest rate risk parameters and total risk measures outlined earlier. These estimated risk measures are more likely to be a better proxy for the BHC’s exposures to interest rate risk compared to financial statement-based measures such as the BHC’s “maturity gap.” It should be noted that the use of publicly traded BHCs reduces the number of commercial banks we can examine because the vast majority of banks in the U.S. are small, private companies. The BHCs included in our sample are the most economically significant participants in the U.S. commercial banking industry since they held over $2.5 trillion in total industry-wide assets during 1991–2000. However, one must be cautious in drawing conclusions about the risk-management activities of smaller, privately held banks because we do not include these firms in our analysis. 29 A highest-level bank holding company is an entity that is not owned or controlled by any other organization. Its activities are restricted by regulatory agencies as well as by federal and state laws. However, a highest-level BHC can clearly own other lower-level BHCs that, in turn, can own one or more commercial banks. The highest-level BHC can also engage in some nonbank financial activities such as securities brokerage, mutual fund management, and some forms of securities underwriting. By focusing our analysis on highest-level BHCs, we can take into account the impact of these nontraditional bank activities on the BHC’s risk-taking and risk-management behavior.
686
The time period of 1991–2000 was chosen to ensure our risk estimates were not unduly influenced by any single exogenous factor such as a particular stage of the business cycle.30 The period of analysis covers a full business cycle commencing with a recession in 1991, a recovery period in 1992, a vigorous expansion during 1993–1999, and subsequent deceleration and contraction in 2000. Monetary policy also changed from accommodating to tightening and then back to a looser stance during the period. Since it is likely that the state of the macroeconomy and monetary policy affect a BHC’s decisions, our test results based on the 1991– 2000 time period should not be a statistical artifact caused by studying an incomplete business cycle.
M.S. Pagano
measure of interest rate risk since this factor cannot be readily obtained from the BHC’s financial statements. This measure is defined as: IRR – the absolute value of an interest rate parameter estimated via a two-factor model based on Equation (45.2).31 As noted in the previous section, IRR can be justified as an indirect measure of interest rate risk-management activity due to Deshmukh et al.’s (1983) and Diamond’s (1984) theoretical results. Thus, values of IRR estimates close to zero would represent more hedging and lower risk exposures, ceteris paribus.32 In addition to measuring interest rate risk, we examine how corporate risk-management theories affect the bank’s overall riskmanagement activities by analyzing the monthly standard deviations of bank stock returns (measured annually and defined as TOTRISK).
45.3.2 Sample Selection The sample of 518 BHCs was constructed from the joint occurrence of available data from four key areas: (1) Financial statement data from the Consolidated Financial Statements for Bank Holding Companies (FR Report Y-9C) filed by the BHCs with their relevant regulatory authorities, (2) stock return data from the Center for Research in Securities Prices (CRSP), (3) management compensation data from SNL Securities’ Executive Compensation Review, and (4) institutional and insider ownership from Compact Disclosure and/or SNL Securities’ Quarterly Bank Digest. The joint occurrence of data from each of these sources during the period of 1991–2000 qualified a highest-level BHC for inclusion in our sample. Based on the limitations of the data (notably the management compensation data), our tests must be done on annual data for highest-level BHCs. Also, the period of 1991–2000 was a time of substantial consolidation in the banking industry. In order to mitigate the problem of survivorship bias, we perform our tests using an unbalanced panel data set of 518 BHCs that have at least 1 year of complete data. Thus, our tests based on the panel data set include firms that may have merged or gone public at some point during 1991–2000, as well as those BHCs that survived the entire 10-year period.
45.3.3 Our Measures of Bank Risk-Taking In order to test our integrated risk-management framework, we must develop an empirical measure for interest rate risk that is consistent with shareholder wealth maximization. In particular, we must form a market-based, BHC-specific 30
Data limitations, particularly for management compensation data such as stock option grants, prohibit us from extending the analysis to time periods earlier than 1991. Reporting changes related to executive compensation and the concomitant hand-collection of these data precluded us from extending the analysis for more than the 1991–2000 period.
45.3.4 The Empirical Model The basis of our empirical approach follows Flannery and James (1984), Unal and Kane (1988), Tufano (1996), Saunders et al. (1990), Brewer et al. (1996), and Hirtle (1997).33 Similar to Saunders et al. (1990) and Hirtle (1997),
31
We do not include a measure of foreign exchange rate risk in our analysis because the sample of banks employed here operates primarily domestic-oriented businesses. Less than 15% of the observations in this sample have a non-zero exposure to foreign exchange rate risk (as measured by the gross notional value of the firm’s foreign exchange futures/forwards/swaps). We also estimated Equation (45.2) with a tradeweighted U.S. dollar index as an additional independent variable to act as another check on the relevance of foreign currency risk but found that this variable was not a significant factor affecting the vast majority of banks in our sample. A separate analysis of exchange rate risk for the relatively small subset noted above is therefore beyond the scope of this paper. 32 As Froot and Stein (1998) demonstrate, a zero interest rate parameter is typically sub-optimal when the firm faces convex external financing costs. In addition, firms may want to deviate from a zero interest rate parameter when manager-owner agency problems exist. As discussed in Tufano (1998b), hedging can reduce the number of times the firm enters the capital markets and therefore reduces the amount of monitoring performed by external investors. Utility-maximizing managers may therefore have an incentive to hedge more in order to obscure their consumption of perquisites or creation of other agency costs (to the detriment of the firm’s shareholders). In addition, as noted in Pennacchi (1987), among others, the potential mispricing of FDIC deposit insurance may create incentives to take on more risk (and therefore hedge less). To the extent that these problems are mitigated by the banking industry’s frequent (i.e., daily) entrance into the capital markets, the use of a zero interest rate parameter can be viewed as an appropriate benchmark for gauging the relative degree of the financial institution’s hedging activity vis-à-vis its peers. 33 Other important empirical papers related to this study are Schrand (1997) on derivatives usage, Amihud and Lev (1981), Gorton and Rosen (1995), Houston and James (1995), Morck et al. (1988), Crawford et al. (1995), and Hubbard and Palia (1995) on management ownership and compensation issues, as well as Galloway et al. (1997) and Cebenoyan, et al. (1999) on bank risk-taking, and Angbazo (1997) on bank profitability and off-balance sheet activities.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
among others, the dependent variable used in our study is continuous in nature and can capture the full range of BHC risk-taking behavior. As noted earlier, using the theoretical framework of Equation (45.2), the two-factor model is estimated separately for each of the 518 BHCs during each year of the 1991–2000 sample period (where data are available).
687
Based on the above hypotheses, control variables, and the relations noted in Equations (45.1)–(45.2), we can specify a pooled model of interest rate risk-management for commercial banks using the IRR parameter estimates. The following stylized second-stage regression equation includes independent variables that are proxies for the full set of relevant factors influencing a BHC’s interest rate risk-taking behavior.35
TOTRISK N;y or IRRN;y D f1 .Taxes; Financial Distress; Agency Costs; Contracting Costs; Asymmetric Information; Alternative Financial Policies; Control Variables/ D f1 .TAX N ; PBANK N ; LEVERAGEN ; MGMTOWN N ; OPTGRANT N ; INST N ; SIZEN ; ASYMINFON ; LIQUIDITY N ; BADLOANSN ; OPERISKN ; REGDUM/
where, IRRN;y D an average measure of the N -th BHC’s interest rate risk exposure for the y-th year defined as the absolute value of the parameter estimate for ˇI;N from Equation (45.2) and TOTRISK N;y is the standard deviation of the N -th BHC’s monthly stock returns for the y-th year. The first six factors on the immediate right-hand-side of the equation represent the main determinants of corporate riskmanagement from Equation (45.1) and the seventh factor represents control variables related to other firm-specific risk factors such as asset quality, portfolio composition, operating risk, and changes in bank regulation during 1991–2000.34 The second equality in Equation (45.3) identifies the specific variables used as proxies for the seven theoretical factors. All of the right-hand-side variables are annual values based on
(45.3)
data for 1991–2000 so that the model’s parameters represent the average relation between the explanatory variables and IRR in the panel of BHCs. As noted earlier, higher values of the dependent variable can be interpreted as an indication of greater risk-taking or, conversely, less hedging. We refer to Equation (45.3) as our “Primary Model.” As mentioned earlier, we can specify an “Alternative Model” for IRR and TOTRISK in order to ensure that our results are not driven by the specific proxy variables we have chosen to represent the corporate risk-management rationales (as summarized in hypotheses H1–H9). This model has the same form as Equation (45.3) but uses different righthand-side variables to check the robustness of our “Primary Model” results. This Alternative Model is presented below:
TOTRISK N;y or IRRN;y D f1 .TAXDUMN ; PBANKDUM N ; LEVDUM N ; MGMTDUM N ; PBONUSN ; EQBLOCK N ; REVENUEN ; ASYMINFON ; CASH=BVEN ; NONPERF=TAN ; OPER EXP=TAN ; REGDUM/
(45.4)
where, IRRN;y D the same interest rate risk measure noted in Equation (45.3); (i.e., it is an average measure of the N th BHC’s interest rate risk exposure for the y-th year defined as the absolute value of the parameter estimate for ˇI;N from Equation (45.2)) and TOTRISK N;y is the same
total risk measure described by Equation (45.3). Note that we have replaced all of the right-hand-side variables of Equation (45.3), except the ASYMINFO and REGDUM variable, with alternative proxy variables to check the robustness of our results to different formulations of our model’s
34
35
As noted earlier, when TOTRISK N;y is used as the dependent variable, the IRRN;y variable is included as an independent variable to control for the BHC’s interest rate risk exposure.
The stochastic disturbance term and parameters have been omitted from the following equations to streamline notation. In addition, the yth year subscripts have been dropped from the right-hand-side variables.
688
independent variables.36 The purpose of Equation (45.4) is to re-estimate Equation (45.3) with different proxies for the various risk-management rationales and see if the results of Equation (45.3) are robust to these alternative definitions. The alternative proxy variables contained in Equation (45.4) represent either alternate definitions or completely different variables that are also reasonable proxies for the risk-management theories and hypotheses presented earlier in Sect. 45.2. For example, we have included four dummy variables that are based on variations of the Primary Model’s variables (i.e., TAXDUM, PBANKDUM, MGMTDUM, LEVDUM). These variables use a binary format rather than a continuous variable to describe hypotheses H1, H2, H4, H7, respectively. We use these binary variables because there may be a “threshold effect” where the saliency of a particular variable might only affect a bank’s IRR when it exceeds this threshold. For example, we can set TAXDUM to 1 when the bank’s net income is greater than $100,000 because it is above this threshold point that the U.S. tax code is no longer convex. For the other three dummy variables, we use the sample mean as the relevant threshold value or, as in the case of PBANKDUM, a value such as 0.01, which signifies a much larger probability of financial distress than the rest of the sample (and represents a probability level that regulators would most likely be concerned about). The new variables included in Equation (45.4) are PBONUS, EQBLOCK, and REVENUE. As previously described in earlier footnotes, PBONUS and EQBLOCK represent alternative proxies for some of the managerial contracting cost rationales related to risk-taking and bank monitoring incentives described in Sect. 45.2.4. In particular, the PBONUS variable provides an alternative measure of the risk-taking incentives of senior management since, like stock
36
The ASYMINFO variable is not replaced in Equation (45.4) because we do not have a suitable alternative for this asymmetric information proxy variable. Admittedly, finding a good proxy for the level of asymmetric information within a BHC is the most difficult aspect of estimating a model such as the one described by Equations (45.3) and (45.4) since information asymmetries are naturally difficult to quantify. One obvious alternative would be to use the bank’s SIZE variable but this variable is needed to control for size-related corporate riskmanagement effects. It is also highly correlated with the REVENUE variable specified in Equation (45.4) and thus the inclusion of both SIZE and REVENUE would introduce a high degree of multi-colinearity into the model. Interestingly, when we simply drop the ASYMINFO variable from Equation (45.4) and allow SIZE to proxy for both size-related and asymmetric information effects, the parameter estimates for the other independent variables remain relatively unchanged. This suggests that our model is robust to using SIZE as an alternative proxy for information asymmetries.
M.S. Pagano
option grants, large annual bonuses have option-like incentive characteristics. REVENUE acts as an alternative proxy for bank size and, as noted in Footnote 23, is most likely to be inter-related with ASYMINFO and might also act as a control for any “too big to fail” effects. The last set of variables included in Equation (45.4) contains alternative control variables (NONPERF/ TA, OPER EXP/ TA, CASH/BVE). For these final three variables, we scale the numerators of each of the Primary Model’s counterparts by an alternate scalar (e.g., the book value of total assets for the first two variables noted above and the book value of common equity for the last one). Also, instead of including marketable securities in the numerator, we define a more restricted version of LIQUIDITY when we form the CASH/BVE variable by including solely cash balances in the numerator and the book value of equity in the denominator. These variables are meant to be reasonable alternative definitions of our independent variables in order to test whether or not our Primary Model’s results are sensitive to the choice, and definition, of these explanatory variables.
45.3.5 Accounting for Nonlinearities and Inter-relations Between Independent Variables Lastly, a review of the explanatory variables described in Sects. 45.2.4 and 45.2.5 raises the question of potential interactions between some of these factors. To account for these potential interactions, as well as the possibility of nonlinear relations between the explanatory variables and IRRN , we can specify Equation (45.3) in a full quadratic form. That is, we can square as well as interact each variable on the right hand side of Equation (45.3) and include these transformed variables along with linear forms of these variables. This approach produces 89 right-hand-side variables for Equation (45.3) based on IRR and 103 variables when TOTRISK is used as the dependent variable. This relatively complicated form of Equation (45.3) can also be compared to the conventional linear form of Equation (45.3) that contains only 12 right-hand-side variables for IRR and 13 variables for TOTRISK. By including squared and interaction terms, the quadratic model might provide a much richer description of the factors that affect interest rate risk-taking at a commercial bank. It also allows us to address the possibility that some of the risk-management incentives may be interdependent. Any inter-dependencies between these variables can be captured by the numerous interaction terms included in the quadratic model.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
45.4 Empirical Results 45.4.1 Descriptive Statistics and Industry Trends Panel A of Table 45.2 provides descriptive statistics for the key variables employed in our Primary Model’s analysis while Panel B reports relevant statistics for the independent variables used in our Alternative Model is robustness test. These descriptive statistics are based on 518 BHCs over a 10year period thus yielding 2,899 bank-year observations for our interest rate risk measures.37 The results of the two-factor model confirm our intuition that U.S. BHCs take on varying degrees of risk relative to interest rates. The average absolute value of the interest rate risk parameter (IRR) during 1991– 2000 for the BHC’s sample was positive (0.5820) and statistically different than zero. The raw value of this measure, IBETA, averages 0.0215 and possesses a large degree of variation (e.g., the standard deviation of 0.7738 is nearly 40 times the mean). Due to a high level of variability, the average raw measure of interest rate risk is not significantly different than zero. However, this average masks the fact that 32% of the individual, bank-specific estimates of IBETA are significantly positive at the 0.10 confidence level and another 35% of the estimates are significantly negative. Thus, two-thirds of the individual IBETA estimates are significantly different than zero even though the overall sample average is quite close to zero. This wide variation in IBETA and IRR shows that there is considerable time-series and cross-sectional dispersion in banks’ sensitivities to interest rate risk. In addition, TOTRISK has a statistically significant mean of 7.67% and exhibits a large degree of variability with a standard deviation of 6.30%.
45.4.2 Multivariate Analysis for Interest Rate Risk Table 45.3 summarizes the results of estimating Equations (45.3) and (45.4) via OLS using the dependent variable, IRR. Recall that these parameter estimates are based on annual regressions described by Equation (45.2). To conserve space, these 2,899 intermediate regressions are not included here.38 Table 45.3 shows that the Primary and
37 There are not 5,180 observations (i.e., 518 10 years) because all banks do not have data for all 10 years. 38 As shown in Table 45.3, the actual number of observations used to estimate Equations (45.3) and (45.4) is less than 2,899 because not all of the independent variables are available for each IRR estimate (particularly the option grants and bonus compensation data).
689
Alternative Models of Equations (45.3) and (45.4) provide substantially the same explanatory power. This finding can be readily seen by comparing the adjusted R2 and t-statistics of the two models. As shown in the last column of Table 45.3, the results of Equation (45.4) based on the Alternative Model follow a pattern similar to that described above for our Primary Model. Table 45.3 also reports that the test results for all the independent variables except SIZE are qualitatively similar for both the Primary and Alternative Models. As we will see from the elasticity estimates of Table 45.4 (described below), the real economic differences between the two models is also not that great in terms of testing our hypotheses. Thus, our results are robust to the choice of independent variables employed in our model. In addition, the choice of interest rate factor in Equation (45.2) does not materially affect our results. As noted earlier, our results are qualitatively similar when other definitions of the interest rate factor are used. It is also helpful to estimate the elasticities of interest rate risk with respect to the model’s primary explanatory variables in order to assess the economic, as well as statistical, significance of our independent variables. That is, we can use the parameter estimates to differentiate Equations (45.3) and (45.4) with respect to a specific explanatory variable. These derivatives can then be used with the mean values of the relevant variables to estimate the elasticity (or sensitivity) of a change in the explanatory variable (e.g., TAX) on our dependent variable (i.e., IRR). A Wald test statistic can then be constructed to test the significance of this elasticity estimate. This approach provides us with a more direct and reliable method of evaluating the importance of a specific variable on the BHC’s interest rate risk-taking activity. Table 45.4 presents the elasticity estimates for the Primary and Alternative Models with respect to the key explanatory variables identified by our nine hypotheses. For the Primary Model, this table reports statistically significant elasticity estimates with theoretically correct signs for TAX, PBANK, LEVERAGE, INST, BADLOANS, and OPERISK. In addition, ASYMINFO and REGDUM are significant but report theoretically incorrect signs. These estimates indicate that, on average, a higher estimated probability of financial distress and debt-related agency costs reduced interest rate risk at commercial banks during the sample period while a higher effective tax rate and greater institutional investor ownership increased IRR. The Alternative Model confirms the Primary Model’s support for the financial distress and agency cost theories of risk-management (as well as the OPERISK control variable) but not the tax rationale. Of particular note are the greater-than-unity elasticity estimates for the LEVERAGE and OPERISK variables in both the Primary and Alternative Models. These estimates suggest that bank interest rate risk sensitivities are most acutely affected by financial leverage and operating cost structures.
690
Table 45.2 Descriptive data for the potential factors affecting risk-management decisions
M.S. Pagano
Descriptive statistics for the full sample of 518 Bank Holding Companies (1991–2000) Variable
Description
Mean
Std. dev.
Min.
Max.
A. Primary model’s variables IRR Absolute Value of Interest Rate Risk Parameter
0:5820
0:5101
IBETA
Raw Value of Interest Rate Risk Parameter
0:0215
0:7738
3:2980
4:901
RM
CRSP Stock Index Total Return
0:1798
0:1464
0:1006
0:3574
IR
Ten-year U.S. Treasury yield (%)
6:0910
0:8661
4:650
7:858
TOTRISK
Monthly Standard Dev. of Stock Returns (%)
7:6682
6:3010
0:162
82:412
SIZE
Total Assets ($ bil.)
12:2509
47:9559
0:0209
902:210
TAX
Average Effective Tax Rate (%)
31:7912
7:8573
0
0
4:901
93:4047
PBANK
Est. Probability Of Bankruptcy
0:0053
0:0200
0
LEVERAGE
Financial Leverage (x)
8:4095
5:1785
1:2227
48:0177
OPTGRANT
Options Granted to Management (%)
0:3914
0:9231
0
21:9060
MGMTOWN
Management Equity Ownership (%)
16:0640
14:4142
0
93:9370
INST
Institutional Investors’ Equity Ownership (%)
20:1839
19:0650
0
90:8620
ASYMINFO
Asymmetric Information Proxy (No. Analysts)
4:7302
7:1157
0
42:0000
BADLOANS
Non-Performing Assets Ratio
0:0889
1:6625
OPERISK
Operating Risk Ratio
0:5941
0:1622
0:1707
0:9934
LIQUIDITY
Liquidity Measure (x)
2:8081
2:4603
0:0299
31:1815
B. Alternative model’s variables TAXDUM Tax Dummy Variable
0:9000
0:3000
0
1
PBANKDUM
Prob. of Bankruptcy Dummy Variable
0:0914
0:2882
0
1
LEVDUM
Book Value Fin. Leverage Dummy Variable
0:4170
0:4932
0
1
PBONUS
Sr. Mgmt. Bonus as % of Mgmt. Comp.
24:6132
18:0054
0
MGMTDUM
Mgmt. Ownership Dummy Variable
0:3356
0:4723
0
EQBLOCK
Un-Affiliated Blockholder Ownership (%)
3:4747
7:5096
0
REVENUE
Total Revenue ($ bil.)
1:1340
4:8050
0:0005
NONPERF/TA
Alternative Non-Performing Assets Ratio
0:0074
0:0137
0:0345
0:0755
OPER EXP/TA
Operating Exp. as a % of Total Assets
0:0523
0:0467
0:0111
0:1333
CASH/BVE
Cash Holdings as a % of BV of Equity
0:8426
0:8037
0:0295
Interestingly, OPERISK has a very large elasticity estimate .C13:2/, which suggests that banks with high operating cost structures also assume relatively large amounts of interest rate risk. To the extent that a bank with a high
0:3508
0:2492
8:5070
93:953 1 85:2000 111:826
14:238
degree of OPERISK also has high financial leverage (as proxied by LEVERAGE), the overall net increase in a bank’s interest rate risk can be ameliorated. For example, based on the elasticity estimates of our Primary Model, a bank
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
691
Table 45.3 Relationship between interest rate risk and risk-management incentives Primary model of IRR Alternative model of IRR Variables CONSTANT TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM Adjusted R2 Durbin-Watson N
Pred. sign C C C C= C C= C= C
Linear model
Alternate variables
Pred. sign
3.7992 .7.92/ 8.9982 (2.60) 7.0068 .2.12/ 0.1576 .3.59/ 0.0438 (0.50) 0.0086 (0.96) 0.0150 (1.83) 0.0006 (0.15) 0.0534 .3.57/ 2.8802 (2.29) 12.2212 (17.88) 0.0414 (0.40) 0.7777 (4.85) 0.2172 1.74 1,959
CONSTANT TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM
C C C C= C C= C= C
Linear model 3.3534 .4.19/ 0.8354 (1.29) 0.9941 .5.13/ 0.3824 .1.78/ 0.0087 (1.12) 0.1301 (0.52) 0.0086 (0.57) 0:0074 .0:18/ 0.0777 .5.73/ 0.0847 (0.00) 96.7454 (10.47) 0.5347 (3.05) 2.0011 (9.18) 0.1916 1.67 2,201
The ordinary least squares (OLS) cross-sectional regression results are based on linear forms of Equations (45.3) and (45.4). The dependent variables are estimates of the BHC’s interest rate risk, IRR, obtained from annual, firmspecific regressions for 518 BHCs during 1991–2001 based on Equation (45.2). The expected signs of the parameter estimates are presented in the second and fifth columns. The results for the Primary and Alternative Models of IRR are reported in the third and sixth columns, respectively. A parameter estimate and its t -statistic (in parentheses) are printed in bold face when the estimate is significant at the 0.10. The standard errors of the parameter estimates are adjusted for heteroskedasticity and autocorrelation according to Newey and West (1987) using a Generalized Method of Moments technique Table 45.4 Estimated elasticities of interest rate risk with respect to the explanatory variables
Primary IRR model
Alternative model
Variable
Pred. sign
Elasticity
p-value
TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM
C C C C= C C= C= C
1.1663 0.1026 3.2120 0.0383 0.2546 0.7876 0.0183 0.8028 0.2916 13.1534 0.2736 1.1155
0.0093 0.0341 0.0003 0.6138 0.3361 0.0675 0.8786 0.0004 0.0222 0.0001 0.6926 0.0001
TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM
Elasticity
p-value
2.0120 0.4330 0.4663 0.5351 0.0999 0.0711 0:0191 1.2400 0.0009 9.1487 1.0264 2.9780
0.1958 0.0001 0.0756 0.2613 0.6022 0.5668 0.8563 0.0001 0.9967 0.0001 0.0023 0.0001
This table reports the estimated elasticities of the BHC’s interest rate risk (IRR) with respect to changes in the Primary and Alternative Models’ explanatory variables. The elasticity estimates are calculated based on the OLS parameter estimates of Equations (45.3) and (45.4) for IRR. The expected signs of the elasticity estimates are presented in the second column. The elasticity estimates and p-values are printed in bold face when the estimate is significant at the 0.10 level
that increases its OPERISK by 1% while also increasing its LEVERAGE by 1% will have a net increase in IRR of approximately 10% (i.e., Œ13:2 3:2 1%). This increase is still substantial, but it is somewhat lower than if the bank solely increased OPERISK with no concomitant increase in
LEVERAGE. These elasticity estimates suggest that a BHC might coordinate some of its key risks when deciding on the firm’s net exposure to interest rate risk. In sum, the parameter and elasticity estimates of the linear model for IRR provide strong support for the hypotheses
692
related to financial distress costs (H2), debt-related agency costs (H7) with weaker support for the tax, managerial contracting costs, and hedge substitutes hypotheses (H1, H4, and H9). No consistent empirical support was found for the sizerelated, asymmetric information, and other managerial contracting cost hypotheses (H3, H5, H6, and H8). Although the details of the quadratic models are not reported here for space reasons, it should be noted that the inclusion of the nonlinear and interaction terms in Equation (45.3) does not significantly affect the model’s parameter estimates. For example, the adjusted R2 statistic for a full quadratic version of Equation (45.3) is 0.2199 compared to our linear model’s value of 0.2172. Thus, the explanatory power of our empirical model of risk-management is not greatly improved by formulating a full quadratic specification rather than a conventional linear model. A Hausman (1978) specification test reveals that the parameter estimates for the linear and quadratic forms of Equation (45.3) are not significantly different.39 This suggests that the interaction and nonlinear effects of the full quadratic form are not important determinants of interest rate risk sensitivities for our sample of banks. Thus, the simpler linear model is sufficient for adequately describing interest rate risk-taking and risk-management behavior for our sample. An additional series of Hausman tests confirm our earlier claims that models that contain squared terms or fixed effects do not provide any meaningful improvements over our Primary Model. In sum, our basic linear model is robust not only to alternative independent variables but also to the inclusion of controls for other key bank risks, nonlinearities, interaction terms, and firm-specific fixed effects.
45.4.3 Multivariate Analysis for Total Risk Table 45.5 summarizes the results of estimating Equations (45.3) and (45.4) via GMM using the dependent variable, TOTRISK. These parameter estimates are based on annual regressions described by Equation (45.2). Similar to our findings for IRR, Table 45.5 shows that the Primary and Alternative Models of Equations (45.3) and (45.4) substantially provide the same explanatory power. This finding can be seen by comparing the adjusted R2 and t-statistics of the two models. The last column of Table 45.5 shows the results of Equation (45.4), based on the Alternative Model, follow a pattern similar to that described above for our Primary Model. Table 45.5 also reports that the results for all the independent variables except OPTGRANT and LIQUIDITY
M.S. Pagano
are qualitatively similar for both the Primary and Alternative Models. As shown from the elasticity estimates of Table 45.6 (described below), the economic difference between the two models is also not that great in terms of testing our hypotheses. Thus, our TOTRISK results are robust to the choice of independent variables employed in our model. Table 45.6 presents the elasticity estimates for the Primary and Alternative Models with respect to the key explanatory variables identified by our nine hypotheses. For the Primary Model, this table reports statistically significant elasticity estimates with theoretically correct signs for all variables except TAX, MGMTOWN, and ASYMINFO. These estimates indicate that, on average, a higher estimated probability of financial distress, more debt-related agency costs, larger firm size, and greater institutional ownership increased total risk at commercial banks during the sample period while more liquidity and greater bank regulation decreased TOTRISK. The Alternative Model confirms the Primary Model’s support for the financial distress, managerial contracting, and agency cost theories of risk-management (as well as the control variables: BADLOANS, OPERISK, REGDUM, and INTBETA). Of particular note are the relatively large elasticity estimates for the LEVERAGE and OPERISK variables in the Primary Model. These estimates suggest that bank total risk exposure is most acutely affected by the firm’s financial leverage and operating cost structure. In sum, the parameter and elasticity estimates of the linear model for IRR provide strong support for the hypotheses related to financial distress costs (H2), debt-related agency costs (H7), and firm size (H6), with weaker support for the managerial contracting costs and hedge substitutes hypotheses (H4 and H9). No consistent empirical support was found for the other hypotheses related to taxes, size, asymmetric information, and other managerial contracting cost rationales (H1, H3, H5, and H8). Although the details of the quadratic models are not reported here for space reasons, it should be noted that the inclusion of the nonlinear and interaction terms in Equation (45.3) does not significantly affect the model’s parameter estimates. For example, the adjusted R2 statistic for a full quadratic version of Equation (45.3) is 0.2998 versus our linear model’s value of 0.2887. Thus, the explanatory power of our empirical model of risk-management is not greatly improved by formulating a full quadratic specification rather than a conventional linear model. A Hausman (1978) specification test reveals that the parameter estimates for the linear and quadratic forms of Equation (45.3) are not significantly different.40 Consistent with the Hausman test result for IRR, this finding for TOTRISK suggests that the interaction and
40 39
To conserve space, these statistics are not reported in the tables.
To conserve space, these statistics are not reported in the tables noted above.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
693
Table 45.5 Relationship between total risk and risk-management incentives Primary model of TOTRISK Alternative model of TOTRISK Variables CONSTANT TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM INTBETA Adjusted R2 Durbin-Watson N
Pred. sign C C C C C C= C C= C= C= C=
Linear model
Alternate variables
Pred. sign
3.9298 (8.46) 3:8534 .1:35/ 31.1975 (2.45) 0.2059 (4.31) 0.5770 (1.74) 0.0064 (1.12) 0.0339 (6.10) 0.0063 (3.38) 0.0786 .5.42/ 3.3243 (2.38) 3.5888 (6.03) 0.3397 .3.63/ 1.4172 .5.83/ 0.2206 (13.33) 0.2887 1.85 1,959
CONSTANT TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM INTBETA
C C C C C C= C C= C= C= C=
Linear model 9.1165 (8.32) 4.2161 .4.13/ 0.9721 (2.88) 0.6185 (4.43) 0:0065 .1:36/ 0:0818 .0:53/ 0.0302 (2.88) 0.0849 (3.74) 0.0421 .3.65/ 47.8477 (2.90) 46.5962 (7.00) 0.1917 (2.55) 1.5987 .7.37/ 0.2031 (13.15) 0.2772 1.87 2,201
The ordinary least squares (OLS) cross-sectional regression results are based on linear forms of Equations (45.3) and (45.4). The dependent variables are estimates of the BHC’s total risk, TOTRISK, obtained from annual, firmspecific regressions for 518 BHCs during 1991–2001 based on Equation (45.2). The expected signs of the parameter estimates are presented in the second and fifth columns. The results for the Primary and Alternative Models of IRR are reported in the third and sixth columns, respectively. A parameter estimate and its t -statistic (in parentheses) are printed in bold face when the estimate is significant at the 0.10. The standard errors of the parameter estimates are adjusted for heteroskedasticity and autocorrelation according to Newey and West (1987) using a Generalized Method of Moments technique Table 45.6 Estimated elasticities of total risk with respect to the explanatory variables
Primary TOTRISK model
Alternative model
Variable
Pred. sign
Elasticity
p-value
TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM INTBETA
C C C C C C= C C= C= C= C=
0:0375 0.0181 0.2621 0.0318 0.0151 0.1150 0.0105 0.0624 0.0187 0.3386 0.1395 0.1815 0.1138
0.1771 0.0143 0.0001 0.0818 0.2611 0.0001 0.0007 0.0001 0.0172 0.0001 0.0003 0.0001 0.0001
TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM INTBETA
Elasticity
p-value
0.6746 0.0138 0.0425 0:0255 0:0046 0.0150 0.0124 0.0331 0.0317 0.3654 0.0246 0.2021 0.1036
0.0001 0.0040 0.0001 0.1733 0.5992 0.0040 0.0002 0.0003 0.0038 0.0001 0.0107 0.0001 0.0001
This table reports the estimated elasticities of the BHC’s total risk (TOTRISK) with respect to changes in the Primary and Alternative Models’ explanatory variables. The elasticity estimates are calculated based on the OLS parameter estimates of Equations (45.3) and (45.4) for TOTRISK. The expected signs of the elasticity estimates are presented in the second column. The elasticity estimates and p-values are printed in bold face when the estimate is significant at the 0.10 level
nonlinear effects of the full quadratic form are also not important determinants of total risk sensitivities for our sample of banks. In sum, our basic linear model is robust not only to
alternative independent variables but also to the inclusion of controls for other key bank risks, nonlinearities, interaction terms, and firm-specific fixed effects.
694
M.S. Pagano
45.5 Conclusion
References
We examine the rationales for risk-taking and riskmanagement behavior for U.S. bank holding companies during 1991–2000. By combining the theoretical insights from the corporate finance and banking literatures related to hedging and risk-taking, we formulate an empirical model based on Flannery and James (1984) to determine which of these theories are best supported by the data. Three main conclusions emerge from the analysis:
Allayannis, G. and J. P. Weston. 2001. “The use of foreign currency derivatives and financial market value.” Review of Financial Studies 14, 243–276. Amihud, Y. and B. Lev. 1981. “Risk reduction as a managerial motive for conglomerate mergers.” Bell Journal of Economics 12, 605–617. Angbazo, L. 1997. “Commercial bank net interest margins, default risk, interest-rate risk, and off-balance-sheet banking.” Journal of Banking and Finance 21, 55–87. Berkman, H. and M. E. Bradbury. 1996. “Empirical evidence on the corporate use of derivatives.” Financial Management 25, 5–13. Blair, R. D. and A. A. Heggestad. 1978. “Bank portfolio regulation and the probability of failure.” Journal of Money, Credit, and Banking 10, 88–93. Bodnar, G. M., G. S. Hayt, and R. C. Marston. 1996. “1995 Wharton Survey of derivatives usage by US non-financial firms.” Financial Management 25, 142. Brewer, E., W. E. Jackson, and J. T. Moser. 1996. “Alligators in the swamp: The impact of derivatives on the financial performance of depository institutions.” Journal of Money, Credit, and Banking 28, 482–497. Campbell, T. S. and W. A. Kracaw. 1990. “Corporate risk-management and the incentive effects of debt.” Journal of Finance 45, 1673– 1686. Campbell, T. S. and W. A. Kracaw. 1990b. “A comment on bank funding risks, risk aversion, and the choice of futures hedging instrument.” Journal of Finance 45, 1705–1707. Cebenoyan, A. S., E. S. Cooperman, and C. A. Register. 1999. “Ownership structure, charter value, and risk-taking behavior for thrifts.” Financial Management 28(1), 43–60. Copeland, T. and M. Copeland. 1999. “Managing corporate FX risk: a value-maximizing approach.” Financial Management 28(3), 68–75. Crawford, A. J., J. R. Ezzell, and J. A. Miles. 1995. “Bank CEO payperformance relations and the effects of deregulation.” Journal of Business 68, 231–256. Culp, C. L., D. Furbush, and B. T. Kavanagh. 1994. “Structured debt and corporate risk-management.” Journal of Applied Corporate Finance Fall 7(3), 73–84. Cummins, J. D., R. D. Phillips, and S. D. Smith. 1998. “The rise of riskmanagement.” Economic Review, Federal Reserve Bank of Atlanta 83, 30–40. DeMarzo, P. and D. Duffie. 1995. “Corporate incentives for hedging and hedge accounting.” Review of Financial Studies 8, 743–772. Demsetz, R. S. and P. E. Strahan. 1995. “Historical patterns and recent changes in the relationship between bank holding company size and risk.” FRBNY Economic Policy Review 1(2), 13–26. Deshmukh, S. D., S. I. Greenbaum, and G. Kanatas. 1983. “Interest rate uncertainty and the financial intermediary’s choice of exposure.” Journal of Finance 38, 141–147. Diamond, D. W. 1984. “Financial intermediation and delegated monitoring.” Review of Economic Studies 51, 393–414. Ferson, W. E. and C. R. Harvey. 1991. “The variation of economic risk premiums.” Journal of Political Economy 99, 385–415. Flannery, M. J. and C. M. James. 1984. “The effect of interest rate changes on the common stock returns of financial institutions.” Journal of Finance 39, 1141–1153. Froot, K. A. and J. C. Stein. 1998. “Risk management, capital budgeting, and capital structure policy for financial institutions: An integrated approach.” Journal of Financial Economics 47, 55–82. Froot, K. A., D. S. Scharfstein, and J. C. Stein. 1993. Risk-management: Coordinating corporate investment and financing policies, Journal of Finance 48, 1629–1658. Galloway, T. M., W. B. Lee, and D. M. Roden. 1997. “Banks’ changing incentives and opportunities for risk taking.” Journal of Banking and Finance 21, 509–527.
1. The corporate risk-management theories most consistently supported are those related to financial distress and debtholder-related agency costs (with weaker support for the managerial contracting costs, firm size, and hedge substitutes rationales). Thus, despite facing a different regulatory framework and investment opportunities than nonfinancial companies, the factors influencing riskmanagement in commercial banks are similar to those present in nonfinancial industries. This finding also corroborates our view that corporate risk-management and risk-taking are, in effect, two ways of looking at the same risk-return problem (i.e., how should a firm manage its risky assets to maximize value?). 2. The tax, asymmetric information, and some of the managerial contracting cost and firm size rationales for managing risk (Hypotheses H1, H3, H5, H8) are not wellsupported by our sample. Although these rationales may be operant within some banks, they are of lesser importance in our sample of BHCs during 1991–2000. 3. A conventional linear model of risk-management adequately explains cross-sectional and time-series variation in the sample. That is, a more detailed model containing quadratic and interaction terms between all independent variables does not yield significantly different (or better) results than a simple linear model. Our model’s findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. Avenues for future research relate to developing more direct measures of bank risk management activities in order to gauge the effectiveness of alternative hedging strategies. Acknowledgments The author wishes to thank Joe Hughes, Oded Palmon, Bob Patrick, and especially Ivan Brick, for helpful comments that greatly improved this paper. The author has also benefited from comments by Fernando Alvarez, Bob Cangemi, Robert DeYoung, Larry Fisher, Bill Lang, C.F. Lee, Ben Sopranzetti, and Kenneth Spong, as well as from participants of the FMA International Conference, Chicago Risk Management Conference, New England Doctoral Students Conference, and Eastern Finance Association Conference. Scott Williams also provided capable research assistance. This research was based on my dissertation at Rutgers University and was partially supported by the New Jersey Center for Research in Financial Services.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior Geczy, C., B. M. Minton, and C. Schrand. 1997. “Why firms use currency derivatives.” Journal of Finance 52, 1323–1354. Gorton, G. and R. Rosen. 1995. “Corporate control, portfolio choice, and the decline of banking.” Journal of Finance 50, 1377–1420. Graham, J. R. and D. A. Rogers. 2002. “Do firms hedge in response to tax incentives?” Journal of Finance 57, 815–839. Graham, J. R. and C. W. Smith Jr. 1999. “Tax incentives to hedge.” Journal of Finance 54, 2241–2262. Greene, W. H. 1993. Econometric analysis, Macmillan, New York), pp. 370–381. Hamada, R. S. 1972. “The effect of the firm’s capital structure on the systematic risk of common stocks.” Journal of Finance 27, 435–452. Haushalter, G. D. 2000. “Financing policy, basis risk, and corporate hedging: Evidence from oil and gas producers.” Journal of Finance 55, 107–152. Hausman, J. 1978. “Specification tests in econometrics.” Econometrica 46, 1251–1271. Hirtle, B. 1997. “Derivatives, portfolio composition and bank holding company interest rate risk exposure.” Journal of Financial Services Research 12, 243–276. Houston, J. F. and C. James. 1995. “CEO compensation and bank risk: Is compensation in banking structured to promote risk taking?” Journal of Monetary Economics 36, 405–431. Howton, S. and S. Perfect. 1998. “Currency and interest-rate derivatives use in U.S. firms.” Financial Management 27(4), 111–121. Hubbard, R. G. and D. Palia. 1995. “Executive pay and performance: Evidence from the U.S. banking industry.” Journal of Financial Economics 39, 105–130. Hughes, J., W. Lang, L. J. Mester, and C-G. Moon. 1996. “Efficient banking under interstate branching.” Journal of Money, Credit, and Banking 28, 1045–1071. Hughes, J., W. Lang, G. -G. Moon, and M. S. Pagano. 2006. Managerial incentives and the efficiency of capital structure in U.S. commercial banking, Rutgers University, Working Paper. Kashyap, A. and J. C. Stein. 1995. “The impact of monetary policy on bank balance sheets.” Carnegie-Rochester Conference Series on Public Policy 42, 151–195. Keeley, M. C. 1990. “Deposit insurance, risk, and market power in banking.” American Economic Review 80, 1183–1200. Keeton, W. R. and C. S. Morris. 1987. “‘Why do banks’ loan losses differ?’ Federal Reserve Bank of Kansas City.” Economic Review, May, 3–21. Leland, H. E. and D. H. Pyle. 1977. “Informational asymmetries, financial structure, and financial intermediation.” Journal of Finance 32, 371–387. Lin, C. M. and S. D. Smith. 2007. “Hedging, financing, and investment decisions: A simultaneous equations framework.” Financial Review 42, 191–209. Mian, S. L. 1996. “Evidence on corporate hedging policy.” Journal of Financial and Quantitative Analysis 31, 419–439. Meulbroek, L. K. 2002. “A senior manager’s guide to integrated risk management.” Journal of Applied Corporate Finance 14, 56–70. Modigliani, F. and M. H. Miller. 1958. “The cost of capital, corporation finance, and the theory of investment.” American Economic Review 48, 261–297. Morck, R., A. Shleifer, and R. W. Vishny. 1988. “Management ownership and market valuation: an empirical analysis.” Journal of Financial Economics 20, 293–315. Nance, D. R., C. W. Smith Jr., and C. W. Smithson. 1993. “On the determinants of corporate hedging.” Journal of Finance 48, 267–284. Newey, W. and K. West. 1987. “A simple positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix.” Econometrica 55, 703–708. Niehans, J. 1978. A theory of money, Johns Hopkins, Baltimore, MD, pp. 166–182.
695
Pagano, M. S. 1999. An empirical investigation of the rationales for risk-taking and risk-management behavior in the U.S. banking industry, Unpublished dissertation, Rutgers University. Pagano, M. S. 2001. “How theories of financial intermediation and corporate risk-management influence bank risk-taking.” Financial Markets, Institutions, and Instruments 10, 277–323. Pennacchi, G. G. 1987. “A reexamination of the over- (or under-) pricing of deposit insurance.” Journal of Money, Credit, and Banking 19, 340–360. Saunders, A. 1997. Financial institutions management: a modern perspective, Irwin/McGraw-Hill, New York, NY, pp. 330–331. Saunders, A., E. Strock, and N. G. Travlos. 1990. “Ownership structure, deregulation, and bank risk taking.” Journal of Finance 45, 643–654. Schrand, C. M. 1997. “The association between stock-price interest rate sensitivity and disclosures about derivatives instruments.” The Accounting Review 72, 87–109. Schrand, C. M. and H. Unal. 1998. “Hedging and coordinated riskmanagement: Evidence from thrift conversions.” Journal of Finance 53, 979–1014. Shapiro, A. C. and S. Titman. 1985. “An integrated approach to corporate risk-management.” Midland Corporate Finance Journal 3, 41–56. Sinkey, J. F., Jr. and D. Carter. 1994. “The derivatives activities of U.S. commercial banks,” in The declining role of banking, Papers and proceedings of the 30th annual conference on bank structure and regulation, Federal Reserve Board of Chicago, pp. 165–185. Smith, C. W., Jr. 1995. “Corporate risk management: Theory and practice.” Journal of Derivatives 2, 21–30. Smith, C. W., Jr. and R. M. Stulz. 1985. “The determinants of firms’ hedging policies.” Journal of Financial and Quantitative Analysis 20, 391–405. Smith, C. W., Jr. and R. L. Watts. 1992. “The investment opportunity set and corporate financing, dividend, and compensation policies.” Journal of Financial Economics 32, 263–292. Spong, K. R. and R. J. Sullivan. 1998. “How does ownership structure and manager wealth influence risk?” Financial Industry Perspectives, Federal Reserve Bank of K.C., pp. 15–40. Stein, J. C. 1998. “An adverse-selection model of bank asset and liability management with implications for the transmission of monetary policy.” RAND Journal of Economics 29, 466–486. Stulz, R. M. 1984. “Optimal hedging policies.” Journal of Financial and Quantitative Analysis 19, 127–140. Stulz, R. M. 1990. “Managerial discretion and optimal financing policies.” Journal of Financial Economics 26, 3–27. Stulz, R. M. 1996. “Rethinking risk-management,” Journal of Applied Corporate Finance 9, 8–24. Tufano, P. 1996. “Who manages risk? An empirical examination of riskmanagement practices in the gold mining industry.” Journal of Finance 51, 1097–1137. Tufano, P. 1996b. “How financial engineering can advance corporate strategy.” Harvard Business Review 74, 136–140. Tufano, P. 1998a. “The determinants of stock price exposure: financial engineering and the gold mining industry.” Journal of Finance 53, 1015–1052. Tufano, P. 1998b. “Agency costs of corporate risk-management.” Financial Management 27, 67–77. Unal, H. and E. J. Kane. 1988. “Two approaches to assessing the interest rate sensitivity of deposit institution equity returns.” Research in Finance 7, 113–137. Wall, L. D. and J. J. Pringle. 1989. “Alternative explanations of interest rate swaps: A theoretical and empirical analysis.” Financial Management 18, 59–73. Wolf, R. 2000. “Is there a place for balance-sheet ratios in the management of interest rate risk?” Bank Accounting and Finance 14(1), 21–26.
Chapter 45
An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior Michael S. Pagano
Abstract We develop a comprehensive empirical specification that treats risk-management and risk-taking as integrated facets of a financial intermediary’s risk profile. Three main results emerge from a sample of 518 U.S. bank holding companies during 1991–2000: (1) The corporate risk-management theories most consistently supported are those related to financial distress costs and debt holder-related agency costs (with weaker support for the rationales related to managerial contracting costs, firm size, and hedge substitutes); (2) the asymmetric information theory for managing risk is not supported by our sample; (3) a conventional linear model of risk-management adequately explains cross-sectional and time-series variation in the sample. The model’s findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. Keywords Corporate finance r Risk-management r Banks Empirical analysis
r
45.1 Introduction The importance of risk-management and risk-taking behavior and its impact on firm value has increased greatly over the past decade as new derivative securities have proliferated and gained acceptance with financial institutions such as commercial banks. In fact, Saunders (1997) suggests that risk-management is the primary business of financial institutions. This paper reviews the rationales for corporate risk-taking and bank risk-management activities and develops several testable hypotheses from these theories in an integrated empirical framework. This framework synthesizes M.S. Pagano Villanova School of Business, Villanova University, 800 Lancaster Avenue, Villanova, PA, 19382, USA e-mail: [email protected]
key theoretical work in both the banking and corporate finance literatures. The tests are conducted on a sample of 518 publicly traded bank holding companies (BHC) in the U.S. during 1991–2000. Although there is a growing list of empirical papers attempting to test the various risk-management theories described below, no study has developed an integrated, comprehensive test for financial service companies such as commercial banks. We view the banking firm as a collection of risks that must be managed carefully in order to maximize shareholder wealth.1 Thus, the bank managers’ key risk-management function is to decide which risks to accept, increase, transfer, or avoid. In this context, risk-taking and risk-management are two sides of the same risk-return tradeoff related to investing in risky assets. That is, the literature on bank risk-taking focuses on the risks banks choose to accept or increase while ignoring the possibility that these actions also imply the transference or avoidance of other bank-related risks. Likewise, the literature on corporate risk-management typically focuses on the risks a firm chooses to transfer or avoid and ignores the implicit effects of such choices on the risks that are retained or increased. Our paper addresses these two perspectives holistically with a single, comprehensive empirical structure that treats riskmanagement and risk-taking as integrated facets of a bank’s risk profile.2 The theoretical hedging literature of the past 20 years has demonstrated how widely held large corporations may hedge in order to: (1) reduce the expected costs of financial 1
There is some discussion in the literature regarding hedging cash flows versus hedging market values of existing and anticipated risk exposures. Our view is that hedging particular exposures of either cash flows or market values will both lead to meaningful impacts on shareholder wealth. Thus, we focus on risk-management’s effects on shareholder value and do not make distinctions between cash flow- and market value-hedging since both forms of risk-management can equally affect total shareholder wealth. 2 See Pagano (1999, 2001) for reviews of the theoretical and empirical literature related to risk-management and risk-taking activities of U.S. commercial banks. Also, see Stulz (1996) and Graham and Rogers (2002) for a discussion of the concepts of a firm’s net risk exposures, integrated risk-management, and their effects on shareholder value.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_45,
675
676
distress, (2) decrease other costs such as those associated with debt-related agency problems and asymmetric information, and (3) increase the expected level of after-tax cash flows (e.g., via a reduction in taxes as well as through lower transaction and contracting costs).3 The key insight of this literature is that hedging at the corporate level (rather than by individual investors) can increase the firm’s market value by reducing the expected costs associated with the market imperfections noted above. Thus, corporate hedging activity can affect the level and riskiness of the firm’s market value and, in particular, the market value of the firm’s equity. In this context, we define “risk-management” broadly as any activity under management’s control (both on- and offbalance sheet) that alters the expected return and volatility of the firm’s market value of equity.4 We use the Flannery and James (1984) two-factor returngenerating model as the theoretical and empirical foundation for our analysis. As is well known, a model’s parameter estimates can be biased when important explanatory variables are omitted from the specification. This can create inference problems when, as in our case, there are multiple potentially competing theories. Our tests attempt to avoid this omitted variables problem by incorporating proxy variables for all extant theories of corporate hedging. This approach also leads to an increase in the explanatory power of the model when compared with conventional tests found in the literature. In addition to the use of a comprehensive set of explanatory variables, this study explores how well corporate riskmanagement rationales can explain risk-taking and hedging activities in the commercial banking industry rather than in non-financial companies. Given the importance of risktaking and risk-management in banking, this paper sheds light on which hedging rationales and interest rate risk
3
For reviews of the effects of these market imperfections on hedging, see Cummins et al. (1998), Tufano (1996), Culp et al. (1994), Smith and Stulz (1985), and Shapiro and Titman (1985). 4 For example, one can assume that two similar banks (A and B) both face an increase in demand for fixed rate loans and that both meet this demand by supplying these loans yet they choose to manage the resulting increase in interest rate risk exposure differently. Suppose that bank A immediately sells the fixed rate loans in the secondary market while bank B holds the loans on its balance sheet and then uses derivatives to completely hedge the interest rate risk. The sensitivity to interest rate risk in economic terms is essentially the same for both banks, although they have chosen different ways to manage this risk. In our model, we focus on the economic significance of cross-sectional and time-varying differences in the interest rate risk (and total risk) exposures of U.S. commercial banks and we do not make distinctions between whether the bank used on- or off-balance sheet transactions to manage these exposures. In this sense, our view of risk-management is broader than what is typically defined as hedging in the existing literature (i.e., the explicit use of derivatives). See Stulz (1996) and Meulbroek (2002) for a detailed discussion of a broader definition of risk-management, which is consistent with our perspective.
M.S. Pagano
exposures are motivating banks’ decisions towards managing total risk. We examine interest rate risk as well as total bank risk because interest rate risk is a risk that most banks can easily hedge if they so desire. Thus, of the many risks a bank faces, one can claim that interest rate risk is one in which senior management has the most discretion over and thus provides an interesting area to test various corporate riskmanagement theories. Further, our tests examine the possibility of inter-relations and nonlinearities between various explanatory variables such as credit, operating, and liquidity risks by specifying a quadratic model and then comparing the results of this specification with those of a conventional linear model. We find that a simple linear model is sufficient for explaining the cross-sectional and time-series variation of the sensitivities of bank stock returns to an overall risk measure (i.e., the standard deviation of monthly stock returns) as well as an interest rate risk measure (defined as the interest rate parameter from a Flannery and James-type regression). Thus, nonlinearities and interactions between the model’s explanatory variables do not provide a significantly more descriptive picture of the key variables affecting bank risk-taking and risk-management decisions in our sample. The empirical tests presented here extend the earlier work on bank risk-taking behavior of Saunders et al. (1990), Gorton and Rosen (1995), Schrand and Unal (1998), and Spong and Sullivan (1998) as well as recent research on general corporate risk-management activities by Graham and Rogers (2002), Tufano (1996, 1998a), Berkman and Bradbury (1996), and Mian (1996), among others. Empirical research based on samples of non-financial companies has found mixed support for the above rationales. The lack of a strong consensus within this line of research appears to be due to difficulties in defining an appropriate dependent variable and specifying a comprehensive set of explanatory variables.5 Tests in the banking literature have typically focused on a particular theory (or subset of theories in banking) and attempt to explain risk-taking rather than risk-management behavior. For example, Saunders et al. (1990) focus on a
5 For example, Nance et al. (1993), Mian (1996) and Wall and Pringle (1989) employ binary dependent variables and lack a consistent test of all competing hedging theories. The use of a binary variable is problematic because it does not fully describe the extent of a firm’s hedging activity. When a binary dependent variable is employed, a firm that hedges 1 or 100% of its risk exposure is treated the same in the model. Tufano (1996) attempts to test most (but not all) of the major hedging rationales and uses a continuous variable – an estimated hedge ratio. However, his sample covers a limited number of companies in a relatively small sector of the economy, the North American gold mining industry. Tufano (1996, 1998a) reports empirical results for 48 publicly traded gold mining firms in North America during 1991–1993. Graham and Rogers (2002) focuses on a cross-section of non-financial companies and finds that companies typically hedge to increase the tax benefits related to increased debt capacity and, to a lesser extent, to reduce financial distress costs and take advantage of economies of scale.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
bank’s ownership structure and its impact on bank riskiness (measured in several different ways using bank stock returns). This approach directly tests the contracting cost theory of hedging (described later in Sect. 45.2). However, the test does not directly address several other related, or competing, hedging theories. Schrand and Unal (1998) is another example of this focused approach since its primary emphasis is on the risk-taking behavior of thrifts with respect to credit and interest rate risk. The paper finds thrifts substitute “compensated” risks (e.g., a risk that earn a positive economic rent) for “hedgeable” risks. However, the paper does not explore the impact of all theoretically relevant factors influencing the thrift’s choice between credit and interest rate risk.6 Empirical tests of our generalized model of interest rate risk-management behavior yield several insights. When we examine both the total risk and interest rate risk exposures of a bank holding company, we find strong support for hedging theories related to financial distress costs (i.e., banks facing higher financial distress costs are more likely to hedge their interest rate risk and yet face greater overall risk). We also find support for the effects of debtholder-related agency costs (with weaker support for the managerial contracting costs, firm size, and hedge substitutes rationales) while the asymmetric information rationale for managing risk is not supported by our sample. Our estimated effect of financial distress on bank risk-management activity during 1991–2000 is consistent with evidence found in Haushalter (2000) for 100 oil and gas producers during 1992–1994 and Graham and Rogers’ (2002) results for 158 non-financial companies during 1994–1995. Further, banks facing greater potential debtholder-related agency conflicts (e.g., due to high financial leverage) are also more likely to have smaller interest rate risk exposures. Note that the financial distress costs and agency problems a bank faces can be inter-related and, interestingly, we find strong empirical support for each of these theories even when such inter-relationships (as well as nonlinearities) are explicitly modeled. In addition, Lin and Smith (2007) study nonfinancial companies during 1992–1996 and find significant inter-relationships between hedging, financing, and investment decisions. This result is consistent with our findings because the underlying factors affecting corporate hedging, financing, and investment are ultimately forces such as financial distress costs, agency problems, asymmetric information, and so forth. Overall, our findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities, 6
Additional examples in the banking literature of these focused empirical tests include Galloway et al. (1997) and Cebenoyan et al. (1999) concerning the effect of bank regulation on risk-taking as well as Gorton and Rosen (1995) on the potential agency and contracting problems related to bank riskiness.
677
and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. The paper is organized as follows. Section 45.2 briefly reviews the relevant theoretical and empirical literature in the areas of corporate finance and banking and presents several testable hypotheses. Section 45.3 describes the sample data, methodology, and proposed tests while Sect. 45.4 discusses the empirical results of these tests. Section 45.5 provides some concluding remarks.
45.2 Theories of Risk-Management, Previous Research, and Testable Hypotheses 45.2.1 Brief Review of Main Theories of Corporate Hedging Until the early 1980s, the implications from theories such as the capital asset pricing model (CAPM) and Modigliani and Miller’s (1958) Proposition I seriously challenged the usefulness of the early theories of hedging. However, recent research has argued that imperfect markets can explain why large, diversified firms actively engage in hedging activities. Large corporations will hedge in order to reduce the variance of cash flow for the following primary reasons: (1) to reduce the expected costs of financial distress or, more generally, the costs of external financing (e.g., see Smith and Stulz 1985; Shapiro and Titman 1985; Froot et al. 1993; and Copeland and Copeland 1999); (2) to decrease other costs such as those associated with agency problems and asymmetric information (e.g., Stulz 1990; Campbell and Kracaw 1990; and DeMarzo and Duffie 1995); and (3) to increase the expected level of after-tax cash flows via a reduction in taxes as well as lower transaction and contracting costs (e.g., Stulz 1984; Smith and Stulz 1985; Shapiro and Titman 1985; Nance et al. 1993). In addition, the corporate hedging decision will be influenced by the availability of “hedging substitutes” such as large cash balances. By relaxing the assumption of perfect capital markets, corporate finance theory has suggested several ways large, widely held corporations can increase their market value by engaging in risk-management activities.7 7 Note that a related strand of the accounting literature has, rather than using shareholder wealth maximization as the firm’s objective function, focused on analyzing hedging decisions based on managing accounting earnings and book value-based regulatory capital requirements (e.g., see Wolf 2000). Instead, we follow the logic of developing tests where the effect of a bank’s net interest rate risk exposure is measured in terms of its impact on the firm’s stock returns. This is a well-accepted, objective
678
45.2.2 Review of Related Banking Theory In addition to the empirical model of Flannery and James (1984) noted in the Introduction (and described in more detail later in the following subsection), another aspect of the banking literature relevant to our analysis pertains to the theoretical results of Diamond (1984) concerning financial intermediation. In this paper, financial intermediaries can capture the value associated with resolving information asymmetries between borrowers and lenders. In addition, this view of a financial intermediary’s delegated monitoring function implies the intermediary should hedge all systematic risk (e.g., risk associated with interest rate fluctuations) to minimize financial distress costs. Therefore, in theory, banks should not have a material exposure to systematic, hedgeable factors such as interest rate risk. Diamond’s results imply the optimal “hedge ratio” for a financial intermediary’s systematic risk such as interest rate risk is 100%. We can use this result to develop an empirical benchmark for risk-management activity by a BHC. For example, according to Diamond’s line of reasoning, a BHC’s return on its investment should have zero sensitivity to interest fluctuations if it is hedging optimally. In Diamond’s model, the bank’s liabilities can therefore be made asymptotically risk-free by combining the use of hedging and diversification. Thus, statistically significant non-zero sensitivities to interest rate movements would suggest less hedging compared to firms with zero sensitivities (after controlling for other relevant factors). It should be noted, however, that Froot and Stein (1998) argue against the optimality of a 100% hedge ratio when some risks are not tradable in the capital markets. Their theory of integrated risk-management suggests that non-zero risk exposures to market-wide factors such as interest rates can be optimal when the only means of hedging “non-tradable” risks is via holding costly bank capital and the bank faces convex external financing costs (i.e., financial distress costs are economically significant). In addition, Stein (1998) demonstrates that adverse selection problems can significantly affect a bank’s asset/liability management activities when there are sizable information asymmetries associated with the quality of a bank’s assets. Interestingly, these ideas from the banking literature are consistent with the financial distress and asymmetric information rationales of risk-management found in the corporate finance literature. Since a bank’s maturity intermediation decision is a classic example of Niehans’ (1978) definition of “qualitative asset transformation” by a financial intermediary, Deshmukh
approach that is commonly used in the corporate risk-management literature and therefore we readily adopt it here.
M.S. Pagano
et al. (1983) provides a theory that links interest rates to key financial intermediation services, which, in turn, directly affect the cash flows and stock returns of a commercial bank. For example, Deshmukh et al. (1983) demonstrates that the level and uncertainty of long-term interest rates relative to short-term rates is an important determinant of the amount of interest rate risk a financial intermediary will assume. Consequently, the above banking literature provides theoretical support for our use of a long-term U.S. Treasury interest rate as the key interest factor in our empirical estimation of the two-factor Flannery and James-type model of bank stock returns presented in the following subsection.8 In addition to the above issues, banking research such as Saunders et al. (1990) and Brewer et al. (1996) reveals banks face multiple risks associated with interest rates, credit quality, liquidity, and operating expenses. Further, it is possible these four key risks are inter-related and therefore need to be controlled for in our model. This realization, along with the results found in Flannery and James (1984) and Diamond (1984), provides us with a theoretical framework to construct a conventional linear model as well as a full quadratic empirical specification that includes linear, squared, and interaction terms for the relevant explanatory variables. This quadratic model approach can then be compared to the conventional linear specification in order to verify empirically whether or not these nonlinearities and interaction terms are relevant in a statistical sense.
45.2.3 Relation to Previous Empirical Research Recent empirical papers have attempted to test the theories presented above. The factors displayed below in Equation (45.1) are based on those forces that researchers in this area have most frequently cited as key influences on corporate hedging. The decision to hedge or not hedge with derivatives is a function of the six corporate finance-related rationales for hedging (i.e., taxes, t.V /, financial distress costs, BC, size-related and managerial contracting costs, CC, debt-related agency costs, AC, asymmetric information problems, AI, and alternative financial policies such as hedge substitutes, AFP).9
8
As we will discuss later, we also test a short-term interest rate, the three-month U.S. T-bill rate, and find that the long-term rate is statistically more significant than this short-term rate. 9 Hedge substitutes in our case can include bank-related on-balance sheet activities such as altering the firm’s asset and liability mix in order to reduce risk. Compared to conventional manufacturing and service companies, commercial banks have greater flexibility in employing these on-balance sheet techniques.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
The above relationships have been expressed in the format of a stylized function in empirical research as follows: Hedge D f .t .V /; BC; CC; AC; AI; AFP/
(45.1)
where, Hedge D either a continuous variable such as an estimated “hedge ratio” or a binary variable, which takes on a value of 1 if the firm hedges with derivatives and 0 if the firm does not hedge with these instruments. Examples of empirical tests based on Equation (45.1) can be found in Nance et al. (1993), Mian (1996), Wall and Pringle (1989), Berkman and Bradbury (1996), Tufano (1996), Howton and Perfect (1998), Haushalter (2000), and Allayannis and Weston (2001). The authors find mixed support for several hedging rationales primarily related to taxes, financial distress costs, managerial contracting, debt-related agency costs, and hedge substitutes.10 Schrand and Unal (1998) take a different approach and examine the role of risk-management in terms of allocating rather than reducing risk. The authors suggest thrift institutions substitute risks that can earn a positive economic rent (e.g., credit risk) for “hedgeable” risks such as interest rate risk. They report a statistically significant shift from interest rate to credit risk in a sample of thrifts that converted from mutual to stock charters during 1984–1988. Cebenoyan et al. (1999) also examine thrift risk-taking and find that unprofitable risk-taking activities increase during periods of relatively lax regulation and low charter values. This result is similar to the findings reported in Galloway et al. (1997) for commercial banks. Another strand of the empirical literature on riskmanagement pertains to surveys of corporate treasury departments regarding derivatives usage.11 As summarized in Bodnar et al. (1996), the Wharton survey results suggest there are substantial participation or “set-up” costs related to derivatives usage and hedging that may not be economically justifiable for many firms. These findings are consistent with the contracting cost theory of hedging related to potential economies of scale available in a risk-management program.
10
It should be noted that this strand of the literature treats the variables on the right-hand-side of Equation (45.1) as exogenous and independent of one another when, in reality, most, if not all, of these factors may be inter-dependent. As will be seen in Sect. 45.3.5, our tests differ from the extant literature by allowing for potential interactions between the right-hand-side variables found in Equation (45.1). 11 Recent empirical studies by Geczy et al. (1997) and Schrand (1997) have also provided more formalized tests of the factors describing corporate derivatives usage. Since these papers focus exclusively on derivatives and do not directly investigate the hedging theories described in this paper, we do not summarize their findings here.
679
Based on the above results and the findings reported in Saunders et al. (1990), we can use the standard deviation of the BHC’s monthly stock returns as a measure of total risk (denoted here as TOTRISK). That is, if the firm’s goal is to maximize shareholder value, then a bank’s risk-taking and risk-management activities should ultimately influence the volatility of its common equity, which we have defined as TOTRISK. In addition, we posit a formula for determining a financial institution’s sensitivity to interest rate risk via a Flannery and James-type model. In this way, we can examine how corporate risk-management factors influence both the bank’s overall riskiness as well as its management of a key market-based risk (i.e., interest rate risk). The Flannery and James (1984) model is a good choice for estimating this latter risk exposure because it explicitly accounts for the effect of interest rate risk on the firm’s stock returns. The relation incorporates risk premiums associated with the bank’s exposure to market-wide risk factors such as the return on a market portfolio proxy and the relative change in a long-term interest rate. This relation can be summarized as follows: rN;t D “0;N C “M;N rm;t C “I;N irt C ©N;t
(45.2)
where, rN;t D stock return on the N-th BHC during month-t, rm;t D total return on a value-weighted CRSP stock market index (S&P 500) during month-t, irt D the change in the yield on the constant maturity 10year U.S. Treasury note during month-t scaled by 1 plus the prior month’s 10-year Treasury note yield,12 ©N;t D a disturbance term with assumed zero mean and normal distribution, 12
The interest rate factor was computed in two ways: (1) simply as it is defined above using the ten-year interest rate (i.e., not orthogonalized) and (2) the orthogonalized interest rate, where (per Unal and Kane 1988) the interest rate factor is regressed on the equity market portfolio proxy and the resulting residuals from this regression are used as the second factor in Equation (45.2) rather than the simple nonorthogonalized version of the interest rate factor. We find that the empirical results are qualitatively the same whether the interest rate factor is orthogonalized or non-orthogonalized. Using Occam’s razor, we thus choose to report the simplest estimation method in our Empirical Results section (i.e., we report our results based on the non-orthogonalized interest rate factor since they are simpler to compute and yield the same main results as when the more complicated orthogonalization procedure is employed). The results are essentially the same when either procedure is used because the simple correlation between the market portfolio proxy and the interest rate factor is quite low and thus the two factors in Equation (45.2) are essentially already orthogonalized before even applying an orthogonalization technique. Also, we find that using the simple changes in the three-month Tbill rate provides statistically weaker but qualitatively similar results as those obtained for the ten-year interest rate factor. Thus, we focus our subsequent discussion on the empirical results related to the tenyear Treasury note rather than results based on the three-month T-bill rate.
680
“k;N D parameter estimates of the model for the N-th BHC. In particular, “I;N is in principle the “interest rate beta” used originally by Flannery and James (1984) to estimate a BHC’s interest rate sensitivity. With the estimates of “k;N generated for each BHC during each year,13 we can then use these estimates as the dependent variable in a second-stage set of regressions. Equation (45.2) states that the required stock return for a BHC should be related to the investment’s riskiness relative to the market portfolio (i.e., the first term on the right hand side of the equation) and the bank’s existing net interest rate risk exposure (the second term on the right hand side of the equation). If ˇI;N D 0, then we can interpret Equation (45.2) as a conventional one-factor market model. Note that the ˇI;N parameter is an estimate of a bank’s interest rate risk net of any interest rate hedging activities the bank has engaged in.14 However, Froot and Stein (1998), among others, argue that ˇI;N will not necessarily be zero due to the presence of financial distress costs and other market imperfections. Equation (45.2) provides us with a theoretical framework to test the various incentives for risk-management via its straightforward decomposition of risk into equity marketrelated and interest-related risks. For our tests, we can estimate the relative risk parameters for the interest rate risk factor in Equation (45.2) for each bank and each year using monthly or daily data and then use the absolute value of these firm-specific parameter estimates in a panel data set to analyze interest rate risk-taking behavior at U.S. BHCs.15 Similar to Graham and Rogers’ (2002) usage of absolute values of net derivatives positions, we use the absolute value of ˇI;N as the dependent variable in our second-stage set of regressions because our review of the relevant literature suggests that banks with higher interest rate risk (and lower hedging activity, ceteris paribus) have interest rate betas that are either
13
This is done to allow for the interest rate beta to vary over time as well as cross-sectionally. Note that our model is similar to Graham and Rogers’ (2002) approach for estimating interest rate risk except we use stock returns as the dependent variable (versus operating income) because we are interested in studying risk-management’s effects on shareholder value rather than rely on indirect measures of market value such as accounting variables drawn from the firm’s income statement. 14 If equity markets are informationally efficient, then we can interpret this interest rate beta (referred to here as IBETA in its raw form and IRR in its absolute value form) as a summary measure of the sensitivity of the bank’s market value of common equity to changes in interest rates. This measure of interest rate risk is consistent with the corporate riskmanagement definitions of interest rate risk, hedging, and its net impact on a firm’s market value of equity. 15 We find that our empirical beta estimates are similar when either monthly or daily data are used, albeit the estimates with the monthly data provide more explanatory power in terms of adjusted R2 statistics. Thus, to conserve space, we report our results from estimates computed using monthly return data.
M.S. Pagano
large positive or negative values (depending on whether the bank has chosen to be exposed to rising or falling interest rates). The absolute value of ˇI;N (referred to as IRR in the following sections) captures these divergent interest rate risk exposures by reporting large values of jˇI;N j for banks with large positive or negative exposures while firms with smaller exposures have values of jˇI;N j closer to zero. Note that we do not use the notional values of derivatives positions as a dependent variable in our model because, as Smith (1995) shows, two firms with the same notional value can have completely different hedging strategies. In order to specify the details of a second-stage regression that uses IRR as its dependent variable, we must first define our hypotheses related to this regression based on the current corporate hedging and banking literature.16 The following subsection presents nine hypotheses related to this literature.
45.2.4 Hypotheses Our review of the corporate hedging and banking literature provides us with an important set of implications about bank risk-taking and risk-management behavior that we now formalize by establishing a set of empirically testable hypotheses. Following each hypothesis is a brief description of the primary empirical variables employed in the analysis. The positive or negative sign in parentheses following each variable denotes the variable’s expected relationship with the BHC’s relative interest rate risk measure, IRR. A more detailed description of the variables used in our analysis is also contained within Table 45.1.17 In addition, a discussion of TOTRISK’s relation to our model’s independent variables (along with relevant predicted signs) is provided below only when the variables’ effects are different than the ones discussed for the IRR variable.18 H1. Tax hypothesis: A BHC will hedge more (and therefore assume less interest rate risk, ceteris paribus) when it faces a convex, or more progressive, tax schedule since
16 To mitigate the potential downward bias in parameter results caused by using an estimated interest rate risk parameter as a dependent variable, we use a generalized method of moments (GMM) estimator. This nonlinear instrument variables estimation technique explicitly accounts for the errors-in-variables problem that can exist when the dependent variable is estimated via a first-stage set of regressions (see Greene, 1993, and Ferson and Harvey 1991, for more details on this econometric issue). 17 Table 45.1 also contains a description of alternate explanatory variables that are used for robustness testing in a specification referred to as our “Alternative Model” of interest rate risk. This alternative specification is described in detail in the Empirical Results subsection. 18 Note that only hypotheses H2, H7, and H9 presented below might exhibit divergent effects for TOTRISK relative to IRR.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
greater convexity implies larger potential reductions in expected tax liabilities. A lower marginal tax rate or little reported net income would imply greater tax convexity.
681
Since higher tax rates are related to a less convex tax schedule, a higher level of TAX corresponds to a lower incentive to hedge and more interest rate risk, ceteris paribus.
Table 45.1 Description of potential factors affecting risk-management decisions Panel A. Summary of explanatory variables and their definitions Definition
Data sourcesa
TAX
Annual average effective tax rate
Y9-C
TAXDUM
Dummy variable equal to 1 when the firm’s net income was above $100,000
Y9-C
PBANK
Variance of quarterly net income during the relevant year/(beginning Equity Capital C Mean of Quarterly Net Income during the relevant year)2
Y9-C
PBANKDUM
Dummy variable equal to 1 when the firm’s value for PBANK is greater than 0.01 (i.e., greater than 1%)b
Y9-C
OPTGRANT
Number of common shares underlying stock options granted to senior officers/Total number of shares outstanding
Proxy Statements/SNL Securities/Compact Disclosure
PBONUS
Percentage of senior officers’ total cash compensation that is received in the form of bonuses
Proxy Statements/SNL Securities/Compact Disclosure
MGMTOWN
Number of common shares held by officers and directors/total number of shares outstanding
Proxy Statements/SNL Securities/Compact Disclosure
MGMTDUM
Dummy variable equal to 1 when MGMTOWN is greater than the sample mean of 16%
Proxy Statements/SNL Securities/Compact Disclosure
INST
Percentage of shares outstanding held by un-affiliated Institutional Investors
SNL Securities/Compact Disclosure
EQBLOCK
Percentage of common shares owned by large blockholders (i.e., external investors with ownership > 5%)
Proxy Statements/SNL Securities/Compact Disclosure
SIZE
Book Value of Total Assets (TA)
Y9-C
REVENUE
Total annual interest and non-interest income earned
Y9-C
Variable name Tax variables
Financial distress variables
Contracting cost variables
Agency cost variables LEVERAGE
TA/Market Value (MV) of Equity Capital
CRSP; Y-9C
LEVDUM
Dummy variable equal to 1 when the TA/Book Value of Equity Capital is greater than the sample mean of 12.3c
Y9-C
Number of security analysts publishing annual earnings forecasts for the BHC
Compact Disclosure
LIQUIDITY
(Cash C Invest. Securities)/MV of Equity Capital
Y9-C; CRSP
CASH/BVE
Cash/BV of Equity Capitald
Y9-C
IRt
Change in the end-of-month yield of the 10-year U.S. Treasury note scaled by 1.0 C the 10-year T-note’s yield lagged 1 month
Federal Reserve
RM;t
Monthly total return on the CRSP value-weighted stock index
CRSP
OPERISK
Non-interest Operating Expense/Total Revenue
Y9-C
BADLOANS
(Non-Performing Loans C Loan Chargeoffs)/MV of Equity Capital
Y9-C; CRSP
NONPERF/TA
(Non-Performing Loans C Loan Chargeoffs)/TA
Y9-C
Asymmetric information ASYMINFO Hedging substitutes
Market-related variables
Control and miscellaneous variables
(continued)
682
M.S. Pagano
Table 45.1 (continued) Panel B. Summary of dependent variables and their definitions Variable name
Definition
Data sourcesa
RN;t
Monthly total return on N-th BHC’s common stock
CRSP
IRRN;y
Absolute value of the interest rate risk parameter estimated via a 2-factor model for the N -th BHC during the y-th year
CRSP
Panel A contains definitions of the explanatory variables employed in the empirical analysis for our Primary and Alternative Models. The panel also reports the data sources used to obtain the relevant data. Panel B provides the definitions and data sources for the empirical model’s dependent variable a Data sources: Y9-C Call Report: detailed financial statement data reported to the relevant bank regulatory authorities on a quarterly basis. CRSP: the Center for Research in Securities Prices database of stock returns. Compact Disclosure: a database containing financial statement, management compensation, and equity ownership information. SNL Securities: compensation data from a publication titled SNL Executive Compensation Review (years 1991–2000) b We also estimated the Alternative Model using a threshold value for PBANKDUM equal to the sample mean of 0.0053 and found essentially the same results as those reported in Table 45.3. We report our results using 0.01 as the threshold value for PBANKDUM because it provides a slightly better fit than the sample mean value of 0.0053 and also because 0.01 is a relatively simple, intuitive breakpoint that bank regulators would probably be interested in using for their monitoring activities c Note that we use the book value of equity capital (rather than the market value) in order to develop a dummy variable that is not too closely related to the LEVERAGE variable. However, we find that using a dummy variable that is based on LEVERAGE provides results that are consistent with those reported here using our book value-based LEVDUM variable d Note that this is a more narrowly defined liquidity measure that is based solely on book values
Consequently, we expect a positive relation to exist between TAX and a bank’s interest rate risk exposure. TAX .C/: This tax influence can be tested via the average effective tax rate of the BHC. This variable captures the convexity of the BHC’s tax schedule because a very low effective tax rate suggests the firm is facing the most convex portion of the U.S. tax code.19 H2. Financial distress costs hypothesis: According to the hedging theories noted earlier, a BHC will hedge more when its costs related to financial distress are expected to be high in order to lower this expected cost as much as possible. PBANK ./: Since it is not feasible to observe a BHC’s expected financial distress costs directly, we use a variable that proxies for one key component of this expected cost: an estimate of the likelihood of experiencing financial distress. We expect this proxy variable to be highly, and positively, correlated with the bank’s true, unobservable expected financial distress costs. An estimate of the probability 19
Including a dummy variable (denoted TAXDUM) that equals 1 when the firm’s annual net income is above $100,000 provides an alternative way to test this tax influence. This TAXDUM measure of the taxbased incentive to hedge is used as a robustness check in our “Alternative Model” and is similar in spirit to the factors used to simulate the tax-related benefits of corporate hedging in Graham and Smith (1999). This dummy variable attempts to capture the convexity of the BHC’s tax schedule because it is at this low level of net income that the U.S. tax code is most convex. Ideally, a variable measuring tax loss carryforwards and carrybacks would be useful in measuring a firm’s tax convexity. However, these data are not readily available for commercial banks.
of financial distress first suggested by Blair and Heggestad (1978), PBANK, is therefore created by employing Chebyshev’s inequality to derive an upper bound on the probability of financial distress for a commercial bank based on the mean and variance of the firm’s quarterly net income during the year and the year-end level of equity capital. The actual calculation employed is described in Table 45.1. Panel A.20 Conversely, we expect PBANK to have a positive effect on TOTRISK because an increased risk of financial distress is reflected in greater stock return volatility when bankruptcy and external financing costs are convex functions of the level of outstanding debt. H3. Contracting costs hypothesis no. 1: Managers of BHC’s might have an incentive to take on more interest rate risk when they own options to buy the BHC’s stock. That is, the BHC’s riskiness should be positively correlated with the level of stock options granted to senior management as it is well known that incurring increased risk can raise the value of these options. OPTGRANT .C/: This effect is measured by the number of shares underlying the options granted annually to senior management scaled by the total number of common shares
20
Although one might view this variable as endogenous, the level and quality of the bank’s assets and the amount of equity capital on hand at the beginning of each year primarily determine these financial distress variables. Thus, we can consider the financial distress cost proxy as a predetermined variable in our empirical specification.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
outstanding.21 A larger number of stock options implies a greater incentive to assume more risk, ceteris paribus. H4. Contracting costs hypothesis no. 2: The relation between the level of equity owned by the company’s managers and the degree of risk taken by the firm may be negative. That is, risk-averse managers may take on less risk than is optimal at the firm level if managerial stock ownership is very high since these risk-averse managers might not be able to adequately diversify their personal investments in the firm.22 Also, un-affiliated large institutional investors should act as influential bank monitors and, via exercising their voting rights, reduce risk-averse managers’ ability to select lower, sub-optimal levels of risk.23 This theory predicts a positive relationship between institutional investor equity ownership levels and bank risk-taking activity, as well as a negative relation between managerial ownership and IRR.
21
Ideally, one would like to also see if the dollar value of these options is related to a bank’s interest rate risk. However, these data would require the application of an option-pricing model with all the related assumptions such a model requires. After 1991, most firms reported some of the relevant data but the assumptions underlying these dollar estimates are not consistent across firms. Since option grants are typically issued “at the money” (i.e., at approximately the current stock price at the time of issuance), using the number of shares granted scaled by the total number of shares outstanding provides us with an objective, standardized measure of incentive-based compensation. As reported in Table 45.1, we also use an alternative proxy variable (PBONUS), which calculates the percentage of annual cash compensation received in bonus form to provide an alternative standardized measure of the risk-taking incentives of senior management. 22 As noted in Gorton and Rosen (1995) and Spong and Sullivan (1998), among others, the relationship between the level of equity owned by the company’s managers and the degree of risk taken by the firm might be non-monotonic. That is, risk averse managers might take on less risk than is optimal at the firm level when managerial stock ownership is very low (since there is no real incentive to take risk) or very high (because too much of the manager’s wealth is at risk). At moderate levels of ownership, however, managers may take on more risk since their personal incentives may be better aligned with those of outside shareholders. In this case, we expect an inverted U-shaped relationship between risk-taking and equity ownership similar to the one reported in Gorton and Rosen (1995) where both low and high levels of management equity ownership exhibit lower levels of risk-taking than moderate ownership levels. Empirical tests of this alternative hypothesis showed no evidence in support of this non-monotonic relationship. These results (not reported here in order to conserve space) could be due to the more detailed specification of the factors affecting risk-taking in our model. 23 For robustness testing, we use an alternative proxy variable, EQBLOCK, which is computed as the percentage of total shares outstanding held by outside equity “blockholders” (those unaffiliated external investors that own 5% or more of the firm’s common equity). We include only the percentage of shares owned by “true” outside blockholders such as mutual fund companies and exclude any shares owned by “quasi-insiders” such as the bank’s Employee Stock Ownership Plan or relatives/associates of senior bank managers. We make a distinction between these types of groups because quasi-insiders might not be as effective monitors of the firm as true outside blockholders.
683
MGMTOWN ./ and INST .C/: We use the percentages of shares outstanding held by directors or officers and unaffiliated institutional investors as proxies for this hypothesis (MGMTOWN and INST, respectively). H5. Contracting costs hypothesis no. 3: The possibility of economies of scale related to risk-management activities suggest larger firms are more likely to engage in hedging than smaller firms since the marginal costs of such activities are usually lower for larger companies. This hypothesis predicts a negative relationship between firm size and the bank’s risk level. SIZE (): The SIZE variable (i.e., total assets of the BHC) can be employed as a proxy for the economies of scale available to a firm related to risk-management activities. H6. Contracting costs hypothesis no. 4: The potential diversification effects of firm size may provide incentives to larger BHCs to assume greater exposures to systematic risk factors such as interest rate risk. In this case, larger firms are less likely to engage in hedging activities than smaller competitors. This alternative argument regarding firm size is based on empirical studies such as Demsetz and Strahan (1995), which have documented that larger BHCs have less idiosyncratic risk due to the positive effects of diversification on their larger loan portfolios. Thus, these larger institutions are able to take on more interest rate risk (and hedge less) if these firms wish to maintain a pre-determined level of total risk (i.e., systematic plus idiosyncratic risk). This effect would also be reinforced if “too big to fail” (TBTF) policies and mispriced deposit insurance are present within the U.S. banking system during our sample period (see hypothesis H8 below for more details on our control for TBTF effects). Consequently, there may be a positive relationship between firm size and risk. SIZE .C/: This hypothesis employs the same variable described in H5 but predicts a positive relationship between firm size and risk. H7. Agency cost hypothesis: BHCs that face larger agency costs related to external debt will hedge more than BHCs with less conflict between their debtholders and shareholders. Thus, we would expect firms with greater leveragerelated agency costs to hedge more than firms with relatively smaller agency problems. Financial leverage would therefore be negatively related to interest rate risk-taking activity. LEVERAGE ./: A proxy variable for this debt-related agency cost is financial leverage (defined as the book value of total assets divided by the market value of common equity). Conversely, we expect a positive relation between LEVERAGE and TOTRISK because increased financial leverage creates higher financial risk and, as Hamada (1972) has shown within the CAPM framework, this increased financial risk can lead to greater stock return volatility. H8. Asymmetric information hypothesis: A BHC that has superior management as well as a more severe degree of
684
asymmetric information between insiders and external investors will have a greater incentive to hedge than BHCs with less severe information asymmetries. According to DeMarzo and Duffie (1995), we expect well-run BHCs with more severe information asymmetries to hedge more and assume less risk than other BHCs, ceteris paribus, in order to generate a less-noisy signal of the firm’s “quality” or cash flow prospects. ASYMINFO .C/: The number of security analysts publishing earnings forecasts for the company in a given year is employed as a measure of the degree of asymmetric information surrounding a particular company (e.g., a higher number of analysts following a company would imply a less acute asymmetric information problem). This hypothesis predicts that less acute information asymmetries should decrease the firm’s need to hedge.24 H9. Hedge substitutes hypothesis: A firm with a large amount of hedge substitutes is “hedging” against future adverse business conditions because any potential earnings shortfall can be compensated for more easily if, for example, the firm has a large amount of liquid assets relative to the BHC’s short-term liabilities (LIQUIDITY). This is effectively the same as conventional hedging via derivative securities because high liquidity reduces the chances that management will be forced to forgo future investment in order to conserve cash when earnings are unexpectedly low. A BHC that uses alternative financial policies or hedge substitutes, such as a large amount of liquid assets, will therefore have less of an incentive to hedge (and a stronger incentive to take interest rate risk). Conversely, firms with low liquidity are expected to have a greater incentive to reduce risk by hedging more.25 Consequently, the bank’s liquidity level is expected to be positively related to interest rate risk. LIQUIDITY .C/: The BHC’s liquidity level (cash plus marketable securities) relative to its market value of common equity is employed as a measure of hedge substitutes and is expected to have a positive relationship with BHC riskiness. This relationship is based on the notion that a firm with a high level of liquidity will have more resources available 24 It should be noted that ASYMINFO and SIZE are most likely interrelated. In our empirical tests discussed later, we accommodate this potential inter-relationship via a full quadratic model of interest rate risk. In addition, both of these variables may also be correlated with moral hazard incentives associated with “too big to fail” (TBTF) policies and mispriced deposit insurance. In effect, the inclusion of these two variables can also help control for risk-increasing incentives related to very large banks that are deemed “TBTF” by regulators. 25 Financial firms with low levels of liquidity face a greater degree of liquidity risk. This risk can be defined as the BHC’s risk of not having sufficient funds on hand to meet depositors’ withdrawal needs. Typically, this risk leads to higher borrowing costs since the BHC is forced to pay a premium to obtain the additional funds on short-term notice in the federal funds market (or at the Fed’s discount window in extreme cases of liquidity needs).
M.S. Pagano
in the case of an earnings shortfall or sudden drop in common equity. Thus, the firm will have an incentive to hedge less and take on more interest risk. Note that for overall risk (TOTRISK) the above effect might be offset by the ability of high levels of LIQUIDITY to act as a “shock absorber” against unexpected declines in cash flow and therefore could reduce overall stock return volatility.
45.2.5 Control Variables As noted earlier in this section, the corporate finance and banking literature has identified some key factors other than interest rate risk that can influence the observed levels of risktaking in the banking industry. Presented below is a description of the control variables related to asset quality, operating risk, and portfolio composition.26 As Stein (1998) has noted, adverse selection problems can affect a bank’s asset/liability management choices when there are large information asymmetries related to a bank’s assets (e.g., due to poor credit quality). Therefore, the informational asymmetry engendered by the credit quality of the bank’s assets can affect a bank’s interest rate risk exposure via potential mis-matches between asset and liability maturities and thus this effect should be controlled for in our tests. We use the ratio of nonperforming assets plus gross chargeoffs relative to the bank’s market value of equity (BADLOANS) as a proxy variable for a bank’s asset quality. Keeton and Morris (1987) recommend a measure such as this because it removes the possibility of distortion created by a bank’s attempt to “manage” the level of its chargeoffs for financial reporting purposes. The use of this variable as a proxy assumes that the quality of a loan portfolio is normally (strongly) positively correlated with the proportion of loans that are nonperforming or have been charged off. A substantial exposure to a risk factor such as high operating leverage can also lead to a greater level of risk for the BHC, ceteris paribus. As Hamada (1972) showed, operating leverage is positively related to systematic risk. Alternatively, high credit and operating risks (denoted BADLOANS and OPERISK, respectively) might create an incentive to hedge more interest rate risk in order to keep total risk at a lower, more tolerable level (from management’s and regulators’ perspectives).27 In addition, the composition of a bank’s
26
Note that when TOTRISK is used as the dependent variable, we must also include our interest rate risk factor, IRR, as a control variable because interest rate risk (along with credit, operating, and liquidity risks) can affect the bank’s overall riskiness. 27 Our definition of OPERISK is noninterest operating expense divided by total revenue and effectively measures operating leverage because OPERISK will be high for banks with a large amount of operating
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
portfolio affects the firm’s liquidity risk. Thus, the liquidity risk inherent in a bank’s asset portfolio might also influence interest rate risk as well as act as a hedge substitute. Consequently, liquidity, asset quality, and operating risks might be negatively related to interest rate risk. Note, however, that the relationship between the above variables and interest rate risk might be positive if managers are attempting to raise the BHC’s overall risk level via multiple sources of risk. For example, bank managers might try to increase the BHC’s total risk by coordinating several risks such as those related to interest rates, credit quality, liquidity, and operations rather than concentrating on one type of risk. Other potential control variables such as details of the BHC’s portfolio composition beyond the LIQUIDITY variable are not explicitly included in our cross-sectional empirical model. The omission of these variables is appropriate because portfolio composition variables such as total loans are endogenous to the set of influences already included in our model. That is, portfolio composition is a function of the risk-taking and risk-management incentives related to, for example, the contracting costs and tax effects outlined earlier in Sect. 45.2.4. As demonstrated in the empirical results of Kashyap and Stein (1995) and the theoretical findings of Stein (1998), the bank’s management can coordinate the level of loans relative to its investment in marketable securities to minimize costs associated with information asymmetries. In particular, smaller banks, which are assumed to have greater information asymmetries, reduce their volume of lending when changes in monetary policy create an outflow of deposits. Thus, by not including specific portfolio composition variables in our model, we allow these variables to adjust optimally to the risk-taking and risk-management incentives explicitly included in our model.28 In addition to the issues described above, we must also consider the role of bank regulation since regulatory changes are likely to affect the firm’s risk-taking behavior over our sample period. In particular, the introduction of stricter enforcement policies via the Federal Deposit Insurance Corporation Improvement Act (FDICIA) and the Bank for International Settlement’s (BIS) higher capital standards during 1991–1993 might have influenced BHC riskiness by reducing bank risk-taking incentives. Thus, a dummy variable, REGDUM, is set to 1 for the years of our sample in which FDICIA was fully in effect (1994–2000). overhead (e.g., due to high fixed operating costs related to an extensive branch network, a large customer support staff, and so forth). 28 It should also be noted that bank-specific fixed effects via a “Fixed Effects” model could be used to incorporate unidentified firm-specific factors that might affect a firm’s interest rate risk exposure. To conserve space, we simply note here that the use of a fixed effects model does not alter the main findings of our basic linear model. That is, allowing for unidentified, firm-specific effects does not improve upon our basic model described later in Equation (45.3).
685
45.3 Data, Sample Selection, and Empirical Methodology 45.3.1 Data The data used in this analysis are based on a total of 518 publicly traded, highest-level bank holding companies (BHCs) that reported annual data for at least one of the 10 years during the period of 1991–2000.29 The analysis employed here requires data from four distinct areas: (1) financial statement data from regulatory reports; (2) stock return data; (3) managerial compensation information (e.g., option grants, equity ownership); and (4) institutional stock ownership data. The following section on sample selection provides more detail on these data sources. Highest-level BHCs were chosen because it is at this level that one can obtain the best description of risk-taking and risk-management behavior across the entire banking organization. For example, an analysis of a two-bank BHC based on studying the effects of risk-taking at the individual subsidiary bank-level would most likely give a distorted view of the BHC’s overall riskiness and riskmanagement activities because this type of analysis would ignore (a) the potential effects of diversification across the two subsidiaries, and (b) the effects of nontraditional bank activities such as securities brokerage services on the BHC’s risk exposure. The above rationales are consistent with those suggested by Hughes et al. (1996, 2006) and among others. Publicly traded BHCs were selected for this sample in order to estimate the interest rate risk parameters and total risk measures outlined earlier. These estimated risk measures are more likely to be a better proxy for the BHC’s exposures to interest rate risk compared to financial statement-based measures such as the BHC’s “maturity gap.” It should be noted that the use of publicly traded BHCs reduces the number of commercial banks we can examine because the vast majority of banks in the U.S. are small, private companies. The BHCs included in our sample are the most economically significant participants in the U.S. commercial banking industry since they held over $2.5 trillion in total industry-wide assets during 1991–2000. However, one must be cautious in drawing conclusions about the risk-management activities of smaller, privately held banks because we do not include these firms in our analysis. 29 A highest-level bank holding company is an entity that is not owned or controlled by any other organization. Its activities are restricted by regulatory agencies as well as by federal and state laws. However, a highest-level BHC can clearly own other lower-level BHCs that, in turn, can own one or more commercial banks. The highest-level BHC can also engage in some nonbank financial activities such as securities brokerage, mutual fund management, and some forms of securities underwriting. By focusing our analysis on highest-level BHCs, we can take into account the impact of these nontraditional bank activities on the BHC’s risk-taking and risk-management behavior.
686
The time period of 1991–2000 was chosen to ensure our risk estimates were not unduly influenced by any single exogenous factor such as a particular stage of the business cycle.30 The period of analysis covers a full business cycle commencing with a recession in 1991, a recovery period in 1992, a vigorous expansion during 1993–1999, and subsequent deceleration and contraction in 2000. Monetary policy also changed from accommodating to tightening and then back to a looser stance during the period. Since it is likely that the state of the macroeconomy and monetary policy affect a BHC’s decisions, our test results based on the 1991– 2000 time period should not be a statistical artifact caused by studying an incomplete business cycle.
M.S. Pagano
measure of interest rate risk since this factor cannot be readily obtained from the BHC’s financial statements. This measure is defined as: IRR – the absolute value of an interest rate parameter estimated via a two-factor model based on Equation (45.2).31 As noted in the previous section, IRR can be justified as an indirect measure of interest rate risk-management activity due to Deshmukh et al.’s (1983) and Diamond’s (1984) theoretical results. Thus, values of IRR estimates close to zero would represent more hedging and lower risk exposures, ceteris paribus.32 In addition to measuring interest rate risk, we examine how corporate risk-management theories affect the bank’s overall riskmanagement activities by analyzing the monthly standard deviations of bank stock returns (measured annually and defined as TOTRISK).
45.3.2 Sample Selection The sample of 518 BHCs was constructed from the joint occurrence of available data from four key areas: (1) Financial statement data from the Consolidated Financial Statements for Bank Holding Companies (FR Report Y-9C) filed by the BHCs with their relevant regulatory authorities, (2) stock return data from the Center for Research in Securities Prices (CRSP), (3) management compensation data from SNL Securities’ Executive Compensation Review, and (4) institutional and insider ownership from Compact Disclosure and/or SNL Securities’ Quarterly Bank Digest. The joint occurrence of data from each of these sources during the period of 1991–2000 qualified a highest-level BHC for inclusion in our sample. Based on the limitations of the data (notably the management compensation data), our tests must be done on annual data for highest-level BHCs. Also, the period of 1991–2000 was a time of substantial consolidation in the banking industry. In order to mitigate the problem of survivorship bias, we perform our tests using an unbalanced panel data set of 518 BHCs that have at least 1 year of complete data. Thus, our tests based on the panel data set include firms that may have merged or gone public at some point during 1991–2000, as well as those BHCs that survived the entire 10-year period.
45.3.3 Our Measures of Bank Risk-Taking In order to test our integrated risk-management framework, we must develop an empirical measure for interest rate risk that is consistent with shareholder wealth maximization. In particular, we must form a market-based, BHC-specific 30
Data limitations, particularly for management compensation data such as stock option grants, prohibit us from extending the analysis to time periods earlier than 1991. Reporting changes related to executive compensation and the concomitant hand-collection of these data precluded us from extending the analysis for more than the 1991–2000 period.
45.3.4 The Empirical Model The basis of our empirical approach follows Flannery and James (1984), Unal and Kane (1988), Tufano (1996), Saunders et al. (1990), Brewer et al. (1996), and Hirtle (1997).33 Similar to Saunders et al. (1990) and Hirtle (1997),
31
We do not include a measure of foreign exchange rate risk in our analysis because the sample of banks employed here operates primarily domestic-oriented businesses. Less than 15% of the observations in this sample have a non-zero exposure to foreign exchange rate risk (as measured by the gross notional value of the firm’s foreign exchange futures/forwards/swaps). We also estimated Equation (45.2) with a tradeweighted U.S. dollar index as an additional independent variable to act as another check on the relevance of foreign currency risk but found that this variable was not a significant factor affecting the vast majority of banks in our sample. A separate analysis of exchange rate risk for the relatively small subset noted above is therefore beyond the scope of this paper. 32 As Froot and Stein (1998) demonstrate, a zero interest rate parameter is typically sub-optimal when the firm faces convex external financing costs. In addition, firms may want to deviate from a zero interest rate parameter when manager-owner agency problems exist. As discussed in Tufano (1998b), hedging can reduce the number of times the firm enters the capital markets and therefore reduces the amount of monitoring performed by external investors. Utility-maximizing managers may therefore have an incentive to hedge more in order to obscure their consumption of perquisites or creation of other agency costs (to the detriment of the firm’s shareholders). In addition, as noted in Pennacchi (1987), among others, the potential mispricing of FDIC deposit insurance may create incentives to take on more risk (and therefore hedge less). To the extent that these problems are mitigated by the banking industry’s frequent (i.e., daily) entrance into the capital markets, the use of a zero interest rate parameter can be viewed as an appropriate benchmark for gauging the relative degree of the financial institution’s hedging activity vis-à-vis its peers. 33 Other important empirical papers related to this study are Schrand (1997) on derivatives usage, Amihud and Lev (1981), Gorton and Rosen (1995), Houston and James (1995), Morck et al. (1988), Crawford et al. (1995), and Hubbard and Palia (1995) on management ownership and compensation issues, as well as Galloway et al. (1997) and Cebenoyan, et al. (1999) on bank risk-taking, and Angbazo (1997) on bank profitability and off-balance sheet activities.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
among others, the dependent variable used in our study is continuous in nature and can capture the full range of BHC risk-taking behavior. As noted earlier, using the theoretical framework of Equation (45.2), the two-factor model is estimated separately for each of the 518 BHCs during each year of the 1991–2000 sample period (where data are available).
687
Based on the above hypotheses, control variables, and the relations noted in Equations (45.1)–(45.2), we can specify a pooled model of interest rate risk-management for commercial banks using the IRR parameter estimates. The following stylized second-stage regression equation includes independent variables that are proxies for the full set of relevant factors influencing a BHC’s interest rate risk-taking behavior.35
TOTRISK N;y or IRRN;y D f1 .Taxes; Financial Distress; Agency Costs; Contracting Costs; Asymmetric Information; Alternative Financial Policies; Control Variables/ D f1 .TAX N ; PBANK N ; LEVERAGEN ; MGMTOWN N ; OPTGRANT N ; INST N ; SIZEN ; ASYMINFON ; LIQUIDITY N ; BADLOANSN ; OPERISKN ; REGDUM/
where, IRRN;y D an average measure of the N -th BHC’s interest rate risk exposure for the y-th year defined as the absolute value of the parameter estimate for ˇI;N from Equation (45.2) and TOTRISK N;y is the standard deviation of the N -th BHC’s monthly stock returns for the y-th year. The first six factors on the immediate right-hand-side of the equation represent the main determinants of corporate riskmanagement from Equation (45.1) and the seventh factor represents control variables related to other firm-specific risk factors such as asset quality, portfolio composition, operating risk, and changes in bank regulation during 1991–2000.34 The second equality in Equation (45.3) identifies the specific variables used as proxies for the seven theoretical factors. All of the right-hand-side variables are annual values based on
(45.3)
data for 1991–2000 so that the model’s parameters represent the average relation between the explanatory variables and IRR in the panel of BHCs. As noted earlier, higher values of the dependent variable can be interpreted as an indication of greater risk-taking or, conversely, less hedging. We refer to Equation (45.3) as our “Primary Model.” As mentioned earlier, we can specify an “Alternative Model” for IRR and TOTRISK in order to ensure that our results are not driven by the specific proxy variables we have chosen to represent the corporate risk-management rationales (as summarized in hypotheses H1–H9). This model has the same form as Equation (45.3) but uses different righthand-side variables to check the robustness of our “Primary Model” results. This Alternative Model is presented below:
TOTRISK N;y or IRRN;y D f1 .TAXDUMN ; PBANKDUM N ; LEVDUM N ; MGMTDUM N ; PBONUSN ; EQBLOCK N ; REVENUEN ; ASYMINFON ; CASH=BVEN ; NONPERF=TAN ; OPER EXP=TAN ; REGDUM/
(45.4)
where, IRRN;y D the same interest rate risk measure noted in Equation (45.3); (i.e., it is an average measure of the N th BHC’s interest rate risk exposure for the y-th year defined as the absolute value of the parameter estimate for ˇI;N from Equation (45.2)) and TOTRISK N;y is the same
total risk measure described by Equation (45.3). Note that we have replaced all of the right-hand-side variables of Equation (45.3), except the ASYMINFO and REGDUM variable, with alternative proxy variables to check the robustness of our results to different formulations of our model’s
34
35
As noted earlier, when TOTRISK N;y is used as the dependent variable, the IRRN;y variable is included as an independent variable to control for the BHC’s interest rate risk exposure.
The stochastic disturbance term and parameters have been omitted from the following equations to streamline notation. In addition, the yth year subscripts have been dropped from the right-hand-side variables.
688
independent variables.36 The purpose of Equation (45.4) is to re-estimate Equation (45.3) with different proxies for the various risk-management rationales and see if the results of Equation (45.3) are robust to these alternative definitions. The alternative proxy variables contained in Equation (45.4) represent either alternate definitions or completely different variables that are also reasonable proxies for the risk-management theories and hypotheses presented earlier in Sect. 45.2. For example, we have included four dummy variables that are based on variations of the Primary Model’s variables (i.e., TAXDUM, PBANKDUM, MGMTDUM, LEVDUM). These variables use a binary format rather than a continuous variable to describe hypotheses H1, H2, H4, H7, respectively. We use these binary variables because there may be a “threshold effect” where the saliency of a particular variable might only affect a bank’s IRR when it exceeds this threshold. For example, we can set TAXDUM to 1 when the bank’s net income is greater than $100,000 because it is above this threshold point that the U.S. tax code is no longer convex. For the other three dummy variables, we use the sample mean as the relevant threshold value or, as in the case of PBANKDUM, a value such as 0.01, which signifies a much larger probability of financial distress than the rest of the sample (and represents a probability level that regulators would most likely be concerned about). The new variables included in Equation (45.4) are PBONUS, EQBLOCK, and REVENUE. As previously described in earlier footnotes, PBONUS and EQBLOCK represent alternative proxies for some of the managerial contracting cost rationales related to risk-taking and bank monitoring incentives described in Sect. 45.2.4. In particular, the PBONUS variable provides an alternative measure of the risk-taking incentives of senior management since, like stock
36
The ASYMINFO variable is not replaced in Equation (45.4) because we do not have a suitable alternative for this asymmetric information proxy variable. Admittedly, finding a good proxy for the level of asymmetric information within a BHC is the most difficult aspect of estimating a model such as the one described by Equations (45.3) and (45.4) since information asymmetries are naturally difficult to quantify. One obvious alternative would be to use the bank’s SIZE variable but this variable is needed to control for size-related corporate riskmanagement effects. It is also highly correlated with the REVENUE variable specified in Equation (45.4) and thus the inclusion of both SIZE and REVENUE would introduce a high degree of multi-colinearity into the model. Interestingly, when we simply drop the ASYMINFO variable from Equation (45.4) and allow SIZE to proxy for both size-related and asymmetric information effects, the parameter estimates for the other independent variables remain relatively unchanged. This suggests that our model is robust to using SIZE as an alternative proxy for information asymmetries.
M.S. Pagano
option grants, large annual bonuses have option-like incentive characteristics. REVENUE acts as an alternative proxy for bank size and, as noted in Footnote 23, is most likely to be inter-related with ASYMINFO and might also act as a control for any “too big to fail” effects. The last set of variables included in Equation (45.4) contains alternative control variables (NONPERF/ TA, OPER EXP/ TA, CASH/BVE). For these final three variables, we scale the numerators of each of the Primary Model’s counterparts by an alternate scalar (e.g., the book value of total assets for the first two variables noted above and the book value of common equity for the last one). Also, instead of including marketable securities in the numerator, we define a more restricted version of LIQUIDITY when we form the CASH/BVE variable by including solely cash balances in the numerator and the book value of equity in the denominator. These variables are meant to be reasonable alternative definitions of our independent variables in order to test whether or not our Primary Model’s results are sensitive to the choice, and definition, of these explanatory variables.
45.3.5 Accounting for Nonlinearities and Inter-relations Between Independent Variables Lastly, a review of the explanatory variables described in Sects. 45.2.4 and 45.2.5 raises the question of potential interactions between some of these factors. To account for these potential interactions, as well as the possibility of nonlinear relations between the explanatory variables and IRRN , we can specify Equation (45.3) in a full quadratic form. That is, we can square as well as interact each variable on the right hand side of Equation (45.3) and include these transformed variables along with linear forms of these variables. This approach produces 89 right-hand-side variables for Equation (45.3) based on IRR and 103 variables when TOTRISK is used as the dependent variable. This relatively complicated form of Equation (45.3) can also be compared to the conventional linear form of Equation (45.3) that contains only 12 right-hand-side variables for IRR and 13 variables for TOTRISK. By including squared and interaction terms, the quadratic model might provide a much richer description of the factors that affect interest rate risk-taking at a commercial bank. It also allows us to address the possibility that some of the risk-management incentives may be interdependent. Any inter-dependencies between these variables can be captured by the numerous interaction terms included in the quadratic model.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
45.4 Empirical Results 45.4.1 Descriptive Statistics and Industry Trends Panel A of Table 45.2 provides descriptive statistics for the key variables employed in our Primary Model’s analysis while Panel B reports relevant statistics for the independent variables used in our Alternative Model is robustness test. These descriptive statistics are based on 518 BHCs over a 10year period thus yielding 2,899 bank-year observations for our interest rate risk measures.37 The results of the two-factor model confirm our intuition that U.S. BHCs take on varying degrees of risk relative to interest rates. The average absolute value of the interest rate risk parameter (IRR) during 1991– 2000 for the BHC’s sample was positive (0.5820) and statistically different than zero. The raw value of this measure, IBETA, averages 0.0215 and possesses a large degree of variation (e.g., the standard deviation of 0.7738 is nearly 40 times the mean). Due to a high level of variability, the average raw measure of interest rate risk is not significantly different than zero. However, this average masks the fact that 32% of the individual, bank-specific estimates of IBETA are significantly positive at the 0.10 confidence level and another 35% of the estimates are significantly negative. Thus, two-thirds of the individual IBETA estimates are significantly different than zero even though the overall sample average is quite close to zero. This wide variation in IBETA and IRR shows that there is considerable time-series and cross-sectional dispersion in banks’ sensitivities to interest rate risk. In addition, TOTRISK has a statistically significant mean of 7.67% and exhibits a large degree of variability with a standard deviation of 6.30%.
45.4.2 Multivariate Analysis for Interest Rate Risk Table 45.3 summarizes the results of estimating Equations (45.3) and (45.4) via OLS using the dependent variable, IRR. Recall that these parameter estimates are based on annual regressions described by Equation (45.2). To conserve space, these 2,899 intermediate regressions are not included here.38 Table 45.3 shows that the Primary and
37 There are not 5,180 observations (i.e., 518 10 years) because all banks do not have data for all 10 years. 38 As shown in Table 45.3, the actual number of observations used to estimate Equations (45.3) and (45.4) is less than 2,899 because not all of the independent variables are available for each IRR estimate (particularly the option grants and bonus compensation data).
689
Alternative Models of Equations (45.3) and (45.4) provide substantially the same explanatory power. This finding can be readily seen by comparing the adjusted R2 and t-statistics of the two models. As shown in the last column of Table 45.3, the results of Equation (45.4) based on the Alternative Model follow a pattern similar to that described above for our Primary Model. Table 45.3 also reports that the test results for all the independent variables except SIZE are qualitatively similar for both the Primary and Alternative Models. As we will see from the elasticity estimates of Table 45.4 (described below), the real economic differences between the two models is also not that great in terms of testing our hypotheses. Thus, our results are robust to the choice of independent variables employed in our model. In addition, the choice of interest rate factor in Equation (45.2) does not materially affect our results. As noted earlier, our results are qualitatively similar when other definitions of the interest rate factor are used. It is also helpful to estimate the elasticities of interest rate risk with respect to the model’s primary explanatory variables in order to assess the economic, as well as statistical, significance of our independent variables. That is, we can use the parameter estimates to differentiate Equations (45.3) and (45.4) with respect to a specific explanatory variable. These derivatives can then be used with the mean values of the relevant variables to estimate the elasticity (or sensitivity) of a change in the explanatory variable (e.g., TAX) on our dependent variable (i.e., IRR). A Wald test statistic can then be constructed to test the significance of this elasticity estimate. This approach provides us with a more direct and reliable method of evaluating the importance of a specific variable on the BHC’s interest rate risk-taking activity. Table 45.4 presents the elasticity estimates for the Primary and Alternative Models with respect to the key explanatory variables identified by our nine hypotheses. For the Primary Model, this table reports statistically significant elasticity estimates with theoretically correct signs for TAX, PBANK, LEVERAGE, INST, BADLOANS, and OPERISK. In addition, ASYMINFO and REGDUM are significant but report theoretically incorrect signs. These estimates indicate that, on average, a higher estimated probability of financial distress and debt-related agency costs reduced interest rate risk at commercial banks during the sample period while a higher effective tax rate and greater institutional investor ownership increased IRR. The Alternative Model confirms the Primary Model’s support for the financial distress and agency cost theories of risk-management (as well as the OPERISK control variable) but not the tax rationale. Of particular note are the greater-than-unity elasticity estimates for the LEVERAGE and OPERISK variables in both the Primary and Alternative Models. These estimates suggest that bank interest rate risk sensitivities are most acutely affected by financial leverage and operating cost structures.
690
Table 45.2 Descriptive data for the potential factors affecting risk-management decisions
M.S. Pagano
Descriptive statistics for the full sample of 518 Bank Holding Companies (1991–2000) Variable
Description
Mean
Std. dev.
Min.
Max.
A. Primary model’s variables IRR Absolute Value of Interest Rate Risk Parameter
0:5820
0:5101
IBETA
Raw Value of Interest Rate Risk Parameter
0:0215
0:7738
3:2980
4:901
RM
CRSP Stock Index Total Return
0:1798
0:1464
0:1006
0:3574
IR
Ten-year U.S. Treasury yield (%)
6:0910
0:8661
4:650
7:858
TOTRISK
Monthly Standard Dev. of Stock Returns (%)
7:6682
6:3010
0:162
82:412
SIZE
Total Assets ($ bil.)
12:2509
47:9559
0:0209
902:210
TAX
Average Effective Tax Rate (%)
31:7912
7:8573
0
0
4:901
93:4047
PBANK
Est. Probability Of Bankruptcy
0:0053
0:0200
0
LEVERAGE
Financial Leverage (x)
8:4095
5:1785
1:2227
48:0177
OPTGRANT
Options Granted to Management (%)
0:3914
0:9231
0
21:9060
MGMTOWN
Management Equity Ownership (%)
16:0640
14:4142
0
93:9370
INST
Institutional Investors’ Equity Ownership (%)
20:1839
19:0650
0
90:8620
ASYMINFO
Asymmetric Information Proxy (No. Analysts)
4:7302
7:1157
0
42:0000
BADLOANS
Non-Performing Assets Ratio
0:0889
1:6625
OPERISK
Operating Risk Ratio
0:5941
0:1622
0:1707
0:9934
LIQUIDITY
Liquidity Measure (x)
2:8081
2:4603
0:0299
31:1815
B. Alternative model’s variables TAXDUM Tax Dummy Variable
0:9000
0:3000
0
1
PBANKDUM
Prob. of Bankruptcy Dummy Variable
0:0914
0:2882
0
1
LEVDUM
Book Value Fin. Leverage Dummy Variable
0:4170
0:4932
0
1
PBONUS
Sr. Mgmt. Bonus as % of Mgmt. Comp.
24:6132
18:0054
0
MGMTDUM
Mgmt. Ownership Dummy Variable
0:3356
0:4723
0
EQBLOCK
Un-Affiliated Blockholder Ownership (%)
3:4747
7:5096
0
REVENUE
Total Revenue ($ bil.)
1:1340
4:8050
0:0005
NONPERF/TA
Alternative Non-Performing Assets Ratio
0:0074
0:0137
0:0345
0:0755
OPER EXP/TA
Operating Exp. as a % of Total Assets
0:0523
0:0467
0:0111
0:1333
CASH/BVE
Cash Holdings as a % of BV of Equity
0:8426
0:8037
0:0295
Interestingly, OPERISK has a very large elasticity estimate .C13:2/, which suggests that banks with high operating cost structures also assume relatively large amounts of interest rate risk. To the extent that a bank with a high
0:3508
0:2492
8:5070
93:953 1 85:2000 111:826
14:238
degree of OPERISK also has high financial leverage (as proxied by LEVERAGE), the overall net increase in a bank’s interest rate risk can be ameliorated. For example, based on the elasticity estimates of our Primary Model, a bank
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
691
Table 45.3 Relationship between interest rate risk and risk-management incentives Primary model of IRR Alternative model of IRR Variables CONSTANT TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM Adjusted R2 Durbin-Watson N
Pred. sign C C C C= C C= C= C
Linear model
Alternate variables
Pred. sign
3.7992 .7.92/ 8.9982 (2.60) 7.0068 .2.12/ 0.1576 .3.59/ 0.0438 (0.50) 0.0086 (0.96) 0.0150 (1.83) 0.0006 (0.15) 0.0534 .3.57/ 2.8802 (2.29) 12.2212 (17.88) 0.0414 (0.40) 0.7777 (4.85) 0.2172 1.74 1,959
CONSTANT TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM
C C C C= C C= C= C
Linear model 3.3534 .4.19/ 0.8354 (1.29) 0.9941 .5.13/ 0.3824 .1.78/ 0.0087 (1.12) 0.1301 (0.52) 0.0086 (0.57) 0:0074 .0:18/ 0.0777 .5.73/ 0.0847 (0.00) 96.7454 (10.47) 0.5347 (3.05) 2.0011 (9.18) 0.1916 1.67 2,201
The ordinary least squares (OLS) cross-sectional regression results are based on linear forms of Equations (45.3) and (45.4). The dependent variables are estimates of the BHC’s interest rate risk, IRR, obtained from annual, firmspecific regressions for 518 BHCs during 1991–2001 based on Equation (45.2). The expected signs of the parameter estimates are presented in the second and fifth columns. The results for the Primary and Alternative Models of IRR are reported in the third and sixth columns, respectively. A parameter estimate and its t -statistic (in parentheses) are printed in bold face when the estimate is significant at the 0.10. The standard errors of the parameter estimates are adjusted for heteroskedasticity and autocorrelation according to Newey and West (1987) using a Generalized Method of Moments technique Table 45.4 Estimated elasticities of interest rate risk with respect to the explanatory variables
Primary IRR model
Alternative model
Variable
Pred. sign
Elasticity
p-value
TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM
C C C C= C C= C= C
1.1663 0.1026 3.2120 0.0383 0.2546 0.7876 0.0183 0.8028 0.2916 13.1534 0.2736 1.1155
0.0093 0.0341 0.0003 0.6138 0.3361 0.0675 0.8786 0.0004 0.0222 0.0001 0.6926 0.0001
TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM
Elasticity
p-value
2.0120 0.4330 0.4663 0.5351 0.0999 0.0711 0:0191 1.2400 0.0009 9.1487 1.0264 2.9780
0.1958 0.0001 0.0756 0.2613 0.6022 0.5668 0.8563 0.0001 0.9967 0.0001 0.0023 0.0001
This table reports the estimated elasticities of the BHC’s interest rate risk (IRR) with respect to changes in the Primary and Alternative Models’ explanatory variables. The elasticity estimates are calculated based on the OLS parameter estimates of Equations (45.3) and (45.4) for IRR. The expected signs of the elasticity estimates are presented in the second column. The elasticity estimates and p-values are printed in bold face when the estimate is significant at the 0.10 level
that increases its OPERISK by 1% while also increasing its LEVERAGE by 1% will have a net increase in IRR of approximately 10% (i.e., Œ13:2 3:2 1%). This increase is still substantial, but it is somewhat lower than if the bank solely increased OPERISK with no concomitant increase in
LEVERAGE. These elasticity estimates suggest that a BHC might coordinate some of its key risks when deciding on the firm’s net exposure to interest rate risk. In sum, the parameter and elasticity estimates of the linear model for IRR provide strong support for the hypotheses
692
related to financial distress costs (H2), debt-related agency costs (H7) with weaker support for the tax, managerial contracting costs, and hedge substitutes hypotheses (H1, H4, and H9). No consistent empirical support was found for the sizerelated, asymmetric information, and other managerial contracting cost hypotheses (H3, H5, H6, and H8). Although the details of the quadratic models are not reported here for space reasons, it should be noted that the inclusion of the nonlinear and interaction terms in Equation (45.3) does not significantly affect the model’s parameter estimates. For example, the adjusted R2 statistic for a full quadratic version of Equation (45.3) is 0.2199 compared to our linear model’s value of 0.2172. Thus, the explanatory power of our empirical model of risk-management is not greatly improved by formulating a full quadratic specification rather than a conventional linear model. A Hausman (1978) specification test reveals that the parameter estimates for the linear and quadratic forms of Equation (45.3) are not significantly different.39 This suggests that the interaction and nonlinear effects of the full quadratic form are not important determinants of interest rate risk sensitivities for our sample of banks. Thus, the simpler linear model is sufficient for adequately describing interest rate risk-taking and risk-management behavior for our sample. An additional series of Hausman tests confirm our earlier claims that models that contain squared terms or fixed effects do not provide any meaningful improvements over our Primary Model. In sum, our basic linear model is robust not only to alternative independent variables but also to the inclusion of controls for other key bank risks, nonlinearities, interaction terms, and firm-specific fixed effects.
45.4.3 Multivariate Analysis for Total Risk Table 45.5 summarizes the results of estimating Equations (45.3) and (45.4) via GMM using the dependent variable, TOTRISK. These parameter estimates are based on annual regressions described by Equation (45.2). Similar to our findings for IRR, Table 45.5 shows that the Primary and Alternative Models of Equations (45.3) and (45.4) substantially provide the same explanatory power. This finding can be seen by comparing the adjusted R2 and t-statistics of the two models. The last column of Table 45.5 shows the results of Equation (45.4), based on the Alternative Model, follow a pattern similar to that described above for our Primary Model. Table 45.5 also reports that the results for all the independent variables except OPTGRANT and LIQUIDITY
M.S. Pagano
are qualitatively similar for both the Primary and Alternative Models. As shown from the elasticity estimates of Table 45.6 (described below), the economic difference between the two models is also not that great in terms of testing our hypotheses. Thus, our TOTRISK results are robust to the choice of independent variables employed in our model. Table 45.6 presents the elasticity estimates for the Primary and Alternative Models with respect to the key explanatory variables identified by our nine hypotheses. For the Primary Model, this table reports statistically significant elasticity estimates with theoretically correct signs for all variables except TAX, MGMTOWN, and ASYMINFO. These estimates indicate that, on average, a higher estimated probability of financial distress, more debt-related agency costs, larger firm size, and greater institutional ownership increased total risk at commercial banks during the sample period while more liquidity and greater bank regulation decreased TOTRISK. The Alternative Model confirms the Primary Model’s support for the financial distress, managerial contracting, and agency cost theories of risk-management (as well as the control variables: BADLOANS, OPERISK, REGDUM, and INTBETA). Of particular note are the relatively large elasticity estimates for the LEVERAGE and OPERISK variables in the Primary Model. These estimates suggest that bank total risk exposure is most acutely affected by the firm’s financial leverage and operating cost structure. In sum, the parameter and elasticity estimates of the linear model for IRR provide strong support for the hypotheses related to financial distress costs (H2), debt-related agency costs (H7), and firm size (H6), with weaker support for the managerial contracting costs and hedge substitutes hypotheses (H4 and H9). No consistent empirical support was found for the other hypotheses related to taxes, size, asymmetric information, and other managerial contracting cost rationales (H1, H3, H5, and H8). Although the details of the quadratic models are not reported here for space reasons, it should be noted that the inclusion of the nonlinear and interaction terms in Equation (45.3) does not significantly affect the model’s parameter estimates. For example, the adjusted R2 statistic for a full quadratic version of Equation (45.3) is 0.2998 versus our linear model’s value of 0.2887. Thus, the explanatory power of our empirical model of risk-management is not greatly improved by formulating a full quadratic specification rather than a conventional linear model. A Hausman (1978) specification test reveals that the parameter estimates for the linear and quadratic forms of Equation (45.3) are not significantly different.40 Consistent with the Hausman test result for IRR, this finding for TOTRISK suggests that the interaction and
40 39
To conserve space, these statistics are not reported in the tables.
To conserve space, these statistics are not reported in the tables noted above.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior
693
Table 45.5 Relationship between total risk and risk-management incentives Primary model of TOTRISK Alternative model of TOTRISK Variables CONSTANT TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM INTBETA Adjusted R2 Durbin-Watson N
Pred. sign C C C C C C= C C= C= C= C=
Linear model
Alternate variables
Pred. sign
3.9298 (8.46) 3:8534 .1:35/ 31.1975 (2.45) 0.2059 (4.31) 0.5770 (1.74) 0.0064 (1.12) 0.0339 (6.10) 0.0063 (3.38) 0.0786 .5.42/ 3.3243 (2.38) 3.5888 (6.03) 0.3397 .3.63/ 1.4172 .5.83/ 0.2206 (13.33) 0.2887 1.85 1,959
CONSTANT TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM INTBETA
C C C C C C= C C= C= C= C=
Linear model 9.1165 (8.32) 4.2161 .4.13/ 0.9721 (2.88) 0.6185 (4.43) 0:0065 .1:36/ 0:0818 .0:53/ 0.0302 (2.88) 0.0849 (3.74) 0.0421 .3.65/ 47.8477 (2.90) 46.5962 (7.00) 0.1917 (2.55) 1.5987 .7.37/ 0.2031 (13.15) 0.2772 1.87 2,201
The ordinary least squares (OLS) cross-sectional regression results are based on linear forms of Equations (45.3) and (45.4). The dependent variables are estimates of the BHC’s total risk, TOTRISK, obtained from annual, firmspecific regressions for 518 BHCs during 1991–2001 based on Equation (45.2). The expected signs of the parameter estimates are presented in the second and fifth columns. The results for the Primary and Alternative Models of IRR are reported in the third and sixth columns, respectively. A parameter estimate and its t -statistic (in parentheses) are printed in bold face when the estimate is significant at the 0.10. The standard errors of the parameter estimates are adjusted for heteroskedasticity and autocorrelation according to Newey and West (1987) using a Generalized Method of Moments technique Table 45.6 Estimated elasticities of total risk with respect to the explanatory variables
Primary TOTRISK model
Alternative model
Variable
Pred. sign
Elasticity
p-value
TAX PBANK LEVERAGE OPTGRANT MGMTOWN INST SIZE ASYMINFO BADLOANS OPERISK LIQUIDITY REGDUM INTBETA
C C C C C C= C C= C= C= C=
0:0375 0.0181 0.2621 0.0318 0.0151 0.1150 0.0105 0.0624 0.0187 0.3386 0.1395 0.1815 0.1138
0.1771 0.0143 0.0001 0.0818 0.2611 0.0001 0.0007 0.0001 0.0172 0.0001 0.0003 0.0001 0.0001
TAXDUM PBANKDUM LEVDUM PBONUS MGMTDUM EQBLOCK REVENUE ASYMINFO NONPERF/TA OPER EXP/TA CASH/BVE REGDUM INTBETA
Elasticity
p-value
0.6746 0.0138 0.0425 0:0255 0:0046 0.0150 0.0124 0.0331 0.0317 0.3654 0.0246 0.2021 0.1036
0.0001 0.0040 0.0001 0.1733 0.5992 0.0040 0.0002 0.0003 0.0038 0.0001 0.0107 0.0001 0.0001
This table reports the estimated elasticities of the BHC’s total risk (TOTRISK) with respect to changes in the Primary and Alternative Models’ explanatory variables. The elasticity estimates are calculated based on the OLS parameter estimates of Equations (45.3) and (45.4) for TOTRISK. The expected signs of the elasticity estimates are presented in the second column. The elasticity estimates and p-values are printed in bold face when the estimate is significant at the 0.10 level
nonlinear effects of the full quadratic form are also not important determinants of total risk sensitivities for our sample of banks. In sum, our basic linear model is robust not only to
alternative independent variables but also to the inclusion of controls for other key bank risks, nonlinearities, interaction terms, and firm-specific fixed effects.
694
M.S. Pagano
45.5 Conclusion
References
We examine the rationales for risk-taking and riskmanagement behavior for U.S. bank holding companies during 1991–2000. By combining the theoretical insights from the corporate finance and banking literatures related to hedging and risk-taking, we formulate an empirical model based on Flannery and James (1984) to determine which of these theories are best supported by the data. Three main conclusions emerge from the analysis:
Allayannis, G. and J. P. Weston. 2001. “The use of foreign currency derivatives and financial market value.” Review of Financial Studies 14, 243–276. Amihud, Y. and B. Lev. 1981. “Risk reduction as a managerial motive for conglomerate mergers.” Bell Journal of Economics 12, 605–617. Angbazo, L. 1997. “Commercial bank net interest margins, default risk, interest-rate risk, and off-balance-sheet banking.” Journal of Banking and Finance 21, 55–87. Berkman, H. and M. E. Bradbury. 1996. “Empirical evidence on the corporate use of derivatives.” Financial Management 25, 5–13. Blair, R. D. and A. A. Heggestad. 1978. “Bank portfolio regulation and the probability of failure.” Journal of Money, Credit, and Banking 10, 88–93. Bodnar, G. M., G. S. Hayt, and R. C. Marston. 1996. “1995 Wharton Survey of derivatives usage by US non-financial firms.” Financial Management 25, 142. Brewer, E., W. E. Jackson, and J. T. Moser. 1996. “Alligators in the swamp: The impact of derivatives on the financial performance of depository institutions.” Journal of Money, Credit, and Banking 28, 482–497. Campbell, T. S. and W. A. Kracaw. 1990. “Corporate risk-management and the incentive effects of debt.” Journal of Finance 45, 1673– 1686. Campbell, T. S. and W. A. Kracaw. 1990b. “A comment on bank funding risks, risk aversion, and the choice of futures hedging instrument.” Journal of Finance 45, 1705–1707. Cebenoyan, A. S., E. S. Cooperman, and C. A. Register. 1999. “Ownership structure, charter value, and risk-taking behavior for thrifts.” Financial Management 28(1), 43–60. Copeland, T. and M. Copeland. 1999. “Managing corporate FX risk: a value-maximizing approach.” Financial Management 28(3), 68–75. Crawford, A. J., J. R. Ezzell, and J. A. Miles. 1995. “Bank CEO payperformance relations and the effects of deregulation.” Journal of Business 68, 231–256. Culp, C. L., D. Furbush, and B. T. Kavanagh. 1994. “Structured debt and corporate risk-management.” Journal of Applied Corporate Finance Fall 7(3), 73–84. Cummins, J. D., R. D. Phillips, and S. D. Smith. 1998. “The rise of riskmanagement.” Economic Review, Federal Reserve Bank of Atlanta 83, 30–40. DeMarzo, P. and D. Duffie. 1995. “Corporate incentives for hedging and hedge accounting.” Review of Financial Studies 8, 743–772. Demsetz, R. S. and P. E. Strahan. 1995. “Historical patterns and recent changes in the relationship between bank holding company size and risk.” FRBNY Economic Policy Review 1(2), 13–26. Deshmukh, S. D., S. I. Greenbaum, and G. Kanatas. 1983. “Interest rate uncertainty and the financial intermediary’s choice of exposure.” Journal of Finance 38, 141–147. Diamond, D. W. 1984. “Financial intermediation and delegated monitoring.” Review of Economic Studies 51, 393–414. Ferson, W. E. and C. R. Harvey. 1991. “The variation of economic risk premiums.” Journal of Political Economy 99, 385–415. Flannery, M. J. and C. M. James. 1984. “The effect of interest rate changes on the common stock returns of financial institutions.” Journal of Finance 39, 1141–1153. Froot, K. A. and J. C. Stein. 1998. “Risk management, capital budgeting, and capital structure policy for financial institutions: An integrated approach.” Journal of Financial Economics 47, 55–82. Froot, K. A., D. S. Scharfstein, and J. C. Stein. 1993. Risk-management: Coordinating corporate investment and financing policies, Journal of Finance 48, 1629–1658. Galloway, T. M., W. B. Lee, and D. M. Roden. 1997. “Banks’ changing incentives and opportunities for risk taking.” Journal of Banking and Finance 21, 509–527.
1. The corporate risk-management theories most consistently supported are those related to financial distress and debtholder-related agency costs (with weaker support for the managerial contracting costs, firm size, and hedge substitutes rationales). Thus, despite facing a different regulatory framework and investment opportunities than nonfinancial companies, the factors influencing riskmanagement in commercial banks are similar to those present in nonfinancial industries. This finding also corroborates our view that corporate risk-management and risk-taking are, in effect, two ways of looking at the same risk-return problem (i.e., how should a firm manage its risky assets to maximize value?). 2. The tax, asymmetric information, and some of the managerial contracting cost and firm size rationales for managing risk (Hypotheses H1, H3, H5, H8) are not wellsupported by our sample. Although these rationales may be operant within some banks, they are of lesser importance in our sample of BHCs during 1991–2000. 3. A conventional linear model of risk-management adequately explains cross-sectional and time-series variation in the sample. That is, a more detailed model containing quadratic and interaction terms between all independent variables does not yield significantly different (or better) results than a simple linear model. Our model’s findings are robust to alternate definitions of the independent variables, major changes in bank regulation, firm-specific fixed effects, nonlinearities and interactions between the independent variables, as well as firm-specific controls for other key risks related to credit quality and operating efficiency. Avenues for future research relate to developing more direct measures of bank risk management activities in order to gauge the effectiveness of alternative hedging strategies. Acknowledgments The author wishes to thank Joe Hughes, Oded Palmon, Bob Patrick, and especially Ivan Brick, for helpful comments that greatly improved this paper. The author has also benefited from comments by Fernando Alvarez, Bob Cangemi, Robert DeYoung, Larry Fisher, Bill Lang, C.F. Lee, Ben Sopranzetti, and Kenneth Spong, as well as from participants of the FMA International Conference, Chicago Risk Management Conference, New England Doctoral Students Conference, and Eastern Finance Association Conference. Scott Williams also provided capable research assistance. This research was based on my dissertation at Rutgers University and was partially supported by the New Jersey Center for Research in Financial Services.
45 An Empirical Investigation of the Rationales for Integrated Risk-Management Behavior Geczy, C., B. M. Minton, and C. Schrand. 1997. “Why firms use currency derivatives.” Journal of Finance 52, 1323–1354. Gorton, G. and R. Rosen. 1995. “Corporate control, portfolio choice, and the decline of banking.” Journal of Finance 50, 1377–1420. Graham, J. R. and D. A. Rogers. 2002. “Do firms hedge in response to tax incentives?” Journal of Finance 57, 815–839. Graham, J. R. and C. W. Smith Jr. 1999. “Tax incentives to hedge.” Journal of Finance 54, 2241–2262. Greene, W. H. 1993. Econometric analysis, Macmillan, New York), pp. 370–381. Hamada, R. S. 1972. “The effect of the firm’s capital structure on the systematic risk of common stocks.” Journal of Finance 27, 435–452. Haushalter, G. D. 2000. “Financing policy, basis risk, and corporate hedging: Evidence from oil and gas producers.” Journal of Finance 55, 107–152. Hausman, J. 1978. “Specification tests in econometrics.” Econometrica 46, 1251–1271. Hirtle, B. 1997. “Derivatives, portfolio composition and bank holding company interest rate risk exposure.” Journal of Financial Services Research 12, 243–276. Houston, J. F. and C. James. 1995. “CEO compensation and bank risk: Is compensation in banking structured to promote risk taking?” Journal of Monetary Economics 36, 405–431. Howton, S. and S. Perfect. 1998. “Currency and interest-rate derivatives use in U.S. firms.” Financial Management 27(4), 111–121. Hubbard, R. G. and D. Palia. 1995. “Executive pay and performance: Evidence from the U.S. banking industry.” Journal of Financial Economics 39, 105–130. Hughes, J., W. Lang, L. J. Mester, and C-G. Moon. 1996. “Efficient banking under interstate branching.” Journal of Money, Credit, and Banking 28, 1045–1071. Hughes, J., W. Lang, G. -G. Moon, and M. S. Pagano. 2006. Managerial incentives and the efficiency of capital structure in U.S. commercial banking, Rutgers University, Working Paper. Kashyap, A. and J. C. Stein. 1995. “The impact of monetary policy on bank balance sheets.” Carnegie-Rochester Conference Series on Public Policy 42, 151–195. Keeley, M. C. 1990. “Deposit insurance, risk, and market power in banking.” American Economic Review 80, 1183–1200. Keeton, W. R. and C. S. Morris. 1987. “‘Why do banks’ loan losses differ?’ Federal Reserve Bank of Kansas City.” Economic Review, May, 3–21. Leland, H. E. and D. H. Pyle. 1977. “Informational asymmetries, financial structure, and financial intermediation.” Journal of Finance 32, 371–387. Lin, C. M. and S. D. Smith. 2007. “Hedging, financing, and investment decisions: A simultaneous equations framework.” Financial Review 42, 191–209. Mian, S. L. 1996. “Evidence on corporate hedging policy.” Journal of Financial and Quantitative Analysis 31, 419–439. Meulbroek, L. K. 2002. “A senior manager’s guide to integrated risk management.” Journal of Applied Corporate Finance 14, 56–70. Modigliani, F. and M. H. Miller. 1958. “The cost of capital, corporation finance, and the theory of investment.” American Economic Review 48, 261–297. Morck, R., A. Shleifer, and R. W. Vishny. 1988. “Management ownership and market valuation: an empirical analysis.” Journal of Financial Economics 20, 293–315. Nance, D. R., C. W. Smith Jr., and C. W. Smithson. 1993. “On the determinants of corporate hedging.” Journal of Finance 48, 267–284. Newey, W. and K. West. 1987. “A simple positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix.” Econometrica 55, 703–708. Niehans, J. 1978. A theory of money, Johns Hopkins, Baltimore, MD, pp. 166–182.
695
Pagano, M. S. 1999. An empirical investigation of the rationales for risk-taking and risk-management behavior in the U.S. banking industry, Unpublished dissertation, Rutgers University. Pagano, M. S. 2001. “How theories of financial intermediation and corporate risk-management influence bank risk-taking.” Financial Markets, Institutions, and Instruments 10, 277–323. Pennacchi, G. G. 1987. “A reexamination of the over- (or under-) pricing of deposit insurance.” Journal of Money, Credit, and Banking 19, 340–360. Saunders, A. 1997. Financial institutions management: a modern perspective, Irwin/McGraw-Hill, New York, NY, pp. 330–331. Saunders, A., E. Strock, and N. G. Travlos. 1990. “Ownership structure, deregulation, and bank risk taking.” Journal of Finance 45, 643–654. Schrand, C. M. 1997. “The association between stock-price interest rate sensitivity and disclosures about derivatives instruments.” The Accounting Review 72, 87–109. Schrand, C. M. and H. Unal. 1998. “Hedging and coordinated riskmanagement: Evidence from thrift conversions.” Journal of Finance 53, 979–1014. Shapiro, A. C. and S. Titman. 1985. “An integrated approach to corporate risk-management.” Midland Corporate Finance Journal 3, 41–56. Sinkey, J. F., Jr. and D. Carter. 1994. “The derivatives activities of U.S. commercial banks,” in The declining role of banking, Papers and proceedings of the 30th annual conference on bank structure and regulation, Federal Reserve Board of Chicago, pp. 165–185. Smith, C. W., Jr. 1995. “Corporate risk management: Theory and practice.” Journal of Derivatives 2, 21–30. Smith, C. W., Jr. and R. M. Stulz. 1985. “The determinants of firms’ hedging policies.” Journal of Financial and Quantitative Analysis 20, 391–405. Smith, C. W., Jr. and R. L. Watts. 1992. “The investment opportunity set and corporate financing, dividend, and compensation policies.” Journal of Financial Economics 32, 263–292. Spong, K. R. and R. J. Sullivan. 1998. “How does ownership structure and manager wealth influence risk?” Financial Industry Perspectives, Federal Reserve Bank of K.C., pp. 15–40. Stein, J. C. 1998. “An adverse-selection model of bank asset and liability management with implications for the transmission of monetary policy.” RAND Journal of Economics 29, 466–486. Stulz, R. M. 1984. “Optimal hedging policies.” Journal of Financial and Quantitative Analysis 19, 127–140. Stulz, R. M. 1990. “Managerial discretion and optimal financing policies.” Journal of Financial Economics 26, 3–27. Stulz, R. M. 1996. “Rethinking risk-management,” Journal of Applied Corporate Finance 9, 8–24. Tufano, P. 1996. “Who manages risk? An empirical examination of riskmanagement practices in the gold mining industry.” Journal of Finance 51, 1097–1137. Tufano, P. 1996b. “How financial engineering can advance corporate strategy.” Harvard Business Review 74, 136–140. Tufano, P. 1998a. “The determinants of stock price exposure: financial engineering and the gold mining industry.” Journal of Finance 53, 1015–1052. Tufano, P. 1998b. “Agency costs of corporate risk-management.” Financial Management 27, 67–77. Unal, H. and E. J. Kane. 1988. “Two approaches to assessing the interest rate sensitivity of deposit institution equity returns.” Research in Finance 7, 113–137. Wall, L. D. and J. J. Pringle. 1989. “Alternative explanations of interest rate swaps: A theoretical and empirical analysis.” Financial Management 18, 59–73. Wolf, R. 2000. “Is there a place for balance-sheet ratios in the management of interest rate risk?” Bank Accounting and Finance 14(1), 21–26.
Chapter 46
Copula, Correlated Defaults, and Credit VaR Jow-Ran Chang and An-Chi Chen
Abstract Almost every financial institution devotes a lot of attention and energy to credit risk. The default correlations of credit assets have a fatal influence on credit risk. How to model default correlation correctly has become a prerequisite for the effective management of credit risk. In this thesis, we provide a new approach to estimating future credit risk on TM target portfolio based on the framework of CreditMetrics by J.P. Morgan. However, we adopt the perspective of factor copula and then bring the principal component analysis concept into factor structure to construct a more appropriate dependence structure among credits. In order to examine the proposed method, we use real market data instead of virtual ones. We also develop a tool for risk analysis that is convenient to use, especially for banking loan businesses. The results indicate that people assume dependence structures are normally distributed, which could lead to underestimated risks. On the other hand, our proposed method captures better features of risks, including conspicuous fat-tail effects, even though the factors appear normally distributed. Keywords Credit risk r Default correlation Principal component analysis r Credit VaR
r
Copula
r
46.1 Introduction Credit risk is a risk that generally refers to counterparty failure to fulfill its contractual obligations. The history of financial institutions has shown that many banking association failures are due to credit risk. For the sake of integrity and regularity, financial institutions attempt to quantify credit risk as well as market risk. Credit risk has great influence on all financial institutions as long as they have
J.-R. Chang () National Tsing Hua University, Hsinchu, Taiwan e-mail: [email protected] A.-C. Chen KGI Securities Co. Ltd., Taipei City, Taiwan e-mail: [email protected]
contractual agreements. The evolution of measuring credit risk has been ongoing for a long time. Many credit risk meaTM sure models have been published, such as CreditMetrics by J.P. Morgan, CreditRisk C by Credit Suisse. On the other side, New Basel Accords (Basel II Accords), which are recommendations concerning on banking laws and regulations, have constructed a standard to promote greater stability in financial systems. Basel II Accords allows banks to estimate credit risk by using either a standardized model or an internal model approach, based on their own risk management system. The standardized model approach is based on external credit ratings provided by external credit assessment institutions. It describes the weights, which fall into five categories for banks and sovereigns and four categories for corporations. The internal model approach allows banks to use their internal estimation of creditworthiness, subject to regulatory. Our thesis focuses on how to build a credit risk measurement model after banking has constructed internal customer credit rating and how to estimate banks’ default probability and default correlations. We attempt to implement a credit risk model tool that will link to an institution’s internal banking database and give relevant reports automatically. The developed model should facilitate banks’ ability to boost their risk management capability. The dispersion of the credit losses, however, critically depends on the correlations between default events. Several factors such as industry sectors and corporation sizes will affect correlations between every two default events. The TM CreditMetrics model (Gupton et al. 1997) issued by J.P. Morgan proposed a binomial normal distribution to describe the correlations (dependence structures). In order to describe the dependence structure between two default events in detail, we adopt a Copula function instead of binomial normal distribution to express the dependence structure. When estimating credit portfolio losses, both the individual default rates of each firm and joint default probabilities across all firms need to be considered. These features are similar to the valuation process of Collateralized Debt Obligation (CDO). A CDO is a way of creating securities with widely different risk characteristics from a portfolio of
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_46,
697
698
J.-R. Chang and A.-C. Chen
debt instrument. The estimating process is almost the same between our goal and CDO pricing. We focus on how to estimate risks. Most CDO pricing literature adopted copula functions to capture the default correlations. David, Li (2000) extended Sklar’s issue (1959) that a copula function can be applied to solve financial problems of default correlation. Li (2000) pointed out that if the dependence structure were assumed to be normally distributed through binomial normal probability density function, the joint transformation probability would be consistent with the result from using a Normal copula function. But this assumption is too strong. It has been discovered that most financial data have skewed or fat-tail phenomenon. Bouye et al. (2000) and Embrechts et al. (1999) pointed out that the estimating VaR would be underestimated if the dependence structure was described by Normal copula versus to actual data. Hull and White (2004) combined factor analysis and copula functions as a factor copula concept to investigate reasonable spread of CDO. Our main objective is to find a suitable correlation to describe the dependence structure between every two default events and to speed up the computational complexity. This paper aims to: 1. Construct an efficient model to describe the dependence structure 2. Use this constructed model to analyze overall credit, marginal, and industrial risks 3. Build an automatic tool for the banking system to analyze its internal credit risks.
46.2 Methodology 46.2.1 CreditMetrics
TM
TM
This paper adopts the main framework of CreditMetrics and calculates credit risks by using real commercial bank loans. The calculating dataset for this paper is derived from
Exposures
a certain commercial bank in Taiwan. Although there may be some conditions that are different from the situations TM proposed by CreditMetrics , the calculating process by TM CreditMetrics can still be appropriately applied to this paTM per. For instance, CreditMetrics adopts S&P’s rating category of seven ratings degree; that is, AAA to C; but in this loan dataset, there are nine degrees. The following is the inTM troduction to CreditMetrics model framework. This model can be roughly divided into three components; that is, value at risk due to credit, exposures, and correlations, respectively, as shown in Fig. 46.1. In this section, these three components and how our model works out credit risk valuation will be briefly examined. For details, please refer to the TM CreditMetrics technique document.
46.2.1.1 Value at Risk Due to Credit The process of valuing value at risk due to credit can be deconstructed into three steps. For simplicity, we assumed there is only one standalone instrument, which is a corporation bond. (The bond property is similar to loan as they both receive certain amounts of cash flow every period and principal at the maturity). This bond has a 5-year maturity and pays an annual coupon at the rate of 5%. These factors can be used to express the calculation process. Some modifications to fit real situations will be considered later. Step 1: TM CreditMetrics assumes all risks of any one portfolio due to credit rating changes regardless of defaulting or rating migrating. It is significant to estimate not only the likelihood of default, but also the possibility of migration toward any possible credit quality state at the risk horizon. Therefore, a standard system that evaluates “rating changing” under a certain horizon of time is necessary. This information is represented more concisely in a transition matrix, which can be calculated by observing the historical pattern of rating change and default. Transition matrixes are published by S&P and Moody’s rating agencies, or they can be calculated by private banking internal rating systems. Transition matrixes should
Correlations
Value at Risk due to Credit
User portfolio
Credit Rating
Seniority
Credit Spread
Recovery Rate In Default
Present Value Revaluation
Market Volatilities
Rating Migration Likelihood
Exposure Distributions
Standard Deviation of value due to credit qualities changes for a single exposure
TM
Fig. 46.1 Structure of CreditMetrics
model
Rating Series
Models (correlations)
Joint Credit Rating Changes
46 Copula, Correlated Defaults, and Credit VaR
Table 46.1 One-year transition matrix
699
Rating at year-end (%) Initial rating
AAA
AA
A
BBB
BB
B
CCC
D
AAA AA A BBB BB B CCC
90:81 0:70 0:09 0:02 0:03 0 0:22
8:33 90:65 2:27 0:33 0:14 0:11 0
0:68 7:79 91:05 5:95 0:67 0:24 0:22
0:06 0:64 5:52 86:93 7:73 0:43 1:30
0:12 0:06 0:74 5:30 80:53 6:48 2:38
0 0:14 0:26 1:17 8:84 83:46 11:24
0 0:02 0:01 0:12 1:00 4:07 64:86
0 0 0:06 0:18 1:06 5:20 19:79
Source: J.P. Morgan’s CreditMetrics Table 46.2 Recovery rates by seniority class Recovery rate of Taiwan debt business research using TEJ data Class Loan Corporation bond
Mean (%)
Standard deviation (%)
Secured Unsecured Secured
55.38 33.27 67.99
35.26 30.29 26.13
Unsecured
36.15
37.17
Source: Da-Bai Shen et al. (2003), Research of Taiwan recovery rate with TEJ Data Bank
be estimated at the same time interval (risk horizon) defined by user demand, usually is a 1-year period. Table 46.1 is an example of a 1-year transition matrix. In the transition matrix table, AAA level is the highest credit rating and D is the lowest; D also represents a predicted default. According to the above transition matrix Table 46.1, a company that stays at the AA level at the beginning of the year has a 0.64% probability declining to BBB level at yearend. In the same way, a company that stays at the CCC level at the beginning of the year has a 2.38% probability of ascending to BB level at year-end. In this paper, the transition matrix is shown as an external data.1 In Step 1, we describe the likelihood of migration to any possible quality states (AAA to CCC) at the risk horizon. Step 2 is valuation. The value at the risk horizon must be determined. According to different states, the valuation falls into two categories. First, in the event of a default, recovery rate of different seniority class is needed. Second, in the event of up (down) grades, the change in credit spread that results from the rating migration must also be estimated. In the default category, Table 46.2 shows the recovery rates by seniority class, which we use to revaluate instruments. For instance, if the holding bond (5-year maturity with an annual coupon paying at the rate of 5%) is unsecured 1
We do not focus on how to model probability of default (PD) but on how to establish the dependence structure. The one-year transition matrix is a necessary input to our model.
TM
– Technical document (Gupton et al. 1997)
Table 46.3 One-year forward zero curves by credit rating category Category Year 1 Year 2 Year 3 Year 4 3:60 3:65 3:72 4:10 5:55 6:05 15:05
AAA AA A BBB BB B CCC
4:17 4:22 4:32 4:67 6:02 7:02 15:02
Source: J.P. Morgan’s CreditMetrics et al. 1997)
TM
4:73 4:78 4:93 5:25 6:78 8:03 14:03
5:12 5:17 5:32 5:63 7:27 8:52 13:52
– technical document (Gupton
and the default occurs, the recovery value will be estimated using its mean value, which is 36.15%. In the rating migration category, revaluation determines the cash flows, which result from holding the instrument (corporation bond position). Assuming a face value of $100, the bond pays $5 (an annual coupon at the rate of 5%) each at the end of the next 4 years. Now, the calculating process to describe the value V of the bond assuming the bond upgrades to level A by the formula below: V D 5C C
5 5 5 C C 2 .1 C 3:72%/ .1 C 4:32%/ .1 C 4:93%/3
105 D 108:66 .1 C 5:32/4
The discount rate in the above formula comes from the forward zero curves shown in Table 46.3, which are derived TM from CreditMetrics technical documentation. This paper does not focus on how to calculate forward zero curves. It is also shown as external input data. In Step 3, we estimate the volatility of value due to credit quality changes for this stand alone exposure (level A, corporation bond). From Steps 1 and 2, the likelihood of all possible outcomes and distribution of values within each outcome TM are known. CreditMetrics uses two measures to calculate the risk estimate: One is standard deviation, and the other is percentile level. Besides these two measures, our paper also embraces marginal VaR, which denotes the increment VaR due to adding one new instrument to the portfolio.
700
46.2.1.2 Exposures As discussed above, the instrument is limited to corporation TM bonds. CreditMetrics allows the following generic exposure types: 1. 2. 3. 4. 5.
Non-interest bearing receivables; Bonds and loans; Commitments to lend Financial letters of credit; and Market-driven instruments (swap, forwards, and so forth).
Here we focus on loan exposure. The credit risks calculation process of loans is similar to bonds as shown in the previous example. The only difference is that loans do not pay coupons. Instead, loans receive interests. But the TM CreditMetrics model can definitely fit our goal of estimating credit risks on the banking loan business.
46.2.1.3 Correlations In most circumstances, there is usually more than one instrument in a target portfolio. We will now take multiple exposures into consideration. In order to extend the methodology to a portfolio of multiple exposures, estimating the contribution to risk brought by the effect of non-zero credit quality correlations is necessary. Thus, the estimation of joint likelihood in the credit quality co-movement is the next problem to be resolved. There are many academic papers that address the difficulties of estimating correlations within a credit portfolio. For example, Gollinger and Morgan (1993) used time series of default likelihood to correlate default likelihood, and Stevenson and Fadil (1995) correlated the default experience across 33 industry groups. On the other hand,
Fig. 46.2 Distribution of asset returns with rating change thresholds
J.-R. Chang and A.-C. Chen TM
CreditMetrics proposes a method to estimate default correlation, including the following assumptions: (a) A firm’s asset value is the process that drives its credit rating changes and default. (b) The asset returns are normally distributed. (c) Two asset returns are correlated and bivariate normally distributed, and multiple asset returns are correlated and multivariate normally distributed. According to Assumption A, individual threshold of one firm can be calculated. For a two-exposure portfolio, with credit ratings of level B and level AA, and standard deviation of returns of ¢ and ¢ 0 , respectively, it only remains to specify the correlation ¡ between two asset returns. The covariance matrix for the bivariate normal distribution: †D
2 0
0 02
Then the joint probability of co-movement that both two firms remain at the same credit rating can be described by the following formula: ˚ Pr ZBB < R1 < Z B ; Z 0 AAA < R2 < Z 0 AA Z ZB Z Z 0 AA D f .r; r 0 I †/.dr0 /dr ZBB
Z 0 AAA
Where ZBB ; ZB ; Z0 AAA ; Z0 AA are the thresholds. Figure 46.2 gives a concept of the probability calculation. These three assumptions estimating the default correlation are too strong, especially assuming the multiple asset returns are multinormally distributed. In the next session, a better way of using copula to examine the default correlation is proposed.
46 Copula, Correlated Defaults, and Credit VaR
701
46.2.2 Copula Function
distribution F1 ; F2 ; : : : ; Fm . There exits an copula C: Œ0; 1 m ! Œ0; 1 such that,
Consider a portfolio consists of m credits. Marginal distribution of each individual credit risks (defaults occur) can be constructed by using either the historical approach or the market implicit approach (derived credit curve from market information). But how do we describe the joint distribution or co-movement between these risks (default correlation)? In a sense, every joint distribution function for a vector of risk factors implicitly contains both a description of the marginal behavior of individual risk factors and a description of their dependence structure. The simplest way is assuming the dependence structure to be mutual independence amount the credit risks. However, the independent assumption of the credit risks is obviously not realistic. Undoubtedly, the default rate for a group of credits tends to be higher when the economy is in a recession and lower when the economy is booming. This implies that each credit is subject to the same factors affecting the macroeconomic environment, and that there exists some form of dependence among these credits. The copula approach provides a way of isolating the description of the dependences structure. That is, the copula provides a solution to specify a joint distribution of risks, with given marginal distributions. Of course, there is no unique solution for this problem. There are many different techniques in statistics that can specify a joint distribution with given marginal distributions and a correlation structure. In the following section, the copula function is briefly introduced.
46.2.2.1 Copula function A m-dimension copula is a distribution function on Œ0; 1 m with standard uniform marginal distributions. C .u/ D C.u1 ; u2 ; : : : ; um /
(46.1)
C is called a copula function. The Copula function C is a mapping of the form C W Œ0; 1 m ! Œ0; 1 ; that is, a mapping of the m-dimensional unit cube Œ0; 1 m such that every marginal distribution is uniform on the interval [0,1]. The following two properties must hold 1. C.u1 ; u2 ; : : : ; um ; †/ is increasing in each component ui 2. C.1; : : : ; 1; ui ; 1; : : : ; 1; †/ D ui for all i 2 f1; : : : ; mg; ui 2 Œ0; 1 .
F .x1 ; x2 ; : : : ; xm / D C.F1 .x1 /; F2 .x2 /; : : : Fm .xm // (46.2) If the margins are continuous, then C is unique. For any x1 ; : : : ; xm in < D Œ1; 1 and X has joint distribution function F , then F .x1 ; x2 ; : : : ; xm / D PrŒF1 .X1 / F1 .x1 /; F2 .X2 / F2 .x2 /; : : : ; Fm .Xm / Fm .xm / (46.3) According to Equation (46.2), the distribution function of (F1 .X1 /, F2 .X2 /; : : : ; Fm .Xm /) is a copula. Let xi D Fi1 .ui /, then C.u1 ; u2 ; : : : ; um / D F F11 .u1 /; F21 .u2 /; : : : ; Fm1 .um / (46.4) This gives an explicit representation of C in terms of F and its margins.
46.2.2.3 Copula of F Li (2000) used the copula function conversely. The copula function links univariate marginals to their full multivariate distribution. For m uniform random variables, U1 ; U2 ; : : : ; Um , the joint distribution function C , defined as C.u1 ; u2 ; : : : ; um ; †/ D PrŒU1 u1 ; U2 u2 ; : : : ; Um um (46.5) where, † is correlation matrix of U1 ; U2 ; : : : ; Um . For given univariate marginal distribution functions F1 .x1 /; F2 .x2 /; : : : ; Fm .xm /. The same as above, let xi D Fi1 .ui /, the joint distribution function F can be describe as following F .x1 ; x2 ; : : : ; xm / D C.F1 .x1 /; F2 .x2 /; : : : ; Fm .xm /; †/ (46.6) The joint distribution function F is defined by using a copula. The property can be easily shown as follows: C.F1 .x1 /; F2 .x2 /; : : : Fm .xm /; †/ D PrŒU1 F1 .x1 /; U2 F2 .x2 /; : : : ; Um Fm .xm /
46.2.2.2 Sklar’s Theorem Sklar (1959) underlined applications of the copula. Let F ./ be a m-dimension joint distribution function with marginal
D PrŒF11 .U1 / x1 ; F21 .U2 / x2 ; : : : ; Fm1 .Um / xm D PrŒX1 x1 ; X2 x2 ; : : : ; Xm xm D F .x1 ; x2 ; : : : ; xm /
702
J.-R. Chang and A.-C. Chen
The marginal distribution of Xi is C.F1 .C1/; F2 .C1/; : : : ; Fi .xi /; : : : ; Fm .C1/; †/ D PrŒX1 C1; X2 C1; : : : ;
correlation matrix. If there are more and more instruments .N > 1;000/ in our portfolio, we need to store N by N correlation matrix, because scalability is a problem. The other advantage is to speed up the computation time because of the lower dimension.
Xi xi ; : : : ; Xm C1 D PrŒXi xi
46.2.3 Factor Copula Model
D Fi .xi /
(46.7)
Li showed that with given marginal functions, we can construct the joint distribution through some copulas accordingly. But what kind of copula should be chosen to correspond with realistic joint distribution of a portfolio? For TM example, CreditMetrics chose the Gaussian copula to construct multivariate distribution. By Equation (46.6), this Gaussian copula is given by: C Ga .u; †/ D Pr.ˆ.X1 / u1 ; ˆ.X2 / u2 ; : : : ; ˆ.Xm / um ; †/ D ˆ† .ˆ1 .u1 /; ˆ1 .u2 /; : : : ; ˆ1 .um // (46.8) Where ˆ denotes the standard univariate normal distribution, ˆ1 denotes the inverse of a univariate normal distribution, and ˆ† denotes multivariate normal distribution. In order to easily describe the construction process, we only discuss two random variables u1 and u2 to demonstrate the Gaussian copula. Z C Ga .u1 ; u2 ; / D
ˆ1 .u1 / 1
Z
ˆ1 .u2 / 1
1
2
p .1 2 /
v21 2v1 v2 C v22 d v2 d v1
exp 2.1 2 /
(46.9) ¡ denotes the correlation of u1 and u2 . Equation (46.9) is also equivalent to the bivariate normal copula, which can be written as follows: C.u1 ; u2 ; / D ˆ2 .ˆ1 .u1 /; ˆ1 .u2 //
(46.10)
Thus, given individual distribution (for example, migration over 1 year’s horizon) of each credit asset within a portfolio, we can obtain the joint distribution and default correlation of this portfolio through copula function. In our methodology, we do not use copula function directly. In the next section, we introduce the concept of factor copula for further improving the default correlation. Using factor copula has two advantages. One is to avoid constructing a high dimension
In this section, we examine factor copula models. A factor copula model describes a dependence structure between random variables, not from the perspective of a certain copula form, such as Gaussian copula, but from the factors model. Factor copula models have been broadly used to assess the price of collateralized debt obligations (CDO) and credit default swaps (CDS). The main concept of the factor copula model is that under a certain macro environment, credit default events are independent to each other and are usually the results of market economic conditions. This model provides another way of avoiding multivariate normal distribution (high dimensional) simulation problem. Continuing on the above example, a portfolio consists of m credits. We consider the simplest example, which contains only one factor. Define Vi is the asset value of i th-credit under single factor copula model. Then this i th-credit asset value can be express by one factor M (mutual factor) chosen from macro economic factors and one error term "i . Vi D r i M C
q 1 ri2 "i
(46.11)
Where ri is weight of M , and the mutual factor M is independent of "i . Let the marginal distribution of V1 ; V2 ; : : : ; Vm are Fi ; i D 1; 2; : : : ; m. Then the m-dimensional copula function can be written as C.u1 ; u2 ; : : : ; um / D F F11 .u1 /; F21 .u2 /; : : : ; Fm1 .um / D Pr V1 F11 .u1 /; V2 F21 .u2 /; : : : ; Vm Fm1 .um / (46.12) F is the joint cumulative distribution function of V1 ; V2 ; : : : ; Vm . It is known that M and "i are independent of each other, according to iterated expectation theorem, and Equation (46.12) can be written as C.u1 ; u2 ; : : : ; um / ˚ D E Pr.V1 F11 .u1 /; V2 F21 .u2 /; : : : ; Vm Fm1 .un //jM
46 Copula, Correlated Defaults, and Credit VaR
( DE
m Y
703
)
q 1 2 Pr ri M C 1 ri "i Fi .ui / jM
i D1
9 8 0 1 > ˆ m = ˆ ; :i D1 1 ri2 0 0 11 Z Y m 1 B B F .ui / ri M CC D @ F";i @ i q AA g.M /dM 1 ri2 i D1
Where the yi are common factors between these m credits, and rij is the weight (factor loading) of each factor. The factors are independent of each other. How are those yi factors and their loading determined? We use pca to derive the factor loading. Factor loadings are based on the listed price of those companies in the portfolio. The experimental results will be shown in next section. We note here that the dependence structure among assets has been absorbed into factor loadings (Fig. 46.3).
(46.13)
46.3 Experimental Results Using above formula, m-dimensional copula function can be derived. Moreover, according to Equation (46.13), the joint cumulative distribution F can also be derived F .t1 ; t2 ; : : : ; tm / 0 0 11 1 Z Y m B B F .FT;i .ti // ri M CC D @ F";i @ i q AA g.M /dM 1 ri2 i D1 (46.14) Let Fi .ti / D Pr.Ti ti / represent i -credit default probability (default occurs before time ti ), where Fi is marginal cumulative distribution. We note here that CDX pricing cares about when the default time Ti occurs. Under the same environment (systematic factor M ) Andersen and Sidenius (2004), the default probability Pr.Ti ti / will equal to Pr.Vi ci /, which represent the probability asset value Vi is below its threshold ci . Then joint default probability of these m credits can be described as follows F .c1 ; c 2 ; : : : ; cm / D Pr.V1 c1 ; V2 c2 ; : : : ; Vm cm / Now, we bring the concept of principal component analysis (pca); pca is used to reduce the high dimensions or multivariable problems. In trying to explain a movement of random variables, one has to gather interpreting variables that relate to those variables movements or their correlation. Once the kinds of interpreting variables become too huge or complicated, it is even harder to explain those random variables and the ensuing problems become more complex. Principal component analysis provides a way to extract approximate interpreting variables to cover maximum variance of variables. Those representative variables may not be “real” variables. Virtual variables are allowed and depend on the explaining meaning. We will not discuss pca calculation processes here, but see Jorion (2000) for further details. Based on factor model, the asset value of m credits with covariance matrix ˙ can be described as follows: Vi D ri1 y1 C ri 2 y2 C : : : C ri m ym C "i
(46.15)
The purpose of this paper is to estimate credit risk by using principal component analysis to construct dependence structure without giving any assumptions to specify formulas of copula. In other words, the data is based on itself to describe the dependence structure.
46.3.1 Data In order to analyze credit VaR empirically through proposed method, this investigation adopts the internal loan account data, loan application data, and customer information data from a commercial bank in Taiwan. For reliability and authenticity, all the data is from the Taiwan market. This also means the portfolio pool contains only the loans of listed companies and does not include contain the loans of unlisted companies. According to the period of these data, we can estimate two future portfolio values for 2003 and 2004, respectively. All requirement data are downloaded automatically from a database system to a workspace for computations. Before going into the detail of the experiments, the relevant data and experimental environment are introduced as follows.
46.3.1.1 Requirements of Data Input 1. Commercial bank internal data: This internal data contains nearly 40,000 entries of customer data, 50,000 entries of loan data, and 3,000 entries of application data. These data contain maturity dates, outstanding amounts, credit ratings, interest rate for lending, market type, and so forth up to December 31, 2004. 2. One-year period transition matrix: The data was extracted from Yang (2005), who used the same commercial bank history data to estimate a transition matrix, which followed the Marcov Chain (Table 46.4).
704
J.-R. Chang and A.-C. Chen
Fig. 46.3 Architecture of proposed model
Table 46.4 One-year transition matrix (commercial data) Rating at year-end (%) Initial rating 1 2 3 4 5 6 7 8 9 D Source: Yang (2005)
1
2
3
4
5
6
7
8
9
D
100 3:53 10:76 1:80 0:14 0:09 0:05 0:01 0 0
0 70:81 0:03 1:36 0:44 0:06 0:05 0 0 0
0 8:90 72:24 5:85 1:58 0:94 0:27 0:03 0:02 0
0 5:75 0:24 57:13 2:39 2:44 3:72 0:45 0:09 0
0 6:29 10:39 18:75 75:47 13:66 3:75 0:21 1:46 0
0 1:39 5:78 11:31 16:97 70:58 14:49 1:34 1:80 0
0 0:19 0:31 2:45 1:49 6:95 66:39 2:00 1:36 0
0 2:74 0:09 0:32 0:61 1:68 8:05 77:10 3:08 0
0 0:06 0:06 0:70 0:49 0:76 0:12 0:44 70:06 0
0 0:34 0:10 0:33 0:42 2:81 3:11 18:42 22:13 100
46 Copula, Correlated Defaults, and Credit VaR
705
Table 46.5 One-year forward zero curves by credit rating category (commercial data) Yield (%) 1 year 2 year 3 year 4 year 5 year 6 year Credit rating
7 year
8 year
9 year
1
1:69
2:08
2:15
2:25
2:41
2:53
2:58
2:62
2:7
2
2:57
2:88
3:19
3:44
3:72
3:94
4:07
4:18
4:33
3
2:02
2:41
2:63
2:85
3:11
3:32
3:45
3:56
3:71
4
2:6
2:93
3:28
3:59
3:91
4:17
4:34
4:48
4:65
5
2:79
3:1
3:48
3:81
4:15
4:42
4:6
4:75
4:93
6
4:61
5:02
5:16
5:31
5:51
5:67
5:76
5:83
5:93
7
6:03
6:16
6:56
6:83
7:07
7:23
7:28
7:31
7:36
8
22:92
23:27
22:54
21:91
21:36
20:78
20:15
19:52
18:94
9
27:51
27:82
26:4
25:17
24:09
23:03
21:97
20:97
20:08
Source: Yang (2005)
3. Zero forward rate: Refer to Yang (2005), based on computed transition matrix to estimate the term structure of credit spreads. Furthermore, they added corresponding risk free interest rate to calculate zero forward rates from discounting zero spot rates (Table 46.5). 4. Listed share prices at exchange and over-counter markets: We collected weekly listed shares prices of all companies at exchange and over-counter market rates in Taiwan from January 1, 2000 to December 31, 2003, in total 3 years data, through Taiwan Economic Journal Data Bank (TEJ).
factors obey some distributions (for example, standard normal distribution) to simulate their future asset value and future possible credit ratings. 5. Discount the outstanding amounts by their own forward zero curves to evaluate future values distributions, according to possible credit ratings. 6. Display the analysis results.
46.3.3 Discussion 46.3.3.1 Tools and Interfaces Preview
46.3.2 Simulation In this section, the simulation procedure of analyzing banking VaR is briefly introduced. There are two main method: A and B. A is the method that this paper proposed, which uses factor analysis to explain the dependence structure and to simulate the distribution of future values. B is contrast set, which is traditionally and popularly used in most applicaTM tions such as CreditMetrics . We call it the multi-normal (Normal/Gaussian copula) simulation method. These two methods need three input data tables: credit transition matrix, forward zero curves, and shares prices of each corporation in a portfolio pool. As we do not describe the details of normal copula method procedure here, readers TM can refer to CreditMetrics technical documentation. The process of factor analysis is shown as follows: 1. Extract the data entries that do not mature under a given date from the database system, including credit ratings, outstanding amounts, interest rates, and so forth. 2. Calculate standardized thresholds for each credit rating, according to the input transition matrix. 3. Use the shares prices of those corporations in the portfolio pool to calculate equities correlations. 4. Use principal component analysis to obtain each factor loadings for all factors. Under the assumption that these
For facility and convenience, this paper uses MATLAB and MySQL to construct an application tool to help tool users analyze future portfolio value more efficiently. This tool’s interactive interfaces is as follows:
Basic Information of Experimental Data: (Pie Chart) Extracts the non computing data and shows as pie charts to give users the basic view of loan information. These charts present the proportion of each of three key elements: loan amount of companies, industry, and credit rating. In order to assist users, we have constructed an overview of a concerned portfolio (Fig. 46.4).
Information According to Experimental Data: (Statistic Numbers) The first part shows the extraction of the company data, which has maturity beyond the given months and the second part shows the essential data of top weighted companies. Part I and II extract data without any computation, the only thing that has been removed is some useless data (Figs. 46.5 and 46.6).
706
J.-R. Chang and A.-C. Chen
Fig. 46.4 Interface of Part I (pie chart of loan amount weight in terms of enterprise, industry and credit rating)
5. This function gives users the option to estimate all or portion of the companies in the portfolio pool. The portion part is sorted according to the loan amount of each enterprise. Users can choose multiple companies. The computational result is written to a text file for further analysis. 6. Distribution of factors. This is also defined for the PCA method. There are two distributions that users can choose: standard normal distribution or student-t distribution. The default freedom of student-t distribution is set as one (Fig. 46.7).
Report of Overall VaR Contributor Fig. 46.5 Interface of Part II
Set Criteria and Derived Fundamental Experimental Result This portion is the core of our proposed tool; it provides various computations. Here are the parameters that users must decide themselves: 1. Estimated year 2. Confidence level 3. Simulation times. Of course, the more simulation time users choose, the more computational time needed. 4. Percentage of explained factors that defined for PCA method. Using the eigenvalues of given normalized assets (equities) values, we can determinate the explained percentage.
Users may be more interested in the detail of risk-profile at various levels. In this part, industries are separated into 19 sections and credits into nine levels. This allows users to see where the risk is concentrated visually (Fig. 46.8).
46.3.3.2 Experimental Result and Discussion Table 46.6 represents the experimental results. For objectivity, all simulations times are set to 100,000 times, which is large enough to obtain stable numerical results2 . Based on the provided data, the 1-year future portfolio value of listed corporations on 2003 and 2004 can be estimated. In other
2
We have examined the simulation times: 100,000 times is enough to have a stable computational result.
46 Copula, Correlated Defaults, and Credit VaR
707
Fig. 46.6 Companies data downloads from Part II interface
Fig. 46.7 Interface of Part III
words, for instance, stands on t0 D January 1, 2002 to estimate the portfolio value at the time tT D December 31, 2003. Or stands on t0 D January 1, 2003 to estimate the portfolio value at the time tT D December 31, 2004. The following Table 46.6 lists the experimental results of factor copula methods of different factor distributions and compares TM the results with multi-normal method by CreditMetrics . Parameter setting and the remaining fields are experimental results. We note here Equation (46.15)
Vi D ri1 y1 C ri 2 y2 C : : : C ri m ym C "i the distribution of factors y1 ; y2 : : : ym listed in following table are standard normally distributed and student-t distributed (assumes freedoms are 2, 5 and 10). There are some messages that can be derived from above table. First, obviously, risk of future portfolio value by multi-normal method is less than by proposed method. Risk amount of proposed method is three to five times greater
708
J.-R. Chang and A.-C. Chen
Fig. 46.8 VaR Contribution of Individual credits and industries
Table 46.6 Experimental result of estimated portfolio value at the end of 2003 Estimate year: 2003 Parameter setting: Simulation time: 100,000
Percentage of explained factors: 100.00%
Involved listed Enterprise Number: 40 Result:
Loan Account Entries: 119
Factor distribution assumption: Normal distribution F N.0; 1/ Multi-normal PCA
Credit VaR 95% 192,113.4943 726,778.6308
Multi-normal method
Credit VaR 99% 641,022.0124 1,029,766.9285
Portfolio mean 3,931,003.1086 3,812,565.6170
Portfolio s.d. 136,821.3770 258,628.5713
PCA method (continued)
46 Copula, Correlated Defaults, and Credit VaR
709
Table 46.6 (continued) Factor distribution assumption: Student-t distribution, Freedom D .2/ Multi-normal PCA
Credit VaR 95% 191,838.2019 1,134,175.1655
Credit VaR 99% 620,603.6273 1,825,884.8901
Multi-normal method
Portfolio mean 3,930,460.5935 3,398,906.5097
Portfolio s.d. 136,405.9177 579,328.2159
PCA method
Factor distribution assumption: Student-t distribution, Freedom D .5/ Multi-normal PCA
Credit VaR 95% 192,758.7482 839,129.6162
Credit VaR 99% 610,618.5048 1,171,057.2562
Multi-normal method
Portfolio mean 3,930,923.6708 3,728,010.5847
Portfolio s.d. 135,089.0618 337,913.7886
PCA method
Factor distribution assumption: Student-t distribution, Freedom D .10/ Multi-normal PCA
Credit VaR 95% 192,899.0228 773,811.8411
Multi-normal method
Credit VaR 99% 600,121.1074 1,080,769.3036
Portfolio mean 3,930,525.7612 3,779,346.2750
Portfolio s.d. 137,470.3856 291,769.4291
PCA method
710
Table 46.7 CPU Time for factor computation (Simulation time: 100,000 years: 2003)
J.-R. Chang and A.-C. Chen
``` Explained `` ```ratio (s) ``` Method
100%
90–95%
80–85%
70–80%
Below 60%
Multi-normal PCA
2:5470 1:2030
2:6090 0:7810
2:2350 0:7030
2:2500 0:6720
2.3444 0.6090
than multi-normal method. This result corresponds to most research that copula function can more adequately capture the fat-tail phenomenon that prevails over the practical market. Second, the distribution of future portfolio value by the proposed method is more diversified than multi-normal method, which concentrates on nearly 400,000 with 50,000 times while proposed method with only 17,000 times. Third, it is very apparent that risks with factors using student-t distribution simulation is more than with normal distribution, and the risk amount tends toward the same while the degree of freedom becomes larger. Fourth, the mean of portfolio of the proposed method is smaller than of multi-normal method, but the standard deviation of proposed method is much more than the multi-normal method. It shows the overall possible portfolio values by proposed method could become less valuable and also fluctuate more rapidly. The above discrepancies between the two methods provide some inferences. First, the proposed method, by using market data, provides another way to estimate more actual credit risks in a portfolio containing risky credits. And this method captures fat-tail events more notably. Second, the computation time of proposed method is shorter than multinormal method. As shown in Table 46.7, when using fully explained factors, computation time by proposed method is still faster than by multi-normal method. The computation time decreases as the required explained ratio set lower. This means less number of factors is used for the expected explained level. Third, Table 46.8, which retrieves individual Credit VaR contribution to whole portfolio from 19 industries, shows the main risk comes from the Electronics Industry. Based on the commercial data, we can find out that among its loan account entries Electronics Industry customers account for more than half of loan entries (63/119). The Credit VaR of Electronics Industry computed by the proposed method is six times more than by multi-normal method. This effect reveals that the multi-normal method lacks the ability to catch concentrative risks. On the contrary, based on factor structure, the mutual factors loadings extracted by the correlation among companies expresses more actual risks. Fourth, for finite degree of freedom, the t-distribution has fatter tails than Gaussian distribution and is known to generate tail dependence in the joint distribution. Table 46.9 shows the impact on risks amount by using different factor numbers. According to Table 46.9, the risks decrease as the explained level decreases; this is a tradeoff
Table 46.8 Individual credit VaR of top five industries XX XXXCredit VaR XX Industry PCA method XX Multi-normal method (No. 1) Electronics (No. 2) Plastic (No. 3) Transportation (No. 4) Construction (No. 5) Textile
40; 341 42; 259 22; 752 7; 011 2; 884
252; 980 42; 049 22; 391 7; 007 2; 765
between time consuming and afforded risk amount. Most research reports say 80% of the explained level is large enough to be accepted.
46.4 Conclusion Credit risks and default correlations issues have been probed in recent research and many solutions have been proposed. We take another view in examining credit risks and derivative tasks. From our perspective, the loan credits in a target portfolio like the widely different risk characteristics from a portfolio of debt instruments, their properties and behavior are the same in the main. In this paper, we propose a new approach that combines the principal component analysis and copula functions to estimate credit risks of bank loan businesses. The advantage of this approach is that we do not need to specify particular copula functions to describe dependence structure among credits. On the contrary, we use a factor structure that covers market and idiosyncratic factors and the computed risks, which have heavy tail phenomenon. Another benefit is to reduce the difficulties associated with the parameters that copula functions use. This approach provides another way and has better performance than the conventional method that assumes the dependence structures are normally distributed. In order to describe the risks features and other massages that bank policymakers may like to know, we wrote a tool for risks estimation and results display. It contains a basic data information preview, which downloads information from databases and does some statistic analyses. It also provides different parameter settings and uses Monte Carlo simulation to calculate credit VaR; it also gives an overview of individual credit VaR contributions. The experimental results are consistent with previous studies that the risk will
46 Copula, Correlated Defaults, and Credit VaR
Table 46.9 Estimate portfolio value at the end of 2004 with different explained level
711
95% confidence level, F .0; 1/ Multi-normal PCA
100%
90–95%
80–85%
70–80%
60–70%
208,329.40 699,892.33
208,684.38 237,612.60
209,079.72 200,057.74
208,686.22 187,717.73
207,710.63 183,894.43
be underestimated vis-à-vis real risks if people assume dependence structure are normally distributed. In addition, the aforementioned approach and tools feature still have some error room regarding recovery rate estimations, chosen distributions of factors, and more user friendly interface.
References Andersen, L. and J. Sidenius. 2004. “Extensions to the Gaussian copula: random recovery and random factor loadings.” Journal of Credit Risk 1(1), 29–70. Bouye, E., V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli. 2000. “Copulas for finance – a reading guide and some application.” Groupe de Recherche. Embrechts, P., A. McNeil, and D. Straumann. 1999. Correlation and dependence in risk management: properties and pitfalls,. Mimeo. ETHZ Zentrum, Zurich.
Gollinger, T. L. and J. B. Morgan. 1993. “Calculation of an efficient frontier for a commercial loan portfolio.” Journal of Portfolio Management 39–46. TM Gupton, G. M., C. C. Finger, and M. Bhatia. 1997. CreditMetrics – Technical document, Morgan Guaranty Trust Company, New York. Hull, J. and A. White. 2004. “Valuing of a CDO and an n-th to default CDS without Monte Carlo simulation.” Journal of Derivatives 12(2), 8–48. Jorion, P. 2000. Value at risk, McGraw Hill, New York. Li, D. X. 2000. “On default correlation: a copula function approach.” Journal of Fixed Income 9, 43–54. Sklar, A. 1959. Functions de repartition an n dimension et leurs marges, Pub. Inst. Statistics. University Paris 8, 229–231. Stevenson, B. G. and M. W. Fadil. 1995. “Modern portfolio theory: can it work for commercial loans?” Commercial Lending Review 10(2), 4–12. Yang, Tze-Chen. 2005. The pricing of credit risk of portfolio base on listed corporation in Taiwan market, Master Thesis at National Tsing Hua University in Taiwan.
Chapter 47
Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing Feng Zhao
Abstract This paper first reviews the recent literature on the Unspanned Stochastic Volatilities (USV) documented in the interest rate derivatives markets. The USV refers to the volatilities factors implied in the interest rate derivatives prices that have little correlation with the yield curve factors. We then present the result in Li and Zhao (J Finance 61:341–378, 2006) that a sophicated DTSM without USV feature can have serious difficulties in hedging caps and cap straddles, even though they capture bond yields well. Furthermore, at-the-money straddle hedging errors are highly correlated with cap-implied volatilities and can explain a large fraction of hedging errors of all caps and straddles across moneyness and maturities. These findings strongly suggest that the unmodified dynamic term structure models, assuming the same set of state variables for both bonds and derivatives, are seriously challenged in capturing the term structure volatilities. We also present a multifactor term structure model with stochastic volatility and jumps that yields a closed-form formula for cap prices from Jarrow et al. (J Finance 62:345–382, 2007). The three-factor stochastic volatility model with Poisson jumps can price interest rate caps well across moneyness and maturity. Last we present the nonparametric estimation results from Li and Zhao (Rev Financ Stud, 22(11):4335–4376, 2009). Specifically, the forward densities depend significantly on the slope and volatility of LIBOR rates, and mortgage markets activities have strong impacts on the shape of the forward densities. These results provide nonparametric evidence of unspanned stochastic volatility and suggest that the unspanned factors could be partly driven by activities in the mortgage markets. These findings reinforce the claim that term structure models need to accommodate the unspanned stochastic volatilities in pricing and hedging interest rate derivatives. Keywords Term structure modeling r Interest rate volatility r Unspanned stochastic volatilities r Dynamic term structure models r Heath–Jarrow–Morton r Nonparametric estimation r Forward measure r Mortgage market F. Zhao () Rutgers University, Newark, NJ, USA e-mail: [email protected]
47.1 Introduction Interest rate caps and swaptions are among the most widely traded interest rate derivatives in the world. According to the Bank for International Settlements, their combined notional values are more than 10 trillion dollars in recent years, which are many times bigger than that of exchange-traded options. Because of the size of these markets, accurate and efficient pricing and hedging of caps and swaptions have enormous practical importance. Pricing interest rate derivatives are more demanding than pricing bonds in that the derivatives are more sensitive to the higher order moments of the distributions for underlying and therefore the models need to be able to capture the interest rate volatilities as well as the interest rates themselves. Under the unified framework of the dynamic term structure models (hereafter DTSMs), a benchmark in the term structure literature, the same set of risk factors are used in pricing bonds and derivatives. Consequently, the set risk factors can be identified with the observations of bond yields or swap rates while the inclusion of derivative prices can help in terms of the efficiency of the estimation but not essential. The practitioners on the other hand generally apply the Heath–Jarrow–Morton (HJM) type of models in pricing interest rate derivatives, in which the entire yield curve is taken as given, and sometimes factors independent of yield curve, such as stochastic volatilities and jumps, are added in a piece-meal approach. This divergence foreshadows one of the key issues of the fast growing literature on Libor-based interest rate derivatives, the so-called unspanned stochastic volatility (USV) puzzle.1 Interest rate caps and swaptions are derivatives written on Libor and swap rates, and the traditional view is that their prices be determined by the same set of risk factors that determine Libor and swap rates. However, several recent studies have shown that there seem to be risk factors that affect the prices of caps and swaptions but are not spanned by 1 Another issue is the relative pricing between caps and swaptions. Although both caps and swaptions are derivatives on Libor rates, existing models calibrated to one set of prices tend to significantly misprice the other set of prices. For a more detailed review of the literature, see Dai and Singleton (2003).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_47,
713
714
F. Zhao
the underlying Libor and swap rates. Heidari and Wu (2003) show that while the three common term structure factors (i.e., the level, slope and curvature of the yield curve) can explain 99.5% of the variations of bond yields, they explain less than 60% of swaption implied volatilities. After including three additional volatility factors, the explanatory power is increased to over 97%. Similarly, Collin-Dufresne and Goldstein (2002) show that there is a very weak correlation between changes in swap rates and returns on at-the-money (ATM) cap straddles: the R2 s of regressions of straddle returns on changes of swap rates are typically less than 20%. Furthermore, one principal component explains 80% of regression residuals of straddles with different maturities. As straddles are approximately delta neutral and mainly exposed to volatility risk, they refer to the factor that drives straddle returns but is not affected by the term structure factors as “unspanned stochastic volatility” (hereafter USV). Jagannathan et al. (2003) find that an affine three-factor model can fit the LIBOR swap curve rather well. However, they identify significant shortcomings when confronting the model with data on caps and swaptions, thus concluding that derivatives must be used when evaluating term structure models. On the other hand, Fan et al. (2003) provide evidence against the existence of USV and show the swaptions can be hedged using bonds alone with a HJM model and the difference from the previous studies results from the nonlinear dependence of derivative prices on the yield curve factors. Li and Zhao (2006) show the yield curve factors extracted using a quadratic term
structure model can hedge the bonds perfectly, but not the interest rate caps and the unhedged components can systematically improve hedging performance across moneyness. They argue the difference is likely due to the fact that the interest caps are more sensitive to the volatility factors than the swaptions, and also the DTSMs are suitable to address the question whether the derivatives are redundant since the HJM type of models need using both data sets for estimation. Overall, most studies suggest that interest rate derivatives are not redundant securities and cannot be hedged using bonds alone. In other words, bonds do not span interest rate derivatives. In the following table we regress weekly cap straddle returns at different moneyness and maturity on weekly changes in the three yield factors and obtain very similar results. In general, the R2 s in Table 47.1 are very small for straddles that are close to the money. For deep ITM and OTM straddles, the R2 s increase significantly. This is consistent with the fact that the ATM straddles are more sensitive to the volatility risk than others away from money. The existence of USV has profound implications for term structure modeling, especially on the existing multifactor dynamic term structure models, a widely popular term structure model followed by a huge literature in the last decade. One of the main reasons of the popularity of these models is their tractability: they provide closed-form solutions for the prices of not only zero-coupon bonds, but also a wide range of interest rate derivatives (see, e.g., Duffie et al. 2001; Chacko and Das 2002; Leippold and Wu 2002). The closed-form
Table 47.1 Regression analysis of USV in caps market Maturity Moneyness (K/F) 1.5 2 2.5 3 3.5
4
4.5
5
6
7
8
9
10
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
0.696 0.719 0.521 0.491 0.481 0.319 0.223 0.108 0.031 0.063 0.171 0.306 0.476 0.593 0.660 0.756 0.851
0.285 0.293 0.258 0.257 0.174 0.133 0.078 0.041 0.044 0.076 0.147 0.238 0.302 0.463 0.502 0.553 –
– 0.245 0.142 0.065 0.090 0.093 0.060 0.031 0.021 0.024 0.065 0.100 0.167 0.240 0.249 – –
0.672 0.636 0.404 0.310 0.225 0.164 0.094 0.042 0.025 0.044 0.101 0.192 0.251 0.307 0.390 – –
0.332 0.284 0.173 0.198 0.128 0.088 0.056 0.026 0.014 0.025 0.057 0.119 0.209 0.219 – – –
0.477 0.210 0.184 0.149 0.112 0.068 0.035 0.031 0.046 0.086 0.140 0.207 0.228 – – – –
0.238 0.134 0.102 0.087 0.064 0.041 0.027 0.024 0.024 0.036 0.044 – – – – – –
0.258 0.149 0.080 0.080 0.073 0.043 0.066 0.053 0.035 0.043 0.024 – – – – – –
– – – – – 0.709 0.507 0.268 0.300 0.473 0.567 0.667 0.751 0.801 0.842 0.872 0.888
– – – 0.704 0.557 0.400 0.242 0.076 0.061 0.104 0.218 0.374 0.543 0.658 0.739 0.784 0.821
– 0.816 0.769 0.677 0.624 0.538 0.364 0.256 0.209 0.249 0.366 0.497 0.608 0.691 0.775 0.827 0.879
– 0.754 0.690 0.654 0.577 0.519 0.322 0.223 0.160 0.169 0.256 0.390 0.510 0.613 0.708 0.785 0.832
0.720 0.725 0.615 0.596 0.418 0.329 0.208 0.124 0.083 0.137 0.263 0.423 0.552 0.670 0.737 0.796 0.845
This table reports the R2 s of regressions of weekly returns of cap straddles across moneyness and maturity on weekly changes of the three yield factors. Due to changes in interest rates and strike prices, we do not have the same number of observations for each moneyness/maturity group. The bold entries represent moneyness/maturity groups that have less than 10% of missing values and the rest are the ones with 10%–50% of missing values
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
formulae significantly reduce the computational burden of implementing these models and simplify their applications in practice. However, almost all existing DTSMs assume that derivatives are redundant and can be perfectly hedged using solely bonds. Hence, the presence of USV in the derivatives market implies that one fundamental assumption underlying all DTSMs does not hold, and these models need to be substantially extended to incorporate the unspanned factors before they can be applied to derivatives. However, as Collin-Dufresne and Goldstein (2002) show, it is rather difficult to introduce USV in traditional DTSMs: One must impose highly restrictive assumptions on model parameters to guarantee that certain factors that affect derivative prices do not affect bond prices. In other words, the ATSMs with USV are restricted version of the existing ATSMs. Some recent papers, for example see Bikbov and Chernov (2004), have tested the USV restrictions by comparing USV models to the nesting unrestricted ATSMs and rejected the USV restrictions when both models are fitted to both bonds and derivatives data. This approach however gives misleading conclusions. The USV for term structure models resembles the inclusion of stochastic volatility (SV) in the stock price process, where the natural comparison is between the Black– Scholes model and the SV model. Similarly the USV model should be nesting the traditional DTSM without USV for statistical testing. Specifically, if the unrestricted three-factor affine model is a good fit for the term structure of interest rates, one should test whether adding one more factor, i.e., a four-factor affine model with USV, will help capture the derivatives data. Some recent studies have also provide evidence in support of the existence of USV using bonds data alone. They show the yield curve volatilities backed out from a cross-section of bond yields do not agree with the time-series filtered volatilities, via GARCH or high-frequency estimates from yields data. This challenges the traditional DTSMs even more since these models can not be expected to capture the option implied volatilities if they can not even match the realized yield curve volatilities. Specifically, Collin-Dufresne et al. (2004, CDGJ) show that the LIBOR volatility implied by an affine multi-factor specification from the swap rate curve can be negatively correlated with the time series of volatility obtained from a standard GARCH approach. In response, they argue that an affine four-factor USV model delivers both realistic volatility estimates and a good cross-sectional fit. Andersen and Benzoni (2006), through the use of highfrequency data on bond yields, construct the model-free “realized yield volatility” measure by computing empirical quadratic yield variation for a cross-section of fixed maturities. They find that the yield curve fails to span yield volatility, as the systematic volatility factors are largely unrelated to the cross-section of yields. They claim that a broad class of affine diffusive, Gaussian-quadratic and affine
715
jump-diffusive models is incapable of accommodating the observed yield volatility dynamics. An important implication is that the bond markets per se are incomplete and yield volatility risk cannot be hedged by taking positions solely in the Treasury bond market. They also advocate using the empirical realized yield volatility measures more broadly as a basis for specification testing and (parametric) model selection within the term structure literature. Thompson (2004), on the LIBOR swap data, argues when the affine models are estimated with the time-series filtered yield volatility they can pass on his newly proposed specification test, but not with the cross-sectional backed-out volatility. From these studies on the yields data alone, there may exist an alternative explanation for the failure of DTSMs in effectively pricing derivatives in that the bonds small convexity makes bonds not sensitive enough to identify the volatilities from measurement errors. Therefore efficient inference requires derivatives data as well. It can be argued in the same fashion that identification of the unspanned factors can be most efficient accomplished by adding derivatives data to the analysis. Duarte (2008) shows mortgage-backed security (MBS) hedging activity affects interest rate volatility and proposes a model that takes these effects as a measure for the stochastic volatility of the underlying term structure. However, it is unclear whether the realized volatility is indeed different form the implied volatility due to the MBS effects. Li and Zhao (2009) provide one of the first nonparametric estimates of probability densities of LIBOR rates under forward martingale measures using caps with a wide range of strike prices and maturities.2 The nonparametric estimates of LIBOR forward densities are conditional on the slope and volatility factors of LIBOR rates, while the level factor is automatically incorporated in existing methods.3 They find that the forward densities depend significantly on the slope and volatility of LIBOR rates. For example, the forward densities become more dispersed (compact) when the slope of the term structure (the volatility of LIBOR rates) increases. Further analysis reveals a nonlinear relation between the forward densities and the volatility of LIBOR rates that depends on the slope of the term structure. With a flat (steep) term structure, higher volatility tends to lead to more dispersed
2 The nonparametric forward densities estimated using caps, which are among the simplest and most liquid OTC interest rate derivatives, allow consistent pricing of more exotic and/or less liquid OTC interest rate derivatives based on the forward measure approach. The nonparametric forward densities can reveal potential misspecifications of most existing term structure models, which rely on strong parametric assumptions to obtain closed-form formula for interest rate derivative prices. 3 Andersen and Benzoni (2006) show the “curvature” factor are not significantly correlated with the yield volatility and it is true in this paper as well, therefore the volatility effect here is not due to the “curvature” factor.
716
(compact) forward densities. This result suggests that the speed of mean reversion of the volatility process depends on the slope of the term structure, a feature that has not been explicitly accounted for by existing term structure models. Additionally, this paper documents important impacts of mortgage market activities on the LIBOR forward densities even after controlling for both the slope and volatility factors. For example, the forward densities at intermediate maturities (3, 4, and 5 years) are more negatively skewed when refinance activities, measured by the Mortgage Bankers Association of America (MBAA) refinance index, are high. Demands for out-of–the-money (OTM) floors by investors in mortgage-backed securities (MBS) to hedge potential losses from prepayments could lead to more negatively skewed forward densities. The impacts of refinance activities are most significant at intermediate maturities because the durations of most MBS are around 5 years. The forward density at 2 year maturity is more rightly skewed when ARMs origination (measured by the MBAA ARMs index) is high. Since every ARM contains an interest rate cap that caps the mortgage rate at a certain level, demands for OTM caps from ARMs lenders to hedge their exposures to rising interest rate could lead to more rightly skewed forward densities. The impacts of ARMs is most significant at 2 year maturity because most ARMs get reset within the first 2 years. These empirical results have important implications for the unspanned stochastic volatility puzzle by providing nonparametric and model-independent evidence of USV. The impacts of mortgage activities on the forward densities further suggest that the unspanned factors could be partially driven by activities in the mortgage markets. The next question naturally is how to best incorporate USV into a term structure model so it can price wide spectrum of interest rate derivatives effectively. In contrast to the approach of adding USV restrictions to DTSMs, it is relatively easy to introduce USV in the Heath et al. (1992) (hereafter, HJM) class of models, which include the LIBOR models of Brace et al. (1997) and Miltersen et al. (1997), the random field models of Goldstein (2000), and the string models of Santa-Clara and Sornette (2001). Indeed, any HJM model in which the forward rate curve has stochastic volatility and the volatility and yield shocks are not perfectly correlated exhibits USV. Therefore, in addition to the commonly known advantages of HJM models (such as perfectly fitting the initial yield curve), they offer the additional advantage of easily accommodating USV. Of course, the trade-off here is that in an HJM model, the yield curve is an input rather than a prediction of the model. Recently, several HJM models with USV have been developed and applied to price caps and swaptions. CollinDufresne and Goldstein (2003) develop a random field model with stochastic volatility and correlation in forward rates. Applying the transform analysis of Duffie et al. (2000),
F. Zhao
they obtain closed-form formulae for a wide variety of interest rate derivatives. However, they do not calibrate their models to market prices of caps and swaptions. Han (2007) extends the model of LSS (2001) by introducing stochastic volatility and correlation in forward rates. Han (2007) shows that stochastic volatility and correlation are important for reconciling the mispricing between caps and swaptions. Trolle and Schwartz (in press) developed a multifactor term structure model with unspanned stochastic volatility factors and correlation between innovations to forward rates and their volatilities. Jarrow et al. (2007) develop a multifactor HJM model with stochastic volatility and jumps in LIBOR forward rates. The LIBOR rates follow the affine jump diffusions (hereafter, AJDs) of Duffie et al. (2000) and a closed-form solution for cap prices is provided. Given a small number of factors can explain most of the variation of bond yields, they consider low-dimensional model specifications based on the first few (up to three) principal components of historical forward rates. Their model explicitly incorporates jumps in LIBOR rates, making it possible to differentiate between the importance of stochastic volatility versus jumps for pricing interest rate derivatives. In section two of this review, we will discuss how the original DTSMs have difficulty in pricing and hedging interest rate derivatives, as shown in Li and Zhao (2006). In section three, we present the HJM model as in Jarrow, Li and Zhao (2007). Finally, we will provide nonparametric evidence from Li and Zhao (2009) showing both the realized and implied yield volatilities can not be spanned by the yield curve factors.
47.2 Term Structure Models with Spanned Stochastic Volatility We begin with a two-factor spot rate model with stochastic volatility as in Longstaff and Schwartz (1992). Under the risk-neutral measure Q the short rate r and its volatility V follow a two-dimensional square-root process, drt D r .r rt / dt C
p Q Vt dW 1t I
dV t D V .V Vt / dt C
p
(47.1) Q
Vt dW 2t :
(47.2)
The price of the zero coupon bond P .t; T / with maturity T can be solved through the fundamental PDE for bond pricing, 1 V Prr C 2 PV V C Pr r .r r/ 2 CPV V .V V / C Pt D rP
(47.3)
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
for 0 t T with the terminal condition P .T; T / D 1: The price P .t; T / and yield y.t; T / are functions of the state variables frt ; Vt g ; P .t; T / D e A.T t /CB.T t /rt CC.T t /Vt ;
(47.4)
log .P .t; T // T t A.T t/ B.T t/ C.T t/ rt Vt ; (47.5) D T t T t T t
y.t; T / D
where A; B and C are coefficients depending on maturity. It is clear here that the state variables are linear combinations of the yield curve factors such as level and slope. In this sense, the stochastic volatility is spanned by the bond yields. Both bonds and bond derivatives can be priced through the fundamental PDE and the same set of state variables enters into their prices. We should note here that the volatility process in this model serves two roles. First it helps to price the cross section of bonds making the model more flexible than one-factor models in generating various shapes of the yield curve. Second, it is the volatility process of the short rate therefore it can be inferred using the time series of the short rate alone. One essential question to address therefore is whether the volatility process inferred cross-sectionally fits the time-series properties stipulated by the model. The potential failure in the fit can be due to the model misspecification or the fact that the volatility process can not be identified using the yield curve factors. For illustration, we discuss the example given in Casassus et al. (2005), p Q drt D r .t rt / dt C Vt dW 1t ;
Vt dt D r .t/ 2r t C dt; r Q
dV t D V .Vt ; t/dt C .Vt ; t/dW 2t ;
The ATSMs rely on the following assumptions: The instantaneous interest rate rt is an affine or quadratic
function of the N-dimensional state variables Xt ; r .Xt / D ˇ 0 Xt C ˛ :
dXt D K Œ Xt dt C †St d Wt ;
(47.11)
where St is a diagonal matrix with elements being the squareroot of an affine function of Xt . Hence the conditional means and variances of the state variables are affine functions of the state variables. The market price of risk is a function of the state variables, .Xt / D 0 Xt St C 1 St ;
(47.12)
where St is a diagonal matrix with elements being the inverse of those in St wherever positive zero otherwise. The zero coupon bond with time to maturity can be priced by risk-neutral pricing Q
D.t; / D Et
h
e
R t C t
r.Xs /ds
0
D e A. /B. / Xt :
1
i (47.13)
The functions of A ./ and B ./ satisfy a system of ordinary differential equations. The continuously compounding yields y.t; / follows y.t; / D
(47.9)
It can also be shown that the price of a European call option on the zero coupon bond however depends on the volatility Vt : For this example, the call option can not be hedged by using bonds alone. Therefore, it is an important exercise to test whether a sophicated DTSM without the USV factor can be used to hedge the interest derivatives. Dai and Singleton
(47.10)
The state variables follow a multivariate affine process,
(47.8)
where the long-run mean of the short rate t has a pure drift process and the short rate volatility Vt follows a stochastic process with general drift and diffusion functions that only depend on the volatility itself. It can be shown that the zero coupon bond price P .t; T / depends only on the short rate and its long-run mean, not on the volatility, i.e. P .t; T / D e A.t;T /CB.T t /rt CC.T t /t :
(2002) review many of the current dynamic term structure models and these models include the affine term structure models (ATSMs) of Duffie and Kan (1996) and the QTSMs of ADG (2002) and many others.4 In a typical dynamic term structure model, the economy is represented by the filtered probability space .; F; ; P ; where fFt g0t T fFt g0t T is the augmented filtration generated by an N-dimensional standard Brownian motion, W; on this probability space. It is usually assumed that fFt g0t T satisfies the usual hypothesis (see Protter 1990).
(47.6) (47.7)
717
4
1 A./ C B ./0 Xt :
(47.14)
The affine models include the completely affine models of Dai and Singleton (2000), the essentially affine models of Duffee (2002), and the semi-affine models of Duarte (2004). Other DTSMs include the hybrid models of Ahn et al. (2003), the regime-switching models of Bansal and Zhou (2002) and Dai et al. (2003), and models with macroeconomic jump effects, such as Piazzesi (2005) and many others.
718
F. Zhao
The interest rate derivatives can be priced similarly via riskneutral pricing. Without any restrictions on the model parameters, the loadings for the state variables, B ./ ; are not zero in general. Hence the state variables can always be backed out given enough number of yields, leaving the derivative prices been redundant in identifying the state variables. The QTSMs rely on the following assumptions:
(47.18)
of the first moment of the yield curve factors, they do not perform satisfactorily in capturing the second moment. The second test is to see whether these models can be used to price and hedge a cross section of interest rate derivatives. The task to performing the second task is made somewhat easier due to one major advantage of these DTSMs in that they provide closed-form solutions for a wide range of interest rate derivatives. The empirical results shown below are from Li and Zhao (2006), in which they study the performance of QTSMs in pricing and hedging interest rate caps. Even though the study is based on QTSMs, the empirical findings are common to ATSMs as well.5 To price and hedge caps in the QTSMs, both model parameters and latent state variables need to be estimated. Due to the quadratic relationship between bond yields and the state variables, the state variables are not identified by the observed yields even in the univariate case in the QTSMs. Previous studies, such as ADG (2002) have used the efficient method of moments (EMM) of Gallant and Tauchen (1998) to estimate the QTSMs. Li and Zhao (2006) use the extended Kalman filter (EKF) to estimate model parameters and extract the latent state variables in one step. The details of the implementation of the EKF is in Appendix. The pricing analysis can reveal two sources of potential model misspecification. One is on the number of factors in the model as a missing factor usually causes large pricing errors. An analogy is using Black–Scholes model while the stock price is generated from a stochastic volatility model. The other is on the assumption of the innovation process of each factor. If the innovation of the factor has a fat-tailed distribution, the convenient assumption of Gaussian distribution is going to deliver large pricing error as well. So from a pricing study, we can not conclude one or the other or both cause large pricing errors. On the other hand, hedging analysis focuses on the changes of the prices, so even if the marginal distribution of the prices can be highly non-gaussian, the conditional distribution for a small time step can still be
In contrast, in the ATSMs the yields are linear in the state variables and therefore the correlations among the yields are solely determined by the correlations of the state variables. Although the state variables in the QTSMs follow multivariate Gaussian process, the quadratic form of the yields helps to model the time varying volatility and correlation of bond yields. Leippold and Wu (2002) show that a large class of fixed-income securities can be priced in closed-form in the QTSMs using the transform analysis of Duffie et al. (2001). The details of the derivation are in Appendix. The first test for these models is to capture both the crosssectional and time-series properties of bond yields, which has been reviewed in Dai and Singleton (2003). Even though the most sophiscated models can fit the cross section of bond prices very well and they can capture the time series property
5 In the empirical analysis of Li and Zhao (2006), the QTSMs are chosen for several reasons. First, since the nominal spot interest rate is a quadratic function of the state variables, it is guaranteed to be positive in the QTSMs. On the other hand, in the ATSMs, the spot rate, an affine function of the state variables, is guaranteed to be positive only when all the state variables follow square-root processes. Second, the QTSMs do not have the limitations facing the ATSMs in simultaneously fitting interest rate volatility and correlations among the state variables. That is, in the ATSMs, increasing the number of factors that follow square-root processes improves the modeling of volatility clustering in bond yields, but reduces the flexibility in modeling correlations among the state variables. Third, the QTSMs have the potential to capture observed nonlinearity in term structure data (see e.g., Ahn and Gao 1999). Indeed, ADG (2002) and Brandt and Chapman (2002) show that the QTSMs can capture the conditional volatility of bond yields better than the ATSMs.
The instantaneous interest rate rt is an affine or quadratic
function of the N-dimensional state variables Xt ; r .Xt / D Xt0 ‰Xt C ˇ 0 Xt C ˛
(47.15)
The state variables follow a multivariate Gaussian
process, dX t D Œ C Xt dt C †dW t ;
(47.16)
The market price of risk is an affine function of the state
variables, .Xt / D 0 C 1 Xt :
(47.17)
Note that in the above equations ‰; ; †; and 1 are N-by-N matrices, ˇ; and 0 are vectors of length N and ˛ is a scalar. The quadratic relation between rt and Xt has the desired property that rt is guaranteed to be positive if ‰ is positive semidefinite and ˛ 14 ˇ 0 ‰ˇ 0: Although Xt follows a Gaussian process in Equation (47.2), interest rate rt exhibits conditional heteroskedasticity because of the quadratic relationship between rt and Xt : As a result, the QTSMs are more flexible in modeling volatility clustering in bond yields and correlations among the state variables than the ATSMs. Consequently, the yield-to-maturity, y.t; /; is a quadratic function of the state variables y.t; / D
1 0 Xt A./Xt C b./0 Xt C c./ :
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
reasonably approximated with Gaussian distribution. As the result, a deficiency in hedging, especially at high frequency, reveals more about the potential missing factors than the distribution assumption in a model. In Li and Zhao (2006), QTSMs can capture yield curve dynamics extremely well. First, given the estimated model parameters and state variables, they compute the one day ahead projection of yields based on the estimated model. Figure 47.1 shows that QTSM1 model projected yields are almost indistinguishable from the corresponding observed yields. Secondly, they examine the performance of the QTSMs in hedging zero-coupon bonds assuming that the filtered state variables are traded and use them as hedging instruments. The delta-neutral hedge is conducted for zerocoupon bonds of six maturities on a daily basis. Hedging performance is measured by variance ratio, which is defined as a
719
the percentage of the variations of an unhedged position that can be reduced by hedging. The results on the hedging performance in Table 47.2 show that in most cases the variance ratios are higher than 95%. This should not be surprising given the excellent fit of bond yields by the QTSMs. If the Libor and swap market and the cap market are well integrated, the estimated three-factor QTSMs should be able to hedge caps well. Based on the estimated model parameters, the delta-neutral hedge of weekly changes of difference cap prices is conducted using filtered state variables as hedging instruments. It is also possible to use Libor zerocoupon bonds as hedging instruments by matching the hedge ratios of a difference cap with that of zero-coupon bonds. Daily rebalance – adjustment of the hedge ratios everyday given changes in market conditions – is implemented to improve hedging performance. Therefore daily changes of a b
6-month
0.07
0.07
0.06
0.06
0.05
0.05
0.04
0.04
0.03
0.03 0.02
0.02 Oct00
Apr01
c
Oct01
Apr02
Oct02 0.07
0.06
0.06
0.05
0.05
0.04
0.04
0.03
0.03
Apr01
Oct01
Apr02
Oct02
5-year
0.02
0.02
Fig. 47.1 The observed yields (dot and the QTSM1 projected yields)
Oct00
d
2-year
0.07
e
1-year
Oct00
Apr01
Oct01 Apr02 7-year
Oct02
f
0.07
0.07
0.06
0.06
0.05
0.05
0.04
0.04
0.03
0.03
0.02
Oct00
Apr01
Oct01 Apr02 10-year
Oct02
Oct00
Apr01
Oct01
Oct02
0.02 Oct00
Apr01
Oct01
Table 47.2 The performance of QTSMs in modeling bond yields
Apr02
Oct02
Apr02
Maturity (years)
QTSM3 QTSM2 QTSM1
0:5
1
2
5
7
10
0:717 0:99 0:994
0:948 0:956 0:962
0:982 0:963 0:969
0:98 0:975 0:976
0:993 0:997 0:997
0.93 0.934 0.932
This table reports the performance of the three-factor QTSMs in capturing bond yields. Variance ratios of model-based hedging of zero-coupon bonds in QTSMs using filtered state variables as hedging instruments. Variance ratio measures the percentage of the variations of an unhedged position that can be reduced through hedging
720
F. Zhao
hedged position is the difference between daily changes of the unhedged position and the hedging portfolio. The latter equals to the sum of the products of a difference cap’s hedge ratios with respect to the state variables and changes in the
corresponding state variables. Weekly changes are just the accumulation over daily positions. The hedging effectiveness is measured by variance ratio, the percentage of the variations of an unhedged position that can be reduced by hedging. This measure is similar in spirit to R2 in linear regression. The variance ratios of the three QTSMs in Table 47.3 show that
Table 47.3 The performance of QTSMs in hedging interest rate caps Panel A. Variance ratio of QTSM1 Moneyness (K/F) 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
Maturity 1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
– – – – – 0.894 0.890 0.888 0.875 0.856 0.851 0.789 0.756 0.733 0.724 0.691 –
– – – 0.865 0.831 0.818 0.810 0.779 0.677 0.619 0.575 0.529 0.489 0.438 0.393 0.324 0.260
– 0.904 0.903 0.884 0.890 0.880 0.853 0.832 0.803 0.767 0.737 0.692 0.645 0.603 0.534 0.449 0.373
– 0.919 0.913 0.911 0.900 0.893 0.872 0.855 0.824 0.799 0.779 0.755 0.692 0.645 0.591 0.539 0.464
0.917 0.920 0.916 0.902 0.876 0.869 0.851 0.847 0.826 0.797 0.763 0.724 0.654 0.575 0.444 0.408 0.319
– – 0.862 0.852 0.864 0.833 0.832 0.833 0.815 0.805 0.773 0.722 0.673 0.634 0.602 0.515 –
– 0.671 0.666 0.689 0.670 0.649 0.631 0.596 0.575 0.536 0.523 0.483 0.521 0.587 0.540 0.436 –
– 0.549 0.487 0.447 0.504 0.531 0.514 0.481 0.456 0.424 0.411 0.422 0.470 0.551 – – –
0.862 0.861 0.822 0.807 0.785 0.773 0.748 0.716 0.695 0.671 0.623 0.611 0.533 0.514 0.334 – –
0.679 0.704 0.619 0.620 0.594 0.590 0.577 0.578 0.533 0.512 0.490 0.426 0.415 – – – –
0.665 0.609 0.565 0.544 0.537 0.516 0.491 0.481 0.476 0.492 0.415 – – – – – –
0.494 0.431 0.355 0.326 0.305 0.296 0.314 0.303 0.287 0.245 0.204 – – – – – –
0.257 0.255 0.218 0.198 0.185 0.159 0.171 0.182 0.164 0.138 – – – – – – –
Panel B. Variance ratio of QTSM1 combined with the changes of the three yield factors Moneyness (K/F) 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
Maturity 1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
– – – – – 0.951 0.942 0.935 0.923 0.906 0.895 0.857 0.848 0.824 0.796 0.777 –
– – – 0.914 0.886 0.870 0.853 0.825 0.746 0.694 0.659 0.624 0.611 0.577 0.560 0.511 0.455
– 0.929 0.927 0.908 0.915 0.905 0.876 0.860 0.841 0.816 0.799 0.770 0.741 0.712 0.680 0.634 0.582
– 0.928 0.922 0.922 0.912 0.899 0.882 0.864 0.836 0.816 0.804 0.789 0.746 0.716 0.687 0.662 0.638
0.921 0.921 0.919 0.903 0.882 0.872 0.855 0.853 0.839 0.819 0.794 0.773 0.729 0.673 0.626 0.603 0.573
– – 0.872 0.861 0.873 0.839 0.837 0.837 0.820 0.814 0.786 0.742 0.701 0.668 0.679 0.636 –
– 0.676 0.675 0.697 0.675 0.654 0.633 0.597 0.578 0.539 0.530 0.491 0.530 0.612 0.559 0.464 –
– 0.573 0.507 0.450 0.510 0.543 0.524 0.488 0.462 0.428 0.421 0.439 0.486 0.598 – – –
0.912 0.904 0.847 0.835 0.811 0.802 0.776 0.738 0.709 0.679 0.630 0.623 0.572 0.541 0.378 – –
0.788 0.804 0.679 0.687 0.653 0.647 0.632 0.623 0.566 0.537 0.508 0.434 0.427 – – – –
0.815 0.765 0.666 0.646 0.639 0.610 0.573 0.554 0.541 0.545 0.480 – – – – – –
0.658 0.611 0.462 0.429 0.417 0.397 0.412 0.395 0.361 0.338 0.278 – – – – – –
0.579 0.471 0.353 0.312 0.324 0.286 0.310 0.306 0.269 0.210 – – – – – – –
This table reports the performance of the three QTSMs in hedging difference caps. Hedging effectiveness is measured by variance ratio, the percentage of the variations of an unhedged position that can be reduced through hedging. The bold entries represent moneyness/maturity groups that have less than 10% of missing values and the rest are the ones with 10–50% of missing values
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
all models have better hedging performance for ITM, shortterm (maturities from 1.5 to 4 years) difference caps6 than OTM, medium and long-term difference caps (maturities longer than 4 years) caps. There is a high percentage of variations in long-term and OTM difference cap prices that cannot be hedged. The maximal flexible model QTSM1 again has better hedging performance than the other two models. To control for the fact that the QTSMs may be misspecified, in Panel B of Table 47.3, the hedging errors of each moneyness/maturity group are further regressed on the changes of the three yield factors. While the three yield factors can explain some additional hedging errors, their incremental explanatory power is not very significant. Thus even excluding hedging errors that can be captured by the three yield factors, there is still a large fraction of difference cap prices that cannot be explained by the QTSMs. Table 47.4 reports the performance of the QTSMs in hedging cap straddles. The difference floor prices are computed from difference cap prices using the put-call parity and construct weekly straddle returns. As straddles are highly sensitive to volatility risk, both delta and gamma neutral hedge is needed. The variance ratios of QTSM1 are as low as the R2 s of linear regressions of straddle returns on the yield factors in Table 47.1, suggesting that neither approach can explain much variations of straddle returns. Collin-Dufresne and Goldstein (2002) show that 80% of straddle regression residuals can be explained by one additional factor. Principle component analysis of ATM straddle hedging errors in Panel B of Table 47.4 shows that the first factor can explain about 60% of the total variations of hedging errors. The second and third factor each explains about 10% of hedging errors and two additional factors combined can explain about another 10% of hedging errors. The correlation matrix of the ATM straddle hedging errors across maturities in Panel C shows that the hedging errors of shortterm (2, 2.5, 3, 3.5, and 4 year), medium-term (4.5 and 5 year) and long-term (8, 9, and 10 year) straddles are highly correlated within each group, suggesting that there could be multiple unspanned factors. To further understand whether the unspanned factors are related to stochastic volatility, we study the relationship between ATM cap implied volatilities and straddle hedging errors. Principle component analysis in Panel A of Table 47.5 shows that the first component explains 85% of the variations of cap implied volatilities. In Panel B, we regress straddle hedging errors on changes of the three yield factors and obtain R2 s that are close to zero. However, if we include the weekly changes of the first few principle components of cap implied volatilities, the R2 s increase significantly: for some 6 The difference cap is the difference of the caps with subsequent maturities and the same strike prices. Instead of having caplets ranging from as early as six months, the difference cap only has caplets of a small maturity region.
721
maturities, the R2 s are above 90%. Although the time series of implied volatilities are very persistent, their differences are not and we do not suffer from the well-known problem of spurious regression. In the extreme case in which we regress straddle hedging errors of each maturity on changes of the yield factors and cap implied volatilities with the same maturity, the R2 s in most cases are above 90%. These results show that straddle returns are mainly affected by volatility risk but not term structure factors. As ATM straddles are mainly exposed to volatility risk, their hedging errors can serve as a proxy of the USV. Panel A and B of Table 47.6 report the R2 s of regressions of hedging errors of difference caps and cap straddles across moneyness and maturity on changes of the three yield factors and the first five principle components of straddle hedging errors. In contrast to the regressions in Panel D of Table 47.6, which only include the three yield factors, the additional factors from straddle hedging errors significantly improve the R2 s of the regressions: for most moneyness/maturity groups, the R2 s are above 90%. Interestingly for long-term caps, the R2 s of ATM and OTM caps are actually higher than that of ITM caps. Therefore, a combination of the yield factors and the USV factors can explain cap prices across moneyness and maturity very well. While the above analysis is mainly based on the QTSMs, the evidence on USV is so compelling that the results should be robust to potential model misspecification. The fact that the QTSMs provide excellent fit of bond yields but can explain only a small percentage of the variations of ATM straddle returns is a strong indication that the models miss some risk factors that are important for the cap market. While we estimate the QTSMs using only bond prices, we could also include cap prices in model estimation. We do not choose the second approach for several reasons. First, the current approach is consistent with the main objective of our study: Use risk factors extracted from the swap market to explain cap prices. Second, it is not clear that modifications of model parameters without changing the fundamental structure of the model could remedy the poor crosssectional hedging performance of the QTSMs. In fact, if the QTSMs indeed miss some important factors, then no matter how they are estimated (using bonds or derivatives data), they are unlikely to have good hedging performance. Finally, Jagannathan et al. (2003) do not find significant differences between parameters of ATSMs estimated using LIBOR/swap rates and cap/swaption prices. The existence of USV strongly suggests that existing DTSMs need to relax their fundamental assumption that derivatives are redundant securities by explicitly incorporating USV factors. It also suggests that it might be more convenient to consider derivative pricing in the forward rate models of HJM (1992) or the random field models of Goldstein (2000) and Santa-Clara and Sornette (2001) because it is generally very difficult to introduce USV in
722
F. Zhao
Table 47.4 Hedging interest rate cap straddles Panel A. The performance of QTSM1 in hedging difference cap straddles measured by variance ratio Maturity (K/F)
1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
– – – – – 0.727 0.558 0.416 0.364 0.471 0.622 0.727 0.788 0.831 0.851 0.876 –
– – – 0.690 0.560 0.438 0.278 0.127 0.081 0.111 0.212 0.357 0.527 0.640 0.727 0.778 0.817
– 0.794 0.776 0.682 0.652 0.579 0.405 0.287 0.210 0.237 0.368 0.508 0.633 0.721 0.808 0.852 0.894
– 0.756 0.723 0.683 0.615 0.532 0.339 0.236 0.142 0.133 0.226 0.378 0.515 0.636 0.728 0.802 0.863
0.711 0.711 0.674 0.589 0.437 0.366 0.248 0.207 0.149 0.190 0.314 0.472 0.593 0.662 0.781 0.833 0.880
– – 0.557 0.497 0.488 0.349 0.265 0.215 0.142 0.187 0.283 0.399 0.481 0.600 0.729 0.804 –
– 0.270 0.250 0.254 0.179 0.133 0.066 0.022 0.024 0.058 0.146 0.235 0.368 0.464 0.525 0.551 –
– 0.215 0.152 0.078 0.093 0.095 0.049 0.017 0.006 0.010 0.065 0.162 0.201 0.296 – – –
0.709 0.672 0.473 0.399 0.293 0.227 0.138 0.069 0.045 0.078 0.133 0.252 0.343 0.408 0.454 – –
0.329 0.427 0.206 0.211 0.126 0.091 0.052 0.018 0.006 0.035 0.054 0.107 0.256 – – – –
0.596 0.495 0.187 0.148 0.113 0.068 0.032 0.034 0.047 0.091 0.091 – – – – – –
0.362 0.283 0.096 0.072 0.053 0.025 0.016 0.012 0.009 0.022 0.018 – – – – – –
0.250 0.185 0.074 0.063 0.070 0.037 0.060 0.043 0.002 0.035 – – – – – – –
Panel B. Percentage of variance of ATM straddles hedging errors explained by the principle components Principle component 1
2
3
4
5
6
59.3%
12.4%
9.4%
6.7%
4.0%
2.8%
Panel C. Correlation matrix of ATM straddles hedging errors across maturity Maturity Maturity
1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
1.5 2 2.5 3 3.5 4 4.5 5 6 7 8 9 10
1.00 0.38 0.28 0.03 0.27 0.13 0.20 0.10 0.21 0.30 0.10 0.14 0.08
–
– –
– – –
– – – –
– – – – –
– – – – – – 1.00 0.96 0.27 0.28 0.36 0.39 0.32
– – – – – – – 1.00 0.23 0.22 0.34 0.37 0.35
– – – – – – – – 1.00 0.08 0.29 0.32 0.26
– – – – – – – – – 1.00 0.29 0.38 0.28
– – – – – – – – – – 1.00 0.83 0.77
– – – – – – – – – – – 1.00 0.86
– – – – – – – – – – – – 1.00
1.00 0.66 0.33 0.52 0.44 0.21 0.11 0.16 0.34 0.12 0.11 0.01
1.00 0.73 0.63 0.37 0.04 0.12 0.19 0.33 0.30 0.25 0.17
1.00 0.59 0.37 0.08 0.13 0.13 0.35 0.30 0.29 0.14
1.00 0.77 0.05 0.16 0.25 0.46 0.25 0.26 0.12
1.00 0.06 0.15 0.05 0.38 0.11 0.12 0.01
DTSMs. For example, Collin-Dufresne and Goldstein (2002) show that highly restrictive assumptions on model parameters need to be imposed to guarantee that some state variables that are important for derivative pricing do not affect bond prices. In contrast, they show that it is much easier to introduce USV in the HJM and random field class of models: Any HJM or random field model in which the forward rate has a stochastic volatility exhibits USV. While it has always been argued that HJM and random field models are
more appropriate for pricing derivatives than DTSMs, the reasoning given here is quite different. That is, in addition to the commonly known advantages of these models (such as they can perfectly fit the initial yield curve while DTSMs generally cannot), another advantage of HJM and random field models is that they can easily accommodate USV (see Collin-Dufresne and Goldstein (2002) for illustration). The existence of USV suggests that these models may not be directly applicable to derivatives because they all rely
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
723
Table 47.5 Straddle hedging errors and cap implied volatilities Panel A. Percentage of variance of ATM Cap implied volatilities explained by the principle components Principle component 1
2
3
4
5
6
85.73%
7.91%
1.85%
1.54%
0.72%
0.67%
Panel B. R2 s of the regressions of ATM straddles hedging errors on changes of the three yield factors (row one); changes of the three yield factors and the first four principle components of the ATM Cap implied volatilities (row two); and changes of the three yield factors and maturity-wise ATM Cap implied volatility (row three) Maturity 1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
0.10 0.29 0.68
0.06 0.49 0.70
0.02 0.54 0.81
0.01 0.43 0.87
0.01 0.63 0.85
0.04 0.47 0.90
0.00 0.95 0.95
0.00 0.96 0.98
0.01 0.21 0.95
0.01 0.70 0.98
0.00 0.68 0.97
0.01 0.89 0.98
0.04 0.96 0.99
This table reports the relation between straddle hedging errors and ATM Cap implied volatilities
on the fundamental assumption that bonds and derivatives are driven by the same set of risk factors. In this paper, we provide probably the first empirical analysis of DTSMs in hedging interest rate derivatives and hope to resolve the controversy on USV through this exercise.
47.3 LIBOR Market Models with Stochastic Volatility and Jumps: Theory and Estimation In this section, we develop a multifactor HJM model with stochastic volatility and jumps in LIBOR forward rates and discuss model estimation and comparison using a wide cross section of difference caps. Instead of modeling the unobservable instantaneous spot rate or forward rate, we focus on the LIBOR forward rates which are observable and widely used in the market.
47.3.1 Specification of the LIBOR Market Models
1 ı
D .t; Tk / 1 ; D .t; TkC1 /
Caplet .t; TkC1 ; X / D ıDkC1 .t/ EtQ
kC1
.Lk .Tk / X /C ; (47.20)
where EtQ is taken with respect to QkC1 given the information set at t. The key to valuation is modeling the evolution of Lk .t/ under QkC1 realistically and yet parsimoniously to yield closed-form pricing formula: To achieve this goal, we rely on the flexible AJDs of Duffie et al. (2000) to model the evolution of LIBOR rates. We assume that under the physical measure P, the dynamics of LIBOR rates are given by the following system of SDEs, for t 2 Œ0; Tk / and k D 1; : : : ; K; kC1
dLk .t/ D ˛k .t/ dt C k .t/ dZk .t/ C dJ k .t/ ; Lk .t/
Throughout our analysis, we restrict the cap maturity T to a finite set of dates 0 D T0 < T1 < < TK < TKC1 ; and assume that the intervals TkC1 Tk are equally spaced by ı, a quarter of a year. Let Lk .t/ D L .t; Tk / be the LIBOR forward rate for the actual period ŒTk ; TkC1 ; and similarly let Dk .t/ D D .t; Tk / be the price of a zero-coupon bond maturing on Tk : Thus, we have L .t; Tk / D
For LIBOR-based instruments, such as caps, floors and swaptions, it is convenient to consider pricing under the forward measure. Thus, we will focus on the dynamics of the LIBOR forward rates Lk .t/ under the forward measure QkC1 , which is essential for pricing caplets maturing at TkC1 . Under this measure, the discounted price of any security using DkC1 .t/ as the numeraire is a martingale. Therefore, the time t price of a caplet maturing at TkC1 with a strike price of X is
for k D 1; 2; : : : K: (47.19)
(47.21)
where ˛k .t/ is an unspecified drift term, Zk .t/ is the k-th element of a Kdimensional correlated Brownian motion with a covariance matrix ‰ .t/ ; and Jk .t/ is the k-th element of a Kdimensional independent pure jump process assumed independent of Zk .t/ for all k: To introduce stochastic volatility and correlation, we could allow the volatility of each LIBOR rate k .t/ and each individual element of ‰ .t/ to follow a stochastic process. But, such a model is unnecessarily complicated and difficult to implement. Instead, we consider a low dimensional model based on the first few principal components of historical LIBOR forward rates. We assume that
724
F. Zhao
Table 47.6 ATM straddle hedging error as a proxy of systematic USV Panel A: R2 s of regressions of cap hedging errors Moneyness (K/F) 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
Maturity 1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
– – – – – 0.958 0.949 0.943 0.932 0.919 0.913 0.879 0.881 0.870 0.861 0.855 –
– – – 0.934 0.917 0.909 0.900 0.886 0.859 0.821 0.793 0.763 0.749 0.742 0.702 0.661 0.640
– 0.938 0.934 0.926 0.934 0.927 0.908 0.905 0.905 0.897 0.890 0.871 0.844 0.818 0.802 0.764 0.725
– 0.949 0.944 0.945 0.938 0.928 0.922 0.918 0.909 0.902 0.894 0.880 0.848 0.817 0.808 0.774 0.743
0.945 0.954 0.943 0.943 0.935 0.928 0.924 0.936 0.939 0.937 0.928 0.915 0.894 0.870 0.836 0.801 0.761
– – 0.911 0.910 0.909 0.889 0.896 0.906 0.902 0.897 0.880 0.860 0.846 0.819 0.802 0.758 0.536
– 0.947 0.934 0.936 0.950 0.956 0.961 0.966 0.988 0.986 0.979 0.970 0.966 0.945 0.920 0.884 –
– 0.952 0.936 0.919 0.946 0.959 0.969 0.976 0.989 0.985 0.976 0.968 0.963 0.943 – – –
0.948 0.960 0.940 0.950 0.951 0.959 0.969 0.980 0.984 0.980 0.974 0.966 0.954 0.941 0.908 – –
0.880 0.928 0.885 0.899 0.898 0.906 0.920 0.967 0.973 0.969 0.967 0.963 0.957 – – – –
0.884 0.871 0.839 0.862 0.862 0.861 0.871 0.889 0.910 0.908 0.913 – – – – – –
0.786 0.807 0.791 0.814 0.821 0.818 0.856 0.882 0.894 0.917 0.921 – – – – – –
0.880 0.838 0.776 0.791 0.840 0.843 0.871 0.893 0.907 0.885 – – – – – – –
Panel B: R2 s of regressions of cap straddle hedging errors Maturity (K/F)
1.5
2
2.5
3
3.5
4
4.5
5
6
7
8
9
10
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
– – – – – 0.915 0.883 0.884 0.901 0.893 0.896 0.908 0.920 0.944 0.945 0.960 –
– – – 0.869 0.812 0.772 0.718 0.672 0.745 0.745 0.757 0.785 0.831 0.873 0.890 0.906 0.924
– 0.883 0.874 0.844 0.839 0.833 0.801 0.828 0.861 0.871 0.883 0.882 0.883 0.894 0.913 0.927 0.941
– 0.916 0.897 0.845 0.828 0.798 0.791 0.826 0.822 0.789 0.769 0.765 0.785 0.813 0.859 0.885 0.912
0.851 0.876 0.825 0.833 0.800 0.772 0.763 0.840 0.851 0.863 0.880 0.886 0.890 0.898 0.921 0.933 0.946
– – 0.775 0.760 0.734 0.685 0.690 0.747 0.745 0.728 0.735 0.778 0.808 0.843 0.872 0.901 –
– 0.891 0.854 0.850 0.872 0.898 0.919 0.936 0.971 0.966 0.953 0.945 0.947 0.918 0.920 0.906 –
– 0.902 0.886 0.853 0.886 0.910 0.930 0.945 0.976 0.972 0.960 0.943 0.935 0.898 – – –
0.839 0.860 0.800 0.812 0.751 0.738 0.721 0.746 0.738 0.729 0.758 0.789 0.809 0.823 0.830 – –
0.589 0.704 0.621 0.655 0.607 0.604 0.622 0.684 0.703 0.758 0.788 0.821 0.843 – – – –
0.724 0.789 0.688 0.740 0.740 0.744 0.777 0.812 0.849 0.874 0.892 – – – – – –
0.712 0.768 0.749 0.800 0.805 0.821 0.859 0.908 0.922 0.913 0.931 – – – – – –
0.744 0.785 0.730 0.761 0.805 0.807 0.840 0.861 0.894 0.902 – – – – – – –
This table reports the contribution of USV proxied by the first few principle components of ATM straddle hedging errors in explaining the hedging errors of caps and cap straddles across moneyness and maturity. It reports the R2 s of regressions of hedging errors of caps across moneyness and maturity on changes of the three yield factors and the first five principle components of straddle hedging errors. The bold entries represent moneyness/maturity groups that have less than 10% of missing values and the rest are the ones with 10–50% of missing values
the entire LIBOR forward curve is driven by a small number of factors N < K (N 3 in our empirical analysis): By focusing on the first N principal components of historical LIBOR rates, we can reduce the dimension of the model from K to N:
Following LSS (2001) and Han (2007), we assume that the instantaneous covariance matrix of changes in LIBOR rates share the same eigenvectors as the historical covariance matrix. Suppose that the historical covariance matrix can be approximated as H D Uƒ0 U 0 ; where ƒ0 is a diagonal
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
matrix whose diagonal elements are the first N largest eigenvalues in descending order, and the N columns of U are the corresponding eigenvectors.7 Our assumption means that the instantaneous covariance matrix of changes in LIBOR rates with fixed time-to-maturity, t ; share the same eigenvectors as H: That is t D Uƒt U 0 ; (47.22) where ƒt is a diagonal matrix whose i -th diagonal element, denoted by Vi .t/ ; can be interpreted as the instantaneous variance of the i -th common factor driving the yield curve evolution at t: We assume that V .t/ follows the square-root process that has been widely used in the literature for modeling stochastic volatility (see, e.g., Heston 1993): dV i .t/ D i .Nvi Vi .t// dt C i
p Vi .t/dWQ i .t/
Us2 D
1 2 2 Uk C UkC1 : 2
(47.24)
We further assume that Us;j is constant for all caplets belonging to the same difference cap. For the family of the LIBOR rates with maturities T D T1 ; T2 ; : : : TK ; we denote UT t the time-t matrix that consists of rows of UTk t ; and therefore we have the time-t covariance matrix of the LIBOR rates with fixed maturities, †t D UT t ƒt UT0 t :
To stay within the family of AJDs, we assume that the random jump times arrive with a constant intensity J ; and conditional on the arrival the jump size follows a of a jump, normal distribution N J ; J2 : Intuitively, the conditional probability at time t of another jump within the next small time interval t is J t and, conditional on a jump event, the mean relative jump size is D exp J C 12 J2 1:10 We also assume that the shocks driving LIBOR rates, volatility, and jumps (both jump time and size) are mutually independent from each other. Given the above assumptions, we have the following dynamics of LIBOR rates under the physical measure P, N q X dLk .t/ D ˛k .t/ dt C UTk t;j Vj .t/dW j .t/ Lk .t/ j D1
(47.23)
where WQ i .t/ is the i -th element of an N -dimensional independent Brownian motion assumed independent of Zk .t/ and Jk .t/ for all k:8 While Equations (47.4) and (47.5) specify the instantaneous covariance matrix of LIBOR rates with fixed time-to-maturity, in applications we need the instantaneous covariance matrix of LIBOR rates with fixed maturities †t . At t D 0, †t coincides with t ; for t > 0, we obtain †t from t through interpolation. Specifically, we assume that Us;j is piecewise constant,9 i.e., for time to maturity s 2 .Tk ; TkC1 / ;
(47.25)
7 We acknowledge that with jumps in LIBOR rates, both the historical and instantaneous covariance matrix of LIBOR rates contain a component that is due to jumps. Our approach implicitly assumes that the first three principal components from the historical covariance matrix captures the variations in LIBOR rates due to continuous shocks and that the impact of jumps is only contained in the residuals. 8 Many empirical studies on interest rate dynamics (see, for example, Andersen and Lund 1997; Ball and Torous 1999; Chen and Scott 2001) have shown that correlation between stochastic volatility and interest rates is close to zero. That is, there is not a strong “leverage” effect for interest rates as for stock prices. The independence assumption between stochastic volatility and LIBOR rates in our model captures this stylized fact. 9 Our interpolation scheme is slightly different from that of Han (2007) for the convenience of deriving closed-form solution for cap prices.
725
CdJ k .t/ ; k D 1; 2; : : : ; K:
(47.26)
To price caps, we need the dynamics of LIBOR rates under the appropriate forward measure. The existence of stochastic volatility and jumps results in an incomplete market and hence the non-uniqueness of forward martingale measures. Our approach for eliminating this nonuniqueness is to specify the market prices of both the volatility and jump risks to change from the physical measure P to the forward measure QkC1 .11 Following the existing p literature, Vj .t/; for we model the volatility risk premium as kC1 j j D 1; : : : ; N: For the jump risk premium, we assume that under the forward measure QkC1 ; the jump process has the same distribution as that under P , except that the jump size follows a normal distribution with mean kC1 and variJ the mean relative jump size under QkC1 is ance J2 : Thus,
C 12 J2 1: Our specification of the kC1 D exp kC1 J market prices of jump risks allows the mean relative jump size under QkC1 to be different from that under P, accommodating a premium for jump size uncertainty. This approach, which is also adopted by Pan (2002), artificially absorbs the risk premium associated with the timing of the jump by the jump size risk premium. In our empirical analysis, we make the simplifying assumption that the volatility and jump risk premiums are linear functions of time-to-maturity, D cj v .Tk 1/ and kC1 D J C cJ .Tk 1/ :12 i.e., kC1 j J 10
For simplicity, we assume that different forward rates follow the same jump process with constant jump intensity. It is not difficult to allow different jump processes for individual LIBOR rates and the jump intensity to depend on the state of the economy within the AJD framework. 11 The market prices of interest rate risks are defined in such a way that the LIBOR rate is a martingale under the forward measure. 12 In order to estimate the volatility and jump risk premiums, we need a joint analysis of the dynamics of LIBOR rates under both the physical and forward measure, as in Chernov and Ghysels (2000), Pan (2002), and Eraker (2003). In our empirical analysis, we only focus on the dynamics under the forward measure. Therefore, we can only identify the
726
F. Zhao
Due to the no arbitrage restriction, the risk premiums of shocks to LIBOR rates for different forward measures are intimately related to each other. If shocks to volatility and jumps are also correlated with shocks to LIBOR rates, then both volatility and jump risk premiums for different forward measures should also be closely related to each other. However, in our model shocks to LIBOR rates are independent of that to volatility and jumps, and as a result, the change of measure of LIBOR shocks does not affect that of volatility and jump shocks. Due to stochastic volatility and jumps, the underlying LIBOR market is no longer complete and there is no unique forward measure. This gives us the freedom to and kC1 : See Andersen choose the functional forms of kC1 j J and Brotherton-Ratcliffe (2001) for similar discussions. Given the above market prices of risks, we can write down the dynamics of log.Lk .t// under forward measure QkC1 ; 0
1 N X 1 d log.Lk .t// D @J kC1 C U2 Vj .t/A dt 2 j D1 Tk t;j C
N X
kC1
p kC1 Vi .t/ dt C i Vi .t/dWQ iQ .t/ dV i .t/ D ikC1 vN kC1 i (47.29) kC1 kC1 where WQ Q is independent of Z Q ; jkC1 D j j kC1 ; j
and vN kC1 D j
QkC1
Vj .t/dW j
j vN j j j kC1 j
; j D 1; : : : ; N: The dynamics of
Lk .t/ under the forward measure QkC1 are completely captured by Equations (47.28) and (47.29). Given that LIBOR rates follow AJDs under both the physical and forward measure, we can directly apply the transform analysis of Duffie et al. (2000) to derive closedform formula for cap prices. Denote the state variables at t as Yt D .log .Lk .t// ; Vt /0 and the time-t expectation of e uYTk under the forward measure QkC1 as .u; Yt ; t; Tk / , kC1 EtQ e uYTk : Let u D .u0 ; 01N /0 ; then the time-t expectation of LIBOR rate at Tk equals; EtQ
kC1
q UTk t;j
where ZkQ .t/ is a standard Brownian motion under QkC1 . Now the dynamics of Vi .t/ under QkC1 becomes
fexp Œu0 log .Lk .Tk // g D
.u0 ; Yt ; t; Tk / h D exp a.s/ C u0 log.Lk .t// i CB.s/0 Vt ; (47.30)
.t/
j D1
CdJ Q k
kC1
.t/ :
(47.27)
For pricing purpose, the above process can be further simplified to the following one which has the same distribution, 0 d log.Lk .t// D @J kC1 C
N 1X
2 j D1
v u N uX U2 Ct
Tk t;j Vj
1 UT2k t;j Vj .t/A dt
where s D Tk t and closed-form solutions of a.s/ and B.s/ (an N -by-1 vector) are obtained by solving a system of Ricatti equations in the Appendix. Following Duffie et al. (2000), we define Ga;b .yIYt ;Tk ;QkC1 / D EtQ
kC1
.t/dZ Q k
kC1
.t/
h i e alog.Lk .Tk // 1fblog.Lk .Tk //yg ; (47.31)
and its Fourier transform, Z
j D1 kC1 CdJ Q k
.t/ ;
Ga;b .vI Yt ; Tk ; Q
(47.28)
kC1
/D
e i vy dGa;b .y/ R
D EtQ
kC1
D
h
e .aCi vb/log.Lk .Tk //
i
.a C ivb; Yt ; t; Tk / : (47.32)
Levy’s inversion formula gives Ga;b .yI Yt ; Tk ; Q
kC1
/D
1 .a C ivb; Yt ; t; Tk / 2
Z
1 0
Im
.a C ivb; Yt ; t; Tk / e i vy dv: v
(47.33)
The time-0 price of a caplet that matures at TkC1 with a strike price of X equals differences in the risk premiums between forward measures with different maturities. Our specifications of both risk premiums implicitly use the one year LIBOR rate as a reference point.
Caplet.0; TkC1 ; X / D ıDkC1 .0/ E0Q
kC1
.Lk .Tk / X /C ; (47.34)
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
where the expectation is given by the inversion formula, E0Q
kC1
ŒLk .Tk / X C D G1;1 . ln X I Y0 ; Tk ; QkC1 / XG0;1 . ln X I Y0 ; Tk ; QkC1 /: (47.35)
The new models developed in this section nest some of the most important models in the literature, such as LSS (2001) (with constant volatility and no jumps) and Han (2007) (with stochastic volatility and no jumps). The closed-form formula for cap prices makes an empirical implementation of our model very convenient and provides some advantages over existing methods. For example, Han (2007) develops approximations of ATM cap and swaption prices using the techniques of Hull and White (1987). However, such an approach might not work well for away-from-the-money options. In contrast, our method would work well for all options, which is important for explaining the volatility smile. In addition to introducing stochastic volatility and jumps, our multifactor HJM models also has advantages over the standard LIBOR market models of Brace et al. (1997), Miltersen et al. (1997), and their extensions often applied to caps in practice.13 While our models provide a unified multifactor framework to characterize the evolution of the whole yield curve, the LIBOR market models typically make separate specifications of the dynamics of LIBOR rates with different maturities. As suggested by LSS (2001), the standard LIBOR models are “more appropriately viewed as a collection of different univariate models, where the relationship between the underlying factors is left unspecified.” In contrast, the dynamics of LIBOR rates with different maturities under their related forward measures are internally consistent with each other given their dynamics under the physical measure and the market prices of risks. Once our models are estimated using one set of prices, they can be used to price and hedge other fixed-income securities.
47.3.2 Estimation Method and Results We estimate our new market model using prices form a wide cross section of difference caps with different strikes and maturities. Every week we observe prices of difference caps with ten moneyness and 13 maturities. However, due to changing interest rates, we do not have enough observations in all moneyness/maturity categories throughout the sample. Thus, we focus on the 13 Andersen and Brotherton-Ratcliff (2001) and Glasserman and Kou (in press) develop LIBOR models with stochastic volatility and jumps, respectively.
727
53 moneyness/maturity categories that have less than ten percent of missing values over the sample estimation period. The moneyness and maturity of all difference caps belong to the following sets f0:7; 0:8; 0:9; 1:0; 1:1g and f1:5; 2:0; 2:5; 3:0; 3:5; 4:0; 4:5; 5:0; 6:0; 7:0; 8:0; 9:0; 10:0g (unit in years), respectively. The difference caps with timeto-maturity less than or equal to 5 years represent portfolios of two caplets, while those with time-to-maturity longer than 5 years represent portfolios of four caplets. We estimate the model parameters by minimizing the sum of squared percentage pricing errors (SSE) of all relevant difference caps.14 Consider the time series observations t D 1; : : : ; T , of the prices of 53 difference caps with moneyness mi and time-to-maturities i ; i D 1; : : : ; M D 53: Let represent the model parameters which remain constant over the sample period. Let C .t; mi ; i / be the observed price of a difference cap with moneyness mi and time-to-maturity i and let CO .t; i ; mi ; Vt ./ ; / denote the corresponding theoretical price under a given model, where Vt ./ is the model implied instantaneous volatility at t given model parameters . For each i and t, denote the percentage pricing error as C .t; mi ; i / CO .t; mi ; i ; Vt ./ ; / ; C .t; mi ; i / (47.36) where Vt ./ is defined as ui;t ./ D
#2 " M X C .t; mi ; i / CO .t; mi ; i ; Vt ; / Vt ./ D arg min : fVt g C .t; mi ; i / i D1 (47.37) We provide empirical evidence on the performance of six different models in capturing the cap volatility smile. The first three models, denoted as SV1, SV2 and SV3, allow one, two, and three principal components to drive the forward rate curve, respectively, each with its own stochastic volatility. The next three models, denoted as SVJ1, SVJ2 and SVJ3, introduce jumps in LIBOR rates in each of the previous SV models. SVJ3 is the most comprehensive model and nests all the others as special cases. We first examine the separate performance of each of the SV and SVJ models, then we compare performance across the two classes of models. The estimation of all models is based on the principal components extracted from historical LIBOR forward rates between June 1997 and July 2000.15
14
Due to the wide range of moneyness and maturities of the difference caps involved, there could be significant differences in the prices of difference caps. Using percentage pricing errors helps to mitigate this problem. 15 The LIBOR forward curve is constructed from weekly LIBOR and swap rates from Datastream following the bootstrapping procedure of LSS (2001).
728
F. Zhao
The SV models contribute to cap pricing in four important ways. First, the three principal components capture variations in the levels of LIBOR rates caused by innovations in the “level”, “slope”, and “curvature” factors. Second, the stochastic volatility factors capture the fluctuations in the volatilities of LIBOR rates reflected in the Black implied volatilities of ATM caps.16 Third, the stochastic volatility factors also introduce fatter tails in LIBOR rate distributions than implied by the log-normal model, which helps capture the volatility smile. Finally, given our model structure, innovations of stochastic volatility factors also affect the covariances between LIBOR rates with different maturities. The first three factors, however, are more important for our applications, because difference caps are much less sensitive to time varying correlations than swaptions.17 Our discussion of the performance of the SV models focuses on the estimates of the model parameters and the latent volatility variables, and the time series and cross-sectional pricing errors of difference caps. A comparison of the parameter estimates of the three SV models in Table 47.7 shows that the “level” factor has the most volatile stochastic volatility, followed, in decreasing order, by the “curvature” and “slope” factor. The long-run mean 16 Throughout our discussion, volatilities of LIBOR rates refer to market implied volatilities from cap prices and are different from volatilities estimated from historical data. 17 See Han (2007) for more detailed discussions on the impact of time varying correlations for pricing swaptions.
Table 47.7 Parameter estimates of stochastic volatility models
(Nv1 ) and volatility of volatility (1 ) of the first volatility factor are much bigger than that of the other two factors. This suggests that the fluctuations in the volatilities of LIBOR rates are mainly due to the time varying volatility of the “level” factor. The estimates of the volatility risk premium of the three models are significantly negative, suggesting that the stochastic volatility factors of longer maturity LIBOR rates under the forward measure are less volatile with lower longrun mean and faster speed of mean reversion. This is consistent with the fact that the Black implied volatilities of longer maturity difference caps are less volatile than that of shortterm difference caps. Our parameter estimates are consistent with the volatility variables inferred from the prices of difference caps. The volatility of the “level” factor is the highest among the three (although at lower absolute levels in the more sophisticated models). It starts at a low level and steadily increases and stabilizes at a high level in the later part of the sample period. The volatility of the “slope” factor is much lower and relatively stable during the whole sample period. The volatility of the “curvature” factor is generally between that of the first and second factors. The steady increase of the volatility of the “level” factor is consistent with the increase of Black implied volatilities of ATM difference caps throughout our sample period. In fact, the correlation between the Black implied volatilities of most difference caps and the implied volatility of the “level” factor are higher than 0.8. The correlation between Black implied volatilities and the other two volatility factors is much weaker. The importance
SV1 Parameter ›1 ›2 ›3 vN1 vN2 vN3 —1 —2 —3 c1v c2v c3v Objective function
SV2
SV3
Estimate
Std. err
Estimate
Std. err
Estimate
Std. err
0.0179
0.0144
0.0091 0.1387
0.0111 0.0050
1.3727
1.1077
1.7100 0.0097
2.0704 0.0006
1.0803
0.0105
0.8992 0.0285
0.0068 0.0050
0:0022
0.0000
0:0031 0:0057
0.0000 0.0010
0.0067 0.0052 0.0072 2.1448 0.0344 0.1305 0.8489 0.0117 0.1365 0:0015 0:0007 0:0095 0.0692
0.0148 0.0022 0.0104 4.7567 0.0142 0.1895 0.0098 0.0065 0.0059 0.0000 0.0001 0.0003
0.0834
0.0758
This table reports parameter estimates and standard errors of the one-, two-, and three-factor stochastic volatility models. The estimates are obtained by minimizing sum of squared percentage pricing errors (SSE) of difference caps in 53 moneyness and maturity categories observed on a weekly frequency from August 1, 2000 to September 23, 2003. The objective functions reported in the table are re-scaled SSEs over the entire sample at the estimated model parameters and are equal to RMSE of difference caps. The volatility risk premium of the i th stochastic volatility factor for forward measure QkC1 is defined as ˜i kC1 D civ .Tk 1/
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
of stochastic volatility is obvious: the fluctuations in Black implied volatilities show that a model with constant volatility simply would not be able to capture even the general level of cap prices. The other aspects of model performance are the time series and cross-sectional pricing errors of difference caps. The likelihood ratio tests in Panel A of Table 47.8 overwhelmingly reject SV1 and SV2 in favor of SV2 and SV3, respectively. The Diebold–Mariano statistics in Panel A of Table 47.8 also show that SV2 and SV3 have significantly smaller SSEs than SV1 and SV2, respectively, suggesting that the more sophisticated SV models improve the pricing of all caps. The time series of RMSEs of the three SV models over our sample period18 suggest that except for two special periods where all models have extremely large pricing errors, the RMSEs of all models are rather uniform with the best model (SV3) having RMSEs slightly above 5%. The two special periods with high pricing errors cover the period between the second half of December of 2000 and the first half of January of 2001, and the first half of October 2001, and coincide with high prepayments in mortgage-backed securities (MBS). Indeed, the MBAA refinancing index and prepayment speed (see Figure 3 of Duarte 2004) show that after a long period of low prepayments between the middle of 1999 and late 2000, prepayments dramatically increased at the end of 2000 and the beginning of 2001. There is also a dramatic increase of prepayments at the beginning of October 2001. As widely recognized in the fixed-income market,19 excessive hedging demands for prepayment risk using interest rate derivatives may push derivative prices away from their equilibrium values, which could explain the failure of our models during these two special periods.20 In addition to overall model performance as measured by SSEs, we also examine the cross-sectional pricing errors of difference caps with different moneyness and maturities. We first look at the squared percentage pricing errors, which measure both the bias and variability of the pricing errors. Then we look at the average percentage pricing errors (the difference between market and model prices divided by the market price) to see whether SV models can on average capture the volatility smile in the cap market. The Diebold–Mariano statistics of squared percentage pricing errors of individual difference caps between SV2 and
18
RMSE of a model at t is calculated as
r u0t O ut O =M : We plot
RMSEs instead of SSEs because the former provides a more direct measure of average percentage pricing errors of difference caps. 19 We would like to thank Pierre Grellet Aumont from Deutsche Bank for his helpful discussions on the influence of MBS markets on OTC interest rate derivatives. 20 While the prepayments rates were also high in later part of 2002 and for most of 2003, they might not have come as surprises to participants in the MBS markets given the two previous special periods.
729
SV1 in Panel B of Table 47.8 show that SV2 reduces the pricing errors of SV1 for some but not all difference caps. SV2 has the most significant reductions in pricing errors of SV1 for mid- and short-term around-the-money difference caps. On the other hand, SV2 has larger pricing errors for deep ITM difference caps. The Diebold–Mariano statistics between SV3 and SV2 in Panel C of Table 47.8 show that SV3 significantly reduces the pricing errors of many short(2–3 years) and mid-term around-the-money, and long-term (6–10 years) ITM difference caps. Table 47.9 reports the average percentage pricing errors of all difference caps under the three SV models. Panel A of Table 47.9 shows that, on average, SV1 underprices shortterm and overprices mid- and long-term ATM difference caps, and underprices ITM and overprices OTM difference caps. This suggests that SV1 cannot generate enough skewness in the implied volatilities to be consistent with the data. Panel B shows that SV2 has some improvements over SV1, mainly for some short-term (less than 3.5 years) ATM, and mid-term (3.5–5 years) slightly OTM difference caps. But SV2 has worse performance for most deep ITM (m D 0:7 and 0:8) difference caps: it actually worsens the underpricing of ITM caps. Panel C of Table 47.9 shows that relative to SV1 and SV2, SV3 has smaller average percentage pricing errors for most long-term (7–10 years) ITM, mid-term (3.5– 5 years) OTM, and short-term (2–2.5 years) ATM difference caps, and bigger average percentage pricing errors for midterm (3.5–6 years) ITM difference caps. There is still significant underpricing of ITM and overpricing of OTM difference caps under SV3. Overall, the results show that stochastic volatility factors are essential for capturing the time varying volatilities of LIBOR rates. The Diebold–Mariano statistics in Table 47.8 shows that in general more sophisticated SV models have smaller pricing errors than simpler models, although the improvements are more important for close-to-the-money difference caps. The average percentage pricing errors in Table 47.9 show that, however, even the most sophisticated SV model cannot generate enough volatility skew to be consistent with the data. While previous studies, such as Han (2007), have shown that a three-factor stochastic volatility model similar to ours performs well in pricing ATM caps and swaptions, our analysis shows that the model fails to completely capture the volatility smile in the cap markets. Our findings highlight the importance of studying the relative pricing of caps with different moneyness to reveal the inadequacies of existing term structure models, the same inadequacies cannot be obtained from studying only ATM options. One important reason for the failure of SV models is that the stochastic volatility factors are independent of LIBOR rates. As a result, the SV models can only generate a symmetric volatility smile, but not the asymmetric smile or skew observed in the data. The pattern of the smile in the cap market
D–M Stats 1:931 6.351
Likelihood ratio Stats ¦2 (168) 1,624 1,557
1.5 years – – – 0:295 1:260
2 years – – 1:553 5.068 4.347
2.5 years – 0:061 1.988 2.693 1:522
3 years – 0.928 2.218 1:427 0.086
3.5 years 2.433 1.838 1:064 1:350 1:492
4 years 2.895 1.840 1:222 1:676 3.134
4.5 years 2.385 2.169 3.410 3.498 3.439
5 years 5.414 6.676 1:497 3.479 3.966
6 years 4.107 3.036 0.354 2.120 –
1.5 years – – – 0:849 0.847
2 years – – 2.897 3.020 2.861
2.5 years – 3.135 3.771 3.115 0.675
3 years – 1:212 3.211 0:122 0.315
3.5 years 1.493 1.599 1.417 0.328 3.650
4 years 1.379 1.682 1.196 3.288 3.523
4.5 years 0.229 0:052 2.237 3.342 2.923
5 years 0:840 0:592 1:570 3.103 2.853
6 years 3.284 3.204 1:932 1.351 –
7 years 5.867 6.948 6.920 1.338 –
7 years 5.701 2.274 0:555 1:734 –
8 years 4.280 4.703 1:230 0.139 –
8 years 2.665 0:135 1:320 1:523 –
9 years 0:057 1.437 2.036 4.170 –
9 years 1:159 1:796 1:439 0:133 –
10 years 2.236 1:079 1:020 0:193 –
10 years 1:299 1:590 1:581 2.016 –
This table reports model comparison based on likelihood ratio and Diebold–Mariano statistics. The total number of observations (both cross sectional and time series), which equals 8,545 over the entire sample, times the difference between the logarithms of the SSEs between two models follows a 2 distribution asymptotically. We treat implied volatility variables as parameters. Thus, the degree of freedom of the 2 distribution is 168 for the pairs of SV2/SV1 and SV3/SV2, because SV2 and SV3 have four more parameters and 164 additional implied volatility variables than SV1 and SV2, respectively. The 1% critical value of 2 (168) is 214. The Diebold–Mariano statistics are calculated according to equation (14) with a lag order q of 40 and follow an asymptotic standard Normal distribution under the null hypothesis of equal pricing errors. A negative statistic means that the more sophisticated model has smaller pricing errors. Bold entries mean that the statistics are significant at the 5% level
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel C. Diebold–Mariano statistics between SV3 and SV2 for individual difference caps based on squared percentage pricing errors
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel B. Diebold–Mariano statistics between SV2 and SV1 for individual difference caps based on squared percentage pricing errors
Models SV2–SV1 SV3–SV2
Table 47.8 Comparison of the performance of stochastic volatility models Panel A. Likelihood ratio and Diebold–Mariano statistics for overall model performance based on SSEs
730 F. Zhao
1.5 years – – – 0.0293 0:1187
2 years – – 0.1092 0.1217 0.0604
2.5 years – 0.0434 0.0534 0.0575 0:0029
1.5 years – – – 0:0002 0:1056
2 years – – 0.1059 0.0985 0.0584
2.5 years – 0.0509 0.0498 0.0369 0:0085
1.5 years – – – 0:0126 0:1184
2 years – – 0.0917 0.0782 0.0314
2.5 years – 0.044 0.0367 0.0198 0:0323
3 years – 0.0476 0.0379 0.0194 0:0336
3 years – 0.051 0.0421 0.0231 0:026
3 years – 0.0412 0.0433 0.0378 0:0229
3.5 years 0.0489 0.0462 0.0398 0.0252 0:0212
3.5 years 0.0482 0.0443 0.0333 0.0134 0:0326
3.5 years 0.034 0.0323 0.0315 0.0227 0:034
4 years 0.0437 0.0378 0.0288 0.0105 0:0397
4 years 0.0425 0.032 0.0145 0:0123 0:0653
4 years 0.0258 0.018 0.01 0:0081 0:0712
4.5 years 0.0308 0.0322 0.0226 0:0012 0:0438
4.5 years 0.0304 0.0258 0.0069 0:0261 0:0721
4.5 years 0.0122 0.0106 0.0003 0:0259 0:0815
5 years 0.0494 0.0506 0.0377 0.011 0:0292
5 years 0.0524 0.0486 0.0284 0:005 0:0454
5 years 0.0339 0.0332 0.0208 0:0073 0:0562
6 years 0.0431 0.0367 0.0178 0:0129 –
6 years 0.0544 0.0472 0.0265 0:0042 –
6 years 0.0361 0.0322 0.0186 0:0079 –
7 years 0.0466 0.0365 0.0145 0:0221 –
7 years 0.0663 0.0586 0.0392 0.008 –
7 years 0.0503 0.0468 0.0348 0.0088 –
8 years 0.031 0.0226 0:0026 0:0299 –
8 years 0.0456 0.0344 0.0054 0:024 –
8 years 0.0344 0.0299 0.0101 0:0114 –
9 years 0.03 0.0249 0.0068 0:0192 –
9 years 0.0304 0.0138 0:0184 0:0572 –
9 years 0.0297 0.0244 0.0062 0:0192 –
10 years 0.028 0.0139 0:0109 0:0432 –
10 years 0.0378 0.0202 0:008 0:0403 –
10 years 0.0402 0.0325 0.0158 0:0062 –
This table reports average percentage pricing errors of difference caps with different moneyness and maturities of the three stochastic volatility models. Average percentage pricing errors are defined as the difference between market price and model price divided by the market price
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel C. Average percentage pricing errors of SV3
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel B. Average percentage pricing errors of SV2
Moneyness 0.7 0.8 0.9 1.0 1.1
Table 47.9 Average percentage pricing errors of stochastic volatility models Panel A. Average percentage pricing errors of SV1
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing 731
732
F. Zhao
is rather similar to that of index options: ITM calls (and OTM puts) are overpriced, and OTM calls (and ITM puts) are underpriced relative to the Black model. Similarly, the smile in the cap market could be due to a market expectation of dramatically declining LIBOR rates. In this section, we examine the contribution of jumps in LIBOR rates in capturing the volatility smile. Our discussion of the performance of the SVJ models parallels that of the SV models. Parameter estimates in Table 47.10 show that the three stochastic volatility factors of the SVJ models resemble that of the SV models closely. The “level” factor still has the most volatile stochastic volatility, followed by the “curvature” and the “slope” factor. With the inclusion of jumps, the stochastic volatility factors in the SVJ models, especially that of the “level” factor, tend to be less volatile than that of the SV models (lower long run mean and volatility of volatility). Negative estimates of the volatility risk premium show that the volatility of the longer maturity LIBOR rates under the forward measure have lower long-run mean and faster speed of mean-reversion. Most importantly, we find overwhelming evidence of strong negative jumps in LIBOR rates under the forward measure. To the extend that cap prices reflect market expectations of future evolutions of LIBOR rates, the evidence suggests that the market expects a dramatic declining in LIBOR rates over our sample period. Such an expectation might Table 47.10 Parameter estimates of stochastic volatility and jumps models
be justifiable given that the economy has been in recession during a major part of our sample period. This is similar to the volatility skew in the index equity option market, which reflects investors fear of the stock market crash such as that of 1987. Compared to the estimates from index options (see, e.g., Pan 2002), we see lower estimates of jump intensity (about 1.5% per annual), but much higher estimates of jump size. The positive estimates of a jump risk premium suggest that the jump magnitude of longer maturity forward rates tendto be smaller. Under SVJ3, the mean relative jump size, exp J C cJ .Tk 1/ C J2 =2 1; for one, five, and ten year LIBOR rates are 97%, 94%, and 80%, respectively. However, we do not find any incidents of negative moves in LIBOR rates under the physical measure with a size close to that under the forward measure. This big discrepancy between jump sizes under the physical and forward measures resembles that between the physical and risk-neutral measure for index options (see, e.g., Pan 2002). This could be a result of a huge jump risk premium. The likelihood ratio tests in Panel A of Table 47.11 again overwhelmingly reject SVJ1 and SVJ2 in favor of SVJ2 and SVJ3, respectively. The Diebold–Mariano statistics in Panel A of Table 47.11 show that SVJ2 and SVJ3 have significantly smaller SSEs than SVJ1 and SVJ2, respectively, suggesting that the more sophisticated SVJ models significantly improve the pricing of all difference caps. The Diebold–Mariano
SVJ1 Parameter ›1 ›2 ›3 vN1 vN2 vN3 —1 —2 —3 c1v c2v c3v œ J cJ ¢J Objective function
SVJ2
SVJ3
Estimate
Std. err
Estimate
Std. err
Estimate
Std. err
0.1377
0.0085
0.0062 0.0050
0.0057 0.0001
0.1312
0.0084
0.7929 0.3410
0.7369 0.0030
0.8233
0.0057
0.7772 0.0061
0.0036 0.0104
0:0041
0.0000
0:0049 0:0270
0.0000 0.0464
0.0134 3:8736 0.2632 0.0001 0.0748
0.0001 0.0038 0.0012 3.2862
0.0159 3:8517 0.3253 0.0003 0.0670
0.0001 0.0036 0.0010 0.8723
0.0069 0.0032 0.0049 0.9626 0.2051 0.2628 0.6967 0.0091 0.1517 0:0024 0:0007 0:0103 0.0132 3:8433 0.2473 0.0032 0.0622
0.0079 0.0000 0.0073 1.1126 0.0021 0.3973 0.0049 0.0042 0.0035 0.0000 0.0006 0.0002 0.0001 0.0063 0.0017 0.1621
This table reports parameter estimates and standard errors of the one-, two-, and three-factor stochastic volatility and jump models. The estimates are obtained by minimizing sum of squared percentage pricing errors (SSE) of difference caps in 53 moneyness and maturity categories observed on a weekly frequency from August 1, 2000 to September 23, 2003. The objective functions reported in the table are re-scaled SSEs over the entire sample at the estimated model parameters and are equal to RMSE of difference caps. The volatility risk premium of the i th stochastic volatility factor and the jump risk premium for D J C cJ .Tk 1/, respectively forward measure QkC1 is defined as ˜i kC1 D civ .Tk 1/ and kC1 J
D–M Stats 2.240 7.149
Likelihood ratio Stats ¦2 (168) 1,886 1,256
1.5 years – – – 0:094 1:395
2 years – – 1:530 3.300 5.341
2.5 years – 1:346 1:914 2.472 0:775
3 years – 0:537 0:934 1:265 0.156
3.5 years 0:234 0:684 1:036 1:358 0:931
4 years 0:308 1:031 1:463 1:647 3.141
4.5 years 1:198 1:892 3.882 2.186 3.000
5 years 0:467 1:372 3.253 2.020 2.107
6 years 0:188 0:684 0:920 0:573 –
1.5 years – – – 1:594 0:877
2 years – – 1:235 1:245 1:583
2.5 years – 1:036 2.057 2.047 0:365
3 years – 0:159 0:328 0:553 0:334
3.5 years 0.551 1.650 2.108 0:289 1:088
4 years 0.690 1.609 1.183 0:463 2:040
4.5 years 1:023 1:714 2.011 2.488 3.302
5 years 1:023 1:898 1:361 1:317 1:259
6 years 1:133 0:778 0:249 2.780 –
7 years 2.550 3.191 2.784 0.182 –
7 years 0.675 0:365 1:588 1:674 –
8 years 1:469 3.992 1:408 0:551 –
8 years 0:240 0:749 2.395 1:396 –
9 years 0:605 2.951 3.411 1:542 –
9 years 0:774 1:837 3.287 2.540 –
10 years 1:920 3.778 2.994 1:207 –
10 years 0:180 1:169 0:686 0:799 –
This table reports model comparison based on likelihood ratio and Diebold–Mariano statistics. The total number of observations (both cross sectional and time series), which equals 8,545 over the entire sample, times the difference between the logarithms of the SSEs between two models follows a 2 distribution asymptotically. We treat implied volatility variables as parameters. Thus the degree of freedom of the 2 distribution is 168 for the pairs of SVJ2/SVJ1 and SVJ3/SVJ2, because SVJ2 and SVJ3 have four more parameters and 164 additional implied volatility variables than SVJ1 and SVJ2, respectively. The 1% critical value of 2 (168) is 214. The Diebold–Mariano statistics are calculated according to Equation (47.14) with a lag order q of 40 and follow an asymptotic standard Normal distribution under the null hypothesis of equal pricing errors. A negative statistic means that the more sophisticated model has smaller pricing errors. Bold entries mean that the statistics are significant at the 5% level
Money-ness 0.7 0.8 0.9 1.0 1.1
Panel C. Diebold–Mariano statistics between SVJ3 and SVJ2 for individual difference caps based on squared percentage pricing errors
Money-ness 0.7 0.8 0.9 1.0 1.1
Panel B. Diebold–Mariano statistics between SVJ2 and SVJ1 for individual difference caps based on squared percentage pricing errors
Models SVJ2–SVJ1 SVJ3–SVJ2
Table 47.11 Comparison of the performance of stochastic volatility and jump models Panel A. Likelihood ratio and Diebold–Mariano statistics for overall model performance based on SSEs
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing 733
734
statistics of squared percentage pricing errors of individual difference caps in Panel B of Table 47.11 show that SVJ2 significantly improves the performance of SVJ1 for long-, mid-, and short-term around-the-money difference caps. The Diebold–Mariano statistics in Panel C of Table 47.11 show that SVJ3 significantly reduces the pricing errors of SVJ2 for long-term ITM, and some mid- and short-term aroundthe-money difference caps. Table 47.12 shows the average percentage pricing errors also improve over the SV models. Table 47.13 compares the performance of the SVJ and SV models. During the first 20 weeks of our sample, the SVJ models have much higher RMSEs than the SV models. As a result, the likelihood ratio and Diebold–Mariano statistics between the three pairs of SVJ and SV models over the entire sample are somewhat smaller than that of the sample period without the first 20 weeks. Nonetheless, all the SV models are overwhelmingly rejected in favor of their corresponding SVJ models by both tests. The Diebold–Mariano statistics of individual difference caps in Panel B, C, and D show that the SVJ models significantly improve the performance of the SV models for most difference caps across moneyness and maturity. The most interesting results are in Panel D, which show that SVJ3 significantly reduces the pricing errors of most ITM difference caps of SV3, strongly suggesting that the negative jumps are essential for capturing the asymmetric smile in the cap market. Our analysis shows that a low dimensional model with three principal components driving the forward rate curve, stochastic volatility of each component, and strong negative jumps captures the volatility smile in the cap markets reasonably well. The three yield factors capture the variations of the levels of LIBOR rates, while the stochastic volatility factors are essential to capture the time varying volatilities of LIBOR rates. Even though the SV models can price ATM caps reasonably well, they fail to capture the volatility smile in the cap market. Instead, significant negative jumps in LIBOR rates are needed to capture the smile. These results highlight the importance of studying the pricing of caps across moneyness: the importance of negative jumps is revealed only through the pricing of alway-from-the-money caps. Excluding the first 20 weeks and the two special periods, SVJ3 has a reasonably good pricing performance with an average RMSEs of 4.5%. Given that the bid-ask spread is about 2–5% in our sample for ATM caps, and because ITM and OTM caps tend to have even higher percentage spreads,21 this cam be interpreted as a good performance. Despite its good performance, there are strong indications that SVJ3 is misspecified and the inadequacies of the model seem to be related to MBS markets. For example, while SVJ3 21
See, for example, Deuskar et al. (2003).
F. Zhao
works reasonably well for most of the sample period, it has large pricing errors in several special periods coinciding with high prepayment activities in the MBS markets. Moreover, even though we assume that the stochastic volatility factors are independent of LIBOR rates, Table 47.14 shows strong negative correlations between the implied volatility variables of the first factor and the LIBOR rates. This result suggests that when interest rate is low, cap prices become too high for the model to capture and the implied volatilities have to become abnormally high to fit the observed cap prices. One possible explanation of the “leverage” effect is that higher demands for caps to hedge prepayments from MBS markets in low interest rate environments could artificially push up cap prices and implied volatilities. Therefore, extending our models to incorporate factors from MBS markets seems to be a promising direction of future research.
47.4 Nonparametric Estimation of the Forward Density For LIBOR-based instruments such as caps, floors, and swaptions, it is convenient to consider pricing using the forward measure approach. We will therefore focus on the dynamics of LIBOR forward rate Lk .t/ under the forward measure QkC1 , which is essential for pricing caplets maturing at TkC1 . Under this measure, the discounted price of any security using DkC1 .t/ as the numeraire is a martingale. Thus, the time-t price of a caplet maturing at TkC1 with a strike price of X is Z
1
C .Lk .t/ ; X; t; Tk / D ıDkC1 .t/
.y X / p Q
kC1
X
.Lk .Tk / D yjLk .t// dy; (47.38) where p Q .Lk .Tk / D yjLk .t// is the conditional density of Lk .Tk / under forward measure QkC1 : Once we know the forward density, we can price any security whose payoff on TkC1 depends only on Lk .t/ by discounting its expected payoff under QkC1 using DkC1 .t/ : Existing term structure models rely on parametric assumptions on the distribution of Lk .t/ to obtain closedform pricing formulae for caplets. For example, the standard LIBOR market models of Brace et al. (1997) and Miltersen et al. (1997) assume that Lk .t/ follows a log-normal distribution and price caplet using the Black formula. The models of Jarrow et al. (2007) assume that Lk .t/ follows affine jump-diffusions of Duffie et al. (2000). kC1
1.5 years – – – 0:009 0:098
2 years – – 0.0682 0.0839 0.0625
2.5 years – 0.014 0.0146 0.0233 0:0038
1.5 years – – – 0:0375 0:089
2 years – – 0.0698 0.0668 0.0612
2.5 years – 0.0232 0.019 0.013 0:0048
1.5 years – – – 0:0204 0:0688
2 years – – 0.0713 0.0657 0.0528
2.5 years – 0.0222 0.014 0.005 0:02
3 years – 0.0249 0.0155 0.0054 0:0242
3 years – 0.0271 0.0205 0.0131 0:0094
3 years – 0.0167 0.0132 0.016 0:0144
3.5 years 0.0261 0.0223 0.0182 0.0142 0:0085
3.5 years 0.0243 0.0211 0.0172 0.015 0.0003
3.5 years 0.0164 0.0116 0.0112 0.0158 0:0086
4 years 0.0176 0.0115 0.0073 0.0033 0:0199
4 years 0.0148 0.0062 0:0012 0:0058 0:0215
4 years 0.0073 0:0014 0:0035 0:0004 0:0255
4.5 years 0.0008 0.0027 0:0002 0:0068 0:0182
4.5 years 0:0008 0:0035 0:0119 0:0214 0:0273
4.5 years 0:0092 0:0091 0:0103 0:0105 0:0199
5 years 0.017 0.0185 0.0129 0.0047 0:0028
5 years 0.0188 0.0172 0.0068 0:0047 0:0076
5 years 0.01 0.0111 0.0104 0.0105 0.0094
6 years 0.0085 0.0016 0:0108 0:0232 –
6 years 0.0175 0.0137 0.0039 0:0054 –
6 years 0.0102 0.007 0.0038 0.0062 –
7 years 0.0167 0.0131 0.0072 0:001 –
7 years 0.0279 0.0255 0.0198 0.0127 –
7 years 0.0209 0.0207 0.0204 0.0194 –
8 years 0.0008 0.004 0.0044 0.019 –
8 years 0.0116 0.0081 0:0041 0:0058 –
8 years 0:0001 0:0009 0:0062 0.0013 –
9 years 0:0049 0:0008 0.0048 0.0206 –
9 years 0.0106 0.0061 0:0047 0:0112 –
9 years 0:0061 0:0076 0:0114 0:0083 –
10 years 0:0021 0:0063 0:0092 0:0058 –
10 years 0.0256 0.0139 0:002 0:0128 –
10 years 0.0077 0.0053 0.0042 0.0094 –
This table reports average percentage pricing errors of difference caps with different moneyness and maturities of the three stochastic volatility and jump models. Average percentage pricing errors are defined as the difference between market price and model price divided by market price
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel C. Average percentage pricing errors of SVJ3
Moneyness 0.7 0.8 0.9 1.0 1.1
Panel B. Average percentage pricing errors of SVJ2
Moneyness 0.7 0.8 0.9 1.0 1.1
Table 47.12 Average percentage pricing errors of stochastic volatility and jump models Panel A. Average percentage pricing errors of SVJ1
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing 735
3.006 4.017 3.165
D–M Stats (whole sample)
2.972 3.580 3.078
Models
SVJ1 - SV1 SVJ2 – SV2 SVJ3 – SV3
1,854 2,115 1,814
Likelihood Ratio Stats ¦2 (4) (whole sample)
1.5 years – – – 0.670 0:927
2 years – – 7.162 5.478 0:435
2.5 years – 3.243 9.488 2.844 0:111
3 years – 12.68 2.773 1:294 0:261
3.5 years 3.050 9.171 1:948 1:036 1:870
4 years 3.504 1:520 1:030 4.001 2.350
4.5 years 0:504 0:692 1:087 3.204 1:710
5 years 1:904 1:195 0:923 1:812 0:892
6 years 4.950 2.245 0:820 2.331 –
7 years 3.506 1.986 0:934 1:099 –
1.5 years – – – 1.057 0:969
2 years – – 7.013 6.025 0:308
2.5 years – 7.353 6.271 2.736 0.441
3 years – 5.908 2.446 1:159 1:284
3.5 years 9.696 5.612 2.145 0:688 2.179
4 years 7.714 2.917 1:246 4.478 3.148
4.5 years 2.879 1:669 1:047 4.410 2.874
5 years 3.914 2.277 1:309 3.754 2.267
6 years 14.01 7.591 2.856 0:404 –
7 years 7.387 4.610 1:867 0:416 –
8 years 7.865 4.182 0:183 0:881 –
1.5 years – – – 0:530 1:178
2 years – – 1.980 1:124 1.395
2.5 years – 4.732 2.571 1:305 1:424
3 years – 7.373 2.501 1:353 2.218
3.5 years 7.040 21.74 6.715 1:909 1:834
4 years 8.358 7.655 3.622 0:880 2.151
4.5 years 7.687 5.145 1.985 1.023 1:537
5 years 10.49 4.774 2.384 0.052 0:337
6 years 7.750 6.711 1:938 0.543 –
7 years 5.817 3.030 1:114 2.110 –
8 years 5.140 2.650 0:768 0:359 –
9 years 3.433 2.614 4.119 0:492 –
9 years 3.248 0.397 0.239 2.504 –
9 years 2.068 1:353 1:109 2.151 –
10 years 3.073 1:239 1:305 2.417 –
10 years 1:637 2.377 3.098 0.023 –
10 years 2.182 1:406 1:166 2.237 –
This table reports model comparison based on likelihood ratio and Diebold–Mariano statistics. The total number of observations (both cross sectional and time series), which equals 8,545 over the entire sample and 7,485 without the first 20 weeks, times the difference between the logarithms of the SSEs between two models follows a 2 distribution asymptotically. We treat implied volatility variables as parameters. Thus the degree of freedom of the 2 distribution is 4 for the pairs of SVJ/SV1, SVJ2/SV2, and SVJ3/SV3, because SVJ models have four more parameters and equal number of additional implied volatility variables as the corresponding SV models. The 1% critical value of 2 (4) is 13. The Diebold–Mariano statistics are calculated according to Equation (47.14) with a lag order q of 40 and follow an asymptotic standard Normal distribution under the null hypothesis of equal pricing errors. A negative statistic means that the more sophisticated model has smaller pricing errors. Bold entries mean that the statistics are significant at the 5% level
Money-ness 0.7 0.8 0.9 1.0 1.1
Panel D. Diebold–Mariano statistics between SVJ3 and SV3 for individual difference caps based on squared percentage pricing errors (without first 20 weeks)
Money-ness 0.7 0.8 0.9 1.0 1.1
Panel C. Diebold–Mariano statistics between SVJ2 and SV2 for individual difference caps based on squared percentage pricing errors (without first 20 weeks)
Money-ness 0.7 0.8 0.9 1.0 1.1
8 years 3.827 1:920 1:176 1:699 –
2,437 2,688 2,497
Likelihood Ratio Stats ¦2 (4) (without first 20 weeks)
Panel B. Diebold–Mariano statistics between SVJ1 and SV1 for individual difference caps based on squared percentage pricing errors (without first 20 weeks)
D–M Stats (without first 20 weeks)
Table 47.13 Comparison of the performance of SV and SVJ models Panel A. Likelihood Ratio and Diebold–Mariano statistics for overall model performance based on SSEs
736 F. Zhao
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
Table 47.14 Correlations between LIBOR rates and implied volatility variables
V1(t) V2(t) V3(t)
737
L(t,1)
L(t,3)
L(t,5)
L(t,7)
L(t,9)
V1 .t/
V2 .t/
V3 .t/
0:8883 0.1759 0:5951
0:8772 0.235 0:485
0:8361 0.2071 0:4139
0:7964 0.1545 0:3541
0:7470 0.08278 0:3262
1 0:4163 0.3842
0:4163 1 0:0372
0.3842 0:0372 1
This table reports the correlations between LIBOR rates and implied volatility variables from SVJ3. Given the parameter estimates of SVJ3 in Table 4, the implied volatility variables are estimated at t by minimizing the SSEs of all difference caps at t
47.4.1 Nonparametric Method We estimate the distribution of Lk .t/ under QkC1 using the prices of a cross section of caplets that mature at TkC1 pQ
kC1
.Lk .Tk / jLk .t// D
@2 C .Lk .t/ ; t; Tk ; X / jX DLk .Tk / : ıDkC1 .t/ @X 2 1
In standard LIBOR market models, it is assumed that the conditional density of Lk .Tk / depends only on the current LIBOR rate. This assumption, however, can be overly restrictive given the multifactor nature of term structure dynamics. For example, while the level factor can explain a large fraction (between 80 and 90%) of the variations of LIBOR rates, the slope factor still has significant explanatory power of interest rate variations. Moreover, there is overwhelming evidence that interest rate volatility is stochastic,22 and it has been suggested that interest rate volatility is unspanned in the sense that it can not be fully explained by the yield curve factors such as the level and slope factors. pQ
kC1
.Lk .Tk / jLk .t/ ; Z .t// D
and have different strike prices. Following Breeden and Litzenberger (1978), we know that the density of Lk .t/ under QkC1 is proportional to the second derivative of C .Lk .t/ ; t; Tk ; X / with respect to X;
One important innovation of our study is that we allow the volatility of Lk .t/ to be stochastic and the conditional density of Lk .Tk / to depend on not only the level, but also the slope and volatility factors of LIBOR rates. Denote the conditioning variables as Z .t/ D fs.t/; v .t/g; where s.t/ (the slope factor) is the difference between the 10- and 2-year LIBOR forward rates and v .t/ (the volatility factor) is the first principal component of EGARCH-filtered spot volatilities of LIBOR rates across all maturities. Under this generalization, the conditional density of Lk .Tk / under the forward measure QkC1 is given by
@2 C .Lk .t/ ; X; t; Tk ; Z .t// jX DLk .Tk / : ıDkC1 .t/ @X 2 1
Next we discuss how to estimate the SPDs by combining the forward and physical densities of LIBOR rates. Denote a SPD function as : In general, depends on multiple economic factors, and it is impossible to estimate it using interest rate caps alone. Given the available data, all we can estimate is the projection of onto the future spot rate Lk .Tk /:
where the second equality is due to iterated expectation and p P .Lk .Tk / D yjLk .t/ ; Z .t// is the conditional density of Lk .Tk / under the physical measure. 22
See Andersen and Lund (1997), Ball and Torous (1999), Brenner et al. (1996), Chen and Scott (2001), and many others.
(47.40)
k .Lk .Tk /I Lk .t/; Z.t// D EtP Œ jLk .Tk /I Lk .t/; Z.t/ ; (47.41) where the expectation is taken under the physical measure. Then the price of the caplet can be calculated as
C .Lk .t/ ; X; t; Tk ; Z .t// D ıEtP .Lk .Tk / X /C Z 1 Dı k .y/ .y X / p P .Lk .Tk / D yjLk .t/ ; Z .t// dy; X
(47.39)
(47.42)
738
F. Zhao
Comparing Equations (47.2) and (47.6), we have p Q .Lk .Tk / jLk .t/ ; Z .t// : k .Lk .Tk /I Lk .t/; Z.t// D DkC1 .t/ p P .Lk .Tk / jLk .t/ ; Z .t// kC1
Therefore, by combining the densities of Lk .Tk / under QkC1 and P; we can estimate the projection of onto Lk .Tk /: The SPDs contain rich information on how risks are priced in financial markets. While Aït-Sahalia and Lo (1998, 2000), Jackwerth (2000), Rosenberg and Engle (2002), and others estimate the SPDs using index options (i.e., the projection
of onto index returns), our analysis based on interest rate caps documents the dependence of the SPDs on term structure factors. Similar to many existing studies, to reduce the dimensionality of the problem, we further assume that the caplet price is homogeneous of degree 1 in the current LIBOR rate:
C .Lk .t/ ; X; t; Tk ; Z .t// D ıDkC1 .t/ Lk .t/ CM .Mk .t/; t; Tk ; Z .t// ; where Mk .t/ D X=Lk .t/ represents the moneyness of the caplet. Hence, for the rest of the paper we estimate the for-
p
QkC1
In this section, we present nonparametric estimates of the probability densities of LIBOR rates under physical and forward martingale measures. In particular, we document the dependence of the forward densities on the slope and volatility factors of LIBOR rates. Figure 47.2 presents nonparametric estimates of the forward densities at different levels of the slope and volatility factors at 2, 3, 4, and 5 year maturities. The two levels of the slope factor correspond to a flat and a steep forward curve, while the two levels of the volatility factor represent low and high volatility of LIBOR rates. The 95% confidence intervals are obtained through simulation. The forward densities should have a zero mean since LIBOR rates under appropriate forward measures are martingales. The expected log percentage changes of the LIBOR rates are slightly negative due to an adjustment from the Jensen’s inequality. We normalize the forward densities so that they integrate to one. However, we do not have enough data at the right tail of the distribution at 4 and 5 year maturities. We do not extrapolate the data to avoid potential biases. Figure 47.2 documents three important features of the nonparametric LIBOR forward densities. First, the lognormal assumption underlying the popular LIBOR market
(47.44)
ward density of Lk .Tk / =Lk .t/ as the second derivative of the price function CM with respect to M W
1 @2 CM .Mk .t/; t; Tk ; Z .t// Lk .Tk / jZ .t/ D jM DLk .Tk /=Lk .t / : Lk .t/ ıDkC1 .t/ @M 2
47.4.2 Empirical Results
(47.43)
(47.45)
models is grossly violated in the data, and the forward densities across all maturities are significantly negatively skewed. Second, all the forward densities depend significantly on the slope of the term structure. For example, moving from a flat to a steep term structure, the forward densities across all maturities become much more dispersed and more negatively skewed. Third, the forward densities also depend on the volatility factor. Under both flat and steep term structures, the forward densities generally become more compact when the volatility factor increases. This is consistent with a mean reverting volatility process: High volatility right now leads to low volatility in the future and more compact forward densities. To better illustrate the dependence of the forward densities on the two conditioning variables, we also regress the quantiles of the forward densities on the two factors. We choose quantiles instead of moments of the forward densities in our regressions for two reasons. First, quantiles are much easier to estimate. While quantiles can be obtained from the CDF function, which is the first derivative of the price function, moments require integrations of the forward density, which is the second derivative of the price function. Second, a wide range of quantiles provide a better characterization of the forward densities than a few moments, especially for the tail behaviors of the densities. Suppose we consider I and J levels of the transformed slope and volatility factors in our empirical analysis. For a
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
2-Year
a
Slope=0.3, V=0.9
Slope=2.4, V=0.9 2.5 Forward Density
Forward Density
2.5 2 1.5 1 0.5 0
1.5 1 0.5 −1 0 1 Log(Spot/Forward) Slope=2.4, V=1.2
Forward Density
2.5
2 1.5 1 0.5 0
2
0
−1 0 1 Log(Spot/Forward) Slope=0.3, V=1.2
2.5 Forward Density
Fig. 47.2 Nonparametric estimates of the LIBOR forward densities at different levels of the slope and volatility factors. The slope factor is defined as the difference between the 10 and 2-year 3-month LIBOR forward rates. The volatility factor is defined as the first principal component of EGARCH-filtered spot volatilities and has been normalized to a mean that equals one. The two levels of the slope factor correspond to flat and steep term structures, while the two levels of the volatility factor corresponds to low and high levels of volatility
739
2 1.5 1 0.5 0
−1 0 1 Log(Spot/Forward)
−1 0 1 Log(Spot/Forward)
3-Year
b
Slope=0.3, V=0.9
Slope=2.4, V=0.9 2.5 Forward Density
Forward Density
2.5 2 1.5 1 0.5 0
1 0.5 −1 0 1 Log(Spot/Forward) Slope=2.4, V=1.2
2.5 Forward Density
Forward Density
1.5
0
−1 0 1 Log(Spot/Forward) Slope=0.3, V=1.2
2.5 2 1.5 1 0.5 0
2
−1 0 1 Log(Spot/Forward)
given level of the two conditioning variables si ; vj ; we first obtain a nonparametric estimate of the forward density at a given maturity and its quantiles Qx si ; vj ; where x can range from 0 to 100%. Then we consider the following regression model
2 1.5 1 0.5 0
−1 0 1 Log(Spot/Forward)
Qx si ; vj D b0x C b1x si C b2x vj C b3x si vj C "x ; (47.46) where i D 1; 2; : : : ; I; and j D 1; 2; : : : ; J: We include the interaction term to capture potential nonlinear dependence of the forward densities on the two conditioning variables.
740
F. Zhao
Fig. 47.2 (continued)
4-Year
c
Slope=0.3, V=0.9
Slope=2.4, V=0.9 2.5 Forward Density
Forward Density
2.5 2 1.5 1 0.5 0
1 0.5 −1 0 1 Log(Spot/Forward) Slope=2.4, V=1.2
2.5 Forward Density
Forward Density
1.5
0
−1 0 1 Log(Spot/Forward) Slope=0.3, V=1.2
2.5 2 1.5 1 0.5 0
2
2 1.5 1 0.5 0
−1 0 1 Log(Spot/Forward)
−1 0 1 Log(Spot/Forward)
5-Year
d
Slope=0.3, V=0.9
Slope=2.4, V=0.9 2.5 Forward Density
Forward Density
2.5 2 1.5 1 0.5 0
1 0.5 −1 0 1 Log(Spot/Forward) Slope=2.4, V=1.2
2.5 Forward Density
Forward Density
1.5
0
−1 0 1 Log(Spot/Forward) Slope=0.3, V=1.2
2.5 2 1.5 1 0.5 0
2
−1 0 1 Log(Spot/Forward)
Figure 47.3 reports regression coefficients of the slope and volatility factors for the most complete range of quantiles at each maturity, i.e., b1x and b2x as a function of x: While Fig. 47.2 includes only the slope and volatility factors as explanatory variables, Fig. 47.4 contains their interaction term
2 1.5 1 0.5 0
−1 0 1 Log(Spot/Forward)
as well. Though in results not reported here we also include lagged conditioning variables in our regressions, their coefficients are generally not statistically significant. The regression results in Fig. 47.3 are generally consistent with the main findings in Fig. 47.2. The slope coefficients are
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
Slope Coefficient
0.5 0 0.3
0.4
0.5
0.6 Quantile
0.7
0.8
0.9
0.1 0 −0.1 0.3
0.4
0.5
c
0.6 Quantile
0.7
0.8
0.9
1
Slope Coefficient
0 −0.2 −0.4 0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.4
0.6 0.5 Quantile
0.7
0.8
0.9
1
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
1
0.2 0.1 0 −0.1 −0.2 0.1
5-Year
0.3 0.2 0.1 0 0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.2 0 −0.2 −0.4 0.1
0.9
Volatility Coefficient
Slope Coefficient Volatility Coefficient
0.3
0.4
0.2
−0.1 0.1
0.2
d
4-Year 0.4
−0.6 0.1
0
−0.5 0.1
1
0.2
−0.2 0.2
3-Year 0.5
1
−0.5 0.2 Volatility Coefficient
b
2-Year 1.5
Volatility Coefficient
Slope Coefficient
a
741
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.3 0.2 0.1 0 −0.1 0.1
Fig. 47.3 Impacts of the slope and volatility factors on LIBOR forward densities. This figure reports regression coefficients of different quantiles of the forward densities at 2, 3, 4, and 5 year maturities on the
slope and volatility factors of LIBOR rates in Equation (47.27) without the interaction term
generally negative (positive) for the left (right) half of the distribution and become more negative or positive at both tails. Consistent with Fig. 47.2, this result suggests that when the term structure steepens, the forward densities become more dispersed and the effect is more pronounced at both tails. One exception to this result is that the slope coefficients become negative and statistically insignificant at the right tail at 5 year maturity. The coefficients of the volatility factor are generally positive (negative) for the left (right) half of the distribution. Although the volatility coefficients start to turn positive at the right tail of the distribution, they are not statistically significant. These results suggest that higher volatility leads to more compact forward densities, a result that is generally consistent with that in Fig. 47.2. In Fig. 47.4, although the slope coefficients exhibit similar patterns as that in Fig. 47.3, the interaction term changes the volatility coefficients quite significantly. The volatility coef-
ficients become largely insignificant and exhibit quite different patterns than those in Fig. 47.3. For example, the volatility coefficients at 2 and 3 year maturities are largely constant across different quantiles. At 4 and 5 year maturities, they even become negative (positive) for the left (right) half of the distribution. On the other hand, the coefficients of the interaction term exhibit similar patterns as that of the volatility coefficients in Fig. 47.3. These results suggest that the impacts of volatility on the forward densities depend on the slope of the term structure. Figure 47.5 presents the volatility coefficients at different levels of the slope factor (i.e., bO2x C bO3x si ; where si D 0:3 or 2.4). We see clearly that the impact of volatility on the forward densities depends significantly on the slope factor. With a flat term structure, the volatility coefficients generally increase with the quantiles, especially at 3, 4, and 5 year maturities. The volatility coefficients are generally negative
742
F. Zhao
Slope Coefficient
1 0 −1 0.2
Volatility Coefficient
b
2-Year
3-Year
2
0.3
0.4
0.5
0.6 Quantile
0.7
0.8
0.9
0.4 0.2 0 −0.2 −0.4 0.2
0.3
0.4
0.5
0.6 Quantile
0.7
0.8
0.9
1 0.5 0 −0.5 −1 0.1
1 Volatility Coefficient
Slope Coefficient
a
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
1
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
1
0.2
0.3
0.4
0.5
0.7
0.8
0.9
1
0.4 0.2 0 −0.2 0.1
1
Slope*Vol.
Slope*Vol.
1 0.5 0 −0.5 0.2
0.3
0.4
0.5
0.8
0.9
0.1
d Slope Coefficient
0 −0.5
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.2 0 −0.2 0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
Slope*Vol.
Slope*Vol.
0 −0.5
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.5
0
−0.5 0.1 1
0.5
0 −0.5 0.1
5-Year 0.5
−1 0.1
0.9
0.4
0.2
0.6
Quantile
4-Year
−0.4 0.1
−0.5
1
0.5
−1 0.1 Volatility Coefficient
0.7
0
Volatility Coefficient
Slope Coefficient
c
0.6 Quantile
0.5
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.5 0 −0.5 −1 0.1
Fig. 47.4 Impacts of the slope and volatility factors (with their interaction term) on LIBOR forward densities. This figure reports regression coefficients of different quantiles of the forward densities at 2, 3, 4, and
5 year maturities on the slope and volatility factors of LIBOR rates and their interaction term in Equation (47.27)
(positive) for the left (right) tail of the distribution, although not all of them are statistically significant. However, with a steep term structure, the volatility coefficients are generally positive (negative) for the left (right) half of the distribution
for most maturities. Therefore, if the current volatility is high and the term structure is flat (steep), then volatility is likely to increase (decline) in the future. We observe flat term structure during early part of our sample when the Fed has raised
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
743 3-Year
2-Year
a
b
Slope=0.3
Vol. Coefficient
Vol. Coefficient
0.1 0 −0.1 −0.2 0.4
0.5
0.6 Quantile Slope=2.4
0.7
0.8
0.3 0.2 0.1 0 −0.1 −0.2
0.9
0.2
0.3
0.4
0.5 0.6 Quantile Slope=2.4
0.7
0.8
0.9
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.4
0.6
Vol. Coefficient
Vol. Coefficient
0.3
Slope=0.3
0.4 0.2 0
0.2 0 −0.2 −0.4
−0.2 0.3
0.4
0.5
0.6 Quantile
0.7
0.8
0.9
4-Year
Vol. Coefficient
0.2 0.1 0 −0.1 0.1
Vol. Coefficient
5-Year
d
Slope=0.3
0.3
0.4
0.6 0.5 Quantile Slope=2.4
0.7
0.8
0.9
0.3 0.2 0.1 0 −0.1 0.1
0.2
0.3
0.4
0.5 Quantile Slope=2.4
0.6
0.7
0.8
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.2
0.4 0.2 0 −0.2 0.1
Slope=0.3
−0.2
0.2
Vol. Coefficient
Vol. Coefficient
c
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0 −0.2 −0.4 0.1
Fig. 47.5 Nonlinear dependence of LIBOR forward densities on the volatility factor of LIBOR rates. This figure presents regression coefficients of quantiles of LIBOR forward densities on the volatility factor
at different levels of the slope factor. The two levels of the slope factor represent flat and steep term structures
interest rate to slow down the economy. It could be that the market was more uncertain about future state of the economy because it felt that recession was imminent. On the other hand, we observe steep term structure after the internet bubble bursted and the Fed has aggressively cut interest rate. It could be that the market felt that the worst was over and thus was less uncertain about future state of the economy. Our nonparametric analysis reveals important nonlinear dependence of the forward densities on both the slope and volatility factors of LIBOR rates. These results have important implications for one of the most important and controversial topics in the current term structure literature, namely the USV puzzle. While existing studies on USV mainly rely on parametric methods, our results provide nonparametric evidence on the importance of USV: Even after control-
ling for important bond market factors, such as level and slope, the volatility factor still significantly affects the forward densities of LIBOR rates. Even though many existing term structure models have modelled volatility as a meanreverting process, our results show that the speed of mean reversion of volatility is nonlinear and depends on the slope of the term structure. Some recent studies have documented interactions between activities in mortgage and interest rate derivatives markets. For example, in an interesting study, Duarte (2008) shows that ATM swaption implied volatilities are highly correlated with prepayment activities in the mortgage markets. Duarte (2008) extends the string model of Longstaff et al. (2001) by allowing the volatility of LIBOR rates to be a function of the prepayment speed in the mortgage markets.
744 9.5 Log(Number of Loan Applications)
Fig. 47.6 Mortgage Bankers Association of America (MBAA) weekly refinancing and ARMs indexes. This figure reports the logs of the refinance and ARMs indexes obtained by weekly surveys at the (MBAA)
F. Zhao
REFI ARM
9 8.5 8 7.5 7 6.5 6 5.5
2001
2003
2002
2004
Date
He shows that the new model has much smaller pricing errors for ATM swaptions than the original model with a constant volatility or a CEV model. Jarrow et al. (2007) also show that although their LIBOR model with stochastic volatility and jumps can price caps across moneyness reasonably well, the model pricing errors are unusually large during a few episodes with high prepayments in MBS. These findings suggest that if activities in the mortgage markets, notably the hedging activities of government sponsored enterprises, such as Fannie Mae and Freddie Mac, affect the supply/demand of interest rate derivatives, then this source of risk may not be fully spanned by the factors driving the evolution of the term structure.23 In this section, we provide nonparametric evidence on the impact of mortgage activities on LIBOR forward densities. Our analysis extends Duarte (2008) in several important dimensions. First, by considering caps across moneyness, we examine the impacts of mortgage activities on the entire forward densities. Second, by explicitly allowing LIBOR forward densities to depend on the slope and volatility factors of LIBOR rates, we examine whether prepayment still has incremental contributions in explaining interest rate option prices in the presence of these two factors.24 Finally, in addition to prepayment activities, we also examine the impacts of ARMs origination on the forward densities. Implicit in any ARM is an interest rate cap, which caps the mortgage rate at a certain level. Since lenders of ARMs implicitly sell a cap to the borrower, they might have incentives to hedge such exposures.25 23
See Jaffee (2003) and Duarte (2009) for excellent discussions on the use of interest rate derivatives by Fannie Mae and Freddie Mac in hedging interest rate risks. 24 While the slope factor can have nontrivial impact on prepayment behavior, the volatility factor is crucial for pricing interest rate options. 25 We thank the referee for the suggestion of examining the effects of ARMs origination on the forward densities.
Our measures of prepayment and ARMs activities are the weekly refinancing and ARMs indexes based on the weekly surveys conducted by MBAA, respectively. The two indexes, as plotted in Fig. 47.6, tend to be positively correlated with each other. There is an upward trend in ARMs activities during our sample period, which is consistent with what happened in the housing market in the past few years. To examine the impacts of mortgage activities on LIBOR forward densities, we repeat the above regressions by including two additional explanatory variables that measure refinance and ARMs activities. Specifically, we refer to the top 20% of the observations of the refinance (ARMs) index as the high prepayment (ARMs) group. After obtaining a nonparametric forward density at a particular level of the two conditioning variables, we define two new variables “Refi” and “ARMs,” which measure the percentages of observations used in estimating the forward density that belong to the high prepayment and ARMs groups, respectively. These two variables allow us to test whether the forward densities behave differently when prepayment/ARMs activities are high. To control for potential collinearity among the explanatory variables, we have orthogonalized any new explanatory variable with respect to existing ones. Figure 47.7 contain the new regression results with “Refi” and “ARMs” for the four maturities. The coefficients of the slope, volatility, and the interaction term exhibit similar patterns as that in Fig. 47.4.26 The strongest impacts of ARMs on the forward densities occur at 2 year maturity, as shown in Panel A of Fig. 47.7. Therefore, high ARMs origination shifts the median and the right tail of the forward densities at 2 year maturity toward the right. This finding is consistent with the notion that hedging demands from ARMs lenders for the cap they have
26
In results not reported, we find that the nonlinear dependence of the forward densities on the volatility factor remain the same as well.
a
0.4
0.3
0.7
0.8
0.9
0.4
0.5
0.6 Quantile
0.7
0.9
0.8
Slope*Vol.
1
0 0.6 Quantile
0.7
0.9
0.8
REFI Coefficient
0.5
0.4 0.2 0 −0.2 −0.4 0.4
0.5
0.6 Quantile
0.7
0.9
0.8
ARM Coefficient
0.3 0.5 0 −0.5
0.3
0.4
0.5
c
0.6 Quantile
0.7
0.9
0.8
4-Year
0.5 0 −0.5 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.4 0.2 0 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
Slope*Vol.
0.5 0 −0.5 0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.9
0.5 0 −0.5 0.1 0.2 0 −0.2 −0.4 −0.6 −0.8 0.1
Volatility Coefficient Slope Coefficient
0.4
0.2
0.2
0.3
0.3
0.4
0.4
0.5 Quantile
0.5 Quantile
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
Fig. 47.7 Impacts of refinance and ARMs activities on LIBOR forward densities. In this figure, for each quantile of LIBOR forward densities at 2, 3, 4, and 5 year maturities, we report regression coefficients of the
REFI Coefficient
REFI Coefficient ARM Coefficient Volatility Coefficient Slope Coefficient Slope*Vol.
0.6 Quantile
0.5
0.1 REFI Coefficient
0.5
0.1 0 −0.1 −0.2 −0.3
0.3
ARM Coefficient
Volatility Coefficient Slope Coefficient
1 0.5 0 −0.5 0.3
745
b
2-Year
ARM Coefficient
Slope*Vol.
Volatility Coefficient Slope Coefficient
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing 3-Year
0.5 0 −0.5 −1
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.2
0.3
0.4
0.5 0.6 Quantile
0.7
0.8
0.9
0.4 0.2 0 −0.2
0.5 0 −0.5 −1
0.2 0 −0.2 −0.4
0.4 0.2 0 −0.2 −0.4 −0.6
d
5-Year
0.4 0.2 0 −0.2 −0.4 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.6 0.4 0.2 0 −0.2 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0 −0.5 −1 −1.5 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
0 −0.2 −0.4 −0.6 −0.8 0.1
0.2
0.3
0.4
0.5 Quantile
0.6
0.7
0.8
1 0.5 0
quantile on (1) the slope and volatility factors and their interaction term as in Equation 47.27; and (2) refinance and ARMs activities
746
shorted might increase the price of OTM caps. One possible reason that the effects of ARMs are more pronounced at 2 year maturity than at 3, 4, and 5 year maturities is that most ARMs get reset within the first 2 years. While high ARMs activities shift the forward density at 2 year maturity to the right, high refinance activities shift the forward densities at 3, 4, and 5 year maturities to the left. We see that the coefficients of Refi at the left tail are significantly negative. While the coefficients also are significantly negative for the middle of the distribution (40–70% quantiles), the magnitude of the coefficients are much smaller. These can be seen in Panels B, C, and D of Fig. 47.7. Therefore, high prepayment activities lead to much more negatively skewed forward densities. This result is consistent with the notion that investors in MBS might demand OTM floors to hedge their potential losses from prepayments. The coefficients of Refi are more significant at 4 and 5 year maturities because the duration of most of MBS are close to 5 years. Our results confirm and extend the findings of Duarte (2008) by showing that mortgage activities affect the entire forward density and consequently the pricing of interest rate options across moneyness. While prepayment activities affect the left tail of the forward densities at intermediate maturities, ARMs activities affect the right tail of the forward densities at short maturity. Our findings hold even after controlling for the slope and volatility factors and suggest that part of the USV factors could be driven by activities in the mortgage markets.
47.5 Conclusion The unspanned stochastic volatility puzzle is one of the most important topics in the current term structure modeling. Similar to the stochastic volatility in the equity options literature, the existence of USV challenges the benchmark in the current term structure literature, the dynamic term structure models. But it also in part explains why the practioners generally apply the HJM type of models for interest rate derivatives, where sometimes the models are applied in an inconsistent manner across securities. Unlike the equity options literature where the underlying follows a univariate process, it is more difficult to argue the stochastic volatilities of yields are not spanned by the existing yield curve factors. We in this paper review the current literature, which is mostly in support of the USV using either bonds data or both bonds and derivatives data. We present the results in Li and Zhao (2006) that the DTSMs have serious difficulty in hedging against the interest rate caps. We also present the results from Li and
F. Zhao
Zhao (2008) where they show nonparametrically both the actual volatility of interest rates and the liquidity component of the implied volatility affect the derivative prices after controlling for the yield curve factors. This paper also presents the model developed in Jarrow, Li and Zhao (2007), which is quite rich parametrically to capture a spectrum of derivative prices. We can expect that the USV will have the similar effect on interest rate derivatives as the stochastic volatility on the equity options literature with many more issues to be addressed in the future.
References Aït-Sahalia, Y. and A. W. Lo. 1998. “Nonparametric estimation of stateprice densities implicit in financial asset prices.” Journal of Finance, American Finance Association 53(2), 499–547. Aït-Sahalia, Y. and A. W. Lo. 2000. “Nonparametric risk management and implied risk aversion.” Journal of Econometrics 94, 9–51. Ahn, D.-H., R. F. Dittmar, A. R. Gallant, and B. Gao. 2003. “Purebred or hybrid?: Reproducing the volatility in term structure dynamics.” Journal of Econometrics 116, 147–180. Ahn, D.-H., R. F. Dittmar, and A. R. Gallant. 2002. “Quadratic term structure models: Theory and evidence.” Review of Financial Studies 15, 243–288. Ahn, D.-H. and B. Gao. 1999. A parametric nonlinear model of term structure dynamics. Review of Financial Studies 12(4), 721–762. Andersen, L. and R. Brotherton-Ratcliffe. 2001. Extended LIBOR market models with stochastic volatility, Working paper, Gen Re Securities. Andersen, T. G. and L. Benzoni. 2006. Do bonds span volatility risk in the U.S. Treasury market? A specification test of affine term structure models, Working paper, Northwestern University. Andersen, T. G. and J. Lund. 1997. “Estimating continuous time stochastic volatility models of the short term interest rate.” Journal of Econometrics 77, 343–378. Bansal, R. and H. Zhou. 2002. “Term structure of interest rates with regime shifts.” Journal of Finance 57, 1997–2043. Ball, C. and W. Torous. 1999. “The stochastic volatility of short-term interest rates: some international evidence.” Journal of Finance 54, 2339–2359. Bikbov, R. and M. Chernov. 2004. Term structure and volatility: lessons from the eurodollar markets, Working Paper, Columbia GSB. Brace, A., D. Gatarek, and M. Musiela. 1997. “The market model of interest rate dynamics.” Mathematical Finance 7, 127–155. Brandt, M. W. and D. Chapman. “Comparing multifactor models of the term structure,” Working paper. Breeden, D. T. and R. H. Litzenberger. 1978. “Prices of state-contingent claims implicit in option prices.” Journal of Business 51, 621–651. Brenner, R., R. Harjes, and K. Kroner. 1996. “Another look at alternative models of short-term interest rate.” Journal of Financial and Quantitative Analysis 31, 85–107. Casassus, J., P. Collin-Dufresne, and R. S. Goldstein. 2005. “Unspanned stochastic volatility and fixed income derivative pricing.” Journal of Banking and Finance 29, 2723–2749. Chacko, G. and S. Das. 2002. “Pricing interest rate derivatives: a general approach.” Review of Financial Studies, 195–241.
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing Chen, R., and L. Scott. 2001. Stochastic volatility and jumps in interest rates: an empirical analysis, Working paper, Rutgers University. Chernov, M., and E. Ghysels. 2000. “A study towards a unified approach to the joint estimation of objective and risk neutral measures for the purpose of options valuation.” Journal of Financial Economics 56, 407–458. Collin-Dufresne, P. and R. S. Goldstein, 2002. “Do bonds span the fixed income markets? Theory and evidence for unspanned stochastic volatility.” Journal of Finance 57, 1685–1729. Collin-Dufresne, P. and R. S. Goldstein. 2003. Stochastic correlation and the relative pricing of caps and swaptions in a generalized affine framework, Working paper, Carnegie Mellon University. Collin-Dufresne, P., R. S. Goldstein, and J. Hugonnier. 2004. “A general formula for valuing defaultable securities.” Econometrica 72(5), 1377–1407. Dai, Q. and K. Singleton. 2000. “Specification analysis of affine term structure models.” Journal of Finance 50(5), 1943–1978. Dai, Q. and K. Singleton. 2002. “Expecttatio puzzles, time-varying risk premia, and affine models of the term structure.” Journal of Financial Economics 63(3), 415–441. Dai, Q. and K. Singleton. 2003. “Term structure dynamics in theory and reality.” Review of Financial Studies 16, 631–678. Deuskar, P., A. Gupta, and M. Subrahmanyam. 2003. Liquidity effects and volatility smiles in interest rate option markets, Working paper, New York University. Duarte, J. 2004. “Evaluating an alternative risk preference in affine term structure models.” Review of Financial Studies 17, 379–404. Duarte, J. 2008. “Mortgage-backed securities refinancing and the arbitrage in the swaption market.” Review of Financial Studies 21, 1689–1731. Duarte, J. 2009. “The causal effect of mortgage refinancing on interest rate volatility: empirical evidence and theoretical implications.” Review of Financial Studies 21, 1689–1731. Duffee, G. R. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57, 405–443. Duffie, D. and R. Kan. 1996. “A yield-factor model of interest rates.” Mathematical Finance 6, 379–406. Duffie, D. and D. Lando. 2001. “Term structures of credit spreads with incomplete accounting information.” Econometrica 69(3), 633–664. Duffie, D., J. Pan, and K. Singleton. 2000. “Transform analysis and asset pricing for affine jump-diffusions.” Econometrica 68(6), 1343–1376. Eraker, B. 2003. “Do stock prices and volatility jump? Reconciling evidence from spot and option prices.” Journal of Finance 59, 1367–1404. Fan, R., A. Gupta, and P. Ritchken. 2003. “Hedging in the possible presence of unspanned stochastic volatility: evidence from swaption markets.” Journal of Finance 58, 2219–2248. Gallant, A. R. and G. Tauchen. 1998. “Reprojecting partially observed systems with applications to interest rate diffusions.” Journal of American Statistical Association 93, 10–24. Glasserman, P. and S. G. Kou. 2003. “The term structure of simple forward rates with jump risk.” Mathematical Finance, Blackwell Publishing, 13(3), 383–410. Goldstein, R. S. 2000. The term structure of interest rates as a random field. Review of Financial Studies 13, 365–384. Han, B. 2007. “Stochastic volatilities and correlations of bond yields.” Journal of Finance LXII(3).
747
Heath, D., R. Jarrow, and A. Morton. 1992. “Bond pricing and the term structure of interest rates: a new methodology.” Econometrica 60, 77–105. Heidari, M. and L. Wu. 2003. “Are interest rate derivatives spanned by the term structure of interest rates?” Journal of Fixed Income 13, 75–86. Heston, S. 1993. “A closed-form solution for options with stochastic volatility with applications to bond and currency options.” Review of Financial Studies 6, 327–343. Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” Journal of Finance 42, 281–300. Jackwerth, J. C. 2000. “Recovering risk aversion from options prices and realized returns.” Review of Financial Studies 13, 433–451. Jaffee, D. 2003. “The interest rate risk of Fannie Mae and Freddie Mac.” Journal of Financial Services Research 24(1), 5–29. Jagannathan, R., A. Kaplin, and S. Sun. 2003. “An evaluation of multifactor CIR models using LIBOR, swap rates, and cap and swaption prices.” Journal of Econometrics 116, 113–146. Jarrow, R., H. Li, and F. Zhao. 2007. “Interest rate caps “Smile” too! but can the LIBOR market models capture it?” Journal of Finance 62, 345–382. Leippold, M. and L. Wu. 2002. “Asset pricing under the quadratic class.” Journal of Financial and Quantitative Analysis 37, 271–295. Li, H. and F. Zhao. 2006. “Unspanned stochastic volatility: evidence from hedging interest rate derivatives.” Journal of Finance 61, 341–378. Li, H. and F. Zhao. 2009. “Nonparametric estimation of state-price densities implicit in interest rate cap prices.” Review of Financial Studies 22(11), 4335–4376. Longstaff, F., P. Santa-Clara, and E. Schwartz. 2001. “The relative valuation of caps and swaptions: theory and evidence.” Journal of Finance 56, 2067–2109. Longstaff, F. and E. Schwartz. 1992. “Interest rate volatility and the term structure: a two-factor general equilibrium model.” Journal of Finance 47, 1259–1282. Miltersen, M., K. Sandmann, and D. Sondermann. 1997. “Closedform solutions for term structure derivatives with lognormal interest rates.” Journal of Finance 52, 409–430. Pan, J. 2002. “The jump-risk premia implicit in options: evidence from an integrated time-series study.” Journal of Financial Economics 63, 3–50. Piazzesi, M. 2005. “Bond yields and the federal reserve.” Journal of Political Economy 113(2), 311–344. Piazzesi, M. 2010. “Affine term structure models.” in Handbook of Financial Econometrics, North-Holland. Protter, P. 1990. “Stochastic integration and differential equations: a new approach.” Springer-Verlag. Rosenberg, J. and R. Engle. 2002. “Empirical pricing kernels.” Journal of Financial Economics 64(3), 341–372. Santa-Clara, P. and D. Sornette. 2001. “The dynamics of the forward interest rate curve with stochastic string shocks.” Review of Financial Studies 14, 2001. Thompson, S. 2004. “Identifying term structure volatility from the LIBOR-swap curve.” Review of Financial Studies 21, 819–854. Trolle A. B., and E. S. Schwartz, 2009. “A general stochastic volatility model for the pricing of interest rate derivatives.” Review of Financial Studies 22(5), 2007–2057.
748
F. Zhao
Appendix 47A The Derivation for QTSMs
E ŒXt C t jXt D Uƒ1 Œˆ IN U 1 CUƒ1 Œˆ IN U 1 Xt (47.47)
To guarantee the stationarity of the state variables, we assume that permits the following eigenvalue decomposition, D UƒU 1
and conditional variance var ŒXt C t jXt D U ‚U 0
where ƒ is the diagonal matrix of the eigenvalues that take negative values, ƒ diag Œi N ; and U is the matrix of the eigenvectors of , U Œu1 u2 uN : The conditional distribution of the state variables Xt is multivariate Gaussian with conditional mean
(47.48)
where ˆ is a diagonal matrix with elements exp.i t/ for i D 1; : : : ; N; ‚ is a N-by-N matrix with elements
vij
t .i Cj / e 1 ; i C j
where vij N N D U 1 ††0 U 01 : With the specification of market price of risk, we can relate the risk-neutral measure Q to the physical one P as follows; Z t Z dQ 1 t 0 0 jFt D exp E .Xs / d Ws .Xs / .Xs /ds ; for t T: dP 2 0 0
Applying Girsanov’s theorem, we obtain the risk-neutral dynamics of the state variables dXt D Œı C Xt dt C †d Wt
Q
Q
where ı D †0 ; D †1 ; and Wt is an N-dimensional standard Brownian motion under measure Q: Under the above assumptions, a large class of fixedincome securities can be priced in (essentially) closed-form (see Leippold and Wu 2002). We discuss the pricing of zerocoupon bonds below and the pricing of caps. Let V .t; / be the time-t value of a zero-coupon bond that pays 1 dollar at time T . D T t/. In the absence of arbitrage, the Rt discounted value process exp 0 r .Xs / ds V .t; / is a Qmartingale. Thus the value function must satisfy the fundamental PDE, which requires the bond’s instantaneous return equals the risk-free rate,
(47.49)
@A ./ D ‰ C A./ C 0 A./ 2A./††0 A./I @ @b ./ D ˇ C 2A./ı C 0 b./ 2A./††0 b ./ I @ 1 @c ./ D ˛ C b./0 ı b./0 ††0 b ./ @ 2 0 Ctr †† A./ I with A.0/ D 0N N I b.0/ D 0N I c.0/ D 0: Consequently, the yield-to-maturity, y.t; /; is a quadratic function of the state variables y.t; / D
1 0 Xt A./Xt C b./0 Xt C c./ :
V .t; / D exp Xt0 A./Xt b./0 Xt c./ ;
In contrast, in the ATSMs the yields are linear in the state variables and therefore the correlations among the yields are solely determined by the correlations of the state variables. Although the state variables in the QTSMs follow multivariate Gaussian process, the quadratic form of the yields helps to model the time varying volatility and correlation of bond yields. Leippold and Wu (2002) show that a large class of fixedincome securities can be priced in closed-form in the QTSMs using the transform analysis of Duffie et al. (2001). They show that the time-t value of a contract that has an exponential quadratic payoff structure at terminal time T, i.e.,
where A./; b./ and c./ satisfy the following system of ordinary differential equations (ODEs),
0 exp .q.XT // D exp XT0 AXT b XT c
2 @V .t; / 1 0 @ V .t; / t r †† C .ı C Xt / 2 @Xt @Xt0 @Xt0 C
@V .t; / D rt V .t; / @t
(47.50)
with the terminal condition V .t; 0/ D 1: The solution takes the form
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
749
has the following form RT
.q; Xt ; t; T / D EQ e t r.Xs /ds e q.XT / jFt D exp Xt A.T t/Xt b.T t/0 Xt c.T t/ :
(47.51)
where A.:/; b.:/ and c.:/ satisfy the ODEs (4)-(6) with the initial conditions A.0/ D A; b.0/ D b and c.0/ D c: C The time-t price a call option with payoff e q.XT / y at T D t C equals RT
C C .q; y; Xt ; / D EQ e t r.Xs /ds e q.XT / y jFt RT
D EQ e t r.Xs /ds e q.XT / y 1fq.XT /ln.y/g jFt D Gq;q . ln .y/ ; Xt ; / yG0;q . ln .y/ ; Xt ; / ; h RT i where Gq1 ;q2 .y; Xt ; / D EQ e t r.Xs /ds e q1 .XT / 1fq2 .XT /yg jFt and can be computed by the inversion formula,
Gq1 ;q2 .y; Xt ; / D
1 .q1 ; Xt ; t; T / 2
Z
1
e ivy
0
.q1 C ivq2 / e ivy iv
.q1 ivq2 /
dv:
(47.52)
Similarly, the price of a put option is P .q; y; ; Xt / D yG0;q .ln .y/ ; Xt ; /Gq;q .ln .y/ ; Xt ; / : We are interested in pricing a cap which is portfolio of European call options on future interest rates with a fixed strike price. For simplicity, we assume the face value is 1 and the strike price is r. At time 0; let ; 2; : : : ; n be the fixed dates for future interest payments. At each fixed date k; the r-capped interest payment is given by .R ..k 1/ ; k/ r/C ; where R ..k 1/ ; k/ is the -year floating interest rate at time .k 1/ ; defined by 1 D % ..k 1/ ; k/ D E Q exp 1 C R ..k 1/ ; k/
Z
k
!
!
r .Xs / ds jF.k1/ : .k1/
750
F. Zhao
The market value at time 0 of the caplet paying at date k can be expressed as " Caplet .k/ D E
Q
Z
!
k
#
r .Xs / ds .R ..k 1/ ; k/ r /
exp
C
0
" D .1 C r/ E
Q
Z
!
.k1/
exp
r .Xs / ds 0
Hence, the pricing of the kth caplet is equivalent to the pric1 ing of an .k 1/ -for- put struck at K D .1C r/ : Therefore, Caplet.k/ D G0;q ln K; X.k1/ ; .k 1/
1 Gq ;q ln K; X.k1/ ; .k 1/ : K (47.53)
Floorlet.k/ D G0;q ln K; X.k1/ ; .k 1/
Appendix 47B The Implementation of the Kalman Filter To implement the extended Kalman filter, we first recast the QTSMs into a state-space representation. Suppose we have a time series of observations of yields of L zero-coupon bonds with maturities D .1 ; 2 ; : : : ; L /: Let „ be the set of parameters for QTSMs, Yk D f .Xk ; I „/ be the vector of the L observed yields at the discrete time points k t; for k D 1; 2; : : : ; K; where t is the sample interval (one day in our case). After the following change of variable, Zk D U 1 . 1 C Xk /;
:
where the innovations in the state and measurement equations wk and vk follow serially independent Gaussian processes and are independent from each other. The timevarying coefficients of the measurement equation dk and Hk are determined at the ex ante forecast of the state variables, Hk D
1 Gq ;q ln K; X.k1/ ; .k 1/ : K (47.54)
C #
Yk D dk C Hk Zk C vk ; vk N.0; Rv/
Similarly for the k th floorlet
C
1 % ..k 1/ ; k/ .1 C r/
@f .U z 1 ; / jzDZkjk1 @z
dk D f .UZkjk1 1 ; / Hk Zkjk1 C Bk ; where Zkjk1 D ˆZk1 : In the QTSMs, the transition density of the state variables is multivariate Gaussian under the physical measure. Thus the transition equation in the Kalman filter is exact. The only source of approximation error is due to the linearization of the quadratic measurement equation. As our estimation uses daily data, the approximation error, which is proportional to one-day ahead forecast error, is likely to be minor. Furthermore, we can minimize the approximation error by introducing the correction term Bk .27 The Kalman filter starts with the initial state variable Z0 D E.Z0 / and the variance– covariance matrix P0Z ; P0Z D E .Z0 E.Z0 // .Z0 E.Z0 //0 : These unconditional mean and variance have closed-form expressions that can be derived using Equations (47.4) and (47.5) by letting t goes to infinity. Given the set of filtering parameters, f„; Rv g ; we can write down the log-likelihood of observations based on the Kalman filter
we have the state equation: Zk D ˆZk1 C wk ;
wk N.0; ‚/
where ˆ and ‚ are first introduced in (4) and (5), and measurement equation:
27
The differences between parameter estimates with and without the correction term are very small and we report those estimates with the correction term Bk .
47 Unspanned Stochastic Volatilities and Interest Rate Derivatives Pricing
log L .Y I „/ D
K X
751
log f .Yk I Yk1 ; f„; Rv g/
kD1
D
K K ˇ ˇ
0
1
1X LK ˇ Y ˇ 1X Y O O P Y log .2 / Y log ˇPkjk1 Y Y ˇ k k kjk1 kjk1 kjk1 2 2 2 kD1
kD1
with Yk1 is the information set at time .k 1/ t; and Y Pkjk1 is the time .k 1/ t conditional variance of Yk ,
N X da.s/ D jkC1 jkC1 Bj .s/ ds j D1
Y Z D Hk Pkjk1 Hk0 C Rv I Pkjk1 Z Z D ˆPk1 ˆ0 C ‚: Pkjk1
CJ Œ .u0 / 1 u0 ..1/ 1/ ; where the function is
1 2 2 : c C c .c/ D exp kC1 J 2 J
Appendix 47C Derivation of the Characteristic Function The solution to the characteristic function of log.Lk .Tk // ; .u0 ; Yt ; t; Tk / D exp a.s/ C u0 log.Lk .t// C B.s/0 Vt ; a.s/ and B.s/; 0 s Tk satisfy the following system of Ricatti equations: dBj .s/ 1 D jkC1 Bj .s/ C Bj2 .s/j2 ds 2 1 2 C u20 u0 Us;j ; 1 j N; 2
The initial conditions are B.0/ D 0N 1 ; a.0/ D 0; and jkC1 and jkC1 are the parameters of Vj .t/ process under QkC1 : For any l < k; Given that B .Tl / D B0 and a .Tl / D a0 ; we have the closed-form solutions .T2 lC1 / and for B a .TlC1 / : Define constants p D u20 u0 Us;j ; q D r
2 p p jkC1 C pj2 ; c D kC1 and d D kC1 : Then we qj
have
.c C d / .c Bj 0 / ; 1 j N; d C Bj 0 exp.qı/ C c Bj 0 " !!# N X d C Bj 0 exp.qı/ C c Bj 0 2 kC1 kC1 a.TlC1 / D a0 j j d ı C 2 ln cCd j j D1
Bj .TlC1 / D c
CJ ı Œ .u0 / 1 u0 ..1/ 1/ ;
if p ¤ 0 and Bj .TlC1 / D Bj 0 ; a.TlC1 / D a0 otherwise. B .Tk / and a .Tk / can be computed via iteration.
qCj
Chapter 48
Catastrophic Losses and Alternative Risk Transfer Instruments Jin-Ping Lee and Min-Teh Yu
Abstract This study reviews the valuation models for three types of catastrophe-linked instruments: catastrophe bonds, catastrophe equity puts, and catastrophe futures and options. First, it looks into the pricing of catastrophe bonds under stochastic interest rates and examines how (re)insurers can apply catastrophe bonds to reduce the default risk. Second, it models and values the catastrophe equity puts that give the (re)insurer the right to sell its stocks at a predetermined price if catastrophe losses surpass a trigger level. Third, this study models and prices catastrophe futures and catastrophe options contracts that are based on a catastrophe index. Keywords Catastrophe risk r Catastrophe bond r Catastrophe equity put r Catastrophe futures options r Contingent claim analysis
48.1 Introduction Catastrophic events having low frequency of occurrence but generally high loss severity can easily erode the underwriting capacity of property and casualty insurance and reinsurance companies (P&Cs, hereafter). P&Cs traditionally hedge catastrophe risks by buying catastrophe reinsurance contracts. Because of capacity shortage and constraints in the reinsurance markets, the capital markets develop alternative risk transfer instruments to provide (re)insurance companies with vehicles for hedging their catastrophe risk. These instruments can be broadly classified into three categories: insurance-linked debt contracts (e.g., catastrophe bonds), contingent capital financing instruments (e.g., catastrophe
J.-P. Lee Department of Finance, Feng Chia University, Taichung, Taiwan, ROC M.-T. Yu () Department of Finance, Providence University and National Taiwan University, Taiwan, ROC e-mail: [email protected]
equity puts), and catastrophe derivatives (e.g., catastrophe futures and catastrophe options). The Chicago Board of Trade (CBOT) launched catastrophe (CAT) futures in 1992 and CAT futures call spreads in 1993 with contract values linked to the loss index compiled by the Insurance Services Office. The CBOT switched to CAT options in 1995 to try to spur growth in the CAT derivatives market, but was unable to generate meaningful activity and ultimately abandoned it in 2000. The CAT bonds, however, have been quite successful with 89 transactions completed, representing $15.53 billion in issuance since the first issue in 1997.1 Since 1996, several CAT equity put deals were negotiated usually with obligations to purchase stock of $100 million each. This study looks into the valuation models for these CATlinked instruments and examines how their values are related to catastrophe risk, terms of the contract, and other key elements of these instruments. The rest of this study is organized into four sections. Section II provides a model to value catastrophe bonds under stochastic interest rates. Section III models and values the catastrophe equity puts with credit risk, and section IV models and prices catastrophe futures and catastrophe options contracts that are based on specified catastrophe indices. Section V investigates how reinsurers can apply catastrophe bonds to reduce their default risk. This study’s valuation approach employs the contingent claim analysis, and when a closed-form solution cannot be derived numerical estimates will be computed using the Monte Carlo simulation method.
48.2 Catastrophe Bonds The CAT bond, which is also named as an “Act of God bond,” is a liability-hedging instrument for insurance companies. There are debt-forgiveness triggers in CAT bond provisions, whereby the generic design allows for the payment of interest
1
See MMC Security (2007).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_48,
753
754
J.-P. Lee and M.-T. Yu
and/or the return of principal forgiveness, and the extent of forgiveness can be total, partial, or scaled to the size of the loss. Moreover, the debt forgiveness can be triggered by the (re)insurer’s actual losses or on a composite index of insurers’ losses during a specific period. The advantage of a CAT bond hedge for (re)insurers is that the issuer can avoid the credit risk. The CAT bondholders provide the hedge to the (re)insurer by forgiving existing debt. Thus, the value of this hedge is independent of the bondholders’ assets and the issuer has no risk of non-delivery on the hedge. However, from the bondholder’s perspective, the default risk, the potential moral hazard behavior, and the basis risk of the issuing firm are critical in determining the value of CAT bonds.
48.2.1 CAT Bond Valuation Models Litzenberger et al. (1996) considered a 1-year bond with an embedded binary CAT option. The repayment of principal is indexed to the (re)insurer’s catastrophe loss (denoted as CT ). This security may be decomposed into two components: (1) long a bond with a face value of F and (2) short a binary call on the catastrophe loss and with a strike price K. Under the assumption that the natural logarithm of the catastrophe loss is normally distributed, the 1-year CAT bond can be priced as follows: PCAT D e rT .F ˆ ŒzK POT / ; zK D
(48.1)
log.K/ u :
Here, r is the risk free interest rate; and are respectively the mean and standard deviation of ln.CT /; ˆ./ denotes the cumulative distribution function for a standard normal random variable; POT refers to the option’s payout at maturity date, T . For a CAT call option spread, when the (re)insurer’s catastrophe loss (CT ) is less than K1 , the bondholder receives a repayment of the entire principal; when the loss is between K1 and K2 (where K2 > K1 ), the fraction of principal lost is .CT K1 / .K2 K1 / ; and for the loss greater than K2 , the entire principal payment is lost. This security may be divided into two components (1) long a bond with an above-market coupon (c) and (2) a CAT call option spread consisting of a short position on the CAT call with a strike price of K1 and a long position on the CAT call with a strike price of K2 . Under the assumption that the (re)insurer’s catastrophe loss is lognormally distributed with mean and standard deviation , the CAT bond can be priced as follows:
PCAT D e rT F 0 1 ˆŒzK1 C C 12 2
ˆ z K1 C B1 ˆ ŒzK1 e Œ K1 B C
B C @ A 1 2 ˆŒzK2 C Cˆ ŒzK2 e C 2 ˆ z K2 Œ K2 (48.2) zKi D
log.Ki / u ; i D 1; 2.
Litzenberger et al. (1996) provided a bootstrap approach to price these hypothetical CAT bonds and compared them with the prices calculated under the assumption of the lognormality of catastrophe loss distribution. Zajdenweber (1998) followed Litzenberger et al. (1996), but changed the CAT loss distribution to the stable-Levy distribution. Contrary to Litzenberger et al. (1996) and Zajdenweber (1998), there were a series of attempts to relax the interest rate assumption to be stochastic. For instance, Loubergé et al. (1999) numerically estimated the CAT bond price by assuming the interest rate follows a binomial random process and the catastrophe loss a compound Poisson process. Lee and Yu (2002) extended the literature and priced CAT bonds with a formal term structure model of Cox et al. (1985). Under the setting that the aggregate loss is a compound Poisson process, a sum of jumps, the aggregate catastrophe loss facing the (re)insurer i can be described as follows 2 : N.t / X Ci;t D Xi;j ; (48.3) j D1
where the process fN.t/gt 0 is the loss number process, which is assumed to be driven by a Poisson process with intensity . Terms Xi;j denote the amount of losses caused by the j th catastrophe during the specific period for the issuing (re)insurance company. Here, Xi;j , for j D 1; 2; :::; N.T /; are assumed to be mutually independent, identical, and lognormally-distributed variables, which are also independent of the loss number process, and their logarithmic means and variances are i and i2 ; respectively. A discount bond whose payoffs .POT / at maturity (i.e., time T) can be specified as follows: POT D
2
F rp F
if if
Ci;T K Ci;T > K;
(48.4)
The process of aggregate catastrophe losses facing the (re)insurer specified by Lee and Yu (2002) is different from the process of (re)insurer’s total liabilities specified by Duan and Yu (2005).
48 Catastrophic Losses and Alternative Risk Transfer Instruments
where K is the trigger level set in the CAT bond provisions, Ci;T is the aggregate loss at maturity, rp is the portion of principal needed to be paid to bondholders when the forgiveness trigger has been pulled, and F is the face value of the CAT bond. Under the assumption that the term structure of interest rates is independent of the catastrophe risk, the CAT bond can be priced as follows: 2 PCAT D PCIR .0; T / 4
1 X
e T
j D0
.T /j j F .K/ jŠ
3 j .T / F j .K/5 ; (48.5) e T Crp 1 j Š j D0
755 1
2
g D EŒCi;T D T e X C 2 X
(48.6) 2
g2 D VarŒCi;T D T e 2X C2X ;
(48.7)
where g and g2 denote the mean and variance of the approximating distribution g.Ci;T /, respectively. The price of the approximating analytical CAT bond can be shown to be the following: "Z
K
PCIR .0; T / 0
1 X
Z
2 1 1 p e 2 .ln Ci;T g / dCi;T C rp 2 g Ci;T
1
K
1
p e 2 g Ci;T
# 12 .ln Ci;T g /
where
2
dCi;T : (48.8)
F j .K/ D P r.Xi;1 C Xi;2 C ::: C Xi;j K/ denotes the j th convolution of F , and PCIR .0; T / D A.0; T /e B.0;T /r.0/ ; where " ACIR .0; T / D
B.0; T /CIR D D
T
2e .C / 2 . C /.e T 1/ C 2
# 2m 2 v
2.e T 1/ . C /.e T 1/ C 2
We report the results of Lee and Yu (2002) in Table 48.1 to illustrate the difference between the analytical estimates and numerical estimates. Table 48.1 shows that the values of the approximating solution and the values from the numerical method are very close and within the range of ten basis points for most cases. In addition, the approximate CAT bond prices are higher than those estimated by the Monte Carlo simulations for a high value of i . This is because the approximate lognormal distribution underestimates the tail probability of losses and this underestimation is more significant when i is high. We also note that the CAT bond price increases with trigger levels and this increment rises with occurrence intensity and loss variance.
p 2 C 2v2 : 48.2.1.2 Default-Risky CAT Bonds
Here, is the mean-reverting force measurement, and v is the volatility parameter for the interest rate. 48.2.1.1 Approximating An Analytical Solution Under the assumption that the catastrophe loss amount is independent and identically lognormally-distributed, the exact distribution of the aggregate loss at maturity, denoted as f .Ci;T /, cannot be known. Lee and Yu (2002) approximated the exact distribution by a lognormal distribution, denoted as g.Ci;T /; with specified moments.3 Following the approach, the first two moments of g.Ci;T / are set to be equal to those of f .Ci;T /, which can be written as:
3
Jarrow and Rudd (1982), Turnbull and Wakeman (1991), and Nielson and Sandmann (1996) used the same assumption in approximating the values of Asian options and basket options.
In order to look into the practical considerations of default risk, basis risk, and moral hazard relating to CAT bonds, Lee and Yu (2002) developed a structural model in which the insurer’s total asset value consists of two risk components – interest rate and credit risk. The term credit risk refers to all risks that are orthogonal to the interest rate risk. Specifically, the value of an insurer’s assets is governed by the following process: dVt D V dt C drt C V d WV;t ; Vt
(48.9)
where Vt is the value of the insurer’s total assets at time t; rt is the instantaneous interest rate at time t; WV;t is the Wiener process that denotes the credit risk; A is the instantaneous drift due to the credit risk; V is the volatility of the credit risk; and is the instantaneous interest rate elasticity of the insurer’s assets.
756
J.-P. Lee and M.-T. Yu
Table 48.1 Default-free CAT bond prices: approximating solution vs. numerical estimates no moral hazard and basis risk. Triggers .K/ Approximating solutions
Numerical estimates
(; XI )
100
110
120
100
110
120
(0.5,0.5) (0.5,1) (0,5,2) (1,0.5) (1,1) (1,2) (2,0.5) (2,1) (2,2)
0.95112 0.94981 0.92933 0.95095 0.94750 0.90559 0.95038 0.94015 0.85939
0.95117 0.95009 0.93128 0.95106 0.94829 0.90933 0.95071 0.94259 0.86603
0.95120 0.95031 0.93293 0.95113 0.94887 0.91254 0.95091 0.94441 0.87183
0.95119 0.94977 0.92675 0.95119 0.94825 0.90273 0.95110 0.93916 0.85065
0.95119 0.95029 0.92903 0.95119 0.97877 0.90682 0.95115 0.94263 0.85717
0.95119 0.95062 0.93103 0.95119 0.94977 0.91058 0.95119 0.94492 0.86378
Notes. All values are calculated assuming bond term T D 1, the market price of interest rate r D 0:01, the initial spot interest rate r D 5%; the long-run interest rate m D 5%, the force of mean-reverting D 0:2, the volatility of the interest rate D 10%; and the volatility of the asset return that is caused by the credit risk V D 5%. All estimates are computed using 20,000 simulation runs
In the case where the CAT bondholders have priority for salvage over the other debtholders, the default-risky payoffs of CAT bonds can be written as follows: POi;T 8 aL if Ci;T K and ˆ ˆ < Ci;T Vi;T a L D rp a L if K < Ci;T Vi;T rp a L ˆ ˆ : Max fVi;T Ci;T ; 0g otherwise; (48.10) where POi;T are the payoffs at maturity for the CAT bond forgiven on the issuing firm’s own actual losses; Vi;T is the issuing firm’s asset value at maturity; Ci;T is the issuing firm’s aggregate loss at maturity; a is the ratio of the CAT bond’s face amount to total outstanding debts (L). According to the payoff structures in POi;T and the specified asset and interest rate dynamics, the CAT bonds can be valued as follows: Pi D
1 E Œe Nr T POi;T ; aL 0
(48.11)
where Pi is the default-risky CAT bond price with no basis risk. Term E0 denotes expectations taken on the issuing date under risk-neutral pricing measure; rN is the average riskfree interest rate between issuing date and maturity date; and 1 is used to normalize the CAT bond prices for a $1 face aL amount. 48.2.1.3 Moral Hazard and Basis Risk Moral hazard results from less loss-control efforts by the insurer issuing CAT bonds, since these efforts may increase
the amount of debt that must be repaid at the expense of the bondholders’ coupon (or principal) reduction. Bantwal and Kunreuther (2000) noted the tendency for insurers to write additional policies in the catastrophe-prone area, spending less time and money in their auditing of losses after a disaster. Another important element that needs to be considered in pricing a CAT bond is the basis risk. The CAT bond’s basis risk refers to the gap between the insurer’s actual loss and the composite index of losses that makes the insurer not receive complete risk hedging. The basis risk may cause insurers to default on their debt in the case of high individual loss, but a low index of loss. There is a trade-off between basis risk and moral hazard. If one uses an insurer’s actual loss to define the CAT bond payments, then the insurer’s moral hazard is reduced or eliminated, but basis risk is created. In order to incorporate the basis risk into the CAT bond valuation, aggregate catastrophe losses for a composite index of catastrophe losses (denoted as Cindex;t ) can be specified as follows: N.t / X Cindex;t D Xindex;j ; (48.12) j D1
where the process fN.t/gt 0 is the loss number process, which is assumed to be driven by a Poisson process with intensity . Terms Xindex;j denote the amount of losses caused by the j th catastrophe during the specific period for the issuing insurance company and the composite index of losses, respectively. Terms Xindex;j , for j D 1; 2; :::; N.T /; are assumed to be mutually independent, identical, and lognormally-distributed variables, which are also independent of the loss number process, and their logarithmic means 2 ; respectively. In addition, and variances are index and index the correlation coefficients of the logarithms of Xi;j and Xindex; j , for j D 1; 2; :::; N.T / are equal to X .
48 Catastrophic Losses and Alternative Risk Transfer Instruments
In the case of the CAT bond being forgiven on the composite index of losses, the default-risky payoffs can be written as: POindex;T 8 aL ˆ ˆ ˆ ˆ < D rp a L ˆ ˆ ˆ ˆ : Max fVi;T Ci;T ; 0g
if Cindex;T K and Ci;T Vi;T a L if Cindex;T > K and Ci;T Vi;T rp a L otherwise; (48.13)
where Cindex;T is the value of the composite index at maturity, and a, L, rp, Vi;T , Ci;T , and K are the same as defined in Equation (48.10). In the case where the basis risk is taken into account the CAT bonds can be valued as follows: Pindex D
1 E0 e Nr T POindex;T ; aL
(48.14)
where Pindex is the default-risky CAT bond price with basis 1 are the same as risk at issuing time. Terms E0 , rN , and aL defined in Equation (48.11). The issuing firm might relax its settlement policy once the accumulated losses fall into the range close to the trigger. This would then cause an increase in expected losses for the next catastrophe. This change in the loss process can be described as follows: ( .1 C ˛/i if .1 ˇ/K Ci;j K; 0 (48.15) i D otherwise; i 0
where i is the logarithmic mean of the losses incurred by the .j C 1/th catastrophe when the accumulated loss Ci;j falls in the specified range, .1 ˇ/K Ci;j K. Term ˛ is a positive constant, reflecting the percentage increase in the mean, and ˇ is a positive constant, which specifies the range of moral hazard behavior. We expect that both moral hazard and basis risk will drive down the prices of CAT bonds. The results of the effects of moral hazard and basis risk on CAT bonds can be found in Lee and Yu (2002). The significant price differences indicate that the moral hazard is an important factor and should be taken into account when pricing the CAT bonds.4 A low loss correlation between the firm’s loss and the industry loss index subjects the firm to a substantial discount in its CAT bond prices.
4 Bantwal and Kunreuther (2000) also pointed out that moral hazard may explain the CAT bond premium puzzle .
757
48.3 Catastrophe Equity Puts If a insurer suffers a loss of capital due to a catastrophe, then its stock price is likely to fall, lowering the amount it would receive for newly issued stock. Catastrophe equity puts (CatEPut) give insurers the right to sell a certain amount of its stock to investors at a predetermined price if catastrophe losses surpass a specified trigger.5 Thus, catastrophe equity puts can provide insurers with additional equity capital when they need funds to cover catastrophe losses. A major advantage of catastrophe equity puts is that they make equity funds available at a predetermined price when the insurer needs them the most. However, the insurer that uses catastrophe equity puts faces a credit risk - the risk that the seller of the catastrophe equity puts will not have enough cash available to purchase the insurer’s stock at the predetermined price. For the investors of catastrophe equity puts they also face the risk of owning shares of a insurer that is no longer viable.
48.3.1 Catastrophe Equity Put Valuation Models The CatEPut gives the owner the right to issue shares at a fixed price, but that right is only exercisable if the accumulated catastrophe losses exceed a trigger level during the lifetime of the option. Such a contract is a special “double trigger” put option. Cox et al. (2004) valued a CatEPut by assuming that the price of the insurer’s equity is driven by a geometric Brownian motion with additional downward jumps of a specified size in the event of a catastrophe. The price of the insurer’s equity can be described as:
1 St D S0 exp ANt C Wt C S S2 t ; 2 (48.16) where St denotes the equity price at time t; fW gt 0 is a standard Brownian motion; fN.t/gt 0 is the loss number process, which is assumed to be driven by a Poisson process with intensity S ; A 0 is the factor to measure the impact of catastrophe on the market price of the insurer’s equity; and S and S are respectively the mean and standard deviation of return on the insurer’s equity given that no catastrophe occurs during an interval. The option is exercisable only if the number of catastrophes occurring during the lifetime of the
5 Catastrophe equity puts, or CatEPuts, are underwritten by Centre Re and developed by Aon with Centre Re.
758
J.-P. Lee and M.-T. Yu
contract is larger than a specified number (denoted as n). The payoffs of the CatEPut at maturity can be written as: PO
D
CFP
K ST 0
if ST < K and NT n ; otherwise (48.17)
where K is the exercise price. This CAT put option can be priced as follows: P CFP D
1 X
.S T /j Ke rT ˆ dj jŠ j Dn !
p S0 e Aj CkT ˆ dj S T ;
fL .y/ and mean l; fN.t/gt 0 is a homogeneous Poisson process with intensity . The term 't is used to compensate for the presence of downward jumps in the insurer’s share price and is chosen as: 'D ˛
.1 e ˛y / fL .y/ dy: 0
ı ˛E lj D ı H) ˛ D :: l (48.18)
where
S2 T 2
1
The parameter ˛ represents the percentage drop in the share price per unit of a catastrophe loss and is calibrated such that:
e S T
k D S 1 e A log SK0 rT C Aj kT C dj D p S T
Z
Since the right is exercisable only if the accumulated catastrophe losses exceed a critical coverage limit during the lifetime of the option, the payoffs of the CatEPut at maturity can be specified as:
PO
JW
D
:
8 <
K ST
:0
^
if ST < K and L.T / > L
;
otherwise (48.20) ^
Improving upon the assumption of Cox et al. (2004) that the size of the catastrophe is irrelevant, Jaimungal and Wang (2006) assumed that the drop in the insurer’s share price depends on the level of the catastrophe losses and valued the CatEPut under a stochastic interest rate. Jaimungal and Wang (2006) modeled the process of the insurer’s share price as follows: St D S0 exp .˛ .L .t/ 't/ C X .t// ;
where the parameter L represents the trigger level of catastrophe losses above which the issuer is obligated to purchase unit shares. Under these settings, the price of the CatEPut at the initial date can be described as follows: P
JW
De
T
Z 1 X .T /j j D1
jŠ
1 ^
L
n .n/ fL .y/ KP .0; T /
o ˆ .d .y// S0 e ˛.y'T / ˆ .dC .y// dy;
(48.19)
(48.21)
whereby .n/
L .t/ D
N.t / X
where fL .y/ represents the n-fold convolution of the catastrophe loss density function f .L/;
lj ;
j D1
1 2 dX.t/ D S S dt C S d W S .t/ ; 2
d˙ .y/ D
ln
St
~ 1 2 P .0; T / ˛ .y 'T / ˙ Vasicek K 2 r ~
r .0; T /
~
dr.t/ D . r .t// C r d W .t/ ; d W S ; W r .t/ D S;r dt; r
where W S .t/ and W r .t/ are correlated Wiener processes driving the returns of the insurer’s equity and the short rate, respectively; L .t/ denotes the accumulated catastrophe losses facing the insurer at time t; lj ; for j D 1; 2; :::, are assumed to be mutually independent, identical, and distributed variables representing the size of the j th loss with p.d.f
r2 .0; T / D S2 T C
;
2S;r S r C r2 2
.T BVasicek .0; T //
r2 2 B .0; T / : 2 Vasicek
Here, P .0; T / is a T -maturity zero coupon bond in the Vasicek model: PVasicek .0; T / D exp fAVasicek .0; T / BVasicek .0; T / r .0/g ;
48 Catastrophic Losses and Alternative Risk Transfer Instruments
where
2 A .0; T /Vasicek D r 2 .BVasicek .0; T / T / 2 r2 2 B .0; T / ; 4 Vasicek 1 1 e T : BVasicek .0; T / D
48.3.1.1 Credit Risk and CatEPuts Both Cox et al. (2004) and Jaimungal and Wang (2006) did not consider the effect of credit risk, the vulnerability of the
8 K ST ˆ ˆ < .K ST / m2 .K S T /
ˆ .K ST / m2 C LRe;T ˆ : 0
Vi;t Li;t ; m1
issuer, on the catastrophe equity puts.6 Here, we follow Cox et al. (2004) to assume that the option is exercisable only if the number of catastrophes occurring during the lifetime of the contract is larger than a specified number, and we develop a model to incorporate the effects of credit risk on the valuation of CatEPuts. Consider an insurer with m1 shares outstanding that wants to be protected in the event of catastrophe losses by purchasing m2 units of CatEPuts from a reinsurer. Each CAT put option allows the insurer the right to sell one share of its stock to the reinsurer at a price of K if the number of catastrophes occurring during the lifetime of the contract is larger than the trigger level (denoted as n). The payoffs while incorporating the effect of the reinsurer’s vulnerability, PO LY , can be written as:
if ST < K and PL;T n and VRe;T LRe;T > m2 .K S T / if ST < K and PL;T n and VRe;T LRe;T m2 .K S T / ;
The value dynamics for the reinsurer’s assets (VRe;T ) and liabilities (LRe;T ) are specifically governed by the following processes: dVRe;t D .r C VRe /VRe;t dt C VRe VRe;t d WVRe ;t ;
(48.23)
where Vi;t and Li;t represent the values of the insurer’s assets and liabilities at time t, respectively. The value dynamics for the insurer’s asset and liability are specified as follows: dV i;t D .r C Vi /Vi;t dt C Vi Vi;t dW Vi ;t ;
1 2 dLi;t D r C Li P e yi C 2 yi Li;t dt CLi Li;t dW Li ;t C YPLi ;t Li;t dPL;t ;
(48.22)
otherwise
where PL;t is the loss number process, which is assumed to be driven by a Poisson process with intensity P ; Si;t denotes the insurer’s share price and can be shown as: Si;t D
759
(48.24)
(48.25)
where r is the risk-free interest rate; Vi is the risk premium associated with the insurer’s asset risk; Li denotes the risk premium for small shocks in the insurer’s liabilities; WVi ;t is a Weiner process denoting the asset risk; WLi ;t is a Weiner process summarizing all continuous shocks that are not related to the asset risk of the insurer; and YPLi ;t is a sequence of independent and identically-distributed positive random variables describing the percentage change in liabilities in the event of a jump. We assume that ln.YPLi ;t / has a normal distribution with mean yi and standard deviation yi . The 1 2 term P e yi C 2 yi offsets the drift arising from the compound Poisson component YPLi ;t Li;t dPL;t .
dLRe;t
1 2 D r C LRe P e yRe C 2 yRe LRe;t dt
(48.26)
CLRe LRe;t d WLRe;t C YPLRe ;t LRe;t dPL;t ; (48.27) where VRe;t and LRe;t represent the values of the reinsurer’s assets and liabilities at time t, respectively; r is the riskfree interest rate; VRe is the risk premium associated with the reinsurer’s asset risk; LRe denotes the risk premium for continuous shocks in the insurer’s liabilities; WVRe ;t is a Weiner process denoting the asset risk; WLRe ;t is a Weiner process summarizing all continuous shocks that are not related to the asset risk of the reinsurer; and YPLRe ;t is a sequence of independent and identically-distributed positive random variables describing the percentage change in the reinsurer’s liabilities in the event of a jump. We assume that
6
Though CatPut issuer may adopt alternative credit instruments or derivative market vehicles to transfer its credit risk to the capital market as suggested in Saunders and Allen (2002). Watson (2008) also discussed the impact of transferring the credit risk by securitizing mortgage credit.
760
J.-P. Lee and M.-T. Yu
ln.YPLRe ;t / has a normal distribution with mean yRe and standard deviation yRe . In addition, assume that the correlation coefficient of ln.YPi ;t / and ln.YPLRe ;t / is equal to Y . The 1
2
term P e yRe C 2 yRe offsets the drift arising from the compound Poisson component YPLRe ;t LRe;t dPL;t . According to the payoff structures, the catastrophe loss number process, and the dynamics for the (re)insurer’s assets and liabilities specified above, the CatEPut can be valued as follows: P LY D E e rT PO LY :
(48.28)
Here, E Œ denotes expectations taken on the issuing date under a risk-neutral pricing measure. The CAT put prices are estimated by the Monte Carlo simulation. Table 48.2 presents the numerical results. It shows that the possibility of a reinsurer’s vulnerability (credit risk) drives the put price down dramatically. We also observe that the higher the correlation coefficient of ln.YPi ;t / and
ln.YPLRe ;t / (i.e., Y ) is, the lower the value of the CatEPut will be. This implies that the reinsurer with efficient diversification in providing reinsurance coverage can increase the value of the CatEPut.
48.4 Catastrophe Derivatives Catastrophe risk for (re)insurers can be hedged by buying exchange-traded catastrophe derivatives such as catastrophe futures, catastrophe futures options, and catastrophe options. Exchange-traded catastrophe derivatives are standardized contracts based on specified catastrophe loss indices. The loss indices reflect the entire P&C insurance industry. The contracts entitle (re)insurers (the buyers of catastrophe derivatives) a cash payment from the seller if the catastrophes cause the index to rise above the trigger specified in the contract.
Table 48.2 Catastrophe put option prices with vs. without credit risk With credit risk y P
Without credit risk VRe Panel A: D1 Vi 2 0.12787 1 0.09077 0.5 0.05064 0.33 0.02878 0.1 0.00391
0.3
0.5
0.8
1
0.02241 0.01940 0.00872 0.00451 0.00048
0.02200 0.01913 0.00873 0.00448 0.00048
0.02140 0.01864 0.00861 0.00437 0.00043
0.02071 0.01843 0.00858 0.00433 0.00044
VRe D5 Vi 0.12787 0.09077 0.05064 0.02878 0.00391
0.03008 0.02637 0.01333 0.00693 0.00081
0.02918 0.02616 0.01319 0.00683 0.00081
0.02833 0.02576 0.01300 0.00664 0.00084
0.02778 0.02515 0.01280 0.00670 0.00085
VRe D 10 Vi 0.12787 0.09077 0.05064 0.02878 0.00391
0.03126 0.02730 0.01403 0.00733 0.00087
0.03029 0.02698 0.01389 0.00722 0.00086
0.02935 0.02654 0.01357 0.00707 0.00088
0.02860 0.02611 0.01347 0.00704 0.00091
Panel B: 2 1 0.5 0.33 0.1 Panel C: 2 1 0.5 0.33 0.1
Note. All values are calculated assuming option term T D 2, the number of catastrophe trigger n D 2; risk-free interest rate r D 5%, the mean of logarithmic jump magnitude yi D 2:3075651 (yRe D 2:3075651), the standard deviation of logarithmic jump magnitude yi D 0:5 (yRe D 0:5), the (re)insurer’s initial capital D 1:2), the volatility of (re)insurer’s assets Vi D 10% (VRe D 10%), and the position LVii D 1:2 ( LVRe Re volatility of (re)insurer’s pure liabilities Li D 10% (LRe D 10%). The catastrophe intensity P is set at 2, 1, 0.5, 0.33, and 0.1. All estimates are computed using 20,000 simulation runs
48 Catastrophic Losses and Alternative Risk Transfer Instruments
48.4.1 Catastrophe Derivatives Valuation Models A general formula for the catastrophe futures price can be developed as in Cox and Schwebach (1992) as follows: 1 .ALt C E ŒYt jJt / ; Q
Ft D
(48.29)
where Q is the aggregate premium paid for in the catastrophe insurance portfolio. Here, Yt denotes the losses of the catastrophe insurance portfolio which are reported after the current time t; but included in the settlement value, ALt is the current amount of catastrophe losses announced by the exchange, and Jt denotes the information available at time t. Cox and Schwebach (1992) further derived the catastrophe futures price by assuming Yt follows a compound Poisson distribution with a intensity parameter Y . The aggregate losses of a catastrophe insurance portfolio would be the sum of a random variable of individual catastrophe losses which are independent and identically distributed. In other words, Yt D X1 C X2 C C XN , where X1 ; X2 ; :::; XN are mutually independent individual catastrophe losses. According to these assumptions, the futures price can be described as follows: Ft D
1 ALt C .T t/ Y p1 ; Q
(48.30)
where p1 represents the first moment of the individual catastrophe loss distribution, i.e., p1 D E .Xi /. Assuming that the loss of a catastrophe insurance portfolio at maturity (i.e., YT ) T is lognormally distributed, that is, the logarithm of AL ALt is normally distributed with mean .T t/ and variance 2 .T t/, the futures price can be described as:
ALt e Ft D Q
.T t /C
2 .T t / 2
:
(48.31)
In the case where ALT is set to be lognormally distributed, Cox and Schwebach (1992) presented the value of a catastrophe futures call option with exercise price x; denoted as CCS , as follows: CCS D
e r.T t / Q
y1 D
2 C 2 .T t / ALt e ˆ .y1 / xQˆ .y2 / ;
C .T t/ C p T t p and y2 D y1 T t :
log
ALt xQ
(48.32) 2 .T t / 2
Cummins and Geman (1995) used two different processes to describe the instantaneous claim processes during the
761
event quarter and the run-off quarter. They argued that the reporting claims by policyholders are continuous and take only a positive value, hence specifying the instantaneous claim to be a geometric Brownian motion during the run-off quarter. Moreover, they added a jump process to the process during the event quarter. Consequently, the two instantaneous claim T and run-off processes during the event quarter t 2 0; 2 quarter t 2 T2 ; T can be respectively specified as follows: T ; for t 2 0; 2 T ;T ; for t 2 2
dct D ct .c dt C c d Wc;t / C Jc dNc;t 0
0 dct D ct c dt C c d Wc;t
where ct denotes the instantaneous claim which means that the amount of claims reported during a small length of time 0 dt is equal to ct dt. Terms c and c represent the mean of the continuous part of the instantaneous claims during the event quarter and run-off quarter, respectively, while c and 0 c represent the standard deviation of the continuous part of the instantaneous claims during the event quarter and run-off quarter, respectively. Term Jc is a positive constant representing the severity of loss jump due to a catastrophe, Nc;t is a Poisson process with intensity c , and Wc;t is a standard Brownian motion. Cummins and Geman (1995) derived a formula to value the futures price at time t as follows: ! T exp˛. 2 t / 1 cs ds C ct Ft D ˛ 0 ! T exp˛. 2 t / ˛ T2 t 1 CJc c ˛2 ! 0 ˛ T2 0 1 ˛ . T2 t / exp Cc0 exp ˛0 ! ! 0 T T exp˛. 2 t / 1 exp˛ 2 1 CJc c ; (48.33) ˛ ˛0 Z
t
0
0
0
where ˛ D c c and ˛ D c c . Here, represents the equilibrium market price of claim level risk and is assumed to be constant over period Œ0; T . Cummins and Geman (1995) also considered catastrophe call spreads written on the catastrophe loss ratio. The payoffs of European call spreads at maturity T , denoted as Cspread .S; k1 ; k2 /, can be written as follows: Cspread .c; k1 ; k2 / ( " D Min Max 100
# ) cs ds k1 ; 0 ; k2 k1 ; (48.34) Qc
RT 0
762
J.-P. Lee and M.-T. Yu
Table 48.3 20/40 European catastrophe call spreads prices s Time to maturity
0.2
0.4
0.6
0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
3.234 3.192 3.155 3.122 3.095 3.071 3.052 3.035 3.022 3.010 3.002
3.842 3.798 3.760 3.727 3.698 3.674 3.653 3.635 3.620 3.608 3.598
4.421 4.376 4.336 4.301 4.270 4.244 4.221 4.202 4.185 4.171 4.160
Note. All values are calculated assuming the contract with an expected loss ratio of 20%, the risk-free rate r D 5%, c D 0:5; Jc D 0:8, and 0 the parameters ˛; ˛ and to be set at 0.1, 0.1, and 0.15, respectively. Strike prices are also in points. Values are quoted in terms of loss ratio percentage points
Since the instantaneous mean and variance of calendartime futures return, Xt dt and X2 t dt, are linear to random information arrival, the information–time proportional factors, X and X2 , are constant. Substituting Equations (48.36) and (48.37) into Equation (48.35), the parent process in information time can be transferred into a lognormal diffusion process: dXn D X dt C X d WXn : Xn
According to the model, the value of the information-type European catastrophe call option with strike price k, denoted as c .X; n; k/, can be written as follows: c .X; n; k/ D
P1 mD0
ln d1 D
where k1 and k2 are the exercise prices of the catastrophe call spread and k2 > k1 , while Qc is the premiums earned for the event quarter. Since no close-form solution can be obtained, the catastrophe call spreads under alternative combinations of exercise prices can be estimated by Monte Carlo simulation. We report the values of 20/40 call spreads estimated by Cummins and Geman (1995) in Table 48.3 to present the effects of parameter values on the value of catastrophe call spreads. Chang et al. (1996) used the randomized operational time approach to transfer a compound Poisson process to a more tractable pure diffusion process and led to the parsimonious pricing formula of catastrophe call options as a riskneutral Poisson sum of Black’s call prices in information time. Chang et al. (1996) assumed catastrophe futures price changes follow jump subordinated processes in calendartime. The parent process is assumed to be a lognormal diffusion directed by a homogenous Poisson process as follows: dXt D Xt dt C Xt d WXt ; Xt
(48.35)
where Xt and Xt are the stochastic calendar-time instantaneous mean and variance, respectively, and: Xt dt D X d n .t/
(48.36)
X2 t dt D X2 d n .t/ ;
(48.37)
where dn.t/ D 1 dn.t/ D 0
if the jump occurs once in dt with probability jX dt; otherwise with probability 1 jX dt:
(48.38)
.m; jX / B .Xˆ .d1 / kˆ .d2 // ; (48.39)
1 X C X2 m p k 2 p ; d2 D d1 X m; X m jX .T t / Œj
/ m
t X where .m; j / D e is the Poisson probamŠ bility mass function with intensity jx . Moreover, T t is the option’s calendar-time maturity, r is the riskless interest rate, B D e r.T t / is the price of a riskless matching bond with maturity T t, and m denotes the information time maturity index. Chang et al. (1996) followed Barone-Adsei and Whaley (1987) for an analytical approximation of the American extension of the Black formula to get the value of the information–time American catastrophe futures call option with strike price k, denoted as C .X; n; k/, as follows:
C .X; n; k/ D
1 X
.T
.m; jX / CB .X; n; k/ ;
(48.40)
mD0
where 8 rm q ŒXˆ .d1 / kˆ .d2 / C A XX ; ˆ <e where X X ; CB .X; n; k/ D ˆ : X k; where X > X AD
X q
(48.41) 1 Bˆ d1 X ;
ln d1 X D
qD
1C
X X
C 12 X2 m p ; X m
p 2r 1 C 4h ; and h D 2 : 2 X .1 B/
48 Catastrophic Losses and Alternative Risk Transfer Instruments
Table 48.4 The Black .B/ and Information–time .I T / American spread values
763
Maturity (Years) Strike/Strike .k1 =k2 /
Model
0.025
0.05
0.1
0.25
0.5
20/40
B .I T1 / I T30 I T15 I T2 B .I T1 / I T30 I T15 I T2
0.756 0.549 0.424 0.154 0 0 0 0.013
1.067 0.943 0.774 0.303 0 0 0 0.024
1.505 1.503 1.322 0.580 0 0 0 0.053
2.231 2.583 2.369 1.288 0.030 0.056 0.064 0.146
3.048 3.706 3.348 2.140 0.228 0.288 0.281 0.321
40/60 y
Note. All option values are calculated assuming annual volatility X D 60%, annual risk-free interest rate r D 5%, and futures price X D 20. I T30 ; I T15 , and I T2 denote information–time values with annual jump arrival intensities jX D 30; 15; and 2, respectively. B denotes the Black value and is identical to I T1 , the information–time value when jump arrival intensity is infinity. The option is capped at 200
Here, CB .X; n; k/ represents the American extension of the Black formula based on MacMillan (1986) quadratic approximation of the American stock options. Moreover, X is the critical futures price above where the American futures option should be exercised immediately and is determined by solving: X k D e rm ŒXˆ .d1 / kˆ .d2 / C A C
X q
X X
q
1 Bˆ d1 X :
Since a diffusion is a limiting case of a jump subordinated process when the jump arrival intensity approaches infinity and the jump size simultaneously approaches zero, the pricing model of Black (1976) is a special case of Equations (48.39) and (48.40). Table 48.4 reports the values of information–time American catastrophe call spreads estimated by Chang et al. (1996). It shows that the Black formula underprices the spread for the 40/60 case. However, for the 20/40 case, the Black formula overprices when the maturity is short. Chang et al. (1996) noted that the Black formula is a limiting case of information–time formula, so that the largest mispricing occurs when the jump intensity j is set at a low value.
48.5 Reinsurance with CAT-Linked Securities P&Cs traditionally diversify and transfer catastrophe risk through reinsurance arrangements. The objective of catastrophe reinsurance is to provide protection for catastrophe losses that exceed a specified trigger level. Dassios and Jang (2003) priced stop-loss catastrophe reinsurance contracts while using the Cox process to model the claim arrival
process for catastrophes. However, in the case of catastrophic events, reinsurers might not have sufficient capital to cover the losses. Recent studies of the catastrophe reinsurance market have found that these catastrophe events see limited availability of catastrophic reinsurance coverage in the market (Froot 1999, 2001; Harrington and Niehaus 2003). P&C reinsurers can strengthen their ability in providing catastrophe coverages by issuing CAT-linked instruments. For example, Lee and Yu (2007) developed a model to value the catastrophe reinsurance while considering the issuance of CAT bonds. The amount that can be forgiven by CAT bondholders when the trigger level has been pulled, ı, can be specified as follows: ı.C / D FCAT PCAT;T ;
(48.42)
where PCAT;T is the payoffs of the CAT bond at maturity and is specified as follows: PCAT;T D
if C KCAT bond FCAT : rp FCAT if C > KCAT bond
(48.43)
Here, FCAT is the face value of CAT bonds, and C can be the actual catastrophe loss facing the reinsurer (denoted as Ci;T , specified by Equation (48.3)) or a composite catastrophe index (denoted as Cindex;T , specified by Equation (48.12)) which depends on the provision set by the CAT bond. When the contingent debt forgiven by the CAT bond depends on the actual losses, there is no basis risk. When the basis risk exists, the payoffs of the reinsurance contract remain the same except that the contingent savings from the CAT bond, ı.C /; depending on the catastrophic-loss index, become ı.Cindex;T /: Since the debt forgiven by the CAT bond does not depend on the actual loss, the realized losses and savings may not match and may therefore affect the insolvency of
764
J.-P. Lee and M.-T. Yu
the reinsurer and the value of the reinsurance contract in a way that differs from that without basis risk. Here, KCAT bond denotes the trigger level set in the CAT bond provision.
Pb;T
8 ˆ ˆM A ˆ ˆ Ci;T A ˆ ˆ ˆ ˆ < .M A/.ARe;T C ı/ D DRe;T C M A ˆ ˆ .C ˆ ˆ i;T A/.ARe;T C ı/ ˆ ˆ ˆ ˆ DRe;T C Ci;T A : 0
if Ci;T M and ARe;T C ı DRe;T C M A if A CiT < M and ARe;T C ı DRe;T C Ci;T A if CiT M and ARe;T C ı < DRe;T C M A
otherwise;
(48.45)
where ARe and ARe denote respectively the mean and standard deviation of the reinsure’s asset return; ARe is the instantaneous interest rate elasticity of the reinsurer’s assets; CRe;T is the catastrophe loss covered by the reinsurance contract; and M and A are respectively the cap and attachment level arranged in the reinsurance contract. In addition to the liability of providing catastrophe reinsurance coverage, the reinsurer also faces a liability that comes from providing reinsurance coverages for other lines. Since the liability represents the present value of future claims related to the noncatastrophic policies, the value of a reinsurer’s liability, denoted as DRe;t , can be modeled as follows: dDRe;t D .rt C DRe /DRe;t dt C DRe DRe;t drt CDRe DRe;t d WDRe ;t ;
(48.46)
where DRe is the instantaneous interest rate elasticity of the reinsurer’s liabilities. The continuous diffusion process reflects the effects of interest rate changes and other day-to-day small shocks. Term DRe denotes the risk premium for the small shock, and WDRe ;t denotes the day-to-day small shocks that pertain to idiosyncratic shocks to the capital market. In order to incorporate the effect of the interest rate risk on the reinsurer’s assets, the asset value of the reinsurance company is assumed to be governed by the same process as defined in Equation (48.10). Under the term structure assumption of Cox et al. (1985) the rate on line (ROL) or the fairly-priced premium rate can be calculated as follows: ROL D
h RT i 1
E0 e 0 rs ds Pb;T ; M A
(48.44)
if M > CiT A and ARe;T C ı < DRe;T C CiT A
where ARe;T denotes the reinsurer’s asset value at time t, which is assumed to be governed by the following process: dARe;t D ARe dt C ARe drt C ARe d WARe ;t ; ARe;t
In the case where the reinsurer i issues a CAT bond to hedge the catastrophe risk, at maturity the payoffs of the reinsurance contract written by the reinsurer, denoted by Pb;T , can be described as follows:
(48.47)
where ROL is the premium rate per dollar covered by the catastrophe reinsurance; and E0 denotes the expectations taken on the issuing date under risk-neutral pricing measure. Table 48.5 reports ROLs with and without basis risk calculated by Lee and Yu (2007). When the coefficient of correlation between the individual reinsurer’s catastrophe loss and the composite loss index, X , equals 1, no basis risk exists. The lower the c is , the higher the basis risk the reinsurer has. The difference of ROLs for a contract with X D 1 and other alternative values is the basis risk premium. We note that the basis risk drives down the value of the reinsurance contract and the impact magnitude increases with the basis risk, catastrophe intensity, and loss volatility. We also note that the basis risk premium decreases with the trigger level and the reinsurer’s capital position, but increases with catastrophe occurrence intensity and loss volatility.
48.6 Conclusion This study investigates the valuation models for three types of CAT-linked securities: CAT bonds, CAT equity puts, and exchange-traded CAT futures and options. These three new types of securities are capital market innovations which securitize the reinsurance premiums into tradable securities and share the (re)insurers’ catastrophe risk with investors. The study demonstrates how prices of CAT-linked securities can by valued by using a contingent-claim framework and numerical methods via risk-neutral pricing techniques. It begins with introducing a structural model of the CAT bond that incorporates stochastic interest rates and allows for endogenous default risk and shows how its price can be estimated. The model can also evaluate the effect of moral hazard and basis risk related to the CAT bonds. This study then extends the literature by setting up a model for valuing CAT equity puts in which the issuer of the puts is vulnerable. The results show how the values of CAT equity puts change
48 Catastrophic Losses and Alternative Risk Transfer Instruments
765
Table 48.5 Values of reinsurance contracts (ROL) with CAT bonds and basis risk KCAT bond 80 100 c
0:3
0:5
1
0:3
0:5
(; c ; ci ndex )
120 1
0:3
0:5
1
ARe =DRe =1.1
(0.5,0.5,0.5) (0.5,1,1) (0.5,2,2) (1,0.5,0.5)
0:00283 0:01696 0:05335 0.01067
0:00283 0:01698 0:05354 0:01067
0:00283 0:01709 0:05424 0:01607
0:00283 0:01694 0:05327 0:01066
0:00283 0:01695 0:05343 0:01066
0:00283 0:01701 0:05403 0:01066
0:00283 0:01694 0:05326 0:01066
0:00283 0:01695 0:05340 0:01066
0:00283 0:01701 0:05392 0:01066
(1,1,1) (1,2,2) (2,0.5,0.5) (2,1,1) (2,2,2)
0:03989 0:10632 0:04331 0:09952 0:21774
0:03995 0:10663 0:04331 0:09964 0:21825
0:04018 0:10798 0:04337 0:10011 0:22703
0:03987 0:10615 0:04331 0:09931 0:21719
0:03990 0:10640 0:04331 0:09938 0:21767
0:04002 0:10749 0:04333 0:09967 0:21981
0:03978 0:10594 0:04331 0:09925 0:21687
0:03978 0:10620 0:04331 0:09930 0:21732
0:03986 0:10712 0:04331 0:09943 0:21910
0:00295 0:01908 0:06259 0:01123 0:04508 0:12488 0:04667 0:11292 0:24732
0:00295 0:01902 0:06175 0:01123 0:04473 0:12300 0:04663 0:11237 0:24506
0:00295 0:01902 0:06189 0:01123 0:04473 0:12326 0:04663 0:11242 0:24540
0:00295 0:01906 0:06246 0:01123 0:04482 0:12428 0:04663 0:11258 0:24677
0:00295 0:01906 0:06246 0:01123 0:04482 0:12428 0:04663 0:11258 0:24677
0:00296 0:02020 0:06887 0:01130 0:04763 0:13768 0:04734 0:12055 0:27038
0:00296 0:02020 0:06899 0:01130 0:04763 0:13793 0:04734 0:12058 0:27081
0:00296 0:02023 0:06962 0:01130 0:04773 0:13907 0:04734 0:12080 0:27268
ARe =DRe =1.3 (0.5,0.5,0.5) (0.5,1,1) (0.5,2,2) (1,0.5,0.5) (1,1,1) (1,2,2) (2,0.5,0.5) (2,1,1) (2,2,2)
0:00296 0:01904 0:06184 0:01123 0:04479 0:12335 0:04669 0:11265 0:24566
0:00296 0:01905 0:06203 0:01123 0:04484 0:12365 0:04669 0:11277 0:24603
0:00296 0:01918 0:06280 0:01123 0:04510 0:12512 0:04670 0:11337 0:24791
0:00295 0:01902 0:06177 0:01123 0:04492 0:12342 0:04666 0:11249 0:24531
0:00295 0:01902 0:06193 0:01123 0:04494 0:12367 0:04666 0:11257 0:24567 ARe =DRe =1.5
(0.5,0.5,0.5) (0.5,1,1) (0.5,2,2) (1,0.5,0.5) (1,1,1) (1,2,2) (2,0.5,0.5) (2,1,1) (2,2,2)
0:00296 0:02016 0:06887 0:01130 0:04767 0:13795 0:04735 0:12066 0:27106
0:00296 0:02018 0:06905 0:01130 0:04771 0:13824 0:04735 0:12076 0:27153
0:00296 0:02030 0:06988 0:01130 0:04800 0:13983 0:04735 0:12147 0:27417
0:00296 0:02017 0:06882 0:01130 0:04764 0:13796 0:04733 0:12051 0:27066
0:00296 0:02017 0:06897 0:01130 0:04766 0:13820 0:04733 0:12057 0:27110
Note. This table presents ROLs with CAT bond issuance and the payoffs to CAT bonds are linked to a catastrophe loss index. ROLs are calculated and report alternative sets of trigger values (KCAT bond ), catastrophe intensities (), catastrophe loss under volatilities (c ; ci ndex ) and the coefficient of correlation between the reinsurer’s catastrophe loss and the composite loss index (X ). ARe =DRe represents the initial asset-liability structure or capital position of the reinsurers. All estimates are computed using 20,000 simulation runs
with the issuer’s vulnerability and the correlation between the (re)insurer’s individual catastrophe risk and the catastrophe index. Both results indicate that the credit risk and the basis risk are important factors in determining CAT bonds and CAT equity puts. This study also compares several models in valuing CAT futures and options. Though differences exist in alternative models, model prices are within reasonable ranges and similar patterns are observed on price relations with the underlying elements. The hedging effect for a reinsurer issuing CAT bonds is also examined. As long as the threat that natural disasters pose to the financial viability of the P&C industry continues to exist, there
is a need for further innovations on better management of catastrophe risk. The analytical framework in this study in fact provides a platform for future research on catastrophic events with more sophisticated products and contracts in the insurance industry as well as other financial industries. For example, our analysis can be applied to the recent massive losses of derivatives associated with subprime mortgage loans and the vulnerability of the firm that has created these structured mortgage products.7
7
See Watson (2008).
766
References Bantwal, V. J. and H. C. Kunreuther. 2000. “A CAT bond premium puzzle?” Journal of Psychology and Financial Markets 1(1), 76–91. Barone-Adesi, G. and R. E. Whaley. 1987. “Efficient analytic approximation of american option values, Journal of Finance 42(2), 301–320. Black, F. 1976. “The pricing of commodity contracts.” Journal of Financial Economics 3, 167–179. Chang, C., Chang, J. S., and M. -T. Yu. 1996. “Pricing catastrophe insurance futures call spreads: a randomized operational time approach.” Journal of Risk and Insurance 63, 599–617. Cox, S. H. and R. G. Schwebach. 1992. “Insurance futures and hedging insurance price risk.” Journal of Risk and Insurance 59, 628–644. Cox, J., J. Ingersoll, and S. Ross. 1985. “The term structure of interest rates.” Econometrica 53, 385–407. Cox, S. H., J. R. Fairchild, and H. W. Pederson. 2004. “Valuation of structured risk management products.” Insurance: Mathematics and Economics 34, 259–272. Cummins, J. D. and H. Geman. 1995. “Pricing catastrophe futures and call spreads: an arbitrage approach.” Journal of Fixed Income March, 46–57. Dassios, A. and J.-W. Jang. 2003. “Pricing of catastrophe reinsurance and derivatives using the cox process with shot noise intensity.” Finance and Stochastics 7(1), 73–95. Duan, J.-C. and M.-T. Yu. 2005. “Fair insurance guaranty premia in the presence of risk-based capital regulations, stochastic interest rat and catastrophe risk.” Journal of Banking and Finance 29(10), 2435–2454. Froot, K. A. (Ed.). 1999, The Financing of Catastrophe Risk, University of Chicago Press, Chicago. Froot, K. A. 2001. “The market for catastrophe risk: a clinical examination.” Journal of Financial Economics 60(2), 529–571. Harrington, S. E. and G. Niehaus. 2003. “Capital, corporate income taxes, and catastrophe insurance.” Journal of Financial Intermediation 12(4), 365–389.
J.-P. Lee and M.-T. Yu Jaimungal, S. and T. Wang. 2006. “Catastrophe options with stochastic interest rates and compound poisson losses.” Insurance: Mathematics and Economics 38, 469–483. Jarrow, R. and A. Rudd. 1982. “Approximate option valuation for arbitrary stochastic processes.” Journal of Financial Economics 10, 347–369. Lee, J.-P. and M.-T. Yu. 2002. “Pricing default-risky CAT bonds with moral hazard and basis risk.” Journal of Risk and Insurance69(1), 25–44. Lee, J.-P. and M.-T. Yu. 2007. “Valuation of catastrophe reinsurance with CAT bonds.” Insurance: Mathematics and Economics 41, 264–278. Litzenberger, R. H., D. R. Beaglehole, and C. E. Reynolds. 1996. “Assessing catastrophe reinsurance-linked securities as a new asset class.” Journal of Portfolio Management Special Issue , 76–86. Loubergé, H., E. Kellezi, and M. Gilli. 1999. “Using catastrophe-linked securities to diversify insurance risk: a financial analysis of cat bonds.” Journal of Insurance Issues 22, 125–146. MacMillan, L. W. 1986. “Analytic approximation for the American put options.” Advances in Futures and Options Research 1, 119–139. MMC Security. 2007. Market update: the catastrophe bond market at year-end 2006, Guy Carpenter & Company, Inc. Nielsen, J. and K. Sandmann. 1996. “The pricing of Asian options under stochastic interest rate.” Applied Mathematical Finance 3, 209–236. Saunders, A. and L. Allen. 2002. Credit risk measurement: new approaches to value at risk and other paradigms, Wiley, New York. Turnbull, S. and L. Wakeman. 1991. “A quick algorithm for pricing European average options.” Journal of Financial and Quantitative Analysis 26, 377–389. Watson, R. D. 2008. “Subprime mortgages, market impact, and safety nets: the good, the bad, and the ugly.” Review of Pacific Basin of Financial Markets and Policies 11, 465–492. Zajdenweber, D. 1998. The valuation of catastrophe-reinsurance-linked securties, American risk and insurance association meeting, Conference Paper.
Chapter 49
A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values Chuang-Chang Chang, Pei-Fang Hsieh, and Hung-Neng Lai
Abstract This study applies a modification of the Schwartz and Moon (Financial Analysts Journal 56:62–75, 2000) model to the evaluation of bank consolidation. From our examination of a bank merger case study (the first example of such a bank merger in Taiwan), we find that, from an ex-ante viewpoint, the consolidation value is, on average, about 30% of the original total value of the independent banks. We also find that the probability of bankruptcy was considerably lower following the merger than it would have been prior to the merger. Our case study therefore indicates that the merger was indeed a worthwhile venture for both banks involved. Furthermore, on completion of the merger, we are also able to determine that, in terms of the magnitude of the increased consolidation value, the most crucial roles are played by the resultant changes in the growth rates of the integrated loans and integrated deposits, as well as the cost-saving factors within the cost functions. Keywords A bank merger r The probability of bankruptcy r Consolidation values
49.1 Introduction Despite the growth in domestic financial institutions throughout the 1990s, there were significant declines in both return on equity (ROE) and return on assets (ROA) for Taiwanese domestic banks, which is, of course, one of the warning signs of an over-banking problem. In order to solve this problem, the Taiwanese government promulgated the Financial Institutions Merger Law in November 2000, which would subsequently allow banks to diversify the scope of their business, or to acquire other
C.-C. Chang (), P.-F. Hsieh, and H.-N. Lai Department of Finance, National Central University, Chung Li City, Taiwan, Republic of China e-mail: [email protected]
financial institutions. Thereafter, in June 2001, in an effort to accelerate the reformation of the island’s financial institutions, the Financial Holding Company Law was introduced as a means of simplifying the areas of cross-management and cross-selling through both horizontal and vertical integration. Following the passage of this Law, the island experienced a wave of financial institution consolidations, as the search began for capital efficiency and cost savings. Numerous studies have discussed the efficiency effects of bank mergers, with capital efficiency having been recognized as one of the issues of consolidation efficiency. In their review of over 250 references, Berger et al. (1999) found the evidence to be consistent with the increases in market power secured by certain other types of consolidations, through improvements in both profit efficiency and the diversification of risk. Rhoades (1998) summarized nine case studies of additional cost savings stemming from the elimination of redundant facilities, staff, or even departments, and found that in all nine cases the combined firms were able to successfully achieve their cost-cutting objectives. The literature on consolidation efficiency has focused on an examination of the consequences of consolidation for individual bank performance, including event studies of stock price responses and studies of post-merger performance based on income statement and balance sheet information; this essentially involves a kind of post-consolidation analysis. There have also been a few papers that explore the merger gains from an ex-ante viewpoint, and indeed, from a merger perspective, a discussion of the transaction values of participating banks prior to a merger does indeed play an important role. Moreover, an additional key determinant of a merger decision is the means of evaluating the merger gains in advance, since such means will help to determine whether the decision maker will actually decide to implement the project. In their analysis of a bank’s international expansion options, Panayi and Trigeorgis (1998) undertook an examination of multi-stage real option applications, dividing the expansion strategy into two stages. The first stage of this process involved investments that were being considered by the banks to take place in year 1, while the second stage involved the investment options that were being contemplated
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_49,
767
768
for year 10. The traditional NPV method was used to estimate the investment value, including the two stages of NPV,1 whereas the Black and Scholes (1973) formula was used to measure the growth option value. In the Panayi and Trigeorgis (1998) study, the option model was used to apply a valuation, with the possible gains from bank consolidation then measured, ex ante. However, the use of the Black and Scholes formula still involves a number of restrictions, such as the limited number of parameters available for measuring bank consolidation value. This paper adopts a modified version of the Schwartz and Moon (2000) real-option concept, along with the least squared MonteCarlo simulation approach (which represents an easier and more general method of evaluating various types of investments) to the evaluation of an Internet company. We apply this method to the evaluation of financial institutions and formulate a model for ex-ante evaluation. Our model is implemented as a means of evaluating the first example of bank consolidation in Taiwan, involving the merger between Taishin International Bank and Dah An Commercial Bank. We calculate the individual premerger bank value, alongside the consolidated bank value, which might, theoretically, help to determine the conversion ratio in advance. Conversely, we examine the ratio of the increased value, resulting from the bank consolidation, which might also, theoretically, help to estimate the merger decision in advance. The remainder of this paper is presented as follows. In Sect. 49.2, we set up the model in continuous time from a discrete time approximation for the bank evaluation, followed, in Sect. 49.3, by a demonstration of the way in which our model is implemented. A number of simulations are carried out in Sect. 49.4 in an effort to investigate the key factors determining bank consolidation value. Finally, the conclusions drawn from this study are presented in Sect. 49.5.
49.2 The Model In developing a simple model for the evaluation of a bank’s value, we make a general assumption that the interest revenue earned from loans represents the bank’s major revenue source. We identify the credit scale as the bank’s total assets, and its deposits as the total liabilities. We also regard the cost function as a noninterest operating expense, combining both the fixed and variable components. We can determine the net post-tax cash flow rate from the revenue and cost functions, while also determining the cash available during each time 1
Panayi and Trigeorgis (1998) defined the expanded NPV as the sum of the base-case expanded NPV, with the second-stage opportunity NPV being valued as an expansion option.
C.-C. Chang et al.
interval. Thereafter, by discounting all of the cash available prior to the time horizon, we are able to determine the present value of the bank. We initially describe the model in continuous time, in an effort to develop it more precisely; within the process of implementation, we use the quarterly accounting data available from each of the banks and practice the use of the model in discrete time. In the following subsection, we focus on the development of a continuous, individual (premerger) bank model, along with a bank consolidation model.
49.2.1 The Premerger Bank Model The majority of a bank’s revenue comes from the interest spreads between loans and deposits. Let us assume that a bank’s loan at time t, is given by Lt , and that the dynamics of the bank’s loans are given by the stochastic differential process: dLt L D L (49.1) t dt C t d z1 Lt where L t (the drift) is the expected rate of growth in loans, which is assumed to follow a mean-reversion process with a long-term average drift, L I L is the level of volatility in the loan growth rate, and z1 follows a standard Wiener process. We assume that the bank can earn an abnormal return within an observed period, and that it will converge stochastically to the more reasonable rate of growth for the banking industry as a whole. The mean-reversion process is given as: L L dL N L L t Dk t dt C t d z2 ;
(49.2)
where L t is the level of volatility in the expected loans growth rate. The mean-reversion coefficient k L describes the rate at which such growth is expected to converge to its longterm average. The term, z2 , also follows a standard Wiener process. We further assume that the unanticipated changes in loans converge to a normal level, L , and that the unanticipated changes in the drift converge to zero. Similarly, the bank’s deposits at time t are given by Dt , and the dynamics of the bank’s loans are given by the stochastic differential process: dDt D D D t dt C t d z3 ; Dt
(49.3)
where D t is the expected rate of growth in deposits, which is also assumed to follow a mean-reversion process with a longterm average drift, D I D represents the level of volatility in the growth rate of deposits, and z3 is a standard Wiener process.
49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values
The expected growth rate in deposits also converges to the long-term rate, as follows: D D N D D dD t D k t dt C t d z4
(49.4)
where D t is the level of volatility of the expected growth rate in deposits; the mean-reversion coefficient, k D , describes the rate at which the growth is expected to converge to its longterm average, and the term z4 follows a standard Wiener process. The unanticipated changes in deposits are also assumed to converge to D , while the unanticipated changes in the expected growth rate in deposits are assumed to converge to zero. The interest spreads relate to the changes in the interest rates of both loans and deposits. Although there are various kinds of deposit rates, we use the average as a proxy. The loans and deposits rates follow two stochastic processes within which we define loan interest as a spread, St , above the average deposits rate, rt , with both of these following the square-root process of Cox, Ingersoll, and Ross (hereafter known as CIR) (1985): drt D a .b rt / dt C t
p
rt d z5
p dSt D aS b S St dt C tS St d z6 ;
(49.5) (49.6)
where ˛ and ˛ s are the reversion speed parameters; b and b s are the values towards which the interest rates revert over time; t and ts are the standard deviation, and z5 and z6 follow standard Wiener processes. We define the net interest income, Rt , as the total amount of the interest spreads: Rt D Lt .rt C St / Dt rt :
(49.7)
In defining all of the parameters for net interest income, there are six standard Wiener processes, z1 ; z2 ; z3 ; z4 ; z5 , and z6 , each of which is instantaneously and mutually correlated (see Appendix 49A). Furthermore, the noninterest operating expenses, Ot , are defined with both fixed and variable components. The fixed costs are the costs of buildings and operating equipment, which are assumed to remain unchanged within the estimation horizon; the variable costs are defined as the nonperforming loans (NPLs), which represent a proportion of the total amount of all bank loans: dOt D Ft C ˛Lt ; Ot where ˛ is the percentage of loans.
(49.8)
769
Given the total revenue and costs of the bank, we can define the net post-tax cash flow rate of Yt , which is given by: Yt D .Rt Ot / .1 t / ;
(49.9)
where t is the corporate tax rate. Finally, the bank is assumed to have a certain amount of cash available, Xt , which evolves according to: dXt D Yt dt:
(49.10)
Schwartz and Moon (2000) assumed that a company goes into bankruptcy when its available cash level reaches zero; the Schwartz and Moon definition is adopted in this study as the base valuation. If a bank runs out of cash, we do not take into account the possibility of additional financing, even where the bank may be sufficiently attractive as to be able to raise cash. In order to simplify the model, we assume that all cash flow remains within the bank, where it earns a risk-free interest rate, and that it will be available for distribution to the shareholders at the time horizon. Finally, the objective of our model is to determine the total values of the individual banks, and indeed, Kaplan and Ruback (1995) provided empirical evidence to show that discounted cash flows provide a very reliable estimate of market value. Our estimation of the value of a bank is undertaken by calculating the discounted value at the risk-free measure: V0 D EQ ŒXT C M .RT OT / e rT : (49.11) where the term EQ is the equivalent martingale measure, and e rT is the continuously compounded discount factor. We assume, in Equation (49.11), that bank value is similar to a European option. At the time horizon, the bank has two amounts of cash available; that is, the available cash accumulated to time horizon, and the terminal value of the bank. Schwartz and Moon (2000) set the terminal value at the time horizon as a multiple, M , (e.g., ten times) of the net post-tax cash flow rate. According to Brennan and Schwartz (1982), the measured variable should be converted so as to become risk-neutral; given that, in the real world, we use a risk-neutral measure (see Appendix 49B). In this model the value of a bank is a function of the state variables (loans, expected loan growth, deposits, expected deposit growth, interest rates, interest spreads, and cash balances) and time. This function can be written as follows: V V L; L ; D; D ; r; S; X; t :
(49.12)
770
C.-C. Chang et al.
49.2.2 The Bank Consolidation Model Based on the study of Berger et al. (1999), we assume that the primary motive for consolidation is the maximization of shareholder value; all actions taken by the banks would therefore be aimed towards the maximization of the value of all shares owned by the existing shareholders. In this study, we design the consolidation model to focus on increasing efficiency. First of all, we discuss whether increasing the scale of loans and deposits could assist in both achieving economies of scale and improving the value of the existing bank. Thereafter, we go on to discuss the changes in the cost structure brought about as a result of consolidation. Assuming there is an existing bank after consolidation, the bank loans of bank M at time t are given by LM t , while the dynamics of the integrated loans are given by the stochastic differential equation, similar to that discussed above. However, the growth rate of the integrated loans processes needs to be redefined, since the loan sources of each of the banks may well overlap, or have some level of correlation. Let us assume, for example, that bank A merges with bank B; in order to consider the correlation between the two banks, we assume that there will be an instantaneous correlation, L , between the two Wiener processes. This is implemented by transformation to the new variables, tA and tB , as follows: d
A t
M LA LA LA D uLA t t dt C t t d z1
(49.13)
d
B t
M LB LB LB D uLB t t dt C t t d z2
(49.14)
M d zM 1 d z2 D L dt; M
(49.15)
M
where dz1 and dz2 are the standard Wiener processes of the new variables tA and tB , having a correlation, L . We define the loan growth rate of the consolidated bank t LM as the value-weighted average of the new growth rate processes of each of the banks: D ktA d dLM t ktA D
LA 0 B LA 0 C L0
A t
C ktB d
ktB D
B t
LB0 B LA 0 C L0
(49.16) ;
where ktA and ktB are the value-weighted initial loan amounts for each of the banks. The dynamics of the stochastic differential equation of the integrated loans is given as follows: dLM t D LM dt C tLM d zM t 3 ; LM t
(49.17)
where t LM represents the integrated loans growth rate, which is defined above, and tLM is the volatility of the integrated loans growth rate.
We define the integrated deposits as Dt M . We should also similarly define each of the bank’s expected deposit growth rates. The dynamics of the integrated deposits stochastic differential equation is as follows: dDM t D DM dt C tDM d zM t 4 ; DtM
(49.18)
where t DM is the growth rate of the integrated deposits, and tDM is the volatility of the growth rate of the integrated deposits. The interest rate models continue to follow the squareroot process of CIR (1985), and we use the existing bank’s average deposit rate and interest spread for the integrated bank’s average deposit rate, rtM , and interest spread, StM . The net interest income of the consolidated bank, RtM , is the difference between the interest earned from loans, and the interest paid to depositors: M M DtM rtM : RtM D LM t rt C St
(49.19)
Similarly, the six Wiener processes, the integrated bank’s loans, deposits, loan growth, deposit growth, deposit rate, and interest spread, may be mutually relative, as discussed earlier in the premerger model (see Appendix 49A). Rhoades (1998) found that in some cases of bank consolidation, there was some existence of cost efficiency, and in our study, we assume that there are cost reduction opportunities within the cost function of the consolidated bank: MF
FtA C FtB C˛tMO LM OtM D FtM C˛tMO LM t D ˛ t ; (49.20) where ˛tMF are the fixed cost saving factors (or the economies of scale factor), and ˛tMO are the variable cost saving factors. Basically, there is a cost saving, or scale economy, motive for bank consolidation if the magnitude of ˛tMF is less than unity. The variable cost may produce a cost-saving opportunity either through increased loans scale, or a reduction in NPLs. We measure the effect of the variable cost savings by comparing the variable cost-saving factor with the rate of the value-weighted variable cost of each participating bank. The rate of post-tax net cash flow into the integrated bank, M Yt , the amount of cash available, XtM , and the estimation of the integrated bank, V0 M , are all the same as those in the premerger bank model. To reiterate, the integrated bank value is a function of the state variables (integrated loans, loan growth rate, integrated deposits, deposit growth rate, interest rate, interest spread and cash balances) and time. This can be shown as: V M V M LM ; LM ; D M ; DM ; r M ; S M ; X M ; t : (49.21)
49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values
The objectives of our study are to determine whether the effects of the bank merger are positive, and whether or not the merger achieves the goal of maximizing share value. If the consolidation value is greater than the summed-up value of the individual banks, then the bank merger does have synergy. We measure the increased value ratio ”, as: Increased bank value ratio . / D
V0M V0A V0B : V0A C V0B (49.22)
The model developed in the continuous time model, above, is path dependent, and such path dependencies can easily be taken into account by using the Monte Carlo simulation to solve the evaluation. In order to implement the simulation, we use the discrete version of the risk-adjusted process (see Appendix 49C).
49.3 Case Study In order to demonstrate the implementation of our model, we use, as an example, the first Taiwanese bank merger case, Taishin International Bank and Dah An Commercial Bank.
49.3.1 An Introduction to the Case Study The Financial Holding Company (FHC) Law, which was introduced in June 2001, brought about significant merger activity, beginning with the merger of Hua Nan Bank, Taiwan Bank, and First Bank on December 19, 2001. Thereafter, between 2002 and the first quarter of 2003, a total of 14 financial holding companies had been established in Taiwan, most of which comprised commercial banks and securities and insurance companies. No commercial bank mergers had taken place before the first quarter of 2002. This section introduces the first case of a commercial bank merger in Taiwan and discusses the value created by the merger, according to our model. Taishin International Bank and Dah An Bank held an extraordinary shareholder meeting in December 2001, at which they agreed to establish Taishin Financial Holding Company (FHC), which would comprise Dah An Bank, Taiwan Securities Company, and the Taishin Bills Finance Corporation. Within the planned FHC, Taishin International Bank offered shares, essentially for the takeover of Dah An Bank. Following the merger, Taishin International Bank was to be the surviving bank, under a share-exchange agreement of one share in Taishin Bank for two shares in Dah An Bank. As a result of the merger, the assets of Taishin FHC increased from NT$42 billion to NT$30 billion. In estimating
771
the effects of the merger, it was expected that Taishin would be able to provide one-stop financial services to more than 3.5 million customers at its 133 branch offices (up from the previous 88) with 4,500 staff members. The company noted that, from 2002 to 2004, the consolidation was expected to lead to cost reductions estimated at between NT$600 and NT$700 million per year. The new company’s share of the loans market was predicted to reach 2.5%, while its nonperforming loan (NPL) ratio was expected to be reduced to 2.24% in 2002, giving the new company post-tax income of almost NT$5 billion. Taishin International Bank Chairman, Thomas Wu, estimated that the merger with Dah An Commercial Bank would create earnings of NT$1.60 per share by the end of 2002, an increase of 201% on 2001 earnings of NT$0.53 per share. The first stage in the formation of the FHC – the merger between Taishin International Bank and Dah An Commercial Bank – was presented to the Ministry of Finance, under plans to be completed in February 2002. The second stage was the plan to bring Taiwan Securities Company and Taishin Bills Finance Corporation into the FHC in the second quarter of 2002, once the share swap ratio had been negotiated at their June shareholder meeting. This merger, which was to be the first example of consolidation of domestic commercial banks in Taiwan, involved a very complex process of share exchanges, the elimination of the acquired bank, and the offering up of the existing bank. Hence, it was important to be able to accurately estimate the value of each bank in order to be able to handle the various problems. In this study, we are also keen to be able to determine the value created by the merger, from an ex-ante perspective.
49.3.2 Estimating the Parameters The model described above requires a considerable number of parameters to facilitate the calculation of overall bank value, and since Taishin International Bank applied for the merger in January 2001, with the expectation that it would complete it before February 2002, we set out in this study to estimate the bank value based on the third quarter of 2001, using the financial reports from the first quarter of 1997 to the third quarter of 2001 as the historical data.
49.3.2.1 The Premerger Parameters The implementation of the first part of the model requires more than 50 parameters, some of which are easily observed from each bank’s quarterly financial reports. However, some of the parameters, those which require the use
772
C.-C. Chang et al.
Table 49.1 Taishin International Bank, quarterly sales and costs, March 1997–September 2001 (unit NT$ 10 million)
Date
Loans
Deposits
Interest income
March 1997 June 1997 September 1997 December 1997 March 1998 June 1998 September 1998 December 1998 March 1999 June 1999 September 1999 December 1999 March 2000 June 2000 September 2000 December 2000 March 2001 June 2001 September 2001
92.453 103.432 115.105 130.221 145.511 161.354 161.134 163.441 161.889 162.930 165.610 176.137 175.961 176.166 180.174 185.694 183.183 182.630 180.272
95.657 96.726 112.576 148.140 166.797 175.910 186.273 197.166 198.380 200.010 195.101 197.314 210.578 214.840 219.554 224.619 231.224 232.996 229.595
2:508 4:750 8:217 10:841 4:043 7:648 12:996 16:245 4:820 8:756 14:465 17:722 5:008 9:372 15:507 19:226 5:420 9:792 15:941
Interest expenses 1:595 3:214 5:100 7:359 2:703 5:620 8:694 11:799 3:045 5:868 8:545 11:074 2:678 5:396 8:237 11:169 2:907 5:611 8:173
of judgment, can be determined either by the management’s estimated forecasts, or through investigation by various analysts. Finally, we determine those parameters that are difficult to either observe or estimate, by direct consultation with previous references. Tables 49.1 and 49.2 present the basic data on each bank, including quarterly loans, deposits, interest income, and interest expenses for the last 19 quarters. In Table 49.3, we use this basic data to determine the initial loans, deposits, fixed component of expenses, and the cash balance available for each bank. We take the average growth rate in loans and deposits over the last two quarters (March and June 2000) for the initial expected loan and deposit growth rates, while the expectations of Morgan Stanley analysts are used for the growth rates over the next four quarters. An average loans growth rate of 3.3% was expected for the Taiwanese financial industry in 2002. When estimating Taishin Bank’s loans growth rate, we can see that there was a high growth rate in nonmortgage consumer loans (24.5% of total loans), which was expected to continue growing in 2002. We forecast that its loans growth rate would be 7% per quarter. Dah An Bank, on the other hand, did not have any additional information on its loan growth rate, so this was estimated at 3%, based on analysts’ expectations. The growth rate in deposits has a high correlation with the currency rate and foreign interest rate, particularly with regard to the U.S. interest rate; hence, the currency remained under pressure during the continued decline of the U.S. interest rate in 2001. We forecast that the growth rate in deposits
Non-performing loan expenses
Operating expenses
Gross profit
Operating profit before taxes (EBITDA)
0.145 0.400 0.620 0.863 0.146 0.255 0.479 0.816 0.463 1.084 1.766 2.673 1.023 1.584 2.468 2.990 1.014 2.249 4.141
0.710 1.440 2.253 3.081 0.959 2.019 3.175 4.374 1.185 2.428 3.645 4.903 1.271 2.665 4.142 5.687 1.562 3.163 4.867
0.375 0.825 1.150 1.505 0.547 0.910 1.357 1.672 0.467 1.041 1.556 2.175 0.779 1.738 2.201 2.299 0.715 1.169 0.849
0.406 0.887 1.241 1.621 0.588 1.001 1.469 1.810 0.500 1.105 1.633 2.252 0.834 1.853 2.374 2.507 0.820 1.421 1.143
would be unchanged in 2002. Taishin Bank’s deposits growth rate was set at 6%, which was the same as its historical growth rate record, while the rate for Dah An Bank was set at 4%. The initial volatility of loans and deposits are the standard deviations of past changes in loans and deposits, with the initial volatility of the expected growth rate in loans and deposits being inferred from the observed stock price volatility. For the other parameters generated from the stochastic processes of loans and deposits, such as the long-term growth rates in loans and deposits, and the long-term volatility of loans, we use the industry average loans and deposits growth rate (2%) and the standard deviations (3%) as proxies. Finally, the mean-reversion coefficients are 7%.2 The cost parameters are classified as either fixed costs or variable costs, each of which is observed from the income statements of each bank. We find that the two main items of each banks’ expenses were financial activity expenses, and operating expenses. The financial activity expenses include interest expenses, NPL expenses and procedure fees; in our study we focus on a discussion of the net cash flow from the interest spread, so we do not take account of the procedure fees arising from other activities. The interest expense is considered as before. We also define two other items of cost, the first of which is the NPL expenses arising from bad loans, defining NPLs
2
All of the parameters are defined as in Schwartz and Moon (2000).
49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values
773
Table 49.2 Dah An Commercial Bank, quarterly sales and costs, March 1997–September 2001 (unit NT$ 10 million)
Date
Loans
Deposits
Interest income
March 1997 June 1997 September 1997 December 1997 March 1998 June 1998 September 1998 December 1998 March 1999 June 1999 September 1999 December 1999 March 2000 June 2000 September 2000 December 2000 March 2001 June 2001 September 2001
82:961 91:784 101:594 114:352 115:389 126:430 135:219 136:495 138:728 147:936 147:323 146:241 149:138 159:936 166:294 163:439 164:203 168:742 167:280
91:379 89:857 97:145 115:943 115:083 130:748 128:233 142:245 147:075 149:782 149:295 153:316 164:056 164:273 184:955 191:037 180:183 182:264 178:245
1:834 3:753 5:929 8:488 2:640 5:347 8:270 11:311 3:006 5:926 8:824 11:792 2:938 5:954 9:192 12:235 3:157 5:991 8:783
Interest expenses
Non-performing loan expenses
Operating expenses
1.270 2.583 4.012 5.764 1.904 3.877 5.986 8.150 2.159 4.273 6.324 8.270 2.040 4.196 6.565 8.717 2.374 4.562 6.567
0.101 0.116 0.324 0.623 0.065 0.113 0.228 0.909 0.284 0.645 1.001 5.235 0.319 0.398 0.493 0.920 0.593 1.008 1.703
0.434 0.891 1.366 1.917 0.527 1.098 1.685 2.286 0.604 1.218 1.719 2.204 0.536 1.095 1.668 2.261 0.560 1.137 1.703
Gross profit 0:344 0:755 1:169 1:480 0:473 0:889 1:336 1:102 0:324 0:552 0:813 2:572 0:571 1:224 1:732 0:400 0:343 0:677 0:441
Operating profit before taxes (EBITDA) 0:347 0:760 1:174 1:430 0:472 0:888 1:336 1:100 0:329 0:547 0:808 2:572 0:583 1:295 1:825 0:549 0:371 0:714 0:488
Table 49.3 Parameters used in the premerger base evaluation Initial estimations Parameters
Taishin International Bank
Dah An Commercial Bank
Loan parameters Initial loans L0 Initial expected loans growth rate (%) L 0 Initial volatility of loans (%) 0L Initial expected volatility of loans growth rate (%) L 0 Long-term loans growth rate (%) NL Long-term volatility of loans growth rate (%) N L
NT$1,800 million/quarter 0.07/quarter 0.07/quarter 0.2082/quarter 0.02/quarter 0.03/quarter
NT$1,670 million/quarter 0.03/quarter 0.05/quarter 0.1882/quarter 0.02/quarter 0.03/quarter
Deposit parameters Initial Deposits D0 Initial expected deposits growth rate (%) D 0 Initial volatility of deposits (%) 0D Initial expected volatility of deposits growth rate (%) D 0 Long-term deposits growth rate (%) ND Long-term volatility of deposits growth rate (%) N D
NT$2,290 million/quarter 0.06/quarter 0.08/quarter 0.2082/quarter 0.02/quarter 0.03/quarter
NT$1,780 million/quarter 0.04/quarter 0.07/quarter 0.1882/quarter 0.02/quarter 0.03/quarter
Cost parameters ˛ Variable component of expenses (as % of loans) F Fixed component of expenses
0.02/quarter NT$48 million/quarter
0.01/quarter NT$17 million/quarter
Other parameters Initial cash balance available X0
t Time increment for the discrete version of the model T Estimation horizon
NT$160 million/quarter one quarter 10 years
NT$220 million/quarter one quarter 10 years
774
C.-C. Chang et al.
as a variable cost, which is a constant percentage of all loans. Taishin Bank’s NPL ratio was expected to fall to about 2% in 2002, so we assume its variable component of expenses to be 2% of all loans. Since the scale of loans at Dah An Bank was smaller, and the economic conditions looked to be better for the subsequent year, we assume its variable expenses component to be 1% of all loans. The second additional cost is operating expenses, which includes selling, general and administrative (SG&A) expenses, and other expenses, such as R&D expenses, training expenses and rental expenses and define these as a combined fixed expenses component. Net interest income is the difference between interest earned from loans and interest paid to depositors, and clearly, the change in interest rates on loans and deposits is a very important factor in estimating bank value. We use the average deposit rate as the initial interest rate for deposits and take the difference between the average lending rate and the deposit rate as the interest spread. These are acquired from the Taiwan Economic Journal (TEJ). As regards the setting up of the other interest rate parameters, we refer to Chen and Scott (1994). Other parameters that should be identified in the discrete model include the tax rate, which is the average rate of tax from the first quarter of 1997 to the third quarter of 2001;
these figures are available from the TEJ. Ten years was taken as the estimation horizon, and one quarter as the time increment. The correlation of each parameter was taken for the estimation of the parameters of each Wiener process, which was obtained from the historical data covering the period from the first quarter of 1997 to the third quarter of 2001.
49.3.2.2 The Consolidation Parameters A total of 39 parameters were required for the implementation of the second part of the model. As Table 49.4 shows, bank consolidation increases the scale of assets, and if we assume that all other things are equal, the initial integrated loans will be the sum of the two banks’ loans. In similar fashion, the initial deposits, following consolidation, will be equal to the sum of the two banks’ deposits. The drift and volatility terms of the loan and deposit stochastic processes are extremely important factors in the simulation of consolidation value; thus, we take the initial expected growth rate in loans and deposits as the value-weighted average of each bank’s initial expected loans and deposits growth rates. As in the previous section, in order to ensure that our model is more realistic, we consider the correlation between each bank, which involves the expected loans and deposits
Table 49.4 Parameters used in the consolidation base evaluation Parameters
Initial estimation
Loans parameters Initial Integrated Loans LM 0 Initial expected integrated loans growth rate (%) LM 0 Initial integrated loans volatility (%) 0LM Long-term integrated loans growth rate (%) N LM Long-term volatility of integrated loans growth rate (%) N LM Correlation of expected loans growth rate of each bank (%) L
NT$3,470 million/quarter 0.05/quarter 0.07/quarter 0.02/quarter 0.03/quarter 0.58/quarter
Deposits parameters Initial Integrated Deposits D0M Initial expected growth rate in integrated deposits (%) DM 0 Initial volatility of integrated deposits (%) 0DM Long-term integrated deposits growth rate (%) N DM Long-term volatility of integrated deposits growth rate (%) N DM Correlation of each bank’s expected rate of growth in deposits (%) D
NT$407 million/quarter 0.05/quarter 0.08/quarter 0.02/quarter 0.03/quarter 0.62/quarter
Cost parameters Variable expense cost saving factors ˛ MO Fixed expense cost saving factors ˛ MF Integrated fixed component of expenses FM
0.01/quarter 0.09/quarter NT$650 million/quarter
Other parameters Initial cash balance available X0M
t Time increment for the discrete version of the model T Estimation horizon Tax rate (%) c Risk-free interest rate (%) rf
NT$380 million/quarter one quarter 10 years 0.18/year 0.05/year
49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values
Table 49.5 Summary of the simulation results
775
Pre-merger value Taishin Bank (NT$ bn)
Dah An Bank (NT$ bn)
Consolidation value (NT$ bn)
Ratio of value increase (%)
Schwartz and Moon (2000)
369.747
320.760
872.658
26.38
Longstaff and Schwartz (2001)
339.003
297.714
821.418
29.01
Method
growth rates. A correlation exists between each of the banks’ expected loans growth rate, with the initial value being assumed as the correlation between the two bank’s historical loans growth rates, in similar fashion to the correlation between each bank’s expected deposits growth rate. An additional major issue, with regard to the overall gains stemming from a merger, is the overall improvement in efficiency, with such efficiency gains coming from cost reductions. In our study, we assume that there are two distinct opportunities for cost savings: fixed cost saving factors and variable cost saving factors. The fixed cost saving factors are those arising from the reduction in the integrated bank’s fixed costs, vis-à-vis the sum of both banks prior to the merger. Analysts’ expectations, which were used for the initial fixed cost saving factors, estimated that the savings on operating expenses following the merger would be NT$3 billion to NT$4 billion; thus, the initial fixed cost saving factor was set at 0.09 per quarter. As was defined in the previous section, the variable cost saving factor is NPL expenses. Taishin FHC’s NPL ratio was expected to fall to 2.24% from 2.7% in 2002; hence, we assume the initial variable cost saving factor to be 0.01 per quarter. If we do not assume that an integrated bank that increases its market concentration may also increase its market power from setting prices on retail services,3 then we can assume that its initial deposit rate and interest spread would remain the same following the merger. Thus, the amount of cash available to the integrated bank would be NT$380 million, the sum of the initial available cash of the two banks.
49.4 Results After estimating the parameters, we use 100,000 Monte Carlo simulations to value the premerger and consolidated values, with the simulation results as follows:
49.4.1 The Fair Transaction Value of the Banks To simulate the value of each bank, we need to determine the bankruptcy assumption, first by using the Schwartz and Moon (2000) definition that a company will go into bankruptcy when the amount of available cash reaches zero. We then take the Longstaff and Schwartz (2001) “leastsquare method” (LSM) to determine the optimal stop point for each simulation path and to calculate the value of the bank. According to the LSM approach, it is more reasonable to determine the definition of a valuation stopping point as being that point where the conditional expected value is less than the cash available at T- t. We could regard the conditional expected value fitting under LSM as the expected continuing value of the bank at T- t., and then compare this to the actual cash available at the same time. Table 49.5 summarizes the simulation results of each method. Under the Schwartz and Moon base method, Taishin Bank’s value was NT$338.358 billion, while that of Dah An Bank was NT$315.542 billion. Since our intention here is to gain an accurate premerger perspective, our main concern is the increase in the value ratio; under this method, the consolidation value increase was 26.38%. Conversely, the LSM approach shows that Taishin Bank’s value was NT$339.003 billion, versus Dah An Bank’s NT$297.714 billion. Although each bank’s value is much the same under the LSM approach as it is under the base method, the increased value is, nevertheless, more significant. We relax the restriction that the cash available stochastic process may reach zero by simulation; therefore, the bank could not be seen as bankrupt if its expected continuing value was still greater than zero. Table 49.5 also clearly indicates that the ratio of the value increase is much more conspicuous under the LSM approach.
49.4.2 Bankruptcy and Stopping Points 3
Berger et al. (1999) found that mergers and acquisitions among institutions that have significant local market overlap, ex ante, may increase local market concentration and allow the consolidated firm to raise profits by setting less favorable prices to customers. This may affect rates and fees on retail deposits and small business loans.
Table 49.6 shows the base valuation method, following Schwartz and Moon (2000), in which the bankruptcy condition is defined as the point at which the cash available at each
776
C.-C. Chang et al.
Table 49.6 Probability of bankruptcy per year, for base valuation (unit: %) Year Taishin Bank Dah An Bank Consolidation 1 2 3 4 5 6 7 8 9 10 Total
1.17 11.61 9.66 6.48 4.68 3.92 3.05 2.68 2.25 0.00 45.50
0.00 1.01 5.78 6.53 5.33 4.31 3.51 3.04 2.38 0.00 31.89
0.00 0.02 0.11 0.66 1.40 2.13 2.22 2.58 2.41 0.00 11.53
Table 49.7 Probability of stop points per year, by LSM approach (unit: %) Year Taishin Bank Dah An Bank Consolidation 1 2 3 4 5 6 7 8 9 10 Total
2.51 6.35 9.87 7.45 5.91 4.69 5.06 6.52 8.38 0.00 56.74
2.14 5.81 7.85 6.58 5.35 5.72 6.41 8.56 10.25 0.00 58.67
0.45 2.28 4.59 7.46 8.57 8.62 9.32 11.27 10.83 0.00 63.39
discrete time reaches zero. Note that in the beginning, the bankruptcy points per year for Taishin International Bank are greater than those for Dah An Commercial Bank, particularly in Year 2; however, the probability of bankruptcy for Taishin International Bank decreases rapidly after Year 4, when it becomes less than the probability of bankruptcy for Dah An Commercial Bank. The consolidated bankruptcy points start in Year 3, and by Year 8 the probability of bankruptcy for the consolidated bank is much lower than that of each individual bank. Table 49.7 provides details of the simulated distributions of the banks’ stopping points under the LSM approach, which is developed to determine the stop points for each simulation path, as discussed above. Note that the optimal stopping points for each bank, for each year, are closer than those under the base valuation. We also find that the total number of stopping probabilities for Dah An Commercial Bank was higher than that for Taishin International Bank in these simulation results. This result is more likely to accord with the real-world situation.
49.4.3 Sensitivity Analysis In this section we perform a sensitivity analysis to the most critical parameters for the total value of the consolidated bank – obtaining the bank values by use of a perturbation (a 10% higher value) for the indicated parameter, while all other parameters remain the same as in the base valuation of the consolidated bank value. The sensitivity analysis results, by base valuation, are reported in Table 49.8, where it is shown that three sets of parameters have a significant effect on the value of the bank. The first of these is the parameter for the fixed and variable cost components. Although the equations indicate that the two components have the same effect on the cost function, in the simulation results the variable component increases by 1%, while the consolidated bank value decreases by 1.73%, which was more significant than the 1% increase in the fixed component (Table 49.8). Second, an increase in the initial integrated loans and deposits growth rate, from 5% per quarter to 5.5% per quarter (a 10% increase) is also significant in the consolidated bank valuations. We find that the increase in the initial loans growth rate has a positive relationship with the valuation of the consolidated bank. Conversely, a negative relationship exists between the growth rate in deposits and the consolidated bank, because an increase in the deposits growth rate would lead to a rise in interest expenditure if the loans growth rate was unchanged. The distribution of loans and deposits growth rate variance is also important in the valuation because it determines the option value of growth, implying a higher probability of growth opportunities (Table 49.8). We find that such an increase in the standard deviation of both loans and deposits would lead to a rise in the value of the consolidated bank.
Table 49.8 Sensitivity of consolidation bank value meters Total Value of consolidation perturbed bank value parameter .NT$ 1;000/ Parameter (per quarter)
to change para-
Base case LM 0 DM 0 0LM 0DM LM DM LM DM k1 k2 ˛ MO ˛ MF
–
– 0.055 0.055 0.077 0.088 0.022 0.022 0.033 0.033 0.077 0.077 0.011 0.099
868,031,438 913,308,472 846,838,917 875,041,757 868,454,744 873,199,259 865,663,870 868,370,641 868,218,622 867,598,615 867,760,803 853,037,257 866,134,239
Increase/ decrease on base case (%) 5:22 2:44 0:81 0:05 0:60 0:27 0:04 0:02 0:05 0:03 1:73 0:22
49 A Real Option Approach to the Comprehensive Analysis of Bank Consolidation Values
Table 49.9 Sensitivity of consolidation bank value to change parameters, by LSM Approach Total Value of consolidation perturbed Increase/decrease bank value parameter on base case (%) .NT$ 1; 000/ (per quarter) Parameter Base case LM 0 DM 0 0LM 0DM LM DM LM DM k1 k2 ˛ MO ˛ MF
– 0.055 0.055 0.077 0.088 0.022 0.022 0.033 0.033 0.077 0.077 0.011 0.099
817,088,184 858,654,309 798,235,081 823,142,437 817,259,208 821,538,806 815,200,412 817,412,390 817,321,305 817,060,177 816,829,042 802,662,521 815,223,433
– 5:09 2:31 0:74 0:02 0:54 0:23 0:04 0:03 0:00 0:03 1:77 0:23
The third parameter involves an issue that is not so obvious, since the set of parameters that are generated from the mean-reversion process have an effect on the value of the consolidated bank. An increase in the mean-reversion coefficient in integrated loans and deposits, k1 and k2 (as in Table 49.8), would lead to a decline in the value of the consolidated bank, while an increase in the long-term growth rates of the integrated loans would lead to an increase in the overall value of the consolidated bank, which runs contrary to the increase in long-term growth rates of the integrated deposits. Table 49.9 shows the sensitivity of the total value of the consolidated bank under the LSM approach. We find that the results are similar to those in the earlier discussion.
49.5 Conclusions In this study, we apply a modification of the Schwartz and Moon (2000) model to the evaluation of bank consolidation. From our examination of a bank merger case study (the first example of such a bank merger in Taiwan), we find that, from an ex-ante viewpoint, the consolidation value is, on average, about 30% of the original total value of the independent
777
banks. We also find that the probability of bankruptcy was considerably lower following the merger than it would have been prior to the merger. Our case study therefore indicates that the merger was indeed a worthwhile venture for both banks involved. Furthermore, on completion of the merger, we also find that, in terms of the magnitude of the increased consolidation value, the most critical roles are played by the resultant changes in the growth rates of the integrated loans and the integrated deposits, as well as the cost-saving factors within the cost functions. Acknowledgments The earlier version of this paper was presented at National Central University, National Taiwan University. We are especially grateful for Professor Pin-Hung Chou’s and San-Lin Chung’s helpful comments.
References Berger, A. N., R. S. Demsetz, and P. E. Strahan. 1999. “The consolidation of the financial services industry: causes, consequences and implications for the future.” Journal of Banking and Finance 23(2–4), 135–194. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Boyle, P. 1997. “Options: a monte carlo approach.” Journal of Financial Economics 4, 323–338. Brennan, M. J. and E. S. Schwartz. 1982. “Consistent regulatory policy under uncertainty.” Bell Journal of Economics 13(2), 507–521. Cox, J. C., J. E. Ingersoll, and S. A. Ross. 1985. “A theory of term structure of interest rate.” Econometrica 53, 385–407. Longstaff, F. A. and E. Schwartz. 2001. “Valuing American options by simulation: a simple least-squares approach.” Review of Financial Studies 14(1), 113–147. Kaplan, S. and R. Ruback. 1995. “The valuation of cash flow forecasts: an empirical analysis.” Journal of Finance 50(4), 1059–1093. Panayi, S. and L. Trigeorgis. 1998. “Multi-stage real options: the cases of information technology infrastructure and international bank expansion.” Quarterly Review of Economics and Finance 38 (Special Issue), 675–692. Rhoades, S. A. 1993. “Efficiency effects of horizontal (in-market) bank mergers.” Journal of Banking and Finance 17, 411–422. Rhoades, S. A. 1998. “The efficiency effects of bank mergers: an overview of case studies of nine mergers.” Journal of Banking and Finance 22, 273–291. Ryngaert, M. D. and J. F. Houston. 1994. “The overall gains from large bank mergers.” Journal of Banking and Finance 18, 1155–1176. Schwartz, E. and M. Moon. 2000. “Rational pricing of internet companies.” Financial Analysts Journal 56, 62–75. Schwartz, E. and M. Moon. 2001. “Rational pricing of internet companies revisited.” University of California at Los Angeles, (Revised April 2001).
778
C.-C. Chang et al.
Appendix 49A The Correlations Between the Standard Wiener Process Generated from a Bank’s Net Interest Income The relationships are as follows: dz1 dz2 dz1 dz5 dz2 dz4 dz3 dz4 dz4 dz5
D Lg Ldt D L dt D g Lg Ddt D Dg D dt D g D dt
dz1 dz3 dz1 dz6 dz2 dz5 dz3 dz5 dz4 dz6
D LD dt D LS dt D g L dt D D dt D g DS dt
dz1 dz4 dz2 dz3 dz2 dz6 dz3 dz6
D Lg Ddt D Dg Ldt D g LS dt D DS dt
where LgL is the correlation between loans and the loan growth rate; LD is the correlation between loans and deposits; LgD is the correlation between the loan growth rate and deposits; L is the correlation between loans and the deposit rate; LS is the correlation between loans and the interest spread; DgL is the correlation between deposits and the loan growth rate; gL gD is the correlation between the loan growth rate and the deposit growth rate; gL is the correlation between the loan growth rate and the deposit rate; gL S is the correlation between the loan growth rate and the interest spread; DgD is the correlation between deposits and the deposit growth rate; D is the correlation between deposits and the deposit rate; DS is the correlation between deposits and the interest spread; gD is the correlation between the deposit growth rate and the deposit rate; and gD S is the correlation between the deposit growth rate and the interest spread.
Appendix 49B The Risk-Adjusted Processes The model has six sources of uncertainty: (1) the uncertainty surrounding the changes in loans; (2) the uncertainty with regard to the expected loans growth rate; (3) the uncertainty surrounding the changes in deposits; (4) the uncertainty with regard to the expected deposit growth rate; (5) the uncertainty surrounding the changes in the average deposit rate; and (6) the uncertainty surrounding the changes in the interest spread. Following Brennan and Schwartz (1982), we have some simplifying assumptions: the risk-adjusted processes for the state variables can be obtained from the true processes, and the loans and deposits of the risk-adjusted processes are as follows:
L L dLt D L t 1 t Lt dt C t Lt d z1 L L L N L L d—L t D k t 3 t dt C t d z2 D Dt dt C D t Dt d z3 dDt D D t 2 t D D N D D 4 D d—D t D k t t dt C t d z4 ; where the market prices of the factor risks, 1 ; 2 ; 3 and 4 are constants. Similarly, we have to adjust the interest rate to be riskneutral, as follows: p p drt D a.b rt / C 5 t rt dt C t rt d z5 h p p i dSt D aS .b S St / C 6 tS S t dt C tS St d z6 ; where 5 and 6 are the market prices of the factor risks.
Appendix 49C The Discrete Version of the Risk-Adjusted Process In order to implement the model developed in the previous section, we have to rewrite the model as a discrete-time version. For example, we use the loans and loan growth rate of each bank under the risk-adjusted process, as follows: ("
Lt C t D Lt e L t C t
De
k L t
L t 1
L t
s
.tL / 2
2
#
p
t CtL t"1
C 1e
k L t
)
3 L t N L L k
1 e 2k L t L p t t "2 2k L
L L tL D 0L e k1 t CN L 1 e k1 t C
L
L k2 t L : t D 0 e
where 0L and L 0 are the initial variance values in the loans and growth rate, and "1 and "2 are drawn from a standard normal distribution with correlation between the loans and growth rate. Under the risk-adjusted process, the deposits and deposit growth rate of each bank are similar to the loans process.
Chapter 50
Dynamic Econometric Loss Model: A Default Study of US Subprime Markets C.H. Ted Hong
Abstract The meltdown of the US subprime mortgage market in 2007 triggered a series of global credit events. Major financial institutions have written down approximately $120 billion of their assets to date and yet there does not seem to be an end to this credit crunch. With traditional mortgage research methods for estimating subprime losses clearly not working, revised modeling techniques and a fresh look at other macroeconomic variables are needed to help explain the crisis. During the subprime market rise/fall era, the levels of the house price index (HPI) and its annual house price appreciation (HPA) had been deemed the main blessing/curse by researchers. Unlike traditional models, our Dynamic Econometric Loss (DEL) model applies not only static loan and borrower variables, such as loan term, combined-loan-to-value ratio (CLTV), and Fair Isaac Credit Score (FICO), as well as dynamic macroeconomic variables such as HPA to project defaults and prepayments, but also includes the spectrum of delinquencies as an error correction term to add an additional 15% accuracy to our model projections. In addition to our delinquency attribute finding, we determine that cumulative HPA and the change of HPA contribute various dimensions that greatly influence defaults. Another interesting finding is a significant long-term correlation between HPI and disposable income level (DPI). Since DPI is more stable and easier to model for future projections, it suggests that HPI will eventually adjust to coincide with the DPI growth rate trend and that HPI could potentially experience as much as an additional 14% decline by the end of 2009. Keywords Dynamic econometric loss model r US subprime markets
50.1 Introduction Subprime mortgages are made to borrowers with impaired or limited credit histories. The market grew rapidly when C.H. Ted Hong () Beyondbond, Inc., USA e-mail: [email protected]
loan originators adopted a credit scoring technique like FICO to underwrite their mortgages. A subprime loan is typically characterized by a FICO score between 640 and 680 or less vs. the maximum rating of 850. In the first half of the decade, the real estate market boom and well-received securitization market for deals including subprime mortgages pushed the origination volume to a series of new highs. In addition, fierce competition among originators created various new mortgage products and a relentless easing of loan underwriting standards. Borrowers were attracted by new products such as “NO-DOC, ARM 2/28, IO” that provided a low initial teaser rate and flexible interest-only payments during the first 2 years, without documenting their income history. As the mortgage rates began to increase during the summer of 2005 and housing activity revealed some signs of a slowdown in 2006, the subprime market started to experience some cracks as delinquencies began to rise sharply. The distress in the securitization market backed by subprime mortgages and the resulting credit crisis had a ripple effect initiating a series of additional credit crunches. All this pushed the US economy to the edge of recession and is jeopardizing global financial markets. The rise and fall of the subprime mortgage market and its ripple effects raise a fundamental question. How can something as simple as subprime mortgages, which accounts for only 6–7% of all US mortgage loans, be so detrimental to the broader economy as well as to the global financial system? Before formulating an answer to such a large question, we need to understand the fundamental risks of subprime mortgages. Traditional valuation methods for subprime mortgages are obviously insufficient to measure the associated risks that triggered the current market turmoil. What is the missing link between traditional default models and reality? Since a mortgage’s value is highly dependent on its future cash flows, the projection of a borrower’s embedded options becomes essential to simulate its cash flows. Studying consumer behavior to help project prepayments and defaults (call/put options) of a mortgage is obviously the first link to understanding the current market conditions. This paper focuses on modeling the borrower’s behavior and resultant prepayment or default decision. A Dynamic
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_50,
779
780
Econometric Loss (DEL) model is built to study subprime borrower behavior and project prepayment and default probabilities based on historical data from Loan Performance’s subprime database (over 17 million loans) and prevailing market conditions from 2000 to 2007. The paper is organized in the following manner. We start by constructing a general model framework in a robust functional form that is able to not only capture the impact of individual model determinants, but is also flexible enough to be changed to reflect any new macroeconomic variables. We then modeled default behavior through an individual factor fitting process. Prepayment modeling follows a similar process with consideration of the dynamic decision given prior prepayment and default history. The delinquency study builds the causality between default and delinquencies and the relationship within the spectrum of different delinquencies. We then utilized the delinquencies as a leading indicator and error correction term to enhance the predictability of the forecasted defaults by 15%. Our findings and forthcoming research are then drawn in the conclusion section (Fig. 50.1).
50.2 Model Framework When a lender issues a mortgage loan to its borrower, the loan is essentially written with two embedded American options with an expiration co terminus with the life of the loan. The lender will then receive payments as compensation for underwriting the loan. The payments will include interest, amortized principal and voluntary/involuntary prepayments along with any applicable associated penalties. The risk for lenders is that they might not receive the contractual payments and will need to go after the associated collateral to collect the salvage value of the loan. Additionally, the foreclosure procedure could be costly and time consuming. Unscheduled payments come in two forms. A voluntary prepayment is usually referred to simply as “prepayment” and an involuntary prepayment is known as “default” (with lags to potentially recover some portion of interest and principal proceeds). Prepayment is nothing but a call option on some or all of the loan balance plus any penalties at a strike price that a borrower has the right to exercise if the option is in-the-money. By the same token, default is a put option with the property’s market value as the strike price to the borrower. Understanding the essence of both options, we need to find the determining factors that trigger a borrower to prepay/default through filtering the performance history of the loan. A list of determinant factors regarding consumer behavior theory for modeling default and prepayment will be discussed in the next two sections. In order to construct a meaningful statistical model framework for empirical work, the availability of data and the data
C.H. Ted Hong
structure are essential. In other words, our model framework is designed to take full advantage of Loan Performance’s subprime mortgage historical information and market information. The model empirically fits to the historical default and prepayment information of US subprime loan performance from 2000 to 2007 (more than 17 million loans) (Fig. 50.2). Mathematically, our general framework constructs the default and prepayment rates as two separate functions of multiple-factors where the factors are categorized into two types – static and dynamic.1 The static factors are initially observable when a mortgage is originated such as borrower, characteristics and loan terms. Borrower characteristics include CLTV, FICO, and debt-to-income ratio (DTI). Loan terms include loan maturity, loan seasoning, original loan size, initial coupon reset period, interest only (IO) period, index margin, credit spread, lien position, documentation, occupancy, and loan purpose. The impact to the performance of a loan from the static factors provides the initial causality relationship, yet their influence may diminish or decay as the information ceases to be up to date. Dynamic factors include several macroeconomic variables such as HPA, prevailing mortgage interest rates, consumer confidence, gross disposable income, employment rate, and unemployment rate. These dynamic factors supply up-to-date market information and thus play an important role in dynamically capturing the market impact. The accuracy of capturing causality relationship due to the static factors and the predictability of the dynamic factors presents a constant challenge during the formulation of this model. For each individual factor, a non-linear function is formulated according to its own characteristics. For example, a “CLTV” factor for modeling default is formulated as the function of default rate over CLTV ratio. However, a DOC factor is formulated as the function of multiplier over discrete variables of “FULL” vs. “LIMITED” with percentages of respective groups. A general linear function of combined multifactor functions is then constructed as a basic model framework to fit the empirical data and to project forecasts for prepayments and defaults.2 In the following sections, we will discuss each factor in detail.
1
There is no industry standard measure for default rate, thus a different definition of default rate will give a very different number. As there is no set standard, we define our default rate to be based on the analysis in this paper, “Loss Severity Measurement and Analysis,” The MarketPulse, LoanPerformance, 2006, Issue 1, 2–19. Please refer to Appendix I for definition of default used throughout this paper. 2 See Appendix II for the details of model specification.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets 07/17 News Corp. reaches a tentative agreem ent to acquire Dow Jones at its original offer price of $60 a share.
01/18 A real-estate consortium unveils a $21.6 billion offer for Equity Office properties Trust.
07/20 The Dow industrials cross the 14000 m ilestone for the first time.
02/10 Fortress Investment Group LLC’s shares surge 68% in their debut to finish at $31.
14,000
06/13 U.S. bond yields hit a five-year high as inventors continue to sell Treasurys, with the yield on the benchm ark 10-year not rising to 5.249% .
03/09 New Century Financial Corp.’s creditors force the subprim e-m ortagage lender to stop m aking loans am id rising defaults.
10/25 Merrill posts a $2.24 billion loss as a larger-than-expected $8.4 billion write-down on m ortgage-related securities leaves the firm with its first quarterly deficit since 2001.
07/25 Countrywide Financial Corp. says profit slips 33%, dragged down by losses on certain types of prime mortgage loans.
06/22 Blackstone Group LP’s IPO is priced at $31 a share, raising as much as $4.6 billion.
04/26 ABN Am ro Holding NV receives a $98.58 billion takeover approach from a group led by Royal Bank of Scotland Group PLC.
14,500
781
11/27 Citigroup, seeking to restore investor confidence am id massive losses in the credit markets and a lack of perm anent leadership, receives a $7.5 billion capital infusion; 11/27 HSBC’s SIV bailout will move 2 SIV’s of $45 billion to its balance sheet.
08/11 Central banks pum p m oney into the financial system for a second day to ease liquidity strains.
13,500 10/27 Countrywide Financial Corp. posts its first quarterly loss in 25 years on about $1 billion in writedowns.
07
07 10
9/ /2
/3 09
/3
0/
0/
07
07 08
/3
1/ /0 08
/0 07
1/
07
07 2/
07 /0 06
/0 05
/0 04
2/
07 3/
07 3/
07 4/ /0 03
/0 02
01
/0
3/
2/
07
07
12,000
12/21 Bear Stearns posts a loss of $854 million, the first in its 84-year history. The firm takes a $1.9 billion write down.
12
08/14 Goldm an Sachs Group Inc. says three of its hedge funds have seen the net value of their assets fall about $4.7 billion this year.
06/23 Bear Stearns Cos. Agrees to lend as m uch as $3.2 billion to one of its own troubled hedge funds.
12/14 Citigroup bails out seven affiliated structuredinvestm ent vehicles, or SIVs, bring $49 billion in assets onto its balance sheet and further denting its capital base.
07
12,500
08/17 Blue chips fall more than 300 points at 845.78 after foreign m arkets tum ble on certain that U.S. credit problems could trigger a global slowdown; 08/17 Countrywide taps an $11.5 billion credit line in a bid to shore up its finances, but its stock falls 11% .
9/
13,000
/2
07/18 Bear Stearns Cos. Says two hedge funds it runs are worth nearly nothing.
11
03/23 Blackstone files for an IPO to raise about $4 billion
What happened in 2008 7/31 The Senate pass the $300 billion Housing and Economic Recovery Act of 2008, which is designed to help struggling homeowners retain ownership. Signed by President Bush, the law goes into effect Oct. 1.
9/8 US rescue fires up markets. EUPHORIC 13000 investors are using the US Government’s historic bail-out 12500 of mortgage lenders Fannie Mae and Freddie Mac to 12000 justify a buying frenzy on share markets worldwide.
11500
11000
10500 Jan 2008 1/22 The largest rate cut since October 1984 by the U.S. Federal Reserve sends Wall Street lower today as Fed Chairman Ben Bernanke and other central bankers show they’re deeply worried about the immediate future of the U.S. economy.
Feb 2008
Mar 2008
3/14 Bear Stearns liquidity problem emerges, stock price goes down from $60 to $30 per share.
3/16 Shotgun wedding of Bear Stearns to J.P. Morgan for $2 a share, a bargain-basement $236.2 million. Federal Reserve bank to provide financing for the deal.
Apr 2008 7/11 FDIC takes over IndyMac Bank, the Second-largest bank failure in U.S. history. Iran missile launches send oil to $147 per barrel record.
May 2008
8/27 FDIC says “Problem bank list” in 2nd quarter grows to 117 with $78 billion in assets - up from 90 banks, $26 billion in assets in 1st quarter.
Fig. 50.1 What happened in 2007; What happened in 2008
Jun 2008
7/14 Fannie Mae & Freddie Mac post losses, stock prices both drop approximately 80%.
Jul 2008
9/3 Ospraie, the largest commodity hedge fund firm, closes to its biggest hedge fund after slumping 38% this year because of bad bets on commodity stocks. Oil plunges to 5month low at $105 per barrel.
9/8 South Korea’s financial regulator urges state-owned Korea Development Bank (KDB) to be cautious over any investment in Lehman Brothers.
Aug 2008
Sep 20
9/17 Fed agrees to take over the insurance giant AIG, an unprecedented $US85 billion bail-out. 9/15 Lehman expects to file for bankruptcy protection, the largest failure of an investment bank since the collapse of Drexel Burnham Lambert 18 years ago. Merrill agrees to sell itself to Bank of America Corp for $44 billion after more than $40 billion credit losses over the last year, Dow Jones Sinks 504 Points for Worst One Day Loss Since 9/11.
782
C.H. Ted Hong Type / Orig. Year ARM OTHER 2000 11,452 2001 11,389 2002 33,776 2003 51,548 2004 221,818 2005 496,697 2006 490,975 2007 99,946 Grand Total 1,417,601
ARM2/28 187,232 261,316 434,732 697,073 1,239,522 1,577,003 1,137,345 161,480 5,695,703
ARM3/27 68,430 67,018 100,939 164,228 413,366 393,020 234,344 36,795 1,478,140
ARM5/25 4,059 10,449 25,827 71,839 213,572 301,829 349,460 160,549 1,137,584
FIXED Grand Total 390,671 661,844 477,718 827,890 605,233 1,200,507 958,170 1,942,858 1,172,413 3,260,691 1,619,257 4,387,806 1,754,382 3,966,506 404,278 863,048 7,382,122 17,111,150
Fig. 50.2 Number of securitized Alt-A and subprime mortgage origination
50.3 Default Modeling
20 2003ACT 2003Prj 2004ACT 2004Prj 2005ACT 2005Prj
18
Default Modeling Factor Components
Refinance Cashout
16
Occupancy Owner Second home Investor Property Type Single-Family Multi-Family Condo Loan Documentation Full Limited House Price Appreciation (HPA) State Level CBSA Level
14 CDR (%)
Seasoning Combined Loan-to-Value (CLTV) Credit Score (FICO) Debt-to-Income Ratio (DTI) Payment Shock (IO) Relative Coupon Spread Loan Size Lien First Second and Others Loan Purpose Purchase
12 10 8 6 4 2 0 02/03 08/03 02/04 08/04 02/05 08/05 02/06 08/06 02/07 08/07 Date
Fig. 50.3 Seasoning: CDRs by date and vintages of ARM 2/28 20 18
2003ACT 2003Prj 2004ACT 2004Prj 2005ACT 2005Prj
16
50.3.1 Seasoning
CDR (%)
14 12 10 8
Loan information regarding borrower’s affordability is usually determined at origination. As a loan seasons, its original information decays, and its default probability starts to surge. A seasoning baseline curve with annualized Constant Default Rate (CDR) against its seasoning age would post a positive slope curve for the first 3 years. Figure 50.3 shows actual CDR curves and their fitted result for different vintages of ARM 2/28 mortgage pools. They roughly follow a shape similar to the Standard Default Assumption (SDA) curve.3 However, as shown in Fig. 50.4, the ramp-up curve can be very different for different vintages.
3
SDA is based on Federal Housing Administration (FHA)’s historical default rate and was developed by Bond Market Association (BMA), now known as Securities Industry and Financial Markets Association (SIFMA).
6 4 2 0 0
6
12
18 24 Age (month)
30
36
Fig. 50.4 Seasoning: CDRs by age and vintages of ARM 2/28 (Source: Beyondbond Inc, LoanPerformance)
50.3.1.1 Why Is the 2005 Seasoning Pattern Faster Than Prior Vintages? Since the seasoning baseline curve is not independent of dynamic factors, a dynamic factor such as HPA could tune vintage seasoning curves up and down. In Fig. 50.4, the 2005 seasoning pattern is significantly steeper than its prior
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets 25
vintages. Looser underwriting standards and deteriorating credit fundamentals can be important reasons. Negative HPA obviously starts to adversely impact all vintages after 2005.
20 CDR (%)
50.3.2 Payment Shock – Interest Only (IO)
783
ACT_NIO Prj_NIO ACT_IO Prj_IO
15 10
The boom in the subprime market introduced new features to the traditional mortgage market. An ARM 2/28 loan with a 2-year interest-only feature has a low fixed initial mortgage rate and pays no principal for the first 2 years prior to the coupon reset.4 When the IO period ends, the borrower typically faces a much higher payment based on its amortized principal plus the fully indexed interest amounts. This sudden rise in payments can produce a “Payment Shock” and test the affordability to borrowers. Without the ability to refinance, borrowers who are either under a negative equity situation or not able to afford the new rising payment will have a higher propensity to default. Consequently, we see a rapid surge of default rates after the IO period. The ending of the IO period triggers payment shock and will manifest itself with a spike in delinquency.5 Delinquent loans eventually work themselves into the defaulted category within a few months after the IO period ends. Figure 50.5 shows the different patterns and the default lagging between IO and Non-IO ARM 2/28 pools.
50.3.3 Combined Loan-to-Value (CLTV) LTV measures the ratio of mortgage indebtedness to the property’s value. When multiple loans have liens added to the indebtedness of the property, the resulting ratio of CLTV becomes a more meaningful measure of the borrower’s true equity position. However, the property value might not be available if a “market” property transaction does not exist. A refinanced mortgage will refer to an “appraisal value” as its property value. Note that “appraisal value” could be manipulated during ferocious competition among lenders in a housing boom market and undermine the accuracy of CLTV. 4
The reset is periodical, and the interest rate is set as Index C Margin. The delinquency rate is measured by OTS (Office of Thrift Supervision) or MBA (Mortgage Bankers Association) convention. The difference between these two measures is how they count missed payments. MBA delinquency rate counts the missed payment at the end of the missing payment month while OTS delinquency rate counts the missed payment at the beginning of the following month after missing payment. This difference will pose a 1–30 days delay of record. OTS delinquency rate is the prevailing delinquency measure in subprime market.
5
5 0 02/04 08/04 02/05 08/05 02/06 08/06 02/07 08/07 Date
Fig. 50.5 IO payment shock: CDRs by date of ARM 2/28
As we know, default is essentially a put option embedded in the mortgage for a borrower. In a risk neutral world, a borrower should exercise the put if the option is in-the-money. In other words, a rational borrower should default if the CLTV is greater than one or if the borrower has negative equity. At higher CLTVs, it becomes easier to reach a negative equity level as the loan seasons and its default probability increases. Figure 50.6 provides the actual stratification result of CDR over various CLTV ranges. Obviously, CDR and CLTV are positively correlated. In addition, lower CDR values are observed for higher subprime tiered FICO ranges. This shows that the FICO tier granularity is another important factor in modeling. However, since CLTV is obtained at the loan’s origination date, it does not dynamically reflect housing market momentum. We introduce a dynamic CLTV that includes housing price appreciation from loan origination in order to estimate more precisely the actual CLTV. This dynamic CLTV allows us to better capture the relationship between CLTV and default. Figure 50.7 clearly illustrates that different CLTV groups show a different layer of risk level.
50.3.4 FICO FICO score is an indicator of a borrower’s credit history. Borrowers with high FICO scores maintain a good track record of paying their debts on time with a sufficiently long credit history.6 6 According to Fair Isaac Corporation’s (The Corporation issued FICO score measurement model) disclosure to consumers, 35% of this score is made up of punctuality of payment in the past (only includes payments later than 30 days past due), 30% is made up of the amount of debt, expressed as the ratio of current revolving debt (credit card balances,
784
C.H. Ted Hong CDR vs. CLTV of ARM 2/28 Non-IO with age>24
CDR vs. CLTV of ARM 2/28 Non-IO with age >24 and FICO between 641 and 680
16 14 14 12 10
10
CDR%
CDR%
12
8 6
6 4
4
2
2 0
8
0 11- 60
61- 70
71- 80 81- 90 CLTV Range
91- 95
96-125
11- 60
61- 70
71- 80
81- 90
91- 95
96-125
CLTV RANGE
Source: Beyondbond Inc, LoanPerformance
Fig. 50.6 Stratified seasoned CDR over CLTV ranges (Source: Beyondbond Inc, LoanPerformance)
20 18 16
CDR (%)
14 12
80-90 Act 80-90 Prj 70-80ACT 70-80Prj 60-70ACT 60-70Prj
10 8 6 4 2 0 02/04
08/04
02/05
08/05
02/06 Date
08/06
02/07
08/07
Source: Beyondbond Inc, LoanPerformance
Fig. 50.7 CDRs by date and CLTVs of ARM 2/28 (Source: Beyondbond Inc, LoanPerformance)
In recent years, people have come to believe that FICO is no longer an accurate indicator due to the boom in hybrid ARM loans and fraudulent reporting to the credit bureaus. With refinancing becoming much easier to obtain, issuers have been giving out tender offers to borrowers in order to survive the severe competition among lenders. CLTV and FICO scores are two common indicators that the industry uses to predict default behavior.7 We examine etc.) to total available revolving credit (credit limits), and 15% is made up of length of credit history. Severe delinquency (30 plus) and credit history length make up 50% of the FICO score. This score reflects people’s willingness to repay. It’s essentially the probability distribution for people’s default activity on other debts such as credit card and/or utility bills, etc. Statistically speaking, people with higher FICO scores will have lower probability to default. 7 Debt-to-Income ratio is also an important borrower characteristic, but in recent years, more Limited-Doc or/and No-Doc loans have been
the combined CLTV and FICO effects on CDR as shown in Fig. 50.8. The figure presents a 3-D surface of stratified CDR rates over CLTV and FICO ranges from two different angles for seasoned ARM 2/28 pools. The relationship between CLTV and CDR is positively correlated across various FICO ranges. On the other hand, the relationship between FICO and CDR is somewhat negatively correlated across various CLTV ranges. However, the case is not as significant. FICO’s impact is obviously not as important as we originally expected. In our analysis, CLTV D 75 and FICO D 640 serve as the base case, and then we adjust the CDR according to movements of other default factors. Figure 50.9 gives an example of fitting results based on ARM 2/28 2004 vintage pools. The difference between 600– 640 and 680–700 FICO ranges makes only a small difference of 1% in CDR for a seasoned pool.
50.3.5 Debt-to-Income Ratio (DTI) and Loan Documentation (DOC) The DTI in this paper is defined as the back-end DTI, which means the debt portion for calculating the DTI ratio includes not only PITI (Principal C Interest C Tax C Insurance) but also other monthly debts such as credit card payments, auto loan payments and other personal obligations.8 The DTI
issued. For these loans, many of them do not have DTI ratio report, so we consider DTI separately for different DOC type. 8 There are two major measures of DTI in the industry: Front-End DTI ratio D PITI/Gross Monthly Income, and Back-End-DTI ratio D PITI C Monthly Debt/Gross Monthly Income. PITI D Principle C Interest C Tax C Insurance.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets CDR vs. FICO and CLTV of Seasoned ARM 2/28
CDR vs. FICO and CLTV of Seasoned ARM 2/28
16
16
14
14
12
12
10
10 CDR%
CDR%
785
8 6
8 6
4
4
2
2
541570
571600
FICO RANGE <=60
61- 70
601640
641680
71- 80
<= 6816 700 >=701 0 81- 90
91- 95
71
61
-7
0
0
0
0 >=96 91- 95
96
-9
-9
-8
>=
91
81
5
1 30 0 <= -57 1 0 54 -60 1 0 57 -64 1 60 680 10 64 70 168 1 70 >=
0 <=301
81- 90
71- 80 61- 70 <=60 CLTV RANGE
CLTV RANGE <=301
>=96
541-570
571-600
FICO RANGE
601-640
641-680
681-700
>=701
Fig. 50.8 Stratified CDR by CLTV and FICO of ARM 2/28 (Source: LoanPerformance)
18 16
10.5
12
10
10
9.5
6 4
Limited DOC
8.5 8 7.5
2 0 02/04
FULL DOC
9
8
CDR%
CDR (%)
14
Stratified CDR of seasoned pools between 2000-2007 by documentation types, FULL and LIMITED
680-700ACT 680-700Prj 600-640ACT 600-640Prj
7 6.5 08/04
02/05
08/05
02/06 Date
08/06
02/07
08/07
Fig. 50.9 FICO: CDRs by date and FICOs of ARM 2/28 (Source: Beyondbond Inc)
ratio shows the affordability of a loan to a borrower and provides a clearer picture of a borrower with an exceptionally high DTI. For different regions of the country, the DTI ratio can imply a different financial condition of the borrower because of different living standards and cost of living expenses. DTI is captured and reported as part of the loan documentation process. Loan documentation, also referred to as DOC, consists of three major groups: “FULL DOC,” “LOW DOC,” and “NO DOC.” Lenders usually require a borrower to provide sufficient “FULL” documentation to prove their income and assets when taking out loans. People who are self-employed and/or wealthy and/or have lumpy income stream are considered as borrowers with “LIMITED” (LOW or NO) documentation. In recent years, fierce competition have pushed lenders to relax their underwriting standards and originate more LIMITED DOC loans with questionable
6 <=10
11-15
16-20
21-25 26-30 DTI
31-35
36-40
>=41
Fig. 50.10 Stratified CDR by DTI ranges (Source: Beyondbond Inc, LoanPerformance)
incomes. This uncertainty regarding income poses problem in determining the real DTI. The stratification report shows two very different patterns of default between FULL and LIMITED documentation categories when analyzing the DTI effect. For FULL DOC loans, default probability vs. DTI is very much positively correlated and CDR increases as the DTI increases. Since FULL DOC loans are loans that have documented income and assets, the CDR to DTI relationship is most clear in Fig. 50.10. LIMITED DOC has a weaker relationship compared to FULL DOC. Figure 50.11 shows the two different time series patterns of CDR curves and their fitted values between FULL and LIMITED DOCs. Since income is one of the main elements in determining the DTI ratio, a macroeconomic variable, the unemployment rate, becomes an important factor that affects an individual’s income level. We find an interesting result when
786
C.H. Ted Hong UNEMPLOYMENT vs. TBILL3M
Actual versus Fitted CDR curve over time by documentation types, FULL and LIMITED for 2004 vintages
6.5
12
6.0
10
UNEMPLOYMENT
Lim_ACT Lim_Prj Full_ACT Full_Prj
CDR (%)
8 6
5.5 5.0 4.5
4
4.0
2
3.5
0 03/04
0
1
2
3
4
5
6
7
TBILL3M 09/04
03/05
09/05
03/06
09/06
Fig. 50.13 Unemployment vs. T-bill 3 months (Source: US Bureau of Census)
Date
Fig. 50.11 DOC: actual vs. fitted for 2004 (Source: Beyondbond Inc, LoanPerformance)
Seasoned CDR by different Loan Size ranges 12
18.00 10
16.00 14.00 12.00 10.00
99
00
01
02
03
04
05
06
07
8.00
CDR%
8
6
4
6.00
3MTBILL
2.00
Unemployment
0.00 79
83
87
91
95
99
03
B/W in Print
2
4.00
0 <100
100-150 150-200 200-250 250-300 300-350 350-400
>400
Original Balance Range 07
Fig. 50.14 Loan size stratification (Source: LoanPerformance)
Fig. 50.12 Unemployment rate 1979–2007 (Source: US Bureau of Census)
we plot the unemployment rate against 3-month US Treasury Bills. They have been very negatively correlated for the last 7 years. Whether this is a coincidence or not, it suggests that the monetary policy has been mainly driven by the unemployment numbers (Figs. 50.12 and 50.13).
50.3.6 Loan Size Is bigger better? The conventional argument is that larger loan size implies a better financial condition and lower likelihood of default. According to the stratification results based on original loan size in Fig. 50.14, CDR forms a smile curve
across original loan balances. Loans with amounts larger than $350,000 tend to be a bit riskier although the increment is marginal. Loans with an amount less than $100,000 also seem riskier. Larger loans do not seem to indicate that they are better credits. The original loan size is usually harder to interpret as it can be affected by other factors such as lien, property type, and geographical area. For example, a $300,000 loan in a rural area may indicate a borrower with growing financial strength; while the same amount in a prosperous large city may indicate a borrower with weak purchasing power. Out of context of property type and geographic location, loan size can be misleading. This may explain why we do not see a clear shape forming in Fig. 50.14. Figure 50.15 shows the three different time series patterns of CDR curves and their fitted values based on their loan size ranges. Since
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets 18 16 14
Seasoned CDR of 1st versus 2nd Liens
600-800ACT 600-800Prj 350-417ACT 350-417Prj 150-200ACT 150-200Prj
14 12 10
10 8
CDR%
CDR (%)
12
787
6 4
8 6 4
2 0 03/04
2 09/04
03/05
09/05
03/06 Date
09/06
03/07
09/07
Fig. 50.15 Size: actual vs. fitted CDR for 2004 (Source: Beyondbond Inc, LoanPerformance)
0 1st
LIEN
2nd
Fig. 50.16 Lien stratification (Source: LoanPerformance)
20
the size is mixed for all the property types, the pattern and fitted results for each category are distorted and the fit is not as good as other factors.
18
1Lien_ACT
1Lien_Prj
2Lien_ACT
2Lien_Prj
16
50.3.7 Lien We know that a second mortgage/lien has a lower priority to the collateral asset than a first mortgage/lien in the event of a default. Thus, the second lien is riskier than the first lien. Second lien borrowers usually maintain higher credit scores, typically with a FICO greater than 640. We often see a very mixed effect if this layered risk is not considered. In Fig. 50.16, second lien loans are significantly riskier than first lien loans when measured against comparable FICO ranges. Figure 50.17 shows two different time series patterns of CDR curves and their fitted values based on liens.
50.3.8 Occupancy Occupancy consists of three groups: “OWNER,” “INVESTOR,” and “SECOND HOME.” The “OWNER” group treats the property as a primary home, rather than as an alternative form of housing or an investment. This group will face emotional and financial distress if the property is in foreclosure or REO. Thus, this group has a lower propensity to default compared with others if all other factors remain the same. On the other hand, “INVESTOR” and “SECOND HOME” groups would be more risk neutral and are more willing to exercise their options rationally. In other words, they should have a higher default risk.
CDR (%)
14 12 10 8 6 4 2 0 02/04
08/04
02/05
08/05
02/06
08/06
02/07
08/07
Date
Fig. 50.17 Lien: actual vs. fitted CDR for 2004 (Source: Beyondbond Inc, LoanPerformance)
Figure 50.18 supports risk neutrality with respect to the “INVESTOR” group, and “INVESTOR” does show the highest default risk among all three groups. The “OWNER” group, however, is not the lowest default risk group. Instead, the “SECOND HOME” group is the lowest one. This observation is interesting but not intuitive. It indicates that when a borrower faces financial stress, a “SECOND HOME” will be sold first even at a loss to support his/her primary home. Thus, the default risk of “SECOND HOME” is actually reduced by incorporating a borrower’s primary home situation and cannot be simply assumed by the risk neutral option exercise. Figure 50.19 shows the two CDR time series and their fitted values between “OWNER” and “INVESTOR.”
788
C.H. Ted Hong Season CDR by Purpose types for ARM 2/28 and FIXED 2000-07 vintages
Season CDR by Occupancy types for ARM 2/28 and FIXED 2000-07 vintages 12
12 ARM2/28
FIXED
ARM2/28
FIXED
10
8
C DR%
C DR%
10
6
8 6
4
4 2
2 0 Owner
Second Home
Investor
0
Occupany
Purchase
Fig. 50.18 Occupancy stratification (Source: LoanPerformance)
Cash Out
Refi-Other
Purpose
Fig. 50.20 Purpose stratification (Source: LoanPerformance) 20 18
InvestorAct
16
InvestorPrj
14
OwnerAct
CDR (%)
OwnerPrj
12 10 8 6 4 2 0 02/04
08/04
02/05
08/05
02/06 Date
08/06
02/07
08/07
Fig. 50.19 Occupancy: actual vs. fitted CDR for 2004 (Source: Beyondbond Inc, LoanPerformance)
50.3.9 Purpose Loan Purpose classifies three key reasons for obtaining a loan as “PURCHASE,” “CASHOUT,” and “REFI.”9 “PURCHASE” means the borrower is buying the property. “REFI” means the borrowing is refinancing the outstanding balance without any additional funds drawn from the equity in the property. “CASHOUT” refers to a refinance loan with extra cash inflow to the borrower due to the positive difference between the new increased loan amount and the existing loan balance (Fig. 50.20).
9
For simplicity sake, we categorize refinance, second mortgage, and other miscellaneous types as “REFI.”
“CASHOUT” and “REFI” usually reflects an intention to rollover the IO period or an attempt to benefit from a lower mortgage rate. They can only be afforded by borrowers in good financial condition. “REFI” is a group of borrowers with a higher FICO to LTV compared to the other two categories. So we expect the “REFI” loans to have a lower default rate than “PURCHASE” loans. This argument seems to be correct for fixed rate mortgages. “REFI” borrowers have much lower default probability than “PURCHASE.” Beginning in 2007, the credit crunch hit the market and most lenders tightened their credit standards. Hybrid ARM loans, such as an ARM 2/28, faced new resets, and borrowers who no longer qualified for refinancing were in danger of defaulting. If these people can no longer afford the payment after IO and/or reset, they will eventually enter default. ARM 2/28 loans show a significant increase in defaults for “REFI” purpose as compared with FIXED rate loans. Figure 50.21 shows the three different time series patterns of CDR curves and their fitted values among various purpose types.
50.3.10 Dynamic Factors: Macroeconomic Variables As we mentioned in the model framework, macroeconomic variables such as HPI, interest rate term structure, unemployment rate, and others that supply up-to-date market information can dynamically capture market impact. In theory, an economy maintains its long-term equilibrium as the “Norm” in the long run and a handful of macroeconomic
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets 20 18
CDR (%)
16 14 12
Refi_ACT Refi_Prj Purchase_ACT Purchase_Prj Cashout_ACT Cashout_Prj
10 8 6 4 2 0 02/04
08/04
02/05
08/05
02/06
08/06
02/07
08/07
Date
Fig. 50.21 Purpose: actual vs. fitted CDR for 2004 (Source: Beyondbond Inc, LoanPerformance)
789
subprime crisis.10 Thus, house price appreciation (HPA), which measures the housing appreciation rate, year-overyear, has become the most important indicator within the US housing market. By comparing the 30-day delinquency across vintages, we see that delinquency rates increase after the 2005 vintage (Fig. 50.22). When we look at seasoning patterns across 2000–2005 vintages, we find that the 2005 seasoning pattern starts to surge after 18 months of age or the third quarter of 2006. Coincidentally, HPA starts to decline in the second quarter of 2006. Although a similar HPA pattern appears at the third quarter of 2003, the main difference is that the former is the up-trend of HPA, while the latter is on a down-trend. Defaults in 2003 are obviously lower than in 2006 with comparable loan features and seasoning/age. In order to capture this subtle trend difference, we study HPI and its various dimensions in addition to HPA levels (Fig. 50.23).
50.3.11.1 Multi-Dimension HPI Impacts variables are usually used to describe the condition of the economy. While the economy is in its “Norm” growing stage, these macroeconomic variables usually move or grow very steadily, and the risk/return profile for an investment instrument can be different depending on its unique characteristics. Because of that, a diversified investment portfolio can be constructed based on a correlation matrix. Thus, macroeconomic variables are usually ignored during the “Norm” period. However, when an economy is under stress and approaches a “bust” stage, many seemingly uncorrelated investments sync together. The same macro variables become the main driving forces that crucially and negatively impact the investment results. The current credit crunch is creating mark to market distress for investments across not only various market sectors but also across credit ratings, which clearly describes our view regarding these macroeconomic variables. Since the severe impact from these variables typically occurs in economic downturns, cross correlation could provide a preliminary result in understanding the causality and the magnitude of their relationship. The dynamic interaction between these variables and consumer behavior would then provide a better sense of prediction and perhaps either forecast the next downturn or efficiently spot an investment opportunity based on the next market recovery.
50.3.11 House Price Appreciation (HPA) The house price index (HPI) has been the most quoted macroeconomic variable that measures or determines high delinquency and default rates since the beginning of
To systematically identify the impact of HPA, we measure HPA in three aspects regarding each loan: Cumulative HPI, an accumulative HPA since origination,
is calculated based on HPI levels to capture equity gain for borrowers. HPA, the change rate of HPI, captures the pulse of the housing market. HPA2D, the change of HPA, is used to capture the trend/expectations of the housing market. HPA factors form a multi-dimensional impact to reflect a loan’s up-to-date capital structure, current housing market conditions, and future housing market prospects. We embedded the “Cumulative HPI” into CLTV to build a dynamic CLTV to reflect the dynamic equity value to the property. In a risk neutral analysis, an option model can be easily applied to project the default probability. HPA is already a leading market indicator in explaining defaults. HPA2D basically serves as the second derivative of HPI; it allows us to capture the general expectation on home price movements and market sentiment. The negative impact due to HPA2D in the third quarter of 2006 is apparently different from the third quarter of
10 The Housing Price Index (HPI) used in this paper is published by Office of Federal Housing Enterprise Oversight (OFHEO) as a measure of the movement of single-family house prices. According to OFHEO, The HPI is “a weighted, repeat-sales index, meaning that it measures average price changes in repeat sales or refinancing on the same properties.” See website of OFHEO, www.ofheo.gov, for details.
790
C.H. Ted Hong
Fig. 50.22 30-Day delinquency of ARM2/28 by vintages (Source: OFHEO, LoanPerformance)
20 18 16
Delq Rate (%)
14 12 10 8 6 2000
2001
4
2005
2006
2
2007
0 1
5
10
18
26
34
42
50
58
66
74
82
90
Age
Fig. 50.23 HPA vs. CDR of ARM2/28 2005 vintage (Source: OFHEO, LoanPerformance)
16
14
14
12
12 10
CDR%
10 8
8 CDR of ARM228 in 2005
HPA
6
6 4
4 2 2
0
0 Feb-05 Jun-05 Oct-05 Feb-06 Jun-06 Oct-06 Feb-07 Jun-07 Oct-07
−2
Distribution Date
2003 even though the HPA numbers are at a similar level.11 HPA2D undoubtedly offers another dimension that reflects consumer expectations about the general housing market. When HPA2D is negative, the probability of borrowers holding negative equity increases. The deterioration of the housing market is producing unseen record-low HPI levels. While the HPA continues decreasing, HPA2D plunges even faster. Our multi-dimensional HPA empirical fitting merely relies on a very limited range
11
We have smoothed HPA and HPA2D series to create better trend lines. A linear weighted distributed lags of last four quarters are adopted for smoothing the series.
of in-sample HPA data. To extrapolate HPA and HPA2D requires numerous possible market simulations to induce a better intuitive sense of the numbers. The shaded area in Fig. 50.24 shows a simulated extreme downturn in the housing market that assumes a 30% drop in HPI levels based on the fourth quarter of 2007 and then a leveling-off. Based on the simulation results, HPA2D starts to pick up at least onequarter earlier than HPA and 1-year earlier than HPI. While a 2-year HPI downturn is assumed, the consumers’ positive housing market expectation reflected in HPA2D effectively reduces their incentive to walk away from their negative equity loans. This case example clearly shows how the forecasted HPA and HPA2D numbers may provide a better intuitive market sense to the model.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets Fig. 50.24 HPA vs. HPA2D, actual and extreme simulation (Source: OFHEO, LoanPerformance)
791
15
5 DLags4 YoY DLags4 YoY2D
4
Upward Trend
10
Downward Trend
3 5 Assum HPA drop 30% in 2year and then flat
2
0 M
ar
S -0
0
ep
M -0
0
ar
S -0
ep
1
M -0
1
ar
S -0
2
ep
M -0
2
ar
S -0
3
ep
M -0
3
ar
S -0
4
ep
M -0
4
ar
S -0
ep
5
M -0
5
ar
S -0
ep
6
M -0
6
ar
S -0
ep
7
M -0
7
ar
S -0
8
ep
M -0
8
ar
S -0
ep
9
M -0
ar
9
S -1
0
1 ep
-1
0
0
−5
−1
−10
−2
Stressed Scenario
−15
−3
−20
−4
The relationship between HPI and consumer behavior that forms the HPI impact to default and prepayment is then modeled. We illustrate the multi-dimensional HPI impact through an example shown below:
.05 .04 .03 .02 250
1. HPCUM #(below 5%) ) CLTV" ) MDR", SMM # 2. HPA #(below 2%) ) MDR", SMM # 3. HPA2D #(below 5%) ) MDR", SMM #
.01 .00
200
−.01 150 100
50.3.11.2 HPI and DPI 50 76 78 80 82 84 86 88 90 92 94 96 98 00 HPI
HPI_QOQ
Fig. 50.25 HPI & HPA QoQ 1975–2007 (Source: US Bureau of Census) HPI vs. DPI (based on 1975q1-2007q3) 500
400
300 HPI
When the economy is experiencing a potentially serious downturn, generating HPI predictions going out 3 years is a much better approach than random simulations. Since HPI has increased so rapidly since 2000, the current decline can be viewed as an adjustment to the previously overheated market. The magnitude and ramp-up period of the adjustment nevertheless determine consumers’ behavior of exercising their mortgage embedded options. Finding a long-term growth pattern of HPI thus becomes very vital for predicting and simulating future HPI numbers (Fig. 50.25). Based on Fig. 50.26, HPI draws a constant relationship with disposable personal income (DPI) in the long run. Since DPI is a more stable process, a long-term HPI prediction based on the observed relationship between DPI and HPI provides a better downturn average number. Based on our long-term HPI prediction, HPI could potentially drop as much as 14% by the end of 2009.12
200
100
0 0
2000
4000
6000
8000 10000 12000
DPI 12
A 5% decrease by the end of 2009 on average plus another 9% based on two standard errors of regression of HPI on DPI result.
Fig. 50.26 HPI vs. DPI (Source: US Bureau of Census)
792
C.H. Ted Hong
50.3.11.3 Geographical Location and Local HPI
50.4 Prepayment Modeling
In housing markets, geographic location (location, location, location or L3 ) is undoubtedly the most important price determinant as it is globally unique. While we are pointing out all HPI impacts in general, HPI at the national level does not reflect the actual local situation and distorts the default impact, ignoring the granularity of detailed local housing market information. The consequence of ignoring this kind of granularity can be very severe when a geographically diversified mortgage pool’s CLTV has a fat-tailed distribution in its high CLTV end. For those loan with negative equity, the local HPI can have a greater relevance than the national HPI. Fortunately, we are able to differentiate HPI impact by drilling down to the state as well as to the CBSA level. Figure 50.27 shows the actual levels of HPA on December 2007 and our projection of HPA for June 2008 detailed by CBSA. We start with a national level HPI model to obtain the long-term relationship between HPI and DPI. We then build a dynamic correlation matrix between national and state as well as national and CBSA levels that estimates parameters and generates forecasts on the fly. The CBSA level HPI is especially important for calculating dynamic CLTV. Since the cumulative HPI (HPCUM) is calculated as the cumulative HPA since origination, it captures the wealth effect for generating dynamic CLTVs. This more detailed information helps to predict if a mortgage has crossed into the negative equity zone.
Prepayment Modeling Factor Components Housing Turnover and Age Refinancing Teaser Effect Interest Only (IO) Effect Burnout Effect Seasonality Loan-to-Value Effect FICO Credit Effect Prepayment Penalty House Price Wealth Effect
Fig. 50.27 Geographic components of HPA by CBSA
50.4.1 Housing Turnover and Seasoning Housing turnover rate is the ratio of total existing singlefamily house sales over the existing housing stock.13 With the exception of cases in the early 1980s, the housing turnover rate has risen steadily for the last 15 years until 2005. The result of a rising housing turnover rate indicates that home owners are capable of moving around more than in the past. In the housing market boom era, it also indicates the height
13
We use a five-year moving average of “Total Occupied Housing Inventory” based on US Census Bureau times 0.67 to estimate the total single-family housing stock.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets Fig. 50.28 US housing turnover 1977–2007 (Sources: National Association of Realtors and Beyondbond Inc)
793
9%
Housing Turnover Rate
8% 7% 6% 5% 4% 3% 2% 1% 0%
77 979 981 983 985 987 989 991 993 995 997 999 001 003 005 007 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1
19
year
30 2003 2004
CPR (%)
2005
20
10
0 0
6
12
18
24
30
36
42
48
Age
Fig. 50.29 CPR over various vintages of discount fixed rate, coupon D 6% (Sources: Beyondbond Inc, LoanPerformance)
of speculation. When a housing boom comes to an end, the housing turnover rate starts to decrease. The movement of housing turnover after 2005 shows exactly this. Since the housing turnover rate is used as the base prepayment speed and can generate a significant tail risk of principal loss given the same default probability, it is especially crucial in a high default and slow prepayment environment like the current one (Fig. 50.28).
50.4.2 Seasoning The initial origination fee and the loan closing expenses usually take a few years to be amortized, and this discourages the new mortgagors from prepaying their mortgages early in the mortgage term. This ramping-up effect is the seasoning
factor. Figure 50.29 shows the age pattern observed for fixed rate loans. The ramping-up period initially lasts for the first few months and then it starts to level off or decrease due to other prepayment factors. Hybrid ARMs exhibit similar patterns initially during the first 12 months. For hybrids like an ARM 2/28, the prepayment level climbs up from 0% to about 20–50% CPR within the first 12 months. After that, the acceleration of the prepayment levels starts to slow down until right before the teaser period ends. The difference in prepayment levels can be readily observed after the 12th month when shorter hybrids begin to show higher prepayment rates. The reason why ARM 2/28 borrowers show higher prepayment levels may be due to the faster housing turnover of the hybrid group. After the first 12 months, the prepayment generally stays around the same level with a wave-like trend peaking every 12 months. The seasoning pattern is illustrated in Fig. 50.30.14
50.4.3 Teaser Effect The teaser effect is the most distinctive feature of Hybrid ARM products. We define the term as the behavior that tends to persist right around the first reset where borrowers seek
14
For the data pooling in terms of its vintage year, we usually use the loan distribution data for grouping information. It helps to maintain the relationship while examining the relationship with macroeconomic variable for time series analysis. It, however, distorts the age pattern since the loans within same vintage year could be underwritten in different months. The seasoning graph is specifically grouped by the loan’s seasoning age to better understand the age pattern.
794
C.H. Ted Hong 100
120 IO=0
90
2001
2003
2005
70
80
CPR%
CPR %
IO=24
100
80
60 50 40
60 40
30 20
20
10
0 0
0 0
6
12
18
24
30
36
42
48
54
60
66
6
12
18
24
72
30
36
42
48
54
60
AGE
AGE
Fig. 50.30 CPR over various vintages of ARM2/28 (Sources: Beyondbond Inc, LoanPerformance)
Fig. 50.31 CPR over various vintages of ARM2/28 (Sources: Beyondbond Inc, LoanPerformance)
alternatives to refinancing their mortgages or simply prepay them to avoid higher interest rates. In the following section we will describe the empirical statistics gathered to support the teaser effect. Approximately 1 to 2 months before the end of the teaser period, a sharp rise in prepayments occurs. The effect is apparently larger for shorter hybrids like ARM 2/28 since shorter hybrids are exposed less to other prepayment factors such as refinancing and burnout during the teaser period. The peak level is reached just about 2 months after the teaser period ends. Teaser impact is usually observed as a sudden jump in prepayment levels. This spike happens whenever borrowers are able to refinance with a lower cost alternative.
50.4.5 Refinance
50.4.4 Interest Only (IO) Effect
50.4.6 Burnout Effect
During the teaser period, the prepayments of ARM 2/28 with or without IO track each other fairly well. Before the end of the teaser period, loans with IO exhibit higher prepayment levels than the regular ones. IO borrowers are even more sensitive to the payment level since they are paying only the interest portion before the teaser. Once the teaser period ends, they will start to pay not only higher interest but also an additional amount of amortized principal. Their incentive to refinance is definitely higher than regular ARM 2/28 borrowers. Even worse, if they cannot find a refinancing alternative, they could face affordability issues and increased default risk (Fig. 50.31). We will address this further in the interaction between prepayment and default section.
The heterogeneity of the refinancing population causes mortgagors to respond differently to the same prepayment incentive and market refinancing rate. This phenomenon can be filtered out as burnout. The prepayment level usually goes up steadily with occasional exceptions across the high financial incentive region. The major reason for this is due to the burnout phenomenon in which borrowers that have already refinanced previously and have taken advantage of the lower rates are less likely to refinance again without additional financial incentives. To capture such a path-dependant attribute, our prepayment model utilizes the remaining principal factor to capture the burnout effect to reduce the chances of overestimating the overall prepayment levels.
The prepayment incentive is measured as the difference between the existing mortgage rate and the prevailing refinancing rate, which is commonly referred to as the refinance factor. As the refinancing factor increases, the financial incentive to refinance increases and thus changes prepayment behavior. When the loans are grouped by their coupon rates during the teaser period, the differences of prepayment levels are quite apparent. They behave in similar patterns, but loans with higher coupons tend to season faster due to the financial incentive to refinance, while loans with lower rates tend to be locked-in as the borrowers have secured the lower rates (Fig. 50.32).
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets Fig. 50.32 Refinance: stratification by coupon, fixed rate, 2003 vintage (Sources: Beyondbond Inc, LoanPerformance)
795
60 Coupon 6.99
50
Coupon 5.78 Coupon 8.18 Coupon 10.68
CPR (%)
40
30
20
10
0 02/03
08/03
50.4.7 CLTV Wealth Effect
02/04
08/04
02/05
08/05 Date
02/06
08/06
02/07
08/07
100 90
50.4.8 FICO Credit Effect The subprime market consists of people with limited credit history and/or an impaired credit score. The high FICO score group is usually offered more alternatives to refinance and thus has the flexibility to choose between different products. For those people who are on the threshold of the subprime and prime market, they could be upgraded to participate in the prime market during the course of the loan life. Thus, the prepayment is an increasing monotonic function with respect to FICO (Fig. 50.35).
80 70 CPR%
As a property’s price appreciates, the LTV of a loan gradually decreases. Borrowers with a low LTV may be able to refinance with a lower interest rate. Some borrowers may even find themselves in an in-the-money situation where they can sell their property to make an immediate profit. Historically, home prices continue to increase with age, and more and more loans will fall into this “low LTV” category, which has an increasing likelihood of prepayment. We use a combination of CLTV, HPA and age to model this effect (Figs. 50.33 and 50.34).
60 50 40 30 20 10 0 0
10
20
30
40
50
60
70
AGE 11- 60
61-90
> 90
Fig. 50.33 Wealth: CPR over various CLTV of ARM2/28 (Sources: Beyondbond Inc, LoanPerformance)
We can see a combined effect of FICO and CLTV on CPR. Those people who have a low CLTV and a high FICO score can easily refinance and will have the highest prepayment rate; while people who have high CLTV and low FICO score will be on the other side of the pendulum with the lowest prepayment rate. Figure 50.36 gives a sample CPR fitting result based on ARM 2/28, 2004 vintage pools.
796
C.H. Ted Hong 70
FICO 641-680 of ARM2/28, 2004 vintage LTV81ACT
LTV81Prj
80
60
640-680ACT 640-680Prj
70
50
50 CPR (%)
CPR (%)
60 40 30 20 10 0 03/04
40 30 20
09/04
03/05
09/05
03/06 Date
09/06
03/07
10
09/07
0 02/04
02/05
08/04
08/05
Fig. 50.34 Fitted CPR over CLTV 81–90 of ARM2/28, 2004 vintage (Sources: Beyondbond Inc, LoanPerformance)
02/06 Date
08/06
08/07
02/07
Fig. 50.36 Credit: fitted CPR (Sources: Beyondbond Inc, LoanPerformance)
2000 2004 vintage, CLTV 70-90, DTI 35 -45 100
100 90
No Prepay Penalty 2 yr Prepay Penalty
80
80
60
CPR (%)
CPR%
70
50 40 30
60
40
20 10
20
0 0
10
20
30
40
50
60
70
80
90
AGE 301-600
601-700
0 > 700
Fig. 50.35 Credit: CPR by FICO of ARM2/28c (Sources: Beyondbond Inc, LoanPerformance)
0
6
12
18
24
30 Age
36
42
48
54
60
Fig. 50.37 CPR over various vintages of ARM2/28 (Sources: Beyondbond Inc, LoanPerformance)
50.4.9 Prepayment Penalty
50.4.10 Interaction Between Prepayment and Default
A prepayment penalty fee in the loan structure is a negative incentive to refinance and deters prepayment. Prepayment is essentially an embedded call option with the remaining balance as its strike. The penalty simply adds to that strike price as an additional cost when borrowers exercise the option. That additional cost will be reduced to zero when the penalty period ends. Figure 50.37 shows the prepayment difference when a penalty clause is in place. Before the 2-year penalty period ends, prepayment is consistently slower than no-penalty loans. As soon as the penalty period ends, prepayments surge dramatically and surpass the no-penalty loans within 3 months and consistently maintain a faster prepayment speed thereafter.
As we stated in the beginning of the model framework, prepayment and default can be viewed as embedded call and put options, respectively, on the mortgage. A borrower will continuously find incentives to exercise the option if it is inthe-money (Fig. 50.38). When we estimate prepayment and default for a pool of mortgages, the remaining principal factor encompasses the entire history of the pool’s prepayment and default rates. Since estimating losses is a main focus of modeling default and prepayment, it is of particular importance in a slow prepayment environment. Given the same default probability, the tail risk to the loss curve will still increase
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets
80
18 16
CDR PrjCDR CPR prjCPR
CDR (%)
14 12
70 60 50
10 40 8 30
6
20
4
10
2 0 02/04
Fig. 50.39 Loss projection of ARM2/28, 2004 vintage (Sources: Beyondbond Inc, LoanPerformance)
40
0 08/04 02/05 08/05 02/06 08/06 02/07 08/07 Date
Loss (%) 0.8
%
CDR CPR
35 30
0.7
CPR, 29% total loss 2xCPR, 21% total loss
0.6
2xCPR
25
0.5
20
0.4
15
0.3
10
0.2 0.1
5 0 40
CPR (%)
Fig. 50.38 CDR and CPR of ARM2/28, 2004 vintage (Sources: Beyondbond Inc, LoanPerformance)
797
Age
60
80
100 120 140 160 180 200 220 240
substantially. Figure 50.39 presents a tail risk example. When the prepayment speeds double, the total loss decreases to 21% from 29% given the same default speeds. Because the history of prepayment and default rates can seriously affect the remaining principal factor for any given pool of loans, tracking and rolling the principal factor for a loan pool is one of the most important elements in the model projections and future forecasts. Prepayments are removed from the outstanding balance and, as a result, are not available to default in the future.
0 40
Age
60
80
100 120 140 160 180 200 220 240
delinquencies should be leading indicators of future defaults. We should be able to simply roll delinquency numbers month to month into actual defaults. The question is whether there is a constant relationship that can be parameterized or not. The time series plots of defaults and the spectrum of delinquencies for the 2003 vintage are shown in Fig. 50.40. The cross correlations indicate an approximately 6-month period for a 30-day delinquency to manifest into default as shown in Fig. 50.41.
50.5.2 Analysis Among Delinquency Spectrum 50.5 Delinquency Study 50.5.1 Delinquency, the Leading Indicator Is delinquency a good leading indicator for default? When a borrower is late for a payment for more than 30 days, a 30-day delinquency is reported. If the payment is late for more than 2 months, a 60-day delinquency is reported. After a 90-day delinquency, the loan is considered to be in default, and the bank holding the mortgage will likely initiate its foreclosure process depending on the judicial status of each state. Since a default is a consequence of delinquency,
The results among delinquency spectrums show a very significant cross correlation between delinquency and its lagged earlier tenor (Fig. 50.42).
50.5.3 A Delinquency Error Correction Default Model Based on the results shown previously, the spectrum of various delinquencies can be parameterized for near-term projections. The benefit of including delinquency to project
798
C.H. Ted Hong
Fig. 50.40 Source: Default and delinquency over time for 2003 vintage
2.8
14 MDR
2.4
MBA30
12
2.0
10
1.6
8
3.2 2.8
2.0
10
1.6
8
1.2
6
0.8
4 2
0.8
4
0.4
2
0.4
0.0
0
0.0
2005
2006
2003
9 MDR
3.2
DLQ60
7
2.4
6
2.0
5
1.6
4
1.2
3
0.8
2
0.4
1
0.0
0
Fig. 50.41 Source: Cross correlations of default and delinquency for 2000 vintages
2004
2005
2006
2004
2005
2006
2007
2.0
2.0 MDR
8
2.8
2003
0
2007
3.6
14 12
6
2004
DLQ30
2.4
1.2
2003
16 MDR
DLQ90
1.6
1.6
1.2
1.2
0.8
0.8
0.4
0.4
0.0
0.0
2007
2003
2004
2005
2006
2007
100% 98% 96% 94%
mba30
dlq30
dlq60
dlq90
Correlation (%)
92% 90% 88% 86% 84% 82% 80% 78% 76% 74% 72% 70% 0
1
2
3
4
5
6
7
8
Lag (month)
mba30(-1) dlq30(-1) dlq60(-1) dlq90(-1)
mba30 0.974228 0.892006 0.842606 0.8199
dlq30 0.896283 0.99476 0.980814 0.937639
dlq60 0.849914 0.989324 0.993112 0.934675
dlq90 0.819303 0.931421 0.915923 0.898144
Fig. 50.42 Source: Correlations between various delinquencies
defaults is that it does not require specific consumer behavior theory to be applied. By simply looking at a delinquency
report, we are able to project the likelihood of defaults. It, however, suffers from the long-term view that if a loan fundamentally carries lower credit-worthy characteristics such as a high CLTV, it has a greater propensity to default. Because we are impressed with the short-term forecast ability, and in order to provided by delinquency and the econometric model based on consumer behavior theory, we have integrated both and created a delinquency error correction model.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets
799
2001 1.6 1.4
2002
MDR: Actual MDR PRJMDR: Projected MDR MDRF1: Projected MDR with Error Correction
1.2
1.6
1.2
1.0 0.8
0.8
0.6 0.4
0.4
0.2 0.0 2001 2002 2003 2004 2005 2006 2007 MDR
PRJMDR
0.0 2002
2004
2003
MDRF1
MDR
2005
PRJMDR
2006
2007
MDRF1
2004
2003 2.0
2.0
1.6
1.6
1.2
1.2
0.8
0.8
0.4
0.4 0.0
0.0 2003
2004 MDR
2005 PRJMDR
2006
2004
2007
2005 MDR
MDRF1
2006 PRJMDR
2007 MDRF1
2006
2005 1.0
.5
0.8
.4
0.6
.3
0.4
.2
0.2
.1 .0
0.0 05M07 06M01 06M07 07M01 07M07 08M01 MDR
PRJMDR
MDRF1
2006M07 MDR
2007M01 PRJMDR
2007M07
2008M01
MDRF1
Fig. 50.43 Source: Delinquency error model: actual vs. fitting
The fundamental idea is that not only can the long-term view and various scenarios based on changing views of macroeconomic variables be adopted, but also the immediate/early warning signs from delinquency can be observed and utilized (Fig. 50.43). In our error correction model, we start by projecting default rates using the default function with fitted parameters.
We then layer on a 6-month lagged 30-day delinquency as an additional exogenous variable to regress the fitted errors. The process is then repeated sequentially by adding 5-month lagged 60-day and then 4-month 90-day delinquency rates as new regressors. The results are very encouraging when compared to the base model without error correction. The additional R2 pickup is about 15% (Fig. 50.44).
800
C.H. Ted Hong
0.95 R-Square 0.9
3.
0.85
0.8
0.75
4.
0.7 Default Model
Error Correction Model
Fig. 50.44 Comparison of model explanation power for 2000–2007 vintages (Sources: Beyondbond Inc, LoanPerformance)
5.
50.6 Conclusion
6.
50.6.1 Traditional Models As a result of the credit crisis, we now know we must have missed something in the traditional models. It requires us to take a hard look at the models and methodologies employed previously and see what is needed to provide a better interpretation of the current market data and conditions. Traditionally, practitioners have observed consumer behavior through historical defaults and prepayments while building an econometric model with several quantifiable factors. These factors include seasoning patterns, underlying loan characteristics, such as mortgage coupon, FICO score, loan-to-value, and debt-to-income ratio, and macroeconomic variables, such as prevailing mortgage rate and housing price appreciation. In order to fit the historical data, non-linear functions are usually constructed with parameters around the factors to explain default and/or prepayment probabilities. During the process of historical sample fitting to the econometric model, the traditional modelers usually miss the following: 1. Traditional models focus on fitting in-sample data with a unique parameter set by vintage. Although the in-sample data fitting provides a much easier fit of the parameter set, it assumes that borrower’s behavior varies given the same loan characteristics and loan age. It creates a disconnection among vintages and cannot be applied to new loans. 2. Borrower behaviors underlying LTV, FICO, and DTI were implicit but not fully quantified in a dynamic form by traditional models. Since the borrower and loan information such as LTV, FICO, and DTI levels are not periodically
updated after the loan origination date, the accuracy of the projected performance of seasoned loans diminishes as the original data becomes aged and less relevant. Out-of-sample projections may produce counterintuitive results. Macroeconomic variables, such as HPA, unemployment level, personal gross income, and so on can be very important factors for in-sample fitting. However, they do not provide insight for new scenarios. If a new scenario has not occurred historically, a stress test for the new scenario should be thoroughly pre-examined. Traditional models focus on the national level rather than the local housing markets. Since house prices are highly dependent on location, a model with more detailed housing information can make a dramatic difference in the accuracy of its forecasts. Traditional models treat prepayments and defaults independently and ignore the complexity and interaction between these embedded call and put options. Traditional models do not dynamically quantify feedback from other leading indicators such as delinquency rates.
50.6.2 Innovation Having addressed the pitfalls of the traditional models, we have built a Dynamic Econometric Loss (DEL) model framework with the following innovations: Consistent parameter sets for all vintages via the addition of consumer behavior factors. 1. Dynamic consumer behavior factors (a) CLTV ratio (via cumulative HPA since origination) that reflects housing market wealth effects during housing boom/bust periods. (b) DTI ratio (via unemployment rate forecasts) that addresses housing affordability. 2. Complete study of HPA index prior to model-fitting (a) HPCUM as the cumulative HPA since origination to capture the wealth effect. (b) HPA to capture the pulse of the housing market. (c) HPA2D as the change of HPA to capture the trend of the housing market. HPA2D successfully captures the timing of defaults for 2005 to 2006 vintages. (d) In-sample and out-of-sample HPA fit testing to ensure the model’s robustness. 3. A detailed CBSA-level HPA model allows us to understand local housing markets better and to generate more precise projections. 4. Recursive calculations along seasoning paths while estimating/projecting prepayments and defaults.
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets
5. An error correction model that systematically builds the linkage between delinquency and default to enhance default forecast accuracy.
50.6.3 Advantages The implementation based on our model framework will capture the default and loss patterns exhibited during the recent period and use the information contained in them to forecast future prepayments, defaults and losses based on various macroeconomic market scenarios. The implementation advantages are as follows: 1. Multiplicative and additive factors for each nonlinear function (boot-strapping Maximum Likelihood Estimation) 2. Comprehensive consumer behavioral economic theory applied in practice (a) Develop a consumer behavior-based economic theory. (b) Estimate consumer behavior via an econometric model. (c) Apply the econometric model to prepayment and default. 3. Fully utilize HPA time-series information (a) A built-in time-series fitting model that dynamically estimates parameters and generates forecasts on the fly. For example, HPCUM #(below 5%) ) CLTV") MDR", SMM# HPA # (below 2%) ) MDR", SMM# HPA2D #(below 5%) ) MDR", SMM # 4. Multiple built-in time-series fitting models at the national, state, and CBSA level that dynamically estimate parameters and generate forecasts on the fly. 5. Built-in recursive calculator along seasoning paths for projecting prepayments and defaults. 6. A set of error correction fitting models that estimate parameters within the spectrum of delinquencies and defaults.
50.6.4 Findings In order to understand how a loan prepays or defaults, we have investigated consumer behavior via loan characteristics utilizing static factors and relevant macroeconomic variables as dynamic factors. For each factor, we have constructed a non-linear function with respect to the magnitude of the factor. We build the default/prepayment function as a linear combination of these factors to justify the impact of each
801
factor accordingly. Since a loan can either prepay or default over time, we continue to ensure that the principal factors are rolled properly for prepayment and default forecasts. While the level of HPA is considered the main blessing/curse for the rise and fall of the subprime market, we find that cumulative HPA and the change of HPA contribute to effect prepayment and defaults. 1. HPI is significantly correlated with DPI over a long-term period. Since DPI is a more stable time series, it suggests that HPI will eventually adjust to coincide with DPI growth rate. 2. Default is strongly correlated with the spectrum of delinquency rates. By applying the fitted parameters between default and delinquency rates to an error correction model, we are able to effectively improve default predictability.
50.6.5 Future Improvements Modeling mortgage defaults and prepayments as embedded options is an ongoing learning process. While we are encouraged by our findings, there is a myriad of new questions for us to address with an aim to continuously improve and finetune the model. Some areas for further investigation are briefly described below.
50.6.5.1 Business Cycle – Low Frequency of Credit Spread While studying the dynamic factors in the Default Modeling section, we have focused mainly on the HPI impact on consumer behavior and have introduced the DPI as another macroeconomic variable to determine the long-term growth of the economy. At the beginning of this paper, we wondered how a relatively small volume of loans could result in a subprime crisis that proved to be so detrimental to the entire US financial market and global financial system. We believe that the subprime crisis was merely the tipping point of unprecedented credit market easing that has existed since early this century. During this era of extremely easy credit, yield hungry investors sought to enhance their returns through investment in either highly leveraged securities or traditionally highly risky assets such as subprime loans. Through the rapid growth of the credit default swap in derivative markets and RMBS, ABS, and CDOs in the securitization markets, subprime mortgage origination volume reached record highs after 2003. The credit ease impacted not just the subprime market. All credit-based lending, from credit cards to auto loans and leveraged buy-out loans, were enjoying a borrower friendly environment as lenders went on a lending spree. While the credit default rates reached their historical low
802
C.H. Ted Hong
3
%
2.5 2
2.0 1.6 1.2
1.5 0.8 1 0.5 0 85 87 89 91 93 95 97 99 01 03 05 07
0.4 0.0 0.0
0.4
0.8 1.2 1.6 2.0 SPREAD_TB3M
2.4
Fig. 50.45 Historical TED spread and histogram (Source: Beyondbond Inc.)
last decade and resulted in extremely tight spreads among credit products, a longer view of the history of business cycles started to reveal warning signs of the potential downside risk. For example, the TED Spread dramatically widened after August 2007, which was a re-occurrence of the late eighties market environment (Fig. 50.45). Over the past 20 years, traditional calibration models that only focused on shorter time frames missed the downside “fat tail.” The improbable is indeed plausible. Is there a better method to mix the long-term low frequency data with the short-term high frequency data and provide a better valuation model?
50.6.5.2 Dynamic Loss Severity It is a usual practice, when using prepayment and default rates to forecast mortgage and mortgage-derived securities performance, to treat the lagged timing of loan loss/recovery and the loan loss/recovery level as given assumptions. The detailed HPA information provided at the CBSA-level and better detailed information from the loan servicers in recent years have allowed us to begin to model these variables to create dynamic loss severity percentages. Greater cooperation with the servicers will lead to more robust estimations.
Hayre, L. S., M. Saraf, R. Young, and J. D. Chen. 2008. Modeling of mortgage defaults, Citigroup Global Markets, Inc., U.S. Fixed Income Strategy & Analysis – Mortgages, January 22. Laderman, E. 2001. Subprime mortgage lending and the capital markets, Federal Reserve Bank of San Francisco, FRBSF Economic Letter (Number 2001–38), December 28. Mago, A. 2007. Subprime MBS: grappling with credit uncertainties, Lehman Brothers Inc., March 22. Mason, J. R. and J. Rosner. 2007. Where did the risk go? How misapplied bond ratings cause mortgage backed securities and collateralized debt obligation market disruptions, Mortgage-Backed Security Ratings, May 3. Nera Economic Consulting. 2007. At a glance: the chilling effects of the subprime meltdown, Marsh & McLennan Companies, September. Parulekar, R., U. Bishnoi, and T. Gang. 2008. ABS & mortgage credit strategy cross sector relative value snapshot, Citigroup Global Markets Inc., June 13. Peterson, C. L. 2007. Subprime mortgage market turmoil: examining the role of securitization – a hearing before the U.S. Senate Committee on Banking, Housing, and Urban Affairs Subcommittee on Securities, Insurance, and Investment, University of Florida, April 17. Ramsden, R., L. B. Appelbaum, R. Ramos, and L. Pitt. 2007. The subprime issue: a global assessment of losses, contagion and strategic implications, Goldman Sachs Group, Inc. – Global: Banks, November 20. Risa, S. 2004. The New Lehman HEL OAS model, Lehman Brothers Inc., December 8. Ted Hong, C. H. 2005. Modeling fixed rate MBS prepayments, Beyondbond Inc, October. Ted Hong, C. H. and M. Chang. 2006. Non-agency hybrid ARM prepayment model, Beyondbond Inc, July, 6–17. Wang, W. 2006. “Loss severity measurement and analysis.” The MarketPulse, LoanPerformance, Issue 1, 2–19.
References Bergantino, S. and G. Sinha. 2005. Special report: subprime model update, Bear Stearns & Co. Inc., May 26. Dubitsky, R., J. Guo, L. Yang, R. Bhu, and S. Ivanov. 2006.“Subprime prepayment, default and severity models.” Credit Suisse, Fixed Income Research, May 17. Flanagan, C. 2008. Subprime mortgage prepayment and credit modeling, J.P. Morgan Chase & Co., April 2. Hayre, L. S. and M. Saraf. 2008. A loss severity model for residential mortgages, Citigroup Global Markets, Inc., U.S. Fixed Income Strategy & Analysis – Mortgages, January 22.
Appendix 50A Default and Prepayment Definition We consider a loan to be in default if it meets both of the following criteria: 1. The loan is not able to generate any future investor cash flow
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets
2. The loan has been in foreclosure, REO, or reporting loss in the prior reporting period The Monthly Default Rate (MDR) is defined as the percentage of defaulted amount as a sum of all default loan balance compared with the aggregate loan balance of that period. SMM (Single Month Mortality) is calculated by formula: SMM D
Scheduled Balance Current Balance Scheduled Balance
If we have MDR and SMM, then we can derive CDR and CPR from them by using the formula: CDR D 1 .1 MDR/ CPR D 1 .1 SMM/12
803
I is the number of multiplicative ! spline functions .j / M P .j / .j / Xt;m ˇm and is a linear combij is equal to 1 C m
.j /
.j /
.j /
nation function with multiplier ˇm of Xt;m ; where Xt;m is an observable value of the type m factor at time t, while .j / ˇm is the composition ratio of the distinct factor j of type m J is number of linear functions
Appendix 50C Default Specification
12
Appendix 50B General Model Framework
yt.s/ D
K X
ˇ
.k/ ˇ .k/ .k/
k Xt ˇ˛m ; ˇm I m 2 Œ0; M .k/
kD0 I Y
i
ˇ
.i / ˇ .i / .i / ; ˇm Im Xt ˇ˛m
2 Œ0; M .i /
i D0
ˇ
.j / ˇ .j / .j / j Xt;m I m 2 Œ0; M .j / / ˇ˛m ; ˇm
J Y j D0
K X
D
ˇ
.k/ ˇ .k/ .k/
k X t ˇ˛ m ; ˇm I m 2 Œ0; M .k/
kD0 I Y
ˇ
.i / ˇ .i / .i / .i / i Xt ˇ˛m ; ˇm I m 2 Œ0; M
i D0 J Y
00 @@1 C
j D0
.j / M X
1 ˇ .j / .j / A ˇ .j / .j / Xt;m ˇm ˇ˛m ; ˇm I m 2 Œ0; M .j / / A 1
m
where .s/
yt is an observable value at time t for dependent variable type s 'k is a spline interpolation function with pair-wise
.i / .i / ˛m ; ˇm knots .k/
Xt is an observable value of factor k at time t K is the number of additive spline functions interpolation function with pair-wise
i is a spline .k/ .k/ ˛m ; ˇm knots .i /
Xt
is an observable value of factor i at time t
A whole loan mortgage starts at t0 and matures by tn , its MDR by time t can be driven by two types of variables – static and dynamic. Collateral characteristics such as mortgage rate, loan size, IO period, teaser period, loan structure, term to maturity, geographic location, FICO, and CLTV are static factors since their impact diminish over time while the loan is getting seasoned. Macroeconomic variables over time such as Housing Price Index, mortgage interest rate, unemployment rates, Gross Disposable Income, and inflation rates are dynamic. They are publicly observable and will adjust the default rate forecasts based on the scenario assumption. We formulate our default function MDR as follows: ˇ Dt D LTV vt ˇLTV j ; ht C FICO .cj / rate .rt jWACt / age .ai ja0 / DTI dj jDTIj ; DOCj IO gt jIOj ; ai size .s/ HPA .HPA/ H2D .H2D/ DOC .Docm / LIEN .LIEN m / PURPOSE .PURPOSEm / where ®’s are spline functions in MDR % and are additive to form a base value œ’s are spline functions as multipliers for the MDR adjustments vt : CLTV by time t where initial CLTV is assumed at time t0 rt : Ratio spread of WACt over original WAC rate cj : FICO score of loan j ai : Age of loan j dt : DTI gi : Remaining IO period if IO exists and is positive lj : Size of loan j ®LTV : Original LTV level & HPAt vt D vt v0 ; ht ; zj
804
C.H. Ted Hong
Hti : HPI at time ti since origination date t0 zt : Geographic zip code j, e.g., z1 D z .CA/ D 1:3 z2 D z .OH/ D 1:1 z3 D z .MI/ D 1:01 z0 D z .Other/ D 1
rt D .WACt MTGt / ®rate .rt / is a spline function of rt WACt is gross coupon that is either observable or can be
simulated from index rates and loan characteristics
the function form of vt vt D
RATE Effect
Index rate forecasting will be a spread
v0 Ht.i lag/ Ht.0lag/
:zj
yt0 s D “0 C “0 yt1 C “1 Swp2Yt C “2 Swp5Yt C“3 Swp10Yt C “4 LIBOR1Mt C ©t
ht : the functional form of ht as simple AR(2) model ht D “h0 C “h1 ht1 C “h2 ht2 C ©t Where all the parameters can be independently regressed by ht ’s time series data zj : the functional form of zj is setup as a dummy variables zj D “zj z.j/ if j D “CA” and parameter “zj can be calibrated by default data by bootstrapping the value ft : is the actual principal factor and will be either observed for in-sample filtering or simulated for out-of-sample forecast FICO: Checks if credit scores (original) are a good measure of default cj : the functional form of cj will be a spline (natural, Linear, tension spline) function with fixed FICO locators, j’s (suggested only) [250, 350, 450, 500, 525, 550, 550, 580, 600, 625, 650, 680, 700, 720, 750, 800, 820] and parameters can be calibrated for default data base & fine-tuned AGE: Default probability increases as loan get seasoned but eventually reach a plateau given other constants at : we will sample linear spline function from 0 to 1 to apply age locators [0, 1, 5, 10, 15, 20, 30, 45, 60, 120] DTI Effect: Income level will affect default under assumption of DOC if it’s fully available ut D u0
GDPt GDP0
UMt UM0
“.UM/
the functional form œu .ut / is a linear spline function of ut œDTI ut ; wj D .œu .ut //œw .wj / where œw .w0 / D 1 ! Full D w0 œw .w1 / D 0:1 ! Low D w1 œw .w2 / D 0 ! No D w2
for corresponding index rate LIBOR6M, 1Y-CMT, COFI, 5YY – CMT, : : :etc. IO-Payment-Shock: Increased payments at the end of IO period will increase defaults. gt D IO0 at œIO .gt / D is a linear spline function of locators [30; 20, 10; 5; 2, 0, 2, 0, 2, 5, 10, 20] Crowding Out: Measures if the underwriting standard has deteriorated œvolume is a spline function vmt is whole loan issue amount ratio .FICO 580, 580 < FICO 700/ Note: 30-day Delinquency rate for the (12-month) ratio if delinquency report is available œsize is a simple step-spline function to certain loan size after default with locators [ 50k; 100k; 150k; 250k, 500k, 800k, 1million]
Occupancy ocp has 3 kinds of occupancy (Owner, Second Home, Investor)
Loan Purpose prs has 3 kinds of purpose (Purchase, Refi, Cash Out)
Lien lien has 2 lien positions (First lien, Second lien)
Loan Document doc has 3 kinds of documentation type (Full, Limit, and No Document)
50 Dynamic Econometric Loss Model: A Default Study of US Subprime Markets
Appendix 50D Prepayment Specification Single Monthly Mortality (SMM) Rate Function St D rate .rt / turnoverrate ./ teaser .tst / seasonality ./ cashout ./ age .at / burnout .ft / yieldcurve ./ equity ./ credit ./ IO .gt / credit .Vt / issuer .IY j 0 s / size .lj 0 s / penality .Nyes=no /
805
mer, decreases through the fall, and slows down even more in the winter. The pattern may be different geographically and demographically.
Cash-Out Prepayment is driven by general housing price appreciation. Rate Factor ®rate .rt / (to grab REFI-incentive) ®rate : a natural spline function 20 locators [10; 5; 2; 1, 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 5, 6, 7, 9, 10, 15, 20] rt D
WAC mt .Fixed/ WACD mt .ARM=Hybrid/
mt : FH 30-yr/10 day commitment rate (FHR3010) as prevailing mortgage rate to measure SATO effect Age Factor: PPY has less incentive due to the consideration of initial financing sunk cost but the probability increases as 3-year costs average out over time.
Housing Turnover Rate Age Prepayment based on long-term housing turnover rate and composed of existing sales over single-family owner housing stock.
Mortgages generally display an age pattern.
Seasonality
Burnout Effect
Monthly seasonality is generally believed to affect prepayments. The belief stems from the mobility of mortgagors, time of housing construction, school year, and weather considerations. For a specific month of the year and ceteris paribus, prepayment rates are directly affected by the related month-of-year’s coefficient. Usually, the seasonality pattern shows greater active in the spring, rises to a peak in the sum-
Borrowers don’t behave homogeneously when they encounter the same refinancing opportunities. Some are more sensitive than others. If the borrowers are heterogeneous with respect to refinancing incentives, those who are more interest sensitive will refinance sooner. The remainder will be composed of less interest sensitive borrowers.
Chapter 51
The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model Huimin Chung, Wei-Peng Chen, and Yu-Dan Chen
Abstract This research sets out to investigate the relationship between credit risk and equity liquidity. We posit that as the firm’s default risk increases, informed trading increases in the firm’s stock and uninformed traders exit the market. Market-makers widen spreads in response to the increased probability of trades against informed investors. Using the default likelihood measure calculated by Merton’s method, this paper investigates whether financially ailing firms do indeed have higher bid-ask spreads. The panel threshold regression model is employed to examine the possible non– linear relationship between credit risk and equity liquidity. Since high default probability and worse economic prospects lead to greater expropriation by managers, and thus greater asymmetric information costs, liquidity providers will incur relatively higher costs and will therefore offer higher bid-ask spreads. This issue is further analyzed by investigating whether there is any evidence of increased vulnerability in the equity liquidity of firms with high credit risk. Our results show that the effects caused by increased default likelihood might precipitate investors’ loss of confidence in equities trading and thus a decrease in liquidity as evident during the Enron crisis period. Keywords Default risk r Equity liquidity r Financial distress costs r Panel data r Threshold regression
W.-P. Chen () Department of Finance, Shih Hsin University, Taiwan e-mail: [email protected] H. Chung Graduate Institute of Finance, National Chiao Tung University, Hsinchu 30050, Taiwan Y.-D. Chen Graduate Institute of Finance, National Chiao Tung University, Taiwan
51.1 Introduction A lot of researchers have studied the costs of financial distress, including direct costs and indirect costs.1 Nevertheless, except for direct costs and indirect costs, recent research has revealed a new notion of financial distress costs. Previous studies that examine the effect of default risk on equities focus on the ability of default spreads to explain or predict returns. Vassalou and Xing (2004) uses Merton’s (1974) options pricing model to compute default measures for individual firms and access the effect of default risk on equity returns. Although considerable research effort has been put toward modeling default risk for the purpose of valuing corporate debt ad derivative products written on it, little attention has been paid to the effects of default risk on equity liquidity. This paper considers the effects of default risk on equity liquidity. Agrawal et al. (2004) demonstrate that firms with financial distress suffer from reduced stock liquidity by increasing bid-ask spread. In other words, as a firm’s performance deteriorates and likelihood of financial distress increases, the stock liquidity of the firm will be worse. In the literature on market microstructure, the bid-ask spread is modeled as arising from three sources: adverse selection, order processing costs, and inventory holding costs. The adverse-selection component compensates the market-maker for losses incurred on trades against informed traders. Market makers will normally want to profit in their transactions with noise traders who do not obtain any private information. Therefore, market makers set the spreads wide enough to guarantee that their profits in trades to uninformed traders will cover the expected losses from the trades with informed traders. In the period of expected high financial distress costs, default risk might affect equity liquidity in a more severe pattern. This is because the managerial agency costs are 1 The direct costs comprise legal and administrative expenses of bankruptcy proceedings (Warner 1977; Weiss 1990); the indirect costs consist of management resources devoted to resolving financial distress, loss of suppliers and customers, and constraints on the firm’s financing and investment opportunities (Altman 1984; Titman 1984; Wruck 1990).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_51,
807
808
particularly severe in stressful market conditions and managers’ incentive to pursue their private benefits increases. Market makers might have to widen the bid ask spread in response to the perceived increase in information asymmetry costs. There are also several other reasons why uninformed investors are less likely to trade in the securities of financially troubled firms. Information asymmetry costs increase with firms’ financial distress conditions. When a firm’s performance and financial condition deteriorate, trading in the firm’s security may be increasingly dominated by informed investors. If the liquidity provider believes that the probability of some traders possessing superior information has increased for equities with high default risk, the liquidity provider could protect himself by quoting a higher bid-ask spread. This paper explores the default risk impacts on liquidity for the Enron crisis period when stock investors’ strong lack of confidence was observed and the expected financial distress costs were extreme.2 The bankruptcy of Enron Corp. was infectious. Investors became more and more sensitive to financial statement and credit risk of companies. In this paper, we will investigate the cross-sectional relationship between firms’ default risk and their equity liquidity and whether the results are any more significant in the period with high financial distress costs. We posit that the recent corporate accounting scandal of Enron and other listed companies’ bankruptcy be a precipitating factor to the investors’ loss of confidence in equity trading, particularly for those stocks with high default risk. To account for the potential non-linear relationship between default risk and equity liquidity, the analysis is conducted by using the panel threshold regression model of Hansen (1999). We apply Merton’s (1974) option pricing model to estimate default risk for individual firms. The default risk measure is called the “Default Likelihood Indicator (DLI).” Our empirical results provide interesting factors on the influence of default risk on liquidity. The results show that the perceived default risk at the beginning of the month has predictive contents on the liquidity (average bid-ask spread) of the month. DLI reveals an extremely significant positive relation to percentage bid-ask spread, showing that DLI is one of the key determinants to the liquidity cost, indicating that financially troubled firms suffer from the cost of reduced liquidity. More importantly, in the period of high expected 2 During the fourth quarter in 2001, most of the global finance markets were sluggish, especially the U.S. stock market, owing to explosion of a series of corporate scandals. Most famous of all, on December 2, 2001, the energy giant “Enron Corp.” announced that it and 14 of its subsidiaries have filed voluntary petitions for Chap. 14 reorganizations with the U.S. Bankruptcy Court for the Southern District of New York. Enron listed $49.8 billion in assets and $31.2 billion in debts, making it the biggest corporate bankruptcy in U.S. history.
H. Chung et al.
financial distress costs the economic costs of liquidity appear to be higher for firms with high default probabilities. The remainder of this paper is organized as follows. The next section discusses data and methodologies, such as default likelihood indicators, panel data regression model, and threshold regression model. Empirical tests and results are undertaken in Sect. 51.3. Section 51.4 outlines our conclusion
51.2 Data and Methodology 51.2.1 Data The sample data include the component stocks of S&P 500. To avoid the effect of different market mechanisms on equity liquidity, only the component stocks listed on the New York Stock Exchange (NYSE) are included, as Stoll (2000) concludes that market structure has a clear effect on bid-ask spreads. The sample period is from February 1, 2001 to May 31, 2002.3 Intraday trade and quoted data are taken from the NYSE Trade and Quoted (TAQ) database. Following Stoll (2000), the averages of each of the underlying variable are taken across all of the days in each month in order to reduce the errors associated with a single day. The firms that have missing values of variables used in this study are deleted from our sample. Consequently, our sample size reduces to 276 firms and the number of firm-months is 4,416. We delete all trades and quotes that were out of time sequence, as well as those that involved any errors. Following Huang and Stoll (1996) and many previous studies, we attempt to further minimize data errors by eliminating quotes with the following characteristics: (1) where either the bid or the ask price is equal to, or less than, zero; (2) where either the bid or the ask depth is equal to, or less than, zero; (3) where either the price or volume is equal to, or less than, zero; (4) all quotes with a negative bid-ask spread, or a bidask spread of greater than US$4; (5) all trades and quotes that are either “before-the-open” or “after-the-close”; (6) all Pt trade prices, where: j.Pt Pt 1 /=Pt 1 j > 0:1; and (7) all at ask quotes, where j.at at 1 /=at 1 j > 0:1 and all bt bid quotes, where j.bt bt 1 /=bt 1 j > 0:1. In our cross-sectional model, the depend variable is percentage bid-ask spread (PSP) and explanatory variables are close price (P), the number of trade (NT), volatility (SIG), market value (MV) and default likelihood indicators (DLI). PSP is calculated as the average of percentage spread in final 5 min for each trading day. The average of the daily observation of PSP, closing price and number of trades are 3
NYSE switched to the decimal pricing system on January 29, 2001. Thus the period of our research starts from February 1, 2001.
51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model
calculated for each stock in each month. We compute the standard deviation of daily stock returns for a given month (SIG) as the volatility of these daily returns over that 1-month period. MV is the stock’s market value. The book value of total debt includes both current and long term. DLI are nonlinear functions of the default probabilities of the individual firms. We estimate DLI by using the contingent claims methodology of Black and Scholes (1973) and Merton (1974). This paper employs the COMPUSTAT database to retrieve the firm’s “Market Value,” “Debt in One Year,” “Long-Term Debt,” and “Total Liabilities” series for all companies. We acquire information on daily stock returns from the Center for Research in Securities Prices (CRSP) daily return files. To calculate DLI, we observe monthly 1-year T-bill rate r from The Federal Reserve’s website.
809
Let xt be the book value of the debt at the time t, that has maturity equal to T. xt is used as the strike price of a option, since the market value of equity, VE , can be regarded as a call option on VA with time to expiration equal to T. Using the Black and Scholes formula for call options, the market value of equity will be estimated: VE D VA N.d1 / Xe rT N.d2 /;
(51.3)
and p log.VA =X / C r C 12 A2 T ; d2 D d1 A T ; p d1 D A T (51.4)
51.2.2 Methodology
r is the risk-free rate, and N./ is the cumulative density function of the standard normal distribution. An iterative procedure is adopted to calculate A and then to estimate daily value of VA for each month. See the following steps:
51.2.2.1 Bid-Ask Spread
Step 1.
The percentage spread is used as a measure of trading cost. They are then averaged for each security for each month of the sample. The nominal spread of security i at time t, Traded Spreadit , is calculated as Askit Bidit , where Askit and Bidit are the respective average intraday ask and bid prices at time t for security i . The percentage spread is calculated as: Askit Bidit : Percentage Spreadit D .Askit C Bidit /=2
Step 2.
Step 3.
(51.1) Step 4.
51.2.2.2 Measure of Default Risk Step 5. Vassalou and Xing (2004) is the first research that uses Merton’s (1974) option pricing model to calculate default measures for individual firms and appraise the effect of default risk on equity returns. The basic concept of Merton (1974) is that the equity of a firm is thought of as a call option on the firm’s assets and then one can estimate the value of equity by using Black and Scholes (1973) model. Our procedure in calculating default risk measures using Merton’s option pricing model is similar to the one used by KMV. We assume the capital structure of the firm including equity and debt. The market value of a firm’s underlying assets follows a Geometric Brownian Motion (GBM) of the form: dV A D VA dt C A VA dW
The above procedure is repeated at the beginning of every month, resulting in the estimation of monthly values of A . The estimation window is always kept equal to 12 months. The risk-free rate is the 1-year T-bill rate. Once daily values of VA for month t are estimated, we can compute the drift , by calculating the annul compound return rate of asset value, log.VA;t 1 =VA;t q /. The default probability is the probability that the firm’s assets will be less than the book value of the firm’s liabilities as maturity T4 . In other words, Pdef ;t D Pr .VA;t CT 6 Xt jVA;t / D Pr .log .VA;t CT /
(51.2)
where VA is the firm’s value with an instantaneous drift and an instantaneous volatility A . A standard Wiener process is W .
Daily data from the past 12 months are used to estimate the volatility of equity .E /, which is then viewed as an initial value for the estimation of A . Using the Black and Scholes formula given in Equation (51.3), we obtain VA using VE as the market value of equity of that day for each trading day of the past 12 months. In this mode, we capture daily values of VA for the past 12 months. We then compute the standard deviation of those VA from the previous step, which is used as the value of A , for the next iteration. Steps 2 and 3 are repeated until the values of A from two consecutive iterations converge. Our tolerance level for convergence is 10E-4. Once the converged value of A is obtained, we use it to back out VA through Equation (51.3).
6 log .Xt / jVA;t / 4
(51.5)
We use the “Debt in One Year” plus half the “Long-Term Debt” as book value of debt.
810
H. Chung et al.
Because the value of the assets follows the GBM of Equation (51.2), the value of the assets at any time t is given by:
p A2 T C A T "tCT ; log.VA;t CT / D log.VA;t / C 2 "tCT D
W .t C T / W .t/ p ; and "tCT N.0; 1/ T
(51.6)
Hence, we can get the default probability as follows: Pdef ;t D Prob. log.VA;t / log.Xt / C 0:5A2 T p CA T "t CT < 0/ Pdef ;t D Prob."t CT 6 Œ log.VA;t =Xt / p C 0:5A2 T =A T /
(51.7)
Distance to default (DD) could be defined as follows: p DDt D . log.VA;t =Xt / C 0:5A2 T /=A T
(51.8)
When the ratio of the value of assets to debt is less than 1 or its log is negative, default occurs. The DD tells us by how many standard deviations the log of this ratio needs to deviate from its mean in order for default to occur. Although the value of the call option in Equation (51.3) does not depend on , DD does, as it depends on the future value of assets that is given in Equation (51.4). Using the normal distribution implied by Merton’s model, the theoretical probability of default, called Default Likelihood Indicators (DLI), will be given by5 :
51.2.2.3 Panel Data Analysis of Default Risk on Equity Liquidity The important factors of effect percentage bid-ask spreads have been approved by numerous earlier papers; see Benston and Hagerman (1974), Tinic and West (1972), and Stoll (1978a, 1978b). Besides stock price, the spread is influenced by other factors such as trading volume, variance of stock returns, market value of equity, and even the structure of exchange market. A word of caution is necessary here because some of these empirical factors will also be affected, in turn, affecting spreads when a firm’s performance deteriorates. For example, the declining stock prices of poorly performing firms will make percentage bid-ask spreads rise. In particular, variance of stock returns, which proxies for informed trading (Black 1986), may rise and increase in spreads. It is also possible that although trading by informed investors rises, trading volume in general will decline because of fewer investors in the market, hence higher spreads will ensue. For the purpose of examining the influence of the firm’s default risk on the bid-ask spreads clearly, we recommend controlling for these determinants. Using controlling variables in our regressions we find that the increase in spreads is still directly linked to the firm’s financial condition. Following Stoll (2000), the averages of each of the underlying variables are taken across all of the days in each month in order to reduce the errors associated with a single day. Every variable is the monthly average except for DLI. DLI is defined as the default probability of the first trading day for each month. That implies that DLI forecasts PSP in an ex ante base. The following regression model is analyzed: PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit
Pdef D N.DD/
p D N . Œ log.VA;t =Xt / C 0:5A2 T =A T /
C b4 log MV it C b5 log PDLI it C "1
(51.9)
(51.10)
DLI increases as (1) the value of debt raises, (2) the market value of equity as well as assets goes down, and (3) the assets’ volatility increases. Comparatively, DLI is superior to other approaches using accounting financial information to measure default risk. DLI has a range of strengths as follows: (1) it has strong theoretical underpinnings; (2) it takes into account the volatility of a firm’s assets; (3) it is forward looking based on stock market data rather than historic book value accounting data.
Based on the results of past literature, we expect that b1 ; b2 and b4 should be negative, and b3 be positive. Because higher default risk implies financial condition exacerbation, b5 is expected to be positive. To account for the potential heteroscedasticity, we use generalized least squares estimation (GLS) to estimate the coefficients of these explanatory variables in Equation (51.10). The important motivation for using panel data is to solve the omitted variables problem. If the omitted variable is correlated with explanatory variables, we cannot consistently estimate parameters without additional information. The use of fixed effect panel regression might mitigate the omitted variable problem.
5
Strictly speaking, Pdef is not a default probability because it does not correspond to the true probability of default in large samples.
51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model
51.2.2.4 Panel Threshold Regression Analysis of Default Risk on Liquidity The previous model considered in this paper assumes a linear relationship between credit risk and equity liquidity. The relationship could be non-linear as perceived high default risk might signal increasing information asymmetry costs. This issue may be addressed by using threshold regression techniques. Threshold regression models express that individual observations can be divided into classes by the value of an observed variable. This paper treats DLI as threshold variables and our single threshold regression equation has the following form: PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit
(51.11)
where I() is the indicator function and 1 is threshold. PDLI is equal to DLI times 100. Alternatively, the double threshold regression model for the credit risk and liquidity costs can be expressed as: PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit C b4 log MV it C b5 log PDLI it I.DLI it < 1 / C b6 log PDLI it I.1 < DLI it < 2 / Cb7 log PDLI it I.DLI it > 2 / C "it
(51.12)
where 1 and 2 are the threshold values for DLI. The observations are divided into two and three regimes. Both models have only the slope coefficient on log PDLI switch between regimes, because we can focus attention on this key variable of interest. Based on the results of past literature, b1 ; b2 and b4 should be negative, and on the other hand, b3 ; b5 ; b6 and b7 should be positive. In contrast to deciding threshold levels arbitrarily, the above panel threshold regression model chooses the threshold points statistically. Hansen’s (1999) threshold regression methods are suitable for non-dynamic panels with individual specific fixed effects and show that the model is rather straightforward using a fixed-effects transformation. An asymptotic distribution theory is derived, which is used to construct confidence intervals for the parameters. A bootstrap method to assess the statistical significance of the threshold effect is also described. Our sample is designed for balanced panels, so we took the subset of 276 stocks, which are observed for the period 2001/02–2002/05. GAUSS programs are modified to fit our sample and regression model. Let yit be the percentage spread for stock i at month t, the threshold model in Equation (51.11) can be rewritten as: yit D xit . /ˇ C ci C uit
where xit . / and ˇ corresponds to the explanatory variables and parameters in Equation (51.13), respectively. Similar to the procedure of fixed effects transformation introduced in previous section, transformation is used to eliminate the individual effect ci and now let Y ; X and u denote data stacked over all individuals. The threshold equation with fixed effects transformation could be written as Y D X . /ˇ C u
(51.13)
(51.14)
Hansen (1999) suggests that the parameter ˇ can be esO / D .X . /0 timated by OLS for any given , i.e., ˇ. 1 0 X . // X . / Y . The regression residuals is O / uO . / D Y X . /ˇ.
C b4 log MV it C b5 log PDLI it I.DLI it < 1 / C b6 log PDLI it I.DLI it > 1 / C "it
811
(51.15)
Then, the sum of squared errors (SSE) is S1 . / D uO . /0 uO . / D Y0 .I X . /0 .X . /0 X . //1 X . /0 /Y (51.16) Using least square to estimate is suggested by Chan (1993) and Hansen (1999). It is easiest to achieve by minimizing the sum of squared errors. Thus, the least squares estimator of is (51.17) O D arg min S1 . /
when the threshold O is available, the coefficient of the reO and the gressor is ˇO D ˇO .O /, the residual is uO D uO .”/, S1 .O / 2 . residual variance is O D n .T 1/ It is very important to identify whether the threshold effect has observable influence on coefficient estimator. In other words, it is necessary to test the hypothesis of no threshold effect in Equation (51.12). The null hypothesis, H0 W b5 D b6 , can be tested by using the likelihood ratio test. If the null hypothesis holds, the model is reduced to Equation (51.10). Let S0 be the sum of squared error of Equation (51.10). The likelihood ratio test of no threshold effect is F1 D .S0 S1 .O // = O 2
(51.18)
Because the asymptotic distribution of F1 is non-standard, Hansen (1999) suggests a bootstrap procedure to simulate the asymptotic distribution of likelihood ratio test. The bootstrap method estimates the asymptotic p-value for F1 under H0 . If p-value is less than the desired critical value, the null hypothesis of no threshold effect is rejected. For the double threshold panel model in Equation (51.12), a sequential OLS method described in Hansen (1999) can be directly applied to estimate the threshold parameters.
812
H. Chung et al.
51.3 Empirical Results 51.3.1 Descriptive Statistics Our sample period extends over the 351 trading days from February 2001 to June 2002. The sample is divided into two subperiods: the first from February 2001 to September 2001, and the second from October 2001 to June 2002. In this paper, the period 2 from October 2001 to June 2002 will be elected the period of high financial distress costs. Table 51.1 presents descriptive statistics, such as the mean, median, standard deviation, and minimum and maximum of selected variable for three groups of pooled data. Pearson correlations coefficients of selected variables for three groups of pooled data are given in Table 51.2. For price (P), number of trades (NT), and market values (MV), there is negative correlation to percentage bid-ask spread (PSP). In the other hand, volatility (SIG) and DLI have positive relation to percentage bid-ask spread (PSP). We first provide some preliminary analysis on the relationship between percentage spread and trading characteristic variables, such as price (P), number of trades (NT),
volatility (SIG), and market values (MV). The monthly regression results using ordinary least squares estimation is shown in Appendix Table 51A.1. For most months, the coefficients of log P and SIG are significant at the 1% level. As expected, the signs of the parameters of the two explanatory variables are both consistent with our prediction. There is a negative cross-sectional relationship between price and percentage spread. While volatility is positively related to percentage bid-ask spread for every month, the coefficients of log NT and log MV are significant only for fewer months.
51.3.2 Results of Panel Data Regression If the structure model for the cross-sectional determinant of percentage spread equation has the potential omitted variable problem, pooled ordinary least squares estimator is not consistent. Unobserved effects panel data model can be used to address this problem. We thus use the fixed effect model to estimate the panel data regression model. The data is also
Table 51.1 Descriptive statistics of selected variable for three groups of pooled data N Mean Median S.D Minimum
Maximum
Panel A: Whole period (February 2001–May 2002) PSP (%) 4,833 0.1285 0.1054 CLP 4,833 40.0159 38.1913 NT 4,833 1,068 859 SIG 4,833 2.0782 1.8841 MV(million’s) 4,833 19,612 7,855 DLI (%) 4,833 4.3084 0 Panel B: First sub-period (February 2001–September 2001)
0.0837 19.7966 681 0.8603 42,885 14.8112
0.0195 2.8175 162 0.4489 506 0
0.8140 159.9965 5,005 6.0690 486,720 100
PSP (%) 2,267 0.1455 0.1202 P 2,267 40.5592 38.9771 NT 2,267 938 741 SIG 2,267 2.1030 1.9298 MV(million’s) 2,267 20,476 7,865 DLI (%) 2,267 4.7494 0 Panel C: Second sub-period (October 2001–May 2002)
0.0880 19.3324 632 0.8370 45,095 15.8597
0.0316 5.8313 162 0.6046 506 0
0.8140 133.1581 4,098 6.0690 486,720 100
PSP(%) P NT SIG MV(million’s) DLI (%)
0.0767 20.1895 701 0.8799 40,826 13.8103
0.0195 2.8175 196 0.4489 510 0
0.7745 159.9965 5,004 6.0177 398,105 100
2,566 2,566 2,566 2,566 2,566 2,566
0.1134 39.5359 1,182 2.0563 18,849 3.9188
0.0913 37.3880 982 1.8439 7,838 0
This table presents descriptive statistics of all selected variables during our sample period. Since the firms that have missing values of variables are deleted, our sample size reduces to 276 firms that were listed on the NYSE Notes: PSP the monthly average percentage bid-ask spreads for company i ; P the monthly average close price for company i ; NT the monthly average of the number of trades for company i ; SIG the standard deviation of daily stock returns for a month for company i ; MV the market value of company i ; DLI default likelihood indicators of company i ; N the number of observation
51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model
813
Table 51.2 Pearson correlations coefficients of selected variable for three groups of pooled data P NT SIG MV
DLI
Panel A: The whole period (February 2001–May 2002) PSP 0:5991 .<:0001/ 0:0665 .<:0001/ 0.4142 .<:0001/ P 0.1838 .<:0001/ 0:2895 .<:0001/ NT 0.2064 .<:0001/ SIG MV Panel B: The first period (February 2001–September 2001)
0:0991 .<:0001/ 0.2110 .<:0001/ 0.6439 .<:0001/ 0:0601 .<:0001/
0.3041 .<:0001/ 0:0156 (0.2793) 0.0125 (0.3836) 0.2338 .<:0001/ 0:0616 .<:0001/
PSP 0:6110 .<:0001/ 0:0534 (0.0109) P 0.2311 .<:0001/ NT SIG MV Panel C: The second period (October 2001–June 2002)
0.3411 .<:0001/ 0:2075 .<:0001/ 0.2998 .<:0001/
0:1172 .<:0001/ 0.2526 .<:0001/ 0.6338 .<:0001/ 0:0128 (0.5419)
0.2553 .<:0001/ 0.0170 (0.4183) 0.0091 (0.6640) 0.2037 .<:0001/ 0:0556 (0.0081)
0.4934 .<:0001/ 0:3570 .<:0001/ 0.1504 .<:0001/
0:0907 .<:0001/ 0.1723 .<:0001/ 0.6856 .<:0001/ 0:1053 .<:0001/
0.3617 .<:0001/ 0:0487 (0.0134) 0.0260 (0.1868) 0.2636 .<:0001/ 0:0695 (0.0004)
PSP P NT SIG MV
0:6248 .<:0001/
0:0156 (0.4280) 0.1615 .<:0001/
This table presents Pearson correlation coefficients between selected variables during our sample period. PSP is the monthly average percentage bid-ask spreads. P is the monthly average close price NT is the monthly average of the number of trades. SIG is the standard deviation of daily stock returns. MV is the market value. DLI is default likelihood indicators Notes: Numbers in parentheses are p-values Table 51.3 Panel regression results of percentage spread on firm characteristics and probability of default PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit C b4 log MV it C b5 log PDLI it C "it Whole period Intercept log P log NT SIG log MV log PDLI No. of obs
Period 1
0.3839 (20.10) 0:1091 .58:66/ 0:0207 .8:53/ 0.0193 (17.33) 0.0103 (7.74) 0.0001 (1.80) 4,692
Period 2
0.5735 (18.27) 0:1225 .39:84/ 0.0076 (1.80) 0.0146 (8.07) 0:0031 .1:35/ 0.00002 (0.55) 2,208
0.3755 (16.23) 0:1038 .47:91/ 0.0008 (0.26) 0.0155 (11.68) 0.0029 (1.77) 0:00003 .1:41/ 2,484
This table presents the fixed effect estimation results of panel data regression of PSP for control variables and DLI. PSP is the average percentage spread. log P is the logarithm of closing price. log NT is the logarithm of average daily number of trade. SIG is the standard deviation of daily stock returns. log MV is the logarithm of market value. log PDLI is the logarithm of Default Likelihood Indicators times 100; that is, log PDLI D log .100 DLI/ Notes: Numbers in parentheses are t-values. “ ” represents statistically significant at 1% level
split up into two subperiods to investigate whether default risk observably has a stronger effect on equity liquidity in the period of expected high financial distress costs. Table 51.3 presents the fixed effect estimation results of panel data regression of percentage spread on four control variables and DLI. For the whole sample period, the estimated coefficients of log P and log NT are negative, while the coefficients of SIG, log MV, and log PDLI are positive. While
the price and volatility variables have similar effects on the two subperiods as those reported for the whole period, the effects of the number of trade and market value variables on the percentage spreads are not pervasive in two sub-periods. Although percentage spread is significantly affected by DLI at the 10% level, the results from the two subperiods show that the relationship between DLI and the percentage spread is not consistent over different sample periods.
814
H. Chung et al.
51.3.3 Results of Panel Threshold Regression The insignificant results of the effect of DLI in Table 51.3 might be due to the non-linear relationship between these two variables. Particularly, the model in Table 51.3 might be misspecified if a significant threshold effect in DLI exists. Table 51.4 presents the results for panel threshold regression for single threshold effect by using the estimation method proposed by Hansen (1999). The number of bootstrap replications is 300 for the test of the threshold effect, while the results are the same when a higher number of replication is used. The results of likelihood ratio test statistics for the null hypothesis of no threshold effect in DLI show that the null hypothesis is rejected. The estimated threshold value .O1 / for the whole period is 0.1086. The likelihood ratio test
results show significant threshold effects in DLI for the two subperiods with the bootstrap p-value lower than 5%. The estimated threshold values for subperiods 1 and 2 are 0.1627 and 0.0843, respectively. The estimated coefficients of DLI on various regimes show significant difference. Particularly for the firms with high default risk, during the Enron crisis period, the coefficient on DLI is 0.012, which is higher than 0.0075 as reported for subperiod 1. It is interesting to note that MV appears to have consistent effects on percentage spread when panel threshold model is considered, indicating that the results in Table 51.3 might be due to misspecification. Table 51.5 presents the results for the panel data model with double threshold. Since the test results do not support the double threshold effects in subperiod 1, we report only
Table 51.4 Results of panel threshold regression model of percentage spread on firm characteristics and default probability PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit C b4 log MV it C b5 log PDLI it I.DLI it 1 / C b6 log PDLI it I.DLI it > 1 / C "it ; log P log NT SIG log MV log PDLI .DLI 5 O1 / log PDLI .DLI > O1 / O1 Test for no threshold effect No. of obs.
Whole period
2001/2–2001/9
2001/10–2002/5
0:616 .8:6307/ Œ6:8217 0:0442 .15:3209/ Œ15:1624 0.0064 (5.9428) [5.3495] 0:0619 .8:5713/ Œ7:1198 0:0001 .1:7607/ Œ2:4861 0.0078 (7.8528) [5.5007] 0.1086 66.821 4,416
0:0869 .7:7514/ Œ6:1481 0.0187 (3.3134) [3.2283] 0.0052 (2.9558) [3.1391] 0:0575 .5:1571/ Œ4:1380 0.0001 (0.9556) [1.4969] 0.0075 (5.3416) [3.7640] 0.1627 31.568 2,208
0:1215 .12:4683/ Œ11:5022 0.0005 (0.1233) [0.1134] 0.0058 (5.0341) [4.3202] 0:0567 .6:0873/ Œ5:7419 0:0001 .2:2172/ Œ2:4121 0.0120 (8.3002) [5.8584] 0.0843 79.269 2,208
This table presents the results for panel threshold regression for single threshold effect .O1 / by using the estimation method proposed by Hansen (1999). The fixed effect estimation method is applied to this model. PSP is the average percentage spread. log P is the logarithm of closing price. log NT is the logarithm of average daily number of trade. SIG is the standard deviation of daily stock returns. log MV is the logarithm of market value. log PDLI is the logarithm of Default Likelihood Indicators times 100; that is, log PDLI D log.100 DLI/. I./ is the indicator function Notes: “ ” and “ ” represent statistically significant at 5 and 1% level, respectively. Numbers in the parentheses and brackets are the t-values calculated by OLS and White standard errors, respectively Table 51.5 Results of double threshold panel data model for Period 2 (October 2001–May 2002)
PSPit D a C b1 log Pit C b2 log NT it C b3 SIGit C b4 log MV it C b5 log PDLI it I.DLI it 1 / Cb6 log PDLI it I.1 < DLI it 2 / C b7 log PDLI it I.DLI it > 2 / C "it ; Coefficient log P log NT SIG log MV log PDLI .DLI 5 O1 / log PDLI .O1 < DLI 5 O2 / log PDLI .DLI > O2 / O1 O2 Test of no threshold effect
0:1299 .13:2635/ Œ11:5794 0.0008 (0.1848) [0.1684] 0.0058 (5.0472) [4.3921] 0:0506 .5:4268/ Œ4:9296 0:0001 .2:5125/ Œ2:6018 0.0086 (5.5106) [3.5961] 0.0208 (9.7521) [6.2750] 0.0843 0.6375 35.282
This table presents the results for the panel data model with double threshold .O1 ; O2 / in subperiod 2. PSP is the average percentage spread. The fixed effect estimation method is applied to this model. log P is the logarithm of closing price. log NT is the logarithm of average daily number of trade. SIG is the standard deviation of daily stock returns. log MV is the logarithm of market value. log PDLI is the logarithm of Default Likelihood Indicators times 100; that is, log PDLI D log.100 DLI/. I./ is the indicator function Notes: “ ” and “ ” represent statistically significant at 5% and 1% level, respectively. Numbers in the parentheses and brackets are the t-values calculated by OLS and White standard errors, respectively
51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model
the results for subperiod 2. There is overwhelming evidence that there are two thresholds in the regression relationship in period 2. The estimated threshold values are 0.0843 .O1 / and 0.6375 .O2 /. The percentage of firms that fall into the three regimes for double threshold model is shown on a monthly basis in Appendix Table 51A.3.
51.4 Conclusion This research investigates the relationship between credit risk and equity liquidity. We explore the hypothesis that as the firm’s default likelihood increases, information asymmetry costs increases and market-makers widen spreads in response to the increased probability of trades against informed investors. Using the default likelihood measure calculated by Merton’s method, this paper investigates whether financially ailing firms do indeed have higher bid-ask spreads. The results based on the panel threshold regression model show evidence of non-linear relationship between credit risk and equity liquidity. Since high default probability and worse economic prospects might lead to greater expropriation by managers, and thus greater asymmetric information costs, liquidity providers will incur relatively higher costs and will therefore offer higher bid-ask spreads. The empirical results support the above hypothesis. The effects that increased credit risk triggers investors’ loss of confidence of equities trading thus decrease in liquidity are particularly stronger during periods of extreme financial distress costs. Acknowledgment The financial support of the National Science Council of Taiwan is also gratefully acknowledged.
815
References Agrawal, V., M. Kothare, R. K. S. Rao, and P. Wadhwa. 2004. “Bidask spreads, informed investors, and firm’s financial condition.” The Quarterly Review of Economics and Finance 44, 58–76. Altman, E. I. 1984. “A further empirical investigation of the bankruptcy cost question.” Journal of Finance 39, 1067–1089. Benston, G. and Hagerman, R.. 1974. “Determinants of bid-asked spreads in the over-the-counter market.” Journal of Financial Economics 1, 353–364. Black, F. 1986. “Noise.” Journal of Finance 41, 529–544. Black, F. and Scholes, M. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Chan, K. S. 1993. “Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model.” The Annals of Statistics 21, 520–533. Hansen, B. E. 1999. “Threshold effects in non-dynamic panels: estimation, testing, and inference.” Journal of Econometrics 93, 345–368. Huang, R. D. and H. R. Stoll. 1996. “Dealer versus auction markets: a paired comparison of execution costs on NASDAQ and the NYSE.” Journal of Financial Economics 41, 13–357. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29, 449–470. Stoll, H. R. 1978a. “The supply of dealer services in securities markets.” Journal of Finance 33, 1122–1151. Stoll, H. R. 1978b. “The pricing of security dealer services: an empirical study of NASDAQ stocks.” Journal of Finance 33, 1153–1172. Stoll, H. R. 2000. “Friction.” Journal of Finance 55, 1479–1514. Tinic, S. and R. West. 1972. “Competition and the pricing of dealer services in the over the counter stock market.” Journal of Financial and Quantitative Analysis 7, 1707–1727. Titman, S. 1984. “The effect of capital structure of a firm’s liquidation decision.” Journal of Financial Economics 13, 137–151. Vassalou, M. and Y. Xing. 2004. “Default risk in equity returns.” Journal of Finance 59, 831–868. Warner, J. B. 1977. “Bankruptcy costs: some evidence.” Journal of Finance 32, 337–347. Weiss, L. A. 1990. “Bankruptcy resolution: direct costs and violation of priority of claims.” Journal of Financial Economics 27, 285–314. Wruck, K. H. 1990. “Financial distress, reorganization, and organizational efficiency.” Journal of Financial Economics 27, 419–444.
816
H. Chung et al.
Appendix 51A
a DLI%
PSP% 0.18
7
PSP%
0.16
6
DLI%
0.14 5 0.12 0.1
4
0.08
3
0.06 2 0.04 1
0.02 0
0 200103 200104 200105 200106 200107 200108 200109 200110 200111 200112 200201 200202 200203 200204 200205
Month
b DLI%
PSP%
0.002
0.16
PSP%
0.14
DLI% 0.12 0.001 0.1 0.08 0.06 0 0.04 0.02 −0.001
0 200103 200104 200105 200106 200107 200108 200109 200110 200111 200112 200201 200202 200203 200204
Month
Fig. 51A.1 (a) Mean – Percentage spread (PSP) versus Default likelihood indicator (DLI). (b) Median – Percentage spread (PSP) versus Default likelihood indicator (DLI)
51 The Effect of Default Risk on Equity Liquidity: Evidence Based on the Panel Threshold Model
817
Table 51A.1 Cross-sectional regression results of percentage bid-ask spreads on trading characteristic variables PSPi D a C b1 log Pi C b2 log NT i C b3 SIGi C b4 log MV i C "i 2001 Feb 2001 Mar 2001 Apr 2001 May 2001 June 2001 July 2001 Aug 2001 Sep 2001 Oct 2001 Nov 2001 Dec 2002 Jan 2002 Feb 2002 Mar 2002 Apr 2002 May 2002 June
Intercept
log P
log NT
SIG
log MV
0.4979 (6.6458) 0.5940 (7.6549) 0.7147 (8.5485) 0.5116 (7.4691) 0.3743 (5.9497) 0.4947 (8.5884) 0.4654 (7.3804) 0.4776 (6.3904) 0.6774 (9.2199) 0.5225 (6.9575) 0.3444 (5.1187) 0.4353 (6.1974) 0.4287 (7.4489) 0.3237 (7.1471) 0.2030 (4.2384) 0.2772 (5.4187) 0.3794 (6.7333)
0:1115 .16:0020/ 0:1143 .15:1291/ 0:1470 .18:1160/ 0:1053 .17:7110/ 0:1050 .18:4007/ 0:0978 .16:9153/ 0:1075 .19:5980/ 0:1287 .16:5431/ 0:1418 .22:0472/ 0:1184 .19:1499/ 0:1058 .18:4517/ 0:1004 .15:6370/ 0:1067 .21:1913/ 0:0910 .20:3429/ 0:0904 .21:9409/ 0:0863 .17:9085/ 0:1179 .21:5443/
0.0097 .0:7912/ 0.0153 (1.1917) 0.0088 (0.6408) 0:0096 .0:8837/ 0:0262 .2:6487/ 0:0205 .2:2836/ 0:0214 .2:2882/ 0:0023 .0:1735/ 0.0151 (1.4437) 0.0140 (1.2782) 0:0019 .0:1777/ 0:0007 .0:0576/ 0.0111 (1.1477) 0.0014 (0.1793) 0:0110 .1:3474/ 0:0157 .1:7900/ 0.0010 (0.0943)
0.0330 (7.0277) 0.0217 (4.5383) 0.0201 (3.9517) 0.0174 (4.6135) 0.0317 (7.0428) 0.0234 (5.7721) 0.0183 (4.1804) 0.0170 (2.5332) 0.0074 (1.9339) 0.0080 (1.7624) 0.0165 (3.4588) 0.0205 (3.7316) 0.0202 (4.7323) 0.0207 (6.5863) 0.0219 (6.1253) 0.0289 (6.4736) 0.0243 (5.3666)
0.0022 (0.3668) 0:0078 .1:2712/ 0:0063 .0:9565/ 0.0001 (0.1716) 0.0112 (2.3060) 0.0037 (0.8506) 0.0065 (1.3758) 0.0055 (0.9002) 0:0074 .1:3609/ 0:0043 .0:7644/ 0.0057 (1.0915) 0.000 (0.0084) 0:0024 .0:5288/ 0.0023 (0.6416) 0.0110 (2.9369) 0.0084 (2.0979) 0.0049 (1.0286)
This table shows the monthly regression results using ordinary least squares estimation. PSP is the average percentage spread. The fixed effect estimation method is applied to this model. log P is the logarithm of closing price. log NT is the logarithm of average daily number of trade. SIG is the standard deviation of daily stock returns. log MV is the logarithm of market value Notes: “ ” and “ ” represent statistically significant at 5 and 1% level, respectively. Numbers in the parentheses are t -values
Table 51A.2 Descriptive statistics and t-test results of DLI
Panel A: Descriptive statistics of DLI DLI
N
Mean
Median
S.D.
Minimum
Maximum
Period 1 Period 2 Diff
276 276 276
4.7616% 3.9043% 0.8573%
0.0829% 0.0006% 0.0006%
13.5339% 12.3610% 10.8036%
0.0000% 0.0000% 74.0280%
87.5001% 88.8889% 80.5989%
Panel B: Results of t-test of difference of default probability between the two sub-periods Variable DLI(Diff)
N 286
t value 1.34
Pr > jtj 0.1807
Notes: N is the number of the sample firms. Period 1 corresponds to February 2001–September 2001. Period 2 corresponds to October 2001–June 2002. “Diff” is the difference of average DLI between Period 2 and Period 1 for each firm. H0 : the difference of DLI for company i between Period 1 and Period 2 equal zero
818
Table 51A.3 Percentages of firms in each regime by month: double threshold model
H. Chung et al.
Firms class
Month
DLI 5 0:0843 APSP D 0:1188
0:0843 < DLI 5 0:6735 APSP D 0:2036
DLI > 0:6735 APSP D 0:2281
2001 Feb 2001 Mar 2001 Apr 2001 May 2001 June 2001 July 2001 Aug 2001 Sep 2001 Oct 2001 Nov 2001 Dec 2002 Jan 2002 Feb 2002 Mar 2002 Apr 2002 May 2002 June
74.6% 91.7% 88.0% 90.6% 92.8% 91.3% 91.3% 89.5% 84.4% 86.6% 89.9% 89.9% 90.2% 89.9% 93.1% 92.4% 91.3%
18.5% 8.0% 9.8% 6.5% 4.7% 6.2% 6.5% 8.0% 13.0% 11.2% 8.0% 8.7% 8.0% 8.7% 6.2% 6.5% 6.9%
6.9% 0.4% 2.2% 2.9% 2.5% 2.5% 2.2% 2.5% 2.5% 2.2% 2.2% 1.4% 1.8% 1.4% 0.7% 1.1% 1.8%
We find that the percentage of companies in the “very high default risk” category ranges from 0.4 to 6.9% of the sample in each month. The “low default risk” firms range from 74.6 to 91.7% of the sample over the months. The average of percentage bid-ask spreads for each regime are 0.1188, 0.2036 and 0.2281 in that order. The “very high default risk” class has the highest average of percentage bid-ask spreads, 0.2281. This proves that the firms with higher credit risk will suffer higher liquidity cost Note: APSP D the average percentage bid-ask spreads for each regime
Chapter 52
Put Option Approach to Determine Bank Risk Premium Dar Yeh Hwang, Fu-Shuen Shie, and Wei-Hsiung Wu
Abstract In this paper, we briefly review the pricing of deposit insurance under the option approach. First, we review the theoretical works of deposit insurance pricing model. And second, we summarize the applications of deposit insurance pricing model, focusing on the issue of whether fees charged by the insuring agency are excessive. Keywords Deposit insurance r Capital standard r Options pricing
52.1 Introduction A main function of a bank is to lend money to borrowers and receive funds from depositors. Because of the information asymmetry and the transaction costs, depositors lend funds to banks instead. To insure that the banks function well, banks should be risk-free. Hence, there is a need for the third-party guarantee on the deposits. But the guarantee incurs costs on the insurer, and how the costs are priced is an important issue. There is an isomorphic relationship between deposit insurance and put options. Merton (1977) first presents this insightful relationship. In short, with deposit insurance, the bank receives nothing as it is solvent. But the insurer pays the deposits that cannot be met by the bank assets when insolvency occurs. Therefore, the deposit insurance contract corresponds to the put option written on the bank assets with strike price equal to the insured deposits. There are lots of studies that follow the work of Merton (1977). First, Merton (1978) considers the auditing costs of insurers and finds that deposit insurance costs per dollar
D.Y. Hwang () and W.-H. Wu Department of Finance, College of Management, National Taiwan University, 85, Roosevelt Road, Section 4, Taipei Taiwan, R.O.C. e-mail: [email protected] F.-S. Shie Department of Finance, College of Management, Asia University, 500, Lioufeng Road, Wufeng, Taichung County, Taiwan, R.O.C.
are no longer monotonically decreasing in the asset-todeposit ratio. Mullins and Pyle (1994) show that the riskbased capital ratio increases with the auditing costs. Allen and Saunders (1993) take into account the capital forbearance. However, their works are modified by Hwang et al. (2009), who introduce the variable of bankruptcy costs. Lee et al. (2005) derive a formula of fair deposit insurance premium, which explicitly incorporates the forbearance and the capital requirement of Basel Accord. To relax the assumptions of constant risk-free rate and asset volatility, Kerfriden and Rochet (1993) discuss how the deposit insurance is priced under stochastic interest rates economy, and Duan and Yu (1999) use the CARCH option technique to fairly price the deposit insurance. One of the main applications of Merton (1977) is to examine whether the deposit insurance provided by the FDIC is excessive. Markus and Shaked (1984) first proposed a method that can be implemented to price the risk-based deposit insurance premium of each bank. Their empirical results show that current fees charged by the FDIC are overpriced. Ronn and Verma (1986) present an empirical approach as well, but they point out that considering forbearance, the deposit insurance premia are underpriced. Pennacchi (1987) considers the power of the insurer, and he indicates that whether the deposit insurance premia are under- (or over-) priced depends on the level of insurer’s control power over the banks. His empirical results show that current premium lies within the upper and lower bound he presents, which shows that the FDIC has mild control over the banks. Flannery (1991) states that the insurer can observe the financial data of banks only with errors, which tends to underestimate the true value of deposit insurance. Since measurement errors exist, it is socially optimal to both adopt risk-based deposit insurance scheme and risk-based capital requirements. This paper is organized as follows. Section 52.2 reviews the pioneer work of deposit insurance pricing in Merton (1977). Section 52.3 discusses some of the major extensions of Merton (1977). Section 52.4 presents the applications for option pricing deposit insurance. The major issue is whether the deposit insurance premium is over- or undercharged. Section 52.5 concludes.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_52,
819
820
52.2 Evaluating Insurer’s Liability by Option Pricing Model: Merton (1977) One of the main functions of a bank is to lend money to borrowers and receive deposits from firms and individuals. People would like to deposit funds in a bank because of lower transactions costs, liquidity, and convenience. But a necessary condition for a depositor to lend funds to banks is that the deposits are risk-free. Otherwise, depositors would have to collect information on the banking industry, analyze the balance sheets of banks, and diversify their funds across banks. This is costly to them. Risky deposits will jeopardize the main function of a bank, which is serving as an intermediary. Introducing a third-party guarantee on the deposits that a bank takes is a solution. However, considering the scale of the banking industry, the third-party guarantor can almost only be the government or one of its agencies, such as the Federal Deposit Insurance Corporation (FDIC). It is publicly believed that even in countries where explicit deposit insurance systems are not adopted, the government guarantees the deposits implicitly. Hence the deposit insurance imposes costs on the guarantor. There is an isomorphic relationship between the deposit insurance and the common stock put options. First we discuss the payoff of depositors without deposit insurance, and then we consider the effects of deposit insurance provided by the insurer. Let V be denoted as the value of a bank’s assets at maturity date, and D the total value of deposits at maturity date. For deposits, if the value of the bank’s assets exceeds total deposits, the payoff of deposits is D. But if V is so low (V < D) that the bank cannot meet its obligations, deposits of the bank can only receive V in return. In sum, at the maturity date, the depositors receive Min (V; D). Similarly, for the equityholders of the bank, their payoff is Max (0; V D). Second, consider the impact of a third-party guarantee on the obligations that the bank should meet at maturity date. If the bank assets value V is greater than the insured deposits D, then the deposit insurance has no value and the bank continues to operate. But if at the end of the contract period, V is smaller than D, then the insurer pays the deficits D V and closes the bank. The costs of an insurer for providing the deposit insurance contract are thus Max .0; D V /. Put it another way, without deposit insurance, the original payoff of a depositor is Min (V; D). But Min .V; D/ D D CMin .0; V D/ D D Max .0; D V /. Since Max (0; D V ) is nonnegative, it is the guarantor’s task to make payment of Max (0; D V ) to depositors to insure that the deposits are risk-free. Hence, pricing the deposit insurance corresponds to value a put option written on the bank assets with strike price equal to the insured deposits. If a guarantor (for example, the FDIC) guarantees that a bank can meet the obligation of its deposits when they are
D.Y. Hwang et al.
due, it is essentially the same that the guarantor sells the bank a put written on its total asset. Hence the deposit insurance premium can be valuing by the put option model. The deposit insurance premium that Merton (1977) has originally derived is of the form: g.d; / D ˆ.h2 /
1 ˆ.h1 /; d
(52.1)
where n o .p ; h1 log.d / 2 p h2 h1 C ; and g is the cost of insured deposit per dollar, d D=V is the current deposit-to-asset value ratio, 2 T is the variance of the logarithmic change in the value of the assets during the term of the deposits, and ˆ./ denotes the CDF of the normal distribution. The proof of Equation (52.1) is shown in the Appendix 52A. From Equation (52.1), it is shown that the change in the cost with respect to an increase in the deposit-to-asset value ratio (d ) is positive; that is, ˆ.h1 / @g D : @d d2
(52.2)
The change in the cost with respect to an increase in the variance of the logarithmic change in the value of the assets (£) is also positive; that is, ˆ0 .h1 / @g D p : @ 2d
(52.3)
Equations (52.2) and (52.3) are easily understood because the increase of d and £ will both increase the probability of insolvency. The proof of Equations (52.2) and (52.3) are shown in the Appendix 52B.
52.3 Extensions of Merton (1977) There are lots of researchers that follow the work of Merton (1977). Merton (1978) considers the surveillance costs of insurer and finds that deposit insurance costs are no longer monotonically decreasing in the asset-to-deposit ratio. Mullins and Pyle (1994) evaluate the impact of auditing costs on the risk-based capital ratio and the deposit rate spread. They show that the risk-based capital ratio and the deposit rate spread increase with auditing costs. Allen and Saunders (1993) take into account the capital forbearance.
52 Put Option Approach to Determine Bank Risk Premium
821
Hwang et al. (2009) introduce the variable of bankruptcy costs to modify the work of Allen and Saunders (1993). Lee et al. (2005) derive a formula of fair deposit insurance premium, which explicitly incorporates the forbearance and the capital requirement of Basel Accord.
52.3.1 The Consideration of Surveillance Costs Merton (1978) derives the explicit form of deposit insurance premium when the surveillance costs of the guarantor are taken into consideration. In this model, random auditing times are also allowed. Under this setting, Merton shows that costs of deposit insurance per dollar are no longer monotonically decreasing in the asset-to-deposit ratio, contrary to the findings in Merton (1977). The reason is that although the costs of deposits guarantee are monotonically decreasing, the audit costs are not. A bank is audited more frequently if it is likely to be insolvent. Another finding is that, in equilibrium, the market interest rate is higher than the total rate of return on deposits. The difference is just the expected auditing costs per dollar per unit time. So, the depositors pay for the auditing costs, and the bank’s shareholders pay for the put option component. Mullins and Pyle (1994) find that the deposit rate spread (risk-free rate minus the deposit rate) increases with the volatility of bank asset value. In Mullins and Pyle (1994), they assume that the deposit insurance guarantor can measure the risks of bank assets without error, and auditing times are known beforehand. They use the isomorphic relationships between the deposit insurance premium and put options and the shareholder value and the call options to determine the risk-based capital ratio under current deposit insurance scheme. That is, p0 D e D N.y C V / .1 C k0 p0 /N.y/ Ca.1 C k0 p0 /; k0 D .1 C k0 p0 /N.y/ e D N.y V /:
(52.4) (52.5)
where p0 P0 =D0 and k0 K0 =D0 , denote the insurance premium per dollar of deposits and capital-to-deposit ratio in the beginning of the period, respectively. And a denotes the auditing cost per dollar of asset value, D rf rD , which is the spread of deposit rate, N./ is the CDF of the standard Normal distribution, and y D .ln.1 C k0 p0 / C D =V / C V =2. Note that the guarantor incurs the auditing costs, and the auditing costs per dollar of deposits, a.1 C k0 p0 /, is added to the Black-Scholes term in Equation (52.4). Since there are two equations, two known variables, k0 .V / and
D .V /, which represent the risk-based capital-to-deposit ratio and the deposit rate spread, can be solved. And p0 is set to be 0.0023, the deposit insurance premium charged by FDIC before 1993. The results show that both the risk-based capital requirements and the deposit rate spread increase with the asset volatility and the auditing costs per dollar of deposits. Using two isomorphic correspondences between options and deposit insurance premium or equityholders’ value can also be seen in Ronn and Verma (1986), see Sect. 52.4.2.
52.3.2 Incorporating Forbearance 52.3.2.1 Allen and Saunders (1993) Allen and Saunders (1993) evaluate the deposit insurance with the forbearance policy exercised by the insurer. Forbearance is the delay in implementing or enforcing specific regulations. With regard to closure policy, forbearance is granting the institution time to return to solvency before it is closed. Previous researches ignore the call provision feature of deposit insurance; that is, the insurer (FDIC) has the right to exercise the option before the insolvent banks choose to exercise it. The call provision has value if insurer-induced closure policy is more stringent than the self-closure policy. Allen and Saunders (1993) hence utilize the callable American put option to price the deposit insurance. Allen and Saunders (1993) show that the value of the deposit insurance can be calculated as the value of the noncallable American put option minus the value from the call provision. The value of the noncallable American put option is: .1 C /a 1 ; (52.6) p.a; 1I 1/ D 1C where a is the asset-to-deposit ratio; D 2r= 2 , r is the risk-free rate. And value from the call provision is: c.a; 1I 1/ D a aN C1 aN C
; . C 1/ C1
(52.7)
where aN denotes the asset-to-deposit ratio that the FDIC will call the put option. So the net value of deposit insurance, evaluated as a callable perpetual American put option is Equations (52.6)–(52.7): aN : i.a; 1I 1/ D .1 a/ N a
(52.8)
Allen and Saunders (1993) indicate that, after deducting the call provision value, the deposit insurance premium has three
822
D.Y. Hwang et al.
features. First, the simulations show that the looser the closure policy, the more valuable the deposit insurance premium. Second, the value of the deposit insurance subsidy from the insurer increases monotonically with increases in the volatility of bank assets. And third, if the insurer rewards the safer banks with looser closure policy, then the deposit insurance value attains maximum at some level of assets’ volatility, not a monotonic function. Let aN f ./, where f 0 ./ > 0, then:
risk. By differentiating Equation (52.12) with respect to , one can obtain
@i f 0 ./ 4r 2r f 0 ./ a Di C 3 ln C 2 : @ 1 f ./ f ./ f ./ (52.9)
That is, the value of deposit insurance with bankruptcy cost increases with the bank asset risk, which is a desirable result.
Note that the sign of Equation (52.9) is undetermined, because although the value of deposit insurance increases with assets’ risk, the likelihood of closure increases as well, which reduces the value of deposit insurance.
52.3.2.2 Hwang et al. (2009) Based on the framework of Allen and Saunders (1993), Hwang et al. (2009) introduce the bankruptcy costs into the model and derive a closed-form solution of fair deposit insurance premium, in which they set the bankruptcy cost as a function of asset return volatility. The comparative static analysis and simulation results show that the value of deposit insurance increases with the increase of asset volatility. Let kx denote discount rate at the self-closure asset-todeposit ratiox, and kaN denote the discount rate at the regulatory closure point, a, N where 0 kx ; kaN 1. Then the bankruptcy cost at the self-closure point is 1kx x and 1kaN aN at the regulatory closure point. Under this setting, the value of the noncallable perpetual put with bankruptcy cost is pbc .a; 1I 1/ D
1 1C
.1 C /kx a
;
(52.10)
and the call provision of the perpetual put with bankruptcy cost becomes C1 cbc .a; 1I 1/ D a aN C kaN aN : kx . C 1/ C1 (52.11) Similarly, subtracting the value of call provision from the value of the noncallable perpetual put yields the value of the callable deposit insurance with bankruptcy costs; that is, ibc .a; 1I 1/ D .1 kaN a/ N
a aN
:
(52.12)
Define kaN kaN ./ where ka0N ./ < 0. That is, the bankruptcy cost is a function of bank asset risk and increases with asset
a @ibc D .ka0N ./a/ N @ aN a a 4r a C .1 kaN a/ N 0: ln aN aN 3 aN (52.13)
52.3.2.3 Lee et al. (2005) With the implementation of the Basel Accord, the pricing of fair deposit insurance premium becomes more complicated, since the threshold of minimum capital requirement must be considered with regard to the forbearance policy. Explicitly considering the forbearance policy taken by the regulator and the capital requirement from Basel Accord, Lee et al. (2005) derive the formula for fair deposit insurance price under this setting. They assume that banks’ auditing times are known in advance. At audit time, if the required capital ratio is maintained, the value of deposit insurance is zero. If the required capital ratio is slightly violated, but is still forborne by the regulator, the bank is allowed to operate, and will be audited later. But if the required capital ratio is severely violated, the bank is closed and the regulator assumes the capital deficits. The derived formula can be decomposed into parts – the capital forbearance component and the time forbearance component. And they have shown that a lower threshold of forbearance however increases the value of capital forbearance part because fewer insolvent banks will be closed, the value of time forbearance decreases accordingly because more banks will be closed later. In sum, lowering forbearance decreases the fair value of deposit insurance.
52.3.3 The Consideration of Interest Rate Risk Previous researches (e.g., Markus and Shaked 1984; Ronn and Verma 1986) assume that interest rates and the volatility of bank assets are constant, which are unrealistic. Using option pricing formula on coupon bond, Kerfriden and Rochet (1993) derive a formula for fair deposit insurance premium. Kerfriden and Rochet (1993) assume the interest rate follows a diffusion process, and at the same time they consider the cash flows of assets come from multi-periods. The formula that they derive is not explicit, but can be computed
52 Put Option Approach to Determine Bank Risk Premium
823
numerically. The sensitivity analysis shows that the deposit insurance premium decreases with capital-to-assets ratio, and increases with short or long interest rates.
52.3.4 Utilization of GARCH Option Pricing In Sect. 52.3.3, it was mentioned that previous studies assume that interest rates and the volatility of bank assets are constant. In empirical studies, it is often found that financial asset returns have features such as fat-tailed distribution, volatility clustering and leverage effects. Duan and Yu (1999) employ the GARCH option pricing technique to fairly value the deposit insurance. Under their approach, they consider the capital standard and the forbearance that the regulator exercises, and they use the NGARCH model to describe the bank assets return process. In their framework, deposit insurance premium can be numerically priced fairly. The remarkable feature of GARCH option pricing method is that, the fair deposit insurance premium is a function of asset risk premium. Consider an n-period model in which banks are audited at the end of each period. The asset value adjustment mechanism at the auditing time ti is 8 rti ˆ A .ti / qu Derti A .ti / Derti : ˆ : ql Derti ; otherwise (52.14) where A .ti / represents the bank asset value prior to adjustment. qu is a threshold level of asset-to-deposit ratio above which the bank will pay the cash dividends. D denotes the face value of insured deposits. The parameter captures the level of capital forbearance, where 0 < < 1. The bank is still allowed to operate even though the capital requirement is violated if it is forborne by the regulator. But if the capital requirement is severely violated, the insurer infuses funds to meet the capital requirement. The parameter ql D 1:087 corresponds to the 8% capital requirement agreed by the Basel Accord. The deposit insurance payoff at the auditing time ti , denoted by P .ti /, is described as follows P .ti / D
0; A .ti / min.; 1/Derti : rti De A .ti /; otherwise (52.15)
The NGARCH model of bank asset return Rt C1 D ln.At C1 =At / under the risk-neutral probability measure is described as 1 Rt C1 D r t2C1 C t C1 t C1 ; 2
(52.16)
t2C1 D ˇ0 C ˇ1 t2 C ˇ2 t2 .t /2 ; (52.17) where t C1 "t C1 C is a standard Normal random variable w.r.t. the risk-neutral probability measure. Finally, taking expectations at time 0 w.r.t. the dynamic process in Equations (52.16) and (52.17), one can obtain the fair deposit insurance premium per dollar of deposits in each period as ın D
n 1 X rti e E0 ŒP .ti / : nD i D1
(52.18)
52.4 Applications for Merton (1977) One of the main applications of Merton (1977) is to examine whether the deposit insurance provided by the FDIC is overpriced, underpriced, or fairly priced. Markus and Shaked (1984) propose a method that can be implemented to price the risk-based deposit insurance premium of each bank. The empirical results show that current fees charged by the FDIC are excessive. Ronn and Verma (1986) also present an empirical approach, but they point out that after taking the forbearance policy into consideration, the deposit insurance premia are underpriced rather than overpriced. Pennacchi (1987) indicates that whether the deposit insurance premia are under- (or over-) priced depends on the level of insurer’s control power over the banks. And the empirical results show that current premium lies within the upper and lower bound he presents, which shows that the FDIC has mild control over the banks. Flannery (1991) argues that the insurer can observe the financial data of banks only with measurement errors. This tends to underestimate the true value of deposit insurance that they provide. Without measurement errors, it is equivalent to adopt the risk-based deposit insurance or risk-based capital requirement. But since measurement errors exist, it is socially optimal to both adopt risk-based deposit insurance scheme and risk-based capital requirements.
52.4.1 Marcus and Shaked (1984) Utilizing the analogue between deposit insurance and put option proposed by Merton (1977), and Markus and Shaked (1984) provide a way to evaluate the risk-based deposit insurance premium of publicly traded banks. If the value of bank assets follows a stochastic diffusion process, then the fair value of deposit insurance (I / at time 0 is: I D DerT Œ1 ˆ.x2 / e ıT V0 Œ1 ˆ.x1 / ;
(52.19)
824
D.Y. Hwang et al.
where • denotes dividend rate per dollar of bank assets, V0 denotes value of bank assets at time 0, x1 D p p log.V0 =D/ C .r C 2 =2 ı/T = , and x2 D x1 . To calculate the volatility of bank assets value, the following equation is used: DerT ˆ.x2 / ; D E 1 ıT e V0 ˆ.x1 /
(52.20)
where E denotes the standard deviation of the rate of return on bank equity. Using recursive substitutions, ¢ and I can be derived from Equations (52.19) and (52.20). Two main findings from Markus and Shaked (1984) are that deposit insurance premium charged by FDIC is excessive, the same is true after taking account of the administrative costs. And the estimated risk-based premia of each bank are skewed, indicating that many relative safe banks subside the fewer but much riskier banks.
52.4.2 Ronn and Verma (1986) Contrary to Markus and Shaked (1984), Ronn and Verma (1986) point out that the deposit insurance premium charged by the FDIC is underpriced rather than overpriced. The main difference is that Ronn and Verma (1986) explicitly consider the direct assistance from the insuring agency in their model, while Markus and Shaked (1984) do not. That is, at the time of audit, if the market value of bank assets V is below the deposit liability D, the bank is not liquidated immediately. Instead, the insurer infuses funds to the bank when D V D . 1/. The bank is closed when V < D. The insurer exercises a kind of forbearance policy. It is the forbearance that increases the value of deposit insurance. To calculate the current market value V0 and the return volatility of bank assets, Equation (52.20) is not used in Ronn and Verma (1986). They use the isomorphic relationship between equity value of a bank and the call option: E D V0 ˆ.d1 / DerT ˆ.d2 /; and D
E E ; V0 ˆ.d1 /
(52.21)
(52.22)
where E denotes the equity market value of a bank, d1 D p p .log.V0 =D/ C =2/= , d2 D d1 . Implementing recursive substitutions in Equations (52.21) and (52.22) one can obtain V0 and . Inserting V0 and into Equation (52.19), the risk-based deposit insurance premium of each bank is obtained. One caution is that, taking forbearance policy into consideration, Ronn and Verma (1986) indicate that current deposit insurance premium is underestimated, but Allen and
Saunders (1993) point out that the subsidies from FDIC tend to be overestimated. The reason is that Ronn and Verma (1986) consider the direct assistance from the insurer, so the extra costs must be added into the premium. While Allen and Saunders (1993) regard the forbearance policy as the FDIC’s right to exercise the put option before the bank chooses to self-close, hence the value of call provision has to be deducted from the premium. One can compare the methods implemented in Ronn and Verma (1986) with Mullins and Pyle (1994). Although they both use the isomorphic correspondences between options and deposit insurance or equity value, the purposes are different. In Ronn and Verma (1986), they try to solve the fair riskadjusted deposit insurance premium of listed banks. So they use the two kinds of isomorphic relationships in Equations (52.19) and (52.21). They use the publicly traded data (stock prices of listed banks) to see if the current premium is fairly priced. Mullins and Pyle (1994) use similar isomorphic relationship in Equations (52.4) and (52.5). Their purpose is to calculate the risk-based capital ratio of each bank, taking the current deposit insurance premium as given. Simply speaking, in Ronn and Verma (1986), bank’s equity value is used as inputs and deposit insurance premium as outputs. In Mullins and Pyle (1994), current deposit insurance premium is used as inputs and the equity value of banks (and hence the riskbased capital ratio) are derived as outputs.
52.4.3 Pennacchi (1987) Marcus and Shaked (1984) have reached the conclusion that the fees charged by FDIC are overpriced under current deposit insurance scheme. Pennacchi (1987) uses a different argument to explain this phenomenon. He states that the judgment of whether most banks are currently under- or overcharged by the FDIC must hinge on the degree of FDIC’s regulatory control over banks. If the FDIC has full control, then FDIC can force the bank to infuse funds immediately as its minimum capital requirements are not met. In this case, the deposit insurance premium is overcharged because the insurer’s liability is lower. On the contrary, if the FDIC has no control over the banks; that is, it has no power to force the bank to increase capital ratio, then the premium is undercharged because the liability the FDIC assumes increases. In Marcus and Shaked (1984), it is assumed that a bank is closed by the insurer when it is insolvent. So it is implicitly assumed that the FDIC has full control over the banks in Marcus and Shaked (1984). The deposit insurance premium tends to be underpriced under this assumption. Pennacchi (1987) assumes that the bank assets market value and the deposit liability both follow diffusion processes. Merton (1978), also assumes that the insuring agency incurs surveillance costs and the audit times are
52 Put Option Approach to Determine Bank Risk Premium
Poisson-distributed. Under this setting, he derives the fair deposit insurance premium for both scenarios – the FDIC has full control and the FDIC has no control the banks. Because in fact the FDIC has some power over the banks, these two premia are the upper and lower bound of fair premia. And the empirical results show that the current premium lies between the upper and lower bound.
52.4.4 Flannery (1991) To price the deposit insurance premium fairly, one necessary condition is that the and D are accurately estimated in Equation (52.4). That is, the true financial data of a bank, the risk of bank assets value, and the liability level of a bank must be known to the insurer without errors. Flannery (1991) indicates that there are reasons that the regulator measures the financial data of banks with errors. First, bank assets are often opaque and hard to monitor. Evaluating the value of bank assets is costly. Second, even if the insurer knows the true value in advance, bankers have strong incentives to change the financial condition ex post, because increasing assets volatility and liability level raises the value of deposit insurance that they receive. If there are no measurement errors, it is equally well to adopt the risk-based capital requirement or risk-based premia. But since measurement errors exist, it is socially optimal to maintain risk-based capital requirement while charge riskbased deposit insurance premia. The argument is as follows. The value of deposit insurance increases with the leverage level of a bank. Although the measurement errors increases with leverage level and are socially costly, leverage levels are also socially beneficial. Hence for the FDIC to maximize the social welfare, there exists an optimal level of leverage. So to maximize the social welfare, both risk-based capital ratio and risk-based deposit insurance scheme should maintain.
52.5 Conclusion A remarkable feature of banking industry is to serve as an intermediary. That is, to lend money to borrowers and receive funds from depositors. To function well, from the perspective of depositors the banks should be risk-free. Otherwise, depositors have to collect information from banks, or to diversify their deposits across banks, which is costly. One way to solve the problem is to introduce the third-party liability guarantee. Considering the size of the whole banking industry, only the government or its agencies can be the liability guarantee guarantor. Such guarantee incurs costs to the insurer. To measure the costs of the guarantor, Merton (1977)
825
first proposes an isomorphic correspondence between a put option and the deposit insurance guarantee, and then uses the Black-Scholes formula to price the fair deposit insurance premium. The reason behind the isomorphic correspondence is that at the end of the period, if the bank is solvent, then the deposit insurance is worthless. But if the bank is insolvent, the insurer must infuse funds to meet the liabilities. Hence the payoff of the deposit insurance corresponds to the put option written on the bank assets with strike price equal to the insured deposits. The simplest version of deposit insurance pricing shows that the deposit insurance premium decreases with the asset-to-deposit ratio, and increases with the volatility of bank asset value. Since the pioneering work of Merton (1977), many studies follow on. Merton (1978) considers the surveillance costs with random auditing times and shows that the fair deposit insurance premium does not decrease in the asset-to-deposit ratio monotonically. Mullins and Pyle (1994) also consider the auditing costs, but they calculate the risk-based capital ratio given the deposit insurance premium set by the regulator. They show that both the risk-based capital ratio and the deposit rate spread increases with the auditing costs. Allen and Saunders (1993), Hwang et al. (2009), and Lee et al. (2005) all value the deposit insurance premium with the forbearance policy taken into account explicitly. Allen and Saunders (1993) show that if the regulator rewards the safer banks with looser closure policy, the value of deposit insurance will not increase with the bank asset risks monotonically. Inspired by Allen and Saunders (1993), Hwang et al. (2009) show how the bankruptcy costs can be incorporated into the deposit insurance pricing model. Besides, Lee et al. (2005) derive a formula of fair deposit insurance premium, which explicitly considers the capital requirements by Basel Accord. Previous studies follow the framework of Black-Scholes and assume that the risk-free rate and the volatility of bank asset value are constant. Kerfirden and Rochet (1993) provide a way of pricing deposit insurance premium under stochastic interest rate. Duan and Yu (1999) show how the CARCH option pricing model can relax the constant volatility assumption. An important application of Merton (1977) is to test whether the deposit insurance is fairly priced. Markus and Shaked (1984) point out that the deposit insurance charged at that time are excessive even after considering the administration costs of the FDIC. However, Ronn and Verma (1986) claim that the deposit insurance is underpriced. The main reason is that the forbearance policy exercised by the regulator is neglected in Markus and Shaked (1984). Pennacchi (1987) argues that whether the deposit insurance is under- (over-) charged depends on the level of insurer’s control power over the banking industry. Flannery (1991) indicates that the deposit insurance premium tends to be underpriced because the
826
D.Y. Hwang et al.
insurer can only measure the bank asset risks with errors (underestimate the risks ex ante). Regarding the deposit insurance scheme, there is still much room for improvement. For example, a mechanism is needed that minimizes the effects of moral hazard, or modifies the deposit insurance scheme, which will comply with the supervision of capital requirement.
(1) Cox and Ross (1976) and Harrison and Kreps (1979) propose the martingale pricing method to price the value of an option. This method is briefly introduced as follows. Assume that the price of the underlying asset of a put option follows a diffusion process:
References
where W P denotes a Wiener process under the probability measure (P-measure).
Allen, L. and A. Saunders. 1993. “Forbearance and valuation of deposit insurance as a callable put.” Journal of Banking and Finance 17, 629–643. Bjork, T. 1998. Arbitrage theory in continuous time, Oxford University Press, Oxford. Cox, J. C. and S. Ross. 1976. “The valuation of options for alternative stochastic process.” Journal of Stochastic Process 3, 145–166. Duan, J. C. and M. T. Yu. 1999. “Capital standard, forbearance and deposit insurance pricing under GARCH.” Journal of Banking and Finance 23, 1691–1706. Epps, T. W. 2000. Pricing derivatives securities, World Scientific, Singapore. Flannery, M. J. 1991. “Pricing deposit insurance when the insurer measures bank risk with an error.” Journal of Banking and Finance 15, 975–998. Harrison, M. and D. Kreps. 1979. “Martingales and multiperiod securities markets.” Journal of Economic Theory 20, 381–408. Hwang, D.-Y., F.-S. Shie, K. Wang and J.-C. Lin. 2009. “The pricing of deposit insurance considering bankruptcy costs and closure policies.” Journal of Banking and Finance 33(10), 1909–1919. Kerfriden, C. and J. C. Rochet. 1993. “Actuarial pricing of deposit insurance.” Geneva Papers on Risk and Insurance Theory 18, 111–30. Lee, S. C., J. P. Lee, and M. T. Yu. 2005. “Bank capital forbearance and valuation of deposit insurance.” Canadian Journal of Administrative Sciences 22, 220–229. Marcus, A. J. and I. Shaked. 1984. “The valuation of FDIC deposit insurance using option-pricing estimates.” Journal of Money, Credit and Banking 16, 446–460. Merton, R. C. 1977. “An analytic derivation of the cost of deposit insurance and loan guarantees.” Journal of Banking and Finance 1, 3–11. Merton, R. C. 1978. “On the cost of deposit insurance when there are surveillance costs.” Journal of Business 51, 439–452. Mullins, H. M. and D. H. Pyle. 1994. “Liquidation costs and risk-based bank capital.” Journal of Banking and Finance 18, 113–138. Pennacchi, G. G. 1987. “A reexamination of the over- (or under-) pricing of deposit insurance.” Journal of Money, Credit and Banking 19, 340–360. Ronn, E. I. and A. K. Verma. 1986. “Pricing risk-adjusted deposit insurance: an option-based model.” Journal of Finance 41, 871–895.
dS D dt C dW P ; S
Let dW P D dW Q
r
Proof of Equation (52.1) In this appendix, we first derive the European put option pricing formula, and then apply this formula to the deposit insurance premium in Merton (1977).
dt;
(52.24)
where W Q is the Wiener process under the equivalent martingale measure (Q-measure) of the original measure (P-measure), and the asset return process under the Q-measure is dS D rdt C dW Q : (52.25) S By Ito’s Lemma,
2 Q dt C Wt ; d ln St D r 2
(52.26)
2 Q T C WT ; ST D S exp r 2
(52.27)
and
Q
where WT N Q .0; T /. Furthermore, let d W Q D d W R C dt, then under Rmeasure
2 T C WTR ; ST D S exp r C 2
(52.28)
where WTR N R .0; T /. Rt 2 2 Q Let t exp. 2 t C Wnt / D exp. 1 2 0 ds
o Rt Rt and since E exp 12 0 2 ds D C 0 dW Q s /, 1 2 exp 2 t < 1; by the Girsanov Theorem, t is a martingale process adapted to a filtration Ft and E Q .t IA / D E R .IA /;
Appendix 52A
(52.23)
(52.29)
where IA is an indicator function; that is, IA D 1 if event A occurs and IA D 0 otherwise. Interested readers not familiar with these tools, please refer to Pricing Derivatives Securities (Epps 2000) and Arbitrage Theory in Continuous Time (Bjork 1998) for more details. (2) The payoff of a European put option holder at time T is Max .K ST /. Where K is the strike price and ST
52 Put Option Approach to Determine Bank Risk Premium
827
denotes the price of the underlying asset at T . The fair price of the put P in the beginning is thus P D e rT E Q Œmax.K ST ; 0/ D e rT E Q Œ.K ST /IA ; (52.30) where A D fST jST < Kg. Hence P D KerT E Q .IA / e rT E Q .ST IA /:
(3) Return to Merton (1977). To derive Equation (52.1), one may simply substitute V D D, KerT D D (the value of insured deposits in the beginning) and 2 =2 D into Equation (52.35). Let d D=V and divide both sides of (A13) by D, Equation (52.1) can be easily obtained. Note that p x D ln.K=S/ .r 2 =2/T = T p D ln.K=S/ C ln e rT C 2 =2 = ; p D ln.Ke rT =S/ C 2 =2 =
(52.31)
We deal with the term E Q .IA / first and then calculate E Q .ST IA /. By Equation (52.27), E Q .IA / D PrQ .ST < K/ D PrQ .ln ST < ln K/ h i Q D PrQ ln S C .r 2 =2/T C WT < ln K : " # Q ln.K=S/ .r 2 =2/T Q WT < p p D Pr T T D ˆ.x/
D h2
and h1 D x
p .
Appendix 52B
(52.32) Proof of Equations (52.2) and (52.3)
Q
WT
N .0; T /, ˆ./ denotes the Recall that CDF of the standard Normal p distribution, and x D ln.K=S/ .r 2 =2/T = T . Under Q-measure, Q
h i Q ST D S exp .r 2 =2/T C WT
2 Q D Se rT exp T C WT : 2 D SerT T
Pf:
Q
(52.33)
R
D SerT E R .IA / D SerT PrR .ST < K/ D Se
rT
and
@h2 @d
D
@h1 , @d
then
< ln K/: D SerT PrR ln S C .r C 2 =2/T C WTR < ln K
W R ln.K=S/ .r C 2 =2/T D SerT PrR p T < p T T
p D Se rT ˆ x T (52.34)
Substituting Equations (52.33) and (52.34) into Equation (52.31), one may obtain the premium of a European put (52.35)
ˆ0 .h1 / d
D
ˆ0 .h1 / p . 2d
then
ˆ0 .h1 / @g D @ d We know that
@g @
@g @h2 ˆ0 .h1 / @h1 D ˆ0 .h2 / : @ @ d @
Since ˆ0 .h2 / D
PrR .ln ST
p P D KerT ˆ.x/ Sˆ x T :
ˆ0 .h1 / d
(2) Proof of Equation (52.3), Pf:
rT
ˆ.h1 / . d2
ˆ.h1 / @g D @d d2
E .ST IA / D Se E .T IA / D Se E .IA / rT
D
ˆ.h1 / ˆ0 .h1 / @h1 @g @h2 D ˆ0 .h2 / C : @d @d d2 d @d
Since ˆ0 .h2 / D
By Equations (52.28) and (52.29), Q
@g @d
(1) Proof of Equation (52.2),
@h2 @
@h1 @
D
@h2 @h1 @ @
@.h2 h1 / @
ˆ0 .h1 / @g D p : @ 2d
D
:
1 p , 2
we have
Chapter 53
Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach Hai-Chin Yu, Chih-Sean Chen, and Der-Tzon Hsieh
Abstract Using quantile regression, our results provide explanations for the inconsistent findings that use conventional OLS regression in the extant literature. While the direct effects of R&D investments, leverage, and main bank relationship on Tobin’s Q are insignificant in OLS regression, these effects do show significance in quantile regression. We find that firms’ advantages with high R&D investment over low R&D monotonically increase with firm value, appearing only for high Q firms; while firms’ advantages with low R&D over high R&D monotonically increase with firm value for low Q firms. Tobin’s Q is monotonically increasing with leverage for low Q firms; whereas it is decreasing in high Q firms. Main banks add value for low to median Q firms, while value is destroyed for high Q firms. Meanwhile, we find the interacted effect of main bank and R&D investment which increases with firm value, only appears in medium quantiles, instead of low or high quantiles. Results of this work provide relevant implications for policy makers. Finally, we document that industry quantile effect is larger than the industry effect itself, given that most of the firms in higher quantiles gain from industry effects while lower quantile firms suffer negative effects. We also find the results of OLS are seriously influenced by outliers. In stark contrast, quantile regression results are impervious to either inclusion or exclusion outliers. Keywords R&D investment r Quantile regression r Keiretsu r Main bank r Leverage
H.-C. Yu () Department of International Business, Chung Yuan University & Rutger University Chung-Li, 32023, Taiwan e-mail: [email protected]; [email protected] C.-S. Chen Institute of Management, Chung Yuan University, Chung-Li, 32023, Taiwan e-mail: [email protected] D.-T. Hsieh Department of Economics, National Taiwan University, Taipei, 100, Taiwan e-mail: [email protected]
53.1 Introduction Research and development (R&D) investment is considered a crucial input for firms’ growth and performance, especially in technology and science intensive industries. Among developed countries, Japan is thought to be one of the most successful countries in manufacturing. Why is manufacturing so successful in Japan but not in the U.S. or other countries? Our data shows that more than 50% of Japanese listed firms belong to the technology-based industry underscoring the importance of R&D in boosting long-term economic growth. Owing to the fact that R&D investments is intangible, the asymmetric information problem with respect to R&D investment is expected to be more serious than physical investment, since managers may be inclined to keep confidential information to maintain competitive advantage. This, in turn, makes it difficult for investors to measure the effect of R&D investment on firm value. Extant literature shows divergent findings regarding the impacts of R&D on firms’ value rendering these results inconclusive. For example, Woolridge (1988) finds a significant market response to the increase in R&D expenditure; however, the reaction differs based on different industries. Chan et al. (1990) and Zantout and Tsetsekos, (1994) report a positive market response to R&D increases for firms in high-tech industries vs. a negative market response for low-tech industries. Acs and Isberg (1996) suggest the final impact of financial leverage and R&D investment on Tobin’s Q depends on firm size. Ho et al. (2006) also indicates that a firm’s growth opportunity from R&D depends on its size, leverage, and industry concentration. Conversely, some researchers such as Doukas and Switzer (1992) and Sundaram et al. (1996) find an insignificant effect. When combined, the above evidence suggests that due to an asymmetry of information outside investors may not completely understand the value-adding characteristic of R&D. More to the point, if a significant asymmetric information problem exists in intangible R&D investment, then firms with main bank relationships may help reduce this uncertainty and convey a positive signal. If this logic is correct, then firms with a main bank relationship facing high R&D
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_53,
829
830
investment should provide more credible information in order for investors to accurately evaluate firm value than those firms without it. In turn, this superior level of information should be reflected on firm value. Additionally, this signal is expected to be stronger when the asymmetric information problem is more serious between managers and investors. Another point that makes this issue valuable is the impact of R&D investment on long-term performance. This characteristic conforms with the Japanese sample where the main bank dominates almost 71% of outstanding shares (Sheard 1994). This concentrated and stable ownership structure promotes efficient monitoring and long-term performance without regard to short-term pressures. Given that the main bank helps to reduce the conflict between managers and creditors and positively aligns their relationship, using the main bank as a moderator variable is expected to produce different findings vs. to those countries that utilize highly individual ownership data (Aoki 1990; McGuire and Dow 2002). In Japan, banks traditionally maintain multidimensional ties with their clients, providing security, insurance, or banking services. This approach of banking has come to be commonly known as Keiretsu (or main bank) style of banking. From a positive viewpoint, a main bank gathers proprietary information through the financing process. This inside information (and hence the quality of the firm) is then signaled to the market by way of a main bank financing decision. Specifically, if the main bank is willing to continue financing, a signal of firm quality is conveyed to outside investors. It is through this mechanism that affiliation with main banks creates value. Moreover, due to lower agency cost induced by significant equity holdings of main banks, a higher level of performance is expected. From a negative point of view, some high-tech intensive firms may not prefer to interact with a main bank because potential activity on the part of the main bank may inadvertently reveal a firm’s prospects to its competitors. Additionally, in the case of financial distress, managers may expect the main bank to solve their financial shortcomings, leading to moral hazard problems and an increased propensity for bankruptcy. Therefore, main bank relationships may exacerbate a firm’s problems and reduce its performance. Extant literature has not revealed a consistent finding regarding whether main banks help increase or decrease firm value. Given that main banks serve simultaneously as the internal governance (board) and external monitoring (principal creditor), more research designed to clarify the relationship of main banks and their influence on a firm’s values is warranted. Our paper differs from the existing literature in a number of ways. First, we use quantile regression instead of conventional OLS to add additional evidence that helps explain the empirical puzzle concerning the effects of R&D expenditure,
H.-C. Yu et al.
leverage, and main bank relationships on firms’ value. Since there were a great deal of bubble Internet stocks in the hightech industry during 1990s in Japan, the average Tobin’s Q would be overestimated by using conventional OLS. To avoid this upward bias resulting from including extremely high Tobin’s Q firms, a more realistic view may be gained by using lower quantiles as the target point. Our data from 1994 to 2004 covers the whole Internet evolution period, illustrating again the advantage of using quantile approach. Second, we include the presence of main bank relationships in model of firm value in order to capture the impact of this relationship and its ability to reduce asymmetric information and enhance firm value. Third, besides testing the independent effects of R&D and main bank influence on firm value, the interaction effects of R&D with leverage and with main bank relationship are also included in the model. Prior research treats these characteristics as independent effects, thus failing to test for possible interaction effects of these variables. Fourth, industry dummies are being controlled in order to avoid any biases resulting from industry characteristics. Fifth, our data set is unique in that we are able to merge a number of sources of data (such as COMPUSTAT Global and Emerging Markets database and PACAP database), allowing for a through examination (by way of quantile estimators) of the impact of Keiretsu style banking on firm value. We find that R&D investment, financial leverage, and Keiretsu style main banking all have quantile effects on firm value. The impact of R&D investment, financial leverage, and Keiretsu style banking on firm value varies with the various Tobin’s Q quantile of the firms rather than firm size as suggested by Acs and Isberg (1996). These results are robust after controlling for industry types. Additional findings include, but are not limited to the following: (a) Firms’ advantages with high (low) R&D investment over low (high) R&D monotonically increase with firm value for high (low) Q firms; (b) Tobin’s Q is monotonically increasing (decreasing) with leverage for low (high) Q firms; (c) for low Q firms, R&D investment and leverage complement each other to create value; for high Q firms, R&D investment and leverage substitute each other; (d) main bank relationships play a significant role in adding value only for low to median Q firms whereas, in contrast, the main bank effect turns out to be negative with firms staying at extremely high Q; and (e) finally, we find that R&D efforts from the industry itself do not have a systematic positive or negative impact on firms’ value. Instead, most of the firms in higher quantiles gain from industry effects while lower quantile firms suffer negative effects. The paper is organized as follows. A review of the extant literature is provided in Sect. 53.2. Section 53.3 presents the data and sample. Section 53.4 reports the methodology and results, Section 53.5 concludes.
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
53.2 Literature Review The effect of R&D investments on Tobin’s Q and the relationship between R&D investments and debt financing is explored in numerous studies (Szewezyk et al. 1996; Zantout 1997; Vincente-Lorente 2001; O’Brien 2003). However, their findings are inconsistent. Some findings document that debt has a positive impact on R&D investment; others document a negative effect, while still others find no effect at all. These inconsistencies are not surprising. From one point, debt can act as a positive influence on managerial behavior by motivating managers to invest in projects with positive net present value. Thus, R&D investments undertaken in a high debt ratio regime are expected to positively enhance value. This helps explain why the relationships between stock returns and R&D investments announcements is positive only in firms with relatively high debt ratio (Zantout and Tsetsekos 1994). From another point, R&D investments have been viewed as high risk and intangible such that managers may not be interested in firms with higher debt ratios. In such cases, underinvestment (or asset substitution) may occur more easily than other physical investments leading to negative impacts reflected in debt ratios. Hence, Bhagat and Welch (1995) suggest a negative relationship between R&D investments (witnessed through debt ratios) and firm value – Bradley et al. (1984) and Long and Malitz (1985) originally link leverage and R&D expenditures. Thereby, highly leveraged firms are supposed to invest less in R&D than other physical assets. Acs and Isberg (1996) address the additional issue of growth opportunities, finding that financial leverage’s and R&D investment’s influence on growth opportunities depends on firm size and industry structures. As we know, agency cost and information asymmetry problems are likely to negatively influence the benefits of a highly leveraged firm from its R&D investments. Daniel and Titman (2006) suggest asymmetric information reduces the attractiveness of R&D projects to outside investors (debt holders) with managers keeping “know-how” for competitive reasons. This problem renders debt financing of R&D projects more expensive and less preferred. Daniel and Titman (2006) document that stock returns are positively related to price-scaled variables such as the book-to-market ratio (BM) and suggest the book-to-market ratio can forecasts stock returns because it is a good proxy for the intangible return. Daniel and Titman (2006) provide an argument for why high book-to-market value firms realize high future returns and find that future returns are unrelated to the accounting measures of past performance (tangible information), but are negatively related to future performance (intangible information). Among different types of intangible information, R&D is the most crucial variable when it comes assessing future performance. We measure firms’ future performance
831
by Tobin’s Q ratio as reported in the extant studies (Lang et al. 1989, 1991; Doukas 1995; Szewczyk et al. 1996, etc.).1 Chan et al. (1990) and Zantout and Tsetsekos (1994) suggest that R&D investment by firms with promising growth opportunities are basically worthy of inclusion into any model of firm value. Due to the potential uncertainty caused by the presence of asymmetry of information between firms and investors and the unobservable nature of management, R&D is likely to be discounted in high-leverage firms. Said another way, asymmetric information problems and agency costs may render debt financing of R&D investments more expensive than other options. Support for this argument is found in Bhagat and Welch (1995), which find a negative relationship between debt ratios and R&D investments on firm value; and Kamien and Schwartz, 1982), which find that investors value a firm’s investments in R&D when they use their own funds. In addition to asymmetric information and agency, transaction cost economics may explain part of the evidence about the types of investments made by managers. Williamson (1988) points out that the choice of project financing depends on the characteristics of the assets. Titman and Wessels (1988) also postulate that “debt level is negatively related to the uniqueness of a firm’s line of business.” Therefore, firms with higher degrees of firm-specific assets are found to have less debt (Williamson 1988; Balakrishnan and Fox 1993; Vincente-Lorente 2001; O’Brien 2003). Our results compensate the lack of literature in addressing the relationship between R&D and firm value, R&D and leverage, and main bank and R&D. The results are robust and consistent after making some relative tests.
53.3 Data and Sample 53.3.1 Data Source and Sample Description To construct the sample for this research, two different databanks are employed: (a) COMPUSTAT Global and Emerging Markets database and (b) PACAP (Pacific-Basin Capital Markets) database. The financial data for Japanese-listed companies is obtained from PACAP database. Starting with an original sample of 3,499 non-financial firms listed on the Tokyo Stock Exchange and, after eliminating the firms with reported data that are not creditable,2 an effective sample 1
The theoretical Tobin’s Q is the ratio of the market value to the replacement cost of the assets. Because some data is not available, it has been often estimated by a simple measure of Q (the “pseudo Q”): market value to book value of the total assets. 2 For example, some firms are characterized by negative debt or negative sales without proper explanations in footnotes.
832
of 40,575 year-firm observations is derived from an original sample of 41,470 observations.3 Additional data, such as firm R&D expenditure, is taken from the Toyo Keizai’s Tokei Geppo Statistics Monthly. The sample is next classified into eight industry types based on industry classifications proposed by Chan et al. (2001). Specifically, our samples is comprised of transportation (SIC D 37), communications (SIC D 48), electronic equipments (SIC D 36), measuring instruments (SIC D 38), computer and office equipment (SIC D 357), drugs and pharmaceuticals (SIC D 283), and computer programming and software (SIC D 737) eight industries. Panel A of Table 53.1 reports the summary statistics of the full sample, while Panel B, C, and D presents the statistics for the main bank clients, unaffiliated firms, and selected industries, respectively. It is worth noting that in order to avoid biases resulting from extreme values in descriptive statistics, we drop the sample outliers based on if their Tobin’s Q is less than 1% or larger than 99%.4 Panel A reveals that the mean (median) of RD_TA in Japan is 2.0% (1.2%), which is low compared with the current American firms that have an average of 10.2–16.0% (Ho et al. 2006; Cui and Mak 2002). The median of RD_TA is 1.2% (smaller than the mean of 2.0%), implying the distribution of RD_TA is dominated by firms with higher RD_TA. Figure 53.1a shows the trend of average RD_TA rose from 1.915% in 1994 to 2.15% in 1998, then dropped to 1.85% in 1999, and then increased to 2.003% in 2004, an increase of only 0.105% during these 10 years. This trend reveals there was no significant change on R&D investment in Japan during our sample period. Figure 53.1a also shows main bank clients do not always have significantly higher or lower R&D expenditure than their counterparts of unaffiliated firms. Besides, main bank clients manifest a similar trend on R&D expenditure as the full sample, which shows the highest peak during 1997–1998. This peak may have been caused by the shrinkage of total assets during Asian financial crisis. The leverage ratio (LEV) is 0.591, which is extremely high compared to the American firms that have an average of 0.158–0.25 only (Ho et al. 2006; Johnson 1997). This figure implies Japanese firms use higher financial leverage in their capital structure, which may related to their main bank system. Furthermore, among these leverages, the long3
The financial firms were excluded from the overall sample as the financial firms exhibit different balance sheet items from those of the non-financial firms. 4 Our extreme value definition is different from Ho et al. (2006) who deleted the extreme value below the first quantile less 1.5 times the interquantile range or any value greater than the third quantile plus 1.5 times the interquantile range. The reason why we need to delete the outliers is that we find some firms with Tobin’s Q extremely high, which is beyond our understanding. We think it may be a result of the bubble Internet stock collapse before Asian Financial Crisis.
H.-C. Yu et al.
term debt ratio is only 11.6%, which is lower than those of U.S. firms averaging 18% (Aivazian et al. 2005), revealing another unhealthy debt structure, since most of the leverages are short-term debts. Figure 53.1b shows the time trend of average leverage. The average leverage falls from 61.4% in 1994 to 53.8% in 2004. Figure 53.1b also shows main bank clients have significantly higher leverages (with an average of 0.652) compared to independent firms (with an average of 0.576) during 1994–2004. The average Tobin’s Q is 8.012 (with a median of 1.466), which is high compared to U.S. listed firms that have an average of 2.89 (Cui and Mak 2002). This high Tobin’s Q may be attributable to the Asian Financial Crisis during 1997–1998 when the economy was bubbling and the dramatic volatility of high tech stocks in 1999. Figure 53.1c displays the trend of Tobin’s Q over 1994–2004. It shows the average Tobin’s Q rose to 66.0 in 2004 from 10.0 in 1994, an increase of six times the value of 1994 levels5 It also shows main bank clients have the significantly lower Tobin’s Q (5.137) vis-àvis independent firms (8.78) over 1994–2004, indicating that main bank clients do not perform better. Keiretsu-style main bank ratio (MB_D) is 19.5%, meaning only one fifth of listed firms have affiliated relationship with Keiretsu-style banking. The average log size (LOG_TA) is 4.3, which is small in comparison to American firms that average 17.89 (Cui and Mak 2002), implying a potential difference in size effect may exist. The average (median) capital expenditure rate (CAPEX_TA) is 2.35% (0.00%), which is negative, and the median of zero% revealed that half of the Japanese firms did not have capital expenditures during the sample period. The net cash flow ratio on average is 7.24% with a negative skewness, implying a high likelihood of negative cash flow among the listed firms. Earnings volatility (EBIT_VOL) is 5.316%, similar to American firms of 4.79% (Jonhson 1998) Dividend payout ratio (DIV_TA) on average (median) is only 0.5% (0.4%), which is low when compared to U.S. firms (Jonhson 1998). Panels B and C then compare the characteristics of Keiretsu-style main bank clients and unaffiliated firms. Not surprisingly, main bank clients show higher leverage (0.65 vs. 0.58), larger size (4.47 vs. 4.25), and lower earnings volatility (3.21 vs. 5.84) than unaffiliated firms. However, it shows a lower median capital expenditure ratio (3.89 vs. 1.97) and cash flow ratio (1.82 vs. 0.46) in comparison to independent firms. This implies main bank clients are larger, have higher leverage, and lower earning volatility. However, these firms paid lower dividends and have lower cash flows ratios and capital expenditure.
5
The time trend utilizes the original sample without deleting the outliers. The summary statistics in Table 53.1 use the sample after deleting Tobin’s Q, which is less than 1% or larger than 99% in the whole sample
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
Surprisingly, main bank clients show a lower R&D expenditure on average (0.019 vs. 0.020), but a higher R&D expenditure on median (0.012 vs. 0.011), suggesting there is no significant difference on investing R&D expenditures. Furthermore, main bank clients show a lower Tobin’s Q on average (5.14 vs. 8.71), but a higher Tobin’s Q on median (1.73 vs. 1.36), suggesting main bank clients perform poorly and are dominated by lower Q firms. Panel D reports the selective statistics on different industries. The drug and pharmaceutical industry shows the highest R&D expenditure of 6% associated with the lowest leverage of 0.38, followed by the computer and office equipment industry with R&D expenditure of 4% associated with the leverage of 0.51, and the measuring instruments industry with R&D inputs of 3.6% and leverage of 0.54. Communication industry shows the highest Tobin’s Q of 58.72, followed by the computer programming and soft-
833
ware (26.79) and computer and office equipments (10.60). The extremely high Tobin’s Q reflects the bubble stocks of high tech during the sample period and underscores the reason for employing quantile regression in this work. The transportation equipment industry shows the lowest Tobin’s Q of 3.23 with the second lowest R&D inputs of 2.4% and highest leverage of 0.61, reflecting the general traditional industry characteristic: low R&D inputs, highly leveraged and low performance of Q.
53.3.2 Keiretsu and Main Bank Sample The Keiretsu data comes from Industrial Groupings in Japan. This handbook provides the data for each company belonging to specific Keiretsu and its relationship strength with the
Table 53.1 Descriptive statistics and industry distribution Panel A: Full sample statistics Variables
Obs
Mean
25%
TobinQ RD_TA LEV MB_D LOG_TA CAPEX_TA CF_TA EBIT_VOLA DIV_TA
35990 16694 36725 36725 36725 33700 36080 25055 36725
8:012 0:020 0:591 0:195 4:293 2:335 0:724 5:316 0:005
0:768 0:004 0:429 0:000 3:935 0:008 0:000 0:278 0:000
Median 1:466 0:012 0:602 0:000 4:342 0:000 0:026 0:735 0:004
75%
Skewness
Kurtosis
3:546 0:026 0:758 0:000 4:752 0:015 0:052 1:952 0:007
12:420 2:707 5:774 1:564 0:657 19:437 94:142 104:394 8:201
171:485 14:896 220:649 3:447 3:945 495:726 11695:460 11331:750 186:577
75%
Skewness
Kurtosis
Panel B: Main bank clients statistics Variables
Obs
Mean
25%
TobinQ RD_TA LEV MB_D LOG_TA CAPEX_TA CF_TA EBIT_VOLA DIV_TA
7018 4132 7049 7049 7049 6386 6964 5000 7049
5:137 0:019 0:652 1:000 4:467 3:891 1:817 3:219 0:004
1:100 0:005 0:521 1:000 4:064 0:011 0:000 0:305 0:000
Median 1:726 0:012 0:668 1:000 4:577 0:000 0:019 0:913 0:004
3:114 0:025 0:800 1:000 5:023 0:009 0:041 2:296 0:007
18:995 1:979 4:004
403:701 8:346 135:686
0:743 12:425 35:304 12:173 5:415
3:165 172:491 1982:061 221:677 94:367
75%
Skewness
Panel C: Unaffiliated firms statistics Variables
Obs
Mean
25%
TobinQ RD_TA LEV MB_D LOG_TA CAPEX_TA CF_TA EBIT_VOLA DIV_TA
28972 12562 29676 29676 29676 27314 29116 20055 29676
8:708 0:020 0:576 0:000 4:252 1:971 0:462 5:838 0:005
0:700 0:004 0:410 0:000 3:913 0:008 0:001 0:272 0:000
Median 1:362 0:011 0:583 0:000 4:301 0:000 0:027 0:693 0:004
3:674 0:027 0:745 0:000 4:680 0:018 0:055 1:869 0:008
Kurtosis
11:629 2:776 6:170
150:366 15:159 234:627
0:711 22:455 101:212 93:482 8:173
4:321 667:718 12523:120 9081:347 179:844 (continued)
834
H.-C. Yu et al.
Table 53.1 (continued) Panel D: Selected industries statistics Industries
TobinQ
RD_TA
IND_D1(Transportation equipment)
37
1146
3:203
0:024
0:609
IND_D2(Communications)
48
150
58:720
0:007
0:532
IND_D3(Electrical equipment excluding)
36
2336
9:280
0:032
0:558
IND_D4(Measuring instruments)
38
754
7:442
0:036
0:537
IND_D5(Computers and office equipment)
357
335
10:599
0:039
0:513
IND_D6(Drugs and pharmaceuticals)
283
498
8:358
0:060
0:385
IND_D7(Computer programming, and software)
737
1199
26:789
0:022
0:419
29572
7:058
0:014
0:607
IND_D8(Others)
SIC
Obs
LEV
For sample period from 1994 to 2004, Panel A presents the basic statistics in the full sample, Panel B, C and D present the statistics for the main bank clients, unaffiliated firms, and selected industries, respectively. To avoid the bias from extreme value, we drop the sample outliers based on if their Tobin’s Q is less than 1% or larger than 99%. TOBINQ is the sum of market value of firms’ equity and book value of total liabilities divided by total assets; RD_TA is R&D expenditure divided by total assets; LEV is total debt divided by total assets; MB_D is the Keiretsu style main bank relationship which is given a 1 or 0; LOG_TA is the log of firm’s total assets; CAPEX_TA is capital expenditure divided by total assets; CF_TA is the ratio of net cash flow to total assets; EBIT_VOLA, the volatility of earnings before interest and taxes, is the standard deviation of first differences of earnings before interest and tax (EBIT) divided by total assets; DIV_TA is the cash dividend payout ratio. IND_D is the dummy variable of industries, IND_D1 denotes transportation equipment, IND_D2 denotes communications industry, IND_D3 denotes electrical equipment excluding computers, IND_D4 denotes measuring instruments, IND_D5 denotes computers and office equipment, IND_D6 denotes drugs and pharmaceuticals industries, IND_D7 denotes computer programming, software, and services industries, IND_D8 denotes the others industries
Keiretsu. There are eight commonly recognized horizontal keiretsu: Mitsubishi, Mitsui, Sumitomo, Fuyo, DKB, Sanwa, Tokai, and IBJ. Our sample shows that these eight commonly recognized horizontal keiretsu – Mitsubishi, Mitsui, Sumitomo, Fuyo, DKB, Sanwa, Tokai, and IBJ – have 269, 209, 247, 222, 207, 191, 60, and 47 affiliated firms, respectively. Of those, only 127, 85, 109, 113, 91, 94, 37, and 26 are publicly listed. Hence, our data shows only 19.5% of Japanese listed firms having strong ties with one of the eight keiretsu groups; this ratio is lower than that of 89 out of 200 reported by Hoshi et al. (1991). Our analysis of main banks is operationally defined as each firm’s top bank on its trading bank list in Kaisya Shikiho (the Japan Company Handbook). Thus, a firm’s main bank is its largest lender as well as the top shareholder among banks with the firm. This data is collected from Toyo Keizai’s Kigyou Keiretsu Soran (the Japanese Keiretsu Handbook). Industrial groupings in Japan provides information about the amount of loans provided by each bank for Keiretsu members. Finally, data and names of lenders are obtained from the Japan Company Handbook. The handbook discloses the major sources of funding for each Japanese listed company. The names of principal banks were obtained from the reference list.
53.3.3 Model Specification Our empirical quantile model is specified as Equation (53.1): Tobin0 sQit D ˇ0 C ˇ1 RD_TAit C ˇ2 LEV it C ˇ3 MB_Dit Cˇ4 LOG_TAit C ˇ5 CAPEX_TAit Cˇ6 CF_TAit Cˇ7 EBIT_VOLAit Cˇ8 DIV_TAit Cˇ9 LEV it RD_TAit Cˇ10 MBit RD_TAit C ˇ1117 INDit _D17 C "i t (53.1) We measure firm value by Tobin’s Q similar to the prior studies (Lang et al. 1989; Doukas 1995; Szewczyk et al. 1996; Claessens et al. 2002, etc.) Tobin’s Q is measured by marketto-book ratio. Market value is defined as the sum of the market value of common stock and the book value of debt and preferred stock. Following the approach of Chan et al. 1990), R&D expenditure ratio (RD_TA) is measured by R&D expenditure divided by total assets. R&D expenditure has been consistently found to be significantly positively related to a firm’s Tobin’s Q in at least two studies (Lang and Stulz 1994;
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
a
835
The R&D Expenditure Ratio of Japanese Firms, 1994-2004 0.021 0.0205 RD_TA
0.02
Unaffiliated Firms
0.0195
Main Bank Clients Full Sample
0.019 0.0185
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
0.018
Year
b
The Leverage of Japanese Firms, 1994-2004 0.750
Leverage
0.700 Unaffiliated Firms
0.650
Main Bank Clients 0.600
Full Sample
0.550
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
0.500
Year
c
The Tobin's Q of Japanese Firms, 1994-2004
Tobin's Q
14 12 10
Unaffiliated Firms
8 6
Main Bank Clients Full Sample
4 2 2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
Year
0
Year
Fig. 53.1 (a) The R&D expenditure ratio (RD_TA) of Japanese firms, 1994–2004. (b) The leverage ratio of Japanese firms, 1994–2004. (c) The Tobin’s Q of Japanese firms, 1994–2004
Chen and Steiner 2000). The main bank dummy (MB_D) is used to measure the impacts of bank monitoring on performance (Jensen and Meckling 1976), reducing information asymmetry and enhancing information signaling (Leland and Pyle 1977). If firms with Keiretsu style main bank relationships are expected to have lesser problems with asymmetric
information or more efficient monitoring, then a lower cost and higher performance may appear. MB_D is given a “one” if firms affiliate with any of Keiretsu main bank groups, otherwise a “zero” will be given. Firm size (LOG_TA) is included because large firms may be more successful in developing new technology (Chauvin
836
and Hirschey 1993) resulting in different Q scores. The capital expenditure ratio (CAPEX_TA) is the ratio of capital investment expenditure to total assets. Houston and James (1996) document that market value (Tobin’s Q) obviously relies on future growth opportunities, which are affected by capital expenditure. We also control free cash flow (CF_TA) for cross sectional differences among firms since R&D investment for low-free-cash-flow firms may increase the probability for seeking external financing. If main bank clients are expected to be less sensitive to cash flow problems, then the expenditure on R&D should have less aversive credit rationing from their main banks as well. Leverage ratio (LEV) and dividend yields (DIV_TA) are included as alternative measures of free cash flow (Jensen 1986) and investment opportunity (Smith and Watts 1992), respectively. Gugler (2003) finds an inverse relationship between payout ratios and R&D investments. Although a tremendous amount of extant literature demonstrates a negative relationship between leverage and Tobin’s Q, the impact of leverage on firms’ value from R&D investments is ambiguous in our research. Bradley et al. (1984) and Long and Malitz (1985) find that the use of leverage decreases with R&D expenditures. However, these authors do not provide any evidence of the impact of leverage on firm value while R&D investment is high, or the impact of R&D expenditure on firm value while leverage is high (or low). To clarify this point, the interaction effects of leverage and R&D expenditure (LEV*RD_TA) is included. Moreover, the interaction effect of main bank dummy and R&D expenditure (MB*RD_TA) is also included to further elucidate our findings in supplementing the extent literature. Earnings growth volatility (EBIT_VOLA) as a proxy for controlling observable credit risk (Johnson 1997) is included in the model. Finally, seven industry dummies (IND_D1IND_D7) are controlled in this model.
53.4 Empirical Results and Analysis 53.4.1 Quantile Regression and Bootstrapping Analysis Quantile regression is originally proposed by Koenker and Bassett (1978). The general form of the equation is as follows: yi D ˇq0 xi C uqi (Koenker and Hallock 2001). This equation implies that the coefficients differ by quantile. Mean-based procedures such as ordinary least squares are more sensitive to outliers than median-based quantile estimators. The quantile estimator places less weight on outliers than do OLS estimators. Therefore, the bias should be smaller using a median-based quantile estimator.
H.-C. Yu et al.
The target for quantile regression estimates is a parameter specified before estimation. Letting eit be the residuals and q represent the target quantile from the distribution of the residuals. Quantile parameter estimates are the coefficients that minimize the following objective function in Equation (53.2): X eit>0
2qjeit j C
X
2.1 q/jeit j
(53.2)
eit
If q D 0:5, equal weighting is given to positive and negative residuals, and the parameters are estimated to minimize the sum of absolute errors. Depending on the quantile, varying weights are assigned to the residuals to lessen the impact of outliers. This result differs from ordinary least squares measures where the only constraint on the residuals is that their sum equals zero. Thus, this methodology is often employed to further investigate the differing relationships, if any, among varying quantiles. More specifically, our model’s coefficients in 5, 25, 50, 75, and 95% quantiles are estimated. The results are presented in the next section.
53.4.2 Analysis of Results of Quantile Regression and OLS Our quantile regression estimates are based on 20 replications of a boot-strapping algorithm at target quantiles of q D :05; :25; :50; :75 and .95. Panels A and B of Table 53.2 reports the results of the original sample with outliers and without outliers, respectively.6 Both panels present the results of OLS and quantile regressions. We find that although there have no significant differences in the quantile regressions with and without outliers, the OLS results are significantly different. Some of the coefficients in OLS regression turn out to be significant after deleting the outliers, while the results are not significant when the outliers are included. Furthermore, the R squared improves much after deleting the outliers in both OLS and quantile regressions. Finally, the values of coefficients on both OLS and quantile regressions turn out to be more clearly interpretable after deleting the outliers.
53.4.2.1 The Original Sample We discuss the results of original sample first. Although OLS Models (1) and (2) in Panel A show that neither R&D investment (RD_TA) nor leverage (LEV) or main bank relationships (MB_D) have significant impacts on a firm’s Tobin’s Q, 6 Panel A uses the original sample, while Panel B uses the sample where Tobin’Q less than 1% or larger than 99% has been deleted.
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
837
Table 53.2 Results of OLS and quantile regression Panel A: The original sample OLS (1)
Quantile regression (2)
COEFFICIENTS RD_TA
(3)
(4)
(5)
q5
q25
q50
(6) q75
437.7
1128
0:0000326
0:712
20:53
(4635)
(4838)
(0.00023)
(0.68)
(6.60)
110:3
(40.2)
(210)
2:507
5:874
12:30
(261)
(0.000016)
(0.10)
(0.19)
(0.42)
(3.14)
29:86
0:0000247
0:0993
0.0456
0:0264
0.658
(144)
(0.0000051)
(0.024)
(0.036)
(0.074)
(0.58)
22:60
0:0000297
0:176
0:131
0:104
3:379
(0.0000098)
(0.024)
(0.021)
(0.033)
(0.72)
0.000115
0:000707
0:000771
0.000394
0.0131
(0.000080)
(0.00036)
(0.00054)
(0.00080)
(0.018)
0.108
0:000208
0:000515
0:00139
0:00108
0.00171
(1.76)
(1.76)
(0.00020)
(0.00048)
(0.00072)
(0.00084)
(0.0092)
EBIT_VOLA
0.00306
0:00154
0:00000236
0:00000717
0.0000542
0.000227
0.000269
(0.17)
(0.17)
(0.0000012)
(0.000022)
(0.00014)
(0.00023)
(0.0024)
DIV_TA
9495
9151
0.000297
1:025
17:51
24:94
357:1
(8057)
(8078)
(0.00052)
(0.51)
(6.53)
(12.8)
(186)
LEV*RDTA
2532
247:6
0.000310
2:339
27:86
142:6
681:0
(8552)
(8740)
(0.00039)
(1.38)
(8.53)
(35.4)
(321)
MB RDTA
2843
2656
0:000656
14:14
10:07
1.750
83:83
(5101)
(5129)
(0.00022)
(1.09)
(1.80)
(6.28)
(74.3)
IND_D1
29:00
0:00000420
0.000875
0.0640
0.0783
2:237
(201)
(0.000012)
(0.052)
(0.064)
(0.85)
(0.47)
IND_D2
433.3
0:00000546
0.0776
9.146
1055
2475
(0.000017)
(0.88)
(368)
(665)
(258) MB_D
53:36 (143)
LOG_TA
39:95 (66.7)
(67.2)
CAPEX_TA
0.165
0.174
(2.25)
(2.25)
0.161
CF_TA
66:43
(526)
0:181
0.988
9:220
(0.0000032)
(0.019)
(0.058)
(0.86)
(5.30)
0:00000597
0:0141
0.150
0:0570
10.17
(0.0000045)
(0.031)
(0.11)
(1.04)
(8.44)
35.58
0:0000138
0:0525
0:558 0.675
137.4
(344)
(0.0000051)
(0.039)
(0.22)
(1.18)
80:36
0:00000369
0:936
1:207
2.617
2:220
(297)
(0.0000060)
(0.16)
(0.27)
(2.16)
(2.48)
650:8
0:00000724
0:0973
1:177
1:746
616:2
(201)
(0.0000079)
(0.029)
(0.17)
(0.69)
(181)
129.1
0:000113
0:473
2:670
7:155
31:47
360:4 (147)
IND_D4
28:72 (235)
IND_D5 IND_D6 IND_D7
(23.3)
0:0000107
0:0501
IND_D3
579:4
0:680
209:3
q95
1:000
LEV
(7)
(565)
Constant
351.6 (344)
(351)
(0.000036)
(0.049)
(0.17)
(0.39)
(3.95)
Observations
13674
13674
13674
13674
13674
13674
13674
.0.0027
0.0009.
0.0014
0.0068.
.0.0294
0.001
0.001
R-squared/Psedo R-squared
(continued)
838
H.-C. Yu et al.
Table 53.2 (continued) Panel B: Sample after deleting the outliers (Tobin’s Q less than 1% or larger than 99%) RD_TA
64.57
97:83
0:000185
1:458
19:53
103:5
509:9
(43.4)
(45.3)
(0.00014)
(0.83)
(4.01)
(12.2)
(81.2)
LEV
13:00
10:53
1:000
0:482
2:710
5:986
13:92
(2.40)
(2.41)
(0.000014)
(0.11)
(0.15)
(0.34)
(2.40)
MB_D
1.828
2:641
0.0000102
0:0983
0.0457
0:0283
0.147
(1.29)
(1.29)
(0.0000063)
(0.035)
(0.032)
(0.073)
(0.44)
LOG_TA
4:461
3:951
0:0000178
0:207
0:135
0:0730
2:065
(0.61)
(0.61)
(0.000010)
(0.023)
(0.020)
(0.031)
(0.40)
CAPEX_TA
0.0175
0.0225
0.000115
0:000964
0:000838
0.0000200
0.00964
(0.020)
(0.020)
(0.00010)
(0.00037)
(0.00060)
(0.00099)
(0.025)
0.00676
0.00230
0:000208
0:000466
0:00134
0:00111
0:00200
(0.016)
(0.016)
(0.00023)
(0.00070)
(0.0012)
(0.00078)
(0.012)
0.0000481
0.00000523
0:00000236
0:00000434
0.0000583
0.000233
0.000270
(0.0015)
(0.0015)
(0.0000010)
(0.000017)
(0.000071)
(0.00020)
(0.00026)
DIV_TA
147:6
185:4
0.000144
2:005
11:12
18:91
146:6
(74.8)
(74.6)
(0.00057)
(0.68)
(5.68)
(10.7)
(80.3)
LEV RDTA
18.29
26:60
0.000411
3:641
27:38
136:8
600:1
(79.6)
(81.0)
(0.00029)
(1.48)
(5.27)
(15.6)
(132)
MB RDTA
110:5
114:6
0.000195
13:52
10:55
4.734
36:45
(46.1)
(46.1)
(0.00024)
(1.79)
(2.40)
(3.21)
(61.1)
IND_D1
2:727
0.00000528
0:000278
0.0807
0.150
1:379
(1.80)
(0.0000091)
(0.035)
(0.061)
(0.10)
(0.47)
IND_D2
60:27
0:0000209
0:0386
2.255
17.64
554:2
(5.44)
(0.000021)
(0.27)
(1.43)
(32.3)
(187)
IND_D3
2:383
0.00000153
0:0475
0:217
0:981
7:174
(1.33)
(0.0000029)
(0.027)
(0.048)
(0.10)
(3.57)
IND_D4
0:0775
0:000000202
0:00747
0:198
0.0601
10.24
(2.12)
(0.0000023)
(0.032)
(0.10)
(0.14)
(6.94)
3.535
0:00000594
0:0719
0:560
0:546
0:226
(3.14)
(0.0000033)
(0.041)
(0.19)
(0.23)
(2.55)
2:018
0:00000341
0:908
1:237
2:897
0:295
(2.68)
(0.0000047)
(0.17)
(0.26)
(0.88)
(2.16)
12:29
-0.00000733
0:138
1:098
0:948
16.60
(1.88)
(0.0000057)
(0.036)
(0.15)
(0.51)
(61.0)
34:07
29:20
0:0000627
0:451
2:822
7:113
26:10
(3.17)
(3.20)
(0.000038)
(0.070)
(0.16)
(0.35)
(2.71)
13401
13401
13401
13401
13401
13401
13401
0.0406
0.0135
0.0235
0.0469
0.0830
CF_TA EBIT_VOLA
IND_D5 IND_D6 IND_D7 Constant Observations R-squared/Psedo R-squared
0.01
0.02
This table presents the OLS and quantile regression results from 1994 to 2004. Dependent variable is Tobin’s Q, which is measured by the sum of market value of firms’ equity and book value of total liabilities divided by total assets; RD_TA is R&D expenditure divided by total assets; LEV represents leverage measured by total debt to total assets; MB_D is the main bank dummy, which is given a 1 or 0; LOG_TA is size measured by the log of firm’s total assets; CAPEX_TA is capital expenditure divided by total assets; CF_TA is the ratio of net cash flow to total assets; EBIT_VOLA is the volatility of earnings before interest and taxes, measured by the standard deviation of first differences of earnings before interest and tax divided by total assets; DIV_TA is the cash dividend payout ratio; LEV*RDTA is the interaction of LEV and RD_TA; MB*RDTA is the interaction of MB_D and RD_TA. IND_D1-IND_D7 are industry dummies. Standard errors in parentheses, p < 0:01, p < 0:05, p < 0:1
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
they all show significant quantile effects on Tobin’s Q in Models (3)–(7). We find that for firms with higher quantiles (q D .50–.95), RD_TA is monotonically increasing with Tobin’s Q (with coefficients of 20.53, 110.3, and 579.4 at q D .50, .75, and .95, respectively). However, for lower quantile firms (q D .05–.25), R&D investment is monotonically decreasing with Tobin’s Q (with negative coefficients of .000003 and .712 at q D .05 and .25, respectively). This implies R&D expenditures add value to high quantiles firms, while destroying value on low quantiles firms. Leverage (LEV) exhibits the opposite signs as R&D increases across various quantiles. LEV is monotonically decreasing with firm value for high q firms (with coefficients of 2.507, 5.874 and 12.30 at q D .50, .75, and .95, respectively); however, it is increasing with firm value for low q (q D .05– .25) firms. This result may be explained by optimal capital structure theory: leverage may increase value for low Q firms; and destroy value for high Q firms. The interaction effect of R&D investment and leverage (LR*RDTA) is significantly negative for high quantiles firms, while positive for low quantiles firms. This finding demonstrates that leverage and R&D substitute each other in high Q firms, but complement each other in low Q firms. Main bank (MB_D) significantly adds value for lower quantile firms (q D .05–.25), positively but insignificantly influences firm values at q D .50, then turns to destroy value in higher quantile firms (q D .75). Hence, main banks help reduce asymmetric information and provide efficient monitoring only for firms with relatively lower quantiles. The interaction effect of R&D investment and main bank effect (MB*RDTA) is increasing with firm value for firms in the medium range of quantiles (q D .25–.50), implying that main bank relationships may increase value with increasing R&D investment for firms with median Q. However, in the presence of extremely low or high quantile (q D .05 or q D .95), main bank effect on firm value is destroyed with increasing R&D investment. This finding sounds reasonable; main bank clients have a potential information leakage cost after reporting their financial statement to their largest banks. For firms with an extremely high Q, this cost of R&D information leakage on firm value is expected to be high and may even outweigh the benefit from main bank in reducing information asymmetry. Although LOG_TA shows a significantly positive impact on Tobin’s Q for lower quantile firms (q D .05–.50), the impact is significantly negative for higher quantile firms (q D .75–.95). This demonstrates the size effect is nonlinear rather than linear. It is positive in low Q firms, while negative in high Q firms. This finding is different from most of the extant literature, which documents a negative relationship by way of using traditional OLS regression. Capital expenditure (CAPEX_TA) positively influences Tobin’s Q
839
in higher quantiles and insignificantly in lower quantiles. Higher dividend payout (DIV_TA) adds value for high quantiles firms (q D .50–.95) but destroys value in lower quantile firms (q D .25). The coefficients on the industry dummies (IND_D1IND_D7) in Models (3)–(7) show that industry dummies do have quantile effects on Tobin’s Q. Although the coefficients on industry dummies in OLS Model (2) are not highly significant, the coefficients in quantile Models (3)–(7) turn from negative to positive while firms’ quantiles increase. This implies that the impact of industry on firm value depends on the firm’s quantile membership within its industry rather than the industry to which the firms belong. We find that as quantile of Q increases, the industry impact tends to positively affect Tobin’s Q regardless of industry types. This finding is different from Acs and Isberg (1996) demonstrating that the impact of R&D on growth opportunity depends on industry type. We formally document that industry quantile effect is larger than the industry effect itself: when firms get into a high quantile membership, most of the industry impacts gradually turns out to be significantly positive no matter what industry it is. Finally, Models (3)–(7) of Panel A show the true intercepts are higher at higher quantiles, demonstrating errors are positive on average at q D 0.5, 0.75 and 0.95. Figure 53.2 exhibits the quantile effects of all variables on Tobin’s Q. It shows most of the relationships with Q are nonlinear. For example, RD_TA shows a convex relation with Q, LEV is concave related to Q, and MB_D is nonlinear to Q without specific pattern.
53.4.2.2 Sample Without Outliers We now discuss the Panel B sample after deleting the outliers. Interestingly, the coefficients of Model (2) on leverage, size, dividend payout, and interactions of main bank with R&D expenditures turn out to be significantly negatively related to Tobin’s Q after deleting the outliers (with coefficients of 10.53, 3.95, 185.4, and 114.6, respectively), while showing insignificant results when they are included (with coefficients of 66.4, 22.6, 9151, and 2656, respectively). Main bank turns out to have a significantly positive impact on Tobin’s Q (with coefficient of 2.64) with significance level of 0.05. These findings document high-levered, large size, and high payout firms tend to have lower Tobin’s Q. Moreover, main bank effects on Tobin’s Q are expected to be worse when firms’ R&D expenditure is high. Although the impact of outliers is indifferent on various quantiles, it is significantly different in OLS regression. This finding suggests OLS may lead to misleading conclusions with the inclusion of outliers, while quantile regression is impervious to either inclusion or exclusion outliers.
1
.2
.4 .6 Quantile
.8
1
DIV/TA
0.00
EBIT_Vol
Ind_D3
.4 .6 Quantile
.8
1
0.01
CAPEX/TA
0.00
0.01
1.00 0.00 .2
-3.00 -2.00 -1.00
log_TA 0
0
.2
.4 .6 Quantile
.8
1
0
.2
.4 .6 Quantile
.8
1
0
.2
.4 .6 Quantile
.8
1
0
.2
.4 .6 Quantile
.8
1
0
.2
.4 .6 Quantile
.8
1
Ind_D7
0.00
200.00
400.00
600.00
Ind_D2
0.00500.001000.001500.00 2000.00 2500.00
0.00
0.00
0.00 0.00 -0.00
-0.20 0.00 0.20 0.40
GP_D0 0
0.00
1
1
1
GP_RDTA
.8
.8
.8
.8
−80.00 −60.00 −40.00−20.00 0.00 20.00
.4 .6 Quantile
.4 .6 Quantile
.4 .6 Quantile
.4 .6 Quantile
150.00
.2
.2
.2
.2
100.00
0
0
0
0
Ind_D5
1
1
1
50.00
.8
.8
.8
0.00
.4 .6 Quantile
.4 .6 Quantile
.4 .6 Quantile
0.00
.2
.2
.2
−200.00
0
0
0
LR_RDTA
1
1
−600.00 −400.00
.8
.8
Ind_D4
.4 .6 Quantile
.4 .6 Quantile
0.00 2.00 4.00 6.00 8.00 10.00
.2
.2
0.00 100.00 200.00 300.00 400.00
0
0
0.00 2.00 4.00 6.00 8.00 10.00
1
-0.00
CF/TA
0.60
0.00 .8
-2.50 -2.00 -1.50 -1.00 -0.50 0.00
Ind_D1
−5.00
Leverage
-15.00 .4 .6 Quantile
-2.00 -1.00 0.00 1.00 2.00 3.00
Ind_D6
−10.00
400.00
RD_TA
200.00 0.00
.2
0.00
0
0.01
H.-C. Yu et al.
600.00
840
Fig. 53.2 The quantile effects of all variables on firms’ Tobin’s Q
53.5 Conclusions and Discussion The quantile approach has advantages over conventional mean-based approachs in estimating the effects of R&D, leverage, and main bank relationship on Tobin’s Q. The targeting quantiles from middle of the error distribution can reduce the outlier bias and improve the sample description based on targeted quantiles. Using quantile regression, our results provide explanations for the inconsistent findings that are derived utilizing conventional OLS regression in the extant literature. We also find the results of OLS are seriously influenced by outliers. In stark contrast, quantile regression effects are impervious to either inclusion or exclusion outliers. Thus, it appears quantile regression is superior to OLS when analyzing data garnered from a volatile period. When taken as a whole, our results also have implications for understanding the effects of Keiretsu style banking or industry characteristics on borrowers within the economy. First, recent studies find that the impact of R&D investment on firm performance depends on industry characteristics (see,
for example, Ho et al. 2006) and size (Acs and Isberg 1996). We also note from our findings that systematic (or consistent industry) effects do not always exist, and that most of the firms in our study gain from industry effects while in higher quantile Q, thus providing policy guidance for governments in their attempts to stimulate R&D. Second, our finding that the presence of main banking relationships do not always lead to higher Q scores suggests a segmented role for Keiretsu style banking and helps explain why only 19.5% of the sample firms actually participate in this type of banking. Additionally, the impact of main bank on firm performance depends on firm’s performance itself rather than its main bank: thus for firms entering into a high quantile membership, the main bank impact turns out to be positive. Acknowledgments Hai-Chin YU is the contact author and currently serving as a visiting professor at Rutgers University. This paper was presented in 2008 Quantitative Finance and Risk Management Conference held by National Chou-Tong University in Shi-Chu, Taiwan. We are grateful to C.F. Lee, Ken Johnson, Joe Wright and the participants for their helpful comments.
53 Keiretsu Style Main Bank Relationships, R&D Investment, Leverage, and Firm Value: Quantile Regression Approach
References Acs, Z. J. and S. C. Isberg. 1996. “Capital structure, asset specificity and firm size: a transaction cost analysis,” in Behavior norms, technological progress, and economic dynamics: studies in Schumpeterian economics, E. Helmstadter and M. Pertman (Eds.). University of Michigan Press, Ann Arbor, Michigan. Aoki, M. 1990. “Toward an economic model of the Japanese firm.” Journal of Economic Literature 8, 1–27. Balakrishnan, S. and I. Fox. 1993. “Asset specificity, firm heterogeneity and capital structure.” Strategic Management Journal 14(1), 3–16. Bhagat, S. and I. Welch. 1995. “Corporate research and development investment: international comparisons.” Journal of Accounting and Economics 19(2–3), 443–470. Bradley, M., A. G. Jarrell, and E. H. Kim. 1984 “On the existence of an optimal capital structure: theory and evidence.” Journal of Finance 39(3), 857–878. Chan, S. H., J. Martin, and J. Kensinger. 1990. “Corporate research and development expenditures and share value.” Journal of Financial Economics 26(2), 255–276. Chan, L. K. C., J. Lakonishok, and T. Sougiannis. 2001. “The stock market valuation of research and development expenditures.” Journal of Finance 56(6), 2431–2456. Chauvin, K. W. and M. Hirschey. 1993. “Advertising, R&D expenditures and the market value of the firm.” Financial Management 22(4), 128–140. Daniel, K. and S. Titman. 2006. “Market reactions to tangible and intangible information.” Journal of Finance 61(4), 1605–1643. Doukas, J. 1995. “Overinvestment, Tobin’s Q and gains from foreign acquisitions.” Journal of Banking and Finance 19(7), 1285–1303. Doukas, J. and L. Switzer. 1992. “The stock market’s valuation of R&D spending and market concentration.” Journal of Economics and Business 44(2), 95–114. Gugler, K. 2003 “Corporate governance, dividend payout policy, and the interrelation between dividends, R&D, and capital investment” Journal of Banking & Finance 27, 1297–1321. Ho, Y. K., M. Tjahjapranata, and C. M. Yap. 2006. “Size, leverage, concentration, and R&D Investment in generating growth opportunities.” Journal of Business 79(2), 851–876. Hoshi, T., A. Kashyap and D. Scharfstein. 1991. “The investment, liquidity and ownership – the evidence from the Japanese groups.” Quarterly Journal of Economics 106, 33–60. Houston, J. F. and C. James. 1996. “Banking relationships, financial constraints and investment: are bank dependent borrowers more financially constrained?” Journal of Financial Intermediation 5(2), 211–221. Jensen, M. and W. Meckling. 1976. “Theory of the firm: managerial behavior, agency costs and ownership structure.” Journal of Financial Economics 3(4), 305–360. Jensen, M. 1986. “Agency costs of free cash flow, corporate finance and takeovers.” American Economic Review 76, 323–329. Johnson, S. A. 1997. “An empirical analysis of the determinants of corporate debt ownership structure.” Journal of Financial and Quantitative Analysis 32(1), 47–69.
841
Johnson, S. A., 1998. “The effect of bank debt on optimal capital structure.” Financial Management 27(1), 47–56. Kamien, M. I. and N. L. Schwartz. 1982. Market structure and innovation, Cambridge University Press, New York. Koenker, R. and G. Bassett. 1978. “Regression quantiles.” Econometrica 46, 33–50. Koenker, R. and K. Hallock. 2001. “Quantile regression.” Journal of Economic Perspectives 15, 143–156. Lang, L., R. Stulz, and R. Walking. 1989. “Managerial performance, Tobin’s Q, and the gains from successful tender offers.” Journal of Financial Economics 24, 137–154. Lang, L., R. Stulz, and R. Walking. 1991. “A test of the free cash flow hypothesis: the case of bidder returns.” Journal of Financial Economics 29, 315–335. Leland, H. E. and D. H. Pyle. 1977. “Information asymmetries, financial structure, and financial intermediation.” Journal of Finance 32(2), 371–387. Long, M. and L. Malitz. 1985. “The investment-financing nexus: some empirical evidence from UK.” Midland Corporate Finance Journal, 234–257. McGuire, J. and S. Dow. 2002. “The Japanese Keiretsu system: an empirical analysis.” Journal of Business Research 55, 33–40. O’Brien, J. 2003. “The capital structure implications of pursuing a strategy of innovation.” Strategic Management Journal 24(5), 415–431. Sheard, P. 1994. “Interlocking shareholdings and corporate governance,” in The Japanese firm: the sources of competitive strength, M. Aoki and M. Dore (Eds.). University Press, Oxford, 310–349. Smith, C. W. and R. Watts. 1992. The investment opportunity set and corporate financing, dividend and compensation policies.” Journal of Financial Economics 32(3), 263–292. Sundaram, A. K., T. A. John, and K. John. 1996. “An empirical analysis of strategic competition and firm values – the case of R&D competition.” Journal of Financial Economics 40(3), 459–486. Szewczyk, S. H., G. P. Tsetsekos, and Z. Zantout. 1996. “The valuation of corporate R&D expenditures: evidence from investment opportunities and free cash flow.” Financial Management 25(1), 105–110. Titman, S. and R. Wessels. 1988. “The determinants of capital structure choice.” Journal of Finance 43(1), 1–19. Vincete Lorente, J. D. 2001. “Specificity and opacity as resource-based determinants of capital structure: evidence for Spanish manufacturing firms.” Strategic Management Journal 22(2), 57–77. Williamson, O. 1988. “Corporate finance and corporate government.” Journal of Fiance 43(3), 567–591. Woolridge, J. R. 1988. “Competitive decline and corporate restructuring: is a myopic stock market to blame?” Journal of Applied Corporate Finance 1, 26–36. Zantout, Z. Z. 1997. “A test of the debt-monitoring hpothesis: the case of corporate R&D expenditures.” Financial Review 32(1), 21–48. Zantout, Z. and G. Tsetsekos. 1994. “The wealth effects of announcements of R&D expenditure increase.” Journal of Financial Research 17, 205–216.
Chapter 54
On the Feasibility of Laddering Joshua Ronen and Bharat Sarath
Abstract In this paper we investigate whether laddering is feasible as an equilibrium phenomenon. In initial public offerings (IPOs), laddering is described as the commitment to and the actual purchase of shares in the aftermarket in addition to what the purchasers would have purchased in the absence of laddering, where the motivation to purchase such additional shares is a pre-agreement (tie-in agreement) with the lead underwriter. Under such tie in agreements, the underwriter allocates more shares to the counter parties of the agreement (ladderers) in return for the commitment to purchase additional, possibly pre-specified, quantities of shares (in excess of what they otherwise would have purchased) so as to boost the price of the stock in after-market trading of the IPO. Our model features three distinct sets of traders: insiders who ladder, rational traders who are generally suspicious of price manipulation and willing to invest effort and resources to acquire information about fundamental values, and, finally, momentum traders whose beliefs are conditioned on past prices. We show that laddering is not a sustainable activity unless (1) Underwriters act as a cartel; (2) there is a ready supply of momentum traders and (3) a lack of short-sellers or other skeptics in the aftermarket. If there are traders that behave in a way consistent with rational expectations and remove price inflation, laddering cannot be a profitable strategy. Keywords C70 r D82 r G24 r k22
54.1 Introduction In this paper we address the question of whether laddering is feasible as an equilibrium phenomenon. In initial public offerings (IPOs), laddering is described as the commitment to and the actual purchase of shares in the aftermarket in J. Ronen () New York University, New York, NY, USA e-mail: [email protected] B. Sarath Baruch College, New York, NY, USA
addition to what the purchasers would have purchased in the absence of laddering, where the motivation to purchase such additional shares is a pre-agreement (tie-in agreement) with the lead underwriter. Under such tie in agreements, the underwriter allocates more shares to the counter parties of the agreement (ladderers) in return for the commitment to purchase additional, possibly pre-specified, quantities of shares (in excess of what they otherwise would have purchased) so as to boost the price of the stock in after-market trading of the IPO. Ladderers could stand to profit from such tie-in agreements by selling their large allocation of IPO shares as well as their after-IPO purchases at inflated prices resulting from the laddering activities. In turn, the underwriter stands to profit by receiving higher than normal commissions from the ladderers or by sharing in the profits of the ladderers through other means.1 The goal of our paper is to analyze the profitability of laddering in different market settings. Specifically, we show that laddering is not a sustainable activity unless (1) Underwriters act as a cartel; (2) there is a ready supply of momentum traders and (3) a lack of short-sellers or other skeptics in the aftermarket. The main question of interest is whether laddering as defined above can be sustainable in equilibrium, that is, whether investors and underwriters operating in a competitive environment find it in their own self-interests to enter into tie in agreements with the objective of inflating the price in the aftermarket. Our analysis shows that this typically is not the case. Specifically, we can show that unless underwriters form a cartel that is able consistently to maintain a monopoly, they would compete to eliminate incentives for inducing laddering to inflate prices: competition to gain market share in the IPO market would lead underwriters to increase offer prices and thus gain more IPO commissions. But even if underwriters form a cartel such that ladderers earn abnormal profits, stock prices need not be inflated; if they ever climb above the 1 One possibility is that Ladderers in one IPO are promised extra allotments in other IPOs such that both ladderers and underwriters are better off. See generally Aggarwal (2000); Aggarwal and Conroy (2003); Aggarwal et al. (2006); Choi and Pritchard (2004); Hanley and Wilhelm (1995); Loughran and Ritter (2002, 2004); Smith and Sidel (2006) and Zhang (2004).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_54,
843
844
equilibrium value, sellers, including short-sellers, are likely to act fast to bring the price down to its equilibrium value. In a competitive environment, under-pricing would provide normal returns for the risk faced by underwriters in estimating demand, manifested in the cost of aftermarket support (Derrien 2005; Hao 2007). Alternatively, underpricing is the cost paid for inducing truthtelling from investors about market demand (Benveniste and Spindt 1969). And while allocants (investors who are allocated shares at the offering price in the IPO) may make profits by flipping, such profits could be seen as compensation for bearing the risk that the IPO could turn cold. Also, underwriters may “overallocate” to some allocants in return for a promise not to flip and/or to purchase additional shares for the purpose of ensuring that the issued stock is held over the long term so as to provide a stable source of funding for the issuers. A recent paper providing an economic model of laddering is Qing Hao (2007). Our paper shares some important features with the model constructed by Qing Hao (2007) but is different in motivation and in some very critical assumptions. Both papers examine a similar market-trading model with downward sloping demand curves; along with Hao (2007), we view this assumption as consistent with investors valuing new issues heterogeneously as well as with empirical evidence of negatively sloping demand (e.g., Table 4 of Wurgler and Zhuravskaya (2002) and the references therein). In such a market, ladderers can boost the stock price by “restricting” the supply of shares available for trading; they do this by holding onto their allocations and by making additional aftermarket purchases. Qing Hao assumes that laddering is feasible economic behavior and examines how the initial underpricing changes in the decision to ladder. In contrast, we seek to examine whether laddering can be optimal economic behavior in the first place. The critical differences between our framework and Qing Hao’s framework concern (1) the determination of the initial IPO price and (2) market conjectures about whether laddering has taken place. In Qing Hao’s framework, underwriters act like monopolists and set the initial offer price in such a way as to profit subsequently from laddering activities. This rather implausible assumption, combined with the ability to overallocate shares, makes laddering economically desirable for underwriters and their favored clients (ladderers). But again, even under Hao’s assumptions, that is, even if underwriters form a cartel, laddering becomes feasible only in the sense that the profits made through the initial overallocation at low issue prices outweigh the losses made in the aftermarket; it does not mean that prices are inflated for any significant length of time. Depending on the trading intensity of momentum traders and information traders (that include short-sellers), any inflation induced by laddering activities can dissipate quickly.
J. Ronen and B. Sarath
The interest in Qing Hao’s model stems from its ability to make predictions about the strategic level of underpricing as a function of laddering activities. In contrast, our paper assumes that the underwriting industry is competitive and that issuers will agree to leave only the minimum amount of money on the table. Therefore, in our setting, it is not the initial underpricing but whether the subsequent manipulation of the market that could lead to potential profits for ladderers that is the focus of interest. To summarize, the issue analyzed in Qing Hao (2007) is that if underwriters act as monopolists (or as a monopolistic cartel along with ladderers, it becomes feasible for ladderers to make abnormal profits from their initial (low price) allotments, but this need not inflate prices beyond equilibrium values; any such inflation could be fast undone by information seeking traders. In contrast, our paper analyzes whether the attempted after market price boosting behavior by ladderers can allow them to cash in on aftermarket purchases and their competitively priced allotments at the time of issue. A second feature that differentiates our model from Qing Hao (2007) concerns market perceptions about laddering. Qing Hao assumes that the market is not aware of laddering and behaves as if the market price (subsequent to tie-in purchases) is a result of market dynamics rather than illegal behavior. We assume that there is some proportion of traders in the market who suspect that laddering might have taken place and assign a positive probability to laddering. These traders would consequently short the stock or, equivalently, revise their fundamental expectations downward thereby reducing their optimal holding levels, i.e., sell the stock.2 That is, we assume there are two types of traders in the market: momentum traders who react to the upward movement of prices by revising their beliefs about the stock upwards, and rational information seekers, who react by trying to acquire information about the fundamentals of the stock. The ability to profit from laddering in the after market depends on the existence of traders of each type. In particular, within our framework, traders who suspect that laddering has taken place can put downward pressure on the stock price which negates the effects of laddering both directly, and in the way it affects the behavior of momentum traders such as to dissipate any inflation in the stock price. Griffin et al. (2007) (hereafter, GHT) provide an in-depth empirical analysis of IPOs conducted on NASDAQ in the years 1997–2002. They show that the clients of lead underwriters are net purchasers of about 9% of the IPO in the 2 Edwards and Hanley (2007) observe that “Contrary to popular belief, we find that short selling occurs in 99.5% of IPOs in our sample on the offer day and the majority of first day short sales occur at the open of trading.” They go on to add that “Greater short selling is observed in IPOs with positive changes in the offer price, high initial returns and large trading volume.”
54 On the Feasibility of Laddering
845
after-market and that such purchasing behavior extends to start selling out their positions (or shorting the stock), then cold IPOs.3 They argue that such purchases lead to consider- prices will cease to be inflated except for very short periods able increases in the closing price at the end of the first day. and laddering will fail to make any money. So in general, the However, the analysis does not extend to the period where phenomenon of laddering is sustainable in equilibrium only these clients then sell the stocks. While GHT show that early if there is a large group of traders whose expectations can buyers in the aftermarket sell their additional holdings by be manipulated and insignificant trading by those who susthe end of the first year, they do not investigate whether the pect that laddering is taking place. Needless to say, neither of process of buying in the aftermarket followed by subsequent these restrictions will hold in an efficient market, and conseselling was actually profitable for the ladderer. In general, we quently, laddering will fail to be a sustainable phenomenon rely on the empirical features discussed in GHT (2007) in within an efficient market. constructing our model. We assume that ladderers will buy or hold on the first-day after trade and sell at some much later point in time. The interval between the initial buying 54.2 The Model point and the later selling point results in enough time to allow other traders to gather information as to whether laddering may have taken place and whether prices are artificially The model we develop is based on a market where beliefs inflated. If these information seeking traders react quickly, about fundamental value may diverge from price. Three sets of traders, insiders, outside momentum traders and outside prices will stay inflated for a very short duration. Our model considers multiple rounds of trade starting private information seekers interact in the after-market tradfrom the IPO. The IPO price is set competitively, and ab- ing of IPO’s. The true value of the security depends on an sent laddering, will not lead to any abnormal profits. The first underlying parameter, x which is unknown. For example, round of trade takes place with ladderers simply holding on f .xjx/ may represent the distribution of cash flows from the to their initial allotments. The next round of trade allows Lad- security and the value depends on these cash flows. Traders derers to increase the price through after market purchases. form inferences about x based on both information signals The final round of trade takes place when ladderers cash out. and prices. Each trader has a demand function Q.P; x/ which We show that this laddering process itself cannot be prof- depends both on their beliefs x and the observed price P . In itable if: (1) some traders investigate whether fundamental equilibrium, the aggregate demand of traders should equal values justify the observed price in the time period between the aggregate supply. However, we assume that insiders mawhen ladderers buy and sell; or (2) non-ladderers react in- nipulate the supply by buying or holding shares in a stratestantaneously to selling pressure by reducing their holdings gic fashion while traders who gather private information act as well. That is, laddering cannot move the price away from to increase supply if they feel that the IPO price is inflated its equilibrium value conditional on available information on (by being laddered or otherwise manipulated). The market fundamentals (future cash flows and their risk) unless the clearing price devolves to one where the shares supplied by market is inefficient in a specially defined sense. Specifically, outsiders equals the demand by outsiders. We begin by outladderers can cash out at a profit if momentum traders that lining the different types of investors, their belief processes increase their expectations about the stock upwards based and their demand functions. on price increases form a sufficiently large proportion of the Assumption 1 (Investor types). market. More generally, laddering only works by manipuWe posit three distinct types of investors who participate both lating expectations rather than through the manipulation of in the initial offering and in aftermarket trading demand and supply. If the only forces at work were the equilibriation of demand and supply, laddering would always fail. 1. Insiders who are party to tie-in agreements (see Qing Hao Only if the process of restricting supply leads to changes in (2007)) and participate in aftermarket trading patterns expectations of a sufficiently broad class of traders that do that boost the price of the stock. For the purposes of our not revert back when the ladderers cash out or others sell paper, all insiders are assumed to be participants in the or short the stock can laddering be sustained. If there is a laddering scheme.4 group of traders who can spot (or suspect) laddering and 2. Outsiders who are momentum traders; these investors revise their beliefs about the fundamental value of the security based on price changes. 3
It is worth noting that the empirical evidence provided in GHT differs in one notable way from the theoretical predictions in Qing Hao (2007). GHT show that net purchasing by clients of lead underwriters is most frequent in cold IPOs whereas Qing Hao’s model predicts that such net purchase, to the extent that it represents laddering, is most profitable in hot IPOs.
4
This is an oversimplification. In reality, only some insiders may be participants in the scheme, in which case, the non-participating insiders who are not party to tie-in agreements would join the ranks of the informed investors that would act immediately to remove price inflation.
846
J. Ronen and B. Sarath
3. Outsiders who have rational expectations; these investors suspect that laddering might be taking place and attempt to gather information on whether price movements are due to laddering. We assume that there are I insider traders, M outside momentum traders and N outside rational expectations traders. Each trader has a demand function Q.P; x/; we use i as an index for insiders and m; n respectively as the index for the two groups of outsiders. So we have I C M C N demand functions Qi .P; x/; i D 1; ; I , Qm .P; x/; m D 1; ; M and Qn .P; x/; n D 1; ; N . For each sequence of prices, P0 ; ; Pt , we associate an inferred value of the parameter x.P0 ; ; Pt /; we then write Q.P; x.P0 ; ; Pt // D Q.P jPt ; ; P0 / for the associated demand function. With this notation, the general properties of the demand functions can be formalized. Assumption 2 (Properties of the demand functions, and inferences). The conditions on the demand function Q are summarized below: 1. The demand functions Q.P; :/ are decreasing in P . 2. The demand functions Q.:; x/ are increasing in x. The higher the inferred value, the larger the demand. 3. Q.P jPt ; ; P0 / is decreasing in P . In other words, the inferences are consistent in that higher current prices lead to lower demand irrespective of the past price sequence. 4. Q.P; x/ is convex decreasing in P for fixed x. 5. We assume in addition a “substitutability” between belief revisions and price increases (decreases) as stated in the following equation: For any two prices P; P 0 ; Q.P; x.P // D Q.P 0 ; x.P 0 // (54.1)
then demand does not change. Consider first a price increase. A price increase will result in a reduction of demand. Now suppose simultaneously with the price increase, there is also an increase in the beliefs about fundamental value. Then demand will increase. We need an assumption on how these two simultaneous effects offset each other. Our assumption states that if the belief rises to the level consistent with the price increase, then these two factors exactly offset each other. If markets are commonly believed to be efficient, the last price is sufficient for all previous prices and x.P0 ; ; Pt / D x.Pt /. In contrast, some traders may believe that prices will sustain their momentum. In the context of our framework, we model momentum traders as those who revise their believes about fundamental value based on momentum, that is, they tend to believe that the fundamental value is likely to be higher than the current price provided that prices have been rising in the past. In addition, these traders use price information observed up to time t 1; this formulation describes more accurately a market where observed prices (i.e., dealer quotes) differ from the actual price at which the order will clear. In contrast, rational expectations traders condition on all prices including time t price. This describes sophisticated traders who not only observe dealer quotes but also have information on market trends such that they are able to adjust their demand schedules on a rapid basis. As our analysis will show, laddering is only profitable in a market where (1) belief revisions are based on past momentum and (2) there is little or no private information acquisition about whether laddering has taken place or whether prices are inflated to levels unjustified by fundamentals. We next describe the belief process of each of our three groups of investors. Assumption 3 (Different types of investors and belief revisions). We denote insiders by the subscript i , outsiders who are momentum traders by the subscript m and outsiders who suspect laddering by the subscript n. We write Qtj .P jPt ; ; P0 / for trader j ’s demand (j can be i , m, or n) at time t.
The properties (1),(2) and (3) in the assumption above follow directly from the definition of a demand function. (4) can be derived through some restrictions on the utility function but we state it as a direct assumption.5 (5) states the following: if 1. For insiders, xi .P0 ; ; Pt / D x.P0 / D x0 . Insiders know the true valuation parameter x0 and never update the price increases (decreases) and beliefs about fundamental their beliefs about it. So their demand functions are given value increase (decrease) to a level consistent with the price, as Qti .P; x0 / D Q0i .P; x0 / and are time invariant. 2. Outsiders who are momentum traders have demand func5 Suppose U denotes a trader’s utility function and w Q their random tions that depend on price changes. Given a sequence of wealth from all other investments beside the IPO. For any sequence prices P0 ; ; Pt , t 2 of observed prices Pt ; ; P0 , let f .w; xjPt ; ; P0 / represent the expected joint distribution of w and x (x is the wealth derived from the IPO). Then the demand for the IPO security, Q is chosen to maxiR mize EŒU jQ D x;w U.w C .x P /Q/f .w; xjPt ; ; P0 /, that is
@ EŒU jQ D 0. Differentiating this first-order-condition totally @Q in P gives us the required condition on the utility function to ensure Q.P jPt ; ; P0 / is convex in P .
Qtm .P; x.P0 ; ; Pt / Qm ŒP .Pt 1 Pt 2 /; x.P0 / D Qm .P1 ; x.P0 //
for t 2 for t D 1 (54.2)
54 On the Feasibility of Laddering
847
3. Outsiders who are rational expectations traders have demand functions
The number of shares issued, S is given by Q.P0 ; x0 /. The insider’s demand at this IPO price P0 , denoted by S1 , and outsider’s demand, S2 , are given by:
Qtn .P; x.P0 ; ; Pt / Qn .P; x.P0 // with probability p D Qn .P; x.Pt // with probability .1 p/ That is, with probability p, these traders discover the true fundamental value and with probability .1p/ they adjust their beliefs to those consistent with the observed price and adjust their demands accordingly. We assume that the insiders (including investment bankers) have some private beliefs about the fundamental value parameter, denoted by x0 . This is the parameter that will define the long-term price once the initial trading restrictions and the underwriters duty to supply price support end. Consequently, they base their demands (absent laddering considerations) on the true underlying value parameter, x0 . The momentum traders move their demand curves up and down based on past price momentum. This can be modeled in several different ways. Our choice represents momentum trading as follows: at time t the momentum trader demands the same amount at a higher price P as they would have demanded at time 0 at the lower price P P where P is the momentum Pt 1 Pt 2 . Clearly, this choice is consistent with the notion of momentum trading and is technically convenient because it parsimoniously represents a family of momentum driven demand curves. The rational expectations traders have two features: (1) they suspect that laddering takes place (or that the price is inflated) and discover it with probability p and (2) even if they fail to detect laddering (with probability 1p), they respond instantaneously to selling pressure by revising their beliefs downwards. The two features together ensure that rational expectations traders cannot be exploited through laddering schemes.
54.2.1 IPO Pricing and Overallotments
S1 D
Qi .P0 ; x0 /I
i D1
S2 D
M X
Qm .P0 ; x0 / C
mD1
N X
Qn .P0 ; x0 /
(54.4)
nD1
At the time of the IPO (t D 0), shares are overalloted to insiders, that is, the shares allotted to insiders and outsiders are S1 C K0 , S2 K0 . Thus K0 denotes the level of overallotment to insiders. Insiders hold on to these extra shares and do not trade them in the initial aftermarket (t D 1) (see Qing Hao (2007)). This, together with the initial underpricing based on the necessity of price support, result in a price rise in the aftermarket. To summarize, the IPO process modeled here reflects underpricing to compensate underwriters for the cost of price support. In addition, overallotments to chosen clients who agree not to flip the shares minimizes price support costs for the underwriter and also leads to an additional upward pressure in the after market. Let P denote the value of P that will maximize gross proceeds absent the cost (to the underwriter) of ensuring that the IPO succeeds. Then P P0 denotes the underpricing that is inherent in the IPO and r D
P 1 P0
(54.5)
denotes the normal returns to issuing IPO’s.6 The question that we shall address is whether laddering will yield a greater return than r . That is, we assume from the outset that there are some competitive returns to participants of an IPO based on institutional and informational factors and examine whether laddering can create abnormal profits.
54.2.2 Laddering
Absent other strategic considerations, the IPO price will be set to maximize the long term market value P Q.P; x0 /. However, there is a cost to issuing the IPO that is borne by the underwriters which we denote by C.P; x0 /. C.P; x0 / is the cost of providing price support if the IPO price is set at P (in this formulation, we are following Qing Hao (2007) and Derrien (2005)). In consequence, the IPO price P0 maximizes the amount
As documented in GHT (2007), the insiders step in and buy further shares so as to push up the price and convince some traders that the shares are likely to rise further in the future. In our model, this event takes place at time (t D 2). Keeping to the empirical structure documented in GHT (2007), we assume that the ladderers have to hold their shares for
6
P Q.P; x0 / C.P; x0 /:
I X
(54.3)
If C.P; x0 / is an increasing positive function of P , it is easy to show that P > P0 .
848
J. Ronen and B. Sarath
some period of time (perhaps until price support is no longer needed). This is part of the initial tie-in agreement with the underwriter. In the interim period, traders may update their beliefs as to whether momentum has been caused by fundamentals or because of laddering. Traders who decide that laddering is present revise their beliefs downwards and sell (or short) the security (at t D 3). In turn, this affects the beliefs of other traders about whether the price momentum is upwards or downwards. Finally, the insiders dump their shares at t D 4. The time line is summarized below.
At time t D 3, the outsiders trade with each other where some might have acquired inside information about the possibility of laddering. So P3 solves: S2 K0 K1 D
M X
Qm .P3 .P2 P1 /; x.P0 //
mD1
C
N X
Œ.1 p/Qn .P3 ; x.P3 // C pQn .P3 ; x.P0 //
nD1
(54.9)
Time Line tD0 tD1 tD2 tD3
tD4
IPO issued consisting of S shares at a price P0 . K0 shares overallotted to insiders. The outside investors trade among themselves to reach a market equilibrium price P1 . Insiders step in and buy K1 additional shares driving the price up to P2 . Outside investors update beliefs and trade with each other at a price P3 . If there are no traders that suspect laddering, no trade takes place at time t D 3. The price support period ends and Insiders sell their shares at a market clearing price P4 .
At time 1, the insiders do not trade. The momentum traders base their beliefs about fundamental value based on past price P0 whereas rational expectations traders use the current price P1 . As a consequence, the market equilibrium price P1 is given as the price that solves: S2 K0 D
M X
Qm .P1 ; x.P0 // C
mD1
N X
Qn .P1 ; x.P1 //
nD1
(54.6) Note that we are fixing the momentum traders beliefs about fundamental values to be consistent with the IPO price. For simplicity, we shall assume that: x.P0 / D x0
(54.7)
that is, momentum traders start with the correct initial beliefs but subsequently adjust their demand based on momentum. At time t D 2, insiders step in and buy an additional K1 shares. Thus the equilibrium price P2 solves: S2 K0 K1 D
M X
Qm .P2 .P1 P0 /; x.P0 //
mD1
C
N X nD1
Qn .P2 ; x.P2 /
(54.8)
Finally, at time t D 4, the insiders sell all their excess shares and revert to the (unmanipulated) optimal holdings – that is S2 D
M X
Qm .P4 .P3 P2 /; x.P0 //
mD1
C
N X
Œ.1 p/Qn .P4 ; x.P4 // C pQn .P4 ; x.P0 //
nD1
(54.10) The question we address is whether this process could possibly lead to excess profits for the insiders. The excess profits (to all insiders) are determined as follows: K0 .P4 P0 / C K1 .P4 P2 / On a per-share basis, this becomes: K0 .P4 P0 / C K1 .P4 P2 / K1 C K0 K0 K1 D .P4 P0 / C .P4 P2 / K1 C K0 K1 C K0 (54.11) The basic question we shall address is whether the quantity in Equation (54.11) can be positive, for then, every insider gains more than their normal IPO return by becoming a part of the laddering cabal. The excess profit measure in Equation (54.11) is structured to be consistent with our assumptions that the IPO offer price is set competitively and the issue is whether manipulation is possible in the after market. If IPO prices are set competitively, there is no reason to add the returns on the optimal (unmanipulated) holdings to the left-hand-side of Equation (54.11). Note that we are not assuming a normal return on all the “excess” shares allocated and are only assuming a normal return with respect to their unmanipulated holdings. Similarly, we are not examining potential costs that may be
54 On the Feasibility of Laddering
imposed on the ladderers as a result of refusing tie-in agreements (such as an exclusion from future IPO’s conducted by the underwriter). If issuers can be forced to a low initial offer price, ladderers benefit from the initial overallocation; under such a circumstance, they would be willing to enter into laddering schemes even though they may lose money on their aftermarket purchases. But again, this need not give rise to inflated prices; the existence of information seeking traders will bring back prices to their equilibrium value levels. Again, these types of strategic collusion between the underwriter and ladderers lie outside our model. Our focus is to examine whether ladderers, acting on their own or in concert with the underwriter, can make abnormal profits under a maintained assumption that there is no price manipulation at the offer.
849
function Q.P C P2 P1 ; x.P0 // at time t D 4. So the period t D 4 market clearing price (with N D 0) solves: S2 D
M X
Qm .P4 .P2 P1 /; x.P0 //
mD1
H) P4 .P2 P1 / D P0
(54.12)
where the last implication follows from Equation (54.4). Let 0 1 ˛ D K1KCK and 1 ˛ D K1KCK . Then (54.11) transforms to 0 0 showing that: ˛.P4 P0 / C .1 ˛/.P4 P2 / 0
(54.13)
Equivalently, substituting for .P4 P0 / and .P4 P2 / from Equation (54.12), it suffices to show: ˛.P2 P1 / C .1 ˛/.P0 P1 / D ˛P2 C .1 ˛/P0 0 (54.14)
54.3 Results The key theme in our formal analysis is to examine conditions when the excess profit determined in (54.11) has a determinate sign; that is, we examine conditions under which excess profits are positive or have to be negative. If ladderers can successfully manipulate the after market prices, then laddering is a rational equilibrium phenomenon. On the other hand, if the conditions are such that the excess profits have to be negative, then laddering is unsustainable. The main result that we establish (Proposition 3) is that the presence of traders who indulge in private information search can result in laddering becoming unsustainable. The first result that we examine considers the benchmark situation where no traders suspect that laddering is taking place and assume instead that the manipulated prices reflect information about fundamental values. We do not believe this situation is realistic – in a Bayesian-rational market with laddering, there will be some prior beliefs that laddering is taking place. However, we start with this benchmark of no priors on laddering both because it is the situation analyzed in prior literature (Qing Hao 2007) and because it provides a contrast with rational and efficient market behavior. Under this restrictive hypothesis, it turns out that laddering is a positive payoff strategy. Proposition 1 (Momentum traders and the profitability of laddering). Suppose that N D 0, that is, that there are no outsider traders who suspect that laddering is taking place. Then ladderers earn abnormal positive profits by purchasing shares in the aftermarket. Proof. In this case, there is no trade at t D 3 so we only have an observed price sequence P0 ; P1 ; P2 ; P4 , (or alternatively, set P3 D P2 ). Therefore, momentum traders have a demand
Adding ˛ times the demand at t D 2 (Equation (54.8)) to .1 ˛/ times the demand in (54.4), we get: ˛
M X
Qm .P2 .P1 P0 /; x.P0 //
mD1
C .1 ˛/
M X
Qm .P0 ; x.P0 //
mD1
D ˛.S2 .K0 C K1 // C .1 ˛/S2 D S2 K0 D
M X
Qm .P1 ; x.P0 //
(54.15)
mD1
where the last equality follows from Equation (54.6). From Equation (54.15), the fact that P1 > P0 and the convexity assumption on Q, we get the following chain of relations: M X
Qm .P1 ; x.P0 //
mD1
D˛
M X
Qm .P2 .P1 P0 /; x.P0 //
mD1
C .1 ˛/
M X
Qm .P0 ; x.P0 //
mD1
˛
M X
Qm .P2 ; x.P0 // C .1 ˛/
mD1
M X
M X
Qm .P0 ; x.P0 //
mD1
Qm .˛P2 C .1 ˛/P0 ; x.P0 //
mD1
H) P1 ˛P2 C .1 ˛/P0 Rearranging, we obtain Equation (54.14).
(54.16) t u
850
J. Ronen and B. Sarath
Proposition 1 shows that with momentum traders, laddering can be a viable economic strategy. Before clarifying the intuition, we note that the price at which ladderers exit their excess holdings would, in general, lie between the inflated price P2 and the IPO price P0 . For this reason, the sign of the expression in Equation (54.11) is indeterminate. However, momentum traders intensify their demand when the price rises (through laddering). Given the greater demand intensity, the price falls less when ladderers sell than it rose at the time they bought the shares. It is precisely this asymmetry, caused by a manipulation of expectations, that results in abnormal profits to laddering. In contrast, the next result shows that if demand intensity cannot be manipulated, then laddering ceases to be profitable. The next proposition deals with the opposite benchmark to Proposition 1 in that we assume there are no momentum traders and that all traders are rational expectations traders. The beliefs of these traders is such that they assign some probability to the fact that ladderers are present and they condition their demands on the spot price rather than past prices. Obviously, if they discover that laddering has taken place, they reduce their demands driving the price downwards before the ladderers can exit the market resulting in losses. Less obviously, the fact that they respond instantaneously to selling pressure (observed through the spot price) makes them immune to laddering schemes. We summarize this basic economic point in the next lemma where we show that the demand of rational expectations traders is determined by the initial (fair) offer price. Lemma 1 (Rational expectation traders demands). For a rational expectations trader n, the demand at time t, Qtn .Pt jPt ; ; P0 / is either Q0n .Pt ; x.P0 // (if the true value parameter x0 D x.P0 / is discovered) or Q0n .P0 ; x.P0 // D Qtn .Pt ; x.Pt // (if the parameter is not discovered). Proof. The assertion is obvious if trader n discovers that laddering has taken place. In contrast, if laddering is not detected, the inferred parameter x.Pt / is such that the demand at Pt with this belief x.Pt / is the same as the demand at P0 with belief x0 (see Assumption 2 Equation 54.1). The lemma follows. As might be expected based on Lemma 1, if all traders are rational expectations traders, laddering schemes fail. Proposition 2 (Rational expectations and the unprofitability of laddering). Suppose that M D 0, that is, all outside traders are rational expectations traders. Then Laddering is unprofitable. Proof. We follow the same notation as Proposition 1 and will show that the following holds: ˛.P4 P0 / C .1 ˛/.P4 P2 / < 0
0 1 and 1 ˛ D K1KCK ). When M D 0, the (where ˛ D K1KCK 0 0 t D 0 and t D 4 equilibrium conditions become:
S2 D
N X
Qn .P0 ; x0 /
nD1
S2 D
N X
Œ.1 p/Qn .P4 ; x.P4 // C pQn .P4 ; x.P0 //
nD1
(54.17) It follows that P4 P0 ; for if P4 > P0 , x.P4 / > x.P0 / and: Qn .P4 ; x.P4 // D Qn .P0 ; x.P0 //I Qn .P4 ; x.P0 // < Qn .P0 ; x.P0 // where the first equality follows from Equation (54.1); this contradicts Equation (54.17). In fact, the same argument shows that P4 P0 , that is, P4 D P0 . As P2 > P1 > P0 D P4 (ladderers buy shares at time t D 2), we conclude t u that ˛.P4 P0 / C .1 ˛/.P4 P2 / < 0 As we had observed earlier, the key issue is whether the price falls by the same level when ladderers sell as it rises when they buy. In the absence of momentum traders, this is precisely what happens, that is, when the ladderers sell their excess holdings, the price reverts to the IPO level. Hence, Laddering is not a profitable equilibrium strategy. Our last result combines the two benchmark cases and examines what happens when there are both momentum traders and rational expectations traders. The interest in the proposition stems from the fact that the presence of some information traders is enough to create a spiraling effect that ends in laddering being unprofitable. Proposition 3 (Laddering is unsustainable in equilibrium). Let N , p be such as to ensure that P3 P2 , that is, rational information traders who suspect that laddering has taken place, investigate the fundamental value of the security, and sell their holdings driving the t D 3 price below the inflated price at t D 2. Then the ladderers lose money in equilibrium and laddering is unsustainable. Proof. Under the hypothesis of this proposition, some proportion pN of traders act on their rational suspicions of laddering and discover the true fundamental value of the security, x0 . This is below the inferred value based on the inflated price, P2 . Under these circumstances, this subset of traders reduce their holdings creating a downward pressure on prices; if this pressure is sufficient, P3 is less than P2 . Then the market clearing price at time t D 4 is also lower both due to the downgrading by informed traders and due to the reduced demand of momentum traders. Specifically,
54 On the Feasibility of Laddering
if P3 < P2 , then it follows that P4 D P0 C .P3 P2 / < P0 . The proof is now similar to Proposition 2. Because P4 < P0 < P2 , the net payoff in Equation (54.13) will be negative, resulting in a loss to ladderers. t u The intuition behind Proposition 3 is a development of the ideas in Propositions 1 and 2. Specifically, the t D 3 price fall forces the t D 4 price to be lower than the IPO price. The reason is that momentum traders now shift their demand curves downwards resulting in lower demand intensity than at the time of the IPO. We have modeled the price fall at t D 3 to be a consequence of Bayesian rationality where at least some traders suspect that laddering might have taken place and therefore investigate the fundamentals underlying the valuation. The same result will hold as long as some traders (rational or otherwise) spend resources to gather information about the security. If the price of the security has been inflated through laddering, private information search leads to a downward trend in prices. This trend is magnified by momentum traders and the price drops significantly if the ladderers try to dispose of their excess holdings.
54.4 Conclusion We investigate the feasibility of laddering in an IPO market. Earlier studies have assumed that laddering is sustainable in equilibrium and have investigated its effects on market prices. In contrast, we study whether laddering itself is economically rational. Our model features three distinct sets of traders: insiders who ladder, rational traders who are generally suspicious of price manipulation and willing to invest effort and resources to acquire information about fundamental values, and, finally, momentum traders whose beliefs are conditioned on past prices. Our findings are that laddering cannot be sustained in equilibrium unless there is a significant proportion of traders whose beliefs about fundamental value can be manipulated through strategic purchases – the momentum traders – and there do not exist information traders who have a significant effect on prices. In contrast, if there are traders that behave in a way consistent with rational expectations and remove price inflation, laddering cannot be a profitable strategy. Our conclusions are dependent on an assumption that offer prices at IPO time are set rationally and that monopolistic (or oligopolistic) setting of issue prices is not feasible. If offer prices can be artificially lowered, then laddering becomes feasible in the limited sense that the profits made through the initial overallocation outweigh the losses made in the aftermarket. With the existence of information traders that include short-sellers, any inflation induced by laddering activities can dissipate quickly and ladderers are saddled with
851
losses in the aftermarket. In particular, if the market for underwriting is competitive with regard to issuers, there are no abnormal returns associated with participating in the IPO. In such a setting, attempts by ladderers to profit from their ability to exploit market sentiment will fail as long as some traders in the market function in a manner consistent with the economics of rational expectations. These results contribute to a better understanding of markets and shed some light on the recent controversy as to whether laddering takes place, and whether such laddering can be successful in yielding profits to the ladderers and underwriters such as to make it sustainable in equilibrium. Our conclusion is that if underwriters do not act as monopolists, laddering is not a sustainable activity in a semistrong efficient market in which price – but not the history of prices – is seen as reflecting public information unbiasedly and instantaneously.
References Aggarwal, R. 2000. “Stabilization activities of underwriters after initial public offerings.” Journal of Finance 55, 1075–1103. Aggarwal, R. and P. Conroy. 2003. “Price discovery in initial public offerings.” Journal of Finance 55, 2903–2922. Aggarwal, R., K. Purnanandam, and G. Wu. 2006. “Underwriter manipulation in initial public offerings.” Working Paper, University of Minnesota. Benveniste, L. M. and P. A. Spindt. 1989. “How investment bankers determine the offer price and allocation of new issues.” Journal of Finance Economics 25, 343–361. Choi, S. and A. C. Pritchard. 2004. “Should issuers be on the hook for laddering?.” Working Paper, Law and Economics, Michigan University, No. 04-009/CLEO USC No. C04-3. Derrien, F. 2005. “IPO pricing in ‘hot’ market conditions: who leaves money on the table?” Journal of Finance 60, 487–521. Edwards, A. K. and K. W. Hanley. 2007. “Short selling and initial public offerings.” Working Paper, Office of Economic Analysis, U.S. Securities and Exchange Commission, April 2007; http://ssrn.com/abstract=981242. Griffin, J. M., J. H. Harris, and S. Topaloglu. 2007. “Why are investors net buyers through lead underwriters?” Journal of Finance Economics 85, 518–551. Hanley K. W. and W. J. Wilhelm. 1995. “Evidence on the strategic allocation of initial public offerings.” Journal of Finance Economics 37, 239–257. Hao, Q. 2007. “Laddering in initial public offerings.” Journal of Finance Economics 85, 102–122. Loughran, T. and J. R. Ritter. 2002. “Why don’t issuers get upset about leaving money on the table.” Review of Financial Studies 15, 413–443. Loughran, T. and J. R. Ritter. 2004. “Why has IPO underpricing changed over time?” Financial Management, Autumn 33, 5–37. Smith R. and R. Sidel 2006. “J.P. Morgan agrees to settle IPO cases for $425 million.” Wall Street Journal April 21, C4. Wurgler, J. and E. Zhuravskaya. 2002. “Does arbitrage flatten demand curves for stock?” The Journal of Business 75, 583–609. Zhang, D. 2004. “Why do IPO underwriters allocate extra shares when they expect to buy them back?” Journal of Financial and Quantitative Analysis 39, 571–594.
Chapter 55
Stock Returns, Extreme Values, and Conditional Skewed Distribution Thomas C. Chiang and Jiandong Li
Abstract This paper investigates stock returns presenting fat tails, peakedness (leptokurtosis), skewness, clustered conditional variance, and leverage effects. We apply the exponential generalized beta distribution of the second kind (EGB2) to model stock returns as measured by six AMEX industry indices. The evidence suggests that the error assumption based on the EGB2 distribution is capable of accounting for skewness and kurtosis and therefore of making good predictions about extreme values. The goodness-of-fit statistic provides supporting evidence in favor of an EGB2 distribution in modeling stock returns. Keywords EGB2 distribution r Stock return r Fat tails r Risk management r VaR
55.1 Introduction In analyzing financial time series, the conventional approach usually assumes a Gaussian distribution. The advantage of using a Gaussian assumption is that the statistical analysis of asset returns can be simplified: the main focus is on the first two moments. This simplification is appealing to practitioners, since it can directly link statistical tools to the meanvariance framework in making daily portfolio decisions. This advantage, however, entails a cost. Recent financial disturbances suggest that significant daily loss occurs more frequently and the behavior of volatility cannot reasonably be predicted based on a normal distribution. To address this issue, practitioners have developed the Value at Risk (VaR) and Expected Tail Loss methods to T.C. Chiang () Department of Finance, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA e-mail: [email protected] J. Li Chinese Academy of Finance and Development (CAFD), Central University of Finance and Economics (CUFE), 39 South College Road, Beijing 100081, China e-mail: [email protected]
deal with extreme values of asset price movements. Yet, using VaR to measure a portfolio’s maximum loss over a target horizon within a given confidence interval is essentially based on a normal distribution. This scheme may work well at the 95% confidence interval. However, using the normal model, VaR estimates in general lead to under-forecasts of losses at the 99% level. The 1987 market crash with its 17¢ attests to the failure of the VaR method in estimating variance by employing the normal distribution. Recognizing the frequent occurrences of market crashes, major defaults, financial turmoil, and the collapse of asset prices, financial institutions have begun to treat extreme values as a kind of common risk when managing their portfolios. From a practical perspective, if portfolio distributions depend on more than two parameters, optimal choice cannot be satisfied by the mean-variance approach. To account for the distributional anomalies of stock returns, we must incorporate skewness and kurtosis into portfolio decisions and risk management. From an academic point of view, although it is generally recognized that the GARCH type model is capable of dealing with the volatility cluster phenomenon, it nonetheless fails to account for the fat tails and skewness. In fact, as argued by Brooks et al. (2005), “If asset returns are fat-tailed, it will lead to a systematic underestimate of the true riskiness of a portfolio, where risk is measured as the likelihood of achieving a loss greater than some threshold” (p. 400). From an econometric point of view, the exclusion of fat-tailed and skewed information in the asset return model is bound to result in missing variables and misspecification problems. Several non-Gaussian distributions have been proposed in the finance literature. For example, Bollerslev (1987) suggests a Student’s t-distribution; Nelson (1991) proposes a general error distribution (GED, also known as an exponential power distribution); and Fama (1963), Liu and Brorsen (1995), Rachev and Mittnik (2000), and Khindanova et al. (2001) use Paretian distributions (or stable distributions). Despite the large volume of research into the distribution of stock returns, attention has been directed to a particular feature of asset return behavior. The research lacks a more general treatment of stock return characteristics such as fat tails, peakedness (leptokurtosis), skewness, clustered
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_55,
853
854
conditional variance, and leverage effects. This paper addresses these non-normality issues and provides empirical evidence based on six industry indices from the American Stock Exchange (AMEX): biotechnology, computers, natural gas, telecommunications, oil, and pharmaceuticals. In this study, we employ a distribution called the exponential generalized beta distribution of the second kind (EGB2).1 The distribution proposed by McDonald and Xu (1995) and Wang et al. (2001) has several advantages compared with alternative distributions. First, as a four-parameter distribution, it allows diverse levels of skewness and kurtosis. Its parameters are estimated simultaneously with the model structure estimation process so that it is able to accommodate a wider range of data characteristics. The skewness and excess kurtosis of EGB2 are in the range of (2, 2) and (0, 6), respectively. Second, a number of distributions used in statistical analysis, such as logistics distribution and log-normal distribution, are nested in the EGB2 distribution.2 Third, unlike a Hansen-type (1994) skewed generalized t-distribution requiring the imposition of several restrictions on the parameter values to permit estimation (Brooks et al. 2005),3 the EGB2 model is simple and has a closed-form density function for the distribution; its higher order moments are finite and explicitly expressed by its parameters. Although the EGB2 is capable of accounting for fat tails and the higher order moments of returns, previous studies using EGB2 distribution fail to incorporate a recent advancement in asset return behavior: the asymmetric response of asset return volatility to news (McDonald and Xu 1995). In addition, their analysis has been focused on aggregate stock market returns (Hueng and McDonald 2005). In this study, we introduce this asymmetric behavior into an EGB2 distribution and examine stock indices for six industries. Thus, our model is characterized as an asymmetric GARCH(1,1) cum EGB2 distribution model and provides evidence for industrial stock return analysis.4 1 There are other names for the EGB2 distribution used in non-financial fields or in non-American journals; for example, generalized logistic distribution is used in Wu et al. (2000), z-distribution in BarndorffNielsen et al. (1982), the Burr type distribution in actuarial science (Hogg and Klugman 1983), and four-parameter Kappa distribution in geology (Hosking 1994). 2 Wang et al. (2001) show that the EGB2 distribution is very powerful in modeling exchange rates that have fat tails and leptokurtic features. 3 In following Hansen (1994), Theodossiou (1998) also develops the skewed generalized t distribution (SGT) to examine stock market indices. 4 It is not our intention to exhaust all the non-Gaussian models in modeling stock returns, which is infeasible. Rather, our strategy is to adopt a distribution that is rich enough to accommodate the features of financial data. To our knowledge, there are different types of flexible parametric distributions parallel to the EGB2 distribution, such as a skewed generalized t -distribution (SGT) (Hueng and Brooks 2002) and an inverse hyperbolic sine distribution (IHS) (Johnson et al. 1994), among others.
T.C. Chiang and J. Li
The remainder of the paper is organized as follows. Section 2 describes the methodology of the AGARCH-EGB2 model. Section 3 discusses the data used in this study. Section 4 presents the empirical results for U.S. industrial stock returns. Section 5 reports the goodness-of-fit tests to assess alternative distribution. Section 6 contains the probability evaluation via non-Gaussian distributions. Section 7 contains concluding remarks and a summary.
55.2 The AGARCH Model Based on the EGB2 Distribution The AGARCH(1,1)-EGB2 stock return model can be represented by: 8 Rt D t C "t ˆ ˆ m n ˆ X X ˆ ˆ ˆ ˆ D
C
R 'j "t j t 0 i t i ˆ ˆ < i D1 j D1 p "t D ht zt ˆ ˆ ˆ ˆ ht D vc C va "2t1 C vb ht 1 C vd "2t1 I."t 1 < 0/ ˆ ˆ ˆ ˆ ˆ : "t jˆt 1 EGB2.0; ht ; p; q/ (55.1) where Rt is the portfolio’s return at time t; and t is the conditional mean that follows an ARMA(m; n) process.5 The error term "t is assumed to be independent. The conditional variance, ht , is assumed to follow GARCH(1,1); c ; a , and b > 0 ensure a strictly positive conditional variance; I is an indicator variable that takes the value of unity only when the error term is negative; the negative error term component in the variance equation captures the asymmetric effect of an extraordinary shock to the variance: bad news usually has a more profound effect than good news (Glosten et al. 1993); zt is assumed to have a zero mean and unit variance and is independent and identically distributed .i:i:d / if the model is correctly specified. The conditional distribution is assumed to be the exponential generalized beta distribution of the second kind (EGB2). The EGB2 distribution has the probability density function (pdf) given by: ip xı e. / EGB2.xI ı; ; p; q/ D ipCq h xı jjB.p; q/ 1 C e . / (55.2) where x is a random variable; ı is a location parameter that affects the mean of the distribution; reflects the scale of h
5
The GARCH-M specification was explored. The conditional variance variable is insignificant for all six indices. Other macro variables and industry factors may be added to the mean equation or variance equation. Here, we focus on the time series property.
55 Stock Returns, Extreme Values, and Conditional Skewed Distribution
the density function; p and q (p > 0 and q > 0) are shape parameters that together determine the skewness and kurtosis of the distribution of the asset return series; and B.p; q/ is the beta function.6 As suggested by McDonald (1991), the EGB2 is suitable for coefficient of skewness values between 2 and 2 and coefficient of excess kurtosis values of up to 6. Thus, it is capable of accommodating fat-tailed and skewed error distributions pertinent to stock return modeling. Different from the Student’s t-distribution, which has a fat-tail feature, the EGB2 distribution allows both fat fail and peakedness features. In other words, the pdf curve of Student’s t-distribution is flat-topped; the pdf curve of the EGB2 distribution has a peak around the mean. This feature makes the EGB2 distribution superior to the Student’s t-distribution, since a histogram of actual equity returns shows fat tails and peakedness. The disadvantage is that the EGB2 distribution has four parameters to be estimated, while the Student’s t-distribution has only three parameters. Compared with other four-parameter distributions, such as stable distributions and skewed t-distributions, the EGB2 distribution is plausible in that it has a closed-form pdf and contains higher-order moments. It follows that the parameters can be estimated by using a maximum likelihood function and the model can be evaluated based on skewness and kurtosis coefficients. The first four central moments are: mean D ı C . .p/ .q// Var D 2 0 .p/ C 0 .q/ The 3rd Moment D 3 00 .p/ The 4th Moment D 4 000 .p/ C
00
.q/
000
.q/
(55.3)
0 00 000 ./; ./ and ./ denote digamma, where ./; trigamma, tetragamma, and pentagamma functions, respectively.7 Since the error term "t is standardized in the model, we can express ı and by using the above polygamma functions as: q q 1 1 D 0 .p/C 0 .q/ D q ı D .. .p/ .q// D 1 (55.4)
where
6
D D
.p/ .q/ 0 .p/ C 0 .q/
It should be noted that beta function here has nothing to do with the stock’s beta. 7 The digamma function is the logarithmic derivative of the gamma function; the trigamma function is the derivative of the digamma function.
855
With some algebraic manipulation, we can write the univariate GARCH-EGB2 log-likelihood function as follows8 : i h p log L D T log. / log.B.p; q// C p
" p X "t p. p / 0:5 log.ht / .p C q/ C ht # p "t C // (55.5) log.1 C exp. p ht where T is the number of periods, B.p; q/ is the beta function, ht is the conditional variance, and "t is the standardized error term. Various analysis packages can be used to conduct the maximum likelihood estimation. This paper uses the R BFGS algorithm in a RATS program. The skewness and excess kurtosis for the EGB2 distribution are given, respectively, by: Skewness D Kurtosis D
.p/ 00 .q/ . 0 .p/ C 0 .q//1:5 00
.p/ C 0 . .p/ C 000
000
.q/
0 .q//2
(55.6)
The standard deviation of skewness and kurtosis coefficients can be drawn by using a standard delta method.9 This model is appealing, since it contains richer features pertinent to asset return behavior. First, the asset return in the mean equation embodies time series patterns, which can facilitate forecasting. Second, the variance equation allows for the evolution of volatility and treats the impacts of bad news and goods news on variance asymmetrically. Third, the distribution is flexible enough to nest alterative distributions. For instance, the EGB2 converges to normal distribution as p D q approaches infinity, to log-normal distribution when only p approaches infinity, to the Weibull distribution when p D 1 and q approaches infinity, and to the standard logistic distribution when p D q D 1. It is symmetric if p D q. The EGB2 is positively (negatively) skewed as p > q .p < q/ for > 0.
55.3 Data The use of industrial indices by institutional and individual investors in making portfolio decisions has become a common strategy in pursuing higher returns, liquidity, investment 8 9
See the appendix in Wang et al. (2001). The derivation is available upon request.
856
T.C. Chiang and J. Li
style, and diversification. For this reason, in this paper we explore indices for six industries traded on the American Stock Exchange (AMEX): biotechnology, computers, natural gas, telecommunications, oil, and pharmaceuticals. The sample is non-balanced weekly data covering the period from 8/26/1983 through 9/17/2006 The earliest data start on 8/26/1983 for the computer and oil indices; the latest begin on 12/9/1994 for biotechnology. The data source is Trade Tool, Inc. We use weekly data in order to be consistent with industry practices. For example, Value Line, Bloomberg, and Baseline all use weekly stock data to calculate a stock’s beta. In addition, weekly data are free of the Monday effect and other calendar effects. Figure 55.1 shows the time series plots for these six indices. We can easily see the boom period in 2001 for the computer, telecom, and biotech industries and the rise in oil and natural gas in recent years. The weekly returns along with other statistics for these six indices are reported in Table 55.1. Looking at the skewness coefficient, we find that all six indices have negative values and four of them are significant at the 1% level. A negative skewness coefficient means that there are more negative extreme values than positive extreme values in the sample period. With respect to the excess kurtosis (the kurtosis coefficient minus 3), all of the estimated values are statistically significant at the 1% level, suggesting that the presence of fat tails is confirmed. The range of the excess kurtosis coefficient is between 1.06 and 4.85. If we check the range of peakedness, the values are between 1.08 and 1.20.10 This range is much lower than the referenced figure, 1.35, indicating the presence of a high peak in the probability density func10
tion for all of the indices under investigation. Further, by inspecting the Jarque–Bera statistics, the normality assumption is uniformly rejected for all six indices. The Ljung–Box Q statistics show that we cannot reject the null hypothesis of the absence of serial correlation at the 5% level for all of the series, indicating no evidence of serial correlation for the stock return series under investigation. However, the Ljung– Box Q2 tests indicate that all six statistics of return squares are significant at the 1% level, suggesting volatility clustering and that this finding would be consistent with a GARCH-type specification. In sum, the above statistics clearly indicate that the popular normality assumption does not match up well with the weekly returns under investigation. Stock index returns often show positive excess kurtosis (fat tails), accompanied by negative skewness. The peakedness does not conform to that of the normal distribution either. Besides the nonGaussian features, some weekly returns show autocorrelation, and all of the return squared series display significant volatility clustering.
55.4 Empirical Evidence 55.4.1 GARCH(1,1) Model: The Normal Distribution It is convenient to start with a GARCH(1,1) model based on the normal distribution, which sets a basis for comparison. With a general model in hand, which is given by the set
See the notes to Table 55.1.
6 index level 1800
AMEXBI OT
1700
AMEXCPT
AMEXNGAS
AMEXNTEL
AMEXPHAR
AMEXOIL
1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100
date
JAN07
JAN06
JAN05
JAN04
JAN03
JAN02
JAN01
JAN00
JAN99
JAN98
JAN97
JAN96
JAN95
JAN94
JAN93
JAN92
JAN90
JAN91
JAN89
JAN87
JAN88
JAN85
JAN86
JAN84
0 JAN83
Fig. 55.1 Level of six indices Six industries traded on the American Stock Exchange (AMEX) are: biotechnology (AMEXBIOT), computers (AMEXCPT), natural gas (AMEXNGAS), telecommunications (AMEXNTEL), oil (AMEXOIL), and pharmaceuticals (AMEXPHAR). Data source: Trade Tool, Inc.
55 Stock Returns, Extreme Values, and Conditional Skewed Distribution
857
Table 55.1 Descriptive statistics for six industry indices’ weekly returns Jarque Ticker AMEXBIOT AMEXCPT AMEXNGAS AMEXNTEL AMEXOIL AMEXPHAR
Nobs 614 1; 203 652 671 1; 203 655
Mean
Variance
Skew
Kurtosis
Peakedness
–Bera
Q(13)
Q.13/2
0:0034
0.0025
0:205
3.187
1.087
264:328
21:354
412:762
Œ2:08
[16.12]
0:000
0:070
0:000
0:581
3.390
643:996
19:555
210:064
Œ8:24
[24.00]
0:000
0:110
0:000
0:309
1.060
40:927
14:0289
86:444
Œ3:22
[5.53]
0:000
0:370
0:000
0:760
4.858
724:532
18:083
197:119
Œ8:04
[25.69]
0:000
0:150
0:000
0:335
1.267
103:157
18:577
74:132
Œ4:75
[8.98]
0:000
0:140
0:000
0:060
1.494
61:357
11:957
83:497
Œ0:63
[7.81]
0:000
0:530
0:000
0:0016 0:0017 0:0014 0:0019 0:0022
0.0012 0.0012 9.93E-04 6.72E-04 6.67E-04
1.108 1.191 1.083 1.202 1.187
Notes: The six indices are biotechnology, computers, natural gas, telecommunications, oil, and pharmaceuticals. The numbers in brackets below the estimated values are t-values. The critical t-value at the 1% level is 2.57. The standard deviations of skewness and excess kurtosis coefficients are approximately given by .6=T/0:5 and .24=T/0:5 , respectively. Peakedness is measured by f0:75 f0:25 , the distance between the value of the standardized variable at which the cumulative distribution function equals 0.75 and the value at which the cumulative distribution function equals 0.25. The reference value of the standard normal distribution is 1.35. Peakedness of less than 1.35 means there is a high peak in the probability density function. A normality test is conducted using the Jarque–Bera statistics. Q and Q2 are Ljung–Box tests for examining independence up to the order of 13. The numbers below these statistics are p-values
of equations in (52.1), we proceed with our estimation by setting the error distribution to be normal; that is, "t jˆt 1 N.0; ht /, where N stands for the normal distribution. Before estimating the AGARCH-EGB2 model, we apply the Box–Jenkins method to filter out the time series pattern.11 The residuals from the mean equation are normalized by dividing by the estimated standard deviation obtained from the conditional variance equation. Using standardized residuals to fit the model allows us to remove the heteroscedasticity due to stochastic variance. The statistics are reported in Table 55.2. Checking the variance equation, we find that most of the coefficients in the GARCH(1,1) equations are statistically significant, indicating that stock-return volatilities are characterized by a clustering phenomenon. Further, the average variances for various stock returns are rather high, indicating that variances display a higher degree of persistence. The information from the asymmetric coefficient suggests that only half of the six cases show a significant asymmetric effect, which seems to be less apparent than those found in the daily data in the literature. This may be the result of using weekly data, which moderates the effect. 11 According to the correlogram of each stock index, the following ARMA processes are revealed. For biotechnology, AR(12) is detected; AR(11) for computers; AR(10) for natural gas; AR(11,15) for telecommunications; AR(1) for oil; and AR(25) for pharmaceuticals.
Looking at the statistics for skewness, the evidence shows that the null hypothesis of the absence of skewness is rejected at the 1% level in three out of six cases, while the null hypothesis of the absence of kurtosis is rejected in five cases. The significant excess kurtosis coefficients are in the range of 0.52–1.26. A joint test of the Jarque–Bera statistics shows that all of the return residuals are rejected by assuming a Gaussian distribution. The rejection occurs because the stock-index changes are either not independent, not normally distributed, or both. Further looking into the measure of peakedness, the estimate values are around 1.25, lower than the reference point of the standard normal distribution of peakedness, 1.35, indicating that all of the returns are leptokurtic. It is apparent that the normality assumption of the residuals is problematic. We then examine the independence of stock returns up to the 13th order, which is one-quarter for weekly data. On the basis of the Ljung–Box statistics, none of Q(13) is significant, indicating the absence of autocorrelation in the weekly data. Further checking the Q.13/2 statistics for examining the null hypothesis of dependency on the squared returns also finds no case that is significant, suggesting that the volatility clustering phenomenon has been reduced significantly as evidence of implementing an AGARCH(1,1) specification. From the test results then, it can be argued that stock returns departing from normality may be mainly attributed to leptokurtosis.
1; 191
641
655
1201
629
AMEXCPT
AMEXNGAS
AMEXNTEL
AMEXOIL
AMEXPHAR
0:0247 Œ0:25 0:3649 Œ5:14 0:3113 Œ3:22 0:2189 Œ2:29 0:3199 Œ4:53 0:1645 Œ1:68 0.6825 [3.42] 1.2623 [8.89] 0.3276 [1.69] 0.5206 [2.72] 0.9927 [7.02] 0.5661 [2.90] 1.2777
1.2568
1.2975
1.281
1.2203
1.2479
11:72666 0:00 105:4992 0:00 13:21887 0:00 12:62975 0:00 69:80144 0:00 11:23468 0:00
6:8348 0:91 7:4283 0:88 9:2334 0:76 5:2553 0:97 8:9519 0:78 11:3585 0:58
19:0509 0:12 17:189 0:19 13:0559 0:44 5:5316 0:96 9:156 0:76 19:2286 0:12 0.00002 [1.09] 0.000021 [2.08] 0.000034 [2.26] 0.00001 [1.94] 0.000007 [1.21] 0.000007 [1.28]
vc 0.0629 [3.35] 0.0727 [3.84] 0.0089 [0.48] 0.0537 [2.81] 0.0424 [3.51] 0.0248 [1.65]
va
0.9214 [37.00] 0.9052 [40.06] 0.9218 [38.95] 0.9080 [52.25] 0.9479 [52.09] 0.9424 [64.87]
vb
0.0175 [0.53] 0.0127 [0.53] 0.0859 [2.77] 0.0612 [1.99] 0.0014 [0.07] 0.0461 [1.75]
vd
2; 012:15
3; 833:33
2; 037:75
1; 847:77
3; 485:60
1; 573:1
ln L
Notes: The six indices are weekly data for biotechnology, computers, natural gas, telecommunications, oil, and pharmaceuticals. The numbers in brackets below the estimated values are t-values. The critical t-value at the 1% level is 2.57. The standard deviations of skewness and excess kurtosis coefficients are approximately given by .6=T/0:5 and .24=T/0:5 , respectively. Peakedness is measured by f0:75 f0:25 , the distance between the value of the standardized variable at which the cumulative distribution function equals 0.75 and the value at which the cumulative distribution function equals 0.25. The reference value of the standard normal distribution is 1.35. Peakedness of less than 1.35 means there is a high peak in the probability density function. A normality test is conducted using the Jarque–Bera statistics. Q and Q2 are Ljung–Box tests for examining independence up to the order of 30. The numbers below these statistics are p-values. vc ; va ; vb ; vd are the coefficients of the variance equation. lnL is the log-likelihood function value. The numbers in brackets below the estimated statistics are the t-values
601
AMEXBIOT
Table 55.2 Statistics for higher moments of the standardized errors on the GARCH(1,1)-normal distribution Ticker Nobs Skew Kurtosis Peakedness Jarque–Bera Q(13) Q.13/2
858 T.C. Chiang and J. Li
55 Stock Returns, Extreme Values, and Conditional Skewed Distribution
55.4.2 AGARCH(1,1): The EGB2 Distribution Model In this section, we estimate the parameters represented by (55.1) that assumes an EGB2 distribution. Table 55.3 reports the comparable statistics based on the standardized return residuals from an AGARCH(1,1) cum EGB2 distribution. The statistics, including the estimated coefficients and Q(13) and Q.13/2 , turn out to have results similar to those reported in Table 55.2, which is based on a normal distribution. However, by using the EGB2 distribution, the skewness problem has been removed for all six indices, and no coefficient is rejected at the conventional significance levels. Turning to the statistics of kurtosis, we find that none of the cases are significant at the 1% level. This suggests that the EGB2 distribution works well on the kurtosis. By inspecting the estimates of p and q, we find that p ranges from 1.25 to 2.28 and q ranges from 1.60 to 4.53. These pair-wise figures are nowhere near being equal and are far from the normal distribution that requires that both p and q approach infinity. Finally, we check the peakedness, which ranges from 1.22 to 1.28, conforming to the existence of a high peak. In sum, the testing results suggest that the EGB2 distribution produces a much more satisfactory result for modeling skewness and tailed information than can be achieved by assuming a normal distribution.
55.5 Distributional Fit Test While highlighting the model’s ability to account for the skewness and kurtosis of the residuals may not be adequate to justify the performance of the distribution, in this section we shall conduct the log-likelihood ratio test and goodness-offit test for different distributions. The log-likelihood function values in Tables 55.2 and 55.3 already clearly show that the EGB2 distribution has outperformed the normal distribution. We conduct a log-likelihood ratio test, which is expressed as: LR D 2.In L0 In L1 /
859
standardized residuals and theoretical distributions based on estimated shape parameters (Snedecor and Cochran 1989).12 The null hypothesis tested by the GoF statistic is that the observed and predicted distribution functions are identical. The test statistic is given by: GoF D
k X .fi Fi /2 Fi i D1
(55.8)
where fi is the observed count of actual standardized residuals in the i th data class (interval), Fi is the predicted count derived from the estimated values for the distribution parameters, and k is the number of data intervals used in distributional comparisons. GoF has an asymptotic 2 distribution with degrees of freedom equal to the number of intervals minus the number of estimated distribution parameters minus one. For the EGB2 distribution, two parameters are estimated; for the normal distribution, no parameter is required, since the error term has been standardized. Table 55.4 reports the results of the 2 test for the standardized residuals generated by the AGARCH(1,1) – the EGB2 distribution model. The test power is maximized by choosing a data class equiprobably (equal probability). The rule of thumb of a Chi-squared test is to choose the number of groups starting at 2 T 0:4 . Our sample contains either 660 or 1,200 observations with 40 intervals being used. The degree of freedom is 37 for the EGB2 distribution and 39 for the normal distribution. The critical values for the chi-squared distribution are provided in the notes to the table. The testing results show that the null is rejected for the computer index on the normal distribution at the 5% level; however, none is rejected on the EGB2 distribution. Although both distributions in general are acceptable for the sample, the EGB2 distribution yields lower values of the 2 statistics. Putting the goodness-of-fit test, LR test, and previous skewness and kurtosis coefficient tests together, plus the peakedness analysis, it is clear that the EGB2 distribution is superior to the normal distribution in our empirical analysis.
(55.7)
55.6 The Implication of the EGB2 Distribution where ln L1 is the maximum log-likelihood of the unrestricted distribution and ln L0 is the maximum log-likelihood of the restricted distribution. The likelihood ratio statistic, LR, follows an asymptotic 2 .k/ distribution with k degrees of freedom under the null hypothesis that the restrictions are valid, where k is the number of restrictions imposed. We find that in Table 55.4 all of the LR statistics are highly significant at the 1% level. The evidence is firmly in favor of the EGB2 distribution. We further use the 2 goodness-of-fit (GoF) statistic to compare differences between observed distributions of
In addition to the superior performance justified by statistical criteria, the EGB2 distribution offers a useful tool for evaluating the probability of those extreme values. According 12 The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests. The chi-square test and Anderson-Darling test make use of the specific distribution in calculating critical values. This has the advantage of allowing a more sensitive test and the disadvantage that critical values must be calculated for each distribution.
va 0.0664 [2.73] 0.0617 [3.46] 0.0088 [0.42] 0.0541 [2.71] 0.0423 [2.85] 0.0182 [0.95]
vb 0.9099 [27.91] 0.9317 [50.52] 0.9263 [43.29] 0.9026 [46.42] 0.9385 [48.42] 0.9306 [39.90]
vd 0.0299 [0.70] 0:006 Œ0:30 0.0826 [2.45] 0.0607 [1.98] 0.0095 [0.38] 0.0642 [1.65]
q 1.604 2.017 4.536 3.721 2.324 2.389
p 1.365 1.253 2.080 2.281 1.593 2.056
1,437.16
2,742.79
1,439.59
1,264.41
2,409.75
ln L 1,024.92
Notes: The six indices are weekly data for biotechnology, computers, natural gas, telecommunications, oil, and pharmaceuticals. The numbers in brackets below the estimated values are t-values. The critical t-value at the 1% level is 2.57. Peakedness is measured by f0:75 f0:25 , the distance between the value of the standardized variable at which the cumulative distribution function equals 0.75 and the value at which the cumulative distribution function equals 0.25. The reference value of the standard normal distribution is 1.35. Peakedness of less than 1.35 means there is a high peak in the probability density function. A normality test is omitted in the table, since the assumed distribution is the EGB2 distribution. An independence test is conducted using a Ljung–Box Q test up to the order of 13. The Q2 is the Ljung–Box statistics for testing returns squared up to the order of 13. The numbers below these statistics are p-values. vc ; va ; vb ; vd are the coefficients of the variance equation. The numbers in brackets below the estimated statistics are t-values. New columns show two distribution parameters: p and q The lnL is the log-likelihood function value
Table 55.3 Statistics for higher moments of the standardized errors on the AGARCH(1,1)-EGB2 distribution Ticker Nobs Skew Kurtosis Peakedness Q(13) Q.13/2 vc AMEXBIOT 601 0:0268 0.7078 1:2376 6:888 18:416 0.000029 [0.69] Œ0:31 0:91 0:14 [1.16] AMEXCPT 1; 191 0:4084 1.5679 1:2198 7:186 19:408 0.000013 Œ0:34 [1.92] 0:89 0:11 [1.75] AMEXNGAS 641 0:3138 0.3286 1:2634 9:431 13:056 0.000032 [0.51] Œ0:68 0:74 0:44 [2.28] AMEXNTEL 655 0:238 0.5981 1:2834 5:313 5:567 0.000014 [0.12] [0.30] 0:97 0:96 [2.12] AMEXOIL 1; 201 0:3333 1.0343 1:2672 8:709 8:396 0.00001 Œ0:71 [1.11] 0:79 0:82 [1.59] AMEXPHAR 629 0:1879 0.6952 1:2739 10:96 19:463 0.000013 Œ0:76 [0.47] 0:61 0:11 [1.39]
860 T.C. Chiang and J. Li
55 Stock Returns, Extreme Values, and Conditional Skewed Distribution
Table 55.4 Goodness-of-fit (Chi square statistics) and log-likelihood ratio test between Gaussian distribution and the EGB2 distribution
861
Ticker
Nobs
Normal
EGB2
LR
AMEXBIOT AMEXCPT AMEXNGAS AMEXNTEL AMEXOIL AMEXPHAR
601 1; 191 641 655 1; 201 629
37.867 54:83 40.875 29.000 35.300 35.313
37.533 39.767 27.312 24.412 18.567 28.312
1; 096:372 2; 151:709 1; 166:723 1; 196:314 2; 181:069 1; 149:995
Notes: Normal is the standardized residual from GARCH(1,1)-normal distribution; EGB2 is the standardized residual from GARCH(1,1)-EGB2; The two columns are based on the same model structure. The quantiles are computed via 40 intervals. For the EGB2 distribution, the degree of freedom (d.f.) is 37, and the critical values at the 1%, 5%, and 10% levels are 59.89, 52.19 and 48.36, respectively. For the normal distribution, the degree of freedom is 39, and the critical values at the 1%, 5% and 10% levels are 62.43, 54.57 and 50.66, respectively. LR is log-likelihood ratio. and indicate significance at the 1% and 5% levels, respectively Table 55.5 The probability of negative extreme shocks in the error term between EGB2 and normal distributions Shocks Index 9¢ 8¢ 7¢ 6¢ 5¢ 4¢ 3¢ AMEXBIOT AMEXCPT AMEXNGAS AMEXNTEL AMEXOIL AMEXPHAR
2¢
1¢
EGB2
4.58E-08 1.91E-07 4.82E-08 1.40E-08 5.34E-08 6.49E-09
3.49E-07 1.23E-06 3.80E-07 1.29E-07 4.05E-07 6.49E-08
2.36E-06 6.92E-06 2.66E-06 1.08E-06 2.72E-06 5.89E-07
1.56E-05 3.79E-05 1.83E-05 8.86E-06 1.80E-05 5.29E-06
0.000103 0.000207 0.000125 7.22E-05 0.000118 4.73E-05
0.000681 0.001125 0.000837 0.000575 0.00077 0.000416
0:004435 0:006063 0:005404 0:004362 0:004927 0:003505
0:02758 0:031589 0:031757 0:029306 0:029597 0:026360
0:145553 0:146058 0:150973 0:151308 0:148128 0:149579
Normal
1.13E-19
6.22E-16
1.28E-12
9.87E-10
2.87E-07
3.17E-05
0:00135
0:02275
0:158655
Notes: The probability values are calculated based on estimated parameter values of the EGB2 distribution. The probability values based on the normal distribution are the same for all six indices
to the normal distribution, the 1987 market crash, which is beyond 17¢ (daily data), would have never happened. However, recent market crashes indicate that big market swings or significant declines in asset prices happen more frequently than we expect. Although VaR is one of the most prevalent risk measures under normal conditions, it cannot deal with the involvement of those extreme values, since extreme values are not in the state of normal. From this perspective, the EGB2 distribution provides a management tool for calculating risk. Table 55.5 reports the probability of the semivolatility of shocks. We find that the predicted probabilities for extreme values for the EGB2 distribution become greater than that of the normal distribution, especially in the horizon beyond 2¢. For instance, the probability of a 5¢ and 7¢ shock in the biotechnology index for the EGB2 distribution is 1.03E-4 and 2.36E-6. These values are much greater than the 2.87E-7 and 1.28E-12 values, respectively, based on the normal distribution. Yet, the probabilities for the EGB2 distribution under a moderate regime (within ˙2¢) are less than that of the normal distribution. This is another way to tell the peakedness and fat tails of portfolio returns. Notice that the dividing point between the EGB2 distribution and the normal distribution is
in the neighborhood of ˙2¢, where the probabilities of both distributions are equivalent. This feature implies that VaR at the 95% confidence level based on the normal distribution is by chance consistent with reality. However, beyond this critical level, the VaR method based on the normal distribution leads to overly optimistic forecasts of losses.
55.7 Conclusion This study provides a time series model that incorporates the asymmetric GARCH feature into a non-Gaussian distribution: an exponential generalized beta distribution of the second kind (EGB2). When we apply data for six stock indices to the EGB2 distribution, the evidence consistently shows that the model is capable of dealing with skewness and kurtosis of stock returns. Testing the model by using goodness-of-fit and likelihood ratio statistics, the EBG2 distribution outperforms the model by assuming normal distribution. Because of its capacity for modeling skewness and fat tails, the EGB2 model provides a useful tool for forecasting variances involving extreme values. As a result, this model can be practically used for risk management.
862 Acknowledgements We are grateful to Michael J. Gombola, Daniel Dorn, Hazem Maragah, and C.F. Lee for their helpful comments in an earlier draft of this paper. We alone are responsible for any errors. Thomas C. Chiang would like to acknowledge the research support received from the Marshall M. Austin Endowed Chair, LeBow Business College, Drexel University.
References Barndorff-Nielsen, O., J. Kent, and M. Sorensen. 1982. “Normal variance-mean mixtures and z distributions.” International Statistical Review 50, 145–159. Brooks, C., S. P. Burke, S. Heravi, G. Persand. 2005. “Autoregressive conditional kurtosis.” Journal of Financial Economics 3, 399–421. Bollerslev, T. 1987. “A conditionally heteroskedastic time series model for speculative prices and rates of return.” Review of Economics and Statistics 69, 542–547. Fama, E. 1963. “Mandelbrot and the stable paretian hypothesis.” Journal of Business 36, 420–419. Glosten, L. R., R. Jagannathan, and D. E. Runkle. 1993. “On the relation between the expected value and the volatility of the normal excess return on stocks.” Journal of Finance 48, 1779–1801. Hansen, B. E. 1994. “Autoregressive conditional density estimation.” International Economic Review 35, 705–730. Hogg, R. V. and S. A. Klugman. 1983. “On the estimation of long tailed skewed distributions with actuarial applications.” Journal of Econometrics 23, 91–102. Hosking, J. R. M. 1994. “The four-parameter kappa distribution.” IBM Journal of Research and Development 38, 251–258. Hueng, C. J. and R. Brooks. 2002. Forecasting asymmetries in aggregate stock market returns: evidence from higher moments and
T.C. Chiang and J. Li conditional densities, working paper, WP02–08–01, The University of Alabama. Hueng, C. J. and J. B. McDonald. 2005. “Forecasting asymmetries in aggregate stock market returns: evidence from conditional skewness.” Journal of Empirical Finance 12, 666–685. Johnson, N. L., S. Kotz and N. Balakrishnan. 1994. Continuous univariate distributions, vol. 1, 2nd ed., Wiley, New York. Khindanova, I., S. T. Rachev, and E. Schwartz. 2001. “Stable modeling of value at risk.” Mathematical and Computer Modeling 34, 1223–1259. Liu, S. M. and B. Brorsen. 1995. “Maximum likelihood estimation of a GARCH – stable model.” Journal of Applied Econometrics 2, 273–285. McDonald, J. B. and Y. J. Xu. 1995. “A generalization of the beta distribution with applications.” Journal of Econometrics 66, 133–152. McDonald, J. B. 1991. “Parametric models for partially adaptive estimation with skewed and leptokurtic residuals.” Economics Letters 37, 273–278. Nelson, D. B. 1991. “Conditional heteroskedasticity in asset return: a new approach.” Econometrica 59, 347–370. Rachev, S. T. and S. Mittnik. 2000. Stable paretian models in finance, Wiley, Chichester. Snedecor, G. W. and W. G. Cochran. 1989. Statistical methods, 8th ed., Iowa State Press, Ames, Iowa. Theodossiou, P. 1998. “Financial data and the skewed generalized t distribution.” Management Science 44, 1650–1661. Wang, K.-L., C. Fawson, C. B. Barrett and J. B. McDonald. 2001. “A flexible parametric GARCH model with an application to exchange rates.” Journal of Applied Econometrics 16, 521–536. Wu, J.-W., W.-L. Hung, and H.-M. Lee. 2000. “Some moments and limit behaviors of the generalized logistic distribution with applications.” Proceedings of the National Science Council, ROC 24, 7–14.
Chapter 56
Capital Structure in Asia and CEO Entrenchment Kin Wai Lee and Gillian Hian Heng Yeo
Abstract We examine the association between CEO entrenchment and capital structure decisions of Asian firms. We find that firms with higher CEO entrenchment have a lower level of leverage. Specifically, firms with CEOs who chair the board, have a higher CEO tenure, and have a lower proportion of outside directors, have lower leverage. The negative association between CEO entrenchment and corporate leverage is more pronounced in firms with higher agency costs of free cash flow. In addition, for firms with entrenched CEOs, those firms with greater institutional investor equity ownership have higher leverage. This result suggests that active monitoring by large shareholders mitigate sentrenched CEOs’ incentives to avoid debt. Keywords Capital structure entrenchment
r
Agency costs
r
Managerial
56.1 Introduction A considerable body of research has focused on managers deviating from optimal level of capital structure due to conflict of interest between managers and shareholders. One stream of research suggests that leverage reduces managerial discretion over corporate resources because higher debt financing increases the commitment and pressure to distribute surplus cash as repayment of debt obligations (Jensen 1986). Thus, entrenched managers prefer capital structures with low leverage. Another stream of research suggests that entrenched managers have greater incentives to increase leverage beyond the optimal level to reduce the probability of
K.W. Lee () S3-B2A-19 Nanyang Business School, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] G.H.H. Yeo S3-1A-21 Nanyang Business School, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore
successful takeovers by increasing the concentration of their shareholdings, which enables them to have greater control of their firms (Harris and Raviv (1988), and Stulz (1988). Prior studies on U.S. listed firms provide some evidence that entrenched managers prefer low corporate leverage. Friend and Lang (1988) and Mehran (1992) find that firms with high agency costs of managerial discretion have low leverage levels. Using a sample of large U.S. industrial firms, Berger et al. (1997) find that leverage levels are lower when CEOs do not face pressure from ownership and compensation incentives or active monitoring. They also document that leverage increases significantly in the aftermath of events that represent entrenchment-reducing shocks to managerial security, including unsuccessful tender offers, involuntary CEO replacements, and the addition to the board of major shareholders. However, given the opposing theories on the association between the capital structure decisions and managerial entrenchment, the empirical evidence supporting these views are hard to generalize, especially in other economies. For example, the market for corporate control in Asia is relatively less hostile and involuntary CEO turnover is quite infrequent. In addition, most firms in Asia are closely held and controlled by a large ultimate shareholder (Claessens et al. 2000). These ownership structures allow controlling owners to commit low equity investment while maintaining tight control of the firm, creating a separation of cash flow rights and voting rights. As the separation of cash flow rights from voting rights increases, the controlling owner becomes more entrenched with levels of control, and therefore entrenched managers in Asia may have less incentive to increase corporate leverage so that they can increase their voting rights. This paper examines the association between CEO entrenchment and capital structure decisions of Asian firms. We find that firms with higher CEO entrenchment have lower levels of leverage. Specifically, our results indicate that firms with CEOs who chair the board, and have a lower proportion of outside directors, have a higher CEO tenure, tend to have lower leverage. We then examine factors that may affect the association between leverage and CEO entrenchment. We find that the
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_56,
863
864
negative association between CEO entrenchment and corporate leverage is more pronounced in firms with higher agency costs of free cash flow. Thus, in firms with entrenched CEOs, those with greater managerial discretion associated with free cash flow have lower leverage. We interpret our result as suggesting that high agency costs of free cash flow exacerbates the agency costs associated with CEO entrenchment, resulting in lower levels of leverage. We also find that for firms with entrenched CEOs, those with greater institutional equity ownership have higher leverage. This result suggests that active monitoring by large institutional shareholders mitigate entrenched CEOs’ incentives to avoid debt. Our result contributes to stream of research on the corporate governance role of institutional shareholders in curtailing the managerial opportunism or self-serving behavior (Gillan and Starks 2000; Hartzell and Starks 2003; Parrino et al. 2003). Debt-to-asset ratios represent the cumulative result of years of separate decisions. Thus, cross-sectional tests based on a single aggregate of different decisions are likely to have low power (Jung et al. 1996). To increase the power of our test, we also examine the decisions to change leverage. Further analysis indicates that year-to-year change in leverage is negatively associated with CEO entrenchment. Specifically, net debt issued during the year is lower when a CEO also chairs the board, and when the CEO has longer tenure. We also find significant leverage decreases in firms with low board independence. As a robustness test, we also examine how leverage changes as a function of the firm’s financing deficit at the start of the year. The financing deficit variable is computed as cash dividends paid plus investments plus capital expenditure plus change in working capital less cash flow. We find that firms with entrenched CEOs tend to fund their financing deficit by issuing less debt. We interpret our result to show that entrenched CEOs avoid increases in debt to mitigate the disciplinary role of debt. This result suggests that in firms with entrenched CEOs, those firms with high free cash flow are even less likely to fund their financing deficit with debt. In other words, the entrenchment effect of CEOs on net debt issued is more pronounced in firms with high agency costs of free cash flow. In firms with entrenched CEOs, those firms with high institutional equity ownership are more likely to fund their financing deficit with debt. Hence, the entrenchment effect of CEOs on net debt issued is mitigated by large institutional equity ownership, suggesting institutional shareholders play an effective monitoring role. The rest of the paper proceeds as follows. Section 56.2 develops the hypotheses and places our paper in the context of related research. Section 56.3 describes the sample. Section 56.4 presents our results. We conclude the paper in Sect. 56.5.
K.W. Lee and G.H.H. Yeo
56.2 Prior Research and Hypothesis 56.2.1 CEO Entrenchment and Leverage Conflicts of interest over capital structure decisions arise between managers and shareholders because managers may deviate from value-maximizing levels of debt. Managers may prefer less leverage than optimal because of their preference for lower firm risk to protect their undiversified human capital (Fama 1980) and their preference to avoid the performance pressures associated with fixed debt payments (Jensen 1986)1. Jung et al. (1996) provide evidence supporting the entrenchment affecting capital structure choices. They find that equity issuers are low-leverage firms with limited investment opportunities. These firms also invest more than similar firms issuing debt. These results suggest that agency costs of managerial discretion lead certain firms to issue equity when debt issuance would be the firm-value enhancing alternative. Using a sample of 452 industrial firms in the United States, Berger et al. (1997) find that firm leverage is negatively associated with the degree of entrenchment of managers. Specifically, they find that leverage is lower when the CEO has a long tenure in office, has weak compensation incentives, and does not face strong monitoring from the board of directors. In further analysis of leverage changes, they find that leverage increases significantly in the aftermath of events that represent entrenchment-reducing shocks to managerial security including unsuccessful tender offers, involuntary CEO replacements, and the addition to the board of major shareholders. Another stream of research suggests that entrenched managers have greater incentives to increase leverage beyond the optimal level so that they can increase the voting power of their equity ownership and reduce the probability of successful takeovers (Harris and Raviv 1988; Stulz 1988). In a similar vein, Israel (1992) argues that by issuing debt, the management of the target firm transfers some of the value from equity holders to debt-holders in exchange for private benefits of control, which lowers the acquirer’s premium. Unlike U.S., the market for corporate control is less hostile in Asia and proxy fights are rare. In addition, involuntary CEO turnover is relatively infrequent in Asia. Most firms in Asia are closely held and controlled by a large ultimate shareholder (Claessens et al. 2000). Corporate ownership in Asia is complicated by pyramidal and cross-holding structures. Specifically, these ownership structures allow controlling owners to commit low equity investment while maintaining
1
See Chen et al. (2009) for a review of important and representative capital structure models.
56 Capital Structure in Asia and CEO Entrenchment
tight control of the firm, creating a separation of cash flow rights and voting rights. As the separation of cash flow rights from voting rights increases, the controlling owner becomes more entrenched with levels of control, while the low cashflow level provides a low degree of alignment of interest between the controlling owner and minority shareholders. Given the relatively inactive market for corporate control and high voting rights concentration in the controlling shareholder, we conjecture that entrenched managers in Asia have relatively weak incentives to increase corporate leverage with the aim of increasing their voting rights. Thus, on balance, entrenched CEOs of Asian firms are likely to prefer lower corporate leverage to avoid the monitoring associated with debt financing. If CEOs in Asian firms face less entrenchment-reducing shocks to managerial security such as unsuccessful tender offers and involuntary CEO replacements, the degree of CEO entrenchment will be higher. Our first hypothesis is as follows: H1: In Asian firms, leverage is negatively associated with the CEO entrenchment.
56.2.2 Free Cash Flow, CEO Entrenchment, and Leverage Firms with high free cash flow may overinvest in negative net present value projects (Jensen 1986; Stulz 1988; Zwiebel 1996). Leverage reduces management’s discretion and mitigates the agency costs of managerial discretion. Since debt financing without retention of proceeds of issue commits the free cash flow to pay creditors, management has less control over the firm’s cash flow. Management will be monitored by creditors who want to ensure that they will be repaid. We posit that entrenched CEOs have incentives to avoid the monitoring associated higher leverage so that they have more discretion over corporate resources. We extend this notion to suggest that entrenched CEOs’ propensity to avoid leverage is exacerbated in firms with high free cash flow. Thus, the combination of higher free cash flow and higher CEO entrenchment amplifies the agency costs of managerial discretion and results in lower leverage. H2: The negative association between CEO entrenchment and corporate leverage is more pronounced in firms with higher free cash flow.
56.2.3 Institutional Ownership, CEO Entrenchment, and Leverage Past studies document the role of large shareholders in corporate governance (Shleifer and Vishny 1986). Due to the
865
high cost of monitoring, large shareholders such as institutional investors can achieve sufficient benefits by having an incentive to monitor (Grossman and Hart 1980). Specifically, large institutional investors have the opportunity, resources, and ability to monitor, discipline, and influence managers. McConnell and Servaes (1990), Nesbitt (1994), Smith (1996), and Del Guercio and Hawkins (1999) find that firm performance (as measured by Tobin’s Q) is positively related to institutional investor ownership. These studies prove consistent with the hypothesis that corporate monitoring by institutional investors can result in managers focusing more on corporate performance and less on opportunistic or selfserving behavior. Other studies find that institutional investors have increasingly used their ownership rights to pressure managers to act in the interest of shareholders. Gillan and Starks (2000) find that corporate governance proposals sponsored by institutional investors receive more favorable votes than those sponsored by independent individuals or religious organizations. Hartzell and Starks (2003) find that institutional ownership is negatively associated with the level of executive compensation and positively associated with pay-for-performance sensitivity. Finally, Parrino et al. (2003) show that institutional selling is associated with forced CEO turnover and that these CEOs are more likely to be replaced with an outsider. In this study, agency costs arise because entrenched CEOs underlever their firms to avoid the tighter monitoring associated with higher debt financing. If institutional shareholders have the incentives and ability to perform an effective monitoring role, we predict that large institutional shareholders mitigate entrenched CEOs’ incentives to reduce corporate leverage. Thus, for firms with entrenched CEOs, those firms with higher institutional ownership have higher leverage. Our third hypothesis is as follows: H3: The negative association between CEO entrenchment and corporate leverage is mitigated in firms with higher institutional shareholders.
56.3 Data and Method 56.3.1 Sample Construction We begin with the Worldscope database to identify listed firms in eight East Asian countries comprising Hong Kong, Indonesia, Japan, South Korea, Malaysia, Philippines, Singapore and Thailand for the period 2000–2005. We exclude financial institutions because of their unique financial structure and regulatory requirements. We eliminated observations with extreme values of financial statement variables such as leverage and return-on-assets (discussed in
866
K.W. Lee and G.H.H. Yeo
Sect. 56.3.2 below). This procedure yielded an initial sample of 4,720 firms. In view of the costs of manually collecting CEO entrenchment and ownership variables from the annual reports, we randomly selected 834 firms to obtain 18% of the firms in the initial sample. We obtained annual reports for fiscal year 2000–2005 from the Global Report database and company websites. Our final sample consisted of 834 firms for 4,301 firm-year observations during the period 2000–2005 in eight East Asian countries (Hong Kong, Indonesia, Japan, South Korea, Malaysia, Philippines, Singapore, and Thailand). On average, our final sample accounts for 62% of the market capitalization of all the listed firms in these countries.
56.3.2 Empirical Model We use the following regression model below to test the association between the corporate leverage and the CEO entrenchment: LTD D ˇ0 C ˇ1 CEODUAL C ˇ2 TENURE C ˇ3 OUTDIR C ˇ4 INSTIOWN C ˇ5 ROA C ˇ6 SIZE C ˇ7 MB C ˇ8 TANGIBLE C ˇ9 FCF C Country Controls C Year Controls C Industry Controls C e
(56.1)
where: LTD D long term debt divided by total assets. CEODUAL D A dummy variable that equals 1 if the CEO is the chairman of the board of directors, and 0 otherwise. TENURE D The number of years the CEO is in office. OUTDIR D The number of outside directors divided by board size. INSTIOWN D percentage of common stock outstanding owned by institutional shareholders ROA D net profit after tax divided by total assets. SIZE D natural logarithm of total assets. MB D market value of equity divided by book value of equity. TANGIBLE D Net Property, Plant and Equipment divided by total assets. FCF D (Cash flow from operations less capital expenditure less common dividends paid) divided by total assets. Country Controls D a set of country dummy variables Year Controls D a set of year dummy variables Industry Controls D a set of industry dummy variables e is the error term.
56.3.2.1 Test Variables We have three proxies for CEO entrenchment: (1) CEO duality, (2) CEO tenure, and (3) board independence. The board of directors cannot objectively monitor a CEO who also chairs the board. Consistent with this hypothesis, Dahya et al. (2002) find higher CEO turnover following poor firm performance for firms that separate CEOs and board chairs. They also find that operating performance improves for these firms. We measure CEO duality with a dummy variable (CEODUAL) that equals 1 if the CEO is the chairman of the board of directors, and 0 otherwise. Following Hermalin and Weisbach (1998), we argue that independent boards are more willing to monitor the CEO, whose ability to impose costs on them declines with their independence. We measure board independence as the proportion of outside directors on the board (OUTDIR). Outside directors are defined as those who are not current or former employees of the company and those who do not have any related party transactions with the company. The CEO’s control over the internal monitoring mechanisms increases as his tenure increases. An entrenched CEO who is insulated from the threat of disciplinary action from the managerial labor market and the market for corporate control is likely to have a larger number of years in the office. We compute the CEO tenure as the number of years in office until the start of the current year (TENURE). Hypothesis H1 predicts that corporate leverage is negatively associated with CEO entrenchment. Thus, we expect firms with CEOs who chair the board to have a higher CEO tenure and have a lower proportion of outside directors and have lower leverage. Hence, we predict the coefficients “1 and “2 to be negative and the coefficients “3 to be positive. Next, we construct a composite entrenchment variable using principal component analysis that summarizes the information contained in the individual entrenchment variables by detecting linear relationships among these variables. The advantage of this method is that it reduces the dimension of the explanatory variables and the potential multi-collinearity problem. To test hypothesis H2 and H3, we modify model (1) as follows: LTD D ˇ0 C ˇ1 ENTRENCH C ˇ2 ENTRENCH FCF C ˇ3 ENTRENCH INSTIOWN C ˇ4 INSTIOWN C ˇ5 ROA C ˇ6 SIZE C ˇ7 MB Cˇ8 TANGIBLE C ˇ9 FCF C Country Controls C Year Controls C Industry Controls C e
(56.2)
56 Capital Structure in Asia and CEO Entrenchment
867
where: ENTRENCH D a composite CEO entrenchment index based on the principal component analysis of CEODUAL, TENURE and OUTDIR with higher values of ENTRENCH denoting higher CEO entrenchment. Other variables are previously defined. From Hypothesis H1, we predict the coefficient for ENTRENCH to be negative. Hypothesis H2 predicts that the negative association between leverage and CEO entrenchment is more pronounced in firms with higher agency costs of free cash flow. We measure free cash flow (FCF) as cash flow from operations less capital expenditure less common dividends paid divided by total assets. Thus, we expect the coefficient on the interaction term, ENTRENCH FCF, to be negative. Hypothesis H3 predicts that the negative association between leverage and CEO entrenchment is more mitigated in firms with higher institutional equity ownership. Thus, we expect the coefficient on the interaction term, ENTRENCH INSTIOWN, to be positive.
dummies to control for time-series effects and country dummies to control for country-specific effects. We include industry dummies to control for industry-level determinants of capital structure.
56.4 Results 56.4.1 Descriptive Statistics Table 56.1 reports the descriptive statistics. Mean leverage, computed as long-term debt divided by total assets, is 0.1178. On average, the firms are profitable (mean ROA D 0:037), are relatively low growth firms (mean MB D 1:096), and have high proportion of assets comprising of tangible assets (mean TANGIBLE D 0:406). About 26% of the firms have a CEO who also chairs the board. The mean CEO tenure is 6.1 years. The mean proportion of outside directors on the board is 0.4002. The mean of INSTIOWN is 0.19, suggesting that 19% of the firms have large institutional shareholders.
56.3.2.2 Control Variables We also include in our model standard control variables for other firm characteristics expected to influence leverage. These variables were considered in prior studies such as Titman and Wessels (1988) and Rajan and Zingales (1995). To control for firm size, we use the natural logarithm of the book value of total assets (SIZE). To control for profitability, we include a return on assets variable (ROA) defined as net income after tax divided by total assets. We also control for the collateral value of tangible assets by including the net property, plant, and equipment divided by total assets (TANGIBLE). To control for growth options in the firm’s investment opportunity set, we include the ratio of market value of equity to book value of equity (MB). We also include year
56.4.2 Leverage Levels Table 56.2 presents the results of regressions of corporate leverage levels. The dependent variable is the level of leverage defined as long-term debt divided by total assets. In column (1), the results indicate that firms with entrenched CEOs have lower leverage. Specifically, the estimated coefficient on CEODUAL is negative and significant at the 1% level, suggesting that firms with CEOs who chair the board have lower leverage. Furthermore, the estimated coefficient on TENURE is negative and significant at the 1% level, suggesting that the longer the CEO tenure, the lower the corporate leverage. This result is consistent with entrenched CEOs
Table 56.1 Descriptive statistics LTD ROA SIZE MB TANGIBLE FCF CEODUAL TENURE OUTDIR INSTIOWN
Mean
25th percentile
Median
75th percentile
Standard deviation
0:1178 0:0373 12:4491 1:0969 0:4067 0:0026 0:2600 6:149 0:4002 0:1908
0:0023 0:0038 11:4710 0:4673 0:2464 0:0331 0 4:662 0:2864 0:1129
0:0734 0:0384 12:3871 0:7771 0:4030 0:0062 0 5:223 0:3247 0:1912
0:1972 0:0849 13:406 1:3248 0:5558 0:0643 1 6:0379 0:5565 0:2653
0.1299 0.1451 1.3999 1.0778 0.2048 0.0781 0.4382 2.4584 0.1514 0.1037
The sample consists of 4,301 firm-year observations in the period 2000–2005. All variables are defined in Appendix 56.1
868
K.W. Lee and G.H.H. Yeo
Table 56.2 Regressions of capital structure levels
1 0:0185 .4:26/ 0:0026 .3:17/ 0.0413 (3.08) 0.0853 (5.23)
CEODUAL TENURE OUTDIR INSTIOWN ENTRENCH
2
0.0653 (3.41) 0:0249 .6:84/
ENTRENCH FCF ENTRENCH INSTIOWN ROA SIZE MB TANGIBLE FCF Year indicator variables Country indicator variables Industry indicator variables N Adjusted R2
0:1052 .8:34/ 0.0322 (14.16) 0.0045 (2.11) 0.1523 (7.62) 0:0147 (2.10) Yes Yes Yes 4,301 29.1%
0:2172 .9:02/ 0.0324 (16.62) 0.0035 (2.15) 0.1516 (7.49) 0:0285 .2:21/ Yes Yes Yes 4,301 28.6%
3
0:01708 .3:95/ 0:0147 .2:23/ 0.0341 (4.51) 0:1467 .6:53/ 0.0317 (14.28) 0.0071 (1.09) 0.1495 (7.31) 0:0382 .1:71/ Yes Yes Yes 4,301 30.7%
The sample consists of 4,301 firm-year observations in the period 2000–2005. The dependent variable is long-term debt divided by total assets (LTD). ENTRENCH is a composite entrenchment index based on the principal component analysis of CEODUAL, TENURE, and OUTDIR with higher values of ENTRENCH denoting higher CEO entrenchment. The clustered t-statistics, adjusted for heteroscedasticity and serial correlation, are reported in parenthesis. All other variables are defined in Appendix 56.1. , , and indicate two-tailed statistical significance at the 10%, 5%, and 1% levels, respectively
taking on less debt, under the assumption that entrenched CEOs have longer job tenure in office. Variables associated with stronger monitoring have positive associations with leverage. When the CEO’s power over the board decreases, corporate leverage increases. Specifically, the positive and significant coefficient on OUTDIR suggests that firms with higher proportion of outside directors on the board have higher leverage. Thus, independent boards have greater influence over entrenched CEOs’ incentives to avoid leverage. In Table 56.2 column (2), we include a composite CEO entrenchment variable (ENTRENCH) based on the principal component analysis of CEODUAL, TENURE, and OUTDIR. The composite CEO entrenchment proxy is an increasing function of managerial entrenchment given the relative importance of the overall factor loadings of each individual entrenchment proxy. Thus, higher values of ENTRENCH denote higher CEO entrenchment. The estimated coefficient on
ENTRENCH is negative and significant at the 1% level. This result supports the hypothesis that entrenched CEOs prefer less debt. In column (3), the coefficient on the interaction term between CEO entrenchment and free cash flow (ENTRENCH FCF) is negative and significant at the 5% level. This finding suggests that the negative association between leverage and CEO entrenchment is more pronounced in firms with higher agency costs of free cash flow. Thus, consistent with hypothesis H2, in firms with entrenched CEOs, those with greater managerial discretion associated with free cash flow have lower leverage. In other words, higher free cash flow exacerbates the agency costs associated with CEO entrenchment, resulting in lower level of leverage. Furthermore, our results indicate that leverage increases with the proportion of equity owned by institutional investors (INSTIOWN). This finding suggests CEOs are more
56 Capital Structure in Asia and CEO Entrenchment
likely to take on more debt when there is active monitoring by large external shareholders. More generally, large shareholders appear to act as monitoring complements to debt-holders. We also document that the coefficient on the interaction term between CEO entrenchment and free cash flow (ENTRENCH INSTIOWN) is positive and significant at the 1% level. Thus, consistent with hypothesis H3, this finding suggests that the negative association between leverage and CEO entrenchment is mitigated in firms with higher institutional equity ownership. By demonstrating that in firms with entrenched CEOs, those with larger institutional equity ownership have higher leverage, our result provides evidence supporting the notion that large institutional shareholders play an effective monitoring role in reducing managerial discretion. In general, the control variables are in their predicted direction. Firms with larger profitability and those with lower free cash flow have lower leverage. Larger firms and firms with higher proportion of tangible assets have higher leverage. As
DEBTISSUE D
869
robustness tests, we repeat our analysis by country. Regression results by country (not tabulated) yield qualitatively similar results. Similarly, when we repeat our tests using annual regressions, our inferences are not changed. Thus, there is no clustering of results by economy and by year.
56.4.3 Changes in Leverage Prior studies (Jung et al. 1996; Berger et al. 1997) suggest that debt-to-asset ratios represent the cumulative result of years of separate decisions. Thus, cross-sectional tests based on a single aggregate of different decisions are likely to have low power. Following Berger et al. (1997), as an additional test, we analyze the decisions to change leverage rather than use the cross-sectional variation in debt-to-asset ratios. To examine the net change in leverage during the year, we compute the net debt issued variable (DEBTISSUE) as
.debt issued less debt retired/ minus .equity issued less equity retired/ Total assets at the start of the year
Table 56.3 presents the regression results for our model of year-to-year net change in leverage. The dependent variable is net debt issued during the year. All independent variables are measured at the start of the year. In column (1), we find that firms with a CEO who also chairs the board have significant decrease in leverage. The results also indicate significant increases in leverage in firms with higher CEO tenure. We also document that firms with higher proportion of outside directors on the board have larger leverage increases. Furthermore, net debt issued by firms is positively associated with the proportion of institutional equity ownership. Collectively, these results suggest that entrenched CEOs are less likely to issue net debt. In column (2), we replace three of the proxies for CEO entrenchment (CEODUAL, TENURE, and OUTDIR) with a composite CEO entrenchment variable (ENTRENCH). Recall that the variable ENTRENCH is computed based on the principal component analysis of CEODUAL, TENURE, and OUTDIR with higher values of ENTRENCH denoting greater CEO entrenchment. The estimated coefficient on ENTRENCH is negative and significant at the 1% level. This finding implies that entrenched CEOs issue less net debt. In column (3), we test whether agency costs of free cash flow exacerbate entrenched CEOs’ incentives to issue less debt. The coefficient on the interaction term between CEO entrenchment and free cash flow (ENTRENCH FCF) is negative and significant at the 5% level. This finding suggests
that in firms with higher agency costs of free cash flow, those with more entrenched CEOs have lower propensity to issue net debt. Hence, the combination of higher agency costs of free cash flow and greater CEO entrenchment is associated with greater decline in leverage. We also document that the coefficient on the interaction term between CEO entrenchment and free cash flow (ENTRENCH INSTIOWN) is positive and significant at the 1% level. Consistent with hypothesis H3, this finding suggests that in firms with entrenched CEOs, those with higher institutional equity ownership have higher increases in leverage.
56.4.4 Financing Deficit Our main results indicate that less entrenched CEOs issue more debt. However, as Berger et al. (1997) note, more leverage may not represent a value-increasing strategy, and it is possible that CEOs increase their firms’ leverage beyond optimal levels as a defensive measure when their security is threatened. For example, increasing corporate leverage increases the concentration of equity, which may reduce the likelihood of takeover. To address this issue, we examine how leverage changes as a function of the firm’s financing deficit at the start of the year. Table 56.4 presents the results of regressions of
870
K.W. Lee and G.H.H. Yeo
Table 56.3 Regressions of net debt issued
1 0:0542 .2:75/ 0:0021 .3:18/ 0.0487 (2.03) 0.0457 (2.17)
CEODUAL TENURE OUTDIR INSTIOWN ENTRENCH
2
0.0283 (2.09) 0:0611 .4:86/
ENTRENCH FCF ENTRENCH INSTIOWN ROA SIZE MB TANGIBLE LTD FCF Year indicator variables Country indicator variables Industry indicator variables N Adjusted R2
0.1819 (3.95) 0.0178 (5.65) 0:0141 .1:87/ 0:0135 .0:65/ 0:1838 .5:85/ 0:2193 .5:52/ Yes Yes Yes 1,198 15.8%
0.1832 (3.98) 0.0177 (1.89) 0:0139 .1:86/ 0:0123 .0:60/ 0:1838 .5:63/ 0:2154 .5:44/ Yes Yes Yes 1,198 15.7%
3
0:0484 .5:17/ 0:0122 .2:14/ 0.0155 (1.99) 0.1622 (3.49) 0.0058 (0.94) 0:0146 .1:96/ 0:0041 .0:19/ 0:1792 .5:72/ 0:2250 .5:64/ Yes Yes Yes 1,198 16.3%
The sample consists of 1,198 firm-year observations in the period 2000–2005. The dependent variable is net debt issued divided by total assets at the start of the year (DEBTISSUE). Net debt issued is (debt issued less debt repurchased) minus (equity issued less equity repurchased) during the year. All the independent variables are measured at the start of the year. ENTRENCH is a composite entrenchment index based on the principal component analysis of CEODUAL, TENURE, and OUTDIR with higher values of ENTRENCH denoting higher CEO entrenchment. The clustered t-statistics, adjusted for heteroscedasticity and serial correlation, are reported in parenthesis. All other variables are defined in Appendix 56.1. , , and indicate two-tailed statistical significance at the 10%, 5%, and 1% levels, respectively
changes in leverage on financing deficit and CEO entrenchment variables. The dependent variable is net debt issued during the year (defined in Sect. 56.4.3). The financing deficit variable (DEFICIT) is computed as cash dividends paid plus investments plus capital expenditure plus change in working capital less cash flow. A positive coefficient on DEFICIT indicates that the firm funds its financing deficit with debt. In column (1), the financing deficit variable is positive and significant, indicating that firms fund their financing deficit mainly with debt issuance. The interaction term between financing deficit and composite entrenchment index (ENTRENCH DEFICIT) is negative and significant at 5% level. This result suggests that firms with entrenched CEOs
tend to fund their financing deficit by issuing less debt. This appears to indicate that entrenched CEOs avoid increases in debt to fund their firm-level financing deficit to mitigate the disciplinary role of debt in mitigating managerial discretion. Hence, this result casts doubt that the changes in leverage are likely to be firm value-enhancing. In column (2), we examine how agency costs of free cash flow affects the association between changes in leverage and the financing deficit in firms with entrenched CEOs. The three-way interaction term (ENTRENCH DEFICIT FCF) is negative and significant at the 5% level. This result suggests that in firms with entrenched CEOs, those firms with high free cash flow are even less likely to fund their financing
56 Capital Structure in Asia and CEO Entrenchment
871
Table 56.4 Regressions of net debt issued on financing deficit 1 2 DEFICIT ENTRENCH ENTRENCH DEFICIT
0.0745 (2.84) 0:0197 (2.25) 0:0116 .2:29/
ENTRENCH DEFICIT FCF ENTRENCH DEFICIT INSTIOWN CHGROA CHGSIZE CHGMB CHGTANGIBLE Year indicator variables Country indicator variables Industry indicator variables N Adjusted R2
0:1877 .4:06/ 0.1159 (8.12) 0.0026 (0.61) 0.0745 (1.67) Yes Yes Yes 1,162 10.5%
0.8503 (2.19) 0:0211 .1:81/ 0:0134 .2:01/ 0:1006 .6:41/ 0.0345 (1.93) 0:1828 .4:02/ 0.1041 (7.36) 0.0041 (0.98) 0.0545 (1.24) Yes Yes Yes 1,162 13.7%
The sample consists of 1,162 firm-year observations in the period 2000–2005. The dependent variable is net debt issued divided by total assets at the start of the year (DEBTISSUE). Net debt issued is (debt issued less debt repurchased) minus (equity issued less equity repurchased) during the year. All the independent variables are measured at the start of the year. ENTRENCH is a composite CEO entrenchment index based on the principal component analysis of CEODUAL, TENURE, and OUTDIR with higher values of ENTRENCH denoting higher CEO entrenchment. DEFICIT is financing deficit for the year computed as the sum of cash dividend plus capital expenditure plus investments plus change in working capital less cash flow from operations divided by total assets. CHGROA is the change in return on assets. CHGMB is the change in market-to-book equity. CHGSIZE is the change in firm size as measured by logarithm of total assets. CHGTANGIBLE is the change in tangible assets. The clustered t-statistics, adjusted for heteroscedasticity and serial correlation, are reported in parenthesis. , , and indicate two-tailed statistical significance at the 10%, 5%, and 1% levels, respectively
deficit with debt. In other words, the entrenchment effect of CEOs on net debt issued is concentrated in firms with high agency costs of free cash flow. In column (3), we explore how institutional equity ownership affects the changes in leverage and the financing deficit in firms with entrenched CEOs. The three-way interaction term (ENTRENCH DEFICIT INSTIOWN) is positive and significant at the 10% level. Thus, there is some evidence supporting the notion that in firms with entrenched CEOs, those with high institutional equity ownership are more likely to fund their financing deficit with debt. In other words, the entrenchment effect of CEOs on net debt issued is reduced in the presence of large institutional equity ownership.
56.5 Conclusion Entrenched CEOs have discretion over their firms’ leverage choices. Following Jensen (1986), leverage reduces the agency costs of managerial discretion. Since debt issuance commits the free cash flow to pay creditors, management has less control over the firm’s cash flow. Thus, we posit that entrenched CEOs have incentives to avoid the monitoring associated higher leverage so that they have more discretion over corporate resources. We find that firms with higher CEO entrenchment have lower level of leverage. Specifically, firms with CEOs who chair the board, have a higher CEO tenure and have a lower proportion of outside directors and have lower leverage. We also predict that the combination of higher free cash flow and higher CEO entrenchment exacerbates the agency costs of managerial discretion and results in lower leverage. Consistent with our hypothesis, we find that the negative association between CEO entrenchment and corporate leverage is more pronounced in firms with higher agency costs of free cash flow. Finally, we document that for the sub-sample of firms with entrenched CEOs, those firms with greater institutional equity ownership have higher leverage. This result suggests large institutional shareholders play an important corporate governance role in mitigating entrenched CEOs’ incentives to avoid debt.
References Berger, P. G., E. Ofek, and D. L. Yermack. 1997. “Managerial entrenchment and capital structure decisions.” Journal of Finance, 52(4), 1411–1438. Chen, S.S., C.F. Lee, and H.H. Lee. (2010). “Alternative methods to determine optimal capital structure: theory and application.” Handbook of Quantitative Finance and Risk Management, (forthcoming). Claessens, S., S. Djankov, and L. Lang. 2000. “The separation of ownership and control in East Asian corporations.” Journal of Financial Economics 58, 81–112. Dahya, J., J. J. McConnell, and N. G. Travlos. 2002. “The cadbury committee, corporate performance, and top management turnover.” Journal of Finance 57(1), 461–483. Del Guercio, D. and J. Hawkins. 1999. “The motivation and impact of pension fund activism.” Journal of Financial Economics 52, 293–340. Fama, E. 1980. “Agency problems and the theory of the firm.” Journal of Political Economy 88, 288–307. Friend, I. and L. H. P. Lang. 1988. “An empirical test of the impact of managerial self-interest on corporate capital structure.” Journal of Finance 47, 271–281. Gillan, S. and L. Starks. 2000. “Corporate governance proposals and shareholder activism: the role of institutional investors.” Journal of Financial Economics 57, 275–305. Grossman, S. and O. Hart. 1980. “Takeover bids, the free rider problem, and the theory of the corporation.” Bell Journal of Economics 11, 42–64.
872 Harris, M. and A. Raviv. 1988. “Corporate control contests and capital structure.” Journal of Financial Economics 20, 55–86. Hartzell, J.C. and L. T. Starks. 2003. “Institutional investors and executive compensation.” Journal of Finance 58(6), 2351–2375. Hermalin, B. E. and M. S. Weisbach. 1998. “Endogenously chosen board of directors and their monitoring of the CEO.” American Economic Review 88(1), 96–188. Jensen, M. C. 1986. “Agency costs of free cash flow, corporate finance and takeovers.” American Economic Review 76, 323–329. Jung, K., Y. C. Kim, and R. M. Stulz. 1996. “Timing, investment opportunities, managerial discretion, and security issue decision.” Journal of Financial Economics, 42, 159–185. Israel, R.. 1992. “Capital and ownership structures, and the market for corporate control.” Review of Financial Studies 5(2), 181–198. McConnell, J. J. and H. Servaes. 1990. “Additional evidence on equity ownership and corporate value.” Journal of Financial Economics 27(2), 595–612. Mehran, H. 1992. “Executive incentive plans, corporate control, and capital structure.” Journal of Financial and Quantitative Analysis 27, 539–560. Nesbitt, S. L. 1994. “Long-term rewards from shareholder activism: a study of the CalPERS effect.” Journal of Applied Corporate Finance 6, 75–80. Parrino, R., R. W. Sias, and L. T. Starks. 2003. “Voting with their feet: institutional ownership changes around forced CEO turnover.” Journal of Financial Economics 68, 3–76. Rajan, R. G. and L. Zingales. 1995. “What do we know about capital structure? Some evidence from international data.” Journal of Finance 50, 1421–1460. Shleifer, A. and R. Vishny. 1986. “Large shareholders and corporate control.” Journal of Political Economy 94, 448–461. Smith, M. 1996. “Shareholder activism by institutional investors: evidence from CalPERS.” Journal of Finance 51, 227–252.
K.W. Lee and G.H.H. Yeo Stulz, R. 1988. “Managerial control of voting rights: financing policies and the market for corporate control.” Journal of Financial Economics 20, 25–54. Titman, S. and R. Wessels. 1988. “The determinants of capital structure.” Journal of Finance 43(1), 1–19. Zwiebel, J. 1996. “Dynamic capital structure under managerial entrenchment.” American Economic Review 86(5), 1197–1215.
Appendix 56A Variables Definition
Variables LTD ROA SIZE MB TANGIBLE
Definition
Long-term debt divided by total assets. Net income after tax divided by total assets. Natural logarithm of book value of total assets. Market value of equity divided by book value of equity Net Property, Plant, and Equipment divided by total assets. FCF (Cash flow from operations less capital expenditure less common dividends paid) divided by total assets. CEODUAL A dummy variable that equals 1 if the CEO is the chairman of the board of directors, and 0 otherwise. TENURE The number of years the CEO is in office. OUTDIR The number of outside directors divided by board size. INSTIOWN The proportion of common equity held by institutional investors.
Chapter 57
A Generalized Model for Optimum Futures Hedge Ratio Cheng-Few Lee, Jang-Yi Lee, Kehluh Wang, and Yuan-Chung Sheu
Abstract Under martingale and joint-normality assumptions, various optimal hedge ratios are identical to the minimum variance hedge ratio. As empirical studies usually reject the joint-normality assumption, we propose the generalized hyperbolic distribution as the joint log-return distribution of the spot and futures. Using the parameters in this distribution, we derive several widely used optimal hedge ratios: minimum variance, maximum Sharpe measure, and minimum generalized semivariance. Under mild assumptions on the parameters, we find that these hedge ratios are identical. Regarding the equivalence of these optimal hedge ratios, our analysis suggests that the martingale property plays a much important role than the joint distribution assumption. Keywords Optimal hedge ratio distribution r Martingale property
r
Generalized hyperbolic
57.1 Introduction Because of their low transaction cost, high liquidity, high leverage, and ease of short position, stock index futures are among the most successful innovations in the financial markets. Besides speculative trading, they are widely used to hedge against the market risk of the spot position. One of the most important issues for investors and portfolio managers is to calculate the optimal futures hedge ratio, the proportion of the position taken in futures to the size of the spot so that the risk exposure can be minimized. The optimal hedge ratios typically depend on the objective functions under consideration. In literature on futures K. Wang () and Y.-C. Sheu National Chiao-Tung University, Hsinchu City, Taiwan, ROC e-mail: [email protected]; [email protected] C.-F. Lee Rutgers University, New Brunswick, NJ, USA e-mail: [email protected] J.-Y. Lee Tunghai University, Taichung, Taiwan, ROC e-mail: [email protected]
hedging, there are two different types of objective functions: the risk function to be minimized, and the utility function to be maximized. Johnson (1960) obtains the minimum variance hedge ratio by minimizing the variance of the change in the value of the hedged portfolios. On the other hand, as Adams and Montesi (1995) indicate, corporate managers are more concerned with the downside risk rather than the upside variation. A measure of the downside risk is the generalized semivariance (GSV) where the risk is computed from the expectation of a power function of shortfalls from the target return (Bawa 1975, 1978; Fishburn 1977). De Jong et al. (1997) and Lien and Tse (1998, 2000, 2001) have calculated several GSV-minimizing hedge ratios. Regarding the utility function approach, we consider the Sharp measure (SM) criteria; that is, the ratio of the portfolio’s excess return to its volatility. Howard and D’Antonio (1984) formulate the optimal hedge ratio by maximizing the Sharp measure. Normally, these optimal hedge ratios under different approaches are not the same. However, with the jointnormality and martingale assumptions, they are identical to the minimum variance hedge ratio. Unfortunately, many empirical studies indicate that major markets typically reject the joint-normality assumption (Fig. 57.1) (Chen, Lee and Shrestha 2001; Lien and Tse 1998). In particular, the fat-tail property of the return distribution affects the hedging effectiveness substantially (Fig. 57.2). It will be useful to find out the nature of the optimal hedge ratios under more realistic assumptions. In this paper we introduce the bivariate generalized hyperbolic distributions as alternative joint distributions for returns in the spot and futures markets (Fig. 57.3). Barndorff-Nielsen (1977, 1978) develops the generalized hyperbolic (GH) distributions as a mixture of the normal distribution and the generalized inverse Gaussian (GIG) distribution first proposed in 1946 by Etienne Halphen. The class of the generalized hyperbolic distributions includes the hyperbolic distributions, the normal inverse Gaussian distributions, and the variance-Gamma distributions, while the normal distribution is a limiting case of the generalized hyperbolic distributions. Use of the generalized hyperbolic distributions have been increasing in finance literature. To model the log returns of some financial assets,
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_57,
873
874
C.-F. Lee et al.
Fig. 57.1 Normal density and Gaussian kernel density estimators
Fig. 57.2 Log-densities of daily log returns of major indices and futures (2000–2004)
Eberleinand Keller (1995) consider the hyperbolic distribution and Barndorff-Nielsen (1995) proposes the normal inverse Gaussian distribution. For more recent applications of the generalized hyperbolic distributions in finance, see Bibby and Sørensen (2003), Eberlein, Keller, and Prause (1998), Rydberg (1997, 1999), Kücher et al. (1999), and Bingham and Kiesel (2001). In terms of the parameters for the bivariate hyperbolic distributions, we have developed in this paper the minimum variance hedge ratio, GSV-minimizing hedge ratio, and the SM-maximizing hedge ratio. Moreover, the relationships between these hedge ratios are explored. In particular, under the martingale assumption, we can still obtain the result
that these hedge ratios are the same as the minimum variance hedge ratio. Based on the maximum likelihood estimation of the parameters and the numerical methods, we calculate and compare the different hedge ratios for TAIEX futures and S&P 500 futures (Fig. 57.4). The paper is divided into five sections. Section 57.2 first introduces the definitions and some basic properties for GIG and GH distributions. In Sect. 57.3, we study the optimal hedge ratios under different approaches and estimate these ratios in terms of the parameters for GH distributions. In Sect. 57.4, we discuss the kernel density estimators and MLE method for parameter estimation problem. The last section provides the concluding remarks. Proofs are relegated to the Appendix.
57 A Generalized Model for Optimum Futures Hedge Ratio Fig. 57.3 Estimated symmetric H2 distributions
Fig. 57.4 Estimated hedge ratios
875
876
C.-F. Lee et al.
Z
57.2 GIG and GH Distributions
1
MGIG.;ı; / .u/ D
e ux dGIG.;ı; / .x/dx 0
57.2.1 The Generalized Hyperbolic Distributions
! D
To introduce the generalized hyperbolic distribution, we first recall some basic properties of generalized inverse Gaussian (GIG) distributions. Note that for any ı; > 0 and 2 R, the function dGIG.;ı; / .x/ D
. =ı/ 1 1 .ı2 x 1 C x e 2 2K .ı /
2 x/
with the restriction 2u <
Z
1
1
1 Cu/
u1 e 2 x.u
(57.1)
(57.2)
is the Bessel functions of the third kind with index . The distribution with the density function dGIG.;ı; / .x/ on the positive half-line is called a generalized inverse Gaussian (GIG) distribution with parameters ; ı; and denoted by GIG.; ı; /. The moment generating function of the generalized inverse Gaussian distribution is given by
where p Y is a random variable with distribution GIG.; ı; ˛ 2 ˇ 2 / and N. C ˇy; y/ denotes the normal distribution with mean C ˇy and variance y. From this, one can easily verify that the density function for GH.; ˛; ˇ; ı; / is given by the formula
1 0
D
ı
ˇ
dN.Cˇy;y/ .x/dGIG.;ı;p˛2 ˇ2 / .y/dy 1
ı 2 C .x /2 p ˛2 2 K .ı / e .x/
. From this, we obtain
ZjY D y N. C ˇy; y/;
dGH.;˛;ˇ;ı;/ .x/ D
p where D ˛ 2 ˇ 2 . The class of hyperbolic distributions is the subclass of GH distributions obtained when is equal to 1. We write
2 2
p
K 1 ˛ ı 2 C .x /2 2
H.˛; ˇ; ı; / instead of GH.1; ˛; ˇ; ı; /. Using the fact that K1=2 .z/ D . =2z/1=2 e z , one obtains the density for H.˛; ˇ; ı; / is
p p ˛2 ˇ2 p
e ˛ ı 2 C .x /2 C ˇ.x /: dH.˛;ˇ;ı;/ .x/ D 2˛ıK1 ı ˛ 2 ˇ 2
The normal inverse Gaussian (NIG) distributions were introduced to finance in Barndorff-Nielsen (1995). It is a subclass of the generalized hyperbolic distributions obtained
dNIG.˛;ˇ;ı;/ .x/ D
ı
˛2 2 ı C .x /2
(57.3)
Barndorff-Nielsen (1977) introduced the class of generalized hyperbolic (GH) distributions as mean-variance mixtures of normal distributions. More precisely, one says that a random variable Z has the generalized hyperbolic distribution GH.; ˛; ˇ; ı; / if
0
Z
2
K .ı /
ıKC1 .ı / K .ı / # 2 " 2 .ı / KC2 .ı / KC1 ı : Var ŒGIG D K .ı / K2 .ı /
;x > 0
x>0
du;
2 2u
2 2u
EŒGIG D
is a probability density function on .0; 1/. Here the function 1 K .x/ D 2
p
p K ı
12
(57.4)
for equal to 1=2. The density of the NIG distribution is given by
eı
C.x/ˇ
p
K1 ˛ ı 2 C .x C /2 :
57 A Generalized Model for Optimum Futures Hedge Ratio
877
57.2.2 Multivariate Modeling
tor Z has the multivariate generalized hyperbolic distribution MGH.; ˛; ˇ; ı; ; / with parameters .; ˛; ˇ; ı; ; / if:
In finance one does not look at a single asset, but at a bunch of assets. Since the assets in the market are typically highly correlated, it is natural to use multivariate distributions. A straightforward way for introducing multivariate generalized hyperbolic (MGH) distributions is via the mixtures of multivariate Normal distributions with the generalized inverse Gaussian distributions. In fact the multivariate generalized hyperbolic distributions were introduced and investigated in Barndorff-Nielsen (1978). Let be a symmetric positive-definite d x d-matrix with determinant j j D 1. Assume that 2 R; ˇ; 2 Rd ; ı > 0, and ˛ 2 > ˇ 0 ˇ. We say that a d-dimensional random vec-
dM GH.x/
where Cd D
Œ.˛ 2 ˇ 0 ˇ/=ı 2 =2
p
ZjY D y Nd . y ˇ; y /; where Nd.A; B/ denotes the d-dimensional Normal distribution with mean vector p A and covariance matrix B, and Y distribution as GIG.; ı; ˛ 2 ˇ 0 ˇ/. Here we notice that the generalized hyperbolic distributions are symmetric if and only if ˇ D .0; : : : ; 0/0 . For D .d C 1/=2 we obtain the multivariate hyperbolic distributions. For D 1=2 we obtain the multivariate normal inverse Gaussian distribution. The density function of the distribution MGH.; ˛; ˇ; ı; ; / is given by the formula
p
Kd=2 ˛ ı 2 C .x /0 1 .x / .ˇ 0 .x// D cd
d=2 e p ˛ 1 ı 2 C .x /0 1 .x /
(57.5)
where rQp D .P1 P0 /=P0 and rQf D ..F1 F0 /=F0 are oneperiod returns on the spot and futures positions, respectively. of MGH are given by h D X=Q is the hedge ratio and D h.F0 =P0 /. (Note that is so-called the adjusted hedge ratio.) E ŒMGH .; ˛; ˇ; ı:; / D C ˇEŒGIG.; ı; / ; The main objective of hedging is to choose the optimal (57.6) hedge ratio . However the optimal hedge ratio will depend on a particular objective function to be optimized. We recall VarŒMGH.; ˛; ˇ; ı; ; / D EŒGIG .; ı; / some most widely used theoretical approaches to the optimal C ˇˇ 0 Var ŒGIG.; ı; / futures hedge ratios and compute explicitly these optimal ra(57.7) tios in terms of the parameters for MGH distributions. For a comprehensive review of futures hedge ratios, see Chen et al. p 2 0 where D ˛ ˇ ˇ. (For details, see, e.g., Blæsid (2003). 1981.) .2 /d=2 K .ı
˛ 2 ˇ 0 ˇ/
. The mean and covariance
57.3.1 Minimum Variance Hedge Ratio
57.3 Futures Hedge Ratios We consider a decision maker. At the decision date .t D 0/, the agent engages in the production of Q.Q > 0/ commodity units for sale at the terminal date .t D 1/ at the random cash price P1. In addition, at the decision date the agent can sell X commodity units in the futures market at the price F0 , but must repurchase them back at the terminal date at the random futures price F1 . Let the initial wealth be V0 D P0 Q and the end-of-period wealth be V1 D P1 Q C .F0 F1 /X . Then we consider the wealth return that is V1 V0 P1 Q C F0 X F1 X P0 Q D V0 P0 Q
P1 P0 F1 F0 F0 X D rQp rQf D P0 F0 P0 Q
rQ D
(57.8)
The most widely used hedge ratio is the minimum variance hedge ratio, which is known as the MV hedge ratio. The objective function to be minimized is the variance of rQ . Clearly; we have
VarŒQr D r2p C 2 r2f 2rp rf ;
Where rp and rf are standard deviations of rQp and rQ;f respectively, and is the correlation coefficient between rQp and rQf . The MV hedge ratio is obtained by minimizing VarŒQr . Simple calculation shows that the MV hedge ratio is given by rp 0 M : (57.9) V D rf
878
C.-F. Lee et al.
Theorem 1. Assume .Qrf ; rQp /0 is distributed as MGH.; ˛; ˇ; ı; ˇ D .ˇ1 ; ˇ2 /0 ; D .1 ; 2 /0 , and
; /, where
11 12 is symmetry. Then we have
D
21 22
M V D
12 EŒGIG C ıfp ı11 EŒGIG C ıff
where GIG D GIG.; ı;
(57.10)
p ˛ 2 ˇ 0 ˇ/ and
ıff D ˇ12 211 C 2ˇ1 ˇ2 11 12 C ˇ22 212 VarŒGIG ıfp D ˇ12 11 12 C ˇ1 ˇ2 11 22 C 212 Cˇ22 12 22 VarŒGIG : D In particular, if ˇ D .0; : : : ; 0/0 , then MV
r 0 ./ D
12 .
11
rp
s D
rf
We consider the optimal hedge ratio that incorporates both risk and expected return. Howard and D’Antonio (1984) considered the optimal level of futures contracts by maximizing the ratio of the portfolio’s excess return to its volatility; that is, rp rf rL (57.11) max where is the standard deviation of rQ ; rp ; rf are expected values for rQp and rQf , respectively, and rL is the riskfree interest rate. Consider the function rp rf rL
r./ D
Then we have
i h r2f .rp rL / C rf rf rp C .rp rL /rp rf r2p rf
(57.12)
3
Where rf rp D Cov.Qrp ; rQf / and, hence, the critical point for r./ is given by 2
57.3.2 Sharpe Hedge Ratio
r rf r p rp rL f
r
r p rf .rp rL /
and
s HE D
:
(57.13)
. /2 C 1: 1 2
Clearly, the last equality implies that (
f
It follows from the Equation (57.12) that if rp L > r r p rf then r 0 ./ > 0 for < s and r 0 ./ < 0 for >
HE
>1
when ¤
D1
when D :
f
s . Hence s is the optimal hedge ratio (Sharpe hedge ratio) r for the Equation (57.11). Similarly, if rp L < r p rf , f
then r./ has a minimum at s . (Note that if rp L D r r p rf , then r./ is strictly monotonic in s .) f
The measure of hedging effectiveness (abbreviated HE) is given in Howard and D’Antonio (1984) by HE D
r.a /=
Write D
rp rL
!
Proposition 1. Assume rp > rL and 1 > . Then we have (57.14)
rp rf =rf
.rp rL /=rp
(57.15)
( is also-called the risk-return relative.) Then we have s D
rp rf
1
Moreover, without any distribution assumption, we have the following relationship between s and MV . In particular, if the expected return on the futures contract is zero and rp > rL , then the Sharpe hedge ratio reduces to the minimum variance hedge ratio.
8 ˆ <s > M V
s D M V ˆ : s < M V
whe n
f < 0
whe n
f D 0
whe n
0 < f :
Recall that rf rp D Cov.Qrp ; rQf /. Then we have
s D
r2p rf rf rp .rp rL / rf rp rf r2f .rp rL /
:
(57.16)
57 A Generalized Model for Optimum Futures Hedge Ratio
From this and by Equations (57.6) and (57.7), we obtain Theorem 2. Assume .Qrf ; rQp /0 is distributed as in Theorem 1. Assume that
879
fp Œ.1 C ˇ1 11 C ˇ2 12 / EŒGIG < ff Œ.2 C ˇ1 21 C ˇ2 22 / EŒGIG rL Then we have
ˇ D
pp Œ.1 C ˇ1 11 C ˇ2 12 / EŒGIG fp Œ.2 C ˇ1 21 C ˇ2 22 / E ŒGIG rL fp Œ.1 C ˇ1 11 C ˇ2 12 /EŒGIG ff Œ.2 C ˇ1 21 C ˇ2 22 / EŒGIG rL
where ıff ; ıfp , GIG are the same as in Theorem 1 and 8 ˆ ıpp ˆ ˆ ˆ < ff ˆ fp ˆ ˆ ˆ : pp
D ˇ12 221 C 2ˇ1 ˇ2 21 22 C ˇ22 222 Var ŒGIG D 11 EŒGIG C ıff D 12 EŒGIG C ıfp D 22 EŒGIG C ıpp :
57.3.3 Minimum Generalized Semivariance Hedge Ratio
1 X D .c ri; /n Irt; c ; m i D1
(57.17)
m
Lobs n .c; /
(57.19)
where ri; D rp .i / rf .i /. Given c and n, numerical methods can be used to search the hedge ratio that minimizing the sample GSV, Lobs n .c; /.
57.4 Estimation and Simulation 57.4.1 Kernel Density Estimators
In this case the optimal hedge ratio is obtained by minimizing the generalized semivariance(GSV) given below: Z e .c x/n dF .x/; n > 0; (57.18) Ln .c; X / D 1
where F .: / is the probability distribution function of the return X . The GSV is specified by two parameters: the target return c and the power of the shortfall n. (Note that if the density function of X is symmetric at c, then we obtain L2 .c; X / D :Var.X/=2. Hence in this case, the GSV approach is the same as that of the minimum variance.) The GSV, due to its emphasis on the returns below the target return, is consistent with the risk perceived by managers (see Lien and Tse 2001). For futures hedge, we consider Ln .c; / D Ln .c; rQp rQf /. Under some conditions on the joint distribution, we obtain that the minimum GSV hedge ratio is the same as the minimum variance hedge ratio. Theorem 3. Assume .Qrf ; rQp / is the same as in Theorem 1. If ˇ D 0 and 1 D rf D 0, then the minimum GSV hedge ratio is the same as the minimum variance hedge ratio (i.e., 12 D MV D
GSV
11 ). In empirical studies, the true distribution is unknown or can be estimated from the samcomplicated. Then GSV ple by using the so-called empirical distribution method adapted in, for example, Price et al. (1982) and Harlow (1991). Suppose we have m observations of .Qrf ; rQp /, say, .rf .i /; rp .i //; i D 1; 2; : : : ; m. From this, the GSV can be estimated by the formula:
Assumed that we have n independent observations x1 ; : : : ; xn from the random variable X with the unknown density function f . The kernel density estimator for the estimation of f is given by 1 X x xi ; K fOh .x/ D nh i D1 h n
x2R
(57.20)
where K is a so-called kernel function and h is the bandwidth. In this kernel: p paper we work with the Gaussian 1 K.x/ D 1= 2 expfx 2 =2g and h D .4=3/1=5 n 5 (For more details, see Scott 1979.) Meanwhile it is worth noting that Lien and Tse (2000) proposed the kernel density estimation method to estimate the probability distribution of the portfolio return for every ™, and then grid search methods was adapted to find the optimum GSV hedge ratio.
57.4.2 Maximum Likelihood Estimation We focus on how to estimate the parameters of a density function f .xI ‚/, where ‚ is the set of parameters to be estimated. Suppose that we have m independent observations x1; : : : ; xn of a random variable X with the density function f .xI ‚/. The maximum likelihood estimator OMLE is the parameter set that maximizes the likelihood function L.‚/ D
n Y i D1
f .xi I ‚/:
880
C.-F. Lee et al.
Clearly, this is equivalent to maximizing the logarithm of the likelihood function: log L.‚/ D
n X
log f .xi I ‚/:
i D1
The log-likelihood function for hyperbolic distribution H.˛; ˇ; ı; / is given by p `H.˛;ˇı;/ .‚/ D n log ˛ 2 ˇ 2 log 2 log ˛ p
log ı log K1 ı ˛ 2 ˇ 2 C
p
2 C .x /0 1 .x / ˛ ı K d .˛=ı/ 2 : d .2 /d=2 K .˛ı/ 1 p 2 2 0 1 .˛ ı C .x / .x //
and, in particular, the two-dimensional symmetric hyperbolic distributions (i.e., “ D 0 and œ D 3=2) has the density (see Fig. 57.3) p .˛=ı/3=2 ˛ ı 2 C.x0 / 1 .x/ p : e 23=2 ˛K 3 .˛ˇ/ 2
From this, we obtain the log-likelihood function for twodimensional symmetric hyperbolic distributions:
˛ 3 1 3 log log 2 log 2 ı 2 2 i log ˛ log K 3 .˛ı/ 2
˛
Set D LT L via Cholesky decomposition Sample Y from GIG.; ı; / distribution Sample Z from N.0; I /, where I is d x d-identity matrix p Return X D C Y ˇ C Y LT Z
The efficiency of the above algorithms depends on the method of sampling the generalized inverse Gaussian distributions. Atkinson (1982) applied the method of rejection algorithm to sampling GIG. We adopt their method for simulation of estimated hyperbolic random variables.
57.5 Conclusion
The symmetric MGH density function is given by the formula
`H2 D n
1. 2. 3. 4.
n h i X p ˛ ı 2 C .xi /2 C ˇ .xi / : i D1
H2 D
Similarly, for simulating a MGH distributed random vector, we have:
n p X ı 2 C .xi /0 1 .xi /: i D1
57.4.3 Simulation of Generalized Hyperbolic Random Variables From the representation of GH distribution as a conditional normal distribution mixed with the generalized inverse Gaussian, a schematic representation of the algorithm reads as follows. 1. Sample Y from GIG.; ı; / distribution 2. Sample " from N (0, 1) p 3. Return X D C ˇY C Y "
Although there are many different theoretical approaches to the optimal futures hedge ratios, under the martingale and joint-normality assumptions, various optimal hedge ratios are identical to the minimum variance hedge ratio. However, empirical studies show that major market data reject the joint-normality assumption. In this paper we propose the generalized hyperbolic distribution as the joint log-return distribution of the spot and futures. In terms of the parameters for generalized hyperbolic distributions, we obtain several widely used optimal hedge ratios: minimum variance, maximum Sharpe measure, and minimum generalized semivariance. In particular, under mild assumptions on the parameters, we show that these theoretical approaches are equivalent. To estimate these optimal hedge ratios, we first write down the log-likelihood functions for symmetric hyperbolic distributions. Then we calculate these parameters by maximizing the log-likelihood functions. Using these MLE parameters for the GH distributions, we obtain the MV hedge ratio and the optimal Sharp hedge ratio. Also based on the MLE parameters and the numerical method, we calculate the minimum generalized semivariance hedge ratio. Regarding the equivalence of these three optimal hedge ratios, our results suggest that the martingale property plays a much important role than the joint distribution assumption. However, conditional heteroskedasticity and stochastic volatility are observed in many spot and futures price series. This implies that the optimal hedge strategy should be timedependent. To account for this dynamic property, parametric specifications of the joint distribution are required. Based on our work here, it is interesting to extend the results to timevarying hedge ratios.
References Adams, J. and C. J. Montesi. 1995. Major issues related to hedge accounting, Financial Accounting Standard Board, Newark, CT.
57 A Generalized Model for Optimum Futures Hedge Ratio
881
Atkinson, A. C. 1982. “The simulation of generalized inverse Gaussian and hyperbolic random variables.” SIAM Journal of Scientific & Statistical Computing 3, 502–515. Barndorff-Nielsen, O. E. 1977. “Exponentially decreasing distributions for the logarothm of partical size.” Proceedings of the Royal Society London A, 353, 401–419. Barndorff-Nielsen, O. E. 1978. “Hyperbolic distributions and distributions on hyperbolae.” Scandinavian Journal of Statistics, 5, 151–157. Barndorff-Nielsen, O. E. 1995. “Normal inverse Gaussian distributions and the modeling of stock returns.” Research Report no. 300, Department of Theoretical Statistics, Aarhus University. Bawa, V. S. 1975. “Optimal rules for ordering uncertain prospects.” Journal of Financial Economics 2, 95–121. Bawa, V. S. 1978. “Safety-first, stochastic dominance, and optimal portfolio choice.” Journal of Financial and Quantitative Analysis, 13, 255–271. Bibby, B. M. and M. Sørensen. 2003. “Hyperbolic processes in finance,” in Handbook of heavy tailed distributions in finance. S. T. Rachev (Ed.). Elsevier Science, Amsterdam, pp. 211–248. Bingham, N. H. and R. Kiesel. 2001. “Modelling asset returns with hyperbolic distribution.” in In return distribution on finance. J. Knight and S. Satchell (Eds.). Butterworth-Heinemann, Oxford, pp. 1–20. Blæsid, P. 1981. “The two-dimensional hyperbolic distribution and related distribution with an application to Johannsen’s bean data.” Biometrika, 68, 251–263. Chen, S. S., C. F. Lee, and K. Shrestha. 2001. “On a mean-generalized semivariance approach to determining the hedge ratio.” Journal of Futures Markets 21, 581–598. Chen, S. S., C. F. Lee, and K. Shrestha. 2003. “Futures hedge ratio: a review.” The Quarterly Review of Economics and Finance 43, 433–465. De Jong, A., F. De Roon, and C. Veld. 1997. “Out-of-sample hedging effectiveness of currency futures for alternative models and hedging strategies.” Journal of Futures Markets 17, 817–837. Eberlein, E. and U. Keller. 1995. “Hyperbolic distributions in finance.” Bernoulli 1, 281–299. Eberlein, E., U. Keller and K. Prause. 1998. “New insights into smile, mispricing and value at risk: The hyperbolic model.” Journal of Business 71, 371–406. Fishburn, P. C. 1977. “Mean-risk analysis with risk associated with below-target returns.” American Economic Review 67, 116–126. Harlow, W. V. 1991. “Asset allocation in a downside-risk framework.” Financial Analysts Journal 47, 28–40. Howard, C. T. and L. J. D’Antonio. 1984. “A risk-return measure of hedging effectiveness.” Journal of Financial and Quantitative Analysis 19, 101–112. Johnson, L. L. 1960. “The theory of hedging and speculation in commodity futures.” Review of Economic Studies 27, 139–151. Kücher, U., K. Neumann, M. Sørensen, and A. Streller. 1999. “Stock returns and hyperbolic distributions.” Mathematical and Computer Modelling 29, 1–15. Lien, D. and Y. K. Tse. 1998. “Hedging time-varying downside risk.” Journal of Futures Markets 18, 705–722. Lien, D. and Y. K. Tse. 2000. “Hedging downside risk with futures contracts.” Applied Financial Economics 10, 163–170. Lien, D., and Y. K. Tse. 2001. “Hedging downside risk: futures vs. options.” International Review of Economics and Finance 10, 159–169.
@ Ln .c; / D @
Z
1 1
Z
cC rf 1
Price, K., B. Price, and T. J. Nantell. 1982. “Variance and lower partial moment measures of systematic risk: some analytical and empirical results.” Journal of Finance 37, 843–855. Rydberg, T. H. 1997. “The normal inverse Gaussian L’evy process: simulation and approximation.” Communications in Statistics: Stochastic models 13, 887–910. Rydberg, T. H. 1999. “Generalized hyperbolic diffusion processes with applications in finance.” Mathematical Finance 9, 183–201. Scott, W. 1979. “On optimal and data-based histograms.” Biometrika 33, 605–610.
Appendix 57A Proof of Theorem 1 The second statement follows from the first one. We prove the first statement. By the Equation (57.7), we obtain C ov.Qrf ; rQp / D 12 EŒGIG C ıfp and r2f D 11 EŒGIG C ıff : Then our result follows by plugging these into the formula Cov.Qr ;Qr / MV D 2f p . rf
Proof of Proposition 1 Assume that rp > rL . Then 1 > r if and only if rp rL > r p rf . Also f has the same sign f
as that of . From this, we observe 8 ˆ ˆ < 1p > 1p D ˆ ˆ : < 1p
when
f < 0
when
f D 0
when
0 < f :
Therefore if f > 0, then s D
rp rf 1
<
rp rf
D MV .
Other cases follow similarly. Proof of Theorem 2 Write Ln .c; / D Ln .c; rQp rQf /: Assume that n > 1 and h.rf ; rp / is a joint density function of rQf and rQp . Then we have Z
1
Ln .c; / D 1
Z
cC rf
.c rp C rf /n h.rf ; rp /drp drf :
1
Simple calculation gives
nrf .c rp C rf /n1 h.rf ; rp /drp drf
882
C.-F. Lee et al.
and Z
@2 Ln .c; / D @2
Z
1
n.n 1/rf2 .c rp C rf /n2 h rf ; rp drp drf :
cC rf 1
1
2
Since @@2 Ln .c; / > 0, the minimum for Ln .c; / occurs at the unique critical point (if it exists.) If the futures and spot returns are jointly normally distributed and if the future price is unbiased (i.e., expected futures price change is zero), Lien and Tse (1998) showed that the minimum GSV hedge ratio is the same as the minimum-variance hedge ratio. For later application, we summarize their arguments here. Suppose that .Qrf ; rQp / is bivariate normal distributed. The joint density h.rf ; rp / is characterized by means ri D E.Qri /; i D p; f , and by the covariance ri rj D cov.Qri ,
rQj /; i; j D p; f . Write x D c C rf rp and y D rf . Then we have Z 1Z 1 @ Ln .c; / D nx n1 yh.y; c x C y/dydx: @ 0 1 (57.21) The function h.y; c x C y/ can be decomposed as 1 1 2 e 2 AB , where ƒ2 D r2p r2f .rp rf /2 ; r D 2 r2f 2 ƒ 2rp rf C r2p , and
i2 h 2 2 A D y C .c x rp /.r2f rp rf /r2 ƒ2 r2 rf rp rf rp r
2 B D c x rp C rf r2 :
Plugging this into Equation (57.21) and then integrating with respect to y give @ Ln .c; / D @
Z
1 0
1 1 2 p nx n1 r3 k .x; /e c 2
where kc .x; / D .c x rp / r2f rp rf C rf
r2p rp rf . In the case of an unbiased futures market (i.e., rf D 0), then the above equation implies that
rp rf r2
is a
f
critical point for Ln .c; /. Hence, by remark above, the minimum generalized semivariance hedge ratio is established at rp r D 2 f D MV ; 8n > 1. GSV rf
Z
Z
1
Ln .c; / D 1
cC rf 1
Z
Z 0
1
Z
2
dx
(57.22)
Next, we consider the case that rQf and rQp are distributed as symmetric bivariate generalized hyperbolic distribution with parameters .; ˛; 0; ı; ; /. Recall that given Y D y; rQf and rQp are distributed jointly as bivariate normal distribution with mean vector .1 ; 2 /0 and covariance matrix y . Write h.;y / .rf ; rp / for the joint density of this distribution. Then the generalized semivariance is given by the formula
.c rp C rf /n h.;y / .rf ; rp /dGIG .y/dydrp drf 0
1 0
p Crf
r
1
where dGIG .y/ is the density function for the GIG.; ı; ˛/ distribution. @ Ln .c; / D @
exr
Similar arguments as above and using the formula (57.22) give
1 1 p nx n1 3 yke .x; /c 2 2
where 2 D y. 22 C 2 11 2 12 / and kc .x; / D .c x2 /. 11 12 /C1 . 22 12 /. Clearly, if 1 D 0 and
ex2 C1
dGIG .y/dydx
12 , then kc .x; / D 0. Hence Ln .c; / has a critical D
11
12 point at 11 . The proof is complete.
Chapter 58
The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements Nikolay Kosturov and Duane Stock
Abstract We examine volatility of returns for corporate bonds around the days when macroeconomic announcements are made. We find that all corporate investment grade (CIG) bonds earn positive announcement-day excess returns, which increase monotonically with maturity. CIG bond excess returns exhibit strong GARCH effects with highly persistent nonannouncement shocks. Volatility is twice as high on announcement days for CIGs where the announcement effect decreases with maturity. Unlike general (nonannouncement) shocks, announcement-day shocks do not persist and only affect announcement-day conditional variance. Furthermore, the dissipation process for CIGs is different from the one for Treasuries, suggesting that hedging CIGs with Treasuries will not be effective on days immediately after announcement. High yield (HY) bonds behave quite differently around macroeconomic announcements than CIG and Treasuries of corresponding maturity. In addition, we find quite strong evidence of asymmetric volatility for CIG and HY bonds where the asymmetric effect is negative for CIG but positive for HY. These asymmetry results are in contrast to weak asymmetry found for Treasury bonds. Keywords Volatility rGARCH rCorporate bonds rHigh yield
58.1 Introduction Many studies have analyzed the impact of macroeconomic announcements on the value of financial instruments, including equities, derivative instruments, and Treasury bonds.1 Few, however, have analyzed the reaction of corporate bond prices of various credit qualities and maturities to regularly scheduled monthly macroeconomic announcements as done here. The corporate bond market is a vital part of the finan-
N. Kosturov and D. Stock () Michael F. Price College of Business, University of Oklahoma, 307 West Brooks, Adams Hall 205A, Norman, OK 73019, USA e-mail: [email protected]
cial system where approximately $4.2 trillion of long-term corporate debt is outstanding in 2003. The value of corporate bonds outstanding exceeds that of U.S. Treasury, federal agency, and municipal debt.2 In recent decades, the importance of the corporate bond market has grown in that many companies that once borrowed from banks have been able to tap the high yield bond market instead. Our purpose is to answer important sets of questions related to corporate bond pricing on announcement days. These questions have largely been answered for Treasury bonds but, for reasons we describe in detail, the same results may or may not apply to corporate bonds. The first set of questions concerns whether corporate bond returns are more volatile on announcement days. Does the answer depend on credit quality? If corporate bond returns are more volatile on announcement days, important subsequent questions within this set include how much more volatile are returns on announcement days compared to other days, and is greater volatility rewarded with greater return on announcement days? A related important question is whether corporate bond volatility is rewarded with greater return on nonannouncement days. The second set of questions concerns whether corporate bond volatility persists for a number of days after the announcement day. If not, it suggests that the market digests the information very quickly and efficiently within one day. Reasons will be given for expecting the volatility of corporate bonds to persist as will reasons for expecting the volatility to not persist. Related to these questions, if the higher volatility persists beyond the announcement day, is volatility rewarded with greater returns beyond the announcement day? Also, for completeness, does volatility of nonannouncement days persist? A third set of questions concerns whether announcementday volatility is related to credit quality and maturity. Are corporate bond returns more or less volatile than equal maturity U.S. Treasury bonds? Reasons to expect both greater 1
For examples see McQueen and Roley (1993), Ederington and Lee (1993), and Huberman and Schwert (1985). For a more recent study see Ederington and Lee (2001). 2 Estimated by Bond Market Association (2003).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_58,
883
884
and lesser volatility for corporate bonds will be briefly given below. Related to this, does the answer depend on the credit quality and maturity of the corporate bonds? For example, does a Treasury bond or a top grade “AA” bond exhibit more or less announcement-day volatility than a high yield bond? One hypothesis is that investment grade bonds will behave very similarly to Treasury bonds but high yield will behave differently. For completeness, the same series of questions also applies to nonannouncement days. The fourth set of questions concerns the potentially diverse reactions to the six different types of macroeconomic announcements we analyze. Do some announcements evoke little reaction in terms of corporate bond return and volatility while others evoke a strong reaction? Do some announcements result in increased returns but not increased volatility for some types of corporate bonds? If so, investing in these bonds prior to an announcement would be a superior investment. The answers to this fourth set have obviously important investment strategy implications. The answers to these sets of questions are quite important given the size of the corporate bond market and the obvious fact that investors need to know how risky (volatile) corporate bonds are. Bond market professionals who hedge volatility of corporate bond positions need to be aware of the special challenge of hedging their bonds on announcement days. Collin-Dufresne et al. (2001) note that hedge funds are exposed to considerable credit risk when they use Treasury futures to hedge corporate bond portfolios. Large hedging errors could occur if volatility is not modeled correctly. Pedrosa and Roll (1998) maintain that hedging corporate bond portfolios is very difficult as much of the volatility is systematic risk, which is largely attributable to macroeconomic announcements. Growth in credit derivatives, which attempt to hedge corporate bonds (and other debt), has been strong in recent years. The San Francisco Federal Reserve Bank (2001) estimates that the volume of credit derivatives traded grew from $600 billion in 1999 to $800 billion in 2000 while the Wall Street Journal estimated 2001 volume at $1 trillion Lopez (2001). Furthermore, the results will be useful to those attempting to value options embedded in corporate bonds and those attempting to incorporate GARCH time series results into GARCH option pricing of debt. See Ritchken and Trevor (1999) and Heston and Nandi (2000) for examples of GARCH option pricing. The next section describes the theory of corporate bond market reaction to macroeconomic announcements. Then we describe the data sources and compute mean returns and volatility for bonds of various credit qualities and maturities for announcement and nonannouncement dates. Next, we utilize simple OLS regressions to calculate expected returns and variance and follow up with more sophisticated GARCH models, which recognize the autocorrelation of
N. Kosturov and D. Stock
returns and volatility. We discuss the important implications of our research for those modeling corporate bond valuation and those hedging corporate bonds. Finally, we summarize the research in the last section.
58.2 Theory and Hypotheses The first set of questions revolve around measuring corporate bond volatility on an announcement day and comparing it to volatility on other days. Why would one expect greater volatility on announcement days? One reason is related to the results of Elton et al. (2001) where they attempt to explain the rate spread for corporate bonds. They find that expected default explains relatively little of the spread, but that more of the spread is explained by the systematic risk factors that we commonly accept as explaining risk premiums for common stocks. Furthermore, Collin-Dufresne et al. (2001) find that firm specific factors explain little of the spread while macroeconomic factors are much more important in explaining spreads. We maintain that macroeconomic announcements represent a good deal of the systematic risk of corporate bonds (and other financial instruments). For example, if a dramatic change in retail sales is reported, the value of all corporate bonds is likely significantly affected. Numerous studies have examined volatility in the U.S. Treasury bond market. Jones et al. (1998) note that the source of autocorrelated volatility commonly found in financial instruments is elusive and they then examine the impact of macroeconomic announcements on volatility in the U.S. Treasury bond market. Given that macroeconomic announcements are not autocorrelated, such announcements enable one to test whether characteristics of the trading process give rise to autocorrelation. They find that Treasury bond volatility is considerably greater on announcement days. Fleming and Remolona (1997) find that the 25 largest price shocks in Treasury bonds were attributable to macroeconomic announcements. In a later study (1999), Fleming and Remolona analyze minute by minute Treasury bond price changes and find that prices adjust sharply to announcements in the first few minutes after an announcement. Given these studies one might think that corporate bond prices should also exhibit high volatility on announcement days. However, this expectation is moderated by the relatively weak evidence that macroeconomic announcements have a clear impact on equity prices. See, for example, McQueen and Roley (1993). Flannery and Protopapadakis (2002) find some announcements have an impact while others do not. Corporate bonds may be described as a mix of risk free debt and equity where the equity component rises as credit quality declines. Blume et al. (1991) maintain that
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
low grade bonds exhibit characteristics of both high grade bonds and equity, and Weinstein (1983, 1985) maintains that high yield bonds have a strong equity component. As discussed below, it may be that corporate bonds of certain credit qualities are more volatile on announcement days than nonannouncement days but less volatile than Treasury bonds on announcement days. Also, it may be that lower grade bonds, with a greater equity component that may not be responsive to announcements, have little or no reaction to announcements. An obviously important related question is whether volatility is rewarded with greater returns. If not, the motivation for holding corporate bonds on announcement days is very weak. Jones et al. (1998) find that greater announcement day Treasury bond volatility is rewarded on announcement days. However, Li and Engle (1998) find that Treasury futures volatility is not rewarded. The second set of questions revolves around the persistence of announcement-day volatility. The evidence on Treasury markets is mixed. Jones, Lamont, and Lumsdaine (JLL) find little or no evidence that announcement-day volatility persists for the cash market in Treasury bonds, but Li and Engle, taking into account asymmetry for positive and negative news, find persistence in Treasury futures markets. Corporate bond volatility could be more persistent than for Treasury bonds due to the greater complexity of corporate bond valuation. Diebold and Nerlove (1989) maintain that volatility should reflect the time it takes for market participants to process information fully. Certain types of new information may involve more disagreement and lack of clarity concerning its relevance. Related to this, Kandel and Pearson (1995) and Harris and Raviv (1993) maintain that not all market participants interpret public information in the same way. Learning models as suggested by Brock and LeBaron (1996) maintain that the more precise the information, the less the likelihood of profitable trading (due to private information). Furthermore, a new equilibrium is reached more quickly the more precise the information. In our study, although macroeconomic news is received with high precision, the implications at time of disclosure are not immediately clear. As Li and Engle (1998) suggest, the news impact may not dominate all beliefs. For example, if a substantial change in retail sales is announced, the meaning for the corporate bond market is complex. Although we elaborate more on this immediately below, suffice to say that a decrease in retail sales may raise the systematic risk of corporate bonds and perhaps raise doubts about the ability of all firms to meet debt service requirements in a timely fashion and thus raise default risk premia. On the other hand, a decrease in retail sales may also reduce inflation expectations thus reducing bond yields. Thus, the net effect is complex and unclear.
885
Related to this last point, the third set of questions concerns whether announcement-day volatility is related to credit quality and maturity. With respect to variation due to credit quality, Fleming and Remolona (1997) and others such as Boyd et al. (2005) have noted that the impact of macroeconomic announcements on equities is complex in that an announcement causes revisions in both the estimates of cash flows generated by the firm and, also, the appropriate discount rate to apply to these expected cash flows. Of course the same applies to corporate bonds and especially to high yield bonds as corporate bonds have an equity component. The price of a corporate bond responds to an announcement in at least two interrelated ways. To explain this, consider that the required yield of a corporate bond consists of the risk free yield for the given maturity plus a risk premium to compensate the purchaser for potential default and other things such as systematic risk stressed by Elton et al. (2001). That is, ib D if C ip (58.1) where ib is the corporate bond required yield, if is the risk free yield, and ip is the premium. An announcement could have a complex impact on the sum of these components. Consider the impact of an announcement that retail sales have declined. Here if may well decline as inflationary expectations decline with a weaker economy represented by lower retail sales. Also, note that the Federal Reserve may be expected to take actions to lower interest rates, which may also reduce if . However, ip may increase as corporate bonds become more risky as firms’ debt servicing prospects diminish with the weakening economy. Alternatively, consider the impact of an announcement that retail sales have dramatically increased when the economy and earnings are growing rapidly. Here, if may increase as inflationary expectations increase but ip may decline due to a decline in perceived default risk. Volatility in ip depends on the volatility in each component as well as the correlation between the components. A negative correlation between components reduces volatility everything else constant. Consistent with the above scenarios, researchers such as Collin-Dufresne and Goldstein (2001); Collin-Dufresne et al. (2001); and Longstaff and Schwartz (1995) have found a negative correlation between risk premia (spreads) and risk free rates, which could moderate corporate bond volatility. Collin-Dufresne et al. (2001) find that this negative relation grows stronger with greater leverage and lower ratings; hence the moderating effect may be stronger for high yield bonds. Similarly, Duffee (1999) finds a negative correlation between the likelihood of default and risk free rates where the negative correlation is stronger for lower bond ratings. Longstaff and Schwartz (1995) suggest the negative correlation between risk premia (spreads) and risk free rates is consistent with their theory of corporate bond valuation where
886
N. Kosturov and D. Stock
higher risk free interest rates increase the growth rate of firm asset values. In this context, Blume et al. (1991) and Stock (1992) find lower grade bonds are less volatile than higher grade, although they did not have the theory of Longstaff and Schwartz (1995) and others to help explain such behavior. Thus, there are solid reasons to expect lower grade bonds may have less volatility than higher grade where this includes the possibility that Treasury bonds may be more volatile on announcement days than corporate bonds of varying credit quality. It is quite possible that investment grade bonds with ratings as low as BBB will roughly mimic U.S. Treasury volatility and differential volatility will only occur when HY bonds are included. The market for any investment grade bond may be thought of as the high quality segment in that institutions are permitted to purchase such bonds. Bonds in the lower quality segment sometimes cannot be purchased by institutions due to regulations. Van Horne (1998) says that the lower the grade of the bond, the more it takes on firm specific characteristics and the less related it is to overall market interest rates. Kwan (1996) finds that lower grade bond yields correlate with the firm’s stock returns. Of course this does not prove that lower credit quality bonds are necessarily less volatile on announcement days; lower grade bonds could be more volatile than high grade. If, for example, the risk premium component .ip / is very large and volatile and if the negative correlation is small or positive. Another possible result is that the above relation of if and ip changes is weak, making corporate and Treasury volatility approximately equal. We maintain that this is an important empirical issue for us to resolve. The theory of volatility as related to credit quality is potentially affected by the greater tendency for lower credit quality bonds to have an embedded call option. Crabbe and Helwege (1994) document this tendency where one explanation is that lower credit quality firms have relatively more potential to experience a credit upgrade. If the bonds experience an improvement in credit quality, the firm can more easily replace higher cost debt with now lower cost debt by exercising the call. Specifically, the value of a bond with a call option can be expressed as VCB D VNCB OCB
(58.2)
where VCB is the value of callable bond, VNCB is the value of a noncallable bond, and OCB is the value of the call option. We will later model conditional volatility in a GARCH context where conditional volatility may be expected to rise on announcement day. However, those investors holding callable bonds may not be rewarded for the increased volatility because increased volatility increases the option value thus offsetting any positive returns in the noncallable component of VCB .
58.3 Data and Return Computations We gathered macroeconomic announcement days for six macroeconomic announcements CPI, PPI, unemployment, employment cost, durable goods, and retail sales for the period December 30, 1994 to February 11, 2000. We also collected daily Salomon Brothers corporate bond index values for ratings AA, A, BBB, and a pooled all-rating series. These are compiled for maturities 1–3 years, 1–5 years, 3–7 years, 1–10 years, 7–10 years, and 10 plus years. Similarly, we collected Goldman Sachs indices for U.S. Treasuries of maturities 1–3, 3–5, 5–7, 7–10, and 10 plus years so that we could compare corporate bond volatility to U.S. Treasuries. In addition to the investment grade corporate bond Salomon Brothers indices, we collected net asset values for seven high-yield (HY) funds described in Appendix A. We used the indices and the HY NAVs, as in Cornell and Green (1991), to compute the daily corporate and Treasury bond returns. We adjusted the HY corporate bond NAVs for coupon distributions. Excess return is computed as realized daily return minus the daily return on 30-day T-bills. We thus obtain daily excess return series for three credit qualities of bond indices: Treasuries, corporate investment grade (CIG), and corporate HY for the time-span of our macroeconomic announcement sample of December 30, 1994 to February 11, 2000. Our sample time span is due to the unavailability of the CIG bond index series after February 2000.3
58.4 Descriptive Statistics of Daily Excess Returns In Table 58.1 we provide descriptive statistics for the daily excess returns of our bond index series. First, we discuss corporate investment grade (CIG) results and then compare 3
The approach of using mutual fund NAVs to calculate returns could potentially be plagued by the presence of stale bond prices in NAVs. If so, the resulting HY return series might exhibit spurious autocorrelation as described by Lo and MacKinlay (1988). We argue that, to at least a certain extent, our GARCH(1,1) process cleanses data by modeling AR(1) effects; thus, the residuals influenced by spurious autocorrelation will not permeate our conditional variance and thus bias higher moment relationships. We note that using an AR(1) process produces Q statistics that suggest residuals and squared residuals have been purged of autocorrelation potentially caused by stale pricing. In addition, Cornell and Green (1991), and Zitzewitz (2000) would claim that stale prices and the resulting arbitrage opportunities would ultimately force managers to switch to market-based valuation, and the problem will thus eventually disappear. Our HY mutual fund results remain an empirical fact of the behavior of mutual fund shareholder wealth. Finally, a large proportion (25% according to Cornell and Green) of HY bonds outstanding are owned by mutual funds, so NAV behavior represents the behavior of value for many billions of wealth (25% of $600 billion outstanding) even if the value may be flawed.
XR
Mean 0:003 Median 0.002 Maximum 0:437 Minimum 0.398 Std. Dev. 0:087 Skewness 0:113 Kurtosis 3:028 Autocor. t to 0:036 tC1
XR
0:008 0:002 0:191 0:000 0:017 5:715 43:143 0:091
XR2
Rating AA, 1–3 years
0:017 0:005 0:361 0:000 0:037 5:591 40:620 0.058
XR2
Rating AA, 1–3 years
0:010 0:003 0:361 0:000 0:023 7:040 75:800 0:031
0:006 0:000 0:530 0.601 0:098 0:102 6:754 0:041
Mean 0:017 Median 0:007 Maximum 0:530 Minimum 0.601 Std. Dev. 0:129 Skewness 0.046 Kurtosis 2:979 Autocor. t to 0:057 tC1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor.
XR2
XR
Rating AA, 1–3 years
0:063 0:048 0:437 0:000 0:059 2:182 7:036 0:084
jXRj
0:093 0:069 0:601 0:000 0:090 2:006 6:141 0.041
jXRj
0:070 0:051 0:601 0:000 0:069 2:324 11:505 0:039
jXRj 0:232 0:082 6:357 0:000 0:451 5:733 55:180 0:066
XR2 0:360 0:286 2:521 0:000 0:321 1:683 7:626 0:021
jXRj
0:365 0:151 6:357 0:000 0:664 5:028 35:544 0:021
0:464 0:388 2:521 0:001 0:387 1:614 4:283 0.019
jXRj
XR2 0:195 0:071 4:623 0:000 0:362 4:856 36:419 0:102
XR 0.004 0.014 1:631 2.150 0:442 0.261 1:432 0:068 0:331 0:266 2:150 0:000 0:293 1:558 3:633 0:047
jXRj
Rating AA, 10 plus years
0:069 0:087 1:546 2.521 0:601 0.508 1:619 0.009
XR
XR2
Rating AA, 10 plus years
0:012 0.002 1:631 2.521 0:482 0.316 4:797 0:048
XR
Rating AA, 10 plus years
0:003 0.001 0:493 0.454 0:089 0:042 3:915 0:048
XR
0:013 0:004 1:664 2.612 0:489 0.372 5:096 0:055
XR
0:097 0:072 0:579 0:002 0:091 1:952 5:649 0.018
jXRj 0:041 0:021 1:105 1.442 0:306 0.456 2:526 0:039
XR 0:095 0:031 2:079 0:000 0:193 5:821 47:162 0.036
XR2
0:008 0:002 0:243 0:000 0:019 6:458 55:387 0:140
XR2
0:064 0:048 0:493 0:000 0:062 2:422 8:914 0:118
jXRj
0.001 0.007 0:819 1.049 0:212 0.243 2:250 0:089
XR
0:045 0:014 1:100 0:000 0:092 5:138 37:263 0:094
XR2
0:155 0:119 1:049 0:000 0:144 1:875 5:069 0:082
jXRj
Rating A, 10 plus years
0:229 0:177 1:442 0:002 0:206 1:848 5:708 0.029
jXRj
0:364 0:289 2:612 0:001 0:327 1:785 8:403 0:030
jXRj
Rating A, 10 plus years
0:239 0:084 6:825 0:000 0:481 6:167 61:127 0:093
XR2
Non-Announcement Dates (1041)
0:018 0:005 0:335 0:000 0:038 5:214 34:608 0.046
XR2
Rating A, 1–3 years
0:020 0:011 0:549 0.579 0:132 0.068 2:727 0:062
XR
0:071 0:052 0:579 0:000 0:070 2:410 11:812 0:074
jXRj
Rating A, 10 plus years
Full Sample
All Announcement Dates (290)
0:010 0:003 0:335 0:000 0:025 6:683 64:150 0:070
XR2
Rating A, 1–3 years
0:007 0:002 0:549 0.579 0:100 0:073 7:098 0:052
XR
Rating A, 1–3 years
Corporate Investment Grade
0:012 0:003 0:557 0:000 0:032 8:140 99:644 0:181
XR2 0:075 0:052 0:746 0:000 0:078 2:684 14:393 0:114
jXRj
0:020 0:006 0:356 0:000 0:043 5:004 30:370 0:312
0:101 0:076 0:597 0:000 0:098 2:035 5:886 0:107
jXRj
jXRj 0:004 0:009 0:067 0.001 0:002 0:048 0:746 0:557 0:746 0.464 0:000 0:000 0:097 0:027 0:070 0:343 10:468 2:930 6:526 168:087 14:754 0:028 0:104 0:123
XR
XR2
Rating BBB, 1–3 years
0:018 0:015 0:528 0.597 0:139 0.427 3:140 0.062
XR
XR2
Rating BBB, 1–3 years
0:007 0:001 0:746 0.597 0:108 0:050 8:483 0:004
XR
Rating BBB, 1–3 years
Table 58.1 Summary Statistics of daily excess returns of Corporate Investment Grade Bonds XR is the daily excess return over the three-month T-bill. Returns are in percent. Sample period is from December 30, 1994 to February 11, 2000
XR2
jXRj
jXRj 0:348 0:448 0:136 0:369 5:848 2:418 0:000 0:001 0:643 0:384 4:695 1:618 29:629 4:083 0:028 0.029
XR2
XR2
jXRj
(continued)
0:000 0:189 0:319 0.006 0:062 0:249 1:550 11:383 3:374 3.374 0:000 0:000 0:435 0:466 0:296 0.601 14:370 2:227 4:096 321:686 12:274 0:078 0:119 0:056
XR
Rating BBB, 10 plus years
0:067 0:053 1:550 2.418 0:587 0.551 1:730 0.019
XR
Rating BBB, 10 plus years
0:015 0:224 0:347 0:005 0:073 0:270 1:550 11:383 3:374 3.374 0:000 0:000 0:473 0:514 0:321 0.544 10:490 2:077 6:342 188:302 12:118 0:054 0:090 0:024
XR
Rating BBB, 10 plus years
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 887
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to tC1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to tC1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to tC1
XR
0:045 0:029 0:530 0:184 0:136 0:771 1:359 0:002
XR
0:020 0:009 0:281 0:000 0:039 5:259 33:276 0:009
XR
2
XR
XR
2
0:484 0:365 1:819 0:001 0:415 1:366 1:609 0:256
jXRj
XR 0:412 0:297 2:378 0:000 0:471 1:893 4:449 0:140
XR
2
0:537 0:545 1:542 0:004 0:354 0:547 0:157 0:064
jXRj
Rating AA, 10 plus years
0:111 0:176 0:097 0:160 0:530 1:542 0:000 1:081 0:090 0:622 2:006 0:031 7:024 0:822 0:065 0:187
jXRj
0:542 0:398 2:521 0:003 0:501 1:990 5:112 0:016
jXRj
Rating AA, 10 plus years
0:540 0:159 6:357 0:000 1:108 4:031 17:850 0:035
0:105 0:071 0:403 0:073 0:001 0:133 0:530 1:546 3:309 0:003 1:819 0:000 0:105 0:636 0:665 2:181 0:169 2:664 6:057 0:776 7:370 0:006 0:009 0:190
jXRj
Rating AA, 1–3 years
0:022 0:022 0:008 0:005 0:530 0:281 0:270 0:000 0:147 0:049 1:042 4:278 2:788 19:775 0:044 0:053
XR
2
Rating AA, 1–3 years
0:008 0:027 0:111 0:027 0:003 0:005 0:074 0:140 0:366 0:361 0:601 1:291 0:601 0:000 0:000 2:521 0:165 0:056 0:121 0:740 0:863 4:131 1:689 1:212 2:744 21:526 3:475 2:623 0:031 0:102 0:155 0:037
XR
XR2
jXRj
XR2
XR
Rating AA, 10 plus years
Rating AA, 1–3 years
Table 58.1 (continued)
XR2 jXRj XR
XR2 jXRj XR
XR2 jXRj
Rating BBB, 1–3 years XR
XR
0:045 0:039 0:549 0:183 0:138 0:805 1:511 0:053
0:021 0:009 0:302 0:000 0:041 5:546 36:452 0:001
XR
2
XR
jXRj
XR
XR
XR
jXRj
jXRj
XR
XR
2
0:118 0:111 0:528 0:002 0:089 1:941 6:596 0:007
jXRj
Rating BBB, 1–3 years
0:113 0:273 0:049 0:022 0:068 0:261 0:040 0:012 1:221 1:105 0:528 0:279 0:000 0:003 0:182 0:000 0:176 0:197 0:140 0:039 4:500 1:363 0:659 5:039 26:201 4:027 0:895 31:034 0:019 0:040 0:075 0:020
XR
2
Rating A, 10 plus years
0:114 0:103 0:093 0:076 0:549 1:105 0:004 0:449 0:089 0:322 2:206 0:394 8:447 0:119 0:051 0:099
jXRj
Rating A, 1–3 years XR
XR
2
Rating BBB, 1–3 years
0:508 0:378 2:418 0:001 0:481 1:946 4:916 0:033
jXRj
XR
XR2
0:180 0:155 1:522 1:017 0:609 0:036 0:732 0:203
XR
0:397 0:267 2:315 0:000 0:457 1:928 4:713 0:228
XR2
(continued)
0:524 0:517 1:522 0:005 0:353 0:509 0:154 0:078
jXRj
Rating BBB, 10 plus years
0:477 0:390 1:727 0:013 0:397 1:329 1:604 0:270
jXRj
Rating BBB, 10 plus years
0:109 0:047 0:112 0:251 0:027 0:023 0:112 0:070 0:383 0:075 0:006 0:040 0:199 0:013 0:008 0:089 0:013 0:152 0:549 1:105 1:221 1:105 0:528 0:279 0:528 1:550 2:981 0:009 0:817 0:000 0:004 0:286 0:000 0:001 1:727 0:000 0:106 0:334 0:202 0:224 0:152 0:049 0:105 0:620 0:617 2:217 0:494 3:551 1:560 0:962 4:115 2:080 0:206 2:668 6:361 1:158 15:528 2:924 2:175 18:384 5:546 0:622 7:268 0:025 0:022 0:113 0:097 0:059 0:060 0:004 0:021 0:202
jXRj
2
Rating A, 10 plus years
Retail Sales Report Announcement Dates (61)
0:027 0:023 0:012 0:006 0:549 0:302 0:274 0:000 0:150 0:051 1:020 4:346 2:651 20:505 0:056 0:036
XR
2
Rating A, 1–3 years
CPI Report Announcement Dates (61)
0:486 0:143 5:848 0:000 1:003 4:031 17:966 0:027
XR2
Rating BBB, 10 plus years
0:009 0:027 0:113 0:011 0:149 0:267 0:008 0:029 0:116 0:033 0:005 0:005 0:071 0:013 0:031 0:177 0:001 0:004 0:064 0:118 0:347 0:335 0:579 0:774 2:079 1:442 0:333 0:356 0:597 1:361 0:579 0:000 0:002 1:442 0:000 0:002 0:597 0:000 0:000 2:418 0:167 0:055 0:122 0:389 0:329 0:282 0:172 0:059 0:127 0:702 0:875 3:651 1:623 1:212 4:290 1:977 0:984 3:678 1:620 1:189 2:448 16:419 2:852 3:277 21:446 4:981 2:496 16:394 2:893 2:732 0:100 0:091 0:112 0:006 0:049 0:058 0:077 0:114 0:207 0:026
XR
Rating A, 10 plus years
Unemployment Report Announcement Dates (62) Rating A, 1–3 years
888 N. Kosturov and D. Stock
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to tC1
0:030 0:017 0:106 0:293 0:388 0:019 0:009 0:093 0:376 0:201 0:288 0:083 0:288 1:411 1:992 0:198 0:000 0:003 0:671 0:000 0:132 0:023 0:080 0:563 0:522 0:164 1:861 0:810 0:032 2:179 0:320 2:992 0:087 0:108 4:246 0:154 0:170 0:163 0:082 0:107
XR 0:518 0:448 1:411 0:012 0:354 1:162 1:275 0:023
jXRj
XR2
jXRj
XR2
XR
Rating AA, 10 plus years
Rating AA, 1–3 years
Table 58.1 (continued)
XR2 jXRj XR
XR2
0:044 0:017 0:102 0:119 0:089 0:046 0:008 0:091 0:142 0:047 0:291 0:085 0:291 0:710 0:505 0:186 0:000 0:003 0:338 0:000 0:125 0:024 0:083 0:280 0:131 0:169 1:991 0:922 0:261 2:359 0:090 3:319 0:366 0:029 5:266 0:043 0:151 0:157 0:004 0:073
XR
XR
XR2
jXRj
Rating BBB, 1–3 years
XR
XR2
(continued)
0:543 0:474 1:405 0:009 0:374 0:906 0:276 0:009
jXRj
Rating BBB, 10 plus years
0:238 0:047 0:021 0:114 0:231 0:428 0:218 0:043 0:011 0:104 0:392 0:225 0:710 0:313 0:098 0:313 1:405 1:973 0:016 0:210 0:000 0:000 1:127 0:000 0:184 0:139 0:027 0:089 0:627 0:542 1:210 0:087 1:844 0:773 0:192 1:805 1:410 0:271 2:917 0:029 0:048 2:539 0:020 0:017 0:099 0:053 0:052 0:112
jXRj
Rating A, 10 plus years
Employment Cost Report Announcement Dates (21) Rating A, 1–3 years
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 889
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor.
0:008 0:002 0:280 0:000 0:020 6:325 55:130 0:040
0:005 0:001 0:529 0:446 0:092 0:141 3:730 0:088
0:015 0:004 0:280 0:000 0:031 4:580 27:129 0:063
0:017 0:006 0:529 0:424 0:122 0:251 2:313 0:060
XR2
0:007 0:002 0:199 0:000 0:015 6:602 61:238 0:105
XR
0:001 0:002 0:353 0:446 0:081 0:128 3:396 0:101
T-bond, 1–3 years
XR2
XR
T-bond, 1–3 years
XR2
XR
T-bond, 1–3 years
0:059 0:043 0:446 0:000 0:056 2:260 8:047 0:117
jXRj
0:088 0:064 0:529 0:001 0:086 1:846 4:521 0:056
jXRj
0:065 0:046 0:529 0:000 0:065 2:310 7:980 0:062
jXRj 0:040 0:012 1:030 0:000 0:087 5:480 41:820 0:036
XR2
0:068 0:022 0:949 0:000 0:127 3:843 18:531 0:068
XR2
0:000 0:008 0:794 1:015 0:180 0:192 2:682 0:107
XR 0:032 0:010 1:030 0:000 0:070 6:290 61:056 0:096
XR2
T-bond, 3–5 years
0:032 0:011 0:974 0:955 0:259 0:007 1:554 0:077
XR
T-bond, 3–5 years
0:007 0:005 0:974 1:015 0:200 0:041 2:701 0:098
XR
T-bond, 3–5 years
0:131 0:098 1:015 0:000 0:123 2:003 6:290 0:092
jXRj
0:193 0:147 0:974 0:002 0:176 1:590 3:107 0:067
jXRj
0:145 0:108 1:015 0:000 0:139 1:994 5:766 0:042
jXRj 0:078 0:023 1:931 0:000 0:161 5:151 37:594 0:013
XR2 0:202 0:151 1:390 0:000 0:192 1:857 4:931 0:007
jXRj
0:125 0:043 1:761 0:000 0:221 3:694 17:920 0:029
XR2 0:266 0:208 1:327 0:001 0:235 1:456 2:574 0:037
jXRj
XR2 0:064 0:019 1:931 0:000 0:137 5:930 53:014 0:009
XR 0:002 0:011 1:030 1:390 0:253 0:333 2:555 0:003
T-bond, 5–7 years
0:184 0:139 1:390 0:000 0:174 1:931 5:779 0:004
jXRj
Non-Announcement Dates (1041)
0:045 0:012 1:228 1:327 0:352 0:152 1:261 0:041
XR
T-bond, 5–7 years
All Announcement Dates (290)
0:008 0:007 1:228 1:390 0:278 0:195 2:355 0:013
XR
T-bond, 5–7 years
Full Sample
0:123 0:037 3:056 0:000 0:258 5:379 40:689 0:063
XR2
0:196 0:070 2:946 0:000 0:349 3:894 20:354 0:060
XR2
0:004 0:014 1:301 1:748 0:321 0:417 2:656 0:121
XR
0:103 0:031 3:056 0:000 0:222 6:186 55:935 0:124
XR2
T-bond, 7–10 years
0:058 0:020 1:527 1:717 0:440 0:176 1:331 0:049
XR
T-bond, 7–10 years
0:010 0:010 1:527 1:748 0:351 0:258 2:427 0:102
XR
T-bond, 7–10 years
jXRj
0:255 0:194 1:748 0:000 0:241 1:888 5:290 0:030
jXRj
0:233 0:177 1:748 0:000 0:220 1:966 6:198 0:071
jXRj
0:335 0:264 1:717 0:000 0:291 1:511 2:939 0:067
Table 58.1 (continued): Summary Statistics of daily excess returns of Treasuries XR is the daily excess return over the three-month T-bill. Returns are in percent. Sample period is from December 30, 1994 to February 11, 2000 Treasuries
0:276 0:096 6:452 0:000 0:518 4:768 34:434 0:056
XR2
0:417 0:153 6:452 0:000 0:696 3:863 22:471 0:046
XR2
0:003 0:013 1:408 2:298 0:486 0:374 1:623 0:072
XR
0:236 0:079 5:280 0:000 0:450 5:022 38:066 0:109
XR2
T-bond, 10 plus years
0:086 0:083 1:769 2:540 0:641 0:256 0:981 0:020
XR
T-bond, 10 plus years
0:017 0:007 1:769 2:540 0:525 0:278 1:586 0:050
XR
T-bond, 10 plus years
(continued)
0:361 0:281 2:298 0:000 0:326 1:610 3:908 0:062
jXRj
0:500 0:392 2:540 0:003 0:409 1:404 2:546 0:064
jXRj
0:391 0:311 2:540 0:000 0:351 1:603 3:713 0:023
jXRj
890 N. Kosturov and D. Stock
Mean Median Maximum Minimum Std. Dev. Skewness Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
0:023 0:004 0:180 0:000 0:042 2:449 5:757 0:087
0:014 0:011 0:424 0:424 0:153 0:293 1:551 0:009
0:021 0:005 0:280 0:000 0:047 4:202 19:786 0:121
0:019 0:005 0:529 0:266 0:144 1:042 2:788 0:038
XR
0:019 0:008 0:280 0:000 0:039 5:369 0:043
XR
0:038 0:014 0:529 0:182 0:134 0:870 0:012
2
T-bond, 1–3 years
XR
XR
2
T-bond, 1–3 years
XR2
XR
T-bond, 1–3 years
Table 58.1 (continued)
0:106 0:090 0:529 0:004 0:090 2:042 0:002
jXRj
0:101 0:073 0:529 0:001 0:104 2:081 5:364 0:167
jXRj
0:105 0:062 0:424 0:001 0:111 1:385 1:149 0:123
jXRj 0:103 0:022 0:911 0:000 0:186 2:612 7:055 0:071
XR2
0:085 0:018 0:949 0:000 0:164 3:440 13:793 0:174
XR
0:075 0:035 0:974 0:417 0:281 0:543 0:022
XR 0:083 0:046 0:949 0:000 0:140 4:486 0:077
XR
2
T-bond, 3–5 years
0:029 0:002 0:974 0:665 0:293 0:706 1:666 0:122
XR
2
T-bond, 3–5 years
0:017 0:010 0:842 0:955 0:323 0:432 1:518 0:000
XR
T-bond, 3–5 years
0:182 0:052 1:761 0:000 0:326 2:829 9:260 0:066
XR2 0:303 0:229 1:327 0:001 0:304 1:376 1:503 0:080
jXRj
0:038 0:013 1:228 1:000 0:398 0:480 1:044 0:174
XR 0:157 0:046 1:508 0:000 0:276 3:019 10:384 0:112
XR
2
T-bond, 5–7 years
0:294 0:213 1:228 0:003 0:269 1:442 2:059 0:048
jXRj
CPI Report Announcement Dates (61)
0:021 0:019 1:040 1:327 0:430 0:623 1:481 0:045
XR
0:230 0:215 0:974 0:002 0:175 1:554 0:132
jXRj
0:111 0:070 1:228 0:596 0:386 0:350 0:024
XR
0:159 0:103 1:508 0:000 0:236 3:712 0:145
XR
2
T-bond, 5–7 years
0:324 0:321 1:228 0:001 0:234 1:231 0:015
jXRj
0:284 0:082 2:946 0:000 0:517 3:113 11:729 0:059
XR2
0:050 0:025 1:527 1:274 0:496 0:492 1:026 0:063
XR
0:137 0:115 1:527 0:747 0:481 0:370 0:086
XR
0:246 0:132 2:332 0:000 0:367 3:684 0:047
XR
2
T-bond, 7–10 years
0:244 0:074 2:332 0:000 0:428 3:067 10:610 0:169
XR
2
T-bond, 7–10 years
0:033 0:038 1:299 1:717 0:536 0:654 1:660 0:017
XR
T-bond, 7–10 years
Retail Sales Report Announcement Dates (61)
0:211 0:134 0:974 0:002 0:203 1:708 3:157 0:250
jXRj
0:224 0:148 0:955 0:002 0:231 1:390 1:337 0:086
jXRj
T-bond, 5–7 years
Unemployment Report Announcement Dates (62)
0:403 0:364 1:527 0:000 0:291 1:259 0:124
jXRj
0:370 0:271 1:527 0:017 0:330 1:496 2:291 0:212
jXRj
0:381 0:287 1:717 0:004 0:375 1:451 1:985 0:088
jXRj 0:587 0:173 6:452 0:000 1:045 3:526 16:052 0:058
XR2
0:476 0:134 3:391 0:000 0:785 2:450 5:767 0:188
XR2
0:189 0:167 1:769 1:197 0:703 0:100 0:187
XR
0:522 0:252 3:129 0:000 0:634 1:990 0:010
XR2
T-bond, 10 plus years
0:052 0:055 1:769 1:841 0:694 0:386 0:718 0:041
XR
T-bond, 10 plus years
0:055 0:120 1:632 2:540 0:771 0:794 1:568 0:064
XR
T-bond, 10 plus years
(continued)
0:600 0:502 1:769 0:004 0:406 0:729 0:079
jXRj
0:518 0:366 1:841 0:005 0:459 1:322 1:195 0:249
jXRj
0:573 0:415 2:540 0:003 0:513 1:509 2:632 0:104
jXRj
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 891
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
XR2
0:014 0:007 0:084 0:000 0:022 2:201 4:695 0:210
XR
0:027 0:013 0:290 0:190 0:119 0:429 0:243 0:093
T-bond, 1–3 years
Table 58.1 (continued)
0:090 0:084 0:290 0:003 0:080 1:086 0:694 0:191
jXRj 0:073 0:046 0:589 0:354 0:249 0:381 0:056 0:069
XR 0:065 0:027 0:347 0:000 0:097 2:113 3:779 0:105
XR2
T-bond, 3–5 years
0:197 0:164 0:589 0:014 0:164 1:135 0:794 0:054
jXRj 0:123 0:134 0:784 0:423 0:337 0:311 0:109 0:060
XR 0:123 0:054 0:615 0:002 0:180 2:010 3:192 0:033
XR2
T-bond, 5–7 years
0:277 0:233 0:784 0:041 0:221 1:128 0:629 0:065
jXRj 0:169 0:166 0:927 0:482 0:405 0:202 0:077 0:048
XR
0:184 0:081 0:859 0:001 0:264 1:965 2:814 0:078
XR2
T-bond, 7–10 years
Employment Cost Report Announcement Dates (21)
0:343 0:285 0:927 0:035 0:265 1:127 0:647 0:020
jXRj
0:346 0:384 1:462 0:592 0:607 0:108 0:192 0:041
XR
0:470 0:270 2:139 0:012 0:641 2:021 2:810 0:068
XR2
T-bond, 10 plus years
(continued)
0:575 0:520 1:462 0:109 0:383 1:366 1:297 0:034
jXRj
892 N. Kosturov and D. Stock
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor.
2:025
4:102
0:000
0:182
13:921
0:719
2:025
0:208
2:509
0:262
0:337
2:025
4:102
0:000
0:201
0:719
2:025
0:212
2:840
4:660
34:926
0:243
13:041
0:338
0:165
18:803 212:736
0:159
0:169 7:463
0:084
0:000
0:859
0:010
0:029
XR
0:970
0:008
0:017
XR
8:479
0:097
0:304
0:176
15:574 83:804
0:040
0:152
0:000
1:424
0:007
0:023
XR
2
0:117
VWEHX
0:406
7:267 66:666
0:000 1:194
0:091
0:045
0:008
0:133
jXRj
0:351
5:494
0:010
XR
0:642
0:007
0:018
XR
2
0:165
VWEHX
0:327
1:948 1:206
0:134
0:013
XR
2
0:461
0:326
SPHIX
4:942
30:106
3:322
0:080
0:192
0:865
0:811
0:000
8:345
0:094
0:000
1:424
0:008
0:025
XR
2
13:191 82:143
0:000 0:927
0:657
0:704
0:811
0:141
0:114
0:038
0:013
0:034
XR
0:061
XR
2
jXRj
32:378
SPHIX
0:156
4:335 0:300
0:159
16:563 250:264
0:181
0:970
0:008
0:017
XR
VWEHX
0:000 1:194
0:094
0:009
0:013
0:134
jXRj
0:043
XR
2
0:015
XR
SPHIX
1:858 0:263
1:858
0:000
0:024
XR
FAGIX
0:039
9:771
0:315
0:000
3:693
0:011
0:075
XR
2
0:035
1:215 0:260
0:275
22:856
4:336
10:897
0:264
0:000
0:069
0:043
13:489 149:444
4:042 0:888
0:125
0:000 2:082
1:194
0:068 0:011
XR
2
0:008
XR
FAGIX
0:005
0:086 0:002
0:088
jXRj
10:621
0:276
16:031 105:631
0:432
0:273
0:202 0:079
11:762
2:681
0:129
4:336 0:000
14:125 136:573
0:002 1:922
0:927
0:099
0:111
jXRj
0:258
19:839
3:700 0:564
0:126
0:000 2:082
1:194
0:011
0:091 0:002
XR 0:069
XR
2
0:012
0:093
jXRj
FAGIX XR
2:118
14:196
0:234
0:052
0:045
39:136 224:662
0:962
0:191
0:000
4:486
0:006
0:037
XR
2
1:068 0:197 14:577
0:247
0:000
3:980
XR
0:164 0:007 2:118
0:092
24:698
4:007
0:202
14:069
0:231
0:000
4:486
0:006
0:036
2
0:074
0:053
39:292 223:392
2:202
0:190
0:000 1:541
2:082
0:105 0:008
XR
FAHYX XR
1:258
66:763
6:688
0:162
0:000
XR
1:175
0:110
0:000
2:144
0:002
0:024
18:628 159:507
2:267 10:316
0:156
2
0:523
XR
0:652
0:001
0:121
0:071
0:000
2:311
0:006
0:015
XR
2
XR
0:327
0:001
0:150
0:140
0:000
2:311
0:007
0:023
XR
2
0:652
0:001
0:000
XR
0:112 22:941
7:075
0:031
0:000
0:425
0:005
0:013
XR
2
0:051
4:112 68:371
4:268 0:195
0:136
0:000 0:618
1:464
0:050
0:077
jXRj
MAHIX
0:059
35:756 252:334 0:229 0:104
24:852
4:577 3:892 15:457
0:142
0:000 1:520
1:258
0:050
0:081 0:012
jXRj
MAHIX
0:032
20:941 840:281
0:056 0:061
23:372
4:341 1:746 26:438
0:137
0:000 1:520
1:464
0:050
0:078 0:002
jXRj
MAHIX
0:137 0:106 0:025 0:029 0:050
68:495
6:970
0:155
0:000 1:464
2:118
8:358
0:125
0:000
1:582
XR
FHYTX
0:079 0:016
0:110
jXRj
0:027 0:002
21:463 88:781
4:021
0:163
0:001 0:538
1:995
0:075 0:034
2
0:192
XR
FHYTX
0:111 0:011
jXRj
9:784
0:113
0:000
2:144
0:002
0:025
XR
2
19:144 138:418
2:682
0:157
0:115 0:174
67:883
6:897
0:157
Non-Announcement Dates (1041)
jXRj
1:258
0:000 1:464
2:118
0:078 0:018
0:023 0:012 0:320
38:852 229:129
0:041 0:072
27:866
4:348 3:092
0:217
0:000 1:995
1:922
0:039 0:006
XR
0:168 0:011
XR
2
0:105 0:003
jXRj
FAHYX
XR
FHYTX
0:110 0:003
jXRj
All Announcement Dates (290)
0:081
25:541
4:095
0:205
0:000 1:995
2:082
0:105 0:004
0:165 0:008
jXRj
FAHYX
Full Sample
Table 58.1 (continued): Summary Statistics of daily excess returns of High Yield Bond funds XR is the daily excess return over the three-month T-bill. Returns are in percent. Sample period is from December 30, 1994 to February 11, 2000 High Yield
XR
1:910
0:002
XR
1:910
0:000
0:242
0:000
3:649
20:829 172:194
1:248 12:034
0:227
0:051 0:005
XR
1:354
0:002
0:200
0:136
7:639
7:702
0:141
0:000
1:833
0:005
0:040
XR2
0:134
0:151
15:132
3:403
0:162
0:000
1:354
0:073
0:117
jXRj
0:037
30:870
4:506
0:189
0:000
1:910
0:074
0:126
jXRj
0:125
20:803
3:763
0:168
0:000
1:910
0:073
0:119
jXRj
(continued)
0:103
10:570 78:045
2:099 0:261
0:083
0:000 0:926
0:652
0:074
0:076 0:003
jXRj
PHBBX
0:126 0:033 0:002
70:120
6:588
0:120
0:073
XR2
PHBBX
0:096
13:810 188:828
0:001 1:056
1:520
0:082
0:168
0:000
3:649
0:005
0:042
XR2
0:165 11:300
0:206
0:090 0:014
jXRj
0:131
49:130
4:411
0:092
0:000 1:056
1:520
0:078
0:079 0:005
jXRj
PHBBX
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 893
Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to tC1
0:517
0:267
0:000
0:057
2:169
4:947
0:517
0:450
0:189
0:008
0:328
0:000
0:046
2:310
6:471
0:179
0:518
0:409
0:030
0:479
0:229
0:445
0:479
0:311
0:465
0:007
0:023
XR
0:003
0:378
0:779
0:111
3:234
0:038
0:068
0:465
0:008
0:054
XR
4:002
0:034
0:369
0:198
0:983 19:994
0:726
0:125
0:000
0:216
0:010
0:018
XR2
VWEHX
0:335
1:182 12:561
0:171
0:146
0:000
0:216
0:010
0:021
XR
2
0:186
VWEHX
0:522
0:005 0:187
0:134
0:035
0:018
0:152
jXRj
0:059
XR2
0:067
XR
SPHIX
0:060 0:046
0:210
0:955
0:129
4:044
0:164
0:000
0:859
0:011
0:064
XR
2
5:053 17:017
0:004 0:373
0:118
0:038
0:147
jXRj
0:256
4:856
0:014
0:073
0:255
1:958 1:522
0:164
0:055
XR
0:642
0:007
0:004
XR
VWEHX
0:004 0:927
0:811
0:127
0:165
jXRj
0:013
XR
2
0:068
0:432
SPHIX
3:992
0:114
0:233
17:491
0:000
0:811
3:399
0:657
0:428
1:491
0:054
0:016
0:022
XR
2
0:063
XR
SPHIX
Table 58.1 (continued)
0:333 0:297
1:858
0:036
0:092
XR
FAGIX
0:114
26:450
10:612
2:647
0:347
0:836
0:001
0:037
XR
FAGIX
1:617
1:148
0:262
0:237 0:047
2:976
1:404
0:094
0:003 0:400
0:465
0:101
0:097
jXRj
0:141 0:189
1:789
1:260
0:103
0:002 0:400
0:465
0:102
0:105
jXRj
0:343
6:401
2:379 4:317
0:197
0:002 1:922
0:927
XR
1:068 0:329
0:069
23:019
1:258
27:970
4:913
0:291
0:533
XR
0:032
0:397 0:145 3:458
1:358
1:795 0:432
0:185
0:000 0:433
0:836
XR
0:082 0:086 2:507
0:009
7:378
0:037
0:000
0:290
0:001
0:707
17:972 56:249
1:661 3:355
0:098
0:003 0:538
0:433
0:075 0:019
0:130
XR
0:305
0:001
0:154
0:024
4:437
0:058
0:000
0:353
0:007
0:666
3:562 21:632 0:390 0:246
25:159
4:311 1:069
0:077
0:000 0:594
0:538
0:034
0:098
XR2
MAHIX
0:049
0:056 0:027
jXRj
5:418
0:036
0:000
0:262
0:009
0:017
XR
2
3:243 35:094
0:609 0:971
0:043
0:058 0:007
XR2
FHYTX
0:107 0:042
jXRj
0:271
0:002
0:009
XR
MAHIX
0:000 0:512
0:165
0:059
0:055
jXRj
2:081
0:174 0:243 0:387
8:285
2:876
0:039
0:000
0:188
0:021 0:006
0:187 0:017
XR2
0:137 0:003
XR
FAHYX
0:100 0:133 0:044
11:424
3:322
0:138
0:000
0:698
0:019
jXRj
3:109
0:020
1:429 10:646
0:623
0:107
0:000
0:107
0:006
0:011
XR
2
0:773 0:194 0:059
26:836
4:870
0:185
4:680 0:081
0:006
0:327
0:001
0:000 0:282
1:258
0:057
2:592 0:298
0:049
0:000
0:027
0:003
0:005
XR
2
0:945
XR
MAHIX
0:103 0:008
jXRj
7:714 0:498
0:122
0:001 0:165
0:634
0:067 0:059
0:107 0:050
jXRj
FHYTX
0:089 0:132 0:178
19:174
4:245
5:057
3:716 0:360 17:947
0:068
0:000
0:402
0:163
0:287
0:002 0:634
1:858
0:026 0:005
XR
0:215 0:006
XR
2
0:137 0:003
jXRj
FAHYX
6:799
0:210
0:000
1:582
0:003
0:044
XR
2
23:820 49:151
4:584
0:211
0:001 0:224
1:995
0:080 0:050
Retail Sales Report Announcement Dates (61)
0:069
XR2
XR
0:154 0:015
jXRj
FHYTX
0:091 0:114 0:769
51:374
6:993
0:523
0:000
3:980
0:006
0:107
XR
2
CPI Report Announcement Dates (61)
0:289
38:316
5:659 3:051
0:256
0:002 1:995
1:922
0:084 0:014
0:155 0:017
jXRj
FAHYX
Unemp. Report Announcement Dates (62)
0:074 0:092 0:176
47:674
6:632
0:459
0:000
3:451
0:019
0:127
XR
2
0:232
60:345
7:724
0:468
0:000
3:693
0:088 0:007
XR
0:160 0:043
XR
2
0:105 0:004
jXRj
FAGIX
High Yield
1:910
0:000
0:062
XR
XR
0:242 0:255
0:072
3:280 18:226
3:897
0:182
0:000
1:116
0:006
5:138
2:266
0:215
0:001
1:056
0:077
0:163
jXRj
0:160
33:016
5:369
0:268
0:000
1:910
0:070
0:130
jXRj
XR
0:363 0:196
0:464
6:246
0:132
2:665
2:204 1:092
0:116
0:001 0:610
0:594
0:086 0:001
3:674
1:949
0:147
0:001
0:610
0:076
0:130
jXRj
(continued)
0:040 0:051
9:133
3:117
0:083
0:000
0:373
0:006
0:038
XR2
PHBBX
0:104 0:019
jXRj
0:080 0:206 0:056 0:040
6:293
1:897 1:811
0:091
0:001 1:056
0:036
XR2
PHBBX
0:123
0:093 0:001 0:512
7:334
0:472
0:000
3:649
0:005
0:088
XR2
27:141 55:697
4:713
0:292
0:093 0:089
jXRj
0:008
1:990
1:352
0:073
0:001 0:197
0:327
0:078
0:078
jXRj
PHBBX
894 N. Kosturov and D. Stock
Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Autocor. t to t C 1
1:585
0:350
0:515
0:098
0:040
XR
0:200
0:667
0:289
0:000
0:179
2:751 0:340 0:008 0:605
1:684 0:142
0:143
0:002 0:371
0:515
0:101
6:231
2:660
0:076
0:000
XR
0:143 0:006
jXRj
FAGIX
0:365 0:071
2:890
1:067 0:544
0:152
0:266
0:010
0:040
XR2
VWEHX
0:000 0:503
0:725
0:065
0:199
1:362 0:364
0:000
0:401
0:463
0:192
0:214
0:463
0:079
0:135
jXRj
1:544
0:040
0:006
0:051
XR2
0:013
XR
SPHIX
Table 58.1 (continued)
0:339
0:172 0:099
0:364 0:488
0:346
0:464 0:210
0:114
0:000 0:240
0:371
0:918
7:176
2:487
0:014
0:000
0:058
0:009 0:004
XR2
0:135 0:006
XR
0:128 0:003
jXRj
1:742 0:823
1:472
0:038
0:000
0:137
0:016
0:030
XR2
FAHYX XR
0:054
0:000
0:046
0:790 0:228
0:734 1:294
0:900 0:393
0:064
0:003 0:101
0:240
0:061
0:074 0:018
jXRj
0:125
0:220
0:015
0:009
XR
0:102
0:218
0:146
0:221
0:525 0:111
0:031
0:424
0:317
0:001
0:000
XR
0:158
0:000
7:625
2:679
0:042
0:330 0:356 0:068
1:654
0:739 0:369
0:068
0:171
0:005
0:024
XR2
PHBBX
0:001 0:414
0:220
0:067
0:074
jXRj
2:109 0:494
1:681
0:014
0:000
0:048
0:004
0:010
XR2
MAHIX
0:000 0:191
0:101
0:019
0:038
jXRj
1:203 0:981
1:328
0:003
0:000
0:010
0:000
0:002
XR2
FHYTX
Emp. Cost Report Announcement Dates (21)
0:011
1:577
1:335
0:112
0:001
0:414
0:073
0:108
jXRj
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 895
896
to Treasury and HY bonds. We then compare the characteristics of returns on announcement versus nonannouncement days. As in JLL we also report the squared excess returns and absolute excess returns as well, since these approximately represent the volatility of the index returns. For brevity, we don’t list results for CIGs with maturities 1–5, 3–7, 1–10, and 7–10 years.
58.4.1 Full Sample of Announcement and Nonannouncement Days Mean daily excess returns of the CIG indices range from 0.006 to 0.015% per trading day, which translates into annualized excess returns of 2.2 and 5.6%, respectively. Maximum daily excess return is 1.66%, and the minimum is 3:3%. Mean excess returns increase with maturity, and slightly increase as credit quality declines. First order autocorrelation of excess returns is positive, typically about 0.05. With the exception of the shortest maturity, which has positively skewed excess returns across all ratings, excess returns are negatively skewed and significantly thin-tailed. JarqueBera statistics soundly reject the null hypothesis of normally distributed excess returns. The positive significant first order autocorrelation of the squared and absolute CIG excess returns (from 0.02 to 0.18) suggests our later use of GARCH models. Treasury mean daily excess returns also increase with maturity, ranging from 0.005 to 0.017%, which suggests annualized excess returns of 1.8 and 6.4%, respectively. Maximum daily excess return is 1.77% and the minimum is 2:5%, both in the longest maturity series. Excess returns are slightly negatively skewed and platykurtic. The result is dissimilar to JLL’s sample of Treasuries, which displayed positively skewed excess returns. Jarque-Bera tests reject the null hypothesis of normally distributed excess returns. First order autocorrelation is significant and ranges between 0.012 and 0.1. With the exception of intermediate maturities, squared excess returns and absolute excess returns also exhibit a positive first order correlation, strongly suggesting a use of GARCH techniques. Daily corporate HY mean excess returns are in the range of 0.0169% for Vanguard (VWEHX) to 0:0079% for Fidelity’s FAHYX fund. Maximum daily excess returns are about 2.1%, and the minimum is 2:08%. Three HY funds have a positive excess return while four have a negative excess return. Skewness is generally negative for the positive mean excess return funds and positive for the negative mean excess return funds. The tails are thinner than those of investment grade returns. Excess return first order autocorrelation is generally greater for high yield bonds compared to CIG
N. Kosturov and D. Stock
and Treasuries counterparts, even though it is negative for a couple of the funds. Similarly, squared and absolute returns are also positively first order correlated, with the correlations being higher than for CIG’s and Treasuries. Thus, our investigation of the descriptive statistics of excess daily returns over the whole sample finds that mean excess returns increase with maturity and generally increase as credit quality declines. In addition, first order autocorrelation in both returns and variances is positive and generally significant.
58.4.2 Announcement Versus Nonannouncement Days For CIGs, excess returns on announcement days are quite different from nonannouncement days. Mean excess returns are about five to six times higher than on a normal day, and squared excess returns are slightly higher than a normal day. Announcement-day excess returns are not normal, and squared and absolute excess returns exhibit a negative autocorrelation, again hinting that announcement-day volatility is not perpetuated. On nonannouncement days, mean excess return is much lower and often even negative. Autocorrelation (day t to t C 1) is generally positive for both announcement and nonannouncement excess returns, just as for full sample returns. Announcement-day Treasury mean excess returns are positive and also about five to six times higher than on an average day for the full sample. Volatility is also higher on announcement days. Announcement-day excess returns are negatively skewed but with fatter tails than full sample returns. Mean excess returns of Treasuries on nonannouncement days are not only lower than their announcementday counterpart, but are often negative, especially for the longer maturities. The result is not totally unexpected, since both Campbell (1995) and JLL also claim that ex ante excess returns are not necessarily positive for Treasuries. Nonannouncement-day returns are also negatively skewed, and thinner tailed. Autocorrelation (day t to t C 1) is generally positive for both announcement and nonannouncement excess returns, just as for full sample returns. Squared and absolute announcement excess returns exhibit a negative autocorrelation (day t to t C 1), while autocorrelation is positive for non-announcement days. HY bonds with a positive full sample mean excess return exhibit higher mean excess returns on announcement days of up to only twice those on a normal, full sample day. Those HY funds with a negative full sample mean excess return exhibit much lower mean excess return on announcement days than on a normal, full sample days. Excess
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
return volatility on announcement days is virtually identical to full sample days in terms of standard deviation and range. Announcement-day first order autocorrelation (day t to t C 1) of squared and absolute excess returns is generally positive, hinting that announcement-day volatility might permeate to days after announcements. For HY bonds with positive full sample mean excess returns, mean nonannouncement excess returns are almost identical to the full sample results, but mean nonannouncement excess returns are slightly higher than the full sample for HY bonds with negative full sample mean excess returns. To summarize, with the exception of some HY funds, all bonds in our sample earn significantly higher excess return on announcement days, which appears to be monotonically increasing with maturity. With the exception of HY bonds, volatility on announcement days is higher, and negatively correlated with volatility on the day after announcement. From the six announcements we use, employment cost, unemployment, retail sales, PPI, and CPI have a larger impact than durable goods announcements. These results are not shown in tables for brevity.
58.5 OLS Regressions of Volatility and Excess Returns 58.5.1 Volatility Measure Regressions Continuing our preliminary inspection of excess return variance, we run simple OLS regressions of corporate and Treasury absolute, jRt j, and squared excess returns, Rt2 , on a full set of weekday dummies, an announcement day indicator (equal to one on days with any type of macroeconomic announcement), and the one day lag and lead of the announcement, (day after and day before announcement, respectively.) The adjustment for the day after and before announcement is necessary in light of the fact that announcement day is a dummy variable equal to 1 on days with a macro announcement and 0 otherwise. When this dummy is lagged by 1 day, equivalent to shifting the series down by one observation in the day after announcement dummy variable obtains. Similar logic applies to the day before announcement variable. The regression equation is thus: jRt j D M Mon C Tu Tues C W Wed C T h Thur C F Fri C 1 ItAC1 C 0 ItA C C1 ItA1 C "t (58.3) Due to suspected heteroskedasticity, we report White’s (1980) heteroskedasticity-consistent standard errors. This OLS regression does not attempt to model conditional
897
volatility, but merely analyzes potential day-of-week effects and announcement volatility patterns. We first report results for CIG’s and then compare those results to Treasury and HY bonds. Table 58.2 presents the results for absolute excess returns. For CIG bonds, we find evidence of strong weekday effects. Volatility of corporate bond excess returns is low on Mondays, then rises on Tuesday, falls on Wednesday, and rises through Friday. This is somewhat similar to the U-shaped pattern detected by JLL in Treasury bond data. Weekday volatility coefficients increase sharply with maturity. There does not seem to be a significant difference among investment grade credit ratings. Even though most of our announcements occur on Fridays, we control for weekdays, therefore separating the Friday effect from the announcement effect. The positive and highly significant coefficient of the announcement dummies confirms that, controlling for day-of-week effects, announcement days have significantly higher volatility. Volatility (absolute excess returns) increases by a little more than one-third on announcement days, a result perfectly consistent with Treasury bond data in JLL. The increase is clearly statistically significant. The coefficient of the day before announcement indicator .™1 / allows us to test the “calm before the storm” effect reported by JLL. We find the coefficient to be generally negative but statistically insignificant. Therefore, days before announcement do not clearly exhibit lower volatility. Day after announcement coefficients are also negative, and statistically insignificant. The only exceptions are the coefficients for the longest maturities, which are negative and significant, hinting of a “calm-after the storm” effect. The result suggests that the higher volatility on announcement days does not generally persist, and that volatility seems to go back to normal nonannouncement levels on the day after announcement. Announcement shocks don’t seem to generate persistent volatility. Furthermore, the finding supports the claim that the trading process itself does not generate autocorrelated volatility, and that new information is impounded quickly into bond prices. For the Treasury sample, the regression results were quite similar. All week-day dummies are significant, increasing through Tuesday, then falling on Wednesday, and rising through Friday, which is the highest volatility day. This is precisely the excess return variance pattern JLL found. Variance is significantly higher (by about 30–50%) on announcement days, regardless of maturity. The announcementday coefficient ™0 is slightly higher for Treasuries than CIG. Day-before .™1 / and day-after .™C1 / coefficients are insignificant. For HY corporate bonds, we observe almost the same weekday pattern, with the exception that volatility starts somewhat higher on Monday and falls on Tuesday before it starts increasing throughout the rest of the week.
KM 0:06114 0:06439 KTu 0:05925 KW 0:06696 KTh 0:07658 KF ™1 0:00202 ™ 0:02694 ™C1 0:00282
Coef.
Coef.
Coef.
KM 0:13953 0 KTu 0:12308 0 KW 0:13189 0 KTh 0:15035 0 KF 0:13316 0 ™1 0:00557 0:6264 ™ 0:00895 0:3714 ™C1 0:00791 0:5116
jExcess returnj SPHIX Mat. 5.3 years Coef. p-value 0:1055 0:0783 0:0803 0:0889 0:0963 0:0062 0:0252 0:0055
0 0 0 0 0 0:4584 0:003 0:5556
jExcess returnj VWEHX Mat. 6.8 years Coef. p-value
Treasuries
0 0 0 0 0 0:7508 0:0437 0:4987
jExcess returnj FAHYX Mat. NA Coef. p-value
0 0 0 0 0 0:854 0 0:3772
Coef.
0 0 0 0 0 0:7693 0 0:1227
0 0 0 0 0 0:9656 0:1764 0:5441
jExcess returnj MAHIX Mat. 6.4 years Coef. p-value
0:346556 0:350063 0:330555 0:409461 0:443231 0:00643 0:113139 0:03654
jExcess returnj Treasuries Mat. 10C yrs. Coef. p-value
0:3114 0:3218 0:3012 0:369 0:372 0:009 0:1123 0:047
Coef.
0:15017 0:11043 0:10331 0:10835 0:11434 0:03209 0:03987 0:05717
0 0 0 0 0 0:0533 0:003 0:0005
0 0 0 0 0 0:6625 0 0:0234
p-value
jExcess returnj Rating 3B Mat. 10C yr.
jExcess returnj PHBBX Mat. 6.2 years Coef. p-value
0 0 0 0 0 0:9182 0 0:4949
p-value
jExcess returnj Rating 3B Mat. 3–7 yr.
0 0:13612 0 0:14943 0 0:14014 0 0:16812 0 0:17957 0:641 0:001 0 0:06269 0:8717 0:0073
p-value
0 0:094569 0 0:091315 0 0:096581 0 0:104096 0 0:09154 0:7329 0:000437 0:0227 0:013215 0:3304 0:00507
jExcess returnj FHYTX Mat. 4.9 yrs. Coef. p-value 0:078915 0:081814 0:083329 0:08626 0:093259 0:002473 0:018981 0:008057
Coef.
jExcess returnj Rating 3B Mat. 1–3 yr.
0 0:06091 0 0:06723 0 0:06376 0 0:07334 0 0:07971 0:6247 0:0022 0 0:0289 0:0637 0:0009
p-value
jExcess returnj Treasuries Mat. 7–10 yrs. Coef. p-value
0 0:21586 0 0:2265 0 0:21094 0 0:26307 0 0:29094 0:8277 0:0028 0 0:08178 0:4657 0:0141
High Yields
0 0:14034 0 0:1209 0 0:12846 0 0:14735 0 0:12534 0:1933 0:0045 0:1055 0:02926 0:9081 0:0079
Coef.
jExcess returnj Rating 1A Mat. 10C yr.
0 0:319391 0 0:341185 0 0:31777 0 0:375104 0 0:39565 0:9768 0:01001 0 0:113975 0:3356 0:03978
p-value
jExcess returnj Treasuries Mat. 5–7 yrs. Coef. p-value
0 0:17026 0 0:18141 0 0:16418 0 0:20845 0 0:22654 0:9491 0:00261 0 0:06676 0:5011 0:00941
jExcess returnj FAGIX Mat. NA Coef. p-value 0:17326 0:15259 0:1645 0:18129 0:16844 0:0215 0:02783 0:00173
Coef.
jExcess returnj Rating 1A Mat. 3–7 yr.
0 0:137969 0 0:152237 0 0:141585 0 0:166112 0 0:184596 0:6348 0:00028 0 0:062501 0:6661 0:01017
p-value
jExcess returnj Treasuries Mat. 3–5 yrs. Coef. p-value
0 0:1188 0 0:1287 0 0:12 0 0:1466 0 0:1611 0:878596 0:0006 0 0:0506 0:962327 0:0062
jExcess returnj Treasuries Mat. 1–3 yrs. Coef. p-value
KM 0:05476 KTu 0:05877 KW 0:05306 KTh 0:06448 KF 0:07017 ™1 0:00062 ™ 0:02513 ™C1 0:0002
Coef.
jExcess returnj Rating 1A Mat. 1–3 yr.
0 0:06001 0 0:06463 0 0:0589 0 0:0669 0 0:07611 0:7575 0:0019 0 0:02911 0:0663 0:002
p-value
jExcess returnj Rating 2A Mat. 10C yr.
0 0:31659 0 0:33311 0 0:31265 0 0:37262 0 0:38983 0:9572 0:00627 0 0:11368 0:2079 0:03856
p-value
jExcess returnj Rating 2A Mat. 3–7 yr.
0 0:1414 0 0:1539 0 0:1462 0 0:1675 0 0:1885 0:6189 0:0005 0 0:0585 0:5369 0:0135
p-value
jExcess returnj Rating 2A Mat. 1–3 yr.
Corporate Investment Grade
Table 58.2 Bond return volatility by day of week and event day An OLS regression of the volatility of the daily mean excess return on weekday dummy variables and announcment day dummies. P-values computed using heteroskedasticity-consistent standard errors. A A C ItA C C1 It1 C "t Regression equation: jRt j D M Mon C Tu Tues C W Wed C Th Thur C F Fri C 1 ItC1
898 N. Kosturov and D. Stock
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
Announcement day dummies have positive coefficients, but are only significant for four of the seven funds. Both daybefore and day-after coefficients are most often negative and insignificant, with the exception of the Putnam PHBBX fund for which they are both positive and significant. We thus detect only weak “calm before the storm” effects and announcement-day volatility persistence. There is only limited evidence of increased volatility on announcement days. The claim that information gets quickly impounded into prices, and that the trading process is not to blame for generating autocorrelated volatility is once again supported. The results (not shown) for Treasuries, CIG, and HY corporates are virtually identical to the above when using the alternative specification of squared excess returns to represent volatility. To summarize, volatility of all CIGs and Treasuries displays a similar pattern across days of the week. For CIGs and Treasuries, excess return volatility on announcement days is typically about 30% higher than volatility on a normal day, but volatility does not clearly increase for HYs. Evidence of calm before and after the storm is weak for all types of bonds.
58.5.2 Excess Return Regressions Next, in Table 58.3, we estimate an OLS regression with the same set of independent variables, but this time using excess returns as the dependent variable. We again report CIG results and then compare to Treasury and HY bonds. For CIGs, the first notable result is that Mondays, Wednesdays, and often Fridays exhibit negative coefficients, however, Monday is the only significant one. The observation for Mondays and Wednesdays is consistent with the lower volatility (Table 58.2) on these two days. We thus find some evidence that excess returns and volatility are positively related across days of the week. We will explicitly test for the excess return-volatility relation in our GARCH models section. The coefficient for the announcement-day dummy is positive, and monotonically increasing with maturity. As an example, the longest maturity A rated index excess return is 0.0782% higher on announcement days than on nonannouncement days. The higher return is marginally significant across the indices, with p-values between 6 and 7%. The coefficients for the days before and after announcement are consistently positive but insignificant. Therefore, after accounting for day-of-week effects, we find no evidence of excess returns on the days before and after announcement. We have reasons to believe, however, that the data exhibits heteroskedasticity. Ljung-Box Portmanteu (Q) statistics for the Table 58.3 regressions fail to reject the null hypothesis of no autocorrelation in the residuals. In addition, the descriptive statistics in Table 58.1 also confirm significant first order
899
autocorrelation of volatility. With heteroskedasticity present, our OLS estimates are inefficient. Treasuries exhibit results very much similar to those for CIGs. Low volatility days tend to translate into lower excess returns but the coefficients are not significant. Announcements tend to increase excess returns. Days before and after effects are indistinguishable from zero. The regressions for HY bonds exhibit the following differences from CIGs and Treasuries. Among day of the week coefficients, Tuesday is the largest and most often the only significant one. In light of our finding that Tuesdays were the lowest volatility days, the conventional volatility-excess return relation is reversed (negative) for HY bonds. The announcement-day coefficient was positive and significant only for the Fidelity HY (SPHIX) fund, whereas the dayafter announcement coefficient was positive and significant for all the funds except Vangurad’s VWEHX. Our results support the hypothesis that bonds earn positive excess returns on announcement days due to the higher exposure to macroeconomic risk for CIG and Treasury bonds. Also, the exposure to macroeconomic risk translates into higher volatility on announcement day. This announcementinduced volatility does not persist into the following day, consistent with the JLL result that the trading process itself does not generate autocorrelated volatility. Longer maturity return and volatility are more sensitive to macroeconomic news, probably due to their greater exposure to interest rate risk. The HY bonds, however, display a different behavior. Of the seven HY bond funds, only Fidelity seems to earn significant positive excess returns on announcement days. Virtually all HY funds earn positive post announcement excess returns. Contrary to basic finance theory, the only significant weekday coefficient (Tuesday) for any HY fund indicates that HY bonds tend to only earn a positive excess return on the lowest volatility day.
58.6 Conditional Variance Models Since Ljung-Box tests confirm the presence of heteroskedasticity, we devote the remainder of our examination of announcement-day volatility to modeling the conditional variance of the excess return process and examining the persistence of different shock types. The starting point in our analysis of conditional variance is the standard GARCH(1,1) model developed by Bollerslev (1986). In particular, we will use a quasi-maximum likelihood estimation, and report Bollerslev-Wooldridge (1992) robust standard errors. The model has been widely used in the finance literature, and provides a good approximation of conditional variance for a wide variety of volatility processes (Nelson 1990).
KM KTu KW KTh KF ™1 ™ ™C1
KM KTu KW KTh KF ™1 ™ ™C1
KM KTu KW KTh KF ™1 ™ ™C1
Excess Return Treasuries Mat. 3–5 yrs. Coef. p-value 0:01736 0:1438 0:00307 0:7977 0:00754 0:5332 0:00517 0:7188 0:00526 0:7245 0:01655 0:2022 0:02922 0:0753 0:01449 0:2903
Treasuries
High Yields
Excess Return Treasuries Mat. 5–7 yrs. Coef. p-value 0:0211 0:2098 0:0021 0:8997 0:015 0:3599 0:0064 0:7529 0:0181 0:39 0:0163 0:3628 0:047 0:0367 0:0266 0:1636
Excess Return Treasuries Mat. 7–10 yrs. Coef. p-value 0:0296 0:1639 0:00826 0:6898 0:0203 0:3337 0:0078 0:7632 0:0206 0:4413 0:01645 0:4674 0:0591 0:0364 0:03284 0:1698
Excess Return Treasuries Mat. 10C yrs. Coef. p-value 0:037 0:2731 0:0234 0:452 0:013 0:6697 0:02 0:6096 0:026 0:5106 0:003 0:9305 0:0868 0:0369 0:0575 0:1088
Excess Return Excess Return Excess Return Rating 1A Rating 3B Rating 3B Mat. 10C yr. Mat. 1–3 yr. Mat. 3–7 yr. Coef. p-value Coef. p-value Coef. p-value 0:035 0:2364 0:0125 0:0545 0:022 0:1041 0:0344 0:261 0:0084 0:2045 0:0092 0:508 0:038 0:2098 0:0036 0:5943 0:006 0:6692 0:011 0:7701 0:0119 0:1224 0:0046 0:7872 0:029 0:4278 0:0028 0:7246 0:008 0:667 0:0101 0:7506 0:0049 0:4733 0:0142 0:3335 0:0782 0:0464 0:0093 0:2922 0:0356 0:0617 0:043 0:1861 0:0128 0:0955 0:017 0:2786
Corporate Investment Grade Excess Return Excess Return Excess Return Rating 2A Rating 1A Rating 1A Mat. 10C yr. Mat. 1–3 yr. Mat. 3–7 yr. Coef. p-value Coef. p-value Coef. p-value 0:0353 0:2304 0:0126 0:0356 0:028 0:0411 0:0271 0:3651 0:00832 0:1884 0:0088 0:5356 0:0343 0:2417 0:0033 0:5822 0:012 0:4035 0:0134 0:7066 0:00932 0:1614 0:0054 0:744 0:0253 0:481 0:00207 0:7852 0:009 0:6272 0:0146 0:644 0:00569 0:3541 0:0172 0:24 0:0716 0:0644 0:01337 0:1055 0:0342 0:0739 0:0419 0:1916 0:00758 0:2658 0:0185 0:2373
Excess Return Treasuries Mat. 1–3 yrs. Coef. p-value 0:00872 0:1004 0:004584 0:4141 0:00072 0:8978 0:0003 0:963 0:001607 0:8128 0:006425 0:2769 0:012836 0:0934 0:005268 0:4063
Excess Return Rating 2A Mat. 3–7 yr. Coef. p-value 0:02313 0:0933 0:01361 0:351 0:01011 0:4853 0:00075 0:9643 0:01294 0:4737 0:01899 0:2109 0:03569 0:0698 0:0138 0:385
Excess Return Rating 3B Mat. 10C yr. Coef. p-value 0:0345 0:2413 0:0289 0:3186 0:0203 0:476 0:0038 0:9201 0:0212 0:539 0:0054 0:8626 0:0638 0:0942 0:0453 0:1442
Excess return Excess return Excess return Excess return Excess return Excess return Excess return SPHIX VWEHX FAGIX FAHYX FHYTX MAHIX PHBBX Mat. 5.3 years Mat. 6.8 years Mat. NA Mat. NA Mat. 4.9 yrs Mat. 6.4 years Mat. 6.2 years Coef. p-value Coef. p-value Coef. p-value Coef. p-value Coef. p-value Coef. p-value Coef. p-value 0:001423 0:925 0:01668 0:189 0:0019 0:9092 0:0084 0:5768 0:0027 0:7306 0:0094 0:3227 0:012 0:4686 0:020111 0:0809 0:01945 0:0216 0:0249 0:0736 0:01961 0:0962 0:0178 0:017 0:0199 0:0277 0:0224 0:0391 0:012159 0:374 0:0138 0:1373 0:0233 0:151 0:01433 0:286 0:0009 0:905 0:0034 0:7511 0:0242 0:0364 0:0089 0:6305 0:00716 0:5039 0:0053 0:8027 0:0184 0:3271 0:0046 0:6408 0:003 0:8044 0:0127 0:4638 0:00775 0:5802 0:00318 0:7787 0:0207 0:2929 0:0149 0:2694 0:013 0:2376 0:013 0:2182 0:0251 0:0633 0:000888 0:9524 0:00313 0:7612 0:0031 0:8829 0:0016 0:9295 0:0007 0:9449 0:012 0:363 0:0454 0:031 0:026878 0:0499 0:00674 0:5389 0:0367 0:1051 0:02496 0:1808 0:0048 0:6847 0:0042 0:7454 0:0014 0:9392 0:027059 0:0733 0:01828 0:1081 0:0494 0:014 0:04611 0:0028 0:0342 0:0017 0:0251 0:0227 0:0684 0:0017
Excess Return Rating 2A Mat. 1–3 yr. Coef. p-value 0:01042 0:0848 0:007091 0:2507 0:00096 0:8733 0:006333 0:3341 0:001278 0:8651 0:00647 0:2917 0:011796 0:1483 0:006798 0:314
Table 58.3 Bond excess return by day of week and event day An OLS regression of daily mean excess return on weekday dummy variables and announcment day dummies. P-values computed using heteroskedasticity-consistent standard errors. A A C ItA C C1 It1 C "t Regression Equation Rt D M Mon C Tu Tues C W Wed C Th Thur C F Fri C 1 ItC1
900 N. Kosturov and D. Stock
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
We begin with a GARCH model of daily CIG excess returns with an intercept, an AR(1) term, and an announcement dummy in the mean (excess return) equation, as in JLL. In the variance equation, we include the exogenous announcement dummy as well as its lead and lag. Thus we simultaneously model the contemporaneous impact of announcements on both conditional mean excess return and conditional volatility. In addition, we allow days before and after announcement to have a different conditional variance intercept. The model is described below, where ht is the conditional variance and ©t is the residual. Rt D C Rt 1 C 0 ItA C "t
(58.4)
ht D ! C ˛"2t1 C ˇht 1 C 1 ItAC1 C 0 ItA C C1 ItA1 (58.5) In the model above, Rt is the realized percentage excess return on day t, and ItA is the announcement dummy indicator. The residual ©t jˆt1 is assumed distributed N.0; ht /. Note that in the conditional variance equation, the lag of the announcement dummy ItA1 indicates the day after announcement and the lead ItAC1 indicates the day before announcement, just like in JLL. In the simple GARCH framework above, we also tested for GARCH-M and asymmetric effects; however, no such effects were found. The coefficient ™0 measures changes in mean excess returns on announcement days. Because of the positive autocorrelation found in excess returns, we allow for the AR(1) term in the first moment. JLL claim such autocorrelation could be due to microstructure effects in measuring prices or equilibrium partial adjustment. Our announcement dummy incorporates all announcement types and results are reported in Table 58.4. Individual announcement effects have also been computed and will be briefly discussed as well. We first report CIG results and then compare to U.S. Treasury and HY bonds. For CIG bonds, the coefficient for the AR (1) term in the mean equation is positive, and consistently significant except for the lower maturities. It also seems to be humped with regard to maturities, peaking in the intermediate 3–7 year maturity range. The announcement-day dummy coefficient ™0 is positive, significant and increasing with maturity, confirming the hypothesis that investment grade corporate bonds earn significant excess returns on announcement days. It does not appear to be significantly different among investment grade ratings. In the conditional variance equation, ’ and “ are both significant. Their sum is about 0.96, providing evidence that the effect of shocks persists, and yielding a half-life of about 17 trading days. The day before announcement coefficient ¡1 is small, negative and usually insignificant thus lending only weak support for the “calm before the storm” effect. The
901
most notable results are for the ¡0 and ¡C1 coefficients, in Equation (58.2), which represent the day of the announcement and day after in the conditional variance equation. The announcement-day coefficient ¡0 is positive, significant, and increases with maturity. Therefore, CIG conditional volatility rises on announcement days. On the day after announcement, however, volatility is back to normal. The coefficient for the day-after announcement is negative, significant, and of the same absolute value as the announcement coefficient. For example, in the A rated 3–7 year index, ¡0 is 0.045 and ¡C1 is 0:0454. In fact, the absolute values of ¡0 and ¡C1 are remarkably similar for many regressions. The result is not to be interpreted as lower volatility after announcements. Rather, it means that volatility reverts back to normal. We cannot reject the null that the sum of the two coefficients is zero (p-values of 0.95.) The finding is very similar to Li and Engle (1998) and the JLL conclusion: general shocks to variance persist, but shocks from announcements do not, even though they significantly raise announcement-day volatility and excess return. These results do not differ significantly among different CIG credit qualities. The model performs well in terms of purging the residuals from autocorrelation, as reflected by low Q-statistics. Using the model on the Treasury return series, the results are remarkably similar to CIGs. In the mean equation, the intercept is insignificant, the AR(1) coefficient is positive, significant and humped with respect to maturity, peaking at the 7–10 year maturity. The announcement dummy is positive and significant, and monotonically increasing with maturity, indicating that Treasuries earn a positive excess return on announcement days. In the variance equation, there is strong shock persistence, as ’ and “ sum to about 0.96 again. Announcements increase conditional variance, but it does not persist since the increase is almost exactly offset on the day after; for example, in the 5–7 year index, ¡0 is 0.054 and ¡C1 is 0:059. With respect to HY bonds, the intercept in the mean equation is most often positive and significant, providing some support that HY bonds earn a consistently positive average excess return ( is around 0.01) controlling for AR(1) and announcement effects, perhaps due to their higher risk. The coefficient for the AR(1) term in the mean equation, ™, is also consistently positive (around 0.4) and strongly significant. The announcement coefficient .™0 / is positive and at least marginally significant for three of the funds and insignificant for the rest. We therefore find only limited evidence that junk bonds earn positive excess returns on announcement days. The coefficients in our conditional variance estimation for HY bonds display interestingly different behavior compared to the other types of bonds. The ’ coefficients are positive (ranging from about 0.05 to 0.5) and clearly significant, indicating that the lagged shocks have quite an impact on conditional variance. The “ coefficients are also positive (ranging
® ™0 ¨ ’ “ ¡1 ¡0 ¡C1
Rating Mat. Coef. 0:003 0:030 0:016 0:001 0:039 0:919 0:001 0:008 0:008
0:006 0:000 0:074 0:859 0:495 0:001
0:010 0:527 0:035 0:007 0:005 0:000
0:000 0:001 0:561 0:426 0:134 1:000
FAGIX Mat. Coef. 0:01127 0:24637 0:04419
SPHIX VWEHX Mat. 5.3 years Mat. 6.8 years Coef. p-value Coef. p-value 0:005 0:388 0:009 0:027 0:378 0:000 0:475 0:000 0:030 0:019 0:015 0:227
¨ 0:020 ’ 0:300 “ 0:267 ¡1 0:001 ¡0 0:004 ¡C1 0:013
® ™0
0:002 0:031 0:931 0:000 0:029 0:032
0:048 0:007 0:000 0:806 0:000 0:000
¨ 0:000 ’ 0:035 “ 0:930 ¡1 0:000 ¡0 0:007 ¡C1 0:007
® ™0
0:01492 0:13401 0:70786 0:0085 0:02985 0:0427
Mat. Coef. 0:000 0:082 0:034
1–3 yrs. p-value 0:656 0:012 0:040
Rating Mat. Coef. 0:003 0:038 0:016 0:000 0:042 0:921 0:001 0:009 0:009
Mat. Coef. 0:001 0:068 0:014
2A 3–7 yr. p-value 0:966 0:029 0:014 0:002 0:003 0:000 0:319 0:000 0:000
2A 10C yr. p-value 0:887 0:073 0:020 0:004 0:044 0:000 0:252 0:000 0:000
Rating Mat. Coef. 0:000 0:060 0:042 0:003 0:037 0:931 0:004 0:040 0:043
Rating Mat. Coef. 0:002 0:048 0:086 0:016 0:027 0:937 0:021 0:165 0:180
2A 1–3 yr. p-value 0:209 0:278 0:032 0:025 0:012 0:000 0:346 0:000 0:000
FAHYX Mat. Coef. 1E-04 0:405 0:044 0:0019 0:0265 0 0:5999 0:1973 0:0013
0:012 0:13 0:686 0:008 0:017 0:036
Variance Equation
NA p-value 0:0842 0 0:0935
Rating Mat. Coef. 0:002 0:053 0:091 0:017 0:028 0:937 0:031 0:176 0:183
5–7 yrs. Mat. p-value Coef. 0:912 0:002 0:001 0:091 0:013 0:067 0:008 0:031 0:931 0:009 0:084 0:092
0 0:0027 0:0526 0:2115 0 0:5591 0:5662 0:0028 0:4705 0:0023 0:0183 0:0039
0:024 0:090 0:000 0:194 0:000 0:000
10C yrs. p-value 0:986 0:111 0:011
3B 1–3 yr. p-value 0:087 0:651 0:041 0:031 0:022 0:000 0:840 0:002 0:000
0:5003 0:0243 0 0:1499 0:8094 0:0772
3B 3–7 yr. p-value 0:637 0:023 0:021 0:003 0:004 0:000 0:133 0:000 0:000
Rating Mat. Coef. 0:001 0:052 0:082 0:014 0:031 0:935 0:017 0:170 0:183
0:0058 0:1766 0:4019 0:0249 0:0115 0:0645
0:0001 0:069 0 0:0522 0:2591 0
PHBBX Mat. 6.2 years Coef. p-value 0:0178 0 0:3378 0 0:0105 0:5222
Rating Mat. Coef. 0:003 0:063 0:040 0:003 0:037 0:931 0:006 0:048 0:048
MAHIX Mat. 6.4 years Coef. p-value 0:008 0:0108 0:3327 0 0:0149 0:2185
0:021 0:023 0:936 0:035 0:176 0:189
Mat. Coef. 0:000 0:044 0:100
Rating Mat. Coef. 0:004 0:014 0:016 0:000 0:050 0:913 0:000 0:009 0:009
0 0:0003 0:0002 0:0533 0 0:9344 0:2675 0:0085 0:351 0:002 0:0171 0:006
4.9 yrs p-value 0:0006 0 0:7163
0:017 0:020 0:000 0:507 0:001 0:000
7–10 yrs. p-value 0:850 0:001 0:011
FHYTX NA Mat. p-value Coef. 0:9954 0:0095 0 0:4766 0:0443 0:0037
0:027 0:012 0:000 0:686 0:001 0:000
Variance Equation 0:005 0:031 0:930 0:004 0:055 0:060
High Yield
0:042 0:011 0:000 0:923 0:001 0:000
1A 3–7 yr. p-value 0:789 0:016 0:023 0:006 0:003 0:000 0:219 0:000 0:000
Treasuries
Rating Mat. Coef. 0:002 0:067 0:040 0:003 0:037 0:928 0:005 0:045 0:045
3–5 yrs. Mat. p-value Coef. 0:999 0:001 0:002 0:089 0:026 0:052
1A 1–3 yr. p-value 0:231 0:176 0:028 0:016 0:004 0:000 0:421 0:000 0:000
1A 10C yr. p-value 0:873 0:053 0:014 0:006 0:051 0:000 0:096 0:000 0:000
3B 10C yr. p-value 0:922 0:054 0:024 0:003 0:073 0:000 0:334 0:000 0:000
The exogenous announcement dummy as well as its lead and lag are included in the variance equation. Model: Rt D C Rt1 C 0 ItA C "t ht D ! C ˛ "2t1 C ˇ ht1 C A A 1 ItC1 C 0 ItA C C1 It1
Corporate Investment Grade
Table 58.4 GAR CH(1,1) model of daily corporate bond excess returns with an intercept, an AR(1) term, and an announcement dummy in the mean equation. Robust errors used for p-values.
902 N. Kosturov and D. Stock
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
from 0.03 to 0.93) and clearly significant. The sum of ’ and “ is the same (about 0.56) for three (SPHIX, VWEHX, and PHBBX) of the HY funds, which translates into a halflife of only 1.19 days. For the rest, the sum is most often around 0.8, corresponding to a half-life of 3.1 days. Thus general shocks to HY excess returns do not seem to persist very long. The coefficients for the announcement and preannouncement dummies in the variance equation are generally positive but insignificant. However, the coefficient of the post-announcement dummy is negative and significant for five of the seven HY funds. Therefore we find very little evidence that macroeconomic announcements have an effect on contemporaneous conditional variance of HY excess returns. There are no announcement day, or “calm before the storm” effects. There are, however, “calm after” effects, indicating lower conditional variance on days after announcements. We find that the six types of macro announcements have quite different effects on excess returns and conditional variance. Announcement type
Effect on excess return
Effect on conditional variance
PPI
No effect
CPI
C on announcement day No effect
Unemployment
No effect
Employment cost
C on announcement day C on announcement day No effect
Retail sales Durable goods
C on announcement day C on announcement day No effect C on announcement day No effect
Producer Price Index (PPI) announcements seem to only increase excess return on announcement day but have no effect on conditional variance of Treasuries and CIGs. CPI announcements tend to increase conditional volatility on announcement days and are associated with both calm before and calm after effects for CIG bonds and Treasuries. The mean equation coefficient is however insignificant for CIGs and Treasuries, suggesting that the CPI effect is complementary to the PPI effect: the former only affects the mean equation, and the latter affects the variance equation. Results of price index announcements are mixed for the HY bonds. There is a remarkable similarity to the effects of employment cost and unemployment announcements on CIG and Treasury bonds excess returns. Just like the CPI and PPI effects, one announcement type only affects excess return, and the other affects only conditional variance. Our results suggest unemployment announcements only positively affect conditional variance and also suggest some limited calm after effect, but have no effect on excess return. Employment cost announcements, on the other hand, generate consistently positive and significant excess returns, but have no effect on conditional variance. For HY bonds, neither
903
employment cost nor unemployment seem to affect excess return or variance. Retail sales is the only announcement that affects both excess returns and conditional variance for both CIGs and Treasuries. Both CIG and Treasury bonds realize a significant positive excess return on days with retail sales announcements. Furthermore, conditional variance is significantly higher on retail sales announcement days, and goes back to normal on the day after. Retail sales announcements seem to have no effect on HY excess returns and variances. Durable goods announcements seem to have no effect on either excess return or conditional variance across all credit ratings. To summarize, the effects of the joint macroeconomic announcement dummy are most likely due to the effect of retail sales and the complementary forces exerted by PPI and CPI, and complimentary forces exerted by unemployment and employment costs. The patterns of these individual announcement effects remain invariant with the GARCH specifications presented next, and will not be reported for the sake of brevity.
58.7 Alternative GARCH Models 58.7.1 Component GARCH The above conclusion that general shocks to variance persist, but announcement-day shocks do not for CIG and Treasury bonds, fit Engle and Lee’s (1993) component GARCH model, which separates transitory and permanent components of variance. Specifically, the above results suggest a specification that includes announcement dummies (1 to 1) in the transitory equation, and no exogenous variable in the permanent equation. Rt D C Rt 1 C 0 ItA C "t ht qt D !N C ˛ "2t1 !N
(58.6)
Cˇ .ht 1 !/ N C 1 ItAC1 C 0 ItA C C1 ItA1 qt D ! C .qt 1 !/ C
"2t1
ht 1
(58.7) (58.8)
Here ht is still the conditional volatility, while qt takes the place of ! and is the time varying long run volatility. Equation (58.7) describes the transitory component ht qt , which converges to zero with powers of .’ C “/. Equation (58.8) describes the long run component qt , which converges to ¨ with powers of £. Typically £ is between 0.99 and 1 so that qt approaches ¨ very slowly. Prior results suggest the announcement dummies in the transitory equation will have an impact on the short run
904
movements in volatility, while the variables in the permanent equation will affect the long run levels of volatility. Output from the model is presented in Table 58.5. Results for CIGs tend to confirm that announcement and post announcement-day shocks have coefficients of the same magnitude and different signs, both having a transitory effect on conditional volatility Equation (58.7). However, the coefficients are only marginally significant. The result is most pronounced for the longest (ten plus years) maturity bonds and the coefficients in the transitory equation are marginally significant, at best. Also, the permanent Equation (58.8) £ is significant, and close to unity, meaning that variance slowly reverts to a long run mean. In terms of the value of the maximized log-likelihood function, the model slightly improves on the simpler GARCH model with exogenous variables in the variance equation. The model’s results for Treasury bond excess returns are remarkably similar to those for CIG. The permanent variance equation slowly converges to a long run mean of ¨, as evidenced by the £ coefficients of about 0.96. The transitory equation announcement-day dummies are significant and positive. Only the 1–3 year maturity series displays an offset in conditional variance on the day after. In the mean equation for HY bonds, the intercept and the AR(1) term are positive and consistently significant but the mean equation announcement dummy is positive and significant for only two of the funds. Therefore, we again only find limited support that HY bonds earn excess returns on announcement days. In the variance equation, the £ coefficient is quite close to 1 just like with investment grades, suggesting that variance takes some time to revert to its long run mean. Announcement dummy coefficients in the transitory equation are generally insignificant. Dayafter announcement coefficients are, however, mostly significant. The ’ and “ coefficients are generally insignificant, with the exception of three of the funds (MAHIX, FHYTX, and FAGIX) where they sum to 0.6, suggesting the transitory component converges to zero relatively fast. The model leads to large improvements in the log-likelihood function for HY bonds, suggesting that quickly dissipating transitory variance components and strong after-announcement effects characterize HY funds excess returns best. The relatively weak results of the component GARCH model for the CIG bonds, however, lead us to search for alternative volatility models.
58.7.2 Filter GARCHs Volatility seasonal filters allow volatility to be different on announcement days, while retaining the underlying volatility process. Compared to the simpler GARCH with exogenous variables discussed in the previous section, which, in effect,
N. Kosturov and D. Stock
allowed for a simple shift in the intercept in the variance equation on announcement days, the new model allows for a regime switch on announcement days. Namely, it models a multiplicative seasonal 1 C •0 IA t , which adjusts the magnitude of conditional volatility on announcement days, rather than just the intercept term ¨. In a way, the model retains the overall decay structure in the conditional variance equation, but allows for a seasonal regime shift of conditional variance. The model is outlined by Andersen and Bollerslev (1997), and is presented as 1=2
(58.9) Rt D C Rt 1 C ItA C st "t st D 1 C ı1 ItAC1 1 C ı0 ItA 1 C ıC1 ItA1 (58.10) ht D ! C
˛"2t1
C ˇht 1
(58.11)
where IA t is an announcement indicator dummy variable, st is the volatility seasonal, and ©t is a random variable with conditional mean zero, and conditional variance ht , independent of st . Even though the volatility seasonal is defined multiplicatively, the announcement and pre and post announcement dummies can never take a common realization. In other words, the volatility seasonal thus allows us to differentiate between announcement and pre and post announcement volatility, while on all other days it is equal to unity. We use an announcement dummy for all of our six types of macroeconomic announcements. In summary, the above specification allows announcement days and pre and post announcement days to have different volatility than nonannouncement days. Parameter estimates are obtained using quasi-maximum likelihood estimation with a normal likelihood function. Results are reported in Table 58.6. Starting values for ht are set to unconditional variance obtained from a basic GARCH model as suggested by JLL. Robust Bollerslev-Wooldridge (1992) standard errors are reported. The models perform well in general, ridding the residuals of autocorrelation and slightly improving the log-likelihood function compared to the simple GARCH(1,1) model with announcement dummies in the variance equation. Also, the Akaike and Schwartz information criteria improve. First, we examine the model on CIG bond indices. Results confirm that volatility on announcement days is from 90 to 135% greater than volatility on a normal day. The •0 coefficient is strongly significant, and interestingly decreases with maturity across corporate bonds. Thus, bonds with more time to maturity tend to exhibit a smaller increase in conditional volatility on days with a scheduled macroeconomic announcement. There seems to be no significant difference among the coefficients for the different ratings, even though the AA coefficients are slightly lower than the coefficients for similar maturity A rated bonds, which in turn are higher
¨ 0:0079 T 0:9577 v 0:0398 ’ 0:0183 “ 0:5599 ¡1 0:0010 0:0073 ¡0 0:0029 ¡C1
® ™0
Rating Coef. 0:0028 0:0326 0:0167
Rating Coef. 0:0008 0:0638 0:0400
0 0:02072 0 0:47507 0:2922 0:54381 0:007 0:097 0:1624 0:24321 0:4974 0:00506 0:5741 0:00735 0:0055 0:00173
¨ 0:042 T 0:94507 v 0:01763 ’ 0:234 “ 0:2066 0:00215 ¡1 0:00229 ¡0 ¡C1 0:01127
® ™0
VWEHX p-value Coef. 0:2259 0:01015 0 0:4539 0:0253 0:0119
¨ 0:0083 T 0:9373 v 0:1002 ’ 0:0986 “ 0:9521 ¡1 0:0005 ¡0 0:0067
® ™0
Mat. Coef. 0:0008 0:0594 0:0132
0:0000 0:3211 0:0000 0:9350 0:0003 0:4642 0:8502 0:4485 0:1901 1:3768 0:7779 0:0192 0:0000 0:1677 0:5793 0:1775
2A Rating p-value Coef. 0:8953 0:0022 0:0207 0:0458 0:0199 0:0839
SPHIX Coef. 0:00694 0:35375 0:02873
0:0000 0:0468 0:0000 0:9720 0:0071 0:0451 0:4025 0:0055 0:0500 0:4356 0:2582 0:0013 0:0000 0:0410 0:2913 0:0091
2A p-value 0:2556 0:2315 0:0235
Rating Coef. 0:0026 0:0428 0:0169
Mat. Coef. 0:0002 0:0853 0:0332
0:0013 0:0108 0:0171 0:6734 0:7139 0:5218 0:0354 0:6898
p-value 0:0039 0 0:3605 0:18982 1:00004 0:01151 0:13557 0:71312 0:00731 0:0253 0:03274
FAGIX Coef. 0:01508 0:25014 0:03847
0:0000 0:0327 0:0000 0:9667 0:5887 0:0327 0:5841 0:0103 0:0033 0:3792 0:6311 0:0008 0:0001 0:0315
1–3 yrs. p-value 0:7329 0:0240 0:0553
0:0003 0:0078 0:0000 0:9649 0:9105 0:0425 0:9136 0:0072 0:7409 0:4175 0:3251 0:0008 0:0001 0:0082 0:0000 0:0022
2A p-value 0:8702 0:0900 0:0222
1A p-value 0:7647 0:0105 0:0286
Rating Coef. 0:0004 0:0515 0:0883
FHYTX p-value Coef. 0:4166 0:00917 0 0:47372 0:0112 0:0032
Rating Coef. 0:0040 0:0254 0:0173
0 0:02318 0 0:99401 0:4854 0:02852 0:0066 0:15142 0:0397 0:55525 0:2811 0:00693 0:2195 0:00177 0:1492 0:00687
Rating Coef. 0:0029 0:0680 0:0380
0:2481 0 0:0231 0:0139 0 0:1438 0:7154 0:0059
p-value 0:0001 0 0:3702
0:0000 0:0000 0:0720 0:6365 0:3614 0:6003 0:0002
0:01646 0:97364 0:01616 0:11718 0:39352 0:02641 0:00562 0:07306
PHBBX Coef. 0:0155 0:32179 0:01579
0:0000 0:0453 0:0000 0:9724 0:0017 0:0415 0:0962 0:0111 0:0990 0:5097 0:8708 0:0035 0:0000 0:0453 0:4986 0:0168
3B p-value 0:1068 0:3763 0:0299
10C yrs. p-value 0:9595 0:0967 0:0093
MAHIX Coef. 0:01226 0:32485 0:01018
0:2470 0:9668 0:0264 0:0118 0:4767 0:0144 0:1655
p-value 0:0009 0 0:7519
0:0000 0:0000 0:0119 0:8468 0:6666 0:9908 0:0000
Mat. Coef. 0:0007 0:0457 0:1031
0:0000 0:0092 0:0000 0:9736 0:0347 0:0425 0:4481 0:0580 0:0010 0:4587 0:6194 0:0002 0:0001 0:0095 0:0087 0:0023
7–10 yrs. p-value 0:8607 0:0006 0:0117
Variance Equation 0:8518 0:05291 0:0531 0:01105 0 0:9384 0 0:93399 0:2147 0:25042 0:8676 0:0368 0:0114 0:1941 0:8973 0:1746 0 1:12187 0:4733 0:42804 0:5051 0:01011 0:5481 0:00273 0:1949 0:02124 0:3868 0:00297 0:0067 0:0359 0:0443 0:0028
p-value 0:0271 0 0:1072
FAHYX Coef. 0:00405 0:41459 0:05336
High Yield
3–5 yrs. Mat. 5–7 yrs. Mat. p-value Coef. p-value Coef. 0:9715 0:0007 0:9268 0:0016 0:0016 0:0909 0:0009 0:0935 0:0288 0:0515 0:0138 0:0665 Variance Equation 0:0000 0:0652 0:0000 0:1045 0:0000 0:9672 0:0000 0:9671 0:0075 0:0330 0:0100 0:0335 0:7517 0:0168 0:5990 0:0059 0:6350 0:3503 0:6227 0:3659 0:8713 0:0003 0:9747 0:0002 0:0000 0:0569 0:0000 0:0870
Treasuries
0:0000 0:2088 0:0000 0:9575 0:0009 0:0417 0:8138 0:0234 0:3199 0:7950 0:4574 0:0103 0:0000 0:1686 0:5692 0:1336
Variance Equation
Rating Coef. 0:0018 0:0710 0:0382
0:0000 0:0453 0:0000 0:9699 0:0028 0:0412 0:7908 0:0057 0:3603 0:4486 0:3425 0:0033 0:0000 0:0426 0:6288 0:0129
1A p-value 0:2758 0:1234 0:0225
1A p-value 0:9788 0:0612 0:0164
Rating Coef. 0:0004 0:0494 0:0824
0:0053 0 0:3695 0:1173 0 0:0485 0:5283 0
p-value 0:0004 0 0:3003
0:0000 0:3221 0:0000 0:9451 0:0008 0:3869 0:6013 0:3646 0:1649 1:3042 0:4057 0:0175 0:0000 0:1726 0:4052 0:1819
3B p-value 0:6150 0:0131 0:0319
0:0061 0:0000 0:8948 0:9009 0:6588 0:3460 0:0001 0:0000
3B p-value 0:9737 0:0682 0:0237
ht qt D ! C ˛ "2t1 ! C ˇ.ht1 !/ C Model: Rt D C Rt1 C 0 It A C "t A A 1 ItC1 C 0 ItA C C1 It1 qt D ! C .qt1 !/ C "2t1 ht1
Corporate Investment Grade
Table 58.5 Component GARCH(1,1) model of daily corporate bond excess returns with an intercept, an AR(1) term, and an announcement dummy in the mean equation. P-values computed using Bollerslev-Wooldridge standard errors.
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 905
Rating Mat. Coef. 0:0014 0:0631 0:0400
SPHIX Mat. Coef. 0:0044 0:3726 0:0293
Rating Mat. Coef. 0:0007 0:0471 0:0864
• 1 0:0067 • 1:2358 • C1 0:0576 ¨ 0:0001 ’ 0:0457 “ 0:9330
® ™0
Mat. Coef. 0:0011 0:0675 0:0139
0:5570 0:5962 0:0000 0:0000 0:0000 0:0000
0:2746 0:4199 0:1121 0:0106 0:4427 0:0749
Mat. Coef. 0:0003 0:0835 0:0328
0:1539 1:2348 0:1481 0:0002 0:0536 0:9189
Rating Mat. Coef. 0:0030 0:0434 0:0173
0:0000 0:0000 0:0341 0:0000 0:0000 0:0453
0:0115 0:5137 0:0195 0:0272 0:3191 0:3229
FAGIX Mat. Coef. 0:0067 0:3206 0:0375
0:9170 0:0089 0:0000 1:0962 0:4031 0:0826 0:0001 0:0008 0:0000 0:0398 0:0000 0:9349
1–3 yrs. p-value 0:6452 0:0200 0:0538
0:3240 0:0000 0:0043 0:0132 0:0000 0:0000
2A 10 C yr: p-value 0:9578 0:0913 0:0239
6.8 years p-value 0:0064 0:0000 0:1326
0:2814 0:0825 0:0000 0:8994 0:0057 0:1781 0:0003 0:0067 0:0000 0:0383 0:0000 0:9293
2A 3–7 yr. p-value 0:8182 0:0251 0:0258
VWEHX 5.3 years Mat. p-value Coef. 0:5028 0:0100 0:0000 0:4765 0:0137 0:0122
0:0118 0:0727 0:0000 1:0769 0:0375 0:1680 0:0005 0:0010 0:0000 0:0505 0:0000 0:9275
2A 1–3 yr. p-value 0:2230 0:2006 0:0288
• 1 0:0254 • 0:0304 • C1 0:2683 ¨ 0:0178 ’ 0:3007 “ 0:2870
® ™0
• 1 0:1461 • 1:1695 • C1 0:1353 ¨ 0:0002 ’ 0:0483 “ 0:9218
® ™0
Rating Mat. Coef. 0:0031 0:0353 0:0165
1A 3–7 yr. p-value 0:7048 0:0116 0:0345
Rating Mat. Coef. 0:0004 0:0508 0:0908
0:7631 0:435 0 1:59512 0:825 0:1029 0 0:0137 0 0:57142 0 0:20637
0 0 0:1172 0 0 0
0:15027 0:46382 0:20049 0:00276 0:22282 0:53102
FHYTX NA Mat. p-value Coef. 0:3837 0:00906 0 0:47373 0:1081 0:0008 Variance Equation
FAHYX NA Mat. p-value Coef. 0:3769 0:0041 0 0:47053 0:0812 0:03395
High Yield
3–5 yrs. Mat. 5–7 yrs. Mat. p-value Coef. p-value Coef. 0:9586 0:0003 0:9641 0:0016 0:0039 0:0900 0:0017 0:0913 0:0389 0:0512 0:0193 0:0661 Variance Equation 0:8942 0:0325 0:6268 0:0476 0:0000 0:9812 0:0000 0:9340 0:2297 0:0873 0:2115 0:0934 0:0004 0:0017 0:0009 0:0028 0:0000 0:0386 0:0000 0:0377 0:0000 0:9346 0:0000 0:9354
Treasuries
0:1211 0:1182 0:0000 0:8943 0:0385 0:1786 0:0009 0:0072 0:0000 0:0389 0:0000 0:9279
Variance Equation
Rating Mat. Coef. 0:0023 0:0688 0:0381
0:0077 0:1060 0:0000 1:1237 0:0161 0:1329 0:0002 0:0011 0:0000 0:0489 0:0000 0:9270
1A 1–3 yr. p-value 0:2399 0:0969 0:0217
Corporate Investment Grade Rating Mat. Coef. 0:0034 0:0655 0:0393
0:1984 0:0000 0:1025 0:0160 0:0001 0:0000
10 C yrs: p-value 0:9464 0:1155 0:0125
0 0:0021 0 0 0 0
Rating Mat. Coef. 0:0034 0:0483 0:0828
1:37474 0:66352 1:76387 0:01439 0:19246 0:30966
0 0 0 0 0 0
6.2 years p-value 0:0381 0 0:4142
0:0656 0:1129 0:0000 0:9340 0:0438 0:2327 0:0006 0:0069 0:0000 0:0444 0:0000 0:9223
3B 3–7 yr. p-value 0:5740 0:0183 0:0308
PHBBX Mat. Coef. 0:01466 0:28945 0:01556
0:2717 0:1239 0:0000 1:1934 0:2122 0:1310 0:0000 0:0011 0:0000 0:0505 0:0000 0:9241
3B 1–3 yr. p-value 0:1307 0:5812 0:0490
MAHIX Mat. 6.4 years Coef. p-value 0:0162 0 0:3336 0 0:0088 0:3883
0:0967 0:7745 0:1146 0:0078 0:0303 0:9383
Mat. Coef. 0:0010 0:0446 0:1018
0:0576 1:1188 0:0812 0:0003 0:0644 0:9057
Rating Mat. Coef. 0:0043 0:0151 0:0159
0:0589 0:4158 0:0002 0:2486 0:0616 0:4878 0 0:0009 0 0:1528 0 0:8235
4.9 yrs p-value 0:0022 0 0:9277
0:4814 0:0000 0:1703 0:0012 0:0000 0:0000
7–10 yrs. p-value 0:8743 0:0017 0:0160
0:1425 0:0000 0:0039 0:0113 0:0000 0:0000
1A 10 C yr: p-value 0:9782 0:0672 0:0187
0:1551 0:0000 0:0001 0:0075 0:0000 0:0000
3B 10 C yr: p-value 0:7934 0:0899 0:0258
Table 58.6 Filter GARCH(1,1) model of daily corporate bond excess returns with an intercept, an AR(1) term, and an
dummy in the mean equation announcement 1=2 A A ht D ! C ˛"2t1 C ˇht1 1 C ıItA 1 C ıC1 It1 P-values computed using Bollerslev-Wooldridge standard errors. Model: Rt D C Rt1 C 0 ItA C st "t st D 1 C ı1 ItC1
906 N. Kosturov and D. Stock
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
than the BBB rated. Variance is consistently about 12% lower on days before announcement than on a normal day. The coefficients are, however, only marginally significant at best; therefore evidence of a “calm before the storm” effect is limited. On the other hand, there is a clearly significant decrease of about 15% on days after announcements, representing a “calm after the storm” effect. The effect tends to be negatively related to maturity. All variance coefficients, except for •1 in some cases, are significant, and the sum of ’ and “ is about 0.97, yielding a half-life of approximately 22.7 days. General volatility shocks therefore persist. There is a slight decrease in the sum of ’ and “ as maturity increases, hinting at slightly lower volatility persistence for longer-term bonds. The coefficient for the AR(1) term in the mean equation is positive and significant, and so is the one for the announcement dummy in the mean equation, again confirming the result that CIGs earn higher returns on announcement days. This announcement excess return is again monotonically increasing with maturity. The results are similar for Treasury bonds. Volatility on announcement days increases by as much as 123% for the lowest maturity Treasuries (1–3 years), and the increase declines with maturity. There is no clear “calm before” nor “calm after” effects. With CIGs, volatility on the day after announcement was clearly lower. Treasury bonds earn positive excess returns on announcement days, which increase with maturity, supporting the “exposure to systematic risk” hypothesis. This excess return is commensurate to that for CIGs with similar maturity. Both ’ and “ are significant and the sum is close to their CIG counterpart. Thus, general, nonannouncement shocks to excess returns persist. For corporate HY bonds, our results are not so clear. The announcement-day volatility coefficient •0 is positive and significant in every case except one, suggesting that volatility ranges from 25 to 160% higher on announcement days. In contrast to the GARCH model of Equation (58.5), announcement-day volatility is now much larger than normal day volatility for HY funds. The day-before and day-after coefficients are also significant and positive for almost all of the funds. Variance coefficients, ’ and “, are significant, but the sum is now quite low, about 0.6, corresponding to a half-life of 1.35 days. The evidence suggests that both announcement and nonannouncement shocks to junk bonds’ excess returns do not persist in the filter model. Mean equation coefficients for four of the seven HY funds support the claim that HY bonds earn a positive excess return on announcement days.
58.7.3 A Modified Filter GARCH We generally find support that shocks to full sample volatility of excess return persists with CIG and Treasury bonds. However, we are interested in the persistence of announcement
907
shocks and announcement volatility in particular. We would like to test the hypothesis that macroeconomic news gets quickly incorporated into CIG prices, and that news effects do not linger. Thus we employ an alternative specification of the filter GARCH(1,1) that more specifically accommodates such a test. The model is 1=2
(58.12) Rt D C Rt 1 C 0 ItA C st "t st D 1 C ı1 ItAC1 1 C ı0 ItA 1 C ıC1 ItA1 (58.13) ht D ! C ˛0 C ˛A ItA1 "2t1 C ˇ0 C ˇA ItA1 ht 1 (58.14) where announcement day shocks are allowed to feed differently into conditional volatility via ’A . Also, on days after announcement, t 1 in this system, the importance of lagged conditional volatility also varies through “A . Thus the specification allows announcement and nonannouncement shocks and volatility to have different effects on conditional volatility. It is noteworthy that if ’A D “A D 0 the model reduces to the simple filter model above. Thus, this regime switching model nests the benchmark model. Also, the model nests the specification in which announcement shocks do not persist at all; that is, ’A D ’0 , and “A D •C1 D 0, in which case volatility evolves by completely disregarding announcementday shocks. First, for the CIGs and the Treasuries, we compute Wald coefficient statistics to test the benchmark filter model versus the modified filter. Coefficient values and results are reported in Table 58.7. We are able to reject the null of ’A D “A D 0, in favor of the modified filter model. Also, the model leads to improvements in the log-likelihood function, improves both the Akaike and Schwartz information criteria, and purges the residuals of autocorrelation. Our CIG results confirm the finding of increased conditional volatility (•0 is positive and significant) on announcement days of, for example, about 115% for the shortest maturity CIG, and of about 83% for long-term BBB CIGs. All •0 values are clearly significant. Again, •0 is decreasing with maturity, and is not distinguishably different among ratings. The “calm before the storm” effect .•1 / is now significant, and the “calm after” .•C1 / effect is not. CIGs still earn a positive excess return on announcement days, which increases with maturity. The ’A coefficient is consistently negative, and very close to the ’0 . A Wald coefficient test fails to reject the hypothesis that announcement shocks do not persist at all, ’A D ’0 . However, ’A on its own is often times insignificant or marginal. The estimates of “A are all negative and statistically significant. The results support the hypothesis that on days after announcements, conditional volatility decays at a much faster rate than on regular days. In particular, announcement-day volatility feeds much less into forecasts of future variance,
Rating Mat. Coef. 0:00297 0:0332 0:0166
0:1847 1:0278 0:0692 0:00029 0:05688 0:0133 0:93801
® ™0
• 1 • • C1 ¨ ’0 ’A “0
5.3 years p-value 0:4748 0 0:0129 0:8315 0:4154 0:0328 0 0 0:3511 0 0:0889
® ™0
•1 0:01121 • 0:05802 •C1 0:2035 ¨ 0:0171 0:26438 ’0 0:07691 ’A 0:33596 “0 “A 0:1339 N/A
N/A
VWEHX Mat. 6.8 years Coef. p-value
0:4626 0 0:9287 0:0003 0 0:0001 0 0:0724
® ™0 • 1 0:0458 • 0:97875 • C1 0:00733 ¨ 0:00013 ’0 0:06234 ’A 0:0656 “0 0:95468 “A 0:0898
0:1411 0 0:0943 0:0097 0 0:0226 0
2A 10C yr. p-value 0:9056 0:1128 0:028
1–3 yrs. p-value 0:7347 0:0162 0:0157
0:1213 0:7527 0:1226 0:0077 0:0484 0:0385 0:9445
Rating Mat. Coef. 0:0016 0:044 0:0826
Mat. Coef. 0:0008 0:06871 0:01672
SPHIX Mat. Coef. 0:00482 0:37219 0:02943
0:0548 0 0:2315 0:0005 0 0:0212 0
2A 3–7 yr. p-value 0:8414 0:0341 0:0239
0:1244 0:8622 0:0849 0:0012 0:0642 0:0408 0:9519
Rating Mat. Coef. 0:0012 0:0607 0:0395
0:0011 0 0:3475 0:0002 0 0:5241 0
2A 1–3 yr. p-value 0:24 0:2376 0:026
0:1249 0:3102 0:526 0:0117 0:1681 0:0946 0:7565 0:4619
FAGIX Mat. Coef. 0:0131 0:2537 0:0348
0:0496 0:8692 0:0193 0:0010 0:0585 0:0608 0:9467 0:1321
Mat. Coef. 0:0001 0:0827 0:0350
0:186 1:1478 0:103 0:0003 0:0555 0:0071 0:9374
Rating Mat. Coef. 0:0029 0:0403 0:0168
Rating Mat. Coef. 0:0021 0:0664 0:0377
0 0:0001 0:0018 0 0 0:0297 0 0
NA p-value 0:0915 0 0:1137
0:441 0 0:8086 0:0006 0 0:0003 0 0:0133
1A 3–7 yr. p-value 0:7283 0:0165 0:0336
0:115 0:7186 0:01 0:0036 0:0529 0:055 0:9547 0:182
0:0748 0 0:9021 0:0009 0 0:0006 0 0:0006
7–10 yrs. p-value 0:8621 0:0029 0:013
0:0451 0 0:1023 0:0079 0 0:0192 0
1A 10C yr. p-value 0:9221 0:088 0:021
0:155 0:6017 0:042 0:0105 0:0411 0:048 0:9555 0:175
Mat. Coef. 0:0014 0:0406 0:0951
0:097 1:1668 0:068 0:0003 0:0495 0:0478 0:9368
Rating Mat. Coef. 0:0043 0:0142 0:0158
0:0292 0 0:629 0:0092 0:0001 0:0018 0 0:0014
10C yrs. p-value 0:9275 0:1578 0:017
0:0713 0 0:3914 0 0 0:0207 0
3B 1–3 yr. p-value 0:1291 0:6202 0:0511 0:1614 1:03584 0:0558 0:00137 0:05971 0:0253 0:943
Rating Mat. Coef. 0:00315 0:06238 0:03905 0:0142 0 0:4836 0:0002 0 0:1991 0
3B 3–7 yr. p-value 0:5999 0:0279 0:0299
Rating Mat. Coef. 0:0038 0:0469 0:0804 0:144 0:8352 0:186 0:0073 0:05 0:017 0:9396
High Yield FAHYX FHYTX MAHIX PHBBX Mat. NA Mat. 4.9 yrs Mat. 6.4 years Mat. 6.2 years Coef. p-value Coef. p-value Coef. p-value Coef. p-value 0:004 0:4252 0:0086 0:0038 0:016 0 0:0344 0:0025 0:461 0 0:4739 0 0:3364 0 0:2001 0:002 0:0303 0:1661 0:001 0:8823 0:0068 0:5096 0:105 0:0011 Variance Equation 0:4007 0 0:0989 0:2308 0:4242 0 0:427 0:0005 1:4453 0 0:4176 0:0013 0:2285 0:0109 1:0747 0:0817 0:0699 0:6431 0:5795 0:0063 0:4878 0 0:39 0:2273 0:0131 0 0:0025 0 0:001 0 0:0122 0:0002 0:6203 0 0:1896 0 0:1561 0 0:0043 0:8194 0:304 0:017 0:0631 0:3878 0:0397 0:2849 1:237 0:0971 0:2316 0 0:6276 0 0:8167 0 0:2803 0:1129 0:082 0:3588 0:361 0 0:0543 0:5031 5:0509 0:096
0:0904 0 0:9795 0:001 0 0:0021 0 0:0089
Variance Equation 0:1091 0:7719 0:0023 0:0019 0:0497 0:0512 0:9651 0:1375
0:1586 0:7447 0:1198 0:0083 0:0494 0:0397 0:9433
Rating Mat. Coef. 0:0013 0:0471 0:0875
5–7 yrs. Mat. p-value Coef. 0:8981 0:002 0:0018 0:0876 0:0114 0:066
0:149 0:0232 0:9558 0 0:055 0:4724 0:0013 0:0004 0:0586 0 0:027 0:1698 0:947 0 Treasuries
3–5 yrs. Mat. p-value Coef. 0:9911 0:001 0:0044 0:0887 0:0226 0:0537
0:0008 0 0:1332 0:0001 0 0:7527 0
Variance Equation
1A 1–3 yr. p-value 0:2512 0:1353 0:0253
Corporate Investment Grade
Table 58.7 II. Modified Filter GARCH(1,1) model of daily corporate bond excess returns with an intercept, an AR(1) term, and an announcement dummy in the mean equation. P-values computed using Bollerslev-Wooldridge standard errors. 2 1=2 A A A A ht D ! C ˛0 C ˛A It1 t1 C ˇ0 C ˇA It1 ht1 1 C ıItA 1 C ıC1 It1 Model: Rt D C Rt1 C 0 ItA C st t st D 1 C ı1 ItC1
0:065 0 0:0068 0:0061 0 0:2654 0
3B 10C yr. p-value 0:7638 0:1016 0:0295
908 N. Kosturov and D. Stock
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements
and thus the above-normal announcement volatility does not persist much. JLL found their Treasury “A coefficient insignificant, thus claiming that shocks decay at their normal rate. In contrast, our “A coefficients are significantly negative, thus supporting the claim that on days after announcement conditional volatility decays at a much faster rate. The modified beta (the sum of the two beta coefficients) is about 16% lower than the regular filter GARCH beta, thus dramatically decreasing half life from 23 to about 3 days. We suggest that announcement-day shocks to CIG bonds variance still do not persist; however, their dissipation process is different than the one JLL found for Treasury bonds. While JLL Treasury test results did not find the higher announcement shock to feed into forecasts of future variance, our CIG tests do not find the higher announcement-day volatility to permeate into future variance forecasts. The result still confirms the finding from the GARCH with exogenous variables in the variance model, where volatility reverted back to its normal level on days after announcements. Our contrasting results for CIGs and Treasury bonds show that forecasting post-announcement conditional volatility for CIGs would have to take into consideration their different evolutionary pattern. Our results are important to researchers and practitioners concerned with the differential dynamics of bond excess return volatility and differential announcement shock formation and effect. This could have important implications for hedging CIGs with Treasuries. We confirm the findings with our own Treasury data, noting that for Treasuries, ’A is significant and a Wald coefficient restriction test fails to reject it is equal to ’0 . “A is also negative and significant, except for the shortest maturity. The results mark a slight departure from JLL’s form of nonpersistent announcement volatility for Treasury bonds. Namely, we find that for Treasuries, neither the announcement-day shocks nor the higher announcement-day volatility are permitted to permeate into conditional variance forecasts – whereas JLL only found evidence of the former. Our CIG returns assured announcement-day shocks dissipation only by banning announcement-day conditional volatility to affect future variance. In addition, the announcement-day volatility coefficient .•0 / declines with maturity, and is much smaller now (maximum is 0.97) as opposed to the one in the simple filter model. There are now no calm after and calm before effects. Excess announcement return is once again positive and increases monotonically with maturity, just like it did for CIGs. For HY bonds, we found that generally shocks do not persist, so it comes as no surprise that we found that the regime switching filter model does not fare well. For three of the seven HY funds, we cannot reject the hypothesis that ’A D “A D 0, thus ruling in favor of the simple nested filter model. The results are no surprise given the findings for CIGs and Treasuries: we simply find that for HY excess returns, announcement-day shocks’ persistence characteristics are
909
identical to general shocks; namely, they do not persist, and dissipate quickly. For Vanguard’s VWHEX fund, the persistence filter model is not even well defined, resulting in a singular covariance matrix. In addition, HY announcement day return is significant in only two cases. For the remaining three HY funds, however, we reject the hypothesis that ’A D “A D 0, in favor of the regime switching model. For these same three funds, this translates into an improvement in the log-likelihood function. Generally, when significant, the ’A coefficient is negative, suggesting that not only do announcement shocks not permeate into future conditional variance but they might actually decrease it. When significant, the “A coefficient is usually negative, indicating that announcement-day conditional volatility decays at a much faster rate than nonannouncement-day volatility. Thus, general shocks, as evidenced by the sum of ’ and “, do not persist for HY funds, and announcement-day shocks persist even less so. The model detects higher announcementday conditional variance and significant day-after effects of mixed signs. In summary, even though we found different dissipation processes for announcement-day volatility with our CIG and Treasury bonds, they both assure that higher announcementday volatility and shocks do not perpetuate an increased conditional variance forecast. This lack of persistence of announcement shocks is consistent with the hypothesis that public news reduces information asymmetry after disclosure. As in Li and Engle (1998), macro announcements could be claimed to make traders’ learning process shorter than the one on nonannouncement days, characterized by private or poor-quality information. High-quality, unambiguous macroeconomic information on preset announcement days gets quickly impounded into CIG and Treasury bond prices, differing from the usual imperfect news and high informational asymmetry setting that give rise to autoregressive conditional heteroskedasticity. For HY bonds, we find that neither general nor announcement-day shocks tend to persist. The finding suggests that lower grade bonds exhibit a different price formation process than CIGs and Treasuries. The effect macroeconomic announcements exert on conditional forecasts of excess return variance is quite similar to the one nonannouncement shocks have. The finding suggests that low grade corporate bonds prices assimilate information through a more complicated process, possibly reflecting a reaction to a multitude of market forces and factors that may have offsetting effects.
58.7.4 Tests for Asymmetric Volatility Numerous researchers have found that the impact of news on conditional volatility is often asymmetric. The nature of our data permits an opportunity to test alternative hypotheses of
910
N. Kosturov and D. Stock
asymmetry. Early observations of asymmetry in equity markets were that bad news (negative residuals) increased conditional volatility more than good news (positive residuals). Black (1976) explained that this was potentially due to the fact that firms were more levered if bad news reduced equity values. Another explanation of the asymmetric effect is that asset values decline at time “t1” if forecasted conditional variance at time “t” rises. Thus at “t1,” a negative residual occurs and the effect is called “volatility feedback” as in Campbell and Hentschel (1992). Here the drop in prices is assumed to be a result of a forecasted increase in volatility. (In the leverage effect, a decline in price precedes the increase in volatility.) Corporate bond data provides an opportunity to test alternative explanations of asymmetric volatility. The leverage effect predicts bad news (negative ©t1 ) generates a negative effect upon CIG volatility as bad news (higher yields) reduces the value of debt and the firm is thus less levered. Another explanation of the leverage effect for CIG is that yields have risen (causing negative ©t1 ) and, to the advantage of the firm, the firm’s debt pays lesser coupon interest than current market rates. On the other hand, the volatility feedback effect predicts bad news should be positively related to volatility as higher forecasted volatility for time “t” reduces time “t 1” value and causes negative ©t1 . For HY bonds, the theory for CIGs applies, but because HY bonds include a large equity component, bad news may raise volatility (positive impact) as it would for equity instruments. The results may suggest one effect dominates. Our data also includes Treasury bonds where the leverage affect does not apply but the volatility feedback does. Thus, only the volatility feedback effect applies for Treasuries and the asymmetric volatility effect should be positive. We test the asymmetry effect with the following model: 1=2
(58.15) Rt D C Rt 1 C 0 ItA C st "t st D 1 C ı1 ItAC1 1 C ıItA 1 C ıC1 ItA1 (58.16) ht D ! C ˛ C C A ItA1 It1 "2t1 C ˇht 1 (58.17) where It1 is an indicator variable equal to 1 if ©t1 < 0, and 0 otherwise. Significant values for ” indicate an asymmetric effect. This specification allows nonannouncement bad news to feed differently .’ C ”/ into conditional volatility than announcement bad news .’ C ” C ”A /. Table 58.8 describes the model’s results. For CIGs, we find evidence of significant negative asymmetric effects, for the short and intermediate maturities. In the cases we find asymmetry, we slightly improve on the log-likelihood function, and in some cases improve on the
Akaike and Schwartz criteria. The negative gamma suggests the leverage effect dominates the volatility feedback effect. ”A is significant in only one case suggesting that negative announcement-day shocks usually have no different effect on conditional variance than general negative shocks. The ’ and “ coefficients and the mean equation coefficients are almost identical to their simple filter model counterparts. For Treasury bonds, we fail to find any asymmetry evidence at the 5% significance level, except for the 7–10 year maturities where ” is negative. The model detects marginal negative asymmetry in the short and intermediate maturities, thus improving on the log-likelihoods and the info criteria. For HY bonds we find that four of the seven funds display a significant positive ”. Only Vanguard’s VWEHX fund displays a significant, negative ”A . The ” results are noteworthy in light of HY bonds’ dual debt-equity nature. Since HY bonds could be expected to behave similar to equity, our positive ” coefficient is in agreement with the positive leverage effect explaining equity asymmetry. We next estimated an array of alternative specifications (results not reported), allowing for variance filter with asymmetry, GARCH in mean equation effects of announcement and nonannouncement conditional variance, and different persistence specifications. No additional insight was gained through any of these. In summary, CIGs tend to exhibit negative asymmetric volatility while HYs tend to exhibit positive asymmetric volatility. The difference may be attributed to the greater equity component of HY bonds. Treasuries tend to show negative asymmetry, but the evidence is weak.
58.8 Conclusion We find that CIG bond excess return volatility on announcement days is almost twice that of normal days. Since nonautocorrelated macroeconomic announcements result in nonpersistent volatility increases, we reject the hypothesis that the particular market design or trading process gives rise to autocorrelated volatility. Rather, the phenomenon is ascribed to the news-generating process, which gives rise to autocorrelated news arrivals. Our modified seasonal filter GARCH models detect that Treasuries’ announcement shock itself is prohibited from perpetuating an increase in future conditional variance, while for CIGs it is the high announcement day conditional variance that gets dissipated faster. This short run difference in volatility dynamics is significant for hedging purposes as we explained above. Consistent with theory suggested by Longstaff and Schwartz (1995) and Collin-Dufresne and Goldstein (2001) we find that 1–3 year maturity CIG excess returns exhibit
Rating Mat. Coef. 0:0035 0:0388 0:0173
0:1501 1:1433 0:1223 0:0002 0:0632 0:0275 0:0197 0:9246
® ™0
•1 • •C1 ¨ ’ ” ”A “
0:0254 0:0618 0:0241 0:0173 0:1516 0:1587 0:1603 0:3309
® ™0
• 1 • • C1 ¨ ’ ” ”A “
Rating Mat. Coef. 0:0036 0:0456 0:0174
0:5755 0:3449 0:0450 0:0000 0:0000 0:0261 0:1365 0:0000
0:2751 0:3562 0:0073 0:0104 0:5024 0:0037 0:2703 0:0731
VWEHX Mat. Coef. 0:1769 0:0096 0:4802 0:0000 0:0000 0:3909 0:0000 0:0000 0:9695 0:0051 0:0463
6.8 years p-value 0:0041 0:0386 0:0000 0:0232 0:4653 0:0305 0:0274 0:1762 0:1839 0:0419 0:3522
FAGIX Mat. Coef. 0:008 0:3018 0:0311
0:8348 0:0015 0:0000 1:0922 0:0200 0:0363 0:0010 0:0007 0:0000 0:0533 0:0889 0:0219 0:0350 0:0227 0:0000 0:9377
• 1 0:0136 • 1:1890 0:0168 • C1 ¨ 0:0001 ’ 0:0605 ” 0:0215 ” A 0:0489 “ 0:9353
® ™0
1–3 yrs. Mat. p-value Coef. 0:6653 0:0642 0:5819 0:0009 0:0116 0:0858
0:3043 0:1649 0:0000 1:2357 0:0057 0:1570 0:0164 0:0002 0:0012 0:0705 0:5031 0:0378 0:4456 0:0029 0:0000 0:9221
2A 10C yr. p-value 0:9916 0:0890 0:0219
Mat. Coef. 0:0315 0:0013 0:0733
0:3137 0:0863 0:0000 0:8756 0:0170 0:1740 0:0024 0:0077 0:0000 0:0361 0:0441 0:0102 0:6569 0:0169 0:0000 0:9235
2A Rating 3–7 yr. Mat. p-value Coef. 0:6775 0:0001 0:0231 0:0482 0:0198 0:0878
5.3 years p-value 0:0000 0:5222 0:0000
0:0686 1:0795 0:1538 0:0009 0:0660 0:0292 0:0096 0:9313
Rating Mat. Coef. 0:0025 0:0646 0:0424
SPHIX Mat. Coef. 0:3134 0:0045 0:3539
0:0104 0:0000 0:0733 0:0014 0:0000 0:0948 0:3928 0:0000
2A 1–3 yr. p-value 0:1709 0:1633 0:0221
0:5812 0 0:7353 0 0:0006 0:0055 0:7512 0
NA p-value 0:3387 0 0:1644
High Yield FAHYX FHYTX Mat. NA Mat. Coef. p-value Coef. 0:0044 0:4455 0:00699 0:4705 0 0:48014 0:0326 0:1402 4:4 103 Variance Equation 0:4083 0 0:1854 1:5386 0 0:50486 0:1111 0:1142 0:22209 0:0141 0 0:00285 0:5007 0 0:05526 0:0995 0:2256 0:2536 0:0313 0:8986 0:027 0:2064 0 0:54601
Treasuries 3–5 yrs. Mat. 5–7 yrs. Mat. p-value Coef. p-value Coef. 0:3670 0:0741 0:2969 0:0810 0:8699 0:0003 0:9728 0:0003 0:0032 0:0914 0:0014 0:0904 Variance Equation 0:9818 0:0375 0:5725 0:0451 0:0000 0:9812 0:0000 0:9566 0:0233 0:0546 0:0132 0:0702 0:0044 0:0015 0:0078 0:0023 0:0000 0:0489 0:0000 0:0498 0:0696 0:0156 0:2132 0:0226 0:3200 0:0156 0:4831 0:0066 0:0000 0:9375 0:0000 0:9413
Corporate Investment Grade 1A Rating 1A Rating 1–3 yr. Mat. 3–7 yr. Mat. p-value Coef. p-value Coef. 0:1569 0:0033 0:5870 0:0008 0:0810 0:0699 0:0105 0:0520 0:0211 0:0393 0:0310 0:0924 Variance Equation 0:0039 0:1051 0:1248 0:1230 0:0000 1:1306 0:0000 0:8654 0:0106 0:1296 0:0497 0:1723 0:0005 0:0010 0:0042 0:0087 0:0000 0:0618 0:0000 0:0358 0:0168 0:0263 0:0816 0:0137 0:8929 0:0016 0:9402 0:0217 0:0000 0:9310 0:0000 0:9196 Rating Mat. Coef. 0:0045 0:0149 0:0121
0:0267 0:0002 0:0416 0 0:0535 0 0:7781 0
4.9 yrs p-value 0:0255 0 0:9963
0:5044 0:0000 0:0113 0:0139 0:0000 0:0482 0:7568 0:0000
0:4114 0:243 0:5226 0:0012 0:0967 0:1403 0:0274 0:7927
Rating Mat. Coef. 0:0047 0:0668 0:0404
0:1870 0:0000 0:0113 0:0346 0:0034 0:8251 0:6048 0:0000
10C yrs. p-value 0:1139 0:9668 0:1121
0 0:0053 0 0 0 0 0:6257 0
Rating Mat. Coef. 0:0037 0:0477 0:0807
0:01247 0:702 1:32433 0:02246 0:05535 0:2346 0:3695 0:17707
0:9489 0:0581 0:0001 0 0:6723 0:2974 0:3998 0:2423
6.2 years p-value 0:3907 0:082 0:191
0:0773 0:1102 0:0000 0:9527 0:0695 0:2363 0:0033 0:0064 0:0000 0:0431 0:0220 0:0032 0:6704 0:0137 0:0000 0:9260
3B 3–7 yr. p-value 0:4353 0:0160 0:0281
PHBBX Mat. Coef. 0:01438 0:18589 0:0638
0:2744 0:1197 0:0000 1:2160 0:0874 0:1263 0:0000 0:0010 0:0000 0:0687 0:5698 0:0369 0:0002 0:0090 0:0000 0:9281
3B 1–3 yr. p-value 0:1090 0:5901 0:1465
MAHIX Mat. 6.4 years Coef. p-value 0:015 0:0001 0:3399 0 0:0039 0:7019
0:1019 0:7605 0:1037 0:0082 0:0299 0:0032 0:0115 0:9370
7–10 yrs. Mat. p-value Coef. 0:2384 0:1130 0:9750 0:0006 0:0018 0:0453
0:1319 0:0631 0:0000 1:2727 0:0061 0:1188 0:0133 0:0003 0:0003 0:0489 0:3461 0:0072 0:3530 0:0874 0:0000 0:9182
1A 10C yr. p-value 0:9544 0:0638 0:0168
Table 58.8 Asymmetric Filter Garch (1,1) model of daily corporate bond excess returns with an intercept, an AR(1) term, and an announcement dummy in the mean equation. P-values computed using Bollerslev-Wooldridge standard errors. 2 1=2 A A A ht D ! C ˛ C C A It1 It1 "t1 C ˇht1 1 C ı ItA 1 C ıC1 It1 Model: Rt D C Rt1 C 0 ItA C st "t st D 1 C ı1 ItC1
0:1731 0:0000 0:0001 0:0091 0:0001 0:8444 0:5188 0:0000
3B 10C yr. p-value 0:7778 0:0973 0:0308
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements 911
912
a slightly lower volatility increase on days with macroeconomic announcements than their Treasury counterparts. For the longer maturities, announcement-day volatility of Treasuries and CIGs is similar, with CIG’s volatility only slightly higher. Excess return announcement-day volatility decreases with maturity for both corporate and Treasury bonds, a result inconsistent with Fleming and Remolona (1999a, b) and Dai and Singleton (2000) who suggests a hump-shaped pattern. All CIG and Treasury bonds earn positive and significant excess returns on announcement days across models. These returns increase monotonically with maturity. We also find that the previous day’s return possesses some explanatory power of excess return. CIG and Treasury bond excess returns exhibit strong GARCH effects, with general shocks lasting for some time, unlike announcement day shocks. HY bond excess returns behave differently from their CIG and Treasury bond counterparts. Only one HY fund earns consistently positive and significant excess returns on announcement days. Nonannouncement shocks to HY bond excess return do not persist and are not a factor in forecasting future conditional variance. Evidence that HY volatility increases on announcement days is comparatively weak. Only a sophisticated regime switching model detects that announcement day volatility is greater. As for nonannouncement day volatility, announcement-day shocks also do not persist. There is slight evidence that high announcement-day volatility gets reversed on the day after announcement. We find CIG and HY bonds display differing asymmetric effects on conditional variance. Negative residuals tend to reduce CIG conditional variance but increase HY conditional variance whereas Treasury bonds display little evidence of asymmetry. Of the six types of macroeconomic announcements examined, only retail sales announcements have a significant positive impact on both announcement-day excess return and conditional variance of Treasury and CIG bonds. Unemployment and CPI announcements seem to have only a significant positive effect on announcement-day conditional variance, while employment cost and PPI announcements are only associated with a significant positive announcement-day excess return.
References Andersen, T. G. and T. Bollerslev. 1997. “Intraday periodicity and volatility persistence in financial markets.” Journal of Empirical Finance 4(2–3), 115–157. Black, F. 1976. “Studies in stock price volatility changes,” Proceedings of the 1976 business meeting of the business and economic statistics section, American Statistical Association, pp. 177–181. Blume, M. E., D. B. Keim and S. A. Patel. 1991. “Returns and volatility of low-grade bonds, 1977–1989.” Journal of Finance 46(1), 49–74.
N. Kosturov and D. Stock Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31(3), 307–328. Bollerslev, T., Wooldridge, J. 1992. “Quasi-maximum likelihood estimation and inference in dynamic models with time varying covariances.” Journal of Political Economy 96, 116–131. Boyd, J. H., J. N. Hu and R. Jagannathan. 2005. “The stock market’s reaction to unemployment news: ‘why bad news is usually good for stocks.”’ Journal of Finance 60(2), 649–672. Brock, W. A. and B. D. LeBaron. 1996. “A dynamic structural model for stock return volatility and trading volume.” Review of Economics and Statistics 78(1), 94–110. Campbell, J. Y. 1995. “Some lessons from the yield curve.” Journal of Economic Perspectives 9(3), 129–152. Campbell, J. Y. and L. Hentschel. 1992. “No news is good news: an asymmetric model of changing volatility in stock returns.” Journal of Financial Economics 31(3), 281–318. Collin-Dufresne, P. and R. S. Goldstein. 2001. “Do credit spreads reflect stationary leverage ratios?” The Journal of Finance 56(5), 1929–1958. Collin-Dufresne, P., R. S. Goldstein and S. J. Martin. 2001. “The determinants of credit spreads.” The Journal of Finance 56(6), 2177–2207. Cornell, B. and K. Green. 1991. “The investment performance of lowgrade bond funds.” The Journal of Finance 66(1), 29–48. Crabbe, L. and J. Helwege. 1994. “Alternative tests of agency theories of callable corporate bonds.” Financial Management 23(4), 3–20. Dai, Q. and K. J. Singleton. 2000. “Specification analysis of affine term structure models.” Journal of Finance 55(5), 1943–1978. Daouk, H. 2001. “Explaining asymmetric volatility: a new approach,” Western Finance Association Meetings, Tuscon, Arizona. Diebold, F. X. and M. Nerlove. 1989. “The dynamics of exchange rate volatility: a multivariate latent factor arch model.” Journal of Applied Econometrics 4(1), 1–22. Duffee, G. R. 1999. “Estimating the price of default risk.” Review of Financial Studies 12(1), 197–226. Ederington, L. and J. Lee. 1993. “How markets process information: news releases and volatility.” Journal of Finance 48, 1161–1191. Ederington, L. and J. Lee. 2001. “Intraday volatiltiy in interest rate and foreign exchange markets: ARCH, announcement, and seasonality effects.” Journal of Futures Markets 21, 517–522. Elton E. J., M. J. Gruber, D. Agrawal and C. Mann. 2001. “Explaining the rate spread on corporate bonds?” Journal of Finance 56(1), 247–277. Engle, R. F. and G. G. J. Lee. 1993. “A permanent and transitory component model of stock return volatility.” Discussion Paper, University of California, San Diego. Engle, R. F. and G. G. J. Lee. 1999. “A permanent and transitory component model of stock return volatility,” in Cointegration, causality and forecasting: a festschrift in honour of Clive W.J. Granger, R. F. Engle, H. White (Eds.). Oxford University Press, Oxford. Fabozzi, F. 2000. Bond markets, analysis and strategies. 4th edn. Prentice Hall, Upper Saddle River, NJ. Flannery, M. and A. Protopapadakis. 2002. “Macroeconomic factors do influence aggregate stock returns.” Review of Financial Studies 15(3), 751–782. Fleming, M. J. and E. M. Remolona. 1997. “What moves the bond market?” Federal Reserve Bank of New York Economic Policy Review 3(4), 31–50. Fleming, M. J. and E. M. Remolona. 1999a. “What moves bond prices?” Journal of Portfolio Management, Summer 25(4), 28–38. Fleming, M. J. and E. M. Remolona. 1999b. “Price formation and liquidity in the U.S. treasury market: the response to public information.” Journal of Finance 54(5), 1901–1915, (working paper version in Federal Reserve Bank of New York Staff Reports No. 27, July 1997).
58 The Sensitivity of Corporate Bond Volatility to Macroeconomic Announcements Grunbichler, A., F. A. Longstaff and E. S. Schwartz. 1994. “Electronic screen trading and the transmission of information: an empirical examination.” Journal of Financial Intermediation 3(2), 166–187. Harris, M. and A. Raviv. 1993. “Differences of opinion make a horse race.” Review of Financial Studies 6(3) 473–506. Heston, S. L. and S. Nandi. 2000. “A closed-form GARCH option valuation model.” Review of Financial Studies 13(3), 585–625. Huberman, G. and G. William Schwert. 1985. “Information aggregation, inflation, and the pricing of indexed bonds.” Journal of Political Economy 95, 92–114. Jones, C. M., O. Lamont and R. L. Lumsdaine. 1998. “Macroeconomic news and bond market volatility.” Journal of Financial Economics 47(3), 315–337. Kandel, E. and M. Pearson. 1995. “Differential interpretation of information and trade in speculative markets.” Journal of Political Economy 103(4), 831–872. Kish, R. J. and M. Livingston. 1992. “Determinants of the call option on corporate bonds.” Journal of Banking and Finance 16(4), 687–704. Kwan, S. H. 1996. “Firm-specific information and the correlation between individual stocks and bonds.” Journal of Financial Economics 40(1), 63–80. Li, L. and R. F. Engle. 1998. “Macroeconomic announcements and volatility of treasury futures,” EFMA meetings, (Helsinki). Lo, A. W. and A. Craig MacKinlay. 1988. “Stock market prices do not follow random walks: evidence from a simple specification test,” Review of Financial Studies 1(1), 41–66. Longstaff, F. A. and E. S. Schwartz. 1995. “A simple approach to valuing risky fixed and floating rate debt.” Journal of Finance 50(3), 789–819. Lopez, J. 2001. Economic letter, 23 Nov 2001, “Financial instruments for mitigating credit risk,” San Francisco Federal Reserve Bank. McQueen, G. and V. Vance Roley. 1993. “Stock prices, news, and business conditions.” Review of Financial Studies 6(3), 683–707. Nelson, D. 1990. “ARCH models as diffusion approximations.” Journal of Econometrics 45, 7–38.
913
Pedrosa, M. and R. Roll. 1998. “Systematic risk in corporate bond credit spreads.” Journal of Fixed Income 8(3), 7–26. Ritchken, P. H. and R. Trevor. 1999. “Pricing options under generalized GARCH and stochastic volatility processes.” Journal of Finance 54, 377–402. Stock, D. 1992. “The analytics of relative holding-period risk for bonds.” Journal of Financial Research 15(3), 253–264. Van Horne, J. C. 1998. Financial markets rates and flows, Prentice-Hall, Upper Saddle River, NJ. Weinstein, M. I. 1983. “Bond systematic risk and the option pricing model.” Journal of Finance 38(5), 1415–1429. Weinstein, M. 1985. “The equity component of corporate bonds.” Journal of Portfolio Management 11(3), 37–41. White, H. 1980. “A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.” Econometrica 48(4), 817–838. Zitzewitz, E. 2000. “Arbitrage proofing mutual funds.” The Journal of Law, Economics, and Organization 19(2), 245–280.
Appendix 58A
Family Vanguard Fidelity Fidelity Fidelity Federated Merrill Lynch Putnam
Ticker VWEHX SPHIX FAGIX FAHYX FHYTX MAHIX PHBBX
Average maturity 6.8 years 5.3 years NA NA 4.91 years 6.38 years 6.22 years
Average quality BB B NA B B B BB
Net assets (in billions) $7 $2.51 $4.11 $1.39 $0.621 $0.456 $0.942
Chapter 59
Raw Material Convenience Yields and Business Cycle Chang-Wen Duan and William T. Lin
Abstract This paper extends the methodology of Milonas and Thomadakis (1997) to estimate raw material convenience yields with futures prices during the period 1996 to 2005. We define the business cycle of a seasonal commodity with demand/supply shocks and find that the convenience yields for crude oil and agricultural commodity exhibits seasonal behavior. The convenience yield for crude oil is the highest in the winter, while that for agricultural commodities are the highest in the initial stage of the harvest period. The empirical result show that WTI crude oil is more sensitive to high winter demand and that Brent crude oil is more sensitive to shortages in winter supply. The theory of storage points out that the marginal convenience yield on inventory falls at a decreasing rate as inventory increases which could be verified through those products affected by seasonality, but could not be observed by products affected by demand/supply. Convenience yields are negatively related to interest rates The negative relationship implies that the increase in the carry cost of commodity – namely the interest rate – would cause the yield of holding spot to decline. We also show that convenience yields may explain the price spread between WTI and Brent crude oil as well as the ratio between soybean and corn. Our estimated convenience yields are consistent with Fama and French (1988) in that commodity prices are more volatile than futures prices at low inventory level, verifying the Samuelson (1965) hypothesis that future prices have fewer variables than spot prices at lower inventory levels. Keywords Business cycle r Convenience yield r Crop year r Demand/supply shock rTheory of storage rTwo-period model
C.-W. Duan () and W.T. Lin Department of Banking and Finance, Tamkang University, 151 Ying-Chuan Road, Tamsui, Taipei County 25137, Taiwan e-mail: [email protected]
59.1 Introduction This paper applied the call option model to estimate convenience yields for four storable commodities. The empirical results derived from the analysis of price and stock data covering the period 1996 to 2005.1 Our estimated convenience yields extend the Fisher (1978) model, which utilizes a non-traded asset as strike price. We instead set the price of a traded asset to be the strike price of the option in our model. In addition, our analyses differ from the Milonas and Thomadakis (1997) model in that we not only fit the price of a traded asset, such as commodity futures prices, as the strike price but also add in storage cost when estimating the convenience yield. Our purpose is to examine convenience yields while taking into account inventory level, volatility and interest rates, evaluating the consistency between our empirical study and the theory of storage. Keynes (1930) points out that in the theory of liquid stocks the risk associated with holding spot goods is much higher than holding forward contracts. When inventory level is lower than planned due to unexpected demand/supply shock, the spot price will be higher than the forward price. This price difference is regarded as a risk premium, and such price behavior is referred to as “backwardation.” Convenience yield plays a key role in the theory of storage and serves as an incentive to hold spot commodities. In the theory of storage, Kaldor (1939), Working (1948, 1949), Brennan (1958), and Telser (1958) use convenience yields to explain this inverted market phenomenon where spot prices are higher than futures price, and convenience yields can be viewed as a benefit of holding storable consumption goods. According to the theory of storage, a high inventory level implies a lower probability of a stockout in the future. Open futures contracts help lower excess demand in the spot market, hence reducing the benefit of holding the commodity. Consequently, convenience yields are negatively related to inventory levels. Such a relationship is stronger for a commodity that is more
1
The sample period for futures is from 1995 to 2006.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_59,
915
916
sensitive to seasonality or supply/demand effect. Therefore, convenience yield plays a central role in explaining the benefits of holding inventory during periods of unexpected demand/supply shock. The prices of agricultural commodities are subject to the seasonal effect of crop growth. Crude oil, though unlike crops, also has the seasonal cycle of supply and demand; when crude oil supply/demand is in disequilibrium, market will find equilibrium by adjusting the spot prices. The process of market adjustment from disequilibrium to equilibrium may be treated as a business cycle. In light that crude oil is a strategic resources and the fluctuation of its prices and output volume would have significant impact on world economy, crude oil users usually hold crude oil with to shield themselves from the volatility of crude oil prices. Thus, our empirical samples are WTI crude oil, Brent crude oil, soybean and corn. Consistent with the theory of storage Samuelson (1965) predicts that spot and futures price variations will be similar when a supply/demand shock occurs during higher inventory levels, but spot prices will be more variable than the futures prices at lower inventory levels. Fama and French (1987) provide empirical study on the theory of storage by showing that convenience yields do vary seasonally for most agricultural and animal products but not for metals. They also find that seasonalities are significant in explaining agricultural commodities’ futures basis. Heinkel et al. (1990) derive a model in which convenience yields behave like options. Their two-period call option model also supports an inverse relationship between inventory levels and convenience yields. Particularly, convenience yields arise from unexpected supply/demand in the cycle of spot markets, which needs to be considered when estimating convenience yield. In their empirical study of metal commodity Fama and French (1988) suggest that metal inventory and prices are not affected by seasonality but by general business conditions. Their evidence indicates that metal production does not adjust quickly to positive demand shocks during business-cycle peaks. Previous studies estimate convenience yields using the cost-of-carry model, where the convenience yield is treated as an exogenous variable. Brennan (1986) tests several empirical convenience yield models and finds that convenience yields follow a mean-reverting process. Gibson and Schwartz (1990) model convenience yield as a stochastic mean-reverting variable, which is connected to the time to maturity of a futures contract, but they assume an exogenously specified convenience yields measure. Casassus and Collin-Dufresne (2005) use a three-factor Gaussian model to capture commodity futures prices and integrates all three variables analyzed by Schwartz (1997). Their model allows convenience yields to depend on spot prices and interest rate, which leads to mean-reverting convenience
C.-W. Duan and W.T. Lin
yields as seen in Gibson and Schwartz (1990). Milonas and Thomadakis (1997) find that mean-reverting behavior does not necessarily occur for commodities with seasonal business cycles, so it is inappropriate to assume convenience yields as an exogenous stochastic mean-reverting variable. This is because the business cycle affects supply, inventory, and demand in a systematic manner, and this is not necessarily consistent with mean reversion. Most theoretical models of convenience yields assume that storage costs are zero, such as Fama and French (1988). They use interest-adjusted basis as a proxy, which avoids the difficulty of estimating storage cost, and develop a relationship between convenience yield and inventory level without directly estimating the storage costs. To the extent that studies above are subject to empirical verification, they do not offer explicit measures of the determining variables in the context of an option model. Milonas and Thomadakis (1997) extend the option approach of Heinkel et al. (1990) with the Black–Scholes model to estimate convenience yield. Although they model convenience yields as call options the approach is problematic in that convenience yields reenter the call option equations when spot prices are used as underlying variables. That requires estimating an unknown variable. In addition, storage cost is also ignored, which could result in a negative convenience yield in their estimation with a cost of carry model. When estimating convenience yields in this paper, we consider unexpected demand/supply shock, business cycle, and crop cycle, and we specify the beginning month, the intermediate month, and the final month of these commodities business cycle. To deal with the two issues, we first choose, under an option pricing framework, the price of a futures contract maturing in the intermediate month as the underlying variable, hereafter termed the nearby contract. Second, unlike Fisher (1978) which utilizes a non-traded asset as the strike price, we instead set the price of a traded asset to be the strike price of the option in our model. For the traded asset, we use a futures contract maturing in the final month, which will be termed hereafter as the distant contract. This resolves the problem of unknown variable encountered in Milonas and Thomadakis (1997). Convenience yields of crude oils and agricultural commodities are then estimated assuming that the price of the underlying asset and the strike price are both stochastic variables. Storage costs are also incorporated to avoid potential problems in their study. Fama and French (1988) use a simple proxy for the level of inventory to test the theory of storage and is consistent with the Samuelson (1965) proposition that future prices are less variables than spot prices at lower inventory levels. Therefore, we adopt real inventory to examine the theory of storage in our estimation and use the approach of Fama and French to test its implications about the variation of spot and futures prices.
59 Raw Material Convenience Yields and Business Cycle
917
Few studies examine how interest rates affect convenience yields. The theory of storage suggests that holding inventory becomes more costly in periods of high interest rates, so we may expect a negative correlation between interest rates and inventory. However, to the extent that inventory and interest rates are both relative measures, it seems consistent with the theory to find a relationship between interest rates and convenience yields. Casassus and Collin-Dufresne (2005) show that the sensitivity of convenience yields to interest rate is positive for crude oil, copper, gold and silver, which is consistent with the theory of storage. Furthermore, as interest rates are related to economic activity, interest rates in turn affect convenience yields of various commodities. So we also examine the relationship between convenience yields and interest rate with a cross-sectional regression model. In the sections that follow, Sect. 59.2 discusses the characteristics of the crude oil and agricultural commodities, Sect. 59.3 presents the model. Section 59.4 describes the data. Section 59.5 explains the empirical analysis. Finally, Sect. 59.6 provides the concluding remarks.
of agricultural commodities futures usually match the period of harvest of agricultural commodities futures. In the U.S., soybean and corn are mostly produced in Iowa and Illinois and are sowed before May and harvested between July and September so that the first day of their crop year is September 1 and the last day is August 31. Therefore, the first expiration contract month of the crop year for CBOT soybean futures contract is September, which is followed by six continuous expiration months: November, March, May, July and August – all together six contracts. However, most of the harvest of corn cannot catch up with the September contract so the first expiration month of the crop year for CBOT corn futures is December, with five continuous expiration months: March, May, July and August – five contracts. On November 19, 2007, the turnovers of corn and soybean futures on the CBOT were 167,266 and 122,894 lots respectively, which we show that corn has a relatively high turnover. During its life cycle, corn has to absorb nitrogen in soil to grow well, but soybean, on the other hand, will release nitrogen gradually into the soil during its growing period. Therefore, in the U.S. soybean and corn2 are viewed as rotation crops because they balance the nitrogen in soil thus producing a higher harvest. Generally, farmers start to prepare the soil for sowing both crops of corn and soybean around April and May and thus the price ratio of both before sowing is the basis of rotating crops for farmers. According to USDA data every acre of tillage could produce an average 31 bushels of soybeans and 108 bushels of corn so that the soybeancorn yield ratio (S/C) is about 3.5; therefore soybean is naturally more expensive than corn, which we can observe from Table 59.1. Observing historical spot prices of both, the turning value of S/C is 2.5; that is, when S/C is greater than 2.5,
59.2 Characteristics of Study Commodities 59.2.1 Agricultural Commodities The harvests of agricultural commodities are mainly affected by the weather during their planting period, and thus no doubt the prices of agricultural commodities adjust seasonally. A period starting from the initial stage of crop harvest to the next sowing is a so-called crop year. From the first day of a crop year, agricultural commodities futures traded in exchanges can be delivered so that the delivery months Table 59.1 Summary statistics WTI spot price (US$/barrel)
Brent spot price (US$/barrel)
2
The biggest producer for corn and soybean is the U.S.
Corn spot price (US cents/bushel)
Soybean/corn
Soybean spot price (US cents/bushel)
Year
N
Mean
¢
N
Mean
¢
N
Mean
¢
Sp =Cp
N
Mean
¢
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 Total Skewness Kurtosis
255 257 259 256 260 254 252 251 249 251 2544 1.2273 1.1242
21.95 20.61 14.44 19.33 30.16 25.90 26.15 31.10 41.49 56.61 28.67
2.24 1.82 1.57 4.56 2.98 3.56 3.22 2.62 5.75 6.27 12.25
257 254 257 254 253 254 254 254 256 254 2547 1.3441 1.4247
20.56 19.16 12.79 17.24 26.30 23.21 25.00 28.78 38.31 55.10 26.63
2.23 1.86 1.56 4.49 3.03 2.68 2.93 2.51 5.82 6.24 12.14
254 253 252 251 252 250 252 251 251 252 2518 2.1765 5.3134
396.59 282.04 237.38 206.48 204.05 206.66 231.13 241.85 257.10 208.21 247.33
81.55 14.86 28.87 14.25 22.08 9.06 25.41 10.37 43.19 13.45 64.95
1.99 2.63 2.42 2.35 2.25 2.18 2.04 2.30 3.04 2.58 2.38
254 253 252 251 252 250 252 252 251 251 2518 0.7725 0:0925
745.08 753.32 597.13 456.75 480.39 447.49 505.29 626.76 744.74 593.62 595.35
44.70 66.50 56.61 26.95 25.80 27.72 54.46 72.82 189.08 56.55 138.52
Notes: N and ¢ are business day and standard deviation, respectively.
918
C.-W. Duan and W.T. Lin
Fig. 59.1 Plot of means of commodity spot prices
WTI
Brent
Corn
Soybean 700
34
650
US$/Barrel
550
30
500 28
450 400
26
Cents$/bushel
600
32
350
24
300 22
250
20
200 Jan
Feb
Mar
farmers tend to plant more soybeans because soybeans have a better expected price and the expected yield of holding them is promoted. On the contrary, they will turn to plant corns. Table 59.1 displays the means of daily S/C computed from spot prices that have been adjusted by the consumer price index (CPI). From this table, we can see that the highest S/C is the 3.04 in 2004, higher than 2.5, which represents that farmers wish more to plant soybean; the lowest S/C is the 1.99 in 1996, less than 2.5, representing farmers preference for planting corns. According to the growing characteristics of agricultural products, the supply rises gradually in the beginning of crop year when the crops start to be reaped and the spot prices are expected to go down as the storages begin to diminish. In Fig. 59.1, which displays the monthly averages of spot prices (adjusted by CPI/PPI to the level of January/1996 from 1996 to 2005), we see that the bottom of corn price occurs in September whereas that of soybeans in October. When the storage has been gradually exhausted, the spot prices rise until another harvest period approaches, and thus a cycle is shaped up. It is called a crop year. So, the harvest of agricultural products in this cycle depends on the producing decision of the previous term; and the supply of crops during a crop year depends on the agricultural harvest of that term as well as the storage decision of previous term. Consequently, the convenience yields of agricultural products are also affected by these factors.
59.2.2 Crude Oil The crude oil market is subject to a seasonal cycle of supply and demand, with prices price adjusting to supply/demand disequilibrium. This process of market adjustment from disequilibrium to equilibrium may be regarded as a business cycle,
Apr May
Jun Jul Month
Aug
Sep
Oct
Nov
Dec
and crude oil convenience yields should exist during periods of unexpected demand/supply shocks within a business cycle. Crude oil is traditionally traded by means of futures contracts. Most of the crude oil future trades take place in the New York Mercantile Exchange (NYMEX) and London’s International Petroleum Exchange3 (IPE). NYMEX trades crude oil futures based on West Texas Intermediate (WTI) crude oil, while IPE contracts are based on North Sea Brent Blend crude oil. Since contracts traded in these two markets are based on crude oil from different production areas observing their convenience yields involves issues different from dealing with different commodities within a single market. The asked spot prices of crude oils from the North American and West African oil fields are quoted based on the value of WTI crude oil. On the other hand, the asked prices in the European market and from oil fields in the North Sea, Russia, northern African, and the Middle East generally use Brent crude oil as their benchmark. Most of the crude oil spot markets around the world give quotes based either on WTI or Brent crude oil due to their stable supplies. Table 59.1 illustrates the annual average spot prices, rates of return, and volatilities of WTI and Brent crude oils. We find that the annual average spot price of WTI is always higher than that of Brent. The price advantage of WTI crude oil could be tested on the basis of demand and supply, as its futures price is influenced by economy, weather, and consumer behavior. The oil fields and trading markets for WTI and Brent are located in the same climate zone WTI crude oil supplies the vast North American and global consumer markets, while Brent supplies the relatively smaller European consumer market. Moreover the delivery point for
3
IPE has changed its name to Intercontinental Exchange (ICE) Futures. In June of 2001, ICE expanded its business into futures trading by acquiring the IPE.
59 Raw Material Convenience Yields and Business Cycle
the WTI spot is closer to the refineries, and there is a standard settlement contract for WTI, while Brent’s delivery point is located far away from the refineries, and there is no standard settlement contract. Tanker shipments of Brent oil to the refineries at times face the port freezing problem. Crude oil is traded mostly in the form of forward or futures contracts. In November 23, 2007, the turnover and total open interest of NYMEX WTI Crude futures were 107,376 lots and 1,404,767 lots respectively, while that of ICE Brent Crude futures reached 103,125 lots and 577,002 lots respectively. WTI crude oil is produced in North America and shipped by pipelines to the delivery point in Cushing, Oklahoma, and then by trucks to the refineries of U.S. oil companies. Due to the close proximity of its delivery locations to the refineries, the delivery cost of WTI crude oil is relatively low. Brent crude oil is settled and delivered to Sullom Voe in the Shetland Islands, and then shipped to refineries by tankers, which face a port freezing problem in the wintertime. Relative to WTI crude oil, the production of Brent crude oil is more susceptible to the influence of climate. To the extent that there are price spreads among crude oils produced in different regions, WTI crude oil apparently possesses certain price advantages. Whether such advantages increase the convenience yield of holding crude oil is a topic that we examine in this study.
59.3 The Model The model we adopt to analyze oil convenience yields is a call option pricing model, with a particular focus on the demand/supply shocks of crude oils. Keynes (1930) assumes that an unexpected demand shock would cause the spot prices of commodities to exceed their futures prices and that a convenience yield from holding inventories would arise during a stockout. The relationship between the convenience yield and the business cycle of a commodity is very close. Figure 59.1 presents the monthly average spot prices of crude oils and agricultural commodities from 1996 to 2005, adjusted respectively based on the PPI and CPI for January 1996. The seasonal patterns are indicated by the mean spot prices for the different months in Fig 59.1. We find that WTI and Brent crude oil prices start to rise before the summer season and start to fall after peaking in September, with the lowest prices occurring in the winter, such as in February. For corn and soybean spot prices, the mean spot prices have a peak in April/May and a bottom in September/October. Therefore, the behaviors of the crude oil and agricultural commodity prices are affected by seasonality and business cycles. The source of shocks result primarily from demand and supply rather than weather or technology for crude oil, and both supply and demand effects influence convenience
919
yields. As for agricultural commodity, the source of shocks is mainly weather and seasonal factors. As spot prices change to restore the equilibrium of the commodity market, a commodity business cycle is then defined as the start of the disequilibrium to the restoration of equilibrium. The crude oil business cycle is then from January to December, with the beginning as date 0, set in January, and the end as date T in December. In particular, January, with the lowest average spot price, is also the observation month since it is when the market equilibrium is restored. The month with the highest average spot price is defined as the intermediate month, which is also the event or shock month in which the convenience yield arises from holding the commodity. The intermediate month divides the business cycle into two periods. A nearby futures contract written in March is considered the starting contract for estimating the convenience yield. The convenience yield is computed in the observation month – September, when spot prices peak – and is set as the shock month to estimate convenience yields. Although agricultural commodity has crop years, we could not use long-term contracts spanning years to estimate convenience yield because the crop year cross next year and the maturity of agricultural commodity futures is less than 1 year. Therefore, considering the consistency of observing period, we set the business cycles of corn and soybean to be the same as that of crude oil. Based on the nature of the production, we assume that the demand (D) in different periods of time is given. The supply at date 0 is a function of the production (Q0 ) determined in the previous cycle while storage (S0 ) is determined at date 0 in the current cycle. The supply at any date t during the cycle is determined by the variation in the stock level, and the supply available at the end of the cycle is determined by the variation in the stock level. Production (QT ) in the current cycle is determined at the beginning of the cycle, and then the spot prices (P ) on the following days depend on the demand and supply available. Available supplies are defined in each period after subtracting inventory: P0 D f .D0 I Q0 S0 /; Pt D f .Dt I S0 St /; PT D f .Dt I St ST C QT /; In a perfect market, the expectations model shows that the futures price (F / today equals the spot price that traders expect to prevail for the underlying asset on the delivery date of the futures contract: F0;t E.Pt /; The storage rises when the demand is low or the supply is high. The futures price that expires at date T observed at date t is a function of three variables (i.e., storage as decided
920
C.-W. Duan and W.T. Lin
at date t; production at date T as decided at the beginning of the cycle; and demand at date T as expressed below): Ft;T Eff .DT I St C QT /g;
(59.1)
Equation (59.1) shows that the current futures price at date t is determined by the current storage and demand and production levels at date T . When the market faces higher demand or lower supply, storage gradually falls to zero. If the production at date T is known, there is a negative relationship between the futures price and storage, and the futures price will reach the upper bound: U Eff .DT I QT /g Ft;T U Ft;T < Ft;T ;
(59.2)
According to the theory of storage, the net cost of holding a futures contract under an arbitrage-free framework is the spot price plus the storage cost (SC): Ft;T D Pt C S Ct;T Thus when supply/demand is in equilibrium, we obtain St > 0, and then U f .Dt I S0 / C S Ct;T < Ft;T
(59.4)
Given that the futures price cannot completely explain the cost-of-carry model, the spot price will be higher than the futures price when the market faces excess demand, and the convenience yield from holding the commodity over the period from t to T may be expressed as: CY t;T D Pt C S Ct;T Ft;T :
OCY t;T D Max.Pt Ft;T ; 0/:
(59.5)
When Equation (59.3) holds, we have C Yt;T D 0. When Equation (59.5) holds, we obtain C Yt;T > 0. Therefore, a temporary shock in demand/supply during the cycle will cause the storage to drop and the spot price to rise, which gives rise to a risk premium from holding the commodity and results in a positive convenience yield. Given that the storage cost in Equation (59.5) is difficult to estimate, observing the convenience yield becomes infeasible. Fama and French (1988) considers the behavior of the convenience yield on an interest-adjusted basis. Such an approach avoids directly estimating the convenience yield but is unable to give the whole picture of the convenience yield from holding the commodity. Milonas and
(59.6)
Instead of using spot price as an underlying variable, as in Equation (59.6), we set the price of a nearby futures contract as the underlying variable, and the price of a distant contract as the exercise price. Under a cost-of-carry framework with zero storage cost, the convenience yield is the difference between the net cost of carrying a nearby and a distant futures contract observed at time 0, OCY 0t;T D Max.F0;t Ft;T ; 0/:
(59.7)
From the above equation the convenience yield from t to T is observed at date 0 in the beginning month of business cycle. However, it ignores storage cost, which might result in a negative convenience yield. The storage cost (S Ct;T ) is incurred from date t when the nearby contract matures, to date T when the commodity is delivered and the distant contract matures. Equation (59.7) is then revised as: Ft;T ; 0 ; OCY 0t;T D Max F0;t
(59.3)
U , then When there is excess demand, St D 0; Ft;T D Ft;T U : f .Dt I S0 / C S Ct;T > Ft;T
Thomadakis (1997) treats the convenience yield as a call option. With the assumption of zero storage cost and the spot price as the underlying variable, the payoff function from the convenience yield (OCY) at maturity, or the boundary condition of the option pricing formula, can be written as
(59.8)
where Ft;T C S Ct;T : F0;t
Both the underlying variable and the exercise price in Equation (59.8) are uncertain variables, which follow standard diffusion processes. We assume that the nearby futures n contract price (F0;t ) with a maturity date at time t and the disd ) with a maturity date at time tant futures contract price (Ft;T T , and they both follow geometric Brownian motion (GBM): n n n D n F0;t dt C n F0;t d zn dF0;t d d d dFt;T D d Ft;T dt C d Ft;T d zd :
(59.9) (59.10)
Based on the deduction of Fisher (1978) that utilizes a nontraded asset as strike price, the expected rate of return on the hypothetical security may be solved by the capital asset pricing model (CAPM). Unlike Fisher where the hedge against the uncertainty of strike price is through a non-traded asset, the strike price in our model is the price of a traded asset such as crude oil futures. The expected rate of return will be n d zero. Simplifying F0;t as F0;t and Ft;T as Ft;T , referring to Equation (59.8), the boundary condition can be rewritten as: F n Max.XT 1; 0/;
(59.11)
59 Raw Material Convenience Yields and Business Cycle
921
where XT D F n =F d Applying Itô’s Lemma on the transformed variables, we obtain a closed-form solution to the call option (OCY): 0 D Fn N.d1 / Fd N.d2 /; OC Yt;T
(59.12)
where d1 D
ln.Xt / C .p2 =2/ p ; p
d2 D
ln.Xt / .p2 =2/ p ; p
and p2 D F2 n C 2F n F d F n F d C F2 d :
(59.13)
, and £ are the volatility, correlation coefficient, and the period between the nearby and distant contracts, respectively. We use the variance of daily logarithmic futures price changes from all trading days during the beginning of the first month of the business cycle as an estimate of volatility of the futures contract in the model. The correlation coefficient is estimated using the nearby and distant daily logarithmic futures price changes from all trading days during the beginning month.
59.4 Data We now turn to the estimation of convenience yield, as well as the design of the business cycle for four different raw materials. The applied dataset consists of daily observation of futures prices on WTI crude oil, Brent crude oil, corn and soybean in the period January 1995 to July 2006. The futures for crude oils have 12 expiration months in 1 year. CBOT corn futures have five expiration months: March, May, July, September, and December. The futures of soybean have 7 expiration months: January, March, May, July, August, September, and November. For each commodity in our sample, we need to estimate the convenience yield and then determine how these values relate to inventory and interest rate by cross-sectional regression. Our analyses test the consistency and validity of the theory of storage. Finally, we also test Samuelson (1965) hypothesis for our estimation of convenience yield with the same approach as that in Fama and French (1988). The issue of whether commodity prices and convenience yields have seasonal cycles is closely related to whether they are subject to supply/demand shocks. We employ a sample of WTI and Brent crude oil prices that are subject to the effect
of supply/demand shocks as well as a sample of corn and soybean prices that are affected by seasonality. The samples contain daily data during the years 1995 and 2006. The commodity price data are from the Commodity Research Bureau database, the Wall Street Journal, the NYMEX, CBOT and ICE. Inventory data are obtained from the U.S. Energy Information Administration (EIA), U.S. Department of Agriculture (USDA), and the Economagic database. Because the study involves many years of data with varying levels of inflation, crude oil prices are translated using the producer price index (PPI) with the base adjusted to January 1996 and agricultural prices are translated using the consumer price index (CPI). We take the 3-month U.S. Treasury bill rate as the risk-free rate, and discount all commodity futures prices at the risk-free rate to the beginning month of the business cycle using the following method: Ft;T D ft;T e rf .T t /=365 ;
(59.14)
where f is the pre-discounted futures price. In this way, all futures prices become free of carrying charges. Brennan (1986, 1991) and Milonas and Thomadakis (1997) also adopted the same discounting procedure. According to the EIA report,4 the storage cost of crude oil is approximately 30 cents per barrel/month during this period, which accounts for about 1% of the crude oil spot price, and is a fixed cost for holders of storable commodities. Moreover, according to the report of the research center of Consumer and Environmental Sciences, College of Agricultural, University of Illinois, the 7-months commercial storage costs5 of corn during the harvest period of May in spring of each year are between US$0.25 and US$0.45 per bushel, accounting for about 15 to 30% of the spot price; those of soybean are close to this figure. Thus, we set the monthly storage cost for crude oil at a fixed percentage of 1% of the spot price and those for corn and soybean at an average percentage of 3.2%6 of spot prices to estimate convenience yields. Finally, to estimate the means and volatilities of these variables, we apply the observed spot and futures prices for all trading days during the beginning month. The convenience yield is calculated on a daily basis throughout the observation month and then averaged over the observation month. In addition to calculating the monthly convenience yields from March to November, we also use September for crude oil, when spot prices peak as the shock month in order to estimate convenience yields. We observe the convenience yields of May to December, and for corn we observe those of May to November for soybean. 4 Energy Information Administration, Petroleum Supply Monthly. DOE/EIA-0109, Washington DC, March 1996. 5 Drying and shrinking costs are included. 6 3.2% D [(15% C 30%)/2] 7 D 3.2% (one-month).
922
Table 59.2 Estimated sample means for crude oil convenience yields
C.-W. Duan and W.T. Lin
WTI crude oil (US$/spot price)
Brent crude oil (US$/spot price)
Period
OCY.1/
CCY.2/
(1)–(2)
OCY.3/
CCY.4/
(3)–(4)
Mar–Apr
0.0707 (12.98) 0.1062 (13.45) 0.1350 (13.47) 0.1619 (13.33) 0.1970 (13.31) 0.2242 (13.26) 0.2529 (12.82) 0.2783 (13.79) 0.3088 (13.73) 0.0677 (12.08) 0.0633 (11.67) 0.0616 (11.31) 0.0637 (11.43) 0.0671 (11.93) 0.0697 (12.66) 0.0699 (12.42) 0.0740 (11.35)
0.0242 (4.42) 0.0476 (4.68) 0.0698 (4.81) 0.0918 0.0700 (5.00) 0.1126 (5.19) 0.1316 (5.37) 0.1502 (5.56) 0.1675 (5.71) 0.1844 (5.86) 0.0232 (4.93) 0.0219 (5.09) 0.0216 (5.64) 0.0202 (6.32) 0.0183 (6.77) 0.0178 (7.58) 0.0165 (7.55) 0.0159 (8.22)
0.0464 (5.98) 0.0586 (5.08) 0.0652 (4.56) 0.1624 (4.27) 0.0844 (4.59) 0.0926 (4.82) 0.1027 (5.23) 0.1108 (4.86) 0.1244 (4.85) 0.0445 (6.08) 0.0414 (6.03) 0.0400 (6.08) 0.0435 (6.72) 0.0487 (7.83) 0.0518 (8.94) 0.0533 (8.79) 0.0581 (8.14)
0.0715 (12.17) 0.1071 (12.67) 0.1355 (12.61) 0.0866 (12.73) 0.1948 (12.21) 0.2226 (11.98) 0.2539 (11.69) 0.2772 (12.91) 0.3007 (11.80) 0.0684 (11.93) 0.0644 (11.26) 0.0630 (11.08) 0.0644 (10.58) 0.0671 (9.98) 0.0720 (10.49) 0.0728 (11.04) 0.0744 (10.86)
0.0236 (4.50) 0.0455 (4.60) 0.0663 (4.69) 0.0757 (4.80) 0.1055 (4.91) 0.1234 (5.00) 0.1408 (5.13) 0.1573 (5.24) 0.1740 (5.35) 0.0217 (4.60) 0.0205 (4.84) 0.0199 (5.16) 0.0184 (5.49) 0.0173 (5.72) 0.0167 (6.36) 0.0157 (6.55) 0.0158 (6.84)
0.0479 (6.38) 0.0615 (4.60) 0.0692 (4.96)
Mar–May Mar–Jun Mar–Jul Mar–Aug Mar–Sep Mar–Oct Mar–Nov Mar–Dec Apr–May May–Jun Jun–Jul Jul–Aug Aug–Sep Sep–Oct Oct–Nov Nov–Dec
(4.69) 0.0892 (4.88) 0.0991 (4.86) 0.1130 (5.42) 0.1198 (5.26) 0.1267 (5.26) 0.0466 (6.36) 0.0438 (6.34) 0.0430 (6.37) 0.0459 (6.74) 0.0498 (6.98) 0.0552 (7.79) 0.0570 (8.40) 0.0585 (8.35)
Notes: “OCY” indicates estimation of convenience yield by option model, and “CCY” is by the cost-ofcarry model Significant at the 0.01 levels. T-statistic is in parentheses
59.5 Empirical Results Table 59.2 presents the convenience yields calculated based on the call option and cost-of-carry models. The results show that the values of the convenience yields estimated from the options model are higher than those from the cost-of-carry model, implying the strategic and management flexibility of the options approach. Moreover, the cost-of-carry model could produce results that contradict the options theory and yield negative estimates. Figures 59.2 and 59.3 draws the term structure of the convenience yield. We find that the behavior of crude oil and agricultural commodity convenience yields exhibits seasonality (Table 59.3). The convenience yields of crude
oils are the highest during the November–December holding period, and then the yields gradually fall to reach the lowest point in June–July. Based on this seasonal behavior, the convenience yields of Brent are higher than those of WTI in the winter, while the same is the case in the summer, suggesting that greater benefits accrue to the holder of Brent crude in the winter when it is subject to supply shocks. Moreover, the convenience yields for Brent crude are more volatile than those for WTI. The convenience yields of corn are at their highest during the September–December holding period, and then they decline gradually to reach their lowest point in March–May. The convenience yields relating to soybean are at their highest during the September–November holding period, and then they decline gradually to reach their lowest
59 Raw Material Convenience Yields and Business Cycle
923
Fig. 59.2 Term structure of convenience yields for crude oil
WTI
Brent
Convenience Yield/Spot price
0.075
0.07
0.065
0.06 Mar-Apr Apr-May May-Jun Jun-Jul Jul-Aug Aug-Sep Sep-Oct Oct-Nov Nov-Dec Holding Periods
Convenience Yield/Spot Price
0.16 Convenience Yield/Spot Price
Fig. 59.3 Term structure of convenience yields for agricultural commodities
0.14
0.12
0.1
0.1
0.08
0.06
0.08 Corn Mar-May May-Jul Jul-Sep Holding Period
point in July–August, which indicates that in the initial stage of crop years, corn and soybean have highest convenience yields showing that convenience yields are affected by seasonality. We find that the inventory of a commodity tends to decrease as the end of the cycle year draws near and the probability of a stock-out increases, which means that the benefit of holding inventory also rises, resulting in higher convenience yields. Thus the term structure of the convenience yields is upward sloping. Figures 59.4 and 59.5 support this conclusion. The empirical results above indicate that WTI possesses a price advantage over Brent in the spot market, but its convenience yields are not necessarily higher than those of Brent. According to our estimates for convenience yields, although the price of soybean is naturally higher than that of corn, the convenience yield of soybean is not necessarily greater than but less than that of corn. The theory of storage also suggests that holding inventory becomes more costly in periods of high interest rates. Furthermore, the convenience yield can be treated as a call
0.12
Sep-Dec
Mar-May May-Jul Jul-Aug Aug-Sep Sep-Nov Soybean Holding Period
option; in theory it is positively correlated with the covariance (p2 ) of two futures contracts and the risk-free rate (rf ). We use a multiple regression to test the relationship between the convenience yield estimated on the basis of the option model (OCY) and the independent variables: 2 C ˇ3 rf t C "t ; OCY t;T D ˇ0 C ˇ1 log.It 1 / C ˇ2 pt (59.15)
where “ and © are the regression coefficient and the error term, respectively. The independent variable log(It 1 ) is log inventories at term t 1. The relationships between convenience yields and inventories are suggested negative by the regression results in Tables 59.4–59.7. In Tables 59.4 and 59.5, we find that the sign of the coefficient is consistent with the theory. “1 is, however, statistically insignificant for crude oils. Tables 59.6 and 59.7 are regression results for corns and soybeans respectively. “1 are mostly negative and statistically significant, strongly suggesting that the negative relationship between estimated convenience yield and inventory.
924
Table 59.3 Estimated means for agricultural commodity convenience yields
C.-W. Duan and W.T. Lin
Corn (US cents/spot price)
Soybean (US cents/spot price)
Perioda
OCY.1/
CCY.2/
(1)–(2)
OCY.3/
CCY.4/
(3)–(4)
Mar–May
0.0909 (22.91) 0.1621 (19.40)
0.0483 (11.63) 0.0999 (11.34)
0.0426 (10.40) 0.0621 (8.56)
0.2409 (17.63)
0.1735 (7.17)
0.0673 (5.95)
0.1047 (27.54) 0.1830 (22.84) 0.2217 (31.47) 0.2727 (26.12) 0.3642 (19.80)
0.0672 (36.36) 0.1343 (33.25) 0.1772 (26.06) 0.2309 (16.84) 0.3136 (14.05)
0.0375 (8.33) 0.0487 (5.57) 0.0445 (5.42) 0.0417 (4.34) 0.0505 (4.73)
0.3483 (15.87) 0.1003 (19.90)
0.2654 (8.08) 0.0527 (11.57)
0.0829 (6.44) 0.0475 (8.75)
0.1075 (18.50) 0.0749 (16.02)
0.0671 (28.64) 0.0427 (11.92)
0.0404 (6.46) 0.0321 (5.99)
0.1218 (18.08)
0.0755 (4.69)
0.0463 (4.30) 0.0778 (19.23) 0.1181 (19.31)
0.0534 (7.84) 0.0806 (10.30)
0.0244 (4.38) 0.0375 (5.51)
Mar–Jul Mar–Aug Mar–Sep Mar–Nov Mar–Dec May–Jul Jul–Aug Jul–Sep Aug–Sep Sep–Nov Sep–Dec
0.1551 (36.39)
0.0938 (14.43)
0.0612 (6.58)
Notes: “OCY” indicates estimation of convenience yield by option model, and “CCY” is by the cost-ofcarry model a The five expiration months for CBOT corn futures contracts are March, May, July, September, and December. CBOT soybean has seven expiration months: January, March, May, July, August, September, and November Significant at the 0.01 levels. T-statistic is in parentheses
Fig. 59.4 Plot convenience yields by March as shock month for crude oil Convenience Yield/Spot Price
WTI
Brent
0.25
0.15
0.05 Apr
May
Jun
Jul
Aug Event Months
Sep
Oct
Nov
Dec
925
Convenience Yield/Spot Price
Fig. 59.5 Plot convenience yields by March as shock month for agricultural commodities
Convenience Yield/Spot Price
59 Raw Material Convenience Yields and Business Cycle
0.35
0.25
0.15
0.25
0.15
0.05
0.05 May
Jul
Corn
Table 59.4 WTI crude oil convenience yields regressed on inventory, volatility and risk-free rate
0.35
Sep
May
Dec
Jul
Soybean
Event Months
Aug
Sep
Nov
Event Month
Period
ˇ0
ˇ1
ˇ2
ˇ3
R2
F
Mar–Apr
0.6206 (1.04) 1.1620 (1.07) 1.6936 (1.08) 2.2397 (1.10) 2.6897 (1.08) 3.0435 (1.04) 3.1141 (0.92) 3.7660 (1.06) 4.1770 (1.08)
0.0841 (0.96) 0.1589 (1.00) 0.2327 (1.01) 0.3088 (1.03) 0.3714 (1.02) 0.4204 (0.98) 0.4292 (0.87) 0.5196 (1.00) 0.5758 (1.02)
20.7896 (4.09) 27.4722 (2.87) 31.4704 (2.18) 34.6799 (1.79) 40.9165 (1.87) 45.8778 (1.73) 55.3662 (1.95) 48.4808 (1.65) 46.3762 (1.76)
0.1092 (0.51) 0.2476 (0.64) 0.4033 (0.72) 0.5691 (0.78) 0.6518 (0.75) 0.7170 (0.69) 0.6886 (0.61) 0.8587 (0.70) 0.954 (0.72)
0.78
7.43
0.66
4.00
0.56
2.60
0.49
1.96
0.49
1.99
0.47
1.81
0.51
2.15
0.46
1.74
0.51
2.17
0.5072 (1.03) 0.4237 (1.03) 0.3708 (1.16) 0.2323 (1.01) 0.0701 (0.43) 0.1632 (0.79) 0.2759 (0.96) 0.4398 (1.33)
0.0678 (0.94) 0.0561 (0.93) 0.0488 (1.04) 0.0288 (0.86) 0.0052 (0.22) 0.0292 (0.97) 0.0458 (1.10) 0.0696 (1.44)
24.3732 (5.35) 26.2683 (6.40) 29.2983 (8.60) 30.0085 (12.98) 27.9118 (18.80) 26.7628 (15.12) 24.8389 (11.30) 24.9073 (11.32)
0.1449 (0.82) 0.1463 (1.00) 0.1350 (1.20) 0.1186 (1.50) 0.0931 (1.66) 0.0626 (0.92) 0.0638 (0.70) 0.0110 (0.10)
0.86
12.56
0.89
17.51
0.93
30.46
0.96
63.85
0.98
131.24
0.97
84.76
0.95
46.80
0.95
47.82
Mar–May Mar–Jun Mar–Jul Mar–Aug Mar–Sep Mar–Oct Mar–Nov Mar–Dec
Apr–May May–Jun Jun–Jul Jul–Aug Aug–Sep Sep–Oct Oct–Nov Nov–Dec
2 Notes: Model: OC Yt;T D ˇ0 C ˇ1 log.It1 / C ˇ2 pt C ˇ3 rt C "t Standard errors are analyzed with the correction of heteroscedasticity and results differ only marginally ; ; Significant at the 0.01, 0.05, and 0.1 levels, respectively. T-statistic is in parentheses
926
Table 59.5 Brent crude oil convenience yields regressed on inventory, volatility and risk-free rate
C.-W. Duan and W.T. Lin
Period
ˇ0
ˇ1
ˇ2
ˇ3
R2
F
Mar–Apr
0.9170 (1.45) 1.5182 (1.34) 2.0842 (1.28) 2.6232 (1.25) 3.1276 (1.21) 3.5752 (1.16) 3.6817 (1.08) 4.3115 (1.15) 4.3029 (1.02)
0.1292 (1.39) 0.2137 (1.28) 0.2934 (1.23) 0.3692 (1.20) 0.4406 (1.16) 0.5034 (1.12) 0.5177 (1.04) 0.6056 (1.10) 0.6038 (0.98)
24.6410 (4.37) 32.4569 (3.12) 38.2187 (2.49) 41.8079 (2.06) 48.5656 (2.13) 52.5074 (2.06) 63.0646 (2.40) 57.1504 (1.75) 67.8829 (2.12)
0.0559 (0.25) 0.0306 (0.08) 0.0167 (0.03) 0.0601 (0.08) 0.0853 (0.10) 0.1289 (0.12) 0.1441 (0.13) 0.1910 (0.15) 0.0730 (2.12)
0.79
7.70
0.67
4.21
0.58
2.87
0.51
2.12
0.54
2.35
0.52
2.25
0.57
2.68
0.47
1.79
0.56
2.57
0.7084 (1.23) 0.6273 (1.29) 0.5241 (1.24) 0.3817 (1.08) 0.2134 (0.75) 0.1551 (0.36) 0.1495 (0.28) 0.1926 (0.55)
0.0985 (1.17) 0.0871 (1.22) 0.0723 (1.17) 0.0515 (1.00) 0.0269 (0.64) 0.0177 (0.29) 0.0165 (0.21) 0.0324 (0.55)
26.6232 (4.85) 28.7516 (5.91) 31.0707 (7.00) 31.6910 (9.31) 30.2982 (13.19) 27.4162 (8.92) 23.8895 (6.72) 28.3505 (11.20)
0.0050 (0.03) 28.7516 (0.13) 0.0070 (0.05) 0.0264 (0.22) 0.0416 (0.44) 0.0722 (0.52) 0.0635 (0.36) 0.0519 (0.46)
0.82
9.15
0.86
13.35
0.90
18.14
0.94
31.80
0.96
64.15
0.93
28.74
0.88
16.07
0.95
47.38
Mar–May Mar–Jun Mar-Jul Mar–Aug Mar–Sep Mar–Oct Mar–Nov Mar–Dec
Apr–May May–Jun Jun–Jul Jul–Aug Aug–Sep Sep–Oct Oct–Nov Nov–Dec
2 Notes: Model: OC Yt;T D ˇ0 C ˇ1 log.It1 / C ˇ2 pt C ˇ3 rt C "t Standard errors are analyzed with the correction of heteroscedasticity and results differ only marginally ; ; Significant at the 0.01, 0.05, and 0.1 levels, respectively. T-statistic is in parentheses
The theory of storage points out that the convenience yield has a negative relation with inventory which could be verified through those products affected by seasonality, but could not be observed by products affected by demand/supply. Furthermore, “2 are consistently positive and mostly statistically significant, suggesting that the more uncertain the prices, the higher the convenience yields, which represents the management flexibility for the value of holding spot. Following the theory of storage, we may expect a significant non-zero coefficient of “3 . In fact, to the extent that holding inventory becomes more costly in period of high interest rates, we may expect negative correlation between interest rates and inventory, and positive correlation between interest rates and convenience yields, and hence a positive “3 . In testing the risk-
free rate coefficient, “3 of crude oil and corn are negative statistically insignificant, while that of soybean is negative statistically significant for some holding periods. This negative value implies that the increase in the carrying cost of commodity – namely the interest rate – would cause the yield of holding spot to decline. To observe the effect of a variable explaining the convenience yield, Lin and Duan (2007) use the spot price spread and futures price spread in the two crude oil markets as explanatory variables and perform regression on the following equation, -Brent C " ; OCY t;T D ˛0 C ˛1 SPFnt;T C ˛2 SPFdt;T C ˛3 SPSWTI t t;T (59.16)
59 Raw Material Convenience Yields and Business Cycle
Table 59.6 Corn convenience yields regressed on inventory, volatility and risk-free rate
927
Perioda
ˇ0
ˇ1
ˇ2
ˇ3
R2
F
Mar–May
0.7675 (3.73) 1.7897 (7.89) 5.1443 (6.09) 9.2369 (5.69) 0.6959 (5.32) 2.2816 (3.94) 0.5170 (1.83)
0.0445 (3.46) 0.1058 (7.45) 0.3073 (5.73) 0.5602 (5.46) 0.0396 (4.84) 0.1364 (3.74) 0.0259 (1.44)
51.5441 (7.31) 69.2155 (13.87) 34.5940 (1.27) 18.8752 (0.35) 45.9741 (15.83) 5.2640 (0.44) 52.9910 (7.17)
0.0443 (0.51) 0.0968 (1.04) 0.0037 (0.01) 0.4290 (0.67) 0.0381 (0.70) 0.2676 (1.09) 0.0451 (0.46)
0.92
23.28
0.97
89.63
0.91
22.04
0.85
11.40
0.98
98.01
0.80
8.41
0.93
27.43
Mar–Jul Mar–Sep Mar–Dec May–Jul Jul–Sep Sep–Nov
Notes: Model:OCY t;T D ˇ0 C ˇ1 log.It1 / C ˇ2 pt2 C ˇ3 rt C "t Standard errors are analyzed with the correction of heteroscedasticity and results differ only marginally a The five expiration months for CBOT corn futures contracts are March, May, July, September, and December Significant at the 0.01 level. T-statistic is in parentheses
Table 59.7 Soybean convenience yields regressed on inventory, volatility and risk-free rate
Perioda
ˇ0
ˇ1
ˇ2
ˇ3
R2
F
Mar–May
0.2828 (2.48) 0.9490 (3.38) 1.6586 (4.87) 3.6699 (7.82) 6.0820 (8.25) 0.5410 (2.70) 0.5664 (1.83) 1.3306 (9.19) 1.0458 (5.26)
0.0141 (1.80) 0.0556 (2.87) 0.1016 (4.31) 0.2359 (7.28) 0.3961 (7.82) 0.0319 (2.31) 0.0352 (1.65) 0.0873 (8.72) 0.0663 (4.85)
43.6339 (14.06) 58.1059 (11.93) 67.5908 (8.36) 67.5973 (6.09) 76.4794 (4.65) 44.0575 (12.20) 28.0626 (6.07) 29.3109 (9.41) 43.6297 (10.60)
0.0600 (1.34) 0.1203 (1.08) 0.2589 (1.98) 0.5214 (2.80) 0.6861 (2.12) 0.0629 (0.80) 0.0818 (0.68) 0.1890 (3.34) 0.0752 (0.89)
0.97
68.11
0.96
49.21
0.92
26.18
0.93
28.63
0.94
35.50
0.96
50.83
0.86
12.74
0.95
47.89
0.96
55.06
Mar–Jul Mar–Aug Mar–Sep Mar–Nov May–Jul Jul–Aug Aug–Sep Sep–Nov
Notes: Model: OCY t;T D ˇ0 C ˇ1 log.It1 / C ˇ2 pt2 C ˇ3 rt C "t Standard errors are analyzed with the correction of heteroscedasticity and results differ only marginally a The seven expiration months for CBOT soybean futures contracts are January, March, May, July, August, September, and November ; ; Significant at the 0.01, 0.05, and 0.1 levels, respectively. T-statistic is in parentheses
where ˛˛ and " are the regression coefficient and the error term, respectively, and SPFn , PFd , and SPSWTI -Brent are the price spreads of a nearby contract, the price spreads of
a distant contract, and the spot price spreads, respectively. However, their testing results for crude oil exhibit inconsistent coefficient signs.
928
Table 59.8 Regression of spot price spread on WTI and Brent convenience yields
C.-W. Duan and W.T. Lin
Period
0
1
2
R2
F
Mar–Apr
0.3330 (0.91) 0.0929 (0.24) 0.2381 (0.59) 0.4485 (1.10) 0.6147 (1.55) 0.0876 (0.25) 0.9046 (2.18) 0.2234 (0.62) 0.1707 (0.47)
126.0478 (7.52) 70.5103 (5.95) 43.4686 (4.65) 30.4990 (4.12) 19.7444 (3.31) 32.4723 (8.11) 13.4769 (3.04) 44.1757 (8.75) 27.2599 (7.96)
150.1493 (9.72) 90.9732 (8.23) 60.9317 (0.99) 46.4001 (6.59) 34.1160 (6.15) 42.7813 (11.70) 1.4364 (0.36) 52.9324 (11.11) 34.5456 (11.38)
0.39
64.16
0.34
53.34
0.30
44.46
0.29
42.01
0.32
48.00
0.49
95.71
0.22
28.66
0.48
94.11
0.50
102.52
0.6735 (1.95) 1.3067 (3.80) 1.2211 (3.83) 1.4472 (4.51) 2.5587 (10.07) 1.8793 (4.52) 0.8636 (1.98) 1.1570 (6.21)
187.2975 (9.31) 195.4551 (8.98) 191.1563 (10.39) 178.1770 (9.98) 187.1712 (17.72) 105.3250 (6.34) 20.4706 (1.23) 171.8386 (24.62)
207.2445 (10.56) 205.5767 (9.97) 201.8633 (11.49) 187.0148 (11.47) 180.7370 (20.58) 105.8573 (7.94) 37.3382 (2.62) 184.1415 (27.82)
0.38
62.78
0.35
53.89
0.40
69.12
0.42
73.69
0.69
225.08
0.27
37.78
0.08
9.20
0.79
395.87
Mar–May Mar–Jun Mar–Jul Mar–Aug Mar–Sep Mar–Oct Mar–Nov Mar–Dec
Apr–May May–Jun Jun–Jul Jul–Aug Aug–Sep Sep–Oct Oct–Nov Nov–Dec
-Brent D C OCY WTI C OCY Brent C " Note: Model: SPSWTI 0 1 2 t t;T t;T t;T and Significant at the 0.01 and 0.05 levels. T-statistic is in parentheses
Table 59.9 Regression of spot price ratio between soybean and corn on corn and soybean convenience yields
Period
0
1
2
R2
F
Mar–May
0.9449 (5.10) 0.7951 (4.70) 0.8665 (7.45) 1.1356 (7.38)
8.1467 (4.97) 3.1352 (4.62) 2.5103 (8.53) 4.8641 (4.13)
6.5683 (3.79) 5.8483 (8.08) 7.7554 (19.50) 6.9854 (6.70)
0.23
31.69
0.30
45.15
0.66
199.35
0.25
34.72
Mar–Jul Mar–Sep May–Jul
Psoybean =Pcorn
Soybean
Corn Note: Model: Ratiot;T D 0 C 1 OCYt;T C 2 OCYt;T C "t The five expiration months for CBOT corn futures contracts are March, May, July, September, and December. CBOT soybean has seven expiration months: January, March, May, July, August, September, and November ; ; Significant at the 0.01, 0.05, and 0.1 levels, respectively. T-statistic is in parentheses
59 Raw Material Convenience Yields and Business Cycle
929
Since Equation (59.16) could lead to multicollinearity among explanatory variables, which in turn might lead to insignificant coefficient test results or inconsistency in the signs of the coefficients, we then use the convenience yields of WTI, Brent, corn, and soybean as explanatory variables and respectively run the following equations: -Brent D C OCY WTI C OCY Brent C " : SPSWTI 0 1 2 t t;T t;T t;T (59.17) P
Ratiot;Tsoybean
=Pcorn
Soybean
D 0 C 1 OCY Corn t;T C 2 OCY t;T
C "t : (59.18)
where ” and " are the regression coefficient and the error term, respectively, and Ratio is the ratio on spot prices of Table 59.10 Convenience yields regression results: Holding period from the shock month to the final month of business cycle
soybean to corn. The results presented in Table 59.8 show that 1 is negative statistically significant excluding March– October holding period, while 2 is significantly positive, indicating that the lower the convenience yield for WTI, the higher the convenience yield for Brent and the greater the spot price spread between WTI and Brent. The results for agricultural commodities illustrated in Table 59.9 show that 1 and 2 are positive statistically significant excluding March-September holding period, indicating that the higher the convenience yields for corn and soybean and the greater the ratio of spot price between soybean and corn. Table 59.10 illustrates the convenience yields from the shock month till the final month of the cycle, where the shock month is the futures contract month in which the peak of spot prices for underlying occurs during the cycle year, and the fi-
Commodity/Period
ˇ0
ˇ1
ˇ2
ˇ3
R2
F
WTI/ Sep–Dec Brent/ Sep–Dec Corn/ May–Dec Soybean/ May–Nov
0.1965 (0.64) 0.0802 (0.17) 7.6373 (4.42) 5.3587 (8.04)
0.0394 (0.87) 0.0027 (0.04) 0.4638 (4.24) 0.3518 (7.67)
44.09 (19.33) 51.12 (14.82) 29.03 (0.49) 76.64 (4.89)
0.0586 (0.57) 0.0041 (0.03) 0.3410 (0.50) 0.5760 (1.99)
0.99
143.55
0.98
84.30
0.77
6.77
0.95
34.54
Notes: Model: OCY t;T D ˇ0 C ˇ1 log.It1 / C ˇ2 pt2 C ˇ3 rt C "t Standard errors are analyzed with the correction of heteroscedasticity and results differ only marginally ; ; Significant at the 0.01, 0.05, and 0.1 levels, respectively. T-statistic is in parentheses
Table 59.11 Testing the hypothesis of Samuelson for crude oil WTI
Brent
Sample
OCY level
No
OC Y
1
No
OC Y
1
Full
High
67 103
High
43
Low
47
High
13
0.3043 (10.49) 0.6262 (13.40) 0.3027 (9.22) 0.3619 (8.29) 0.3912
60
Low
0.2559 (30.37) 0.0772 (29.11) 0.2972 (39.89) 0.1479 (24.48) 0.2606
0.2397 (26.38) 0.0758 (32.02) 0.2763 (30.65) 0.1241 (20.96) 0.0830
0.1709 (4.58) 0.2390 (6.03) 0.1526 (3.30) 0.1974 (4.80) 0.1340
Low
77
(8.50) 0.0659 (34.81)
(4.66) 0.6818 (11.83)
(67.92) 0.0508 (24.95)
(2.64) 0.3858 (4.95)
High
130
Low
210
0.2459 (39.60) 0.0755 (44.33)
0.2420 (10.22) 0.4293 (12.77)
March as shock
March to December as shock
Total sample WTI and Brent
Notes: The estimated model is ln.Ft;T =Ft1;T 1 / D 0 C 1 ln.St =St1 / C "t Significant at the 0.01 level. T-statistic is in parentheses
110 40 50 50
40
930
C.-W. Duan and W.T. Lin
Table 59.12 Testing the hypothesis of Samuelson for agricultural commodities Corn Soybean Sample
OCY level
No
OC Y
1
No
OC Y
1
Full
High
24 46
High
18
Low
22
High
17
0.7292 (16.89) 0.7829 (24.37) 0.7150 (14.31) 0.7853 (20.81) 0.6775
36
Low
0.2773 (16.57) 0.1204 (28.45) 0.3058 (17.33) 0.1326 (14.08) 0.1457
0.2717 (22.23) 0.1012 (27.01) 0.3029 (22.72) 0.1556 (16.74) 0.1153
0.8087 (28.32) 0.8855 (42.16) 0.7664 (21.57) 0.9092 (44.64) 0.8680
(31.16) 0.0958 (36.29) 0.2723 (27.71) 0.1094 (37.53)
(10.46) 0.8714 (29.27) 0.7770 (32.08) 0.8388 (43.21)
(36.45) 0.0793 (31.43)
(24.44) 0.8971 (31.55)
March as shock
March to December as shock
Corn and soybean
Low
23
High
61
Low
99
54 25 25 24
26
Notes: The estimated model is ln.Ft;T =Ft1;T 1 / D 0 C 1 ln.St =St1 / C "t Significant at the 0.01 level. T-statistic is in parentheses
nal month is the last futures contract month of business cycle. The negative correlation between these commodities’ convenience yield and the inventory level suggests that it is closely linked to business cycle. In the case of agricultural commodities, we find the convenience yield with the inventory level has strong negative correlation in the regression analysis, which is identical to the previous results of this article. Samuelson (1965) argues that futures prices are less variable than spot prices. The theory of storage also predicts that, at a low inventory level, futures prices vary less than spot prices; at a high inventory level, spot prices and futures prices have similar variability. Fama and French (1988) supported Samuelson’s hypothesis by examining the interest-adjusted basis of industrial metals. Thus it is believed that the convenience yield declines at higher inventory levels, and that spot and futures prices have roughly the same variability. Conversely, the convenience yield rises at a low inventory level, and spot prices are more variable than futures prices. To test the Samuelson (1965) hypothesis, we adopt the same approach as Fama and French (1988) and perform the following regression: ln
Ft;T Ft 1;T 1
St D 0 C 1 ln St 1
C "t :
(59.19)
We apply the regression analysis in the event months, and divide the estimated values of the coefficients 1 into high convenience yields and low convenience yields based on the means of the convenience yields. We find that high conve-
nience yields have smaller average values for coefficients, while low convenience yields have average coefficient values close to one. This implies that at a low inventory level, the spot prices of crude oil vary more than the futures prices and that the convenience yields derived from the options model are higher; at a high inventory level, the spot and futures prices of these commodities have similar variability, while the convenience yields are lower. These results are consistent with the hypothesis of Samuelson (1965) and the findings of Fama and French (1988) (Tables 59.11 and 59.12).
59.6 Conclusion The commodity convenience yield calculated by the methodology of Milonas and Thomadakis (1997) exhibits seasonality in the presence of the business cycle. The paper has provided a framework for estimating convenience yields by option approach. The empirical results show that the values of the convenience yields estimated from the options model are higher than those from the cost-of-carry model, implying the strategic and management flexibility of the options approach. Our results show that the negative correlation between the convenience yields for crude oil and the inventory level is weak, yet those for agricultural commodities are strong. This demonstrates that when using our option model the choice of the timing of the business cycle is critical to the calculation of the commodity convenience yield.
59 Raw Material Convenience Yields and Business Cycle
We find that the convenience yields of crude oil can explain the spot price spreads between WTI and Brent. It is thus implied that the higher the convenience yield in relation to WTI crude oil the lower the convenience yield in terms of Brent crude oil, and the greater/lower the spot price spread between WTI and Brent crude oils. In the evidence for agricultural commodities, convenience yields could explain the ratio between soybean and corn. Samuelson (1965) proposed that spot and futures price variations will be similar when a supply/demand shock occurs during higher inventory levels but that spot prices will be more variable than the futures prices at lower inventory levels. Using a regression analysis method for both futures prices against spot prices and their volatilities as in Fama and French (1988), to verify the Samuelson hypothesis, we find that at a low inventory level spot prices vary more than futures prices. Thus our estimated convenience yields are shown to be higher at a lower inventory level than at a higher inventory level. At a high inventory level, spot and futures prices share roughly the same variability, which leads to a lower convenience yield at a higher inventory level, verifying the Samuelson hypothesis.
References Brennan, M. J. 1958. “The Supply of Storage,” American Economic Review 48, 50–72. Brennan, M. J. 1986 The cost of convenience and the pricing of commodity contingent claims, Working paper, University of British Columbia.
931 Brennan, M. J. 1991 “The price of convenience and the valuation of commodity contingent claims” in Stochastic models and options values, D. Lund and B. Oksendal (Eds.). Elsever North Holland. Casassus, J. and P. Collin-Dufresne, 2005. “Stochastic convenience yield implied from commodity futures and interest rates.” Journal of Finance 60, 2283–2331. Fama, E. F. and K. R. French 1987 “Commodity futures prices: some evidence of forecast power premiums, and the theory of storage” Journal of Business 60, 55–73. Fama, E. F. and K. R. French 1988 “Business cycles and the behavior of metals prices” Journal of Finance 43, 1075–1093. Fisher, S. 1978 “Call option pricing when the exercise price is uncertain and the valuation of index bonds” Journal of Finance 33, 169–176. Gibson, R. and E. S. Schwartz 1990 “Stochastic convenience yield and the pricing of oil contingent claims” Journal of Finance 45, 959– 976. Heinkel, R., M. Howe, and J. S. Hughes 1990 “Commodity convenience yields as an option profit” Journal of Futures Markets 10, 519–533. Kaldor, N. 1939 “Speculation and economic stability” Review of Economic Studies 7, 1–27. Keynes, J. M. 1930. A treatise on money, Harcourt Brace, New York. Lin, T. and C. W. Duan 2007 “Oil convenience yields estimated under demand/supply shock.” Review of Quantitative Finance and Accounting 28, 203–225. Milonas, N. T and S. B. Thomadakis 1997 “Convenience yields as call options: an empirical analysis” Journal of Futures Markets 17, 1–15. Samuelson, P. A. 1965. “Proof that properly anticipated prices fluctuate randomly.” Industrial Management Review 6, 41–49. Schwartz, E. S. 1997. “The stochastic behavior of commodity prices: implication for valuation and hedging.” Journal of Finance 52, 923–973. Telser, L. G. 1958 “Futures treading and the storage of cotton and wheat” Journal of Political Economy 66, 233–255. Working, H. 1948 “Theory of the inverse carrying charge in futures markets” Journal of Farm Economics 30, 1–28. Working, H. 1949 “The theory of the price of storage” American Economic Review 39, 1254–1262.
Chapter 60
Alternative Methods to Determine Optimal Capital Structure: Theory and Application Sheng-Syan Chen, Cheng-Few Lee, and Han-Hsing Lee
Abstract In this paper, we review the most important and representative capital structure models. The capital structure models incorporate contingent claim valuation theory to quantitatively analyze prevailing determinants of capital structure in corporate finance literature. In capital structure models, the valuation of corporate securities and financial decisions are jointly determined. Most of the capital structure models provide closed-form expressions of corporate debt as well as the endogenously determined bankruptcy level, which are explicitly linked to taxes, firm risk, bankruptcy costs, risk-free interest rate, payout rates, and other important variables. The behavior of how debt values (and therefore yield spreads) and optimal leverage ratios change with these variables can thus be investigated in detail. Keywords Optimal capital structure Capital structure model r Credit risk
r
Structural model
r
60.1 Introduction The traditional capital structure theories investigate how the capital structure of a firm can be affected by some crucial determinants, such as the financial distress costs and agency costs. However, it is a difficult task to measure their effects on the capital structure decisions and security pricing. It is of great interest for corporate managers and researchers to quantitatively examine how these effects influence a company and its financial decisions under various conditions. The capital structure models employ contingent claim valuaS.-S. Chen National Taiwan University, Taiwan e-mail: [email protected] C.-F. Lee Rutgers University, USA e-mail: [email protected] H.-H. Lee () National Chiao Tung University, Taiwan e-mail: [email protected]
tion (option pricing) theory for analyzing classical corporate finance issues and can deepen the understanding of the firm’s financial decisions. Furthermore, the contingent claim valuation enables risky debt pricing using marketable securities of the firm, such as common stock, bond or credit default swaps (CDS) prices, to estimate the firm value parameters and in turn the default risk. This approach avoids the use of bond rating and risk premium (markup), which are difficult to pin down and often are stale in contrast to the trading data of marketable securities.1 The structural credit risk modeling or defaultable claim modeling is pioneered by the seminal paper by Black and Scholes (1973) in which corporate liabilities can be viewed as a covered call – own the asset but short a call option. Black and Scholes (1973) assume that a company issues only one zero-coupon bond and equity. At the maturity, debt holders either receive the face value of debt if the company is solvent, or take control of the company otherwise. This option analogy allows the closed-form expression for the corporate securities. Later on, Merton (1974) rigorously elaborates the pricing of corporate debt. In addition, the valuation of any corporate security can be modeled as a contingent claim on the underlying value of firm. This Black-Scholes-Merton approach of modeling default claims is named structural approach since the model explicitly ties default risk to the firm value process and its capital structure. In modeling the cause of default-trigger mechanism, the Black-Scholes-Merton approach implicitly assumes that a company can only default at a certain time point when its bond principal is due. While this default trigger mechanism is apparently unrealistic, Black and Cox (1976) first propose a model in that default can occur during the life of the bond – a company defaults on its debt if the asset value of the company falls below a certain exogenous default boundary due to bond covenant. In the same paper, Black and Cox (1976) also point out that default decision can be endogenized by 1
Another advantage of the structural approach is that macro- and microconditions have been implicitly incorporated into the evolution (change) of firm values and, in turn, security prices, and need not be modeled separately.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_60,
933
934
shareholders’ discretion to maximize equity value. These two types of default mechanisms, exogenously specified by bond covenant and endogenously determined by management, have been widely adopted in most of the succeeding structural models.2 Subsequent researchers extend the structural approach further and incorporate more realistic assumptions. Based on the key objectives of these models, one can classify them as corporate bond pricing model and capital structure model. Corporate bond pricing models, typically assuming stochastic interest rate, are primarily developed to price defaultable corporate debt. In order to accommodate a more realistic and complex debt structure, these models typically assume an exogenous default barrier and bondholders receive a recovery rate times the face value of debt at maturity if a firm defaults.3 For example, the Longstaff and Schwartz model (1995) is the most well-known corporate bond pricing model. By contrast, the capital structure models incorporate corporate finance theory and put the emphasis on the default event itself. The valuation of corporate securities and financial decisions are jointly determined, and the analyses focus on the complex relations between credit risk, risk premiums, and the firm’s financing decisions rather than on the valuation of complex credit securities. In contrast to corporate bond pricing models, the simple stationary debt structure assumption, debt structure with time-independent payouts, is widely adopted in order to obtain closed-form expression of debt, equity, and firm value. In this paper, we restrict ourselves to capital structure models and exclude those models focusing on risky debt pricing.4 In the capital structure models, the most notable developments are the works by Leland (1994) and Leland and Toft (1996). They extend the endogenous default approach by Black and Cox (1976) to include the tax shield of debt and bankruptcy costs, and explicitly analyze the static tradeoff theory of capital structure. Various important issues including the endogenous default boundary, leverage ratio, debt value, debt capacity, yield spread, and even potential agency costs are discussed in detail. Note that apart from the early structural models, Leland and Toft (1996) point out that equity in a capital structure model is not precisely analogous to an ordinary call option or vanilla barrier option. The bankruptcy boundary varies with the risk of the firm’s activities. Furthermore, the tax benefits and the potential loss in
2 Geske (1977) proposes another way of modeling corporate debts as compound options. 3 There are some exceptions. For example, Briys and de Varenne (1997) assume recovery of firm value when a firm defaults. However, the tradeoff is its simpler debt structure, a zero-coupon bond. 4 Readers who are interested in other structural models can refer to the survey paper by Elizalde (2005).
S.-S. Chen et al.
bankruptcy imply that debt and equity holders do not split a claim depending only on the underlying asset value. While the Leland (1994) and the Leland and Toft (1996) models provide some insights of the capital structure issues, their predicted optimal leverage and yield spreads seem not to be consistent with historical average. Consequently, the more recent studies introduce additional realistic features in order to meliorate the model-predicted leverage and yield spreads. We will also review the some of these extensions. This paper is organized as follows: In Sect. 60.2, we will briefly review traditional capital structure theory. In Sect. 60.3, we will present in detail two most representative capital structure models – Leland (1994), and Leland and Toft (1996) models – and their implications. In Sect. 60.4, more recent studies including debt renegotiation, dynamic capital structure, as well as other developments are presented. In Sect. 60.5, the applications and empirical performance of capital structure models are discussed. In Sect. 60.6, conclusion is given.
60.2 The Traditional Theory of Optimal Capital Structure5 Durand (1952) proposes an approach to explain why there is an optimal capital structure for a firm. Called the traditional approach, it suggests that “moderate” amounts of debt do not noticeably increase the risks to either the debtholders or the equityholders. Hence, both cost of debt (i ) and cost of equity (rE ) are relatively constant up to some point. However, beyond some critical amount of debt used, both i and rE increase. These increases in both the cost of debt and equity begin to offset the decrease in the average cost of capital because of cheaper debt. Therefore, the average cost of capital (rA ), initially falls and then increases. This allows for a minimum value for the average cost of capital function. If the average cost of capital is at the minimum, then the firm’s value (V ) is maximized V D
.1 TC /EBIT rA
(60.1)
where TC D corporate tax rate and EBIT D earnings before interest and tax. This implies that there is an optimum capital structure for the firm.
60.2.1 Bankruptcy Costs The theoretical analysis of Modigliani and Miller (1958) indicates that in the presence of corporate taxes (TC ), the 5
This chapter is an excerpt from the book by Lee et al. (2009).
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
value of the levered firm (VL ) exceeds the value of the otherwise comparable unlevered firm (VU ). The difference can be viewed as an increment to present value derived from the tax shelter (TC B). Therefore, we can write VL D VU C T C B
(60.2)
where B D total amount of debt. One issue not considered in deriving this result is the possibility of corporate bankruptcy. Obligations to bondholders constitute a first claim on a company’s assets. However, under US law, stockholders, who are the owners of public companies, have only limited liability. Their personal assets, other than holdings in the firm, are protected from the claims of bondholders. If operating performance is sufficiently poor, stockholders can avoid obligations to bondholders through the firm’s declaring bankruptcy. Essentially, bankruptcy turns over the firm to its debtholders. Bankruptcy proceedings can be very costly. Heavy legal fees must be paid, and assets must often be sold at “bargain basement” prices, frequently following lengthy and expensive disputes among competing claimants. Taken together, these bankruptcy costs may be far from negligible, and the possibility of their being incurred must be considered in assessing a firm’s market value. After bankruptcy, the firm belongs to its creditors, who must therefore bear these costs. However, purchasers of corporate bonds will be aware of this and will charge a higher price – that is, demand a higher interest rate – for debt. This, in turn, reduces the amount of earnings available for distribution to stockholders, reducing the value of shares. Hence, before bankruptcy, costs are carried by stockholders as reductions in the value of their equity holdings. How much does the possibility of bankruptcy lower the firm’s value? Two factors must be considered: (1) the costs
Fig. 60.1 Possibility of optimal debt ratio when allowing for present value of bankruptcy costs
935
incurred in the event of bankruptcy and (2) the likelihood, or probability, of bankruptcy occurring. The greater the amount of debt a firm has, the more likely it will fail to meet its obligations and be driven to bankruptcy. Therefore, seeing this risk, bondholders will demand higher rates of interest the larger the debt. Thus, expected bankruptcy cost is an increasing function of the proportion of debt in a firm’s capital structure. Taking into account the possibility of bankruptcy and its effect on firm value, we can augment Equation (60.2) as follows: VL D VU C TC B present value of bankruptcy costs (60.3) The higher the level of debt, the greater the present value of the tax shield. As we saw in Equation (60.2), the effect is linear, in the sense that the value of the tax shield is a constant multiple TC (the corporate tax rate) of the debt value. However, as Equation (60.3) indicates, one must now subtract the present value of bankruptcy cost. This, too, is an increasing function of the debt level, reflecting the increase in the probability of bankruptcy that attaches to increasing debt levels. If these factors operate in conjunction, as illustrated in Fig. 60.1, then an optimal capital structure results. One sees for the unlevered firm, zero values for the tax shield and bankruptcy costs. Each increases with the level of debt issued, the net effect being a relationship between firm value and debt level that shows a maximum value for a debt level well below 100% of total value. Myers (1984) called the optimal capital structure discussed in this section the static tradeoff theory.6 6
Myers (1984) also proposes a pecking-order theory to contrast to the static trade-off theory.
936
S.-S. Chen et al.
Fig. 60.2 Possibility of optimal debt ratio when allowing for present value of bankruptcy and agency costs
60.2.2 Agency Costs Agency costs are legal and contract costs that protect the rights of stockholders, bondholders, and management, and arise when the interests of these parties come into conflict. Such organizational conflict may lead to capital structure decisions that affect the firm’s value. Asset substitution problem arises when a firm invests in assets that are riskier than those that the debtholders expect. For example, having sold debt, stockholders may gain if the corporation abandons its low-risk projects in favor of more risky projects. If the higher-risk projects are successful, the stockholders will benefit from the high returns. On the other hand, this exposes bondholders to a higher degree of risk than they anticipate when the bonds were purchased, and thus they face a higherthan-expected chance of corporate bankruptcy. Bondholders anticipate such possibilities and require that bonds be protected by restrictive covenants. Such covenants involve two types of costs to the firm. First, there are costs in monitoring the covenants, and second, these covenants restrict management flexibility, possibly precluding courses of action that would raise the firm’s market value. These costs are borne by stockholders, resulting in a lower market value for equity. Jensen and Meckling (1976) use agency costs of debt and equity to suggest that there is an optimum combination of outside debt and equity that will be chosen to minimize total agency costs. Therefore, it is possible for the existence of an optimal capital structure even in the absence of taxes and bankruptcy costs. In general, one might expect that agency costs of this sort will be greater the higher the debt. If there is little debt, bondholders will consider their investments relatively secure and demand only modest, fairly inexpensive protection of their interest. At higher debt levels, more stringent assurances will
be required. Thus, agency costs affect firm value in much the same way as bankruptcy costs. The greater the debt, the greater the agency cost against firm value. This is illustrated in Fig. 60.2, which extends Fig. 60.1 by allowing for agency cost. At any level of debt above zero, agency cost causes a decrease in market value; the greater the debt level, the larger the agency cost factor. Hence, agency cost yields a lower optimal amount of debt in the capital structure than would otherwise prevail. If there is agency cost, then Equation (60.3) can be rewritten as VL D VU C TC B present value of bankruptcy costs Agency cost
(60.3a)
In a review paper, Harris and Raviv (1991) survey capital structure theories based on agency costs, asymmetric information, product/input market interactions, and corporate control considerations (but excluding tax-based theories). Therefore, this paper gives us further detailed information about the result discussed in this section.
60.3 Optimal Capital Structure in the Contingent Claims Framework The traditional capital structure theory identifies some prime determinants of optimal capital structure. Nevertheneses, the theory is less useful in practice since it provides only qualitative guidelines. The structural approach enables quantitative analysis and thus has become very popular in the corporate finance literature. We will review in detail how the determinants mentioned in Sect. 60.2 affect capital structure decisions under the contingent claim framework in this section.
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
In the analysis of optimal capital structure, Brennan and Schwartz (1978) are the first to provide the quantitative examination of optimal leverage using contingent claims approach. However, their analysis employs numerical techniques and lacks a close-form solution. As a result, their numerical-based study cannot provide general comparative statics of financial variables and further discussion. In this section, we review in detail two most representative capital structure models – Leland (1994) and Leland and Toft (1996) models – and the implication of these models. These two studies provide closed-form expressions of corporate securities as well as comparative statics. Their methodology and stationary debt structures are widely adopted in the literature. In addition, these two models have served as benchmark models in the optimal capital structure literature.
60.3.1 The Leland (1994) Model Leland (1994) extends the endogenous default approach by Black and Cox (1976) to include the tax shield of debt and bankruptcy costs, which is also termed the Static tradeoff model in the corporate theory. In the Leland (1994) model, financial distress is triggered when shareholders no longer find that running a company is profitable, given the revenue produced by the assets to continue servicing debt. Bankruptcy is determined endogenously rather than by the imposition of a positive net worth condition or by a cash flow constraint. In this model, two major assumptions are as follows: (1) The activities of the firm are unchanged by financial structure, and the capital structure decisions, once made, are not subsequently changed; (2) The face value of debt, once issued, remains static through time. Consider a firm with asset value V , which follows a diffusion process with constant volatility of rate of return: dV D .V; t/dt C dW V
(60.4)
where W is a standard Brownian motion. The stochastic process of V is assumed to be unaffected by the financial structure of the firm. Therefore, any net cash outflows associated with the choice of leverage such as coupons after tax benefits must be financed by selling additional equity. In addition, there exists a riskless asset that pays a constant rate of interest r. Denote any claim F .V; t/ on the firm that continuously pays a nonnegative coupon, C , per instant of time when the firm is solvent. When the firm finances the net cost of the coupon by issuing additional equity, any such asset’s value must satisfy the PDE: 1 2 2 V FVV .V; t/CrVFV .V; t/rF.V; t/CFt .V; t/CC D 0 2
937
with boundary conditions determined by payments at maturity, and by payments in bankruptcy happens prior to maturity. When securities have no explicit time dependence as this case, the term Ft .V; t/ D 0, and the above equation becomes an ordinary differential equation with F .V / satisfying 1 2 2 V FVV .V; t/ C rVFV .V; t/ rF.V; t/ C C D 0: 2 (60.5) The equation has the general solution F .V / D A0 C A1 V C A2 V X ;
(60.6)
where X D 2r= 2 and the constants A0 , A1 , and A2 are determined by boundary conditions. Any time-independent claim with an equity-financed constant payout C 0 must have this functional form. First, Leland (1994) provides the solution of the perpetual debt. Debt promises a perpetual coupon payment, C , whose level remains constant unless the firm declares bankruptcy. Let VB denote the constant level of asset value at which bankruptcy is declared. If bankruptcy occurs, a fraction 0 ˛ 1 of asset value will be lost to bankruptcy cost. Accordingly, debtholders receive .1˛/VB and shareholders receive nothing.7 Boundary conditions are: At V D VB ;
D.V / D .1 ˛/VB
As V ! 1, D.V / D C =r D.V / D
X C C V C .1 ˛/VB r r VB
(60.7)
The above equation can be rewritten as D.V / D Œ1 pB .C =r/ C pB Œ.1 ˛/VB where pB .V =VB /X has the interpretation of the present value of $1 contingent on future bankruptcy; that is, V falls to VB . Next, Leland (1994) derives the total value of the firm, v.V /, which reflects three terms: the firm’s asset value, plus the value of the tax deduction of coupon payments, less the value of bankruptcy costs. Consider a security pays no coupon, but has a value equal to the bankruptcy costs. Because its returns are time independent, it must satisfy Equation (60.6) with boundary conditions At V D VB ; BC.V / D ˛VB As V ! 1; BC.V / D 0:
7
Leland (1994) also has a discussion (in Sect. VI.C) under the condition of deviation of absolute priority, in which bondholders do not receive all of the remaining asset value and stockholders also receive a fraction of the remaining asset value.
938
S.-S. Chen et al.
The solution in this case is ˛VB .V =VB /X ;
(60.8)
and BC.V / is a decreasing convex function of V . Next consider the value of tax benefits associated with debt financing. Denote the corporate tax rate as TC . These benefits resemble a security that pays a constant coupon equal to the tax-sheltering value of interest payments, TC C , as long as the firm is solvent. In other words, TB.V / equals the value of the tax benefit of debt. Since it is also time independent and must satisfy Equation (60.6) with boundary conditions At V D VB ; TB.V / D 0 As V ! 1; TB.V / D TC C =r: The solution with these boundary conditions is TC C =r .TC C =r/.V =VB /X
Note that Equation (60.10) is similar to Equation (60.3). If .V =VB /X approaches to zero, then Equation (60.10) reduces to original M&M proposition I as defined in Equation (60.2). For this condition to hold, either (i) .V =VB / ! 1 or (ii) 2 ! 0 and V > VB . In both cases, the firm is unlikely to go bankrupt, and therefore tax saving per period on the interest payments becomes a sure stream. As a result, the tax benefit claim in Equation (60.9) will be TC C =r, and Equation (60.10) will coincide with the original M&M proposition I in Equation (60.2), even with the existence of bankruptcy cost. On the other hand, without the bankruptcy cost (˛ D 0), the tax benefit claim under Leland’s specification is still different from that by M&M. The difference, a minus term (.TC C =r/.V =VB /X ), is coming from the possibility of future bankruptcy, and thus lowers the tax shield of debt. The value of equity is the total value of the firm less the value of debt as follows: E.V / D v.V / D.V / D V .1 TC /.C =r/
(60.9)
CŒ.1 TC /.C =r/ VB .V =VB /X: 8
The TB.V / is an increasing, strictly concave function of V . Note that the value of tax benefits above presumes that the firm always benefits fully in amount TC C from the tax deductibility of coupon payments when it is solvent. However, under U.S. tax codes, to benefit fully, the firm must have earnings before interest and taxes (EBIT) that is at least large as the coupon payments C . Leland (1994) also provides the solution, in which EBIT is related to asset value,V , and tax benefits may be partially lost when the firm is still solvent. In sum, the total value of the firm, v.V /, consisting of the above three terms, the firm’s asset value, the value of the tax deduction of coupon payments, and the value of bankruptcy costs, can be expressed as: v.V / D V C TB.V / BC.V / D V C .TC C =r/Œ1 .V =VB /X ˛VB .V =VB /X (60.10) v.V / is strictly concave in asset value, V , when C > 0 and either ˛ > 0 or TC > 0. Note that if ˛ > 0 or TC > 0, then v.V / < V as V ! VB , and v.V / > V as V ! 1. This coupled with concavity implies that v.V / is proportionally more volatile than V at low values of V , and is less volatile at high value of V .
8 Note that this equation represents that the firm always benefitfully from the tax deductibility of coupon payment when it is solvent. However, to benefit fully the firm must have EBIT over the coupon payment, C . Leland (1994) provides an alternative approach in Appendix A in which tax benefits may lost when the firm is still solvent.
(60.11) Leland next provides the bankruptcy-triggering asset level VB chosen by the firm. To maximize the value of equity at any level of V , the equilibrium bankruptcy-triggering asset value VB is determined ˇ endogenously by the smooth-pasting @E.V IVB ;T / ˇ D 0. condition ˇ @V V DVB
VB D Œ.1 TC /C =r ŒX=.1 C X / D .1 TC /C =.r C 0:5 2 / (60.12) Since VB < .1 TC /C =r, equity is indeed convex in V . Several observations of the asset value, VB , at which bankruptcy occurs are: (1) (2) (3) (4) (5)
It is proportional to the coupon, C . It is independent of the current asset value, V . It decreases as the corporate tax rate, TC , increases. It decreases as the risk-free interest rate, r, rises. It decreases with increases in the riskiness of the firm, 2 .
Finally, VB in Equation (60.12) is independent of time and it confirms the assumption of the constant bankruptcytriggering asset level VB . In addition to the determinants of bankruptcy level, with the closed-form expressions, corporate debt values and optimal leverage are explicitly linked to taxes, firm risk, bankruptcy costs, risk-free interest rate, and bond covenant. The behavior of how debt values (and therefore yield spreads) and optimal leverage ratios changes with these variables is discussed in detail. In addition, Leland shows that a
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
positive net worth covenant9 can eliminate the firm’s incentive to increase risk and lower the agency problem. However, optimal capital structure refers not only to leverage ratio, but also to the maturity of debt. The consol bond assumption of the Leland model confines the analysis in this dimension. As a result, Leland and Toft (1996)10 propose a finite maturity debt structure for further discussion of this issue.
939
of bankruptcy. Let f .sI V; VB / denote the density of the first passage time s from V to VB , and F .sI V; VB / is the cumulative distribution function of the first passage time to bankruptcy; that is, the default probability at time s. Under risk-neutral valuation with the drift rate .r ı/, the bond with maturity t has the value Z
t
d.V I VB ; t/ D
e rs c.t/Œ1 F .sI V; VB / ds
0
Ce rt p.t/Œ1 F .sI V; VB / Z t C e rs .t/VB f .sI V; VB /ds
60.3.2 The Leland and Toft (1996) Model
0
In the Leland and Toft (1996) model, financial distress is triggered, as assumed in Leland (1994), when shareholders no longer find that running a company is profitable. In this model, they further examine the optimal capital structure of a firm that can choose both the amount and maturity of its debt. They again setup a stationary debt structure by issuing finite maturity debt with continuous coupon and retiring the same constant amount (principal) of debt issued in the past simultaneously. As a result, at any given time, the firm has many overlapping debt outstanding, while the total cash flow to debt holders is constant. The results extend Leland’s (1994) closed-form results to a much richer class of possible debt structures and dividends payment and permit study of the optimal maturity of debt as well as the optimal amount of debt. The firm has productive assets whose unleveraged value V follows a continuous diffusion with constant proportional volatility dV D Œ.V; t/ ı dt C dZ V
The first term represents the discounted expected value of the coupon flow; the second term represents the expected discounted value of repayment of principal; and the third term represents the expected discounted value of the fraction of the assets, which will go to debt with maturity t, if bankruptcy occurs. Integrating the first term by parts yields d.V I VB ; t/ D
c.t/ c.t/ Œ1 F .t/ C e rt p.t/ r r c.t/ C .t/VB G.t/ (60.14) r Z
where
t
G.t/ D 0
Using the result of F .t/ from Harrison (1985) and G.t/ from Rubinstein and Reiner (1991):
(60.13)
where .V; t/ is the total expected rate of return on asset value V ; ı is the constant fraction of value paid out to security holders; and dZ is the increment of a standard Brownian motion. The process continues without time limit unless V falls to a default-triggering valueVB . VB will be determined endogenously and shown to be constant in a rational expectations equilibrium. Finally, a default-free asset exists and pays a continuous constant interest rate r. Consider a bond issue with maturity t periods from the present, which continuously pays a constant coupon flow c.t/ and has principal p.t/. Let .t/ be the fraction of asset value VB , which debt of maturity t receives in the event 9 The positive net-worth covenant is modeled by setting default boundary equal to the principal value of debt. 10 Leland (1994b) proposes an exponential model similar to sinking fund that firms retire debt at a proportional rate m through time. In the absence of default, the average maturity of debt is finite as the timeR1 weighted average: T D 0 t .memt /dt D 1=m.
e rs f .sI V; VB /ds
F .t/ D N Œh1 .t/ C G.t/ D
V VB
V VB
2a N Œh2 .t/
aCz N Œq1 .t/ C
V VB
(60.15)
az N Œq2 .t/ (60.16)
where q1 .t/ D
.b z 2 t/ .b C z 2 t/ I q2 .t/ D p p t t
.b a 2 t/ .b C a 2 t/ p I h2 .t/ D p t t
.r ı . 2 =2// V I aD I b D ln 2 VB
h1 .t/ D
zD
Œ.a 2 /2 C 2r 2 1=2 2
and N./ is the cumulative standard normal distribution.
940
S.-S. Chen et al.
A stationary debt structure with time-independent debt service payments is assumed by Leland and Toft (1996) as the following. Consider an environment where the firm continuously sells a constant (principal) amount of new debt with maturity of T years from issuance, which (if solvent) it will redeem at par upon maturity. New bond principal is issued at a rate p D .P =T / per year, where P is the total principal value of all outstanding bonds. The same amount of principal will be retired when the previously issued bonds mature. Therefore, as long as the firm remains solvent, at any time s the total outstanding debt principal will be P , and have a uniform distribution of principal over maturities in the interval .s; s C t/. Bonds with principal p pay a constant coupon rate c D .C =T / per year, implying the total coupon paid by all outstanding bonds is C per year. Total debt service payments are therefore time-independent and equal to .C C P =T / per year. Accordingly, this environment assumed by Leland and Toft (1996) avoids the time-dependent debt service requirements such as the single issue of debt outstanding assumed by Merton (1974). Assuming the fraction of firm asset value lost in bankruptcy is ˛ and .t/ D =T per year for all t, this implies D .1 ˛/. Therefore, writedown for debt is the fixed fraction of the principal value of debt lost in bankruptcy, which is .1 .1 ˛/VB =P /. Leland and Toft derive the closed-form solution of the value of all outstanding bonds, with debt of maturity T years, by integrating all the outstanding debt during this period: Z T d.V I VB ; t/dt D.V I VB ; T / D 0
C C 1 e rT D C P I.T / r r rT
C J.T /; (60.17) C .1 ˛/VB r where 1 T
Z
T
1 .G.T / e rT F .T //; rT 0 aCz Z 1 T rt 1 V J.T / D e G.t/dt D p T 0 V z T B ! az V
N Œq1 .T / q1 .T / C N Œq2 .T / q2 .T / : VB I.T / D
e rt F .t/dt D
Following Equation (60.10) by Leland (1994), total firm value is az az V V TC C 1 ˛VB v.V I VB / D V C r VB VB (60.18) where a and z have been defined in Equation (60.16).
The equilibrium bankruptcy-triggering asset value VB is determinedˇ endogenously by the smooth-pasting condition @E.V IVB ;T / ˇ D 0. ˇ @V V DVB
VB D
.C =r/ .A=.rT/ B/ AP=.rT/ TC Cx=r 1 C ˛x .1 ˛/B (60.19)
p p where A D 2aerT N.a T / 2zN.z T / p p rT p n.z T / C .z a/ .z T / C 2e T B D 2z C
2 z 2 T
C.z a/ C
2 p T
n
p p 2 N.a T / p n.z T / T
1 z 2 T
x D a C z11 and n./ denotes the standard normal density function. The solution is independent of time, indicating that the earlier assumption of a constant VB is warranted: it represents a rational expectations equilibrium. As T ! 1, it can be shown that VB ! .1 TC /.Cx=r/=.1 C x/ as in Equation (60.12). Note that VB will depend on the maturity T of debt chosen, for any given values of total bond principal P and coupon rate C . This is an important difference between Leland and Toft model and the flow-based bankruptcy model (e.g., Kim et al. (1993) or with a positive net worth covenant model (e.g., Longstaff and Schwartz (1995), in which they imply that VB is independent of debt maturity. And finally, the value of equity is given by E.V I VB ; T / D v.V I VB / D.V I VB ; T /
(60.20)
In contrast to the Leland (1994) model, the Leland and Toft (1996) model has some different implications. First, by permitting a firm to choose its debt maturity, optimal leverage is lower when the firm is financed by shorter-term debt. They also provide explanation of why a firm issues shortterm debt given that the longer-term debt generates higher firm value. They find that short-term debt reduces the asset substitution agency problem. While short-term debt does not exploit tax benefit of debt, it can balance against bankruptcy and agency costs in determining the optimal maturity of the capital structure. This is a consequence that short-term debt 11
If the payout rate (ı) is equal to 0, then it can be shown that 2 =2// 2 2 2 1=2 a C z D .rı. C Œ.a / C2r D 2r2 . This is exactly the 2 2 x defined in Equation (60.6) by Leland (1994). However, we should note that the introduction of the payout rate relaxes the unreasonable assumption assumed by Leland (1994) that net cash outflows associated with the choice of leverage such as coupons after-tax benefits must be financed by selling additional equity.
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
holders need not protect themselves by demanding higher coupon rates and thus equity holders will also benefit. Second, for bond portfolio management, they find that Macaulay duration overstates true duration of risky debt, especially for junk bonds that may have negative duration. Furthermore, convexity can become concavity.
60.4 Recent Development of Capital Structure Models In Sect. 60.3, we have reviewed in detail two of the most well-known capital structure models – Leland (1994), and Leland and Toft (1996) models. These two models have been served as benchmark models in the optimal capital structure literature. Their stationary debt structures and time-independent setting have been widely adopted in the literature. However, compared with historical average, the optimal leverage implied in the Leland (1994) model is unreasonably high, while the model’s predicted yield spread is too low given the suggested high leverage, especially for the unprotected debt. Furthermore, unreasonably large bankruptcy costs as high as 50% is required to lower the optimal leverage to this level. In addition to the issues above, several implications by the Leland (1994) model are also not in accordance with the empirical observations. In the Leland (1994) model, the firm will issue equity to finance contractual debt obligation until equity value is driven to zero in the absence of reorganization option. Hence, bankruptcy boundary coincides with the liquidation barrier. However, in practice, it is normal for management to file for bankruptcy when equity value is still positive. Some researchers argue that a more realistic bankruptcy procedure may affect the valuation of corporate securities and capital structure decisions. Therefore, subsequent studies attempt to extend Leland’s (1994) model to a more realistic setting and hope to make the model-implied yield spread and leverage ratio to be more consistent with the observed average in practice. The first branch of these extensions is to permit deviations of absolute priority rule (APR) in Chap. 14 bankruptcy proceedings. The importance of violations from APR has been addressed in the literature. Eberhart and Weiss (1998) clearly point out that APR is explicitly or implicitly assumed in many seminal finance models, such as Merton (1974) and Myers (1977). The APR is adhered since shareholders receive nothing until debt holders have been fully paid. The APR is also assumed in most of the early structural models. However, empirical studies indicate that the APR violation is a common phenomenon (see Franks and Torous1989; Eberhart et al. 1990; Weiss 1990). Therefore, it has of great interest to researchers to analyze the cause of APR deviation
941
and its impact on capital structural decision under rational expectation framework. Structural models can demonstrate the violation of APR by incorporating different default boundaries. Leland (1994) has a discussion (in Sect. VI.C) regarding deviation of absolute priority, in which bondholders do not receive all of the remaining asset value and stockholders also receive a fraction of the remaining asset value. Nevertheless, it cannot explain the behavior of debt holders’ concession by simply changing the boundary condition since their claims will be reduced. Therefore, several recent models permit the debt renegotiation by private workouts or by courtsupervised debt renegotiation under Chap. 14 bankruptcy proceedings. For example, Fan and Sundaresan (2000) propose a game-theoretic setting that incorporates equity’s ability to force concessions and varying bargaining powers to the debt holders and equity holders. Later on, Francois and Morellec (2004) extend this bargaining game to explicitly model Chap. 14 protection and court-supervised renegotiation. By introducing the possibility of renegotiating the debt contract, the default can occur at positive equity value. This is in contrast to the Leland’s (1994) model that the default occurs as the equity value reaches zero. As a consequence of that, issuing new equity is costless, and the APR is respected. Another branch of the extension is the dynamic capital structure. Early capital structure models assume that the debt issuance is a static decision. However, in practice, firms adjust outstanding debt levels in response to changes in firm value. Early studies of dynamic capital structure include works by Kane et al. (1984, 1985) and Fischer et al. (1989). Recent works consist of the Goldstein et al. (2001) and Ju and Ou-Yang models (2006), and many others. In general, the dynamic feature of these models predicts a lower leverage ratio than those static capital structure models and is more in line with the observed leverage in practice. There are still other researchers who seek to explore the capital structure problem of investment risks and agency costs, the timing of default under more realistic bankruptcy proceedings, alternative process, and other issues. We briefly categorize these studies as follows (the list below is by no means exclusive): Dynamic Capital Structure:12 Fischer et al. (1989); Goldstein et al. (2001); Ju et al. (2004); Ju and Ou-Yang (2006). Strategic Default and Renegotiation: Anderson and Sundaresan (1996); Mella-Barral and Perraudin (1997); Mella-Barral (1999); Fan and Sundaresan (2000); Francois and Morellec (2004). Risk Management for Firms: Ross (1996); Leland (1998). 12
Taurén (1999) and Collin-Dufresne and Goldstein (2001) approach this issue by proposing alternative models to reflect the firms’ tendency to maintain a stationary leverage ratio. However, their studies focus on the risky corporate debt pricing and the implications of credit spreads.
942
S.-S. Chen et al.
Bankruptcy Procedures (Chapter 7 versus Chapter 11): Broadie et al. (2007). Optimal Investment: Mauer and Triantis (1994), Childs et al. (2005); Hackbarth (2006). Stochastic Interest Rate: Kim et al. (1993); Acharya and Carpenter (2002); Huang, Ju and Ou-Yang (2003); Ju and Ou-Yang (2006). Jump: Dao and Jeanblanc (2006). The Fan and Sundaresan (2000) model as well as the Goldstein et al. (2001) model are presented in detail in Sects. 60.4.1 and 60.4.2, respectively. We choose these two papers that, in our opinion, are the most important or representative of that stream of research. The remaining research will be presented in less detail and we will briefly outline their key assumptions and results. Readers may refer to the original papers for details.
60.4.1 The Fan and Sundaresan (2000) Model Fan and Sundaresan (2000) propose a game-theoretic setting that incorporates equity’s ability to force concessions and various bargaining powers to the debt and equity holders. Two cases are discussed in their study where the firm’s asset value falls below a certain boundary: Debt-equity swap and strategic debt service. In the case of debt-equity swap, the debtholders exchange their claims for equity. In strategic debt service, borrowers stop making the contractual coupon and start servicing debt strategically until the firm’s assets returns to its original level. Some key assumptions under their framework include the following: (1) During the default period, the tax benefits are lost. (2) Asset sales for dividend payments are prohibited. (3) The firm can be liquidated only at a cost. The proportional cost is ˛.0 ˛ 1/ and the fixed cost is K.K 0/. Note that debt holders have strict absolute priority upon liquidation. (4) The asset value of the firm follows the lognormal diffusion process dV D . ˇ/V dt C VdBt
(60.21)
where is the instantaneous expected rate of return on the firm gross of all payout, and ˇ is the firm’s cash payout ratio. Here we present only the results of strategic debt service. If the equity and debt holders can reach an agreement of temporary coupon reduction when the firm is under financial distress, the firm will not lose its potential future tax benefits. At the trigger point for the strategic service VQS , both parties will bargain the total value of the firm v.V /. The driving force behind strategic behavior is the presence of proportional and fixed costs of liquidation. The bargaining power of equity holder clearly depends on the liquidation cost ˛VS C K. Fan and Sundaresan (2000) endogenize both the reorganization boundary and the optimal sharing rule between equity and debt holders upon default.
The total value of the firm is 8 T C ˆ
C TC C V ; when V > VQS Q r C r C VS v.V / D ˆ :V C TC C V ; when V VQS C r VQS
(60.22)
where D 0:5
rˇ 2
˙
rh
0:5
rˇ 2
i2
C
2r . 2
For any V VQS , Fan and Sundaresan set the optimal sharing rule between shareholders and debtholders as Q Q Q / D v.V Q / D .1 /v.V E.V / and D.V /:
(60.23)
The Nash solution to the bargaining game can be characterized as o n Q /0 Q D arg max v.V n o1 Q
.1 /v.V / max Œ.1 ˛/V K; 0 .1 ˛/V K ; (60.24) D min v.V / where denotes bargaining power of shareholders. At equilibrium, Fan and Sundaresan (2000) show that debt holders accept less than the contractual coupon and still permit shareholders to run the company. This results in deviations from APR, in which the shareholders will get a bigger share and the debtholders will get a less proportion of the firm. But both parties will be better off. Therefore, the concession of debtholders is well explained under the Fan and Sundaresan’s framework. They then solve the trigger point for strategic debt service (K D 0 case) as c.1 TC C TC / 1 VQ D r 1 1 ˛
(60.25)
and the strategic service amount paid to debt holders S.V / D .1 ˛/ˇV . It can be shown that the equity value satisfies the following differential equations:13 1 2 2Q Q V EVV .V; t/ C .r ˇ/V EQ V .V; t/ r E.V; t/ 2 CˇV c.1 TC / D 0; when V > VQS I 1 2 2Q Q V EVV .V; t/ C .r ˇ/V EQ V .V; t/ r E.V; t/ 2 CˇV S.V / D 0; when V > VQS I (60.26)
13
See Dixit and Pindyck (1994).
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
with the boundary conditions: lim E.V / D V
V "1
c.1 TC / r
cT C C r V #VQS
C cT C lim EV .V / D ˛ C r VQS V #VQS lim E.V / D ˛ VQS
943
Note that when V > VQS , the contractual coupon is paid and the tax shield is in effect. By contrast, when V VQS , the tax shield is lost because the equity holders are strategically servicing the debt according to the cash flow generated by the firm. Both the strategic debt service amount S.V / and the trigger level VQS are determined endogenously. The equity value is
8
.1 TC /C .1 C / TC C V .1 TC /C < C ; V Q E.V / D r .1 /r .C /.1 / r VS : v.V / .1 ˛/V;
The Fan and Sundaresan model shows that debt renegotiation encourages early default and increases credit spreads on corporate debt, given that shareholders can renegotiate in distress to avoid inefficient and costly liquidation. It might be the interest of debt holders to forgive part of the debt service payments if they can avoid the wasteful liquidations, which can be shared by the two claimants. If shareholders have no bargaining power, no strategic debt service takes place and the model converges to the Leland model. Furthermore, by introducing the possibility of renegotiating the debt contract, the default can occur at positive equity value. This is in contrast to the Leland’s (1994) model in that the default occurs when the equity value reaches zero as a consequence of that issuing new equity is costless and the APR is respected.
dı D P dt C dz ı
Z
1
dsıs e rs t
14 Although unlevered firm value may not be traded, trading of equity or other contingent claims allows the use of unlevered firm value as the state variable (Ericsson and Reneby 2004a).
(60.27)
(60.28)
The value to the claim of entire period is the discounting expected cash flows (unlevered firm) Q
Goldstein, Ju, and Leland propose an EBIT-based model in that the state variable is earnings before interests and taxes (EBIT) instead of commonly used unlevered firm value.14 This approach is intuitively appealing: the EBIT-generating machine runs independently of how the EBIT flow is distributed among its claimants. The advantage of taking the claim on future EBIT is that all contingent claimants, including the government claim, to future EBIT flows are treated in a consistent fashion. More precisely, the government claim is no longer treated as a cash inflow in a form of tax benefit, but as a cash outflow on EBIT in a form of tax. In other words, each dollar paid out, whether to interest payments, dividends, or taxes, affects the firm in the same way. Another advantage is that unlevered and levered equity never exist simultane-
when V VQS
ously. In contrast to the unlevered firm value, the claim to future EBIT flow should be more reasonable to changes in capital structure. Furthermore, compared with previous models, this characterization also has assumptions that are more in accordance with empirical findings.15 The single technology of a pure exchange economy produces a payout flow that is specified by the (EBIT) process
V .t/ D Et
60.4.2 The Goldstein et al. (2001) Model
when V > VQS
D
ıt r
(60.29)
where D .P / is the risk-neutral drift of the payout flow rate: dı D dt C dzQ (60.30) ı Since r and are constants, V and ı share the same dynamics: dV (60.31) D dt C dzQ V
15
Fischer et al. (1989) assume that the following: (i) The value of an optimally levered firm can only exceed its unlevered value by the amount of transaction costs incurred. It eliminates the arbitrage opportunity. (ii) A firm that follows an optimal financing policy offers a fare riskadjusted return. Thus, unlevered firms offer a below-fair expected rate of return. Goldstein et al. (2001) argue that these key assumptions by Fischer et al. (1989) are not realistic. First, the arbitrage strategy is not feasible because a bidder must pay a large premium in order to gain control of management. They also argue that for publicly traded assets, rational investors will push down security prices so that fair expected return is obtained on any traded assets regardless of the policies employed by the management.
944
S.-S. Chen et al.
Accordingly,
with boundary conditions dV C dı D rdt C dzQ : V
lim Vsolv D V and Vsolv D 0 as V D VB
(60.32)
V !1
Thus, Vsolv D V VB pB .V /
60.4.2.1 Optimal Static Capital Structure Assume that a firm issues a consol bond with a constant coupon payment C as long as the firm remains solvent. As in Leland (1994), any claim must satisfy the ordinary differential equation (60.33)
where P is the payout flow. The solution to the above equation can be written as the sum of a particular solution to the full inhomogeneous equation and the general solution to the corresponding homogeneous equation. The general solution of VFV C 2 2 V FVV rF D 0 (no intertemporal cash flows): 2 F D A1 V " where x D
2
2 2
y
C
2 4 yD 2
C A2 V r
x
2 2
(1.3) Define Vint as the value of the claim to interest payments. Vint D Since lim Dc D
2 2 V FVV C P rF D 0 VF V C 2
V !1
Vint D
2
3, s
2 2 C 2r 2 5 2 2
Note that x is positive and y is negative. Since A1 V y goes to infinity while V becomes large, A1 equals zero for all claims of interest.
(60.38)
Dsolv .V / D .1 Ti /Vint
(60.39)
Gsolv .V / D Teff .Vsolv Vint / C Ti Vint
(60.40)
.1 Teff / D .1 Tc /.1 Td / (1.5) Define Vdef .V / as the present value of the default claim. Vdef .V / D A1 V y C A2 V x with boundary conditions lim Vdef .V / D 0 and Vdef .V D VB / D VB
V !1
lim pB .V / D 0 and lim pB .V / D 1
V !1
Thus,
V !1
pB .V / D
(60.37)
where Ti is the personal tax rate of interest payments, Td is the effective rate for dividends, TC is the tax rate for corporate profits, and the effective rate Teff is
with boundary conditions
Thus,
C Œ1 pB .V / r
Esolv .V / D .1 Teff /.Vsolv Vint /
(1.1) Define pB .V / as the present value of a claim that pays $1 contingent on firm value reaching VB . pB .V / D A1 V y C A2 V x
, A1 D 0. This claim vanishes at V D
(1.4) Finally, separating the value of continuing operations among equity, debt, and government, we have
#, C 2r 2
C r
C C A1 V y C A2 V x r
VB , one can obtain
(60.34)
2
(60.36)
V VB
Vdef .V / D VB pB .V /
x (60.35)
(1.2) Define Vsolv .V / a claim entitled to the entire payout ı as long as the firm remains solvent; that is, firm value remains above VB . Vsolv D V C A1 V y C A2 V x
(60.41)
Note that Vsolv .V / C Vdef .V / D V , the present value of the claim to EBIT. Therefore, value is neither created nor destroyed by changes in the capital structure. Similar to the claims to funds during solvency, the default claim can be divided among bankruptcy costs, debt, and government, and: BCdef .V / D ˛Vdef .V /
(60.42)
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
Ddef .V / D .1 ˛/.1 Teff /Vdef .V /
(60.43)
Gdef .V / D .1 ˛/Teff Vdef .V /
(60.44)
Note that Goldstein et al. (2001) assume that the portion of the EBIT-generating machine not lost to bankruptcy cost can be sold to competitors in the event of bankruptcy. Thus, the government can have a claim in bankruptcy due to the taxes on future EBIT from competitors. (1.6) The restructuring costs are assumed to be proportional to the initial value of the debt issuance: RC.V0 / D qŒDsolv .V0 / C Ddef .V0 /
PU .V / D A1 V y C A2 V x with boundary conditions pU .VB / D 0 and lim pB .V / D 0
C r
V !1
(60.46)
. where D Note that while most comparative statics are similar to those in Leland (1994), equity is a decreasing function of effective tax rate. This is because an increase in Teff increases the government claim at the expense of equity. In contrast, previous capital structure models use unlevered firm value as state variable and treat government claim as cash inflow. As a result, the tax benefits increase with tax rate and in turn increase equity value. However, empirical evidence indicates that equity prices are a decreasing function of the tax rate. Under the characterization of Goldstein et al. (2001), even though the tax benefit is an increasing function of TC , equity is a decreasing function of TC . This feature solves the problem of the previous capital structure models and is supported by empirical studies. x xC1
60.4.2.2 Optimal Upward Dynamic Capital Structure Strategy16 Goldstein et al. (2001) assume that there will be a threshold VB , where the firm will optimally choose bankruptcy, and there be a threshold VU , where management will call the outstanding debt issue at par, and sell a larger new issue.17 They 16
(2.1) Define PU .V / as the present value of a claim that pays $1 contingent on firm value reaching VU . This claim receives no dividend and therefore has the form
@V V DVB
VB D
then use backward induction scheme to show that it is optimal for each period’s coupon payments, restructuring threshold, and bankruptcy threshold to increase by the same factor D VU =V0 that the initial EBIT value increases by each period. All claims will also scale up by the same factor. Because of this scaling feature, the firm will be identical at every future restructuring date, except that all factors will be scaled up by .
(60.45)
The restructuring costs are deducted from the proceeds of the debt issuance before distribution to equity holders. ˇ D 0, (1.7) Solving the smooth-pasting condition @E ˇ one can obtain
945
They mention that downward restructuring can be incorporated into their model. However, they argue that transaction costs can discourage debt reduction outside Chap. 14 (see Gilson 1997). In addition, due to the neglected issues such as asset substitution, asymmetric information, equity’s ability to force concessions, and Chap. 14 protection, they are skeptical about the results for the downward restructuring. 17 Goldstein et al. (2001) claim that given management’s strategy, the value of the debt issue would be the same even if it were not callable, as long as all debt issues receives the same recovery rate in the event of bankruptcy.
Thus, y
PU .V / D y
VBx y V V C B V x † †
(60.47)
y
where † VB VUx VU VBx . (2.2) Similarly, define pB .V / as the present value of a claim that pays $1 contingent on firm value reaching VB . y
pB .V / D y
VUx y V V C U V x † †
(60.48)
y
where † VB VUx VU VBx . (2.3) From (2.1) and (2.2), all period 0 claims can be written in terms of pU .V / and pB .V /. 0 Vsolv .V / D V pU .V /VU pB .V /VB
(60.49)
(current firm value remains between VU and VB ) 0 .V / D Vint
C0 Œ1 pU .V / pB .V / r
(60.50)
0 .V / D pB .V /VB Vdef
(60.51)
0 Vres .V / D pU .V /VU
(60.52)
0 0 0 .V / C Vdef .V / C Vres .V / D V . Note that the sum Vsolv
(2.4) Assuming full-loss offset, the values of period 0 claims after the restructuring: 0 0 .V / C .1 ˛/.1 Teff /Vdef .V / d 0 .V / D .1 Ti /Vint (60.53) 0 0 .V / Vint .V / e 0 .V / D .1 Teff /ŒVsolv
(60.54)
946
S.-S. Chen et al. 0 0 0 g 0 .V / D Teff ŒVsolv .V / Vint .V / C Ti Vint 0 C.1 ˛/Teff Vdef .V /
(60.55)
0 bc0 .V / D ˛Vdef .V /
(60.56)
(2.5) Define e.V / as the equity claim to intertemporal EBIT flows for all periods. Note that e.V / is not the total equity claim. Since the claims to future debt issues are currently owned by equity and restructuring costs, future debt holders will have to pay fair price for these claims in the future. As a result, there are also cash transfers between the claimants at each restructuring date.
The total equity claim, E(V), just before the initial debt issuance, is the sum of the intertemporal claims of debt and equity, less the claim of restructuring costs: E.V0 / D
e 0 .V0 / C d 0 .V0 / qD 0 .V0 / 1 PU .V0 /
(60.62)
And after the initial debt issuance during period 0, E.V / D PU .V /E.V0 / C e 0 .V / PU .V /D 0 .V0 / (60.63) This reduces to E.V0C / D E.V0 / .1 q/D 0 .V0 /
(60.64)
The scaling feature inherent in the model implies It also reduces to
d.V0 / g.V0 / e.V0 / D 0 D 0 0 e .V0 / d .V0 / g .V0 /
E.VU / D E.V0 / D 0 .V0 /
1 V0 D D 0 1 PU .V0 / V0 VU .V0 /PU .V0 / (60.57) The equity claim to intertemporal cash flows is e 0 .V0 / 1 PU .V0 / ˚ D e 0 .V0 / 1 C PU .V0 / C ŒPU .V0 / 2 C : : : 9 8 1 = <X ŒPU .V0 / j D e 0 .V0 / (60.58) ; :
e.V0 / D
j D0
(2.6) Next, they determine the portion of the future debt claims that currently belong to equity and restructuring costs. The management will call the outstanding debt issue at par at restructuring date. Therefore, the initial value of the debt claim equals the sum of the claim to period 0 cash flows plus the present value of the call value D 0 .V0 / D d 0 .V0 / C pU .V0 /D 0 .V0 / Hence, D 0 .V0 / D
d 0 .V0 / 1 pU .V0 /
(60.59)
(60.60)
D 0 .V0 / is the fair value debt holders must pay at the time of issuance. The proceeds are then divided between equity and restructuring cost. By the scaling argument, the total restructuring claim is ˚ RC.V0 / D qD .V0 / 1 C PU .V0 / C ŒPU .V0 / 2 C : : : 0
D
qD 0 .V0 / 1 PU .V0 /
(60.61)
(60.65)
just before the next issuance. VB =V0 Finally, managers choose C , VU =V0 , in order to maximize the wealth of equity holder, subject to limited liability. In the Goldstein, Ju, and Leland model, the optimal initial leverage ratio is substantially lower than that obtained in the static model and is more in line with the historical average. This is a result of that the firm has the option to increase its leverage in the future. Thus, it will wait until it becomes optimal to exercise it. The possibility of increasing leverage in the future decreases the initial optimal leverage. In addition, the bankruptcy triggered level VB also drops significantly since a firm that has an option to adjust its capital structure is more valuable than the one without this option. Moreover, the model also implies reasonable and lower bankruptcy cost for the optimal leverage. The tax benefits to debt also increase significantly over the prediction of static models.
60.4.3 Other Important Extensions In this section, we will briefly present some other extensions with the introduction of investment risk, alternative bankruptcy procedures, as well as alternative stochastic processes, and will outline key results of these models.
60.4.3.1 Investment Risk – Leland (1998) One possible agency problem, asset substitution, has been discussed in Leland (1994) and Leland and Toft (1996). Leland (1998) explicitly models this issue by allowing the firm to choose its risk strategy and examines the interaction between capital structure and risk choice. To capture the
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
agency cost, Leland (1998) assumes that risk choices are made ex post; that is, after debt is in place. The risk choices cannot be precontracted in the debt covenants or credibly precommitted. The environment with ex post risk choice can be contrast with the hypothetical ex ante, in which the firm simultaneously chooses its risk strategy and its debt structure to maximize initial firm value. The measure of agency costs is therefore the difference in maximal values between the ex ante and ex post cases. This reflects the loss in value follows from the risk strategies chosen to maximize equity value rather than firm value. In this paper, optimal capital structure reflects both the tax advantages less default costs and the agency costs resulting from asset substitution (Jensen and Meckling 1976). The joint determination of capital structure and investment risk is examined. Leland (1998) finds that agency costs restrict leverage and debt maturity and increase yield spreads.
60.4.3.2 Chapter 11 Protection – Francois and Morellec (2004) Francois and Morellec (2004) extend the Fan and Sundaresan (2000) model to incorporate the possibility of Chap. 14 filings. Under their setup, shareholders hold a Parisian down-and-out option on the firm’s asset; that is, shareholders have a residual claim on the cash flows generated by the firm unless the value of these assets reaches the default threshold and remains below that threshold for the exclusive period. It is generally acknowledged that there are two types of defaulting firms: First, firms that are economically sound but default only due to temporary financial distress, and recover under Chap. 14. Second, firms that are economically unsound, keep on losing value under Chap. 14, and are eventually liquidated under Chap. 10. The modeling philosophy comes from the empirical studies that show that most firms emerge from Chap. 14. Only a few are eventually liquidated under Chap. 10 after filing Chap. 14. Following the Nash bargaining game of Fan and Sundaresan’s approach,18 Francois and Morellec presume the firm renegotiates its debt obligations whenever the asset value falls below a constant threshold VB . However, a major
18 Note that the specification for the bargaining game within Francois and Morellec’s framework is slightly different from that of Fan and Sundaresan (2000). Francois and Morellec focus on Chapter 11 filings – court-supervised debt renegotiation in contrast to the private workouts by Fan and Sundaresan. Therefore, the automatic stay of assets prevents shareholders from liquidating the firm’s assets. Nevertheless, renegotiation plan requires each participant to receive a payoff that exceeds the liquidating value of its claims. As a result, bondholders’ payoff has to exceed .1 ˛/VB and this results in the difference of the specification for the bargaining game.
947
difference from Fan and Sundaresan’s model is that Francois and Morellec assume that a proportional costs ' are borne by the company during the renegotiation process. The costs of financial distress incurred by the firm filing Chap. 14 are ignored in the prior research. Even though the costs implied by private workouts are generally low, Chap. 14 filings are associated with large costs of financial distress that may affect shareholder’s default decision. Note that Leland (1994) only allows liquidation while Fan and Sundaresan (2000) only permit private workouts. Francois and Morellec solve the endogenous default barrier by maximizing equity value and providing closed-form solutions for corporate debt and equity values. They find that the possibility to renegotiate the debt contract has ambiguous impact on leverage choices and increases credit spreads on corporate debt. The sharing rule of cash flows during bankruptcy has a large impact on optimal leverage, while credit spread on corporate debt shows little sensitivity to the varying bargaining power.
60.4.3.3 Chapter 11 vs. Chapter 7 – Broadie et al. (2007) Broadie et al. (2007) propose framework in which optimal debt and equity values are determined in the presence of Chaps. 10 and 14 under the U.S. bankruptcy code. They explicitly consider two separate barriers for default and liquidation for the more realistic bankruptcy procedures. They model the key feature of bankruptcy proceedings by automatic stay provision – suspending the dividends while entering bankruptcy (Chap. 14) and the imposition of APR (Chap. 10) upon liquidation. To facilitate the analysis, they follow the EBIT process by Goldstein et al. (2001)19 and assume the firm issues a single consol bond. They model the critical characteristics of the U.S. bankruptcy proceeding – grace period, automatic stay, debt forgiveness, absolute priority, and transfer of control rights from equity to debt holders. Some prior models, for example, such as the Francois and Morellec (2004) model, also make a distinction between default and liquidation. However, Broadie et al. (2007) argue that in their model the firm cannot be liquidated even if the unlevered firm value is too low during the grace period. As a result, limited liability could be violated. Therefore, under the setup of Broadie, Chernov, and Sundaresan, liquidation could happen as the firm value either
19 Note that they assume the same process for EBIT before and after default. Therefore, their framework models financial rather than economical distress since bankruptcy by itself does not cause poor performance. The work by Goldstein et al. (2001) does not need this assumption since there is no difference between default (bankruptcy) and liquidation under their setup.
948
reaches the liquidation barrier or stays under the bankruptcy barrier for longer than the grace period. In addition, they explicitly model the automatic stay provision by recording the accumulated unpaid coupons plus interest in arrears, denoted as At . When a firm recovers from Chap. 14, the debt holders will forgive a fraction 1 , 0 1 of arrears and receive an amount At . Moreover, the firm does not pay dividends to shareholders while it is in the default state. The entire EBIT while in default is accumulated in a separated account St . If St is not sufficient to repay the arrears, equity holders must raise the remaining at the cost of diluting equity. If St > At , the leftover is distributed to shareholders.20 In this paper, they focus on the issues of bankruptcy proceedings and the optimal choice of these two boundaries driven by different objectives.21 They show that the first-best outcome, the total firm value maximization ex-ante upon filing Chap. 14, is different from the equity value maximization outcome. They also show that the first-best outcome can be restored in large measure by giving creditors either the control to declare Chap. 14 or the right to liquidate the firm once it is taken to Chap. 14 by the equity holders. This serves as the threat from debtholders to prevent equity holders from filing for Chap. 14 too soon to get debt relief. Finally, they also find that on average the firms are more likely to default and are less likely to liquidate relative to the benchmark model of Leland (1994).
60.4.3.4 Optimal Debt Maturity – Ju and Ou-Yang (2006) In the work by Leland and Toft (1996), debt maturity is shown to be crucial to the leverage ratio and credit spreads. In practice, the optimal capital structure and optimal maturity structure are interdependent decisions. However, in Leland and Toft (1996), debt maturity is exogenously specified. In addition, Goldstein et al. (2001) show that the optimal capital structure is very sensitive to the input level of the interest rate. Therefore, Ju and Ou-Yang (2006) propose a dynamic model in which the optimal structure and an optimal debt maturity are jointly determined in a Vasicek (1977) meanreverting interest rate process.22 20
Note that in order to accommodate the complex feature of bankruptcy proceedings, numerical approach is necessary to solve the model. The detailed binomial lattice methodology is developed by Broadie and Kaya (2007). 21 To determine if a distressed firm would choose the reorganization option under nearly perfect market assumption, they first ignore taxes to emphasize the impact of bankruptcy and return to this issue later in their paper. 22 We should note that the default boundary is exogenously specified in their model for tractability. The design of recapitalization of the firm’s
S.-S. Chen et al.
In their model, the tradeoff between the gains from dynamically adjusting the debt amount and the restructuring costs if adjusting it yield the optimal debt maturity. In other words, the tax rate and the transaction costs are the most two important parameters in determining the optimal maturity. They find that a higher transaction cost yields a longer debt maturity since it is more expensive to rebalance a firm’s capital structure. On the other hand, a higher tax rate yields a shorter optimal debt maturity because it is more valuable to recapitalize the capital structure. They also find that the initial level and long-run mean of the interest rate process are key variables in determining both optimal capital structure and optimal maturity structure. When the interest rate is higher (lower), coupon and principal are higher (lower). The coupon and principal are determined in such a way that the market value of debt is independent of the spot interest rate level. This is in sharp contrast to the fact that the level of spot interest rate is a key parameter in the constant interest rate model. Finally, the volatility of the interest rate process and the correlation between the interest rate process and the firm asset value process play important roles in determining the debt maturity structure.
60.5 Application and Empirical Evidence of Capital Structure Models Capital structure models are intended to quantitatively examine the capital structure theories. In capital structure models, the corporate debt value and capital structure are interlinked variables. Therefore, the valuation of corporate securities and financial decisions are jointly determined. Through the structural approach, researchers can conduct a detailed analysis of the behavior of bond prices and optimal leverage ratios as financial variables (such as taxes, interest rates, payout rates or bankruptcy costs) change. While theoretically elegant, capital structure models must be able to explain observed capital structure and bond prices in practice. As a result, the most commonly used measures for a capital structure model are the predicted leverage ratio and yield spreads. Most of the capital structure models illustrate the performance by numerical examples using the historical average as model input parameters. In general, the early capital structure models predicted optimal leverage rates that were too high. The recent models such as those incorporating dynamic capital structure settings have predicted optimal leverage more in line with the capital structure is different from that of Goldstein et al. (2001). In contrast to the recapitalization barrier assumed in Goldstein et al. (2001), Ju and Ou-Yang (2006) assumes that the firm rebalances its capital structure periodically as long as the firm is solvent.
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application
observed average under reasonable bankruptcy cost. Nevertheless, capital structure models (as well as corporate bond pricing models using structural approach) still predict credit spreads too low, especially for the short-term debt. There are only a few empirical studies on the models employing structural approach. Among them, a large portion of the empirical tests of structural models focus on the risky debt pricing performance of the bond pricing models, such as Longstaff and Schwartz (1995). The empirical performance in pricing risky debt is generally unsatisfactory (see Wei and Guo 1997; Anderson and Sundaresan 2000; Delianedis and Geske 2001; Huang and Huang 2003, among others.). The empirical investigations for capital structure models are even rare due to their complexity. The most well-known study including capital structure models is by Eom et al. (2004). Eom et al. (2004) carry out an empirical analysis of five structure models including Merton (1974), Geske (1977), Longstaff and Schwartz (1995), Leland and Toft (1996), and Collin-Dufresne and Goldstein (2001). They test these models with bond data of firms with simple capital structure on the last trading day of each December from 1986 to 1997. They calibrate these models using the book value of total liabilities in the balance sheet as the default boundary, and calculate the corresponding asset value as the sum of the market value of equity and the book value of total debt. Then they estimate the asset return volatility using bond-implied volatility as well as six equity return volatilities measured by different time horizons before and after the bond price observation. In contrast to previous studies that have suggested that structural models generally predict yield spreads too low, the result of Eom et al. (2004) shows more complicated phenomena, although all of these models make significant errors in predicting the credit spread. The Merton (1974) and Geske (1977) models underestimate the spreads, while the Longstaff and Schwartz (1995), Leland and Toft (1996), and Collin-Dufresne and Goldstein (2001) models all overestimate the spreads. Huang and Huang (2003) use several structural models to predict yield spread, including the Longstaff and Schwartz (1995) model, the strategic model, the endogenousdefault model, the stationary leverage model as well as two models they propose: one with a time-varying asset risk premium and one with a double exponential jump-diffusion firm value process. They calibrate inputs, including asset volatility, for each model so that target variables, including leverage, equity premium, recovery rate, and the cumulative default probability at a single time horizon, are matched. They show that the models make quite similar predictions on yield spreads. In addition, the observed yield spreads relative to Treasury bonds are considerably greater than the predicted spreads, especially for highly rated debt. Hence, they conclude that additional factors such as liquidity and taxes must be important in explaining market yield spreads.
949
By adopting a different estimation method, Ericsson et al. (2006) use the MLE approach proposed by Duan (1994) to perform an empirical test on both CDS spreads and bond spreads, including three structural models – Leland (1994), Leland and Toft (1996), and Fan and Sundaresan (2000). In contrast to previous evidence from corporate bond data, CDS premia are not systematically underestimated. Also, as expected, bond spreads are systematically underestimated, which is consistent with the fact they are driven by significant non-default factors. In addition, they also conduct regression analysis on residuals against default and non-default proxies. Little evidence is found for any default risk components in either CDS or bond residuals, while strong evidence, in particular, an illiquidity premium, is related to the bond residuals. They conclude that structural models are able to capture the credit risk price in the markets but they fail to price corporate bonds adequately due to omitted risks. Also using the MLE approach, Ericsson and Reneby (2004b) estimate yield spreads between 1 and 50 months out of sample by an extended version of the Leland (1994) model, which allows for violation of the absolute priority rule and future debt issues. In addition to the stock price series, they also include the bond price and dividend information in estimating their model. The bond samples consist of 141 U.S. corporate issues and a total of 5,594 dealer quotes. Their empirical results show that, for the 1 month-ahead prediction, a mean error is merely 2 basis points. This is similar to the fitting error of those reduced-form models in Duffee (1999). Therefore, they conclude that the inferior performance of structural models may result from the estimation approaches used in the existing empirical studies. Finally, another potentially important application of capital structure model is to predict the credit quality of a corporate security. For example, Leland (2004) examines the default probabilities predicted by the Longstaff and Schwartz (1995) model with the exogenous default boundary, and the Leland and Toft (1996) model with endogenous default boundary. As Leland stated We focus on default probabilities rather than credit spreads because (i) they are not affected by additional factors such as liquidity, tax differences, and recovery rates; and (ii) prediction of the relative likelihood of default is often stated as the objective of bond ratings.
Leland uses Moody’s corporate bond default data from 1970 to 2000 in his study and follows a similar calibration approach by Huang and Huang (2003). Rather than matching the observed default frequencies, Leland instead chooses common inputs across models to observe how well they match observed default statistics. The empirical results show that when costs and recovery rates are matched, the exogenous and the endogenous default boundary models fit observed default frequencies equally. The models predict longer-term default frequencies quite accurately, while
950
shorter-term default frequencies tend to be underestimated. Thus, he suggests that a jump component should be included in asset value dynamics.23 In summary, the empirical performance of capital structure models in pricing risky debt is not satisfactory. Even with the incorporation of jump process, the illiquid corporate bond market still hinders structural models from accurately pricing risky debt. However, predicting the credit quality of a corporate security could be a good application of capital structure models because they are less affected by the micro structure issues. Since recent capital structure models put numerous efforts on the event of bankruptcy, prediction of default probabilities or default events shall be potentially important applications.24
60.6 Conclusion In this paper, we review the most important and representative capital structure models. The capital structure models incorporate contingent claim valuation theory to quantitatively analyze prevailing determinants of capital structure in corporate finance literature. In capital structure models, the valuation of corporate securities and financial decisions are jointly determined. Most of the capital structure models provide closed-form expressions of corporate debt as well as the endogenously determined bankruptcy level, which are explicitly linked to taxes, firm risk, bankruptcy costs, risk-free interest rate, payout rates, and other important variables. The behavior of how debt values (and therefore yield spreads) and optimal leverage ratios change with these variables can thus be investigated in detail. While theoretically elegant, capital structure models do not perform well empirically in risky corporate bond pricing. Researchers have been attempting to resolve the yield spread underestimates by introducing jumps and liquidity premium. On the other hand, since recent capital structure models put numerous efforts on the event of bankruptcy, we suggest that prediction of default probabilities or default events shall be potentially important applications. Furthermore, some researchers argue that the past poor performance of capital structure models may come from the estimation approaches historically used in the empirical studies (Ericsson and Reneby 2005, Ericsson, Reneby, and Wang 2006). And we have seen some innovative estimation methods for solving
23 We should note that Zhou (2001) show that jumps can explain why yield spreads remain strictly positive even as maturity approaches zero. 24 In default prediction, some researchers have conducted empirical investigation on a company basis analysis such as Chen et al. (2006) and Chen et al. (2008), though the models they tested are mainly bond pricing models.
S.-S. Chen et al.
the estimation problem in models employing structural approach (see Bruche 2005) and the reference therein). Thus, we hope to see more rigorous empirical studies for capital structure models in the future.
References Acharya, V. and J. Carpenter. 2002. “Corporate bond valuation and hedging with stochastic interest rates and endogenous bankruptcy.” Review of Financial Studies 15, 1355–1383. Anderson, R. and S. Sundaresan. 1996. “Design and valuation of debt contract.” Review of Financial Studies 9, 37–68. Anderson, R. and S. Sundaresan. 2000. “A comparative study of structural models of corporate bond yields: an exploratory investigation.” Journal of Banking and Finance 24, 255–269. Black, F. and J. C. Cox. 1976. “Valuing corporate securities: some effects of bond indenture provisions.” Journal of Finance 31, 351–367. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Brennan, M. and E. Schwartz. 1978. “Corporate income taxes, valuation, and the problem of optimal capital structure.” Journal of Business 54, 103–114. Briys, E. and F. de Varenne. 1997. “Valuing risky fixed rate debt: an extension.” Journal of Financial and Quantitative Analysis 32, 239–248. Broadie, M. and O. Kaya. 2007. “A binomial lattice method for pricing corporate debt and modeling chapter 11 proceedings.” Journal of Financial and Quantitative Analysis 42, 279–312. Broadie, M., M. Chernov, and S. Sundaresan. 2007. “Optimal debt and equity values in the presence of chapter 10 and chapter 14.” Journal of Finance 62, 1341–1377. Bruche, M. 2005. Estimating structural bond pricing models via simulated maximum likelihood, Working paper, London School of Economics. Chen, R., S. Hu, and G. Pan. 2006. Default prediction of various structural models, Working paper, Rutgers University, National Taiwan University, and National Ping-Tung University of Sciences and Technologies. Chen, R., C. Lee, and H. Lee. 2008. Default prediction of alternative structural credit risk models and implications of default barriers, Working paper, Rutgers University. Childs, P., D. Mauer and S. Ott. 2005. “Interaction of corporate financing and investment Decisions: the effects of agency conflict.” Journal of Financial Economics 76, 667–690. Collin-Dufresne, P. and R. S. Goldstein. 2001. “Do credit spreads reflect stationary leverage ratios?” Journal of Finance 56, 1929–1957. Dao, B. and M. Jeanblanc. 2006. Double exponential jump diffusion process: a structural model of endogenous default barrier with rollover debt structure, Working paper, The Université Paris Dauphine and the Université d’Évry. Delianedis, G. and R. Geske. 2001. The components of corporate credit spreads: default, recovery, tax, jumps, liquidity, and market factors, Working paper, UCLA. Duan, J. C. 1994. “Maximum likelihood estimation using pricing data of the derivative contract.” Mathematical Finance 4, 155–167. Duffee, G. 1999. “Estimating the price of default risks.” Review of Financial Studies 12, 197–226. Durand, D. 1952. “Costs of debt and equity funds for business trends and problems of measurement,” Conference on Research in Business Finance, National Bureau of Economic Research, New York, pp. 215–247.
60 Alternative Methods to Determine Optimal Capital Structure: Theory and Application Eberhart, A., W. Moore, and R. Roenfeldt. 1990. “Security pricing and deviations from the absolute priority rule in bankruptcy proceedings.” Journal of Finance, December, 1457–1469. Eberhart, A. C. and L. A. Weiss. 1998. “The importance of deviations from the absolute priority rule in chapter 11 bankruptcy proceedings.” Financial Management 27, 106–110. Elizalde, A. 2005. Credit risk models II: structural models, Working paper, CEMFI and UPNA. Eom, Y. H., J. Helwege, and J. Huang. 2004. “Structural models of corporate bond pricing: an empirical analysis.” Review of Financial Studies 17, 499–544. Ericsson, J. and J. Reneby. 2004a. “A note on contingent claims pricing with non-traded assets.” Finance Letters 2(3). Ericsson, J. and J. Reneby. 2004b. “An empirical study of structural credit risk models using stock and bond prices.” Journal of Fixed Income 13, 38–49. Ericsson, J. and J. Reneby. 2005. “Estimating structural bond pricing models.” Journal of Business 78, 707–735. Ericsson, J., J. Reneby, and H. Wang. 2006. Can structural models price default risk? Evidence from bond and credit derivative markets, Working paper, McGill University and Stockholm School of Economics. Fan, H. and S. Sundaresan. 2000. “Debt valuation, renegotiations and optimal dividend policy.” Review of Financial Studies 13, 1057–1099. Fischer, E., R. Heinkel, and J. Zechner. 1989. “Dynamic capital structure choice: theory and tests.” Journal of Finance 44, 19–40. Francois, P. and E. Morellec. 2004. “Capital structure and asset prices: some effects of bankruptcy procedures.” Journal of Business 77, 387–411. Franks, J. and W. Torous. 1989. “An empirical investigation of U.S. firms in reorganization.” Journal of Finance 44, 747–767. Geske, R. 1977. “The valuation of corporate liabilities as compound options.” Journal of Financial and Quantitative Analysis 12, 541–552. Hackbarth, D. 2006. A real options model of debt, default, and investment, Working paper, Olin School of Business, Washington University. Harris, M. and A. Raviv. 1991. “The theory of capital structure.” The Journal of Finance 46(1), 297–355. Harrison, J. M. 1985. Brownian motion and stochastic flow systems, Wiley, New York. Huang, J. and M. Huang. 2003. How much the corporate-treasury yield spread is due to credit risk? Working paper, Penn State University and Stanford University. Huang, J., N. Ju, and H. Ou-Yang. 2003. A model of optimal capital structure with stochastic interest rates, Working paper, New York University. Jensen, M. and W. Meckling. 1976. “Theory of the firm: managerial behavior, agency costs and capital structure.” Journal of Financial Economics 3, 305–360. Ju, N. and H. Ou-Yang. 2006. “Capital structure, debt maturity, and stochastic interest rates.” Journal of Business 79, 2469–2502. Ju, N., R. Parrino, A. Poteshman, and M. Weisbach. 2004. “Horses and rabbits? Trade-off theory and optimal capital structure.” Journal of Financial and Quantitative Analysis 40, 259–281.
951
Kane, A., A. Marcus, and R. MacDonald. 1984. “How big is the tax advantage to debt.” Journal of Finance 39, 841–852. Kane, A., A. Marcus, and R. MacDonald. 1985. “Debt policy and the rate of return premium to leverage.” Journal of Financial and Quantitative Analysis 20, 479–499. Kim, I., K. Ramaswamy, and S. Sundaresan. 1993. “Does default risk in coupons affect the valuation of corporate bonds? A contingent claims model.” Financial Management 22, 117–131. Lee, C. F., John C. Lee, and Alice C. Lee. 2009, Financial analysis, planning and forecasting, 2nd ed. Hackensack, NJ, World Scientific Publishers. Leland, H. E. 1994. “Corporate debt value, bond covenants, and optimal capital structure.” Journal of Finance 49, 1213–1252. Leland, H. E. 1998. “Agency cost, risk management, and capital structure.” Journal of Finance 53, 1213–1243. Leland, H. E., 2004. “Prediction of default probabilities in structural models of debt.” Journal of Investment Management 2(2), 5–20. Leland, H. E. and K. B. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” Journal of Finance 51, 987–1019. Longstaff, F. and E. Schwartz. 1995. “A simple approach to valuing risky fixed and floating rate debt and determining swaps spread.” Journal of Finance 50, 789–819. Mauer, D. and A. Triantis. 1994. “Interactions of corporate financing and investment decisions: a dynamic framework.” Journal of Finance 49, 1253–1277. Mella-Barral, P. 1999. “The dynamics of default and debt reorganization.” Review of Financial Studies 12, 535–578. Mella-Barral, P. and W. Perraudin. 1997. “Strategic debt service.” Journal of Finance 52, 531–566. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 28, 449–470. Modigliani, F. and M. Miller. 1958. “Corporate income taxes and the cost of capital.” American Economic Review 53, 433–443. Myers, S. C. 1977. “Determinants of corporate borrowing.” Journal of Financial Economics 5, 147–175. Myers, S. C. 1984. “The capital structure puzzle.” Journal of Finance 39, 575–592. Rubinstein, M. and E. Reiner. 1991. “Breaking down the barriers.” Risk Magazine 4, 28–35. Taurén, M. 1999. A model of corporate bond prices with dynamic capital structure, Working paper, Indiana University. Vasicek, O. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188. Wei, D. and D. Guo. 1997. “Pricing risky debt: an empirical comparison of the Longstaff and Schwartz and Merton models.” Journal of Fixed Income 7, 8–28. Weiss, L. 1990. “Bankruptcy costs and violation of claims priority.” Journal of Financial Economics 27, 285–314. Zhou, C. 2001. “The term structure of credit spreads with jump risk.” Journal of Banking and Finance 25, 2015–2040.
Chapter 61
Actuarial Mathematics and Its Applications in Quantitative Finance Cho-Jieh Chen
Abstract We introduce actuarial mathematics and its applications in quantitative finance. First, we introduce the traditional actuarial interest functions and use them to price different types of insurance contracts and annuities. Based on the equivalence principle, risk premiums for different payment schemes are calculated. After premium payments and the promised benefits are determined, actuarial reserves are calculated and set aside to cushion the expected losses. Using the similar method, we use actuarial mathematics to price the risky bond, the credit default swap, and the default digital swap, while the interest structure is not flat and the first passage time is replaced by the time of default instead of the future life time. Keywords Actuarial risks r Interest r Life insurance r Annuity r Risk premium rActuarial reserve rRisky bond price r Credit spread r Credit default swap r Default digital swap
61.1 Introduction In this article, we introduce actuarial mathematics and its applications in quantitative finance. Actuarial risks are the risks that insurance companies face. The famous actuary C.L. Trowbridge coined the term “C-1” risk to denote the risk of asset defaults and decreases in market values of investments (see Panjer et al. 1998, Sect. 4.3). We can generally use the term “C-1” risk to denote the market risk, credit risk, and liquidation risk of assets. Trowbridge used the term “C-2” risk to denote the mortality and morbidity risks. We can use the term “C-2” risk to denote the risk of pricing insufficiency to cover underlying liabilities. Trowbridge used the term “C-3” risk to denote the risk of interest rate fluctuation. Subsequently, the term “C-4” risk is used to denote
C.-J. Chen () University of Alberta, Edmonton, Alberta, Canada e-mail: [email protected]
the operational risk, such as the risk of loss resulting from inadequate or failed internal processes, people and systems, or from external events. The accounting, managerial, social, and regulatory risks are C-4 risks.
61.2 Actuarial Discount and Accumulation Functions Let the bank account process be denoted by fB.t/I t 0g, where B.t/ represents the time-t accumulated value of one dollar deposited at time zero in a riskless account with B.0/ D 1. When B.t/ is a linear increasing function of the time t, then B.t/ D 1 C it; t 0: The interest accrued of this pattern is called simple interest. When lnŒB.t/ is proportional to the time t, then B.t/ can be written as B.t/ D .1 C i /t : The interest accrued of this pattern is called compound interest. The effective rate of interest during the n-th period is defined as the ratio of the capital increment to the investment at the beginning of the period as in D
B.n/ B.n 1/ ; for n 1; n 2 N: B.n 1/
The effective rate of discount can be similarly defined as the ratio of the capital increment to the capital at the end of the period as dn D
B.n/ B.n 1/ ; for n 1; n 2 N: B.n/
When the interest rate is a fixed constant, the subscripts of the interest rate and discount rate can be dropped. The discount factor is defined by 1 vD : 1Ci
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_61,
953
954
C.-J. Chen
The relationship among i , v, and d can be expressed by the following equations. The first,
The relationship between the interest functions can be given by
iv D d
1 C i D e ı D v1 D .1 d /1
m
n i .m/ d .n/ D 1C D 1 : m n
(61.1)
Equation (61.1) can be interpreted as the interest i of one dollar for one term, if payable at the beginning of a term, equals i times the discount factor v. This amount is called the discount rate d . The second equation,
As m approaches infinity, lim
m!1
1C
i .m/ m
m
D lim e i
.m/
m!1
D
ı
1 i D1 D 1 v: d D iv D 1Ci 1Ci
e . It can be shown that (61.2)
lim i .m/ D ı;
m!1
Equation (61.2) can be interpreted as the discount rate d is the difference between one dollar payable at time zero and time-zero value of one dollar payable at time one. When we borrow money from a bank, we are usually given the annual percentage rate (APR) instead of the effective rate. This interest is usually paid more frequently than once per period. This interest rate is called the nominal rate of interest and is denoted by i .m/ if the interest is payable m times per period. The nominal rate i .m/ is divided by m then is compounded m times for a period as
m i .m/ 1Ci D 1C : m When a credit card company charges an APR of 18% convertible monthly, the effective rate of interest been charged 12 1 D 19:56%, which is higher than the is i D 1 C 0:18 12 rate that we are told. The effective rate of discount can be defined similarly as
m d .m/ 1d D 1 : m The rate of the instantaneous interest movement at a specific time t is defined by ıt D
B 0 .t/ d ln B .t/ D ; t 0; B .t/ dt
where ı, is called the force of interest in actuarial science and is called the short rate in quantitative finance. The bank account process can be written as Z
t
B.t/ D exp
ıs ds :
0
If the force of interest is a constant, then the bank account process equals B.t/ D e ıt :
and lim d .m/ D ı:
m!1
An annuity is consisted of a series of payments made at equal time intervals. The present value of an annuity is denoted by a. The present value of an annuity paying $1 at the beginning of each year for n years with interest rate i has a standardized actuarial notation as 2 n1 aR nji D N D 1CvCv C Cv
1 vn 1 vn D : 1v d
(61.3)
This annuity is called an n-year annuity-due in actuarial science. An annuity-due has payments due at the beginning of each term. The accumulated value of the n-year annuitydue is n n1 C .1 C i /n2 C C .1 C i /1 sRnji N D .1 C i / C .1 C i /
D
.1 C i /n 1 D .1 C i /n aR nji N : d
It equals the present value of the annuity in Equation (61.3) times the accumulation factor .1 C i /n . The present value of an annuity paying $1 at the end of each year for n years with interest rate i has a standardized actuarial notation as 2 3 n anji N D vCv Cv CCv D
1 vn v.1 vn / D : 1v i (61.4)
This annuity is called an n-year annuity-immediate. The accumulated value of the annuity-immediate is n1 snji C .1 C i /n2 C .1 C i /n3 C C 1 N D .1 C i /
D
.1 C i /n 1 D .1 C i /n anji N : i
Comparing Equation (61.3) with Equation (61.4), we can find that both present values share the same numerator but
61 Actuarial Mathematics and Its Applications in Quantitative Finance
have different denominators. The n-year annuity-due has d in the denominator while the n-year annuity-immediate has i in the denominator. An annuity sometimes pays more frequently than one time per year. Let a $1 annual nominal payment be payable at the beginning of each m-th of the year with each payment equal 1/m, where m 1 and m is an integer (m D 12, 4, and 2 representing monthly, quarterly, and semi-annually payments, respectively). The number of payment periods in years is n. The present value of this annuity-due is .m/
D aR nji N
1 1 2 1 1 C v m C v m C C vn m m
955
In general, if an annuity provides payment of g.t/ at time t for a period of n with non-constant force of interest, the present value of the annuity is Zn aD
g.t/ dt D B.t/
0
Zn
g.t/e
Rt
ıs ds
0
dt:
0
For more complicated actuarial interest functions, please consult Kellison (1991).
61.3 Actuarial Mathematics of Insurance
1
D
1 v m nm 1 vn 1 1 vn h i
D D : 1 1 m d .m/ 1 vm m 1 .1 C i / m (61.5)
This m-th payment n-year annuity-due in Equation (61.5) is similar to the n-year annuity-due in Equation (61.3) except the denominator of the present value is replaced by d .m/ . The accumulated value of this annuity-due is .m/
.m/
D .1 C i /n aR nji D sRnji N N
.1 C i / 1 : d .m/ n
If this annuity is payable at the end of each m-th of a year, its actuarial present value is
.m/
anji N
1 1
v m 1 v m nm 1 2 1 1 v m C v m C C vn D
D 1 m m 1 vm
A life insurance policy provides financial security to offset the financial impact caused by unexpected death. Payments of a life insurance policy are triggered by the termination of the life of the insured. Let a newborn’s age-at-death be denoted by X . The cumulative distribution function (c.d.f.) is FX .x/ D PrfX xg; x 0; which is the probability that the newborn will die before age x. The probability that the newborn will survive age x is called the survival function of X and is denoted by SX .x/ D PrfX > xg D 1 FX .x/; x 0: If a newborn survives age x, we use (x) to denote a lifeage-x. The future life time of (x) is denoted by T .x/. The probability that (x) will die by age x C t is t qx
1
D
v m .1 vn / 1 vn 1 vn iD h D .m/ : 1 1 i d .m/ =v m m 1 .1 C i / m
D PrfT .x/ tg D Prfx < X tjX > xg D
SX .x/ SX .t/ SX .x/
D
FX .t/ FX .x/ : 1 FX .x/
The accumulated value of this annuity-immediate is .m/ snji N
D .1 C
.m/ i /n anji N
.1 C i /n 1 D : i .m/
When m approaches infinity, the annuity provides continuous payments. The present value of a continuous annuity is .m/
anj D lim anji D lim N m!1
1v 1v : D .m/ d ı n
m!1
vm vnCm : ı
t px
D PrfT .x/ > tg D 1 t qx :
n
If an annuity does not provide any payments for the first m years then provides payments for a period of n years after the first m years, this m-year deferred n-year continuous annuity has a present value of mjaN njN D vm aN nj D
The probability that (x) will survive age x C t is
Theprobabilitythat(x)willdiebetweenagexCt andxCt Cuis t ju px
D Prft < T .x/ t C ug D t Cu qx t qx D t px t Cu px D t px u qxCt :
(61.6)
Equation (61.6) can be interpreted as the probability that (x) will die between age x Ct and x Ct Cu equal the probability that (x) will survive age xCt times the probability that (xCt) will die before or equal to age x C t C u.
956
C.-J. Chen
The force of mortality is defined by Prfx < X x C xjX > xg
x!0
x fX .x/ x=SX .x/ fX .x/ d ln SX .x/ D lim D D :
x!0
x SX .x/ dx
.x/ D lim
The benefit of an insurance contract is a function of the future life of an insured. Let the benefit of an insurance contract be denoted by bt when T .x/ D t; t 0, for an insured (x). Let the present value of the benefit be denoted by Z and Z D vt bt if the interest is assumed to be constant. The expected present value of the insurance is Z1
Therefore,
EŒZ D ZxCt
.s/ds D ln x
SX .x C t/ D ln t px ; SX .x/
and t px
De
xCt R
.s/ds
x
:
The c.d.f. of T .x/ is
0
d @1 e d t qx D fT .x/ .t/ D dt D e
xCt R
xCt R
1
x
.s/ds
Zn e
A
2
h i A1 xWnj D EŒZ 2 D E v2T .x/ IfT .x/ ng Zn
Z1 t t px .x C t/dt
0
where the second moment of the present value equals the first moment calculated using the force of interest of 2ı. The variance of the present value of the insurance is
0
D
2 VarŒZ D 2 A1 xWnj A1 xWnj :
Z1 ST .x/ .t/dt D
0
t px dt: 0
It is also called the complete-expectation-of-life. Let bT .x/c be the integer part of T .x/ [or the largest integer less than or equal to T .x/] so T .x/bT .x/c represents the fractional part of T .x/. The expected value of bT .x/c is called the curtateexpectation-of-life and equals ex D E fbT .x/cg D
1 X
k k px qxCk D
kD0
D k.k px / j1 0
1 X kD0
1 X
For a term insurance contract, the “1” on the top of x represents that the $1 benefit is payable only on the death of (x). If the “1” is on top of nj, it means that the policy provides survival benefit of $1 at the insured’s age x C n if the insured (x/ survives age x C n. This insurance is called an n-year pure endowment insurance and is denoted by Z1
k .k px /
1 AxWnj
1 X kD1
e n fT .x/ .t/dt
D EŒv IfT .x/ > ng D n
n
kD0
.kC1 px / D
e .2ı/t fT .x/ .t/dt D A1 xWnj@2ı0
D
ı
Z1
vt t p x .x C t/dt:
The second moment of the present value of the payment is
Œ.x C t/ D t px .x C t/:
ex D E ŒT .x/ D
fT .x/ .t/dt D 0
The expected value of T .x/ is denoted by e x and equals ı
ıt
0
dt
x
0
Zn D
.s/ds
zt px u.x C t/dt:
An n-year term life insurance pays death benefit of $1 at the moment of death if the death is at or before the age x C n of (x). The present value of the payment is Z D vT .x/ IfT .x/ ng where I fT .x/ ng is an indicator function taking value of 1 if T .x/ n and taking value of 0 otherwise. The expected present value of the payment is
The probability density function (p.d.f.) is
zt fT .x/ .t/dt D
A1 xWnj D EŒZ D E vT .x/ IfT .x/ ng
FT .x/ .t/ D PrfT .x/ tg D t qx :
0
Z1
Z1 k px :
D
vn t p x .x C t/dt D vn n px : n
61 Actuarial Mathematics and Its Applications in Quantitative Finance
The variance of the present value of the payment is Z1 v2n t p x .x C t/dt Œvn n px 2
VarŒZ D n
2 1 1 D v2n n px v2n n px2 D 2 AxWnj AxWnj :
957
When the benefit is payable at the end of the m-th of the year of death, the expected present value of a whole life insurance is dmT.x/e .m/ Ax D EŒZ D E v m
D
An n-year endowment insurance provides both death and survival benefit of $1. The payment of the insurance is bt D 1. The present value of the insurance is Z D vT .x/^n . The expected present value of an n-year endowment insurance is
kC 1 m1 X X Z
j C1 m
vkC
AxWnj D
1 C AxWnj :
The whole-life insurance provides death benefit over the whole life span. The expected present value of the whole life insurance is Ax D
1 AxW1j
Z1 D EŒv
T .x/
e ıt fT .x/ .t/dt
D 0
Z1 D
vt t p x .x C t/dt:
D
1 m1 X X
vkC
j C1 m
If the insurance benefit is payable at the end of the year of death instead of the moment of death, the actuarial symbol is different. The expected present value of the benefit for an n-year term insurance providing death benefit of $1 at the end of the year death is h i A1xWnj D EŒZ D E vdT .x/e IfT .x/ ng
kD0
where the ceiling of T .x/ is defined as the smallest integer that is greater than or equal to T .x/. For a whole life insurance with benefit payable the end of the year of death, the expected actuarial present value is kC1 1 Z i X h dT .x/e D vkC1 fT .x/ .t/dt Ax D E v
D
1 X kD0
vkC1 k px qxCk :
px 1 qxCkC j : m
m
If we assume a uniform distribution of deaths (U.D.D.) in each year of age, the survival function follows SX .n C s/ D .1 s/SX .n/ C sSX .n C 1/; where 0 s < 1 and n is an integer. Let the fractional part of T .x/ be denoted by S D T .x/ bT .x/c. Under the U.D.D. assumption, S follows a uniform Œ0; 1/ distribution. The life expectancies follow
Under the U.D.D. assumption, the expected actuarial present values of the whole life insurance follow i h Ax D E vT .x/ D E vdT .x/e.1S / i h D E vdT .x/e E .1 C i /1S Z1 D Ax
kC1 n1 Z n1 X X vkC1 fT .x/ .t/dt D vkC1 k px qxCk ; D
kD0t Dk
j
kC m
1 ı e x D EŒT D E ŒbT c C S D E ŒbT c C E ŒS D ex C : 2
0
kD0 k
fT .x/ .t/dt
kD0 j D0 j t DkC m
kD0 j D0 1 AxWnj
j C1 m
e ıu du D Ax
i eı 1 D Ax ; ı ı
0
where U D 1 S also has a uniform distribution over the unit interval and dmT.x/e dmT.x/e .m/ dT .x/edT .x/eC m m Ax D E v DE v h i dmT.x/e D E vdT .x/e E .1 C i /dT .x/e m Z1 .1 C i /dt e
D Ax
dmte m
dt
0
D Ax
m1 X j D0
.j C1/=m Z
.1 C i /dt e j=m
dmte m
dt
958
C.-J. Chen
8 ˆ Z2=m < Z1=m dmte dt e dmte m dt C D Ax .1 C i / .1 C i /dt e m dt ˆ : 0
1=m
m=m Z
.1 C i /dt e
CC
dmte m
dt
.m1/=m
1=m
m=m Z
CC
m
.1 C i /1 m dt
.m1/=m
9 > = > ;
D
i .m/
at j ST .t/j1 0
C
Z1 d a
tj
dt
ST .t/dt
0
Z1
> ;
1 2 1 n D Ax .1 C i /1 m C .1 C i /1 m m o m C C .1 C i /1 m .1 C i /m=m 1 D Ax m .1 C i /1=m 1
i
ax D
9 > =
8 ˆ Z2=m < Z1=m 1 2 1 m D Ax .1 C i / dt C .1 C i /1 m dt ˆ : 0
Using integration by parts, we can simply it as
D
Z1 vt ST .t/dt D
0
Z1 vt t px dt D
0
0
where t Ex D vt t px . If an annuity provides a maximum payment period of n years, it is called an n-year temporarily annuity. The present value of the payment is Z D aT .x/^nj ; where T .x/^n D min.T .x/; n/. The expected present value of the payment is
axWnj
h i Zn D E aT .x/^nj D at j t px .x C t/dt C anjn px : 0
Applying integration by parts, we obtain Zn axWnj D
a t j t px jn0
C
Ax :
Zn v t px dt C anj n px D
Z1 Ax D
e ıt e t dt D
: Cı
0
0
If an annuity provides payments as long as the insured is still alive with guaranteed payments for n years, it is called an n-year certain and life annuity. The present value of the payment is Z D aT .x/_nj , where T .x/ _ n D max.T .x/; n/. The expected present value of the payment is Z D aT .x/_nj , where i
h
For more complicated insurance products, please consult Bowers et al. (1997).
vt t px dt:
t
0
If we assume the force of mortality is constant, then the expected present value of a whole life insurance is
t Ex dt;
Zn
axWnj D E aT .x/_nj D
a nj t px .x C t/dt 0
Z1 C
61.4 Actuarial Mathematics of Annuity A whole life annuity provides benefits while the insured is alive. The present value of the payment is aT .x/j . The expected present value of the annuity is Z1 i Z1 h ax D E aT .x/j D a t j fT .t/dt D at j t px .x C t/dt: 0
0
at j t px .x C t/dt n
Z1 D anj n qx at j
1 t px j n
C
vt t px dt n
Z1 D anj n qx C anj n px C
vt t px dt n
Z1 D anj C
vt t px dt: n
(61.7)
61 Actuarial Mathematics and Its Applications in Quantitative Finance
959 n i h X D aR nj aR nC1j n px C 1 C vk k px
An n-year deferred whole life annuity provides benefits if the insured is alive but without any payments for the first n years. The present value of the payment is Z D vn aT .x/nj IfT .x/ > ng. The expected present value is
kD1
D vn n px C v0 0 px C
Z1 nj ax
D
D
n
n1 X
vn asj nCs px .x C n C s/ds
If an annuity provides survival benefits at the end each year with a maximum payment period of n years, it is called an n-year temporarily life annuity-immediate. The present value of the benefits is Z D abT .x/c^nj . The expected present value is i h axWnj D E abT .x/c^nj
0
Z1 vn asj n px s pxCn .x C n C s/ds 0
Z1 D v n px
asj s pxCn .x C n C s/ds
n
0
D
D n Ex axCn :
(61.8)
The relationship between the n-year deferred whole life annuity in Equation (61.8) and the n-year certain and life annuity in Equation (61.7) is i h h i axWnj D E aT .x/_nj D E anj C aT .x/nj IfT .x/ > ng D anj C nj ax :
i h aR xWnj D E aR dT .x/e^nj D
n1 X
aR kC1j k px qxCk C aR nj n px :
kD0
By applying summation by parts ( n P
n P
g.x/ f .x/ D f .x/
f .x C 1/ g.x/, where f .x/ D f .x C
xD0
n1 X
akj k px qxCk C anj n px D
n X
vk k px :
kD1
If an annuity of $1 per year provides benefits of 1=m at each m-th of a year, it is called an m-thly payment n-year temporarily life annuity-due. The present value is Z D .m/ aR mT.x/ ˇˇ . The expected present value is d
e
m
^nˇˇ
D E aR dmT.x/e m
^nj
n1 m1 X X 1 j vkC m kC j px : D m m j D0 kD0
The relationship between an n-year temporarily life annuitydue and an n-year endowment insurance is aR xWnj
h
i
"
1 vdT .x/e^n D E aR dT .x/e^nj D E d 1 AxWnj 1 E vdT .x/e^n D : D d d
#
xD0
1/ f .x//, we obtain aR xWnj D
n1 X kD0
.m/ aR xWnj
If an annuity provides survival benefits at the beginning each year with a maximum payment period of n years, it is called an n-year temporarily life annuity-due. The present value of the benefits is Z D aR dT .x/e^nj . The expected present value is
g.x /jnC1 0
vk k px :
kD0
Z1
D
vk k px
kD1
vn at nj t px .x C t/dt
D
n X
aR kC1j .k px / C aR nj n px
61.5 Actuarial Premiums and Actuarial Reserves 61.5.1 Actuarial Premiums
kD0 n1 ˇn X ˇ D aR kC1j .k px /ˇ .kC1 px / aR kC1j C aR nj n px 0
D aR nC1j n px C 1 C
kD0 n1 X kD0
kC1 px v
kC1
C aR nj n px
In previous sections, we obtain expected present values of insurances and annuities. These are expected costs for an insurer. In return, an insurer receives risk premiums to cover these losses. In this section, we discuss how to use the equivalence principle to obtain premiums.
960
C.-J. Chen
Because an insurer receives premiums and pays claims, the present value of the loss of an insurer is L D present value of claims present value of premium incomes: By letting EŒL D 0, the expected claim is equal to the expected gain. This is called the equivalence principle. The premium obtained is called the pure premium or the benefit premium and is denoted by P . In practice, the insurer charges more than P in order to cover expenses, commissions, fees, and losses due to uncertainties. The extra charge is called the security loading. The gross premium, G, usually follows
Consider that an insured purchases an n-year term insurance whose $1 death benefit is payable at death, by paying premiums at the beginning of each year when the insured is still alive with a maximum payment period of h, h n. The premium is AN N N N / D xWnj : h P .AxWnj aR xWhjN For an n-year term insurance whose death benefit is payable at death purchased by continuous premium payments when the insured is still alive with a maximum payment period of n, the premium is PN .ANxWnjN / D
G D .1 C /P; where is called the relative security loading and P is called the security loading. Consider that an insured purchases a whole life insurance of $1 payable at the end of the year of death by paying premiums at the beginning of each year when the insured is still alive. The present value of the loss of the insurance company is L D vdT .x/e P aR dT .x/ej : Based on the equivalence principle, we have
0 D E.vdT .x/e / P E aR dT .x/ej D Ax P aR x :
Px D
L D vT .x/^n
:
P
maR dT .x/e^hj : m
The annual premium is
Ax : aR x
hP
Consider that an insured purchases an n-year term insurance whose death benefit of $1 is payable at the end of the year of death, by paying premiums at the beginning of each year when the insured is still alive with a maximum payment period of h, h n. The present value of the loss of the insurance company is
aN xWnjN
When the payment period and insurance period are the same, the subscript h in front of P can be neglected. In practice, premiums are very often paid by monthly payments. As men.m/ represents the expected tioned in the previous section, aR xWnj present value of an annuity of $1 per year providing benefits of 1/m at each m-th of a year for n years. Consider an n-year term insurance of $1 payable at death with continuous m-thly payments of Pm when the insured is still alive with a maximum payment period of h, h n. The present value of the loss of the insurance company is
(61.9)
After rearrangement of Equation (61.9), we denote this specific premium by Px as
ANxWnjN
.m/
.ANxWnjN / D
ANxWnjN .m/ N xWhj
aR
:
At each m-th of the year, the insured pays periodic level premium of
hP
.m/ .A / xWnj
m
.
61.5.2 Actuarial Reserve
L D vdT .x/e^n P aR dT .x/e^hj : Based on the equivalence principle, we have
0 D E vdT .x/e^n P E aR dT .x/e^hj D AxWnj P aR xWhj : This specific premium is denoted by h PxWnj and equals h PxWnj
D
AxWnj aR xWhj
:
The equivalence principle holds at time 0. As time moves on, the equivalence principle does not necessarily hold. Let us consider the loss of the insurer at time t > 0 for selling a whole life insurance with continuous premium payments. If the individual died at time t , with 0 < t < t, the loss at time t is the benefit payment of $1 minus the premium income as t L
D 1 PN .ANx /Nst j :
61 Actuarial Mathematics and Its Applications in Quantitative Finance
The time-t value of the loss is tL
D .1 C i /t t PN .ANx / sNt N j .1 C i /t t :
The loss or gain is realized at time t . There is a loss at time t if 1 > PN .ANx /Nst j and a gain if 1 < PN .ANx /s t j . Let us consider the loss for an insured .x/ who survives time t. At time t, the insured attains age x C t. The insurer is responsible to pay for the future death benefit and will receive premiums from time t to the death of the insured. The loss evaluated at time t is tL
.Ax / D EŒ t Lj T .x/ > t D AxCt P .Ax /axCt : (61.10)
If a n-year term insurance is purchased by h-year payments, for 0 < t < h < n, the time-t loss is tL
D vT .xCt /^.nt / h P .AxWnj /aT .xCt /^.ht /j:
The reserve is h t
If the insurance AxWnj is purchased by h-year payments at the beginning of each year, the reserve becomes h t
V .ANxWnjN / D ANxCt Wnt j h P .AxWnj /aR xCt Wht j :
If the insurance AxWnj is purchased by h-year payments at the beginning of each year, the reserve becomes h t
VxWnj D AxCt Wnt j h PxWnjN aR xCt Wht j :
61.6 Applications in Quantitative Finance
D vT .xCt / P .Ax /aT .xCt /j :
To cushion the possible loss in the future, the insurer has to put aside capital of an amount of EŒt Lj T .x/ > t . This expected loss is called the benefit reserve and is tV
961
V .AxWnj / D EŒ t Lj T .x/ > t D AxCt Wnt j h P .AxWnj /axCt Wht j :
Actuarial mathematics can be used not only for insurance valuation, but also for some applications in quantitative finance. Li (1998) use actuarial mathematics to obtain the term structure of credit risk. Consider a risky zero coupon bond paying $1 at maturity T . We assume that there exists a risk neutral probability measure Q. Let the time-t price of this risky bond be denoted by V .t; T /. Let time-t price of the riskless zero-coupon bond be denoted by P .t; T /. Let the time of default be denoted by . Assume that the bondholder will receive a fractional amount ı at maturity if default occurs. This recovery scheme is referred to as the recovery-of-treasury-value scheme. Under the risk neutral measure and the RTV scheme, the time-t price of this risky bond can be written by the Jarrow and Turnbull (1995) pricing formula as
(61.11) V .t; T / D
In Equations (61.10) and (61.11), reserves are obtained by looking prospectively into the future on the loss minus the income. This method is called a prospective formula. Alternatively, we can look retrospectively into past on the contributions that the insured has made minus the insurance protection that the insured has used. The reserve based on the retrospective formula is V .AxWnj / D h P .AxWnj / t h
axWt j t Ex
1
AxCt Wnt j t Ex
D h P .AxWnj /s xWt j
t Ex
; (61.12)
where h PN .ANxWnj /s xWt j is the accumulated premium contribu1
A
B.t/ .ı If T g C 1 If > T g/ ; B.T /
Q
where Et represents time-t expected value under the Q-measure based on information available up to time t. If we assume the bank account process is independent of the default process, we have Q
V .t; T / D P .t; T /Et Œ.ı If T g C 1 If T g/ : (61.13) After rearrangement of Equation (61.13), the risky bond price becomes
1
AxCt Wnt j
Q Et
Wnt j is the insurance protions by the survivorship and xCt t Ex tection that has been used by the survivorship. It is not difficult to find out that Equation (61.11) equals Equation (61.12).
V .t; T / Q D Et Œ.1 .1 ı/If T g/ D 1.1ı/.T t / qt : P .t; T / The probability of default becomes .T t / qt
D
P .t; T / V .t; T / : .1 ı/P .t; T /
(61.14)
962
C.-J. Chen
If we know ı; P .t; T1 /; P .t; T2 /; P .t; Tn / and V .t; T1 /; V .t; T2 /; V .t; Tn / with that t T1 T2 Tn , we can obtain .T1 t / qt ; .T2 t / qt ; ; .Tn t / qt . Consider that .T2 t / qt
D .T1 t / qt C .T1 t / pt .T2 t T1 / qt CT1 :
(61.15)
Equation (61.15) shows that the probability of default .T2 t / qt for the time period .t; T2 equals the probability of default for the time period .t; T1 plus the probability of survival for the time period .t; T1 times the probability of default for the time period .T1 ; T2 . Recursively using this method, we can obtain the whole term structure of default for the whole time period .t; Tn . For example, assume that a firm is estimated to have a recovery rate of 60% if default were to occur. The riskless zero-coupon bond prices are given as P .2008; 2009/ D 0:9792; P .2008; 2010/ D 0:9550, P .2008; 2011/ D 0:9221; and P .2008; 2012/ D 0:8905. The zero-coupon risky bond prices of a public firm are quoted as V .2008; 2009/ D 0:9654; V .2008; 2010/ D 0:9268; V .2008; 2011/ D 0:8823; and V .2008; 2012/ D 0:8390: Based on Equation (61.14), the probability of default can be obtained as 1 q2008
D 0:03523; 2q2008 D 0:0738;
3 q2008
D 0:1079 and 4 q2008 D 0:1446:
Based on Equation (61.15), we can obtain the 1-year default probability for the time period (2008, 2012] as 1 q2008
D 0:03523; 2q2009 D 0:03999;
3 q2010
D 0:03680 and 4 q2011 D 0:41111:
.m/ R bm.T t1 /c : .t1 t /j a tW m
Let us consider the put-call parity here. Let a European call and a European put on a non-dividend-paying stock have the same strike price K with maturity T . Let the time-t price of the European call and the European put be denoted by ct and pt . The relationship between ct and pt is ct pt D
B.t/.ST K/C B.t/.K ST /C B.T / B.T /
(61.16)
D St K P .t; T /; where AC D A if A 0 and AC D 0 if A < 0. If the stock provides a dividend yield of q, the put-call parity in Equation (61.16) becomes ct pt D St e q.T t / K P .t; T /:
(61.17)
Assume that the stock provides periodic dividends m times per year with each dividend equal d . Assume the current time is t and the next coupon payment will be made at t1 . The putcall parity in Equation (61.16) becomes ct pt D St d m
.m/
R .t1 t /j a
.m/
ˇ
bm.T t1 /c ˇ ˇ m
K P .t; T /; (61.18)
h
B.t / / B.t / D m1 B.t C B.t1B.t C1=m/ C B.t1 C2=m/ 1/ i / C C B.bm.TB.t . t1 /c=m/ We can extend actuarial equivalence principle to pricing credit default swap. Let the credit default swap issuer provide the loss given default of the reference party to the insured for a period of n. The insured pays periodic payments for a maximum period of h for this protection. If default occurs, the issuer pays the insured an amount of 1-random recovery rate (R) and the insured no longer has to pay the payment. We assume that the random recovery rate R is independent of the bank account process. Therefore, the loss of the issuer is
where
R .t1 t /j a
ˇ
bm.T t1 /c ˇ ˇ m
B.t/ I. t C n/ P aR ^hj : B./
By applying equivalence principle, we have Q
Q
1
0 D Et ŒL D .1 Et ŒR /At Wnj P aR t Whj :
(61.19)
Rearranging Equation (61.19), we can obtain the periodic payment 1 Q .1 Et ŒR /At Wnj : P D aR t Whj
V .t; T / D P .t; T /Œ1 .1 ı/.T t / qt
Q Et
B.t/.ST K/ B.T /
L D .1 R/
Assume that the bond provides periodic coupons m times per year with each coupon equal c. Assume the current time is t and the next coupon payment will be made at t1 . The pricing formula of Equation (61.14) is then modified to
Cc m
Q
D Et
If the issuer pays the insured at the end of the period if default occurs, then the premium is Q
P D
.1 Et ŒR /A1
t Wnj
aR t Whj
:
(61.20)
For example, we assume the insured pays two annual premiums to obtain a 4-year protection. Based on the default
61 Actuarial Mathematics and Its Applications in Quantitative Finance
963
probabilities of the previous example, the periodic premium can be obtained using Equation (61.20) as
Q
P D
.1 E2008 ŒR /A1
2008W4j
aR 2008W2j
B.2008/ B.2008/ B.2008/ B.2008/ q2008 C p2008 q2009 C 0:4
2 p2008 q2010 C 3 p2008 q2011 B.2009/ B.2010/ B.2011/ B.2012/ D B.2008/ 1C p2008 B.2009/ D
0:4 Œ0:9792 0:03523 C 0:9550 .1 0:03523/ 0:03999 1 C 0:9792 .1 0:03523/ C
D
0:4 ŒC0:9221 .1 0:07382/ 0:03680 C 0:8905 .1 0:10790/ 0:04111 1 C 0:9792 .1 0:03523/
0:4 0:135441 D 0:027859: 1:9447
Using actuarial mathematics, we can obtain clean and nice closed-form solutions for some problems in quantitative finance.
method. In Equations (61.10) and (61.11), reserves are calculated prospectively as the expected current value of future liabilities. In Equation (61.12), the reserve is calculated retrospectively as the expected current value of the unused contributions. Both methods reach the same result. Actuarial mathematics can be used not only for insurance valuation, but also for some applications in quantitative finance. When we replace the time of death by the time of default and assume that interest rate is not flat, we can use actuarial mathematics to obtain the term structure of credit risk from zero-coupon bonds (like Li 1998). The bond price for coupon-paying bond is also obtained in a similar manner in this chapter. In the last part of this chapter, we used actuarial mathematics to price the credit default swap and default digital swap. In general, actuarial mathematics and quantitative finance could be integrated in more areas.
61.7 Conclusion
References
In this chapter, we introduce actuarial mathematics and actuarial interest functions, insurance, annuities, premiums, and reserves. The pure premium is calculated based on the equivalence principle. The real premium that the insurance company charges is called the loaded premium, which is the pure premium plus a loaded amount. After premiums are collected by the insurer, the reserve is set aside to cushion the expected loss. We introduce two methods of finding the actuarial reserve: the prospective method and the retrospective
Bowers, N. L. et al. 1997. Actuarial mathematics, 2nd Edition, Society of Actuaries, Itasca, IL. Jarrow, R. A. and S. M. Turnbull. 1995. “Pricing derivatives on financial securities subject to credit risk.” Journal of Finance 50, 53–85. Kellison, S. G. 1991. The theory of interest, 2nd Edition, IRWIN, Homewood, IL. Li, D. 1998. Constructing a credit curve, Credit Risk: Risk Special Report, November, 40–44. Panjer, H. H. et al. 1998. Financial economics, The Actuarial Foundation, Schaumburg, IL. Schonbucher, P. J. 2003. Credit derivatives pricing models, Wiley, New York.
For a default digit swaps, (see Schonbucher 2003, Sect. 5.8.1) the insured will receive a predetermined fixed amount s if default occurs. The periodic payment becomes P D
sA1
t Wnj
aR t Whj
:
If a default digit swap pays $0.50 for per dollar exposure, then the periodic payment is P D
0:5 0:135441 D 0:034823: 1:9447
Chapter 62
The Prediction of Default with Outliers: Robust Logistic Regression Chung-Hua Shen, Yi-Kai Chen, and Bor-Yi Huang
Abstract This paper suggests a Robust Logit method, which extends the conventional logit model by taking outliers into account, to implement forecast of defaulted firms. We employ five validation tests to assess the in-sample and out-of-sample forecast performances, respectively. With respect to in-sample forecasts, our Robust Logit method is substantially superior to the logit method when employing all validation tools. With respect to the out-of-sample forecasts, the superiority of Robust Logit is less pronounced. Keywords Logit r Robust Logit r Forecast r Validation test
62.1 Introduction In recent years a large number of researchers and practitioners have worked on the prediction of business defaults. This prediction is important because not only can it reduce nonperforming loans but it can help to determine capital allocation. As required by the Basel Committee on Banking Supervision, the prediction of business default is the first step to fulfill the requirement of the internal rating-based (IRB) of Basel II. Large banks are therefore eager to develop their default prediction system to monitor credit risk. One of the issues regarding credit risk assessment is the model or method of default prediction used. Altman and Saunders (1998) have traced development of the literatures in risk measurement for 20 years. Currently, the common model used includes discriminant analysis (Altman 1968), logit and probit models (Ohlson 1980; Westgaard and Wijst 2001), multiC.-H. Shen () Department of Finance, National Taiwan University, Taipei, Taiwan e-mail: [email protected] Y.-K. Chen Department of Finance, National University of Kaohsiung, Kaohsiung, Taiwan e-mail: [email protected] B.-Y. Huang Department of Business Management, Shih Chien University, Taipei, Taiwan e-mail: [email protected]
group hierarchy model (Doumpos et al. 2002; Doumpos and Zopounidis 2002), neural network (Atiya 2001; Piramuthu 1999; Wu and Wang 2000) and option type models such as KMV to name a few. The works of Lennox (1999) for the comparison of the first three models and Dimitras et al. (1996) for the discussion of the first five models have become the standard reference for the financial research. While there are many methods to estimate the probability of default, none of them have taken the outliers into account when there is a discrete dependent variable. Outliers that can seriously distort the estimated results have been well documented in the conventional regression model. For example, Levine and Zervos (1998) employ 47 countries data and confirm that the liquidity trading is positively related to economic growth. Zhu et al. (2002), however, reject this positive relation when they employ the econometric methods to minimize the outlier distortion effects caused by Taiwan and South Korea. Although methods and applications that take outliers into account are well known when the dependent variables are continuous (Rousseeuw 1984; Rousseeuw and Yohai 1984), few have conducted empirical studies when the dependent variable is binary. Atkinson and Riani (2001), Rousseeuw and Christmann (2003), and Flores and Garrido (2001) have developed the theoretical foundations as well as the algorithm to obtain consistent estimator in logit models with outliers, but they do not provide applied studies. If outliers indeed exist when the dependent variable is binary, the conventional logit model might be biased. The aim of this paper is to predict default probability with the consideration of outliers. This is a direct extension of the logit estimation method and is referred to as the Robust Logit model hereafter. We apply the forward search method of Atkinson and Cheng (2000) and Atkinson and Riani (2001) to Taiwan data. To the best of our knowledge, our paper is the first to use the Robust Logit model for actual data. Once estimated coefficients are obtained, we assess the performances of the logit and the Robust Logit methods by using five validation tools; that is, contingency table (cross-classifications table), cumulative accuracy profile (CAP), relative or receive operation characteristics (ROC), Kolmogorov-Smirnov (KS), and Brier score.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_62,
965
966
The paper proceeds as follows. In addition to the first section, the next section provides literatures of the logit and the Roubst Logit regression model. Section 62.3 introduces the five validation models. Section 62.4 discusses the empirical model and estimated results. Section 62.5 provides the conclusion.
62.2 Literature Review of Outliers in Conventional and in Logit Regression Conventional linear regression analysis taking outliers into account has been utilized since 1960. The methods, such as Least Median of Squares (LMS), Least Trimmed Squares (LTS) (Rousseeuw 1983, 1984), which exclude the effect of outliers on linear regression, are now standard options in many econometric softwares. Until 1990, however, the literature was slow in the consideration of outliers when the logit model is involved. Furthermore, most development tends to center on the theoretical derivations of outliers in logit method in the fields of statistics and actuaries.
C.-H. Shen et al.
62.2.2 Outliers in Logit Regression: Robust Logistic Regression Our Robust Logistic (RL) regression is based on Atkinson and Riani (2001) forward approach, which include five steps. <Step 1> Choice of the initial subset of observations Randomly choose k C 1 observations where k C 1 is equal to one-third of total observations as our starting sample size.1 The corresponding estimated coefficient vector of logit method is denoted as ˇO .kC1/ and the
value predicted .kC1/ O , where of the observed company is yOi D F xi ˇ i D 1:::;N. <Step 2> Obtain the median of the errors Calculate the probability of accurate rate of the prediction of the default companies as p .kC1/;i : ( p .kC1/;i
D yOi ; if yi D 1 D 1 yOi ; if yi D 0:
Corresponding to the accurate rate of the prediction, the probability of the nonaccurate rate of the prediction e .kC1/;i is calculated as e .kC1/;i D 1 p .kC1/;i of all observations. Then take an ascending order of all e .kC1/;i , i.e., e .kC1/;1 < e .kC1/;2 < < e .kC1/;N ;
62.2.1 Outliers in Conventional Regression Cook and Weisberg (1982) suggest a backward procedure to exclude outliers in the conventional linear model. By using the whole sample as the first step, Cook and Weisberg (1982) detect one outlier and remove it. Then, they go on detecting the second outlier and removing it, followed by the third and so on. Repeating this step, they remove all outliers. While the backward approach appears straightforward, it, however, has suffered from a masking effect. Namely, the statistical properties of the detected outliers, are affected by the outliers remaining in the sample. Barrett and Gray (1997) and Haslett (1999) suggest a multiple outliers’ method to overcome this problem but the speed of detection is slow. Atkinson (1985, 1994) proposes a forward search algorithm (forward approach), which uses only a small subset without outliers as the first step. Then, another subset is added and examine in the next observation. Continuing this step, he claims that the forward approach can remove outliers without the masking effect. When the number of outliers is not known, Atkinson and Riani (2006) indicate a method that makes efficient use of the individual simulations to derive the simultaneous properties of the series of tests occurring in the practical data-analytical case. Riani and Atkinson (2007) provide easily calculated bounds for the statistic during the forward search and illustrate the importance of the bounds in inference about outliers.
and obtain the median of all e .kC1/;i which is e .kC1/;med . <Step 3> Proceed forward search algorithm Add an additional observation in the subset. Employing k C 2 observations to yield coefficients ˇO .kC2/ . Yet, these k C 2 observations are the observations corresponding to the smallest errors of (k C 2) observations in Step 2; that is, observations corresponding to e .kC1/;1 ; e .kC1/;2 ; : : : ; e .kC1/;kC2 . This is equivalent to removing the outliers. Then, repeating Step 2 and we can obtain the median of e .kC2/;i , which is e .kC2/;med . <Step 4> Obtain all estimated coefficients and corresponding error median Add an additional observation again. It means that repeat Step 3 by adding another observation and use k C 3 observations corresponding to the smallest k+3 errors of e .kC2/;i in Step 3 We then similarly obtain ˇO .kC3/ and median e .kC3/;med . Repeat the above steps by adding one additional observation in each estimation and the process is done until all samples are used. We thus obtain estimated coefficients ˇO .kC4/ ; ˇO .kC5/ ; : : : ; ˇO N and the corresponding median e .kC4/;med ; e .kC5/;med ; : : : ; e N;med . <Step 5> Outlier is found
1
Atkinson uses k D C1 as the number of parameters C1 as the starting sample size. We do not adopt his suggestion because the small sample size often is full of zeros without one, invalidating the logit model.
62 The Prediction of Default with Outliers: Robust Logistic Regression
Calculating e ;med D min e .kC1/;med ; e .kC2/;med ; : : : e N;med and its corresponding ˇO , which is the estimator of RL method.2 Although this forward search method is intuitively appealing, it encounters three problems in the actual application. First, the random sampling may pick all zeros or ones as dependent variable, which fail the estimation. Next, the selected initial set of samples affects the estimation results. Thus, repeated sampling of the initial set become necessary. Third, companies identified as outliers may be those extremely good and bad companies. They are statistical outliers but not financial outliers.
967
the sum of TP% and TN% is referred to as the hit rate, whereas FP% and FN% are referred to as Type II and Type I errors, respectively. Furthermore, TP% C FN% D 1 and FP% C TN% D 1. The weakness of this table is that only one cutoff is chosen to decide these ratios. Typically, the selection of this cutoff is based on the average rule (the cutoff is then 0.5) or the sample proportion rule (the cutoff is then the number of default/total number firms). More cutoffs may be needed, which motivates the development of the following validation tests.
62.3.2 Cumulative Accuracy Profile (CAP) 62.3 Five Validation Tests Once we obtain the estimated results from two methods, we compare their forecasting ability based on the following five validation tests for the assessment of discriminatory power. The validation methods introduced here are mostly based on the work by Sobehart and Keenan (2004), Sobehart et al. (2000) and Stein (2002).
62.3.1 Contingency Table (Cross-Classification Table) Contingency Table, also referred as the Cross-Classification Table, is the most often used validation tool in comparing the power of prediction. Let TP% and TN% be the ratios of success in predicting default and non-default firms, whereas FP% and FN% be the ratios of failure in predicting default and non-default firms (see Table 62.1). In conventional terms, Table 62.1 Contingency table (cross-classification table) Non-default Default XX XXXPredicted companies companies XXX Actual X .yi D 0/ .yi D 1/ TP%D
FP%N
Non-default
companies (yOi D F xi ˇO < cutoff)
FN%D
TN%N
Notations are taken from Sobehart et al. (2000). TP true positive means that companies are default and are accurately predicted; FN false negative means companies are not default and not correctly predicted; FP false positive means companies are default and not correctly predicted; TN true negative means that companies are not default and correctly predicted; D is number of default and N is number of non-default companies 2
We could further repeat Step 1 to start different set of observations.
Perfect Turning points =
Fraction of Defaulted Companies
Default companies
(yOi D F xi ˇO cutoff)
CAP curve is a visual tool which graph can easily be drawn if two representative samples of scores for defaulted and nondefaulted borrowers are available. The shape of the concavity of the CAP is equivalent to the property that the conditional probabilities of default, given the underlying scores, form a decreasing function of the scores (default probability). Alternatively, non-concavity indicates suboptimal use of information in the specification of the score function. Researchers typically calculate the accuracy ratio, which is the area under the rating model divided by the area under the perfect model, to examine the performance of model. This is equivalent to A/B graphed in Fig. 62.1. Figure 62.1 plots CAPs of a perfect rating and random model. A perfect rating model will assign the lower estimated scores to the defaulters. In this case the CAP is increasing linearly and then staying at one. For a random model without any discriminative power, the fraction x of all debtors with the lower scores contain x percent of all defaults. Applied rating systems will be somewhere in between these two extremes. Statistically, the comparison ratio is defined as the
D Perfect Model
N +D
B Rating Model A Random Model
45 Fraction of All Companies
Fig. 62.1 CAP curve
968
C.-H. Shen et al.
ratio of A/B, where A is the area between the CAP of the rating model being validated and the CAP of the random model, and B is the area between the CAP of the perfect rating model and the CAP of the random model. The calculation of area under rating model is as follows. First, a descending order of the estimated default rates is ranked. Then, it takes the top s% number of firms that have the higher estimated default rates, making these numbers equal to G D s% .N C D/ where N and D are the number of non-defaulting and default companies in the data set. Within G firms, it then calculates the number of firms that are actually default and are divided by G to yield y%. Repeating the above process, we obtain a sequence of s% and y%. Plotting y% (y-axis) against s% (x-axis) yields CAP. The shape of the rating model depends on the proportion of solvent and insolvent borrowers in the sample.
Hit Rate 1.0 FN%
FP%
0.8 A 0.6 TP%
0.4
B
0.2
TN%
0.2
0.4
0.6
0.8 1.0 False Alarm Rate
Fig. 62.2 ROC curve
obtain the first set of (FP%, TP%/ D .5%; 8%). Continuing this process, we can get many sets of FP% and TP%, which generate ROC.
62.3.3 Receiver Operating Characteristic (ROC) 62.3.4 Kolmogorov-Smirnov (KS) ROC curve uses the same information as CAP to answer the question: What percentage of non-defaulters would a model have to exclude? (Stein 2002). It generalizes Contingency Table analysis by providing information on the performance of a model at any cutoff that might be chosen. It plots the FP% rate against the TP% rate for all credits in a portfolio. In particular, ROCs are constructed by scoring all credits and ordering the non-defaults from the worst to the best on the x axis and then plotting the percentage of defaults excluded at each level on the y axis. The area under the rating model is ACBD
kDn1 X
.XkC1 Xk /.YkC1 C Yk / 2;
The KS-test tries to determine if two datasets differ significantly. The KS-test has the advantage of making no assumption about the distribution of data. It also enables us to view the data graphically. KS plots the cumulative distribution of default and non-default firms, denoted as F 1 and F 2, respective by and then calculates the maximum distance between these two curves as KS D max.F 1 F 2/ The large KS suggest the rejection of the null hypothesis of equality of distributions.
kD0
where A and B are the area under rating model and 45ı , respectively and n is the number of intervals. The ROC curve is demonstrated in Fig 62.2. For example, assuming that the number of default firm D D 50, and non-default firm N D 450. Then, similar to the CAP method, a descending order of the estimated default probability is ranked. Next, giving a fixed type II error, FP%, and finding the corresponding cutoff of c%, we can calculate the corresponding TP%. To illustrate this, if FP% is first chosen to be 5%, then 23 non-default firms are misjudged as default (450 5% D 22:5). At the same time, the cutoff c% is decided. Based on this c%, if we successfully predict four defaulted firms, making TP% D 8%.4=50 D 8%). Thus we
62.3.5 Brier Score Brier score computes the mean squares error between the estimated and actual default rate. The Brier Score is estimated as
2 Pn bi Ii P i D1 ; BD n where POi is the predicted value and Ii is the actual 0 and 1. From the above definition, it follows that the Brier score is always between zero and one. The closer the Brier score is to zero the better is the forecast of default probabilities.
62 The Prediction of Default with Outliers: Robust Logistic Regression
62.4 Source of Data and Empirical Model 62.4.1 Source of Data To ensure the reliability of financial statements, our samples are actual listed companies on the Taiwan Stock Exchange. Default firms are defined as those stocks that require full delivery; that is, transaction with cash in Taiwan Stock Exchange. These firms include (1) check bouncing of the CEOs, board directors and supervisors of companies; (2) firms that request for financial aid from the government, due to restructuring, bankruptcy, liquidation, ongoing uncertainty, acquisitions, tunneling, trading halts, and credit crunch by banks. In our sample, there are 52 default companies in the period 1999–2004. For each default company, we search for the three additional companies with a similar size of assets in the same industry, resulting in 156 non-default companies. Hence, 208 companies in total are in our sample. The names and codes of all the companies are reported in Table 62.2 as well as the reasons of their defaults. We also reserve 20% of our sample for out-of-sample forecast. That is, there are 42 reserved firms.
62.4.2 Empirical Model Our empirical model is on the basis of Altman’s z-score, which contains five variables. Four of them are financial accounting variables and one of is market variable .X4 /. Because of multicollinearity, we choose only four of them;3 that is our model is Yt D f .X1t 1 C X2t 1 C X3t 1 C X4t 1 C 1:0X5t 1 / where Y is the binary variable with 1 and 0 and 1 denotes defaulted and zero otherwise, X1 D operation capital/total asset (operating capital D liquid asset liquid liability), X3 D earnings before interest and tax (EBIT)/total asset, X4 D stock value/total liability, X5 D net sales revenue/total asset (net sales revenues D sales redemption and discount). All signs are expected to be negative.
62.5 Empirical Results The left and right parts of Table 62.3 report the estimated results using the logit and the Robust Logit models, respectively. When the logit model is used, all coefficients show the expected negative sign and all are significant except for coefficient of X5 . By contrast, when the Robust Logit model 3
We omit X2 D retained earnings/total assets.
969
is employed, all coefficients not only show the expected signs but also are significantly different from zero. Alongside this, the pseudo-R2 is 0.3918 for the logit model but is higher up to 0.9359 for the Robust Logit model, suggesting that in-sample fitting is much better in the Robust Logit model than in the logit model. Figure 62.3 plots the curve of CAP, ROC, and KS using in-sample forecasts. The curves generated by the Robust Logit method is more concave to the southeast than the logit method shown in the CAP and ROC. With respect to KS method, the maximum distance between non-default and default is also bigger for the Robust Logit method than for the logit method. Figure 62.4 is similar to Fig. 62.3 but an out-of-sample forecast is used. The CAP and ROC curves generated by the two methods are twisted with each other to some extent and the area under the curves can be hardly evaluated by the human eye. With respect to KS method, the maximum distance of non-defaults and defaults clearly show that the Robust Logit method is superior to the logit method. Table 62.4 reports the five validation tests by using the in-sample forecast. With respect to the Contingency Table, when the logit method is used, the TP% and TN% are about 77% but are higher up to 97.67 and 93.67%, respectively, when the Robust Logit method is undertaken. Thus, the Robust Logit method defeats the logit method when the validation is based on the Contingency Table. The KS is 5.288 and 6.410 for the two methods, respectively, again supporting the superiority of the Robust Logit method. The CAP ratio also reaches the similar conclusion, where they are 0.7040 and 0.8308 for the two methods, respectively. Not surprisingly, ROC ratios also support the same conclusion as the two ratios are 0.8447 and 0.9867, respectively. Finally, the Brier score, whose definition is opposite to the previous validation tests, is smaller if the performance of the method is superior. The scores for two models are respectively 0.1207 and 0.0226. Accordingly, all validation tests suggest that the Robust Logit method is superior to the logit method in insample prediction. Table 62.5 reports the validation tests by using the outof-sample forecast. The superior performance of the Robust Logit method in in-sample forecast becomes less pronounced here. When Contingency Table is employed, the TP% and TN% yielded by the logit model are about 75%, which is similar to their in-sample counterparts reported in Table 62.4. The values, however, change dramatically when the Robust Logit is used. The TP% becomes 100.0% but TN% is only about 48%. This implies that the Robust Logit method is aggressive in the sense that it has a greater tendency to assign companies as default. The use of KS test still support the conclusion reached by the in-sample case, i.e., the logit method performs worse than the Robust Logit method. The differences between the two methods in CAP test become trivial as the logit method is 0.6566 and the Robust Logit is
970
C.-H. Shen et al.
Table 62.2 All sample companies Code of failing Names companies of companies
Default date
Types of default
Matching samples of non-default companies
9913
MHF
1999/1/18
G
9911(SAKURA), 9915(NienMade), 9914(Merida)
1998
2005
U-Lead
1999/1/24
G
2002(CSC), 2006(Tung Ho Steel), 2007(YH)
1998
2539
SAKURAD
1999/3/22
G
2501(CATHAY RED), 2504(GDC), 2509(CHAINQUI)
1998
2322
GVC
1999/4/1
O
2323(CMC), 2324(Compal), 2325(SPIL)
1998
2522
CCC
1999/4/18
C
2520(KINDOM), 2523(DP), 2524(KTC)
1998
1431
SYT
1999/5/21
H
1432(TAROKO), 1434(F.T.C.), 1435(Chung Fu)
1998
1808
KOBIN
1999/5/24
H
1806(CHAMPION), 1807(ROMA), 1809(China Glaze)
1998
9922
UB
1999/10/5
G
9918(SCNG), 9919(KNH), 9921(Giant)
1998
1206
TP
1999/11/2
H
1216(Uni-President), 1217(AGV), 1218(TAISUN)
1998
1209
EHC
2000/3/23
N
1201(Wei-Chuan), 1203(Ve Wong), 1207(CH)
1999
2528
CROWELL
2000/4/28
G
2526(CEC), 2527(Hung Ching), 2530(DELPHA)
1999
1462
TDC
2000/7/11
G
1458(CHLC), 1459(LAN FA), 1460(EVEREST)
1999
2703
Imperial
2000/9/5
H
2702(HHG), 2704(Ambassador), 2705(Leo Foo)
1999
1422
MICDT
2000/9/6
C
1417(CARNIVAL), 1418(TONG-HWA), 1419(SHINKO.SPIN.)
1999
1505
YIW
2000/9/6
G
1503(Shihlin), 1504(TECO), 1506(Right Way)
1999
2334
KFC
2000/9/6
C
2333(PICVUE), 2335(CWI), 2336(Primax)
1999
2518
EF
2000/9/6
G
2514(LONG BON), 2515(BES), 2516(New Asia)
1999
Data year
(continued)
62 The Prediction of Default with Outliers: Robust Logistic Regression
971
Table 62.2 (continued) Code of failing Names companies of companies
Default date
Types of default
Matching samples of non-default companies
2521
HCC
2000/9/8
G
2505(ky), 2509(CHAINQUI), 2511(PHD)
1999
2019
Kuei Hung
2000/9/16
G
2020(MAYER PIPE), 2022(TYCOONS), 2023(YP)
1999
2011
Ornatube
2000/10/13
C
2012(CHUN YU), 2013(CSSC), 2014(CHUNG HUNG)
1999
9906
Corner
2000/10/27
C
9905(GCM), 9907(Ton Yi), 9908(TGTG)
1999
1222
Yuan Yi
2000/11/2
C
1224(HSAFC), 1225(FOPCO), 1227(QUAKER)
1999
2902
Choung Hsim
2000/11/29
H
2903(FEDS), 2910(CHUN YUAN STEEL), 2912(7-ELEVEN)
1999
2517
CKA-LT
2000/11/30
G
2520(KINDOM), 2523(DP), 2524(KTC)
1999
2537
Ezplace
2001/1/12
G
2534(HSC), 2535(DA CIN), 2536(Hung Poo)
2000
1408
CST
2001/4/2
G
1410(NYDF), 1413(H.C.), 1414(TUNG HO)
2000
1407
Hualon
2001/5/22
G
1402(FETL), 1416(KFIC), 1409(SSFC)
2000
2540
JSCD
2001/5/25
G
2533(YUH CHEN UNITED), 2538(KeeTai), 2530(DELPHA)
2000
2304
A.D.I.
2001/7/28
C
2301(LTC), 2302(RECTRON), 2303(UMC)
2000
1438
YU FOONG
2001/8/10
E
1435(Chung Fu), 1436(FUI), 1437(GTM)
2000
1450
SYFI
2001/8/24
G
1451(NIEN HSING), 1452(HONG YI), 1453(PREMIER)
2000
2318
Megamedia
2001/9/28
G
2315(MIC), 2316(WUS), 2317(HON HAI)
2000
2506
PCC
2001/10/16
G
2514(LONG BON), 2515(BES), 2516(New Asia)
2000
Data year
(continued)
972
C.-H. Shen et al.
Table 62.2 (continued) Code of failing Names companies of companies
Default date
Types of default
Matching samples of non-default companies
1613
Tai-I
2001/10/22
E
1614(SANYO), 1615(DAH SAN), 1616(WECC)
2000
2512
Bao-Chen
2002/4/16
C
2504(GDC), 2509(CHAINQUI), 2511(PHD)
2001
1805
KPT
2002/6/2
G
1802(TG), 1806(CHAMPION), 1807(ROMA)
2001
1602
PEW
2002/9/6
G
1601(TEL), 1603(HwaCom), 1604(SAMPO)
2001
1221
CCI
2003/3/6
C
1217(AGV), 1234(HEYSONG), 1231(Lian Hwa Foods)
2002
2342
MVI
2003/4/18
G
2344(WEC), 2345(ACCTON), 2347(Synnex)
2002
3053
DING ING
2003/4/26
E
3045(TWN), 3046(AOpen), 3052(APEX)
2002
2329
OSE
2003/6/30
G
2330(TSMC), 2331(Elitegroup), 2332(D-LINK)
2002
1212
SJI
2003/9/30
G
1204(jingjing), 1216(UniPresident), 1218(TAISUN)
2002
3001
KIM
2004/3/5
C
3010(WAN LEE), 3011(JH), 3018(TUNG KAI)
2003
2525
Pao Chiang
2004/3/20
G
2520(KINDOM), 2523(DP), 2524(KTC)
2003
2494
Turbocomm
2004/4/15
E
2489(AMTRAN), 2488(HANPIN), 2492(WTC)
2003
2398
Procomp
2004/6/15
H
2382(QCI), 2388(VIA), 2409(AUO)
2003
3021
Cradle
2004/7/26
C
3020(USTC), 3022(ICP), 3023(Sinbon)
2003
2491
Infodisc
2004/8/23
G
2308(DELTA), 2311(ASE), 2312(KINPO)
2003
2490
Summit
2004/9/15
C
2349(RITEK), 2350(USI), 2351(SDI)
2003
Data year
(continued)
62 The Prediction of Default with Outliers: Robust Logistic Regression
973
Table 62.2 (continued) Code of failing Names companies of companies
Default date
Types of default
Matching samples of non-default companies
3004
NAFCO
2004/9/23
H
2356(INVENTEC), 2357(ASUSTEK), 2358(MAG)
2003
1534
Tecnew
2004/9/24
C
1531(SIRUBA), 1532(CMP), 1533(ME)
2003
9936
Compex
2004/10/20
E
9933(CTCI), 9934(GUIC), 9935(Ching Feng)
2003
Data year
The number is the code listed in Taiwan Stock Exchange Type of Default: C check bounce; E concern of continuing operation; G financial Aid from government; O substantial loss (low net worth)
Table 62.3 Estimated results: logit vs. Robust Logit
Logit Methods
Coefficients
Robust Logit t -value
Coefficients
t -value
Constant
0:3600
0:7506
17.0487
2.1627
X1
1:6195
1:1766
10:2357
1:7913
X3
13:1535
4:1651
234:2707
2:2311
X4
0:5519
2:0683
2:1146
2:3319
0:5858
X5
0:4227
Log likelihood
61:4865
9:510
12:3312
Average likelihood
0.6905
0.9200
Pseudo-R-square
0.3918
0.9359
Number of sample
166
114
Number of default companies
43
43
Number of non-default companies
123
71
Medium of residuals
–
6.42811e-04
,
,
2:0225
denote significant at 10, 5 and 1% level, respectively
only 0.6812. Similar results occur in ROC test as the former test is 0.6717 but the latter one is 0.6883. Thus, based on CAP and ROC, they are in a tie. Last, to our surprise, the Robust Logit method is defeated by the logit method as its Brier score is higher than the logit method. Thus, when the out-ofsample forecast is implemented, the differences between two methods are hard to distinguish.
62.6 Conclusion We compare the forecast ability between logit and Robust Logit methods, where the latter take the possible outliers into account. Six validation tests are employed when the
in-sample forecasts are compared, i.e., pseudo-R square, Contingency Table, CAP, ROC, KS and Brier score, whereas the latter five validation tests are undertaken for the out-ofsample forecast. With respect to the in-sample forecasts, Robust Logit method is substantially superior to the logit method when using all validation tests here. With respect to the out-of-sample forecasts, Robust Logit method yields less type II but large type I errors than the logit method when Contingency Table is used, suggesting that Robust Logit is more aggressive in assigning firms as default. Robust Logit is marginally better than the logit method when CAP, ROC, and KS, are adopted but worse when the Brier score is used. Thus, the superiority of Robust Logit is less pronounced or even disappears in the out-of-sample forecasts.
974
C.-H. Shen et al.
a. CAP curve
b. ROC curve
c. Kolmogorov-Smirnov
Fig. 62.3 In-sample forecast: Logit and Robust Logit
62 The Prediction of Default with Outliers: Robust Logistic Regression
a. CAP curve
b. ROC curve
c. Kolmogorov-Smirnov
Fig. 62.4 Out-of-sample forecast: logit and Robust Logit forecast
975
976
Table 62.4 Validation tests (in-sample forecast)
C.-H. Shen et al.
Methods
Logit method
Robust Logit method
Cross-classification
TP% TN% TP% TN% 76.74% 78.05% 97.67% 94.37%
KS CAP ROC Brier score
5.288 0.7040 0.8447 0.1207
6.410 0.8308 0.9867 0.0226
CAP: cumulative accuracy profile; ROC: receiver operating curve; KS: Kolmogorov-Smirnov
Table 62.5 Estimated results of cross-classification, KS and Brier (out-of-sample forecast)
Methods
Logit method
Robust Logit method
Cross-classification
TP% TN% TP% TN% 77.78% 72.73% 100.00% 48.48%
KS CAP ROC Brier score
1.45 0.6566 0.6717 0.1319
1.558 0.6812 0.6883 0.3756
CAP: cumulative accuracy profile; ROC: receiver operating curve; KS: Kolmogorov-Smirnov
References Altman, E. I. 1968 “Financial ratios, discriminate analysis and the prediction of corporate bankruptcy.” Journal of Finance 23(4), 589– 609. Altman, E. I. and A. Saunders 1998 “Credit risk measurement: developments over the last 20 years.” Journal of Banking & Finance 21(11–12) 1721–1742. Atiya, A. 2001 “Bankruptcy prediction for credit risk using neural networks: a survey and new results.” IEEE Transactions on Neural Networks 12(4), 929–935. Atkinson, A. C. 1985 Plots, transformations, and regression: an introduction to graphical methods of diagnostic regression analysis, Oxford University Press, New York. Atkinson, A. C. 1994 “Fast very robust methods for the detection of multiple outliers.” Journal of the American Statistical Association 89(428), 1329–1339. Atkinson, A. C. and T. -C. Cheng 2000. “On robust linear regression with incomplete data.” Computational Statistics & Data Analysis 33(4), 361–380. Atkinson, A. C. and M. Riani 2001 “Regression diagnostics for binomial data from the forward search.” The Statistician 50(1), 63–78. Atkinson, A. C. and M. Riani 2006 “Distribution theory and simulations for tests of outliers in regression.” Journal of Computational Graphical Statistics 15(2) 460–476. Barrett, B. E. and J. B. Gray 1997 “Leverage, residual, and interaction diagnostics for subsets of cases in least squares regression.” Computational Statistics and Data Analysis 26(1) 39–52. Cook, R. D. and S. Weisberg. 1982. Residuals and influence in regression, Chapman & Hall/CRC, London. Dimitras, A. I., S. H. Zanakis and C. Zopoundis 1996 “A survey of business failures with an emphasis on prediction methods and industrial application.” European Journal of Operation Research 90(3), 487–513. Doumpos, M. and C. Zopounidis 2002. “Multi-criteria classification and sorting methods: a literature review.” European Journal of Operational Research 138(2), 229–246.
Doumpos, M., K. Kosmidou, G. Baourakis and C. Zopounidis 2002 “Credit risk assessment using a multicriteria hierarchical discrimination approach: a comparative analysis.” European Journal of Operational Research 138(2), 392–412. Flores, E and J. Garrido 2001 “Robust logistic regression for insurance risk classification.” Business Economics Universidad Carlos III, Departamento de Economía de la Empresa Working Papers wb016413. Haslett, J. 1999. “A simple derivation of deletion diagnostic results for the general linear model with correlated errors.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(3), 603–609. Lennox, C 1999 “Identifying failing companies: a reevaluation of the logit, probit and DA approaches.” Journal of Economics and Business 51(4), 347–364. Levine, R. and S. Zervos 1998 “Stock markets, banks, and growth.” American Economic Review 88(3), 537–558. Ohlson J. T. 1980 “Financial ratios and the probabilistic prediction of bankruptcy.” Journal of Accounting Research 18(1), 109–131. Piramuthu, S. 1999 “Financial credit-risk evaluation with neural and neurofuzzy systems.” European Journal of Operational Research 112(2) 310–321. Riani, M. and A. Atkinson 2007 “Fast calibrations of the forward search for testing multiple outliers in regression.” Advances in Data Analysis and Classification 1(2) 123–141. Rousseeuw P. J. 1983 “Regression techniques with high breakdown point.” The Institute of Mathematical Statistics Bulletin 12, 155. Rousseeuw, P. J. 1984 “Least median of squares regression.” Journal of the American Statistical Association 79(388), 871–880. Rousseeuw, P. J. and A. Christmann 2003 “Robustness against separation and outliers in logistic regression.” Computational Statistics & Data Analysis 43(3) 315–332. Rousseeuw P. J. and V. J. Yohai 1984 “Robust regression by means of S-estimators,” in Robust and nonlinear time series analysis, Vol. 26, W. H. Franke and R. D. Martin (Eds.). Springer, New York, pp. 256–272. Sobehart, J. R. and S. C. Keenan 2004 “Performance evaluation for credit spread and default risk models,” in Credit risk: models and
62 The Prediction of Default with Outliers: Robust Logistic Regression management Second Edition, D. Shimko (Ed.). Risk Books London, pp. 275–305. Sobehart, J. R., S. C. Keenan and R. M. Stein 2000 “Rating methodology: benchmarking quantitative default risk models: a validation methodology.” Moody’s Investors Service, Global Credit Research, New York. Stein, R. M. 2002 “Benchmarking default prediction models: pitfall and remedies in model validation.” Moody’s KMV, Technical Report, No. 030124.
977 Westgaard, S and N. Wijst 2001 “Default probabilities in a corporate bank portfolio: a logistic model approach.” European Journal of Operational Research, 135(2), 338–349. Wu, C. and X. M. Wang 2000 “A neural network approach for analyzing small business lending decisions.” Review of Quantitative Finance and Accounting, 15(3), 259–276. Zhu A. M. Ash and R. Pollin 2002 “Stock market liquidity and economic growth: a critical appraisal of the Levine/Zervos model.” International Review of Applied Economics 18(1), 1–8.
Chapter 63
Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence Hai Lin and Chunchi Wu
Abstract In this chapter, we survey modern term structure models for pricing fixed income securities and their derivatives. We first introduce bond pricing theory within the dynamic term structure model (DTSM) framework. This framework provides a general modeling structure in which most popular term structure models are nested. These include affine, quadratic, regime switching, jump-diffusion, and stochastic volatility models. We then review major studies on default-free bonds, defaultable bonds, interest rate swaps and credit default swaps. We outline the key features of these models and evaluate their empirical performance. Finally, we conclude this chapter by summarizing important findings and suggesting directions for future research. Keywords Risk-neutral and physical measures r Latent factors r Default intensity r Recovery r Liquidity r Counterparty risk r Default correlation r Yield spreads r Structural and reduced-form models
63.1 Introduction This article provides a survey on term structure models designed for pricing fixed income securities and their derivatives.1 The past several decades have witnessed a rapid development in the fixed income markets. A number of new fixed-income instruments have been introduced successfully into the financial market. These include, to mention just a few, strips, debt warrants, put bonds, commercial mortgagebacked securities, payment-in-kind debentures, zero-coupon convertibles, interest rate futures and options, credit default swaps, and swaptions. The size of the fixed income market
H. Lin Department of Finance, School of Economics & Wang Yanan Institute of Studies in Economics, Xiamen University, Xiamen, 361005, China C. Wu () Department of Finance, Robert J. Trulaske, Sr. College of Business, University of Missouri-Columbia, Columbia, MO 65211, USA e-mail: [email protected]
has greatly expanded. The total value of the fixed income assets is about two thirds of the market value of all outstanding securities.2 From the investment perspective, it is important to understand how fixed income securities are priced. The term structure of interest rates plays a key role in pricing fixed income securities. Not surprisingly, a vast literature has been devoted to understanding the stochastic behavior of term structure of interest rate, the pricing mechanism of fixed income markets, and the spread between different fixed income securities. Past research generally focuses on (1) modeling the term structure of interest rates and yield spreads; (2) providing empirical evidence; and (3) applying the theory to the pricing of fixed income instruments and risk management. As such, our review centers on alternative models of term structure of interest rates, their tractability, empirical performance, and applications. We begin with the basic definitions and notations in Sect. 63.1. We provide clear concepts of term structure of interest rates that are easily misunderstood. Section 63.2 introduces bond pricing theory within the dynamic term structure model (DTSM) framework. This framework provides a general modeling structure in which most of the popular term structure models are nested. This discussion thus helps understand the primary ingredients to categorize different DTSMs; that is, the risk-neutral distribution of the state variables and the mapping function between these state variables and instantaneous interest rate. Section 63.3 provides a literature review of the studies on default free bonds. Several widely used continuoustime DTSMs are reviewed here, including affine, quadratic, regime switching, jump-diffusion, and stochastic volatility (SV) models. We conclude this section with a discussion of empirical performance of these DTSMs, where we discuss some open issues, including the expectation puzzle, the linearity of state variables, the advantages of multifactor and nonlinear models, and their implications for pricing and risk management. 1
For a survey on term structure models, see Dai and Singleton (2002b), Dai and Singleton (2003), and Maes (2004). 2 See The 2008 Statistical Abstract, U.S. Census Bureau.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_63,
979
980
H. Lin and C. Wu
The studies of defaultable bonds are explored in Sect. 63.4. We review both structural and reduced-form models, with particular attention given to the latter. Several important issues in reduced form models are addressed here, including the specification of recovery rates, default intensity, coupon payment, other factors such as liquidity and taxes, and correlated defaults. Since it is convenient to have a closed-form pricing formula, it is important to evaluate the tradeoff between analytical tractability and the model complexity. Major empirical issues are related to uncovering the components of yield spreads and answering the question whether the factors are latent or observable. Section 63.5 reviews the studies on two popular interest rate derivatives: interest rate swap and credit default swap. Here we present the pricing formulas of interest rate swap and credit default swap based on risk-neutral pricing theory. Other risk factors, such as counterparty risk and liquidity risk are then introduced into the pricing formula. Following this, we review important empirical work on the determinants of interest rate swap spread and credit default swap spread. Section 63.6 concludes the paper by providing a summary of the literature and directions for future research. These include the following: (1) the economic significance of DTSM specification on pricing and risk management; (2) the difference of interest rate dynamics in the risk neutral measure and physical measure; (3) the decomposition of yield spreads; and (4) the pricing of credit risk with correlated factors.
which maps time to maturity into the yield of the zerocoupon bond with that maturity at time t. The price of the zero-coupon bond can be calculated from its yield by D.t; T / D expŒ.T t/r.t; T /
(63.3)
63.2.3 Instantaneous Interest Rate The instantaneous interest rate at time t, rt is defined as: rt D lim
T !t
ln D.t; T / T t
(63.4)
63.2.4 Forward Rate The forward rate at time t, ft T1 !T2 , is the interest rate between two future time points T1 and T2 , which is settled at time t. Specifically, ft T1 !T2 D
ln D.t; T1 / ln D.t; T2 / T2 T1
(63.5)
Remark: if T1 D t, ft t !T2 D r.t; T2 /.
63.2 Definitions and Notations 63.2.5 Instantaneous Forward Rate 63.2.1 Zero-Coupon Bonds A default-free zero-coupon bond (or discount bond) with maturity date T and face value 1 is a claim that has a nonrandom payoff of 1 for sure at time T and no other payoff before maturity. The price of a zero-coupon bond with maturity date T at time 0 t T is denoted by D .t; T /.
The instantaneous forward rate at time t with an effective date T , f .t; T / is defined as f .t; T / D lim ft T !T2 D lim T2 !T
D
T2 !T
ln D.t; T / ln D.t; T2 / T2 T
@ ln D.t; T / @T
(63.6)
63.2.2 Term Structure of Interest Rates Consider a zero-coupon bond with a fixed maturity date T . The continuously compounded yield on this bond is r.t; T / D
1 ln D.t; T / T t
(63.1)
The zero-coupon yield curve or term structure of interest rates at time t is the function ! r.t; t C / W Œ0; 1 ! <;
(63.2)
63.3 Bond Pricing in Dynamic Term Structure Model Framework 63.3.1 Spot Rate Approach Let the instantaneous interest rate rt be a deterministic function of state variables Yt and time t, where Yt is an K 1 vector, rt D r.Yt ; t /
(63.7)
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
and the risk-neutral dynamics of Yt follow a diffusion process, Q
d Yt D .Yt ; t /dt C .Yt ; t /dW t
(63.8)
63.3.2 Forward Rate Approach If we know the initial forward rate curve f .t; T / for all values of 0 t T TN , we can recover D.t; T / by
Q
where Wt is a K 1 vector of standard and independent Brownian motions under the risk-neutral measure Q, .Y; t / (K 1 vector) and .Y; t / (K K matrix) are both deterministic functions of Y and t. By risk-neutral pricing theory,3 the price of a zero-coupon bond with maturity date T and face value 1 is given by D.t; T / D
Q Et
Z exp
T
rs ds
981
Z D.t; T / D exp
T
f .t; v/dv
(63.12)
t
Heath et al. (1992) propose the following forward rate process6 df .t; T / D ˛.t; T /dt C .t; T /dW t
(63.13)
(63.9)
t
and find that under no arbitrage condition, Q Et
where represents the conditional expectation under riskneutral measure Q. D.t; T / is functionally related to K stochastic factors Yt :
(i) the forward rate evolves according to the following process Q
D.t; T / D D.t; T; Yt /
df .t; T / D .t; T / .t; T /dt C .t; T /dW t
(63.10)
By applying the discounted Feynman-Kac theorem4 to Equation (63.9), it can be shown that D.t; T / must satisfy the partial differential equation (PDE),5
where .t; T / D
RT t
(63.14)
.t; v/dv, and
(ii) the zero coupon bond price evolves according to the following process
@D @D C .Y; t /T @t @Y 2 1 T @ D D rD C Trace .Y; t /.Y; t / 2 @Y @Y T
Q
dD.t; T / D rt D.t; T /dt .t; T /D.t; T /dW t
(63.15)
(63.11) with the boundary condition D.T; T / D 1 for all rT . The superscript T represents the transpose of a vector (or matrix). In order to solve the PDE in Equation (63.11) with the boundary condition, we need to specify the diffusion process of state variables Yt in the risk-neutral measure and the functional form r.Yt ; t/, which determines the diffusion process of instantaneous interest rate in the risk-neutral measure. Models involved with such diffusion processes are called the dynamic term structure models (DTSMs).
63.4 Dynamic Term Structure Models In this section, we review the DTSMs commonly used in the pricing of default-free bonds. We begin with the affine DTSMs and then the nonlinear DTSMs.
63.4.1 Affine DTSMs Affine DTSMs are characterized by the condition that the yield of zero-coupon bond is an affine (linear plus constant) function of the state variables; that is,
3
The Fundamental Theorem of Finance states that under no arbitrage condition, there exists an equivalent martingale measure (risk-neutral) Q under which any security prices scaled by money market account are a martingale process. Such measure is unique if the market is both no arbitrage and complete. See Harrison and Kreps (1979), Duffie (1996), Cochrane (2001). 4 For a detailed description of the discounted Feynman-Kac theorem, please refer to Shreve (2004). 5 See also Dai and Singleton (2003).
r.t; T / D A.T t/ C B.T t /T Yt
6
(63.16)
Shreve (2004) shows that every DTSM driven by Brownian motion is an HJM model.
982
H. Lin and C. Wu
63.4.1.1 One Factor Affine DTSMs
63.4.1.2 Multi-factor Affine DTSMs
If K D 1, the diffusion process for rt is given by7
One can develop multi-factor affine DTSMs from the above one-factor examples by simply assuming that rt D ı0 C PK i D1 ıi Y it , with each Yi following one of the preceding onefactor affine DTSMs.8 In the following, we first illustrate this type of model by the two-factor model and then generalize it to the multi-factor model.
Q
drt D .rt ; t /dt C .rt ; t /dW t ;
(63.17)
which is the one-factor DTSM. Some of the popular onefactor affine models are summarized below.
1. Canonical two-factor Vasicek model9 1. Vasicek (1977) Model. In this model rt follows the diffusion process with .rt ; t / D . rt /, and .rt ; t / D . In this case, D.t; T / is given by D.t; T / D exp.A.T t / B.T t/rt /; T t (63.18) 2 /Œ 2 2
2 B./2 , 4
where A./ D . B./ C and / . The instantaneous forward rate is B./ D 1exp. given by f .t; T / D e .T t / rt C.1 e .T t / /
2 .1 e .T t / /2 2 2 (63.19)
2 and v.t; T / D 2 1 e 2.T t / is the conditional variance of rT . 2. CIR (Cox et al. (1985)) Model. In this model, rt follows the diffusion process with .rt ; t / D . rt /, and p .rt ; t / D rt . The zero-coupon bond price D.t; T / is given by D.t; T / D A.T t / exp.B.T t /rt / where A./
h
D
2e.C /=2 .C/.e 1/C2
p
2.e 1/
(63.20)
i2= 2
D
, B./
and D C .C /.e 1/C2 3. Hull and White (1993) Model. The Hull and White (1993) model is a generalization of the Vasicek (1977) model that considers the time variant properties of , , and . In this model, rt follows the diffusion process with .rt ; t / D a.t/ b.t/rt , and .rt ; t / D .t/, where a.t/, b.t/ and .t/ are non-random functions of t. D.t; T / is given by 2
2 2 .
D.t; T / D exp.A.t; T / B.t; T /rt / RT
where A.t; T / D and
t
Z B.t; T / D
T
1 2 .s/B 2 .s; T / 2
Z s
exp b.u/du ds
t
7
a.s/B.s; T /
(63.21)
ds,
8 Q ˆ
(63.23)
Q
where W1t and W2t are independent standard Brownian motions under the risk-neutral measure. Given the stochastic processes of the factors, the price of zerocoupon bond D.t; T / is given by D.t; T / D exp A.T t / B1 .T t /Y1t
(63.24) B2 .T t/Y2t where B1 ./, B2 ./ and A./ satisfy the following ordinary differential equations (ODEs), 8 ˆ B 0 ./ D 1 B1 ./ 21 B2 ./ C ı1 ˆ < 1 B20 ./ D 2 B2 ./ C ı2 ˆ ˆ : 0 A ./ D 12 B12 ./ 12 B22 ./ C ı0
(63.25)
with the boundary conditions B1 .0/ D B2 .0/ D A.0/ D 0. 2. Canonical two-factor CIR model 8 p Q ˆ dY D .1 11 Y1t 12 Y2t /dt C Y1t dW 1t ˆ < 1t p Q dY 2t D .2 21 Y1t 22 Y2t /dt C Y2t dW 2t ˆ ˆ : rt D ı0 C ı1 Y1t C ı2 Y2t (63.26) Under some regularity conditions, D.t; T / D exp A.T t / B1 .T t/Y1t
(63.27) B2 .T t /Y2t
(63.22)
t
Q
Throughout this paper, Wt represents the standard Brownian motion under the risk-neutral measure Q and W t stands for the standard Brownian motion under the physical measure.
8 See Stambaugh (1988), Longstaff and Schwartz (1992), Chen and Scott (1993), Pearson and Sun (1994), Duffie and Singleton (1997) for the multi-factor version of CIR model. 9 Please refer to Shreve (2004) for a description of the generalized twofactor Vasicek model.
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
where A./, B1 ./, and B2 ./ satisfy the following ODEs, 8 1 0 2 ˆ
B20 ./ D 12 B1 ./ 22 B2 ./ 12 B2 2 ./ C ı2 ˆ : 0 A ./ D 1 B1 ./ C 2 B2 ./ C ı0 (63.28)
and the boundary conditions A.0/ D 0, B1 .0/ D 0, and B2 .0/ D 0. The superscript0 denotes the first-order derivative. 3. Two-factor mixed model 8 p Q ˆ dY 1t D . 1 Y1t /dt C Y1t dW 1t ˆ < p p Q Q dY 2t D 2 Y2t dt C 21 Y1t dW 1t C ˛ C ˇY1t dW 2t ˆ ˆ : rt D ı0 C ı1 Y1t C ı2 Y2t (63.29) Then, D.t; T / D exp A.T t / B1 .T t/Y1t
(63.30) B2 .T t/Y2t where A./, B1 ./, and B2 ./ satisfy the following ODEs, 8 ˆ B10 ./ D 1 B1 ./ 12 B1 ./2 21 B1 ./B2 ./ ˆ ˆ ˆ < .1 C ˇ/B ./2 C ı 2
1
ˆ B20 ./ D 2 B2 ./ C ı2 ˆ ˆ ˆ :A0 ./ D B ./ 1 ˛B ./2 C ı 1
2
2
0
with the initial conditions that A .0/ D 0 and B .0/ D 0K1 . 5. Duffie and Kan (1996) provide sufficient conditions for affine DTSMs that could handle the general correlated affine diffusions:10 (i) .Y; t / is an affine function of Yt W .Y; t / D a C bYt , where a is a K 1 vector and b is a K K matrix. (ii) .Y; t /.Y; t /T is an affine function of Yt : .Y; t /.Y; t /T D h0 C
where St is a diagonal matrix with ŒSt ii D ˛i C YtT ˇi . Let B be the K K matrix with the i th column given by ˇi . By restricting the parameter vector .ı0 ; ı; K; ‚; ˙; ˛; B/, they construct admissible affine models that give a unique, well-defined solution for D.t; T /, which is equal to exp.A.T t/ B.T t /T Yt / by the following system of ODEs: 8 K 2 ˆ 1 P ˆ ˙ T B./ i ˛i ı0
1 2
K P i D1
i D1
2 ˙ T B./ i ˇi C ı (63.33)
XK j D1
h1 j Yjt
(63.34)
j
where h0 and h1 , j D 1; 2; : : : ; K are K K matrices.
63.4.2 Quadratic DTSMs Ahn et al. (2002) provide a general specification of quadratic DTSMs. The state variables Yt are assumed to follow the multivariate Gaussian processes with mean-reverting properties in the risk-neutral measure: Q
dY t D Œ Yt dt C .Y; t /dW t
(63.35)
where is a K 1 vector, and are K K matrices, Q and Wt is a K-dimensional vector of the standard Brownian motions that are mutually independent under the risk-neutral measure Q. The instantaneous interest rate rt is a quadratic function of the state variables,
(63.31) and the boundary conditions A.0/ D 0, B1 .0/ D 0, and B2 .0/ D 0. 4. Dai and Singleton (2000) examine the multi-factor affine DTSMs with the following structure: ( p Q dY t D K.‚ Yt /dt C ˙ St dW t (63.32) r.Yt ; t / D ı0 C ıYTt
983
r.Y; t/ D ı0 C ı T Yt C YtT ‰Yt
(63.36)
where ı0 is a constant, ı is a K 1 vector, ‰ is a K K positive semi-definite matrix, and ı0 14 ı T ‰ 1 ı 0K1 . Applying the discounted Feynman-Kac theorem to bond pricing, we obtain: D.t; T / D exp.A.T t / C B.T t/T Yt C YtT C.T t /Yt / (63.37) where A./, B./, and C./ satisfy the ODEs, 8 ˆ C 0 ./ D 2C./ T C./ .C./ C T C.// ‰ ˆ ˆ ˆ
(63.38) with the initial conditions that A.0/ D 0, B.0/ D 0K1 , and C.0/ D 0KK . 10
See Duffie et al. (2003) for sufficient and necessary conditions.
984
H. Lin and C. Wu
The yield of zero-coupon bond is a quadratic function of the state variables, r.t; T / D
A.T t/ C B.T t /T Yt C YtT C.T t /Yt T t (63.39)
The above is an example of the nonlinear model. Other examples of quadratic DTSMs include: 1. Beaglehole and Tenney (1991) Model: ı0 D 0, ı D 0K1 , ‰, , and are diagonal matrix. 2. Longstaff (1989) Model: ı0 D 0, ı D 0K1 , ‰, are diagonal matrix, D 0K1 , and ¤ 0K1 . The key feature of this model is that the state variables are not mean reverting. 3. Constantinides (1992) Model: ı D 0K1 , ‰ D IKK , and are diagonal matrices.
63.4.3 DTSMs with Jumps The public announcement of important economic news and the sudden change of monetary policy typically have a jump impact on the interest rates. A number of researchers (see, for example, Das (2002) and Johannes (2004)) find that most classical diffusion processes fail to explain the leptokurtosis of interest rate and suggest the use of jump in DTSMs. Suppose that rt D r.Yt ; t / is a function of a jumpdiffusion process Y with the risk-neutral dynamics Q
dY t D .Yt ; t /dt C .Yt ; t /dW t C Jt dZ t
(63.40)
where Zt is a Poisson process with risk-neutral intensity t , and the jump size Jt follows the distribution vt .x/ v.xI Yt ; t/. Bas and Das (1996) extend the Vasicek (1977) model to consider the jump behavior of interest rate. Ahn and Thompson (1988) extend the CIR model to the case of state variables following a square-root process with jumps. Duffie et al. (2000) obtain the analytic expressions for D.t; T / with the affine jump-diffusion process. Piazzesi (2001) develops a class of affine-quadratic jump-diffusion models and links the jumps to the resetting of target interest rates by the Federal Reserve Bank.
63.4.4 DTSMs with a Regime Switching The processes that govern the DTSMs are very likely to change over economic cycles. There is extensive empirical literature that suggests the regime switching model for
DTSMs (see, for example, Sanders and Unal 1988; Gray 1996; Garcia and Perron 1996; Ang and Bekaert 2002). Suppose that there are (S C 1) possible states (regimes) evolved by a conditional Markov chain st W ! f0; 1; : : : ; S g with a .S C 1/ .S C 1/ transition probability matrix Pt with the ij property that all rows are sum to one. Pt dt is the probability of moving from regime i to j over the next interval dt. The state variables Yt in the risk-neutral measure follow the following process Q
dY t D j .Yt ; t/dt C j .Yt ; t /dW t
(63.41)
where j indexes regime j . Let zt j D 1st Dj ; j D 0; 1; : : : ; S be the regime indicator functions. Then .st I Yt ; t/ D PS PS j j j j j D0 zt .Yt ; t /, .st I Yt ; t/ D j D0 zt .Yt ; t/ and PS j j j D D.t; T / D j D0 zt D .t; T /, where D .t; T / D.st D j I Yt ; t; T /. Bansal and Zhou (2002) develop a discrete-time regimeswitching model where the short interest rate and the market price of risks are subject to discrete regime shifts. Dai and Singleton (2003) propose a DTSM with regime switching that has a closed-form solution for the zero-coupon bond price. The dynamics for each regime i in risk-neutral measure is given by 8 i T i ˆ
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
985
63.4.5 DTSMs with Stochastic Volatility
63.4.6 Other Non-affine DTSMs
The stochastic volatility model introduces an additional factor; that is, the volatility of rt , in an attempt to explain the instantaneous interest rate dynamics. Examples in this category are:
Besides the quadratic DTSMs, DTSMs with jumps, regime switching, and stochastic volatilities, there are other nonaffine DTSMs with .Yt ; t / or .Yt ; t / not satisfying the conditions of affine DTSMs suggested by Duffie and Kan (1996) and Duffie et al. (2003). The following examples include:
1. Longstaff and Schwartz (1992) SV model:
1. Ahn and Gao (1999) nonlinear model:
8 ˇı ˛ ı ˆ ˆ drt D ˛ C ˇ rt Vt dt ˆ ˆ ˇ˛ ˇ˛ ˆ ˆ s s ˆ ˆ ˆ ˆ ˇrt Vt Vt ˛rt ˆ ˆ dW 1t C ˇ dW 2t C˛ ˆ ˆ < ˛.ˇ ˛/ ˇ.ˇ ˛/ ˆ ˛ˇ.ı / ˇ ˛ı ˆ 2 2 ˆ ˆ r V dt dV D ˛ C ˇ t t t ˆ ˆ ˇ˛ ˇ˛ ˆ ˆ s s ˆ ˆ ˆ ˆ ˇr V Vt ˛rt t t ˆ 2 2 ˆ C˛ dW 1t C ˇ dW 2t : ˛.ˇ ˛/ ˇ.ˇ ˛/ (63.44)
drt D ˛1 C ˛2 rt C
˛3 rt2
dt C
q
˛4 C ˛5 rt C ˛6 rt3 dW t (63.48)
2. Ait-Sahalia (1996) nonlinear model: drt D ˛1 rt1 C ˛0 C ˛1 rt C ˛2 rt2 dt C rt dW t (63.49) 3. Black et al. (1990), and Black and Karasinski (1991) nonlinear model:12 Q
where ˛, ˇ, , , ı and are positive constants and Vt is the instantaneous variance of changes in rt . 2. Andersen and Lund, (1997), and Ball and Torous (1999) SV models: ( drt D 1 . rt /dt C t rt dW 1t ; > 0 (63.45) d log t2 D 2 .˛ log t2 /dt C dW 2t 3. Bali (2000) SV model: 8 ˆ
(63.46)
4. Collin-Dufresne and Goldstein (2002) SV model:11 8 p Q dv D kv .v vt /dt C v vt dW 1t ˆ ˆ ˆ t ˆ ˆ d D Œk . t / C k r .r rt / C k v .v vt / dt ˆ ˆ q < t p p Q Q Q C v vt dW 1t C r ˛r C vt dW 2t C 2 C ˇvt dW 3t ˆ ˆ ˆ vt / dt ˆdrt D Œkr .r t / C kr . t / C krv .vq ˆ ˆ ˆ p p Q Q Q : C rv vt dW 1t C ˛r C vt dW 2t C r 2 C ˇvt dW 3t (63.47)
11 It should be noted that Longstaff and Schwartz (1992) and CollinDufresne and Goldstein (2002) SV model are also nested in affine DTSMs since their yields of zero-coupon bonds are also affine to state variables.
d log rt D .t t log rt /dt C t dW t
(63.50)
The other related but different classes of DTSMs include two categories. The first class of these models describes the DTSMs with a selection of macroeconomic variables. As Maes (2004) points out, there is a great incentive to investigate the relationship between the dynamics of macroeconomic variables and the term structure because there is strong evidence that term structure predicts movement on macroeconomic activities (see, for example, Estrella and Hardouvelis (1991), Estrella and Mishkin (1996, 1997, 1998)). One important issue involved in these models is to interpret the economic meanings underlying the latent factors in terms of observed and unobserved macroeconomic variables such as inflation and output gaps. Dewachter et al. (2006) study a continuous-time joint model of macroeconomy and the term structure of interest rates. Ang and Piazzesi (2003) and Dewachter and Lyrio (2006) find that macroeconomic factors clearly affect the short end of term structure. Kozicki and Tinsley (2001, 2002) find that missing factors may be related to the long-run inflation expectation of agents. Dewachter and Lyrio (2006) provide a macroeconomic interpretation for the latent factors in DTSMs: the “level” factor represents the long-run inflation expectation of agents; the “slope” factor captures the temporary business cycle conditions; and the “curvature” factor represents a clear independent monetary policy factor. Moreover, Wu (2006) develops a general equilibrium model of term structure with macroeconomic factors. 12
Peterson et al. (1998) extend the lognormal model to two factor case.
986
The second class views the whole term structure as state variables and models their dynamics accordingly. Such high dimensional models are developed by Kennedy (1994) as “Brownian sheets,” Goldstein (2000) as “random fields,” and Santa-Clara and Sornette (2001) as “stochastic string shocks.”
63.4.7 Empirical Performance Because bond pricing is an important issue and there are so many DTSMs, a large number of studies have evaluated different DTSMs and compared their empirical performance in search for a best model. In the following, we summarize major results of empirical term structure studies.
63.4.7.1 Explanation of Expectation Puzzle The expectation puzzle was documented by Fama (1984), Fama and Bliss (1987), Froot (1989), Campbell and Shiller (1991), and Bekaert et al. (1997), which has long posed a challenge for DTSMs.13 Campbell (1986) introduces a constant risk premium hypothesis to explain the expectations hypothesis. Campbell and Shiller (1991) attribute the expectation puzzle to the time-varying liquidity premium. Backus et al. (1989) show that a model assuming power utility preferences and time-varying expected consumption growth cannot account for this puzzle. Longstaff (2000) tests the expectations hypothesis at the extreme short end of the term structures and finds evidence supporting the hypothesis. Dai and Singleton (2002a) show that a statistical model of stochastic discount factor (SDF) can explain the puzzle. By altering the dependence of risk premia on factors, one can find parameter values for the three factor affine DTSMs that are consistent with the risk premium regressions in Fama and Bliss (1987) and Campbell and Shiller (1991). Using the framework of Campbell and Cochrane (1999), Wachter (2006) proposes a consumption-based model that accounts for many features of the nominal term structure of interest rates. Bekaert and Hodrick (2001) argue that the past use of the large-sample critical regions instead of the small sample counterparts may have overstated the evidence against the expectations hypothesis. Backus et al. (2001) find that it is unlikely to explain the expectation puzzle using a one-factor affine DTSM.
H. Lin and C. Wu
63.4.7.2 Linear or Nonlinear Drift of State Variables? Most of the DTSMs assume that the drifts of state variables are linear (mean reverting). However, empirical findings are inconclusive. Ait-Sahalia (1996) constructs a specification test of DTSMs and rejects the linear drift specification. Stanton (1997) obtains similar results as Ait-Sahalia (1996). Conley et al. (1997) examine the linearity of drift and find that mean reverting is stronger only for large values of interest rate. Ahn and Gao (1999) find nonlinearity in term structure dynamics. Chapman and Pearson (2000) conduct a Monte Carlo simulation of DTSMs with a linear drift and then apply the estimators of Ait-Sahalia (1996) and Stanton (1997) to the simulated data. They find strong mean reversions when interest rate is high. Elerian et al. (2001) and Jones (2003) use a Bayesian approach to show that stronger mean reversion in the extreme levels of interest rate depends critically on the prior distribution. In a survey paper, Chapman and Pearson (2001) examine the interest rate data and find that mean reversion is weak with a wide range of interest rates. The short rate series seems to be a “persistent” time series; that is, it lingers over long consecutive periods above and below the unconditional long-run mean. Boudoukh et al. (1998) and Balduzzi and Eom (2000) use the nonparametric analysis and find that the drifts in both two- and three-factor DTSMs are nonlinear. Dai and Singleton (2000) empirically test the affine DTSMs and find relatively promising performance of affine DTSMs. But as Ahn et al. (2002) point out, the results also suggest that there may be some omitted nonlinearity in the affine DTSMs since the pricing errors of affine DTSMs are sensitive to the magnitude of the slope of the yield curve and highly persistent. However, there are other empirical studies that support the linear drift. Durham (2003) applies the simulated maximumlikelihood estimator of Durham and Gallant (2002) to compare different DTSMs. The results suggest that simpler drift specifications are preferable to more flexible forms and that the drift function appears to be constant. Li et al. (2001) implement a moment-based estimator that accounts for the bias described by Chapman and Pearson (2000) and find no evidence of a nonlinear drift. In summary, the exact nature of drift for instantaneous interest rate is still inconclusive. Some evidence suggests that mean reversion is stronger in extreme levels of interest rates, while others fail to find strong evidence of nonlinearity.
63.4.7.3 One Factor or Multiple Factors? 13 Campbell and Shiller (1991) run the regressions r.t C 1; n 1/ r.t; n/ D ’ C .1=n 1/“n .r.t; n/ r.t; 1// C et , where r(t,n) is the yield of zero-coupon bond with maturity n at time t. Under the expectation hypothesis, “n D 1 for all n. However, Campbell and Shiller (1991) show that ˇn is negative and increasing with n.
While many studies employ single-factor models to describe the interest rate behavior, others suggest using multi-factor models. In a single-factor DTSM, the whole term structure may be inferred from the level of one factor, which
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
is historically taken to be the instantaneous interest rate. There are some intuitive reasons to criticize the single-factor DTSMs.14 First, it implies that changes in term structure and hence bond returns are perfectly correlated across maturities, which can be easily rejected by empirical evidence. Second, it can only accommodate term structures that are monotonically increasing, decreasing or normally humped. An inversely humped or any other shape cannot be generated by single-factor DTSMs. Third, Dewachter and Maes (2000) compare single-factor versus multi-factor DTSMs and find that one factor time-variant parameter models provide a relatively poor fit to the actual term structure observed in the market. Empirically, Brown and Dybvig (1986) find that singlefactor DTSMs understate the volatility of long-term yields. Brown and Schaefer (1994) show that the mean reversion coefficient required to explain the cross-maturity patterns at one time is inconsistent with the best fit coefficient. Empirical research of the term structure models generally suggests that multi-factor DTSMs perform much better than single-factor DTSMs. Dai and Singleton (2000) show a substantial improvement in data fit offered by multi-factor DTSMs. Specifically, the changes in instantaneous interest rate may not only depend on the current level, but also on other factors that may be unobservable or observable. Dai and Singleton (2003) compare different DTSMs and find that the following: (1) The conditional volatilities of one-factor affine and quadratic DTSMs are affine and these models fail to capture the change of volatility, which suggest the need to use multi-factor DTSMs; (2) In multi-factor DTSMs, the hump and inverted-hump of volatility could be realized by the negative correlation between state variables or the negative correlation between state variables and interest rate; and (3) the two-factor model performs the best. A number of studies have attempted to provide economic meanings for the factors included in the multifactor models. These include Brennan and Schwartz (1979), Richard (1978), Longstaff and Schwartz (1992), Schaefer and Schwartz (1984), Andersen and Lund (1997), Balduzzi et al. (1996), and Dewachter et al. (2006).
63.4.7.4 Affine or Nonlinear DTSMs? The academic literature has focused on the affine DTSMs, which are mainly due to the fact that this class of models yields closed-form bond pricing formulas and can easily handle the cases with multiple factors. However, as Dai and Singleton (2000) and Maes (2004) point out, the affine DTSMs are not able to ensure the positivity of interest rates without having to impose parameter restrictions and without
14
See also Maes (2004).
987
loosing flexibility on the unconditional correlation structure among the state variables. Moreover, the affine DTSMs fail to capture the nonlinearities in the dynamics of interest rates, which are documented by Ait-Sahalia (1999), Boudoukh et al. (1998), and Balduzzi and Eom (2000). Chan et al. (1992) compare different DTSMs using GMM. The results show that the models with volatility dependent on risk perform the best. Johannes (2004) examines some classical DTSMs and finds that these models fail to produce the distribution consistent with historical data. He then proposes the jump factor in the DTSM. Hong and Li (2005) propose a nonparametric specification method to test DTSMs. The results show that although significant improvements are achieved by introducing jumps and regime switching into the term structure of interest rate, all models are still rejected, implying that specification errors remain in these DTSMs. Duffee (2002) tests the affine DTSMs and finds that affine DTSMs forecast future yield changes poorly. Affine DTSMs cannot simultaneously match term structure movements and bond return premiums without modifying the dependence of the market price of interest rate risk on interest rate volatility. 63.4.7.5 What Do We Really Care About? Perhaps there are two more fundamental questions than “how much we know the dynamics of short rates.” The first is do we really care about the differences among these models? This question depends on whether different DTSMs have significantly different implications for their applications, such as pricing and risk management (for example, the calculation of value-at-risk). Although more complicated models could capture some specific characteristics of underlying variables, the improvement for pricing and hedge may be limited.15 Second, we should bear in mind that only the DTSMs in the risk-neutral measure matter for pricing. The dynamics of instantaneous interest rate in the risk-neutral measure is different from that in the physical measure. For example, nonlinearity in the drift based on the physical measure need not imply the nonlinearity in the risk-neutral measure. It is the market price of risk for interest rate that connects the DTSMs in different measures (see, for example, Dai and Singleton (2000), Duffee (2002), Duarte (2004), Ahn and Gao (1999), Cheridito et al. (2007) for different specifications of the market price of risk for interest rate). Therefore, we should be cautious when we attempt to infer the riskneutral parameter values from the variables in the physical measure. As Dai and Singleton (2003) argue, it seems that having multiple factors in linear models is more important than introducing the nonlinearity into models with a smaller 15 Bakshi et al. (1997) compare different option pricing models and find that the improvement by some complicated models may be limited.
988
number of factors. Moreover, because of the computational demand of pricing in the presence of nonlinear drifts, attention now continues to focus primarily on DTSMs with a linear drift for state variables.
H. Lin and C. Wu
Titman and Torous (1989), Kim et al. (1993), Leland (1994), Fama and French (1996), Briys and de Varenne (1997), Cathcart and El-Jahel (1998), Goldstein et al. (2001), Nielsen et al. (2001), Acharya and Carpenter (2002), and Vassalou and Xing (2004).
63.5 Models of Defaultable Bonds 63.5.2 Reduced-Form Models Defaultable bonds are bonds whose payoff depends on the occurrence of default event (credit risk). Therefore, modeling the default probability (credit risk) is the key issue for pricing the defaultable bonds. Basically, there are two approaches to model the default: structural and reduced-form approaches.
63.5.1 Structural Models Structural models are pioneered by Black and Scholes (1973) and Merton (1974) that regard the corporate bond as the derivative of firm value. In these models, default occurs at the maturity date of debt provided that the asset value is less than the face value of maturing debt. Default before maturity is not considered in both studies. Black and Cox (1976) propose the first passage time model that defines the default time as the first time the asset value falls below a boundary. Within this framework, default can occur before the maturity of debt. Geske (1977) introduces the coupon payment in the structural model and treats it as a compound option. On each coupon date, if shareholders decide to pay the coupon by selling new equity, the firm stays alive; otherwise, default occurs and bondholders seize firm assets. Leland and Toft (1996) consider the case that the firm continuously issues a constant amount of debt with a fixed maturity that pays continuous coupons. Similar to Geske (1977), the default boundary is endogenous since equity holders can decide whether or not to issue new equity to pay for the debt in case that the firm’s payout is not large enough to cover the debt service requirements. Longstaff and Schwartz (1995) introduce interest rate risk described by the Vasicek, (1977) model and provide a pricing formula for fixed coupon and floating coupon bonds. Collin-Dufresne and Goldstein (2001) extend the Longstaff and Schwartz (1995) model to account for a stationary leverage ratio. Zhou (1997) extends Merton’s approach by modeling the firm’s value process as a geometric jump-diffusion process. Anderson and Sundaresan (1996) and Mella-Barral and Perraudin (1997) introduce simplified bargaining games to obtain analytical expressions for the default boundaries in structural models. Duffie and Lando (2001) consider incomplete accounting information in structural models. Related structural models are also studied by Ho and Singer (1982),
Reduced-form models treat default time as the arrival time of a counting process with the associated intensity process. Jarrow and Turnbull (1995) model the default time as a Poisson process with constant intensity ; that is, the number of events occurring at any time interval t follows the Poisson distribution with intensity t, P fN.t C t/ N.t/ D ng D
. t/n t e ; n D 0; 1 : : : ; s; t 0 nŠ
(63.51)
where N.t/ is the number of events until t. Duffie and Huang (1996), Jarrow et al. (1997), Lando (1998), and Madan and Unal (1998) introduce the doubly stochastic and state dependent default intensity into the Jarrow-Turnbull model, which then becomes the benchmark specification for reduce-form models. The model is formalized as the following. Define the default time as a random variable between Œ0; T , and the probability of no default until time t (survival probability), 0 t T , is St D P . t /. The unconditional default probability between t and t C t is P .t < t C t/ D St St C t . The conditional default probability between t and t C t conditional on no default until t is P .t < t C tj t / D
St St C t St
(63.52)
The conditional default probability in unit time is defined as St St C t P .t < t C tj t / D
t St t
(63.53)
The instantaneous conditional default probability (default intensity) t is defined as S0 P .t < t C tj t/ D t
t !0
t St
t D lim
(63.54)
with the initial condition that S0 D 1, Z t
s ds St D exp 0
(63.55)
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
Then the price of a zero-coupon defaultable bond with face value 1 is given by Q Et
B.t; T / D
Z exp
ru d u ! 1f T g
989
The pricing formula of defaultable bonds depends on three variables: recovery rate, default-free interest rates, and the default intensity. Differences in the treatment of three factors differentiate each reduced-form model.
t
Q CEt
Z exp
T
(63.56) ru d u 1f >T g
t
R Q where Et exp t ru du ! 1f T g is the present value of the recovery upon default ! D !.Y ; / at the hdefault arrival time whenever T , and R i
T Q Et exp t r.u/du 1f >T g is the present value of face value conditional on no default before maturity. Simplifying the pricing formula (see, e.g., Lando (1998)) gives the following result Z
T
Q
B.t; T / D Et
Z s
s !s exp .ru C u /du ds
t T
.rs C s /ds
Z
T
Q
Et t
Q
C Et
This recovery formulation refers to the case that in the event of default, a fraction ! of the face value is recovered but the payment is postponed until the maturity of defaultable bond. This in fact results in the following specification: ! D !D.; T /
(63.58)
If ! is constant and state-independent, Q Et
Z !
T
Z s exp
t
T
Z
T
ru du t
Z Q C Et exp
s
u du ds
t
.rs C s /ds ;
(63.59)
t
(63.57)
t
B.t; T / D !
63.5.2.1.1 Fractional Recovery of Par, Payable at Maturity
B.t; T / D
t
Z Q CEt exp
63.5.2.1 Recovery Rate
which becomes
Z Z s
Q s exp u du Et exp t
Z exp
T
ru du ds
t
Z
Q rs ds Et exp
t
T
T
s ds
(63.60)
t
if rt and t are independent. The Jarrow et al. (1997) model is a special case of this class.
B.t; T / D
Q Et
Z !
T
Z s
s exp .ru C u / du ds
t
t
Z Q C Et exp
63.5.2.1.2 Fractional Recovery of Par, Payment at Default
T
.rs C s / ds ;
(63.61)
t
If a constant and state-independent fraction ! of the face value is recovered and paid at the time of default, Z
T
B.t; T / D !
Q
Et
Z s Z s
Q s exp u du Et exp ru du ds
t
t
Q C Et
which becomes
Z exp
T t
t
Z
Q rs ds Et exp
T t
s ds
(63.62)
990
H. Lin and C. Wu
if rt and t are independent. Examples of this class of models are Duffie (1998a), Duffie and Singleton (1999a), Longstaff et al. (2005), and Liu et al. (2007).16 Due to the identification problem in obtaining separate estimates for the recovery rate and the default intensity, most empirical studies try to estimate the default intensity process from defaultable bond data using an exogenously given recovery rate. Houweling and Vorst (2005) find that under some specification of and r, the value of recovery rate does not substantially affect the results if it lies within a logical interval.
with an intensity c and that jump sizes are exponentially distributed with mean J , Duffie and Singleton (2003) show that the conditional survival probability from t to s is: P . > sj t/ D e ˛.st /Cˇ.st/t
(63.67)
where 8 1 e ˆ ˇ./ D ˆ ˆ ˆ ˆ ˆ
< 1 e ˛./ D ˆ ˆ
ˆ ˆ ˆ c 1 e ˆ : Jt ln 1 C J J C
63.5.2.2 Dynamics of Interest Rate and Default Intensity
(63.68) Duffie and Singleton (1999a) model rt and t as
Although many complicated DTSMs (for example, DTSMs with jump, and regime switching) are applicable to model the dynamics of interest rate and default intensity in the pricing of defaultable bonds, affine models are still the most favored framework due to the analytical tractability of these models. Consider the state vector Y that follows an affine-jump diffusion process dY jt D j .j Yjt /dt p Q C j Yjt dW jt C Jjt dZ jt ; j D 1; : : : ; K (63.63)
(
(
rt D ar0 .t/ C ar1 .t/Y1t C : : : C arK .t/YKt t D a0 .t/ C a1 .t/Y1t C : : : C aK .t/YKt
(63.64)
where Y0t , Y1t , Y2t and Y3t are independent CIR (square-root) processes under the risk-neutral measure, and 0 and b are constants. The degree of negative correlations between rt and t is controlled by the choice of b.17 Liu et al. (2007) model rt and t as: 8 D t C ˇ.rt rN / ˆ ˆ < t p Q drt D r .r rt /dt C r rt dW 1t ˆ ˆ p : Q dt D . t /dt C t dW 2t Q
(63.65)
(63.70)
Q
where W1t , W2t are two independent standard Brownian motions under the risk-neutral measure. The degree of negative correlations between rt and t is controlled by the choice of ˇ. On the other hand, Duffie (1998b), Bielecki and Rutkowski (2000, 2004) apply the spread forward rate and price the zero-coupon defaultable bond as
Duffie and Singleton (2003) formulate an intensity process as a mean-reverting process with jumps. The default intensity between jump events is given by: dt D . t /: dt
(63.69)
t D bY 0t C Y3t
Q
where Wjt , j D 1; , K are independent standard Brownian motions under the risk-neutral measure. rt and t are typically modeled by making them dependent on a set of common stochastic factors Y , which introduces stochasticity and correlation in the process of rt and t . For example,
rt D 0 Y0t C Y1t C Y2t
Z B.t; T / D exp
T
.f .t; v/ C s.t; v//dv
(63.71)
t
where f .t; v/ is the default-free forward rate, and s.t; v/ is the spread forward rate.
Thus, at any time t between two jumps, 63.5.2.3 Coupon t D C e .t T / .T /
(63.66)
where T is the time of last jump and T is the jump intensity at time T . Suppose that jumps occur at Poisson arrival time
It is quite natural to extend the pricing of zero-coupon defaultable bonds to coupon bonds. Assuming that the coupon C is paid continuously and the recovery is a fraction of the
16
17
There is another class of models that specify the recovery as a fraction of market value, see, Lando (1998), and Li (2000b).
Chen et al. (2006) also propose a pricing model with correlated factors.
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
991
par value, which is paid at default time, the price of coupon defaultable bond is given by
B.C; t; T / D
Z C
Q Et
Z
T
s
exp t
Z
Q .ru C u /du ds C Et exp
t
Z Q C Et !
T
T
.rs C s /ds
t
Z s
s exp .ru C u /du ds
t
(63.72)
t
If rt and t are assumed to be independent, it can be further simplified to Z B.C; t; T / D C
T
Q
Et
Z s Z s
Q exp ru du Et exp u du ds
t Q
C Et
t
Z exp
T
t
Z
Q rs ds Et exp
t
Z
T
C!
Q
Et
T
s ds
t
Z s
Z s
Q exp ru du Et s exp u du ds
t
t
(63.73)
t
The case of discrete coupon payments and the fractional recovery of par value paid at maturity can be derived in a similar way.
Q C Et
Z !
T t
Z s
s exp .ru C u C lu /du ds t
(63.74) 63.5.2.4 Liquidity and Taxes Standard term structure models of default risk assume that yield spreads between corporate (defaultable) bonds and government (default-free) bonds are determined by two factors: default risk .t / and the expected loss .1 !/ in the event of default. However, recent studies have shown that other factors, such as liquidity and taxes can significantly affect corporate bond yield spreads.18 Using a reduced-form approach, Longstaff et al. (2005) introduce the liquidity factor into the defaultable bond pricing formula and obtain a closed-form solution for corporate bond price with default and liquidity, B.C; t; T / Z Q D Et C
T
Z s
exp .ru C u C lu /du ds
t
t
Z Q C Et exp
T
.rs C s C ls /ds
t
18
See Elton et al. (2001), Longstaff et al. (2005), and Liu et al. (2007).
where rt , t , and lt denote the default free interest rate, default intensity, and liquidity intensity at time t, respectively. Their dynamics follow (
p Q dt D . t /dt C t dW 1t Q
dlt D l dW 2t Q
Q
(63.75)
where W1t and W2t are independent standard Brownian motions under the risk-neutral measure. The tax effect, on the other hand, is much more complicated due to changes in tax rate and differential tax treatments of capital gain (loss) in discount (premium) bonds. For example, for the discount bonds, when there is no default, the difference between the face value and the price is regarded as the capital gain and should be taxed by capital gain tax rate. When there is a default before maturity, the investor expects a capital loss and receives a tax rebate from the government. Moreover, the premium and discount of corporate bonds must be amortized, which makes the pricing more complicated. Liu et al. (2007) deal with these issues by assuming a buy-and-hold strategy of bond and obtain the pricing formula for corporate bond with taxes. The price
992
H. Lin and C. Wu
of discount defaultable coupon bond without amortization is given by the following:19 " B.t; tM / D
Q Et
C.1 i / Q
C Et
M X
Z exp
tm
# .rs C s /ds
t
mD1
Z 1 g .1 B.t; tM // exp
.rs C s /ds
tM t
Z
tM
Q
C Et
Z s
! C g .B.t; tM / !/ s exp .ru C u /du ds
t
(63.76)
t
where i is the income tax rate, g is the capital tax rate, ti , i D 1; : : : ; M is the discrete time of coupon payment. After a simple transformation,
" Z tm
# M X 1 Q B.t; tM / D Et C.1 i / exp .rs C s /ds Z t mD1 Z tM
Q C Et .1 g / exp .rs C s /ds t
Z
tM
Q C Et
Z s
.1 g /!s exp .ru C u /du ds
t
and Z Q exp Z D 1 g E t
tM
In a recent paper, Lin et al. (2007) propose a generalized defaultable bond pricing model with default, liquidity, and tax. The price of discount defaultable coupon bond without amortization is given by
.rs C s /ds
t
Z
tM
C t
(63.77)
t
Z s
s exp .ru C u /du ds t
" Z tm
# M X 1 Q B.t; tM / D Et C.1 i / exp .rs C s C ls /ds Z t mD1 Q
C Et
Z .1 g / exp
tM
.rs C s C ls /ds
t
Z Q
C Et
tM
Z s
.1 g /!s exp .ru C u C lu /du ds
t
19
For the case of premium bond and amortization, see Liu et al. (2007).
t
(63.78)
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
and Z Q Z D 1 g E t exp Z C t
tM
.rs C s C ls /ds
Suppose there are I firms, and the default intensity for firm i , i D 1; : : :; I , is it D a0 i C a1 i Y1t C : : : C aKi YKt C it
t
tM
Z s
s exp .ru C u C lu /du ds : t
Under the similar assumptions of rt , t , and lt as in Longstaff et al. (2005), Lin et al. (2007) derive a closed-form solution for defaultable coupon bond with default, liquidity, and tax.
63.5.2.5 Correlated Default Default correlation is an important issue when dealing with the default or survival probability of more than one firm. Schonbucher (2003) lays out some basic properties to model the correlated defaults. First, the model must be able to produce default correlations of a realistic magnitude. Second, it must keep the number of parameters used to describe the dependence structure under control, without growing dramatically with the number of firms. Third, it should be a dynamic model, capable of modeling the number of defaults as well as the timing of default. Fourth, it should be able to reproduce the periods with default clustering. Finally, the easier calibration and implementation of the model, the better. Consider two firms A and B that have not defaulted before time t .0 t T /, whose default probabilities before T are given by pA and pB . Then the linear correlation coefficient20 between the default indicator random variables 1A D 1fA T g and 1B D 1fB T g is given by21 pAB pA pB .1A ; 1B / D p pA .1 pA /pB .1 pB /
In this model, i captures the stochasticity of intensities and the coefficients a1 i and a2 i , i D 1; : : : ; I , capture the correlation between intensities themselves, and between intensities and interest rates. The main drawback of the CID models is that they fail to generate high default correlation. However, Yu (2003) argues that it is not a problem of the CID approach itself but a problem of the choice of state variables. Driessen (2005) introduces two more common factors for Duffee’s (1999) model and find that they elevate the default correlation.23 The riskneutral default density of firm i , i D 1; 2; : : :; I is a function of K common factors, a firm-specific factor and two more factors that determine the risk free rates, 8 K P j ˆ ˆ ai .Yjt YNjt / C .it N it / ˆit D a0 i C < j D1
ˆ ˆ ˆ :
C bi1 .X1t XN1t / C bi 2 X2t ; i D 1; 2; : : : ; N rt D ar0 C ar1 X1t C ar2 X2t (63.82)
follow the independent where Yjt , j D 1; 2; : : :; K and square-root process, and the two additional terms X1t and X2t follow a bivariate process, it
63.5.2.5.1 Conditionally Independent Default Models The conditionally independent default (CID) approach introduces correlation by making them dependent on a set of common variables Y and on a firm-specific factor it .
20
This correlation is based on risk-neutral measure, which is different from physical measure used to compute the correlation from empirical default events. Jarrow et al. (2005), Yu (2002, 2003) provide a procedure for calculating physical default correlation through risk-neutral densities. 21 Another measure of default dependence between firms is the linear correlation between default time, .A ; B /. 22 See also Elizalde (2003).
(63.80)
where the firm-specific default factor it is independent across firms. For example, Duffee (1999) considers a CID model as follows: 8 ˆ r D ar0 C Y1t C Y2t ˆ < t it D a0 i C a1 i .Y1t YN1t / C a2 i .Y2t YN2t / C it ˆ p ˆ : dit D i .i it /dt C i it dW it (63.81)
(63.79)
There are three different approaches to model default correlation in the literature.22 We describe these approaches below.
993
dX 1t dX 2t
11 0 1 X1t D dt X2t 21 22 p X1t p 0 dW 1t C dW 2t 1 C ˇX1t 0
(63.83)
where W1t and W2t are independent standard Brownian motions under the physical measure. The introduction of two more terms could allow for the correlation between spreads and risk free rates. 23
There are two possible ways to deal with the high default correlation issue. One way is to introduce joint jumps in the default intensities (Duffie and Singleton 1999b). The other way is to consider the defaultevent triggers that cause joint defaults (Duffie and Singleton 1999b; Kijima 2000; Kijima and Muromachi 2000).
994
H. Lin and C. Wu
63.5.2.5.2 Contagion Model
Examples of copulas are:
Contagion models account for two more empirical facts: (1) the default of one firm can trigger the default of other related firms; (2) the default times tend to concentrate in certain periods of time. It includes the propensity model proposed by Jarrow and Yu (2001) and infectious defaults in Davis and Lo (2001). Jarrow and Yu (2001) extend the CID model to account for counterparty risk; that is, the risk that the default of one firm may increase the default probability of other related firms. They differentiate the total I firms into primary firms .1; : : :; K/ and secondary firms .K C 1; : : :; I /. The default intensities of the primary firms are modeled using CID and do not depend on the default status of any other firm. The default of a primary firm increases the default intensities of secondary firms, but not the converse (asymmetric dependence). Thus, the secondary firms’ default intensities are given by it D N it C
K X
j
ai;t 1fj t g
(63.84)
j D1
for i D K C 1; : : : ; I and j D 1; : : : ; K. Davis and Lo (2001) assume that each firm has an initial it for i D 1; : : : ; I . When a default occurs, the default intensity of all remaining firms is increased by a factor a > 1, called the enhancement factor, to ait .
63.5.2.5.3 Copula The copula approach takes the marginal probabilities as input and introduces the dependence structure to generate joint probabilities. Specifically, if we want to model the default time, the joint default probabilities are given by F.t1 ; : : : ; tI / D PŒ1 t1 ; : : : ; I tI D C .F1 .t1 /; : : : ; FI .tI // d
(63.85)
S.t1 ; : : : ; tI / D PŒ1 > t1 ; : : : ; I > tI D C .s1 .t1 /; : : : ; sI .tI //
C.u1 ; : : : ; uI / D
YI i D1
ui
(63.87)
(ii) Perfect Correlation Copula. The I -dimensional perfect correlation copula is given by C.u1 ; : : : ; uI / D min.u1 ; : : : ; uI /
(63.88)
(iii) Normal Copula. The I -dimensional normal copula with correlation matrix ˙ is given by C.u1 ; : : : ; uI / D ˆI˙ .ˆ1 .u1 /; : : : ; ˆ1 .uI // (63.89) where ˆI˙ represents an I -dimensional normal distribution function with a covariance matrix ˙, and ˆ1 denotes the inverse of the univariate standard normal distribution function. (iv) t Copula. Let X be an random vector distributed as an I -dimensional multivariate t-student with v degrees of freedom, mean vector (for v > 1) and covariance mav ˙ (for v > 2). We could then express X as trix v2 p v X DC p Z S
(63.90)
where S is a random variable distributed as a 2 with v degrees of freedom and Z is an I -dimensional normal random vector that is independent of S with zero mean and linear correlation matrix ˙. The I -dimensional t-copula of X could be expressed as I .tv1 .u1 /; : : : ; tv1 .uI // C.u1 ; : : : ; uI / D tv;R
(63.91) p
If we want to model the survival times,
s
(i) Independent Copula. The I -dimensional independent copula is given by
v
I where tv;R represents the distribution function of p Z, S Z is an I -dimensional normal random vector, which is independent of S with mean zero and covariance matrix R. tv1 denotes the inverse of the univariate tstudent distribution function with v degrees of freedom ˙ and Rij D p ij . ˙ii ˙jj
(63.86)
(v) Archimedean Copulas. An I -dimensional Archimedean copula function is represented by
where Cd and Cs are two different copulas.24
C.u1 ; : : : ; uI / D 1 . .u1 / C : : : C .uI //
24
where the function W Œ0; 1 ! RC , called the generator of the copula, is invertible and satisfies the conditions that 0 .u/ < 0, 00 .u/ > 0, .1/ D 0, and
.0/ D 1. Examples of generator functions are:
For a more detailed description of the copula theory, please refer to Joe (1997), Frees and Valdez (1998), Costinot et al. (2000), and Embrechts et al. (2001).
(63.92)
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
Clayton W .u/ D
u 1 ;
Frank W .u/ D ln
for 0
e u 1 ; e 1
Gumbel W .u/ D . ln u/ ;
for 2 Rnf0g
for 1
Product W .u/ D ln u Studies incorporating copulas into the reduced-form approach to account for default dependence include Li (2000a), Schonbucher and Schubert (2001), and Frey and McNeil (2001).
63.5.3 Empirical Issues 63.5.3.1 The Components of Yield Spread Understanding the determinants of corporate bond spreads is important for both academics and practitioners. For academics, valuation of corporate bonds requires a pricing model that incorporates all relevant factors. From the investment perspective, investors need to know the required premia of default, liquidity, and taxes in order to be compensated properly for the risk and tax burden of holding corporate bonds. Furthermore, from the corporate finance perspective, understanding the components of the corporate yield spread aids in capital structure decisions as well as determination of timing and maturity of debt and equity issuances. A vast literature has been devoted to studies of determinants of corporate bond spreads. This includes, among others, Jones et al. (1984), Duffee (1999), Duffie and Singleton (1999a), Elton et al. (2001), Collin-Dufresne et al. (2001), Huang and Huang (2003), Eom et al. (2004), De Jong and Driessen (2004), and Ericsson and Renault (2006). These studies reported mixed results for the component of corporate bond yield spreads. Jones et al. (1984) apply Merton model to a sample of firms with simple capital structures and secondary market prices during 1977–1981 period and find that the predicted prices (yields) from the model are too high (low). Ogden (1987) finds that the Merton model underpredicts spreads by 104 basis points on average. Lyden and Saraniti (2000) compare the Merton model and the Longstaff and Schwartz (1995) model and find that both models underestimate the yield spreads. Huang and Huang (2003) evaluate the performance of most popular term structure models and report estimates of default risk premia substantially below the corporate bond spread. CollinDufresne et al. (2001) analyze the effect of key financial variables suggested by structural models on corporate bond spreads and find that these variables explain only a small portion of variations in spreads. Sarig and Warga (1989), and
995
Helwege and Turner (1999) examine the shape of credit term structure, while Duffee (1999) and Brown (2001) test the correlation between interest rates and spreads. Most of these empirical studies conclude that term structure models underpredict yield spreads. On the other hand, Eom et al. (2004) empirically test five structural models; that is, the Merton model, Geske (1977) model, Longstaff and Schwartz (1995) model, Leland and Toft (1996) model and Collin-Dufresne et al. (2001) model. The results show that term structure models can over or underestimate corporate bond spreads and prediction errors are high. Besides default risk, liquidity and tax are two additional factors that affect corporate bond yield spreads. One of the biggest challenges in term structure models is to estimate the liquidity premium of corporate bond spreads. Empirical estimation of liquidity premium is difficult because liquidity is unobservable and bond prices only reflect the combined effects of liquidity and default risk. Thus, the liquidity and default premia cannot be separately identified from the data on term structure of corporate bond prices alone. Longstaff et al. (2005) overcome this identification problem by using additional information from the credit default swap. They find that the majority of the corporate bond spread is due to default risk and the nondefault component is time varying and strongly related to the measures of bond-specific illiquidity as well as macroeconomic measures of bond market liquidity. Specifically, when the Treasury curve is used as the default free discount function, the average size of default component is 51% for AAA/AA bonds, 56% for A bonds, 71% for BBB bonds, and 83% for BB bonds. Yawitz et al. (1985) estimate nonlinear models of corporate and municipal bonds with default risk and taxes. Elton et al. (2001), Liu et al. (2007) show that the tax premium accounts for a significant portion of corporate bond spreads. Elton et al. (2001) assume a tax rate equal to the typical statutory state income tax. Liu et al. (2007) propose a pricing model that accounts for stochastic default probability and differential tax treatments for discount and premium bonds. By estimating parameters directly from bond data, they obtain a significantly positive income tax rate of marginal investor after 1986. Empirical evidence shows that taxes explain a substantial portion of observed spreads. Taxes on average account for 60, 50, and 37% of the observed corporate-Treasury yield spreads for AA, A, and BBB bonds, respectively. Lin et al. (2007) further account for stochastic default probability, liquidity, and differential tax treatments for discount and premium bonds in the pricing model. The model provides more precise estimates of the tax and liquidity components of spreads. They find that a substantial portion of the corporate yield spread is due to taxes and liquidity. The liquidity component in the spread is highly correlated with bond-specific and market-wide liquidity measures whereas
996
H. Lin and C. Wu
the tax component is insensitive to these liquidity measures. On average, 51% of corporate yield spread is attributable to the default component, 32% to the tax component, and 17% to the liquidity component. The default component represents 39% of the spread for AAA/AA bonds, 46% for A bonds, 60% for BBB bonds, and 73% for BB bonds. The tax component explains 39% of the spread for AAA/AA bonds, 36% for A bonds, 25% for BBB bonds, and 16% for BB bonds. The liquidity component accounts for 21% of the spread for AAA/AA bonds, 18% for A bonds, 15% for BBB bonds, and 11% for BB bonds. Berndt et al. (2006) investigate the source for common variation in U.S. credit markets that is not related to changes in riskfree rates or expected default losses. They extract a latent common component from firm-specific changes in default risk premium, named as “default risk premium (DRP) factor,” and find that its change is priced in the corporate bond market. The DRP factor could explain a maximum 35% of the credit market returns. Moreover, the DRP factor also captures the jump-to-default risk associated with marketwide credit events.
63.5.3.2 State Variables: Latent or Observable Standard term structure models specify the default intensity as the function of latent variables that are unobservable and follow some diffusion processes. By contrast, some empirical studies specify the default intensity as the function of observable state variables in the reduced-form models. For example, Bakshi et al. (2006) model the aggregate defaultable discount rate Rt D rt C .1 !/t as: Rt D ƒ0 C ƒr rt C ƒY Yt
(63.93)
where Yt denotes the firm-specific distress index. Bakshi et al. (2006) consider leverage, book-to-market, profitability, equity-volatility, and distance-to-default to be the firm specific distress variables and show that interest rate risk is of the first-order importance for explaining variations in singlename defaultable bond yields. When applying to low-grade bonds, a credit risk model that takes leverage into consideration reduces absolute yield mispricing by as much as 30%. Janosi et al. (2002) assume that Rt D a0 C a1 rt C a2 Zt
(63.94)
where Zt is a standard Brownian motion driving the S&P 500 index. Chava and Jarrow (2004) estimate a reduced-form model with accounting and market variables using historical bankruptcy data. In their model, the default correlation can be computed directly from physical intensity rather than those
transformed from the risk-neutral intensity estimated from credit spreads. Using this model, one can also estimate an affine model with latent variables from bankruptcy, which better captures the common variations in default rates and may lead to more accurate default correlation estimates.
63.6 Interest Rate and Credit Default Swaps 63.6.1 Valuation of Interest Rate Swap An interest rate swap is an agreement between two parties (known as counterparties) where one stream of future interest payments is exchanged for another based on a specified principal amount. Interest rate swaps often exchange a fixed payment for a floating payment that is linked to an interest rate (most often the LIBOR). With the interest rate swap, a company agrees to pay cash flows equal to interest at a predetermined fixed rate on a notional principal for a number of years. In return, it receives interest at a floating rate on the same notional principal for the same period of time. Interest rate swaps are simply the exchange of one set of cash flows (based on interest rate specifications) for another. Swaps are contracts set up between two or more parties, and can be customized in many different ways. When an interest rate swap is first initiated, it is generally a plain vanilla and its value is zero. However, it could be positive or negative after time goes on. Similar to the pricing of corporate bonds, there are two approaches in which to value the interest rate swap: the structural approach and the reduced-form approach.25 Structural models such as Cooper and Mello (1991) and Li (1998) use Merton’s (1974) approach to price the interest rate swap. Models developed more recently adopt the reduced-form approach, which regards the swap as the difference between two bonds and focuses on the rate used to discount the future cash flows of the interest rate swap. Studies using the reduced-form models of interest rate swaps include, among others, Duffie and Huang (1996), Duffie and Singleton (1997), Gupta and Subrahmanyam (2000), Collin-Dufresne and Solnik (2001), Grinblatt (2001), Liu et al. (2004), and Li (2006). In what follows, we focus on the literature on the reduced-form approach. Consider a plain vanilla fixed-for-floating swap with maturity and the nominal principal equal to 1. The floating side is reset semi-annually to the 6-month LIBOR rate from 6 months prior. The fixed side pays a coupon rate c at the reset dates. Let rtL be the LIBOR rate set at date t for loans 25
In practice, there is another approach for valuing the interest rate swap as a series of forward rate agreement (FRAs), see Hull (2006).
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
maturing 6 months later. From the standpoint of the floatingrate payer, an interest rate swap could be regarded as a long position in a fixed rate bond and a short position in a floatingrate bond. The fixed-side coupon rate c is set at initial date t so that the present value of expected net cash flows from the long and short positions is zero at the initial date; that is, 0D
2 X
Z exp
Q Et
t C0:5j t
j D1
L Rs ds c rt C0:5.j 1/ (63.95)
Then 2 P
cD
h
Q
Et
j D1 2 P
R i
t C0:5j exp t Rs ds rtLC0:5.j 1/ Q
j D1
Et
h R
i t C0:5j exp t Rs ds
(63.96)
1 Bt c D P2 0:5j j D1 Bt
(63.97)
with D
Q Et
Z exp
t Cm
Rs ds
(63.98)
t
Here Btm means the present value of 1 dollar payable at time t C m. For the floating rate payer, the floating rate swap with a face value of 1 at time t C is equivalent to a floating rate bond, which has the value at par at the initial date. Thus, the value of the floating side of the swap is 1 minus the present value of 1 dollar payable at time t C ; Bt . Duffie and Singleton (1997) suggest that Rt could also be interpreted as a default-adjusted discount rate. If the recovery rate is !t and the default intensity is t , then Rt D rt C .1 !t /t
(63.99)
If the relative liquidities of interest rate swap and Treasury market are also considered, Rt D rt C .1 !t /t lt
(63.100)
where lt is a convenience yield that accounts for the effect of differences in liquidity and repo specialness between the Treasury and the swap market.26 26
Duffie and Huang (1996) change Rt to account for the counterparties’ asymmetric default risks. The economic intuition is clear. Suppose at any given time t, the current market value of the swap with no default is Vt for party A, which could be positive or negative. On the other hand, the value is Vt for party B. If Vt > 0, then party A is at risk to the default of party B between t and t C 1. Thus, under the riskneutral measure, Vt equals the default probability of party B between t and t C1 multiplied by the recovery value, plus the survival probability of party B between t and t C1, multiplied by the market value given no default, which is the risk-neutral expected present value of receiving Vt C1 at t C 1, plus any interest paid to A by B between t and t C1. If Vt < 0, this recursive method is the same, except for the fact that now B is at risk to default of A, so the default probability and recovery rate are those of A. This could be given mathematically by
t
where Rs are the discount rates of the cash flows. With the assumption that risky zero-coupon bonds are priced at the appropriate LIBOR rate in the interbank lending market, Duffie and Singleton (1997) show that
Btm
997
See, for example, Grinblatt (2001).
Rv;t D rt C stA 1fv<0g C stB 1fv0g
(63.101)
sti D .1 !ti /it
(63.102)
with Liu et al. (2004) use a five-factor affine framework to model the swap spread: 8 ˆ
(63.103)
where ı0 , ı1 , ı2 , and are constants, ŒY1 ; Y2 ; Y3 ; Y4 ; Y5 are five state variables with dynamics in risk-neutral measures following Q
dY t D ˇYt dt C ˙dW t
(63.104)
Li (2006) proposes a similar reduced form model of interest rate swaps: 8 rt D ı0 C Y1t C Y2t C Y3t ˆ ˆ ˆ ˆ ˆ ˆ <t D ı1 rt C Y4t lt D ı2 rt C Y5t ˆ p ˆ Q ˆ ˆ dY it D i .i Yit /dt C i Yit dW it ; i D 1; 2; 3; 4 ˆ ˆ : Q dY 5t D 5 .5 Y5t /dt C 5 dW 5t (63.105) Q
where Wit are independent standard Brownian motions under the risk-neutral measure.
63.6.2 Valuation of Credit Default Swaps Credit derivatives have emerged as a remarkable and rapidly growing area in global derivatives and risk management practice and has been perhaps the most significant and successful
998
financial innovation of the last decade. The growth of the global credit derivatives market continues to exceed expectations. According to BBA (British Bankers’ Association) Credit Derivatives Report 2006, the outstanding notional amount of the market will reach $33 trillion at the end of 2008. Single-name credit default swaps (CDS) represent a substantial proportion of the market. CDS are the most liquid products among the credit derivatives currently traded, and make up the bulk of trading volume in credit derivatives markets. Moreover, CDS along with total return swaps and credit spread options are the basic building blocks for more complex structured credit products. The CDS market has supplanted the bond market as the industry gauge for a borrower’s credit quality. Credit default swaps are structured as instruments that make an agreed payoff upon the occurrence of a credit event. That is, in a CDS, the protection seller and the protection buyer enter a contract, which requires that the protection seller compensates the protection buyer if a default event occurs before maturity of the contract. If there is no default event before maturity, the protection seller pays nothing. In return, the protection buyer typically pays a constant quarterly fee to the protection seller until default or maturity, whichever comes first. This quarterly payment, usually expressed as a percentage of its notional principal value, is the CDS spread or premium.
H. Lin and C. Wu
been set by ISDA: full restructuring; modified restructuring (only bonds with maturity shorter than 30 months can be delivered); modified-modified restructuring (restructured obligations with maturity shorter than 60 months and other obligations with maturity shorter than 30 months can be delivered); and no restructuring. The payment following the occurrence of a credit event is either repayment at par against physical delivery of a reference obligation (physical settlement) or the notional principal minus the post default market value of the reference obligation (cash settlement). In practice, physical settlement is the dominant settlement mechanism, though the proportion has dropped a lot (according to BBA Credit Derivative Report 2006). The delivery of obligations in case of physical settlement can be restricted to a specific instrument, though usually the buyer may choose from a list of qualifying obligations, irrespective of currency and maturity as long as they rank pari passu with (have the same seniority as) the reference obligation. This latter feature is commonly referred to as the cheapest-to-deliver option. Theoretically, all deliverable obligations should have the same price at default and the delivery option would be worthless. However, in some credit events, for example, a restructuring, this option is favorable to the buyer, since he/she can deliver the cheapest bonds to the seller. Counterparties can limit the value of the cheapest-to-deliver option by restricting the range of deliverable obligations, for example, to non-contingent, interestpaying bonds.
63.6.2.1 Credit Event in CDS As with all other financial markets, the liquidity and efficiency of aligning buyers and sellers depend on consistent, reliable, and understandable legal documentation. The International Swaps and Derivatives Association (ISDA) has been a strong force in maintaining the uniformity of documentation of CDS products through the assistance and support of its members, primarily the dealer community. The credit definitions by ISDA allow specification of the following credit events: 1. Bankruptcy, 2. Failure to pay above a nominated threshold (say in excess of US$1 million) after expiration of a specific grace period (say, two to five business days), 3. Obligation default or obligation acceleration, 4. Repudiation or moratorium (for sovereign entities), and 5. Restructuring. There are significant issues in defining the credit events. This reflects the heterogeneous nature of credit obligations. In general, items 1, 2, and 5 are commonly used as credit events in CDS for firms. Four types of restructuring have
63.6.2.2 Valuation of Credit Default Swap Without Liquidity Effect Let be the premium paid by the buyer of default protection. Assuming that the premium is paid continuously, the present value of the premium leg of a credit-default swap can be written as 8 T 2 s 3 9 Z = < Z Q Et exp 4 .ru C u /du5 ds ; : t
(63.106)
t
If the bond defaults, a bondholder recovers a fraction ! of the par value and the seller of default protection pays 1 ! of the par value to the buyer. The value of the protection leg of the credit default swap is given by Q
Et
8 T
t
2 .1 !/s exp 4
Zs t
9 = .ru C u /du5 ds ; 3
(63.107)
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
Equating the premium leg to the protection leg, we can solve for the CDS premium: ( s ) RT R Q Et .1 !/ s exp .ru C u /du ds (
D Q
Et
t
RT
t
s ) R exp .ru C u /du ds
t
t
(63.108) An analytical solution could be obtained if we assume rt and t follow the affine class of diffusion in the risk-neutral measure. For example, Longstaff et al. (2005) assume a CIR processes for t : p Q dt D . t /dt C t dW t
(63.109)
Q
where Wt is a standard Brownian motion under the riskneutral measure. Other related studies include Duffie (1999), Hull and White (2000, 2001), and Houweling and Vorst (2005). 63.6.2.3 Valuation of Credit Default Swap with Liquidity The liquidity effect on the CDS is asymmetric since the premium leg and the protection leg are subject to different liquidity risks. The premium leg should be discounted with CDS-specific liquidity factor since it depends on the liquidity of CDS market, while the protection leg has no liquidity problem.27 The CDS premium is then given by ( Q Et
s ) R .1 !/ s exp .ru C u /du ds RT
(
D Q
Et
t
RT t
t
s ) R exp .ru C u C lu /du ds t
(63.110)
63.6.3 Empirical Issues Due to the rapid growing swap markets, a number of empirical studies have attempted to identify the possible determinants of swap spreads and related issues on pricing mechanism, market efficiency, and information spillovers among different markets. 27 Bühler and Trapp (2007) argue that the protection leg should be discounted with corporate bond specific liquidity factor if the protection is paid in physical settlement. However, we argue that it is unnecessary since it could be introduced easily by adjusting !.
999
63.6.3.1 Determinants of Interest Rate Swap Spread One of the stylized facts we observe for interest rate swap is that there is a positive spread between the swap rate and the government default-free interest rate, which is termed as the swap spread. Most of the empirical studies of interest rate swap focus on the determinants of interest rate swap spreads, which basically include two components: default and liquidity components. The default component generally involves two types of default risk. First, the counterparties may default on their future obligations. This is called counterparty default risk. Second, the underlying floating rate in a swap contract is usually set at the LIBOR rate, which is a defaultrisky interest rate.28 Sun et al. (1993) provide the earliest empirical investigation of default risk in swap spread and find evidence of default risk premium in the swap spread.29 Brown et al. (1994) study U.S. dollar swaps from 1985 to 1991 and find a positive relationship between the LIBOR spread and the swap spread, while Eom et al. (2000) find a similar evidence in Japan. Minton (1997) uses two proxies for default risk premium (corporate quality spread and aggregate default spread)30 and finds that default risk is important for interest rate swap spreads. In general, a 100 basis-point increase in the bond spread of BBB bonds results in a 12–15 basis-point increase in the swap spread. Lang et al. (1998) argue that the sharing of surplus created by swaps affects swap spreads. Fehle (2003) ran VAR regressions of swap spreads on default risk and liquidity proxies using international data. The difference between LIBOR and Treasury-bill rate is employed as a proxy for liquidity, while the level, slope, and volatility of term structure and the difference between the yields on a portfolio of corporate bonds and a corresponding Treasury bond act as proxies for default risk. The results show that swap spreads are sensitive to bond spreads in most currencies and maturities. However, there are no clear patterns across bond spreads from different ratings and across swap maturities. Most of these studies use a linear regression approach and do not apply the dynamic reduced-form model. Within the reduced-form framework, Duffie and Singleton (1997) find that both liquidity and default risks are necessary to explain the variation in swap spreads. However, the effect of liquidity factor does not last long while the default risk becomes more important for longer time horizons. He (2000) 28 Li (2006) summarizes six reasons to suggest that the counterparty default risk is not important for swap spreads. 29 See also Sundaresan (2003). 30 Corporate quality spread is defined as the difference between the yields on portfolios of Moody’s Baa-rated corporate debt and portfolios of Moody’s Aaa-rated corporate debt, while aggregate default spread is defined as the difference between the yields on portfolios of Moody’s Baa-rated corporate debt and the average of 10- and 30-year US Treasury yields that match the maturities of Baa-rated corporate debt.
1000
finds that the liquidity component could explains most variations in swap spreads. Grinblatt (2001) attributes the swap spread to liquidity differences between government securities and short-term Eurodollar borrowing and finds that his model could generate a wide variety of swap spread curves and explain about 35–40% of the variations in U.S. swap spreads across time. Li (2006) attributes the liquidity component of swap spreads to the liquidity difference between the Treasury and swap markets and decomposes the swap spreads into default and liquidity components. The parameter estimates show that the default and liquidity components of swap spreads are both negatively related to riskless interest rates. A further analysis reveals that default risk accounts for the levels of swap spreads, while the liquidity component explains most of the volatilities of swap spreads.
63.6.3.2 Determinants of Credit Default Swap Spread Given the short history of credit derivatives market and the limited data availability, there has been little empirical work in this arena. Most of them focus on the determinants of CDS spreads, spillover between CDS and other financial markets, and their role in price discovery of credit conditions.
63.6.3.2.1 Determinants of CDS Spreads Houweling and Vorst (2005) implement a set of reducedform models on market CDS quotes and corporate bond quotes and find that financial markets may not regard Treasury bonds as the default-free benchmark. Zhu (2006) examines the long-term pricing accuracy in the CDS market relative to the bond market. His study looks into the underlying factors that explain the price differentials and explores the short-term dynamic linkages between the two markets in a time series framework. The panel data study and the VECM analysis both suggest that short-term deviations between the two markets are largely due to the higher responsiveness of CDS spreads to changes in the credit condition. Zhang et al. (2005) introduce jump risks of individual firms to explain credit default swap spreads. Using both historical and realized measures as proxies for various aspects of the jump risks, they find evidence that long-run historical volatility, short-run realized volatility, and various jump-risk measures all have statistically significant and economically meaningful effects on credit spreads. More important, the sensitivities of credit spreads to volatility and jump risk depend on the credit grade of the entities and these relationships are nonlinear. Negative jumps tend to have larger effects. Blanco et al. (2005) test the theoretical equivalence of credit default swap spreads and credit spreads derived by
H. Lin and C. Wu
Duffie (1999). Their empirical evidence strongly supports the parity relation as an equilibrium condition. Moreover, CDS spreads lead credit spreads in the price discovery process.
63.6.3.2.2 Spillover Between CDS and Other Financial Markets Longstaff et al. (2003) examine weekly lead-lag relationships between CDS spread changes, corporate bond spread changes, and stock returns of U.S. firms in a VAR framework. They find that both stock and CDS markets lead the corporate bond market, which provides support for the hypothesis that information seems to flow first into credit derivatives and stock markets and then into corporate bond markets. However, there is no clear lead pattern of the stock market to the CDS market and vice versa. Jorion and Zhang (2007) examine the information transfer effect of credit events across the industries and document the intra-industry credit contagion effect in the CDS market. The empirical evidence strongly supports the domination of contagion effects over competition effects for Chap. 14 bankruptcies and competition effects over contagion effects for Chap. 10 bankruptcies. Acharya and Johnson (2007) quantify insider trading in the CDS market and show that the information flow from the CDS market to the stock market is greater for negative credit news and for entities that subsequently experience adverse shocks. The degree of information flow increases with the number of banks that have ongoing lending (and hence monitoring) relations with a given entity. The results suggest that the CDS market leads the stock market in information transmission.
63.6.3.2.3 Price Discovery of Credit Condition Norden and Weber (2004) analyze the response of stock and CDS markets to rating announcements made by the three major rating agencies during the period 2000–2002. The results show that the CDS market reacts earlier than the stock market with respect to reviews for downgrade by S&P and Moody’s. Hull et al. (2004) examine the relationship among CDS spreads, bond yields, and benchmark risk-free rates used by participants in the derivative market. They show that using swap rates as the risk free benchmark produces better goodness-of-fit compared to using other risk-free rate proxies such as Treasury rates. Their empirical evidence also suggests that the CDS market anticipates credit rating announcements, especially negative rating events. Tang and Yan (2006) study the effects of liquidity in the CDS market and liquidity spillover from other markets to the
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence
CDS market using a large data set. They find substantial liquidity spillover from bond, stock, and option markets to the CDS market.
63.7 Concluding Remarks This paper provides a comprehensive survey of term structure models, pricing applications, and empirical evidence. Historically, two major considerations shape the development of DTSMs: (1) explaining the stylized facts of term structure; and (2) the tradeoff between mathematical complexity and analytical tractability. We begin with a generalized pricing framework by which most of the DTSMs are nested. Based on different specifications on the risk-neutral dynamics of state variable and the mapping function between state variable and short term interest rates, we categorize DTSMs as affine, quadratic, regime switching, jump, stochastic volatility models. We compare the empirical performance of these DTSMs in fitting the interest rate behavior in the physical measure and the price of default-free government bonds. Empirical findings on DTSMs are not conclusive. Multifactor models seem to perform better than single-factor models. However, there remains serious concern about the applicability of nonlinear DTSMs. Moreover, the economic value of some DTSMs in bond pricing and risk management, and the relationship between the dynamics of term structures under the risk-neutral and physical measures remain open questions. We also evaluate the usefulness of DTSMs in the pricing of defaultable bonds. In standard term structure models, the yield spread is determined by two factors: the risk of default (modeled by default intensity) and the expected loss in the event of default (modeled by recovery rate). However, most of the empirical evidence has shown that default risk can only explain a portion of credit spreads, and non-default components, such as liquidity, and tax, are also important for the credit spread. Lin et al. (2007) propose a corporate bond pricing model that incorporates the default probability, liquidity, and tax to decompose the corporate bond yield spread into three components. They find that default, liquidity and tax are all important factors for explaining corporate yield spreads. However, there are two caveats. First, empirically Lin et al. (2007) follow the approach of Longstaff et al. (2005) by assuming that CDS premium contains no liquidity component. This assumption has been questioned by several studies such as Acharya and Johnson (2007) and Tang and Yan (2006). Second, Lin et al. (2007) assume that liquidity intensity and default intensity are independent while in reality they are more likely to be correlated. To accommodate this correlation, we need to obtain a closed-form solution for the corporate bond pricing formula with correlated factors.
1001
Finally, research on the components of swap spreads is inconclusive. Most studies assume that the CDS premium contains no liquidity component, while several recent studies show the existence of liquidity premium in CDS. Other potentially interesting research subjects in this area include the significance of fixed-income derivative markets in affecting information transmission, price discovery, and liquidity in the spot markets. For example, there are two possible effects on corporate bond trading by CDS trading. First, it provides an easier way to trade the credit risk, which makes investors more reluctant to trade corporate bonds and hence decreases the corporate bond liquidity. Second, CDS trading provides a way to hedge the credit risk, which complements corporate bond trading and increases the liquidity of corporate bonds. The fixed income research will continue to be an exciting field. The recent literature on pricing derivatives using DTSMs shows an enormous potential for new insights using derivatives data in model estimation. It is expected that the fixed income derivative market will provide important information for researchers to better understand credit risk and liquidity of the underlying market and to develop more sophisticated models of term structure to address the unresolved issues.
References Acharya, V. V. and J. N. Carpenter. 2002. “Corporate bond valuation and hedging with stochastic interest rates and endogenous bankruptcy.” Review of Financial Studies 15, 1355–1383. Acharya, V. V. and T. C. Johnson. 2007. “Insider trading in credit derivatives.” Journal of Financial Economics 84, 110–141. Ahn, D. H. and B. Gao 1999. “A parametric nonlinear model of term structure dynamics.” Review of Financial Studies 12, 721–762. Ahn, C. M. and H. E. Thompson. 1988. “Jump-diffusion processes and the term structure of interest rates.” Journal of Finance 43, 155–174. Ahn, D. H., R. F. Dittmar, and A. R. Gallant. 2002. “Quadratic term structure models: theory and evidence.” Review of Financial Studies 15, 243–288. Ait-Sahalia, Y. 1996. “Testing continuous-time models of the spot interest rate.” Review of Financial Studies 9, 385–426. Ait-Sahalia, Y. 1999. “Transition densities for interest rate and other nonlinear diffusions.” Journal of Finance 54, 1361–1395. Andersen, T. G. and J. Lund. 1997. “Estimating continuous-time stochastic volatility models of the short-term interest rate.” Journal of Econometrics 77, 343–377. Anderson, D. W. and S. Sundaresan. 1996. “Design and valuation of debt contracts.” Review of Financial Studies 9, 37–68. Ang, A. and G. Bekaert. 2002. “Regime switches in interest rates.” Journal of Business and Economic Statistics 20, 163–182. Ang, A. and M. Piazzesi. 2003. “A no-arbitrage vector autoregression of term structure dynamics with macroeconomic and latent variables.” Journal of Monetary Economics 50, 745–787. Backus, D., A. W. Gregory, and S. E. Zin. 1989. “Risk premiums in the term structure: evidence from artificial economies.” Journal of Monetary Economics 24, 371–399.
1002 Backus, D., S. Foresi, A. Mozumdar, and L. Wu. 2001. “Predictable changes in yields and forward rates.” Journal of Financial Economics 59, 281–311. Bakshi, G., C. Cao, and Z. Chen. 1997. “Empirical performance of alternative option pricing models.” Journal of Finance 52, 2003–2049. Bakshi, G., D. Madan, and F. Zhang. 2006. “Investigating the role of systematic and firm-specific factors in default risk: lessons from empirically evaluating credit risk models.” Journal of Business 79, 1955–1987. Balduzzi, P. and Y. Eom. 2000. Non-linearities in U.S. Treasury rates: a semi-nonparametric approach, Working paper, Boston College. Balduzzi, P., S. R. Das, S. Foresi, and R. Sundaram. 1996. “A simple approach to three factor affine term structure models.” Journal of Fixed Income 6, 43–53. Bali, T. G. 2000. “Testing the empirical performance of stochastic volatility models of the short-term interest rate.” Journal of Financial and Quantitative Analysis 35, 191–215. Ball, C. A. and W. N. Torous. 1999. “The stochastic volatility of shortterm interest rates: some international evidence.” Journal of Finance 54, 2339–2359. Bansal, R. and H. Zhou. 2002. “Term structure of interest rates with regime shifts.” Journal of Finance 57, 1997–2043. Bas, J. and S. R. Das. 1996. “Analytical approximation of the term structure for jump-diffusion process: a numerical analysis.” Journal of Fixed Income 6, 78–86. Beaglehole, D. and M. Tenney. 1991. “General solutions of some interest rate-contingent claim pricing equations.” Journal of Fixed Income 1, 69–83. Bekaert, G. and R. J. Hodrick. 2001. “Expectations hypotheses tests.” Journal of Finance 56, 1357–1394. Bekaert, G., R. J. Hodrick, and D. A. Marshall. 1997. “On biases in tests of the expectations hypothesis of the term structure of interest rates.” Journal of Financial Economics 44, 309–348. Berndt, A., A. A. Lookman, and I. Obreja. 2006. “Default risk premia and asset returns.” AFA 2007 Chicago Meetings Paper. Bielecki, T. R. and M. Rutkowski. 2000. “Multiple ratings model of defaultable term structure.” Mathematical Finance 10, 125–139. Bielecki, T. R. and M. Rutkowski. 2004. “Modeling of the defaultable term structure: conditionally Markov approach.” IEEE Transactions on Automatic Control 49, 361–373. Black, F. and J. C. Cox. 1976. “Valuing corporate securities: some effects of bond indenture provisions.” Journal of Finance 31, 351–367. Black, F. and P. Karasinski. 1991. “Bond and option pricing when short rates are lognormal.” Financial Analysts Journal 47, 52–59. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Black, F., E. Derman, and W. Toy. 1990. “A one-factor model of interest rates and its application to treasury bond options.” Financial Analysts Journal 46, 33–39. Blanco, R., S. Brennan, and I. Marsh. 2005. “An empirical analysis of the dynamic relation between investment-grade bonds and credit default swaps.” Journal of Finance 60, 2255–2281. Boudoukh, J., M. Richardson, R. Stanton, and R. F. Whitelaw. 1998. The stochastic behavior of interest rates: implications from a multifactor, nonlinear continuous-time model, Working Paper, NBER. Brennan, M. J. and E. S. Schwartz. 1979. “A continuous time approach to the pricing of bonds.” Journal of Banking and Finance 3, 133–155. Briys, E. and F. de Varenne. 1997. “Valuing risky fixed rate debt: an extension.” Journal of Financial and Quantitative Analysis 32, 239–248. Brown, D. 2001. “An empirical analysis of credit spread innovations.” Journal of Fixed Income 11, 9–27.
H. Lin and C. Wu Brown, S. J. and P. H. Dybvig. 1986. “The empirical implications of the Cox, Ingersoll, Ross theory of the term structure of interest rates.” Journal of Finance 41, 617–630. Brown, R. H. and S. M. Schaefer. 1994. “The term structure of real interest rates and the Cox, Ingersoll and Ross model.” Journal of Financial Economics 35, 3–42. Brown, K. C., W. V. Harlow, and D. J. Smith. 1994. “An empirical analysis of interest rate swap spread.” Journal of Fixed Income 4, 61–78. Bühler, W. and M. Trapp. 2007. Credit and liquidity risk in bond and CDS markets, Working paper, Universität Mannheim. Campbell, J. Y. 1986. “A defense of traditional hypotheses about the term structure of interest rates.” Journal of Finance 41, 183–193. Campbell, J. Y. and J. H. Cochrane. 1999. “By force of habit: a consumption-based explanation of aggregate stock market behavior.” Journal of Political Economy 107, 205–251. Campbell, J. Y. and R. J. Shiller. 1991. “Yield spreads and interest rate movements: a bird’s eye view.” Review of Economic Studies 58, 495–514. Cathcart, L. and L. El-Jahel. 1998. “Valuation of defaultable bonds.” Journal of Fixed Income 8, 65–78. Chan, K. C., G. A. Karolyi, F. A. Longstaff, and A. B. Sanders. 1992. “An empirical comparison of alternative models of the short-term interest rate.” Journal of Finance 47, 1209–1227. Chapman, D. A. and N. D. Pearson. 2000. “Is the short rate drift actually nonlinear?” Journal of Finance 55, 355–388. Chapman, D. A. and N. D. Pearson. 2001. “Recent advances in estimating term-structure models.” Financial Analysts Journal 57, 77–95. Chava, S. and R. A. Jarrow. 2004. “Bankruptcy prediction with industry effects.” Review of Finance 8, 537–569. Chen, R. and L. Scott. 1993. “Maximum likelihood estimation for a multifactor equilibrium model of the term structure of interest rates.” Journal of Fixed Income 3, 14–32. Chen, R. R., X. Cheng, F. J. Fabozzi, and B. Liu. 2008. “An explicit, multi-factor credit default swap pricing model with correlated factors.” Journal of Financial and Quantitative Analysis 43(1), 123–160. Cheridito, P., D. Filipovi, and R. L. Kimmel. 2007. “Market price of risk specifications for affine models: theory and evidence.” Journal of Financial Economics 83, 123–170. Cochrane, J. H. 2001. Asset pricing, Princeton University Press, Princeton, NJ. Collin-Dufresne, P., R. S. Goldstein. 2001. “Do credit spreads reflect stationary leverage ratios?” Journal of Finance 56, 1929–1957. Collin-Dufresne, P., R. S. Goldstein. 2002. “Do bonds span the fixed income markets? Theory and evidence for unspanned stochastic volatility.” Journal of Finance 57, 1685–1730. Collin-Dufresne, P. and B. Solnik. 2001. “On the term structure of default premia in the swap and LIBOR markets.” Journal of Finance 56, 1095–1115. Collin-Dufresne, P., R. S. Goldstein, and J. S. Martin. 2001. “The determinants of credit spread changes.” Journal of Finance 56, 2177–2207. Conley, T. G., L. P. Hansen, E. G. J. Luttmer, and J. A. Scheinkman. 1997. “Short-term interest rates as subordinated diffusions.” Review of Financial Studies 10, 525–577. Constantinides, G. M. 1992. “A theory of the nominal term structure of interest rates.” Review of Financial Studies 5, 531–552. Cooper, I. A. and A. S. Mello. 1991. “The default risk of swaps.” Journal of Finance 46, 597–620. Costinot, A., T. Roncalli, and J. Teiletche. 2000. Revisiting the dependence between financial markets with copulas, Working paper, Groupe de Recherche Operationnelle, Credit Lyonnais. Cox, J. C., J. E. Ingersoll, and S. A. Ross. 1985. “A new theory of the term structure of interest rates.” Econometrica 53, 385–407. Dai, Q. and K. J. Singleton. 2000. “Specification analysis of affine term structure models.” Journal of Finance 55, 1943–1978.
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence Dai, Q. and K. J. Singleton. 2002a. “Expectation puzzles, time-varying risk premia, and affine models of the term structure.” Journal of Financial Economics 63, 415–441. Dai, Q. and K. J. Singleton. 2002b. “Fixed income pricing,” in Handbook of the economics of finance: financial markets and asset pricing, G. Constantinides, M. Harris, and R. M. Stulz (Eds.). Elsevier, Amsterdam. Dai, Q. and K. J. Singleton. 2003. “Term structure dynamics in theory and reality.” Review of Financial Studies 16, 631–678. Dai, Q., K. J. Singleton, and W. Yang. 2007. “Regime shifts in a dynamic term structure model of US Treasury bond yields.” Review of Financial Studies 20, 1669–1706. Das, S. R. 2002. “The surprise element: jumps in interest rates.” Journal of Econometrics 106, 27–65. Davis, M. and V. Lo. 2001. “Infectious defaults.” Quantitative Finance 1, 382–387. De Jong, F. and J. Driessen. 2004. Liquidity risk premia in corporate bond and equity markets, Working paper, University of Amsterdam. Dewachter, H. and M. Lyrio. 2006. “Macro factors and the term structure of interest rates.” Journal of Money, Credit, and Banking 38, 119–140. Dewachter, H. and K. Maes. 2000. Fitting correlations within and between bond markets, Working Paper, KU Leuven CES. Dewachter, H., M. Lyrio, and K. Maes. 2006. “A joint model for the term structure of interest rates and the macroeconomy.” Journal of Applied Econometrics 21, 439–462. Driessen, J. 2005. “Is default event risk priced in corporate bonds?” Review of Financial Studies 18, 165–195. Duarte, J. 2004. “Evaluating an alternative risk preference in affine term structure models.” Review of Financial Studies 17, 379–404. Duffee, G. R. 1999. “Estimating the price of default risk.” Review of Financial Studies 12, 197–226. Duffee, G. R. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57, 405–443. Duffie, D. 1996. Dynamic asset pricing theory, Princeton University Press, Princeton, NJ. Duffie, D. 1998a. Defaultable term structure models with fractional recovery of par, Working paper, Graduate School of Business, Stanford University. Duffie, D. 1998b. First to default valuation, Working paper, Graduate School of Business, Stanford University. Duffie, D. 1999. “Credit swap valuation.” Financial Analysts Journal 55, 73–87. Duffie, D. and M. Huang. 1996. “Swap rates and credit quality.” Journal of Finance 51, 921–949. Duffie, D. and R. Kan. 1996. “A yield-factor model of interest rates.” Mathematical Finance 6, 379–406. Duffie, D. and D. Lando. 2001. “Term structures of credit spreads with incomplete accounting information.” Econometrica 69, 633–664. Duffie, D. and K. J. Singleton. 1997. “An econometric model of the term structure of interest-rate swap yields.” Journal of Finance 52, 1287–1321. Duffie, D. and K. J. Singleton. 1999a. “Modeling term structures of defaultable bonds.” Review of Financial Studies 12, 687–720. Duffie, D. and K. J. Singleton. 1999b. Simulating correlated defaults, Working paper, Graduate School of Business, Stanford University. Duffie, D. and K. J. Singleton. 2003. Credit risk: pricing, measurement, and management, Princeton University Press, Princeton, NJ. Duffie, D., J. Pan, and K. Singleton. 2000. “Transform analysis and option pricing for affine jump-diffusions.” Econometrica 68, 1343–1376. Duffie, D., D. Filipovi, and W. Schachermayer. 2003. “Affine processes and applications in finance.” The Annals of Applied Probability 13, 984–1053.
1003
Durham, G. B. 2003. “Likelihood-based specification analysis of continuous-time models of the short-term interest rate.” Journal of Financial Economics 70, 463–487. Durham, G. B. and A. R. Gallant. 2002. “Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes.” Journal of Business and Economic Statistics 20, 297–338. Elerian, O., S. Chib, and N. Shephard. 2001. “Likelihood Inference for discretely observed onlinear diffusions.” Econometrica 69, 959–993. Elizalde, A., 2003. Credit risk models I: default correlation in intensity models, MSc Thesis, King’s College, London. Elton, E. J., M. J. Gruber, D. Agrawal, and C. Mann. 2001. “Explaining the rate spread on corporate bonds.” Journal of Finance 56, 247–277. Embrechts, P., F. Lindskog, and A. McNeil. 2001. Modelling dependence with copulas, Working paper, Department of Mathematics, ETHZ. Eom, Y. H., M. G. Subrahmanyam, and J. Uno. 2000. Credit risk and the Yen interest rate swap market, Working paper, Stern Business School, New York. Eom, Y. H., J. Helwege, and J. Z. Huang. 2004. “Structural models of corporate bond pricing: an empirical analysis.” Review of Financial Studies 17, 499–544. Ericsson, J. and O. Renault. 2006. “Liquidity and credit risk.” Journal of Finance 61, 2219–2250. Estrella, A. and G. A. Hardouvelis. 1991. “The term structure as a predictor of real economic activity.” Journal of Finance 46, 555–576. Estrella, A. and F. S. Mishkin. 1996. “The yield curve as a predictor of US recessions.” Current Issues in Economics and Finance 2, 1–6. Estrella, A. and F. S. Mishkin. 1997. “The predictive power of the term structure of interest rates in Europe and the United States: implications for the European central bank.” European Economic Review 41, 1375–1401. Estrella, A. and F. S. Mishkin. 1998. “Financial variables as leading indicators predicting U.S. recessions.” Review of Economics and Statistics 80, 45–61. Fama, E. F. 1984. “The information in the term structure.” Journal of Financial Economics 13, 509–528. Fama, E. F. and R. R. Bliss. 1987. “The information in long-maturity forward rates.” The American Economic Review 77, 680–692. Fama, E. F. and K. R. French 1996. “Multifactor explanations of asset pricing anomalies.” Journal of Finance 51, 55–84. Fehle, F. 2003. “The components of interest rate swap spreads: theory and international evidence.” Journal of Futures Markets 23, 347–387. Frees, E. W. and E. A. Valdez. 1998. “Understanding relationships using copulas.” North American Actuarial Journal 2, 1–25. Frey, R. and A. McNeil. 2001. Modelling dependent defaults, Working paper, ETH Zurich. Froot, K. A. 1989. “New hope for the expectations hypothesis of the term structure of interest rates.” Journal of Finance 44, 283–305. Garcia, R. and P. Perron. 1996. “An analysis of the real interest rate under regime shifts.” Review of Economics and Statistics 78, 111–125. Geske, R. 1977. “The valuation of corporate liabilities as compound options.” Journal of Financial and Quantitative Analysis 12, 541–552. Goldstein, R. S. 2000. “The term structure of interest rates as a random field.” Review of Financial Studies 13, 365–384. Goldstein, R., N. Ju, and H. Leland. 2001. “An EBIT-based model of dynamic capital structure.” Journal of Business 74, 483–512. Gray, S. F. 1996. “Modeling the conditional distribution of interest rates as a regime-switching process.” Journal of Financial Economics 42, 27–62. Grinblatt, M. 2001. “An analytic solution for interest rate swap spreads.” International Review of Finance 2, 113–149.
1004 Gupta, A. and M.G. Subrahmanyam. 2000. “An empirical examination of the convexity bias in the pricing of interest rate swaps.” Journal of Financial Economics 55, 239–279. Harrison, M. and D. Kreps. 1979. “Martingales and arbitrage in multiperiod security markets.” Journal of Economic Theory 20, 381–408. He, H. 2000. Modeling term structures of swap spreads, Working paper, Yale University. Heath, D., R. Jarrow, and A. Morton. 1992. “Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation.” Econometrica 60, 77–105. Helwege, J. and C. M. Turner. 1999. “The slope of the credit yield curve for speculative-grade issuers.” Journal of Finance 54, 1869–1884. Ho, T. S. Y. and R. F. Singer. 1982. “Bond indenture provisions and the risk of corporate debt.” Journal of Financial Economics 10, 375–406. Hong, Y. and H. Li. 2005. “Nonparametric specification testing for continuous-time models with applications to term structure of interest rates.” Review of Financial Studies 18, 37–84. Houweling, P. and T. Vorst. 2005. “Pricing default swaps: empirical evidence.” Journal of International Money and Finance 24, 1200–1225. Huang, J. and M. Huang. 2003. How much of the corporate-treasury yield spread is due to credit risk?, Working paper, Graduate School of Business, Stanford University. Hull, J. 2006. Options, futures, and other derivatives, Prentice Hall, Englewood Cliffs, NJ. Hull, J. and A. White. 1993. “One-factor interest-rate models and the valuation of interest-rate derivative securities.” Journal of Financial and Quantitative Analysis 28, 235–254. Hull, J. and A. White. 2000. “Valuing credit default swaps I: no counterparty default risk.” Journal of Derivatives 8, 29–40. Hull, J. and A. White. 2001. “Valuing credit default swaps II: modeling default correlations.” Journal of Derivatives 8, 12–22. Hull, J., M. Predescu, and A. White. 2004. “The relationship between credit default swap spreads, bond yields, and credit rating announcements.” Journal of Banking and Finance 28, 2789–2811. Janosi, T., R. Jarrow, and Y. Yildirim. 2002. “Estimating expected losses and liquidity discounts implicit in debt prices.” Journal of Risk 5, 1–38. Jarrow, R. A. and S. M. Turnbull. 1995. “Pricing derivatives on financial securities subject to credit risk.” Journal of Finance 50, 53–85. Jarrow, R. A. and F. Yu. 2001. “Counterparty risk and the pricing of defaultable securities.” Journal of Finance 56, 1765–1799. Jarrow, R. A., D. Lando, and S. M. Turnbull. 1997. “A Markov model for the term structure of credit risk spreads.” Review of Financial Studies 10, 481–523. Jarrow, R. A., D. Lando, and F. Yu. 2005. “Default risk and diversification: theory and empirical implications.” Mathematical Finance 15, 1–26. Joe, H. 1997. Multivariate models and dependence concepts, Chapman & Hall, London. Johannes, M. 2004. The statistical and economic role of jumps in continuous-time interest rate models.” Journal of Finance 59, 227–260. Jones, C. S. 2003. “Nonlinear mean reversion in the short-term interest rate.” Review of Financial Studies 16, 793–843. Jones, E. P., S. P. Mason, E. Rosenfeld. 1984. “Contingent claims analysis of corporate capital structures: an empirical investigation.” Journal of Finance 39, 611–625. Jorion, P. and G. Zhang. 2007. “Good and bad credit contagion: evidence from credit default swaps.” Journal of Financial Economics 84, 860–883. Kennedy, D. P. 1994. “The term structure of interest rates as a Gaussian random field.” Mathematical Finance 4, 247–258. Kijima, M. 2000. “Valuation of a credit swap of the basket type.” Review of Derivatives Research 4, 81–97.
H. Lin and C. Wu Kijima, M. and Y. Muromachi. 2000. “Credit events and the valuation of credit derivatives of basket type.” Review of Derivatives Research 4, 55–79. Kim, I. J., K. Ramaswamy, and S. Sundaresan. 1993. “Does default risk in coupons affect the valuation of corporate bonds?: a contingent claims model.” Financial Management 22, 117–131. Kozicki, S. and P. A. Tinsley. 2001. “Shifting endpoints in the term structure of interest rates.” Journal of Monetary Economics 47, 613–652. Kozicki, S. and P. A. Tinsley. 2002. “Dynamic specifications in optimizing trend-deviation macro models.” Journal of Economic Dynamics and Control 26, 1585–1611. Lando, D. 1998. “On Cox processes and credit risky securities.” Review of Derivatives Research 2, 99–120. Lang, L. H. P., R. H. Litzenberger, and A. L. Liu. 1998. “Determinants of interest rate swap spreads.” Journal of Banking and Finance 22, 1507–1532. Leland, H. E. 1994. “Corporate debt value, bond covenants, and optimal capital structure.” Journal of Finance 49, 1213–1252. Leland, H. E. and K. B. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” Journal of Finance 51, 987–1019. Li, H. 1998. “Pricing of swaps with default risk.” Review of Derivatives Research 2, 231–250. Li, D. X. 2000a. “On default correlation: a copula function approach.” Journal of Fixed Income 9, 43–54. Li, T. 2000b. A model of pricing defaultable bonds and credit ratings, Working paper, Olin School of Business, Washington University at St. Louis. Li, X. 2006. “An empirical study of the default risk and liquidity components of interest rate swap spreads.” Working paper, York University. Li, M., N. D. Pearson, and A. M. Poteshman. 2001. Facing up to conditioned diffusions, Working paper, University of Illinois at UrbanaChampaign. Lin, H., S. Liu, and C. Wu. 2007. Nondefault components of corporate yield spreads: taxes or liquidity?, Working paper, University of Missouri-Columbia. Liu, J., F. A. Longstaff, and R. E. Mandell. 2004. “The market price of risk in interest rate swaps: the roles of default and liquidity risks.” Journal of Business 79, 2337–2369. Liu, S., J. Shi, J. Wang, and C. Wu. 2007. “How much of the corporate bond spread is due to personal taxes?” Journal of Financial Economics 85, 599–636. Longstaff, F. A. 1989. “A Non-linear general equilibrium model of the term structure of interest rates.” Journal of Financial Economics 2, 195–224. Longstaff, F. A. 2000. “The term structure of very short-term rates: new evidence for the expectations hypothesis.” Journal of Financial Economics 58, 397–415. Longstaff, F. A. and E. S. Schwartz. 1992. “Interest rate volatility and the term structure: a two-factor general equilibrium model.” Journal of Finance 47, 1259–1282. Longstaff, F. A., E. S. Schwartz. 1995. “A simple approach to valuing risky fixed and floating rate debt.” Journal of Finance 50, 789–819. Longstaff, F. A. S. Mithal, and E. Neis. 2003. The credit default swap market: is credit protection priced correctly?, Working paper, UCLA. Longstaff, F. A., S. Mithal, and E. Neis. 2005. “Corporate yield spreads: default risk or liquidity? New evidence from the credit default swap market.” Journal of Finance 60, 2213–2253. Lyden, S. and D. Saraniti. 2000. An empirical examination of the classical theory of corporate security valuation, Working paper, Barclays Global Investors. Madan, D. B. and H. Unal. 1998. “Pricing the risks of default.” Review of Derivatives Research 2, 121–160.
63 Term Structure of Default-Free and Defaultable Securities: Theory and Empirical Evidence Maes, K. 2004. Modeling the term structure of interest rates: where do we stand?, Working paper, National Bank of Belgium. Mella-Barral, P. and W. Perraudin. 1997. “Strategic debt service.” Journal of Finance 52, 531–556. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29, 449–470. Minton, B. A. 1997. “An empirical examination of basic valuation models for plain vanilla US interest rate swaps.” Journal of Financial Economics 44, 251–277. Nielsen, L. T., J. Saá-Requejo, and P. Santa-Clara. 2001. Default risk and interest rate risk: the term structure of default spreads, Working paper, INSEAD. Norden, L. and M. Weber. 2004. “Informational efficiency of credit default swap and stock markets: the impact of credit rating announcements.” Journal of Banking and Finance 28, 2813–2843. Ogden, J. P. 1987. “Determinants of the ratings and yields on corporate bonds: tests of the contingent claims model.” Journal of Financial Research 10, 329–339. Pearson, N. D. and T. S. Sun. 1994. “Exploiting the conditional density in estimating the term structure: an application to the Cox, Ingersoll, and Ross model.” Journal of Finance 49, 1279–1304. Peterson, S., R. C. Stapleton, and M. G. Subrahmanyam. 1998. A two factor lognormal model of the term structure and the valuation of American-style options on bonds, Working paper, New York University. Piazzesi, M. 2001. An econometric model of the yield curve with macroeconomic jump effects, Working paper, UCLA. Richard, S. F. 1978. “An arbitrage model of the term structure of interest rates.” Journal of Financial Economics 6, 33–57. Sanders, A. B. and H. Unal. 1988. “On the intertemporal behavior of the short-term rate of interest.” Journal of Financial and Quantitative Analysis 23, 417–423. Santa-Clara, P. and D. Sornette. 2001. “The dynamics of the forward interest rate curve with stochastic string shocks.” Review of Financial Studies 14, 149–185. Sarig, O. and A. Warga. 1989. “Some empirical estimates of the risk structure of interest rates.” Journal of Finance 44, 1351–1360. Schaefer, S. M. and E. S. Schwartz. 1984. “A two-factor model of the term structure: an approximate analytical solution.” Journal of Financial and Quantitative Analysis 19, 413–424. Schonbucher, P. J. 2003. Credit derivatives pricing models: models, pricing and implementation. Wiley, UK.
1005
Schonbucher, P. and D. Schubert. 2001. Copula-dependent default risk in intensity models, Working paper, Bonn University. Shreve, S. E. 2004. Stochastic calculus for finance II: continuous-time models. Springer, New York. Stambaugh, R. F. 1988. “The Information in forward rates: implications for models of the term structure.” Journal of Financial Economics 21, 41–70. Stanton, R. 1997. “A nonparametric model of term structure dynamics and the market price of interest rate risk.” Journal of Finance 52, 1973–2002. Sun, T. S., S. Sundaresan, and C. Wang. 1993. “Interest rate swaps: an empirical investigation.” Journal of Financial Economics 34, 77–99. Sundaresan, S. 2003. Fixed income markets and their derivatives, 2nd Edition, South-Western Thomson, Cincinnati, OH. Tang, D. Y. and H. Yan. 2006. “Liquidity, liquidity spillover, and credit default swap spreads.” 2007 AFA Chicago Meetings Paper. Titman, S. and W. Torous. 1989. “Valuing commercial mortgages: an empirical investigation of the contingent-claims approach to pricing risky debt.” Journal of Finance 44, 345–373. Vasicek, O. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188. Vassalou, M. and Y. Xing. 2004. “Default risk in equity returns.” Journal of Finance 59, 831–868. Wachter, J. A. 2006. “A consumption-based model of the term structure of interest rates.” Journal of Financial Economics 79, 365–399. Wu, T. 2006. “Macro factors and the affine term structure of interest rates.” Journal of Money, Credit, and Banking 38, 1847–1875. Yawitz, J. B., K. J. Maloney, and L. H. Ederington. 1985. “Taxes, default risk, and yield spreads.” Journal of Finance 40, 1127–1140. Yu, F. 2002. “Modeling expected return on defaultable bonds.” Journal of Fixed Income 12, 69–81. Yu, F. 2003. Correlated defaults in reduced-form models, Working paper, University of California Irvine. Zhang, Y., H. Zhou, and H. Zhu. 2005. “Explaining credit default swap spreads with equity volatility and jump risks of individual firms.” Bank for International Settlements. Zhou, C. 1997. “A jump-diffusion approach to modeling credit risk and valuing defaultable securities.” Federal Reserve Board. Zhu, H. 2006. “An empirical comparison of credit spreads between the bond market and the credit default swap market.” Journal of Financial Services Research 29, 211–235.
Chapter 64
Liquidity Risk and Arbitrage Pricing Theory Umut Çetin, Robert A. Jarrow, and Philip Protter
Abstract Classical theories of financial markets assume an infinitely liquid market and that all traders act as price takers. This theory is a good approximation for highly liquid stocks, although even there it does not apply well for large traders or for modeling transaction costs. We extend the classical approach by formulating a new model that takes into account illiquidities. Our approach hypothesizes a stochastic supply curve for a security’s price as a function of trade size. This leads to a new definition of a self-financing trading strategy, additional restrictions on hedging strategies, and some interesting mathematical issues. Keywords Illiquid markets r Fundamental theorems of asset pricing r Approximately complete markets r Approximation of stochastic integrals
64.1 Introduction Classical arbitrage pricing theory is formulated under two primary assumptions: frictionless and competitive markets. A frictionless market is one that has no transaction costs (including taxes) and no restrictions on trade (e.g., short sale constraints). A competitive market is one where any trader can buy or sell unlimited quantities of the relevant security without changing the security’s price. There is a large literature studying arbitrage pricing theory with transaction costs or trading restrictions (see Barles and Soner 1998; Constantinides and Zariphopoulou 1999; Cvitanic and Karatzas 1996; Cvitanic et al. 1999; Jou 2000; Jouini and Kallal 1995; Jouini et al. 2001; Soner et al. 1995). In general, the standard
This paper is reprinted from Finance and Stochastics, 8 (2004), No. 3, pp. 311–341. U. Çetin Technische Universität Wien, Institut für Finanz- und Versicherungsmathematik, Wiedner Hauptstr. 8-10, 1040 Vienna, Austria e-mail: [email protected] R.A. Jarrow () and P. Protter Cornell University, Ithaca, NY 14853, USA e-mail: [email protected]
theory applies but instead of a single arbitrage-free price for a security there exists an interval of arbitrage-free prices. Hedging strategies are modified, but the basic intuition remains unchanged. In contrast, the literature studying the relaxation of the competitive market hypothesis is much less extensive (see Beck 1993; Cvitanic and Ma 1996; Jarrow 1992, 1994; Bank and Baum 2004). When the competitive market assumption is relaxed, however, the standard theory can completely change. Market manipulation may become an issue and option pricing becomes trader and market structure dependent. The relaxation of the frictionless and competitive market hypotheses introduces the notion of liquidity risk. Roughly speaking, liquidity risk is the additional risk due to the timing and size of a trade. From a financial engineering perspective, the need is paramount for a simple yet robust method that incorporates liquidity risk into arbitrage pricing theory. The market microstructure literature (see Kyle 1985; Glosten and Milgrom 1985; Grossman and Miller 1988), although conceptually useful, is lacking in this regard. As a first solution to this problem, liquidity risk has recently been incorporated into arbitrage pricing theory as a convenience yield (see Jarrow and Turnbull 1997; Jarrow 2001). Convenience yields have a long history in the context of commodity pricing. This solution to the problem successfully captures that component of liquidity risk due to inventory considerations. And, more importantly, it retains the price taking condition so that classical arbitrage pricing theory can still be applied. Nonetheless, this convenience yield approach to the inclusion of liquidity risk has an important omission. It doesn’t explicitly capture the impact of different trade sizes on the price. Consequently, there is no notion of a bid/ask spread for the traded securities in this model structure. This is a significant omission because all markets experience price inelasticities (quantity impacts) and bid/ask spreads. A simple yet robust method for including a quantity impact on the traded security prices in arbitrage pricing theory is currently unknown. The purpose of this paper is to develop a model for the inclusion of liquidity risk into arbitrage pricing theory that incorporates the impact of differing trade sizes on the price. As such, our approach is consistent with price inelasticities.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_64,
1007
1008
We do this by hypothesizing the existence of a stochastic supply curve for a security’s price as a function of trade size, for which traders act as price takers. This condition implies that the investor’s trading strategy has no lasting impact on the price process. Following Heath et al. (1992), we study conditions on the supply curve such that there are no arbitrage opportunities in the economy (appropriately defined). Given an arbitrage free evolution, we then discuss complete markets and the pricing of derivatives. This structure can also be viewed as an extension of the model in Jou (2000) where the traded securities have two distinct price processes – a selling price (the bid) and a buying price (the ask). Instead of two prices as in Jou (2000), we have a continuum of stochastic processes indexed by trade size. This paper provides a new perspective on classical arbitrage pricing theory based on the insight that after its initiation, and before its liquidation, there is no unique value for a portfolio. Indeed, any price on the supply curve is a plausible price to be used in valuing a portfolio. Among these, at least three economic meaningful values can be identified: the marked-to-market value, the accumulated cost of the portfolio, and the liquidation value (for a related discussion, see Jarrow (1992)). This nonuniqueness is important because it changes the mathematics used in tracking a portfolio’s performance. The existing portfolio theory literature and portfolio performance evaluation literature need to be modified accordingly. This is a subject for future research. The first fundamental theorem of asset pricing appropriately generalized, holds in our new setting, while the second fundamental theorem fails. Herein, a martingale measure (appropriately defined) can be unique, and markets still be incomplete. However, a weakening of the second fundamental theorem holds. Markets will be approximately complete in our setting if a martingale measure is unique. Here, approximately complete is roughly defined to be as follows: given any random variable (appropriately integrable), there exists a self-financing trading strategy whose liquidation value is arbitrarily close (in an L2 sense) to the given random variable. In an approximately complete market, we show that by using trading strategies that are continuous and of finite variation, all liquidity costs can be avoided. This occurs because trading strategies induce no path dependency in the evolution of the price process. Hence, a large trade can be subdivided into infinitely many infinitesimal trades to avoid liquidity costs. Consequently, the arbitrage-free price of any derivative is shown to be equal to the expected value of its payoff under the risk neutral measure. This is the same price as in the classical economy with no liquidity costs. However, in a world with liquidities, the classical hedge will not be the hedge used to replicate the option. Both of these observations are consistent with industry usage of the classical arbitrage free pricing methodology. Our approach to liquidity risk can be implemented in practice and the stochastic supply curve estimated from
U. Çetin et al.
market data that contains the price of a trade, the size of the trade, and whether it is a buy or a sell. Our approach to liquidity risk, appropriately generalized, can also be used to study transaction costs. Transaction costs effectively make the traded security’s price dependent on the trade size. However, transaction costs induce a supply curve for the security’s price distinct from that assumed herein. In particular, our supply curve is C 2 in the quantity purchased. Transactions costs violate this condition at the origin. As such, our solution to the problem is quite different from that used earlier in the literature to handle transaction costs, and it provides a new perspective on the existing results. We do not pursue this line of study in this paper, but instead refer the interested reader to Çetin (2003). Independently generated, the mathematics and theorems contained in Bank and Baum (2004) are remarkably similar to those contained herein.1 Bank and Baum study a continuous time economy with a single large trader whose trades affect the price, generalizing Jarrow’s (1992, 1994) discrete time model. In Bank and Baum, the large trader faces a price process represented by a family of continuous semimartingales. The family is indexed by the aggregate position of the large investor’s trading strategy. They concentrate on the large trader’s real wealth, and characterize its evolution using the Itô-Wentzell formula. The economic issue studied is the large trader’s ability or lack thereof to manipulate the market, and in a manipulative free economy, the valuation of contingent claims. Akin to our setting, continuous and finite variation trading strategies play a key role in the analysis because they enable the large trader to avoid market impact costs. As just illustrated via the Bank and Baum paper, our approach is related to the market manipulation literature. The difference is that under market manipulation, the security’s price process can depend not only on the investor’s current trade, but also on the entire history of their past trades. The elimination of this path dependency condition in our structure excludes market manipulation and allows us to use classical arbitrage pricing theory, appropriately modified (see Cvitanic and Ma 1996; Jarrow 1992; Bank and Baum 2004). The extension of our model to the study of market manipulation is a subject for future research. An outline for this paper is as follows. Section 64.2 describes the basic economy. Sections 64.3 and 64.4 study the first and second fundamental theorems of asset pricing, respectively. Section 64.5 provides an example – the extended Black–Scholes economy. Section 64.6 extends the analysis to discontinuous sample path evolutions for the supply curve and Sect. 64.7 concludes the paper.
1
One difference is that in Bank and Baum a manipulative-free economy requires an equivalent measure such that every point on the supply curve evolves as a martingale. In our model, an arbitrage free economy requires only an equivalent measure such that the supply curve evaluated at the zero net trade evolves as a martingale.
64 Liquidity Risk and Arbitrage Pricing Theory
64.2 The Model
1009
4. S.; 0/ is a semimartingale. 5. S.; x/ has continuous sample paths (including time 0) This section presents the model. We are given a filtered probfor all x. ability space .; F ; .Ft /0t T ; P/ satisfying the usual conditions where T is a fixed time. P represents the statistical Except for the second condition, these restrictions are or empirical probability measure. We also assume that F0 is self-explanatory. Condition 2 is the situation where the larger trivial, i.e., F0 D f;; g. the purchase (or sale), the larger the price impact that occurs We consider a market for a security that we will call a on the share price. This is the usual situation faced in asstock, although the subsequent model applies equally well to set pricing markets, where the quantity impact on the price bonds, commodities, foreign currencies, etc. We will initially is due to either information effects or supply/demand imbalassume that ownership of the security has no cash flows as- ances (see Kyle 1985; Glosten and Milgrom 1985; Grossman sociated with it. Also traded is a money market account that and Miller 1988). This excludes the more familiar situation accumulates value at the spot rate of interest. Without loss of in consumer products where there are quantity discounts for generality, we assume that the spot rate of interest is zero, so large orders. It includes, as a special case, horizontal supply that the money market account has unit value for all times.2 curves. This structure can also be viewed as a generalization of the model in Jou (2000) where the traded securities have distinct selling and buying prices following separate stochastic processes. Here, instead of two prices as in Jou 64.2.1 Supply Curve (2000), we have a continuum of stochastic processes indexed by trade size. We consider an arbitrary trader who acts as a price taker with respect to an exogenously given supply curve for shares Example 1 (Supply Curve). To present a concrete example bought or sold of this stock within the trading interval. More of a supply curve, let S.t; x/ f .t; Dt ; x/ where Dt is formally, let S.t; x; !/ represent the stock price, per share, an n-dimensional, Ft -measurable semimartingale, and f W at time t 2 Œ0; T that the trader pays/receives for an order RnC2 ! RC is Borel measurable, C 1 in t, and C 2 in all its of size x 2 R given the state ! 2 . A positive order other arguments. This nonnegative function f can be viewed .x > 0/ represents a buy, a negative order .x < 0/ repre- as a reduced form supply curve generated by a market equisents a sale, and the order zero .x D 0/ corresponds to the librium process in a complex and dynamic economy. Unmarginal trade. der this interpretation, the vector stochastic process Dt repBy construction, rather than the trader facing a horizontal resents the state variables generating the uncertainty in the supply curve as in the classical theory (the same price for any economy, often assumed to be diffusion processes or at least order size), the trader now faces a supply curve that depends Markov processes (e.g., a solution to a stochastic differenon his order size.3 Note that the supply curve is otherwise tial equation driven by a Levy process). This supply curve is independent of the trader’s past actions, endowments, risk time nonstationary. From a practical perspective, the stationaversion, or beliefs. This implies that an investor’s trading ary supply curve formulation, f .Dt ; x/, may be more useful strategy has no lasting impact on the price process. This re- for empirical estimation. striction distinguishes our economy from the situation where If the supply curve in this example assumes the form the supply curve also depends on the entire history of the trader’s trades. S.t; x/ D e g.t;Dt ;x/ S.t; 0/ We now impose some structure on the supply curve. where g W RnC2 ! R is Borel measurable, C 1 in t, C 2 in all Assumption 1 (Supply Curve). its other arguments, nondecreasing in x, with g.t; Dt ; 0/ D 0, g.t; Dt ; x/ < 0 for x < 0 , and g.t; Dt ; x/ > 0 for x > 0, 1. S.t; x; / is Ft measurable and nonnegative. then it is similar to the liquidity premium model used in 2. x 7! S.t; x; !/ is a.e. t nondecreasing in x, a.s. (i.e., Jarrow and Turnbull (1997) and Jarrow (2001). In Jarrow x y implies S.t; x; !/ S.t; y; !/ a.s. P, a.e. t). (2001), the liquidity premium is due to temporary restrictions 3. S is C 2 in its second argument, @S.t; x/=@x is continuous on trading strategies imposed during extreme market conin t, and @2 S.t; x/=@x 2 is continuous in t. ditions, in particular, purchases/short sales are not allowed when the market has a shortage/surplus. In the absence of these extreme conditions, there is no liquidity premium and 2 A numéraire invariance theorem is proved in Appendix for the subsethe supply curve is identically S.t; 0/. More formally, the quent economy. 3 In contrast, the trader is assumed to have no quantity impact due to his function g.t; Dt ; x/ is piecewise constant in x: g.t; Dt / > 0 for x < 0 when there are shortages, g.t; Dt / < 0 for x > 0 trades in the money market account.
1010
U. Çetin et al.
when there are surpluses, and g.t; Dt / D 0 otherwise. This setting does not lead to an arbitrage opportunity, however, because one cannot short or purchase the stock at the times when these extreme conditions arise. In Jarrow’s (2001) model, the liquidity premium has the interpretation of being a convenience yield.
64.2.2 Trading Strategies We start by defining the investor’s trading strategy. Definition 1. A trading strategy is a triplet ..Xt ; Yt W t 2 Œ0; T /; / where Xt represents the trader’s aggregate stock holding at time t (units of the stock), Yt represents the trader’s aggregate money market account position at time t (units of the money market account), and represents the liquidation time of the stock position, subject to the following restrictions: (a) Xt and Yt are predictable and optional processes, respectively, with X0 Y0 0, and (b) XT D 0 and is a predictable .Ft W 0 t T / stopping time with T and X D H1Œ0; / for some predictable process H.t; !/. We are interested in a particular type of trading strategy – those that are self-financing. By construction, a selffinancing trading strategy generates no cash flows for all times t 2 Œ0; T /. That is, purchase/sales of the stock must be obtained via borrowing/investing in the money market account. This implies that Yt is uniquely determined by .Xt ; /. The goal is to define this self-financing condition for Yt given an arbitrary stock holding .Xt ; /. Definition 2. A self-financing trading strategy (s.f.t.s.) is a trading strategy ..Xt ; Yt W t 2 Œ0; T /; / where (a) Xt is càdlàg if @S.t; 0/=@x 0 for all t, and Xt is càdlàg with finite quadratic variation (ŒX; X T < 1) otherwise, (b) Y0 D X0 S.0; X0 /, and (c) for 0 < t T , Z
t
Yt D Y0 C X0 S.0; X0 / C
X
Xu dS.u; 0/ Xt S.t; 0/ 0
Xu ŒS.u; Xu / S.u; 0/
0ut
Z
0
t
@S .u; 0/d ŒX; X cu : @x
a trading strategy that is allowed in the classical theory, but disallowed here, is Xt D 1fS.t;0/>Kg for some constant K > 0 where S.t; 0/ follows a Brownian motion. Under the Brownian motion hypothesis this is a discontinuous trading strategy that jumps infinitely often immediately after S.t; 0/ D K (the jumps are not square summable), and hence Yt is undefined. Condition (b) implies the strategy requires zero initial investment at time 0. When studying complete markets in a subsequent section, condition (b) of the s.f.t.s. is removed so that Y0 C X0 S.0; X0 / ¤ 0. Condition (c) is the self-financing condition at time t. The money market account equals its value at time 0, plus the accumulated trading gains (evaluated at the marginal trade), less the cost of attaining this position, less the price impact costs of discrete changes in share holdings, and less the price impact costs of continuous changes in the share holdings. This expression is an extension of the classical self-financing condition when the supply curve is horizontal. To see this note that using condition (b) with expression (64.1) yields the following simplified form of the self-financing condition: Z
t
Yt C Xt S.t; 0/ D
Xu dS.u; 0/ 0
X
Xu ŒS.u; Xu / S.u; 0/
0ut
Z
t
0
@S .u; 0/d ŒX; X cu for 0 t T: @x (64.2)
The left side of expression (64.2) represents the classical “value” of the portfolio at time 0. The right side gives its decomposition into various components. The first term on the right side is the classical “accumulated gains/losses” to the portfolio’s value. The last two terms on the right side capture the impact of illiquidity, both entering with a negative sign. To understand the origin of the liquidity terms, consider the following heuristic derivation. Assume that X is a semimartingale. Intuitively, the self-financing condition is d Yt D S.t C dt; dXt /dXt
(64.1)
Condition (a) imposes restrictions on the class of acceptable trading strategies. Under the hypotheses that Xt is càdlàg and of finite quadratic variation, the right side of expression (64.1) is always well-defined although the last two terms (always being nonpositive) may be negative infinity. The classical theory, under frictionless and competitive markets, does not need these restrictions. An example of
D S.t; 0/dXt ŒS.t C dt; dXt / S.t C dt; 0/ dXt ŒS.t C dt; 0/ S.t; 0/ dXt : Because S is continuous, we can rewrite this last expression as d Yt D S.t; 0/dXt ŒS.t C dt; dXt / S.t C dt; 0/ dXt d ŒX c ; S t :
64 Liquidity Risk and Arbitrage Pricing Theory
1011
Decomposing X into its continuous and discontinuous parts, we rewrite this expression one more time: d Yt D S.t; 0/dXt ŒS.t C
dt; dXtc /
S.t C
dt; 0/ dXtc
ŒS.t; Xt / S.t; 0/ Xt d ŒX c ; S t : Because @S.; x/=@x is continuous and ignoring the higher order terms, ŒS.t Cdt; dXtc /S.t Cdt; 0/ dXtc D
@S .t; 0/d ŒX; X ct : @x
Definition 3. The liquidity cost of a s.f.t.s. .X; Y; / is Z
The following lemma follows from the preceding definition. Lemma 1 (Equivalent characterization of the liquidity costs). Lt D
@S .t; 0/d ŒX; X ct : @x
This heuristic derivation can be justified by a limiting argument based on simple trading strategies (see Appendix 64A).
64.2.3 The Marked-to-Market Value of a s.f.t.s. and Its Liquidity Cost This section defines the marked-to-market value of a trading strategy and its liquidity cost. At any time prior to liquidation, there is no unique value of a trading strategy or portfolio. Indeed, any price on the supply curve is a plausible price to be used in valuing the portfolio. At least three economically meaningful possibilities can be identified (1) the immediate liquidation value (assuming that Xt > 0 gives Yt C Xt S.t; Xt /), (2) the accumulated cost of forming the portfolio (Yt ), and (3) the portfolio evaluated at the marginal trade (Yt C Xt S.t; 0/).4 This last possibility is defined to be the marked-to-market value of the self-financing trading strategy .X; Y; /. It represents the value of the portfolio under the classical price taking condition. Motivated by expression (64.2), we define the liquidity cost to be the difference between the accumulated gains/losses to the portfolio, computed as if all trades are executed at the marginal trade price S.t; 0/, and the marked-tomarket value of the portfolio.
4
These three valuations are (in general) distinct except at one date, the liquidation date. At the liquidation time , the value of the portfolio under each of these three cases are equal because X D 0.
X
Xu ŒS.u; Xu / S.u; 0/
0ut
Z
t
C
S.t; 0/dXt D Xt dS.t; 0/ d ŒX c ; S t :
d Yt D Xt dS.t; 0/ Xt ŒS.t; Xt / S.t; 0/
Xu dS.u; 0/ ŒYt C Xt S.t; 0/ : 0
Integration by parts gives
Using these two expressions yields the desired result:
t
Lt
0
@S .u; 0/d ŒX; X cu 0 @x
where L0 D 0; L0 D X0 ŒS.0; X0 / S.0; 0/ and Lt is nondecreasing in t. Proof. The first equality follows directly from the definitions. The second inequality and the subsequent observation follow from the fact that S.u; x/ is increasing in x. t u We see here that the liquidity cost is nonnegative and nondecreasing in t. It consists of two components. The first is due to discontinuous changes in the share holdings. The second is due to the continuous component. This expression is quite intuitive. Note that because X0 D Y0 D 0,
L0 D L0 L0 D L0 > 0 is possible. Remark 1. 1. When X is of bounded variation, ŒX; X cu is zero. 2. When X is continuous (that is, Xu D 0 for u > 0), then the first term in the liquidity costs equals its value at zero, i.e., L0 . 3. When X is both continuous and of bounded variation, the liquidity costs of the trading strategy equal L0 .
64.3 The Extended First Fundamental Theorem This section studies the characterization of an arbitrage free market and generalizes the first fundamental theorem of asset pricing to an economy with liquidity risk. To evaluate a self-financing trading strategy, it is essential to consider its value after liquidation. This is equivalent to studying the portfolio’s real wealth, as contrasted with its marked-to-market value or paper wealth, see Jarrow (1992). Using this insight, an arbitrage opportunity can now be defined.
1012
U. Çetin et al.
Definition 4. An arbitrage opportunity is a s.f.t.s. .X; Y; / such that PfYT 0g D 1 and PfYT > 0g > 0: Before we begin, let us clarify the behavior of the stochastic integral at time t D 0. If S is a semimartingale with decomposition St D S0 C Mt C At , M0 D R t A0 D 0, where X is a càdlàg and adapted process, then 0 Xs dSs D Rt Rt 0 Xs dMs C 0 Xs dAs . The initial term is X0 S0 , which of course is 0, since XR0 D 0. On the other hand R t if X itt self is predictable then 0 Xs dSs D X0 S0 C 0 Xs dMs C Rt Rt 0 Xs dAs . Some authors write 0C Xs dSs to denote the stochastic R t integral without itsR tjump at 0. With this interpretation, 0 Xs dSs D X0 S0 C 0C Xs dSs . Note that we assume R0 S.; 0/ is continuous at t D 0, thus 0 Xu dS.u; 0/ D 0 for all predictable X that are S.; 0/-integrable. We first need to define Rsome mathematical objects. Let t st S.t; 0/, .X s/t 0 Xu dS.u; 0/, and for ˛ 0, let ‚˛ fs.f.t.s .X; Y; / j .X s/t ˛ for all t almost surelyg. Definition 5. Given an ˛ 0, a s.f.t.s. .X; Y; / is said to be ˛-admissible if .X; Y; / 2 ‚˛ . A s.f.t.s. is admissible if it is ˛-admissible for some ˛. Lemma 2 (Yt C X t S (t,0) is a supermartingale). If there exists a probability measure Q P such that S.; 0/ is a Qlocal martingale, and if .X; Y; / 2 ‚˛ for some ˛, then Yt C Xt S.t; 0/ is a Q-supermartingale. Proof. From Definition 3 we have that Yt C Xt S.t; 0/ D .X s/t Lt . Under the Q measure, .X s/t is a local Q-martingale. Since .X; Y; / 2 ‚˛ for some ˛, it is a supermartingale (Duffie 1996). But, by Lemma 1, Lt is nonnegative and nondecreasing. Therefore, Yt C Xt S.t; 0/ is a supermartingale too. t u Theorem 1 (A Sufficient Condition for No Arbitrage). If there exists a probability measure Q P such that S.; 0/ is a Q-local martingale, then there is no arbitrage for .X; Y; / 2 ‚˛ for any ˛. Proof. Under this hypothesis, by Lemma 2, Yt C Xt S.t; 0/ is a supermartingale. Note that Y C X S.; 0/ D Y by the definition of the liquidation time. Thus, for this s.f.t.s., EQ ŒY D EQ ŒY C X S.; 0/ 0. But, by the definition of an arbitrage opportunity, EQ ŒY > 0. Hence, there exist no arbitrage opportunities in this economy. t u The intuition behind this theorem is straightforward. The marked-to-market portfolio is a hypothetical portfolio that contains zero liquidity costs (see Definition 3). If S.; 0/ has an equivalent martingale measure, then these hypothetical
portfolios admit no arbitrage. But, since the actual portfolios differ from these hypothetical portfolios only by the subtraction of nonnegative liquidity costs (Lemma 1), the actual portfolios cannot admit arbitrage either. In order to get a sufficient condition for the existence of an equivalent local martingale measure, we need to define the notion of a free lunch with vanishing risk as in Delbaen and Schachermayer (1994). This will require a preliminary definition. Definition 6. A free lunch with vanishing risk (FLVR) is either: (i) an admissible s.f.t.s. that is an arbitrage opportunity or (ii) a sequence of n -admissible s.f.t.s. .X n ; Y n ; n /n1 and a nonnegative FT -measurable random variable, f0 , not identically 0 such that n ! 0 and YTn ! f0 in probability.5 To state the theorem, we need to introduce a related, but fictitious economy. Consider the economy introduced previously, but suppose instead that S.t; x/ S.t; 0/. When there is no confusion, we denote S.t; 0/ by the simpler notation st . 0 In this fictitious economy, a s.f.t.s. X; Y ; satisfies the classical condition with X0 D 0, the value of the portfolio is given by Zt0 Yt0 C Xt st with Yt0 D .X s/t Xt st for all 0 t T , and X is allowed to be a general S.; 0/ s integrable predictable process (see the remark following expression (64.1)). So, in this fictitious economy, our definitions of an arbitrage opportunity, an admissible trading strategy, and a no free lunch with vanishing risk (NFLVR) collapse to those in Delbaen and Schachermayer (1994). Theorem 2 (First Fundamental Theorem). Suppose there are no arbitrage opportunities in the fictitious economy. Then, there is no free lunch with vanishing risk (NFLVR) if and only if there exists a probability measure Q P such that S.; 0/ s is a Q-local martingale. The proof is in Appendix.
64.4 The Extended Second Fundamental Theorem This section studies the meaning and characterization of a complete market and generalizes the second fundamental theorem of asset pricing to an economy with liquidity risk. For this section we assume that there exists an equivalent local martingale measure Q so that the economy is arbitrage free and there is NFLVR. Also for this section, we generalize the definition of a s.f.t.s .X; Y; / slightly to allow for nonzero investments at time 0. In particular, a s.f.t.s. .X; Y; / in this section will 5 Delbaen and Schachermayer (1994) Proposition 3.6, page 477 shows that this definition is equivalent to FLVR in the classical economy.
64 Liquidity Risk and Arbitrage Pricing Theory
1013
satisfy Definition 2 with the exception that condition (b) is removed. That is, a s.f.t.s. need not have zero initial value (Y0 C X0 S.0; X0 / ¤ 0).6 2 To proceed, we need to define the space HQ of semimartingales with respect to the equivalent local martingale measure Q. Let Z be a special semimartingale with canonical decomposition Z D N C A; where N is a local martingale under Q and A is a predictable finite variation process. The H2 norm of Z is defined to be Z 1 1=2 jdAs j kZkH2 D ŒN; N 1 2 C L
0
L2
where the L2 -norms are with respect to the equivalent local martingale measure Q. Throughout this section we make the assumption that 2 2 . Since we’re assuming s 2 HQ , it is no s./ D S.; 0/ 2 HQ longer necessary to require that X s is uniformly bounded from below. Definition 7. A contingent claim is any FT -measurable random variable C with EQ .C 2 / < 1. Note that the contingent claim is considered at a time T , prior to which the trader’s stock position is liquidated. If the contingent claim’s payoff depends on the stock price at time T , then the dependence of the contingent claim’s payoff on the shares purchased/sold at time T must be made explicit. Otherwise, the contingent claim’s payoff is not well-defined. An example helps to clarify this necessity. Consider a European call option on the stock with a strike price7 of K and maturity T0 T .8 To write the modified boundary condition for this option incorporating the supply curve for the stock, we must consider two cases: cash delivery and physical delivery. 1. If the option has cash delivery, the long position in the option receives cash at maturity if the option ends up in-themoney. To match the cash settlement, the synthetic option position must be liquidated prior to time T0 . When the synthetic option position is liquidated, the underlying stock position is also liquidated. The position in the stock at time T0 is, thus, zero.
If we sell the stock at time T0 to achieve this position, then the boundary condition is C maxŒS.T0 ; 1/ K; 0 where XT0 D 1 since the option is for one share of the stock. However, as we show below, one could also liquidate this stock position just prior to time T0 using a continuous and finite variation process, so that XT0 D 0. This alternative liquidation strategy might be chosen in an attempt to avoid liquidity costs at time T0 . In this case, the boundary condition is C maxŒS.T0 ; 0/K; 0 . Note that using this latter liquidation strategy, the option’s payoff is only approximately obtained (to a given level of accuracy) because liquidation occurs just before T0 . 2. If the option has physical delivery, then the synthetic option position should match the physical delivery of the stock in the option contract. With physical delivery, the option contract obligates the short position to deliver the stock shares. To match the physical delivery, the stock position in the synthetic option is not sold. Unfortunately, our model requires the stock position to be liquidated at time T0 . Formally, physical delivery is not possible in our construct. However, to approximate physical delivery in our setting, we can impose the boundary condition C maxŒS.T0 ; 0/ K; 0 where XT0 D 0. This boundary condition is consistent with no liquidity costs being incurred at time T0 , which would be the case with physical delivery of the stock.9 Definition 8. The market is complete if given any contingent R claim C , there exists a s.f.t.s. .X; Y; / with T Q 2 E 0 Xu d Œs; s u < 1 such that YT D C . To understand the issues involved in replicating contingent claims, let us momentarily consider a contingent claim C in L2 .d Q/ where there exists a s.f.t.s. n.X; Y; / such othat RT RT C D c C 0 Xu dsu where c 2 R and EQ 0 Xu2 d Œs; s u < R0 1. Note that EQ .C / D c since 0 Xu dsu D X0 s0 D 0 by the continuity of s at time 0. This is the situation where, in the absence of liquidity costs, a long position in the contingent claim C is redundant. In this case, Y0 is chosen so that Y0 C X0 s0 D c. But, the liquidity costs in trading this stock position are (by Lemma 1): X Lt D
Xu ŒS.u; Xu / S.u; 0/ 0ut
6
In this section we could also relax condition (b) of a trading strategy, Definition 2, to remove the requirement that XT D 0. However, as seen below, it is always possible to approximate any random variable with such a trading strategy. Consequently, this restriction is without loss of generality in the context of our model. This condition was imposed in the previous section to make the definition of an arbitrage opportunity meaningful in a world with liquidity costs. 7 To be consistent with the previous construct, one should interpret K as the strike price normalized by the value of the money market account at time T0 . 8 Recall that interest rates are zero, so that the value of the liquidated position at time T0 is the same as the position’s value at time T .
Z
t
C 0
9
@S .u; 0/d ŒX; X cu 0: @x
We are studying an economy with trading only in the stock and money market account. Expanding this economy to include trading in an option expands the liquidation possibilities prior to time T . Included in this set of expanded possibilities is the delivery of the physical stock to offset the position in an option, thereby avoiding any liquidity costs at time T . Case 2 is the method for capturing no liquidity costs in our restricted setting.
1014
U. Çetin et al.
We have from Definition 2 that Z
T
YT D Y0 C X0 s0 C
Xu dsu XT sT LT C L0 0
and10
RT 0
Xu dsu D
RT 0
n of s.f.t.s. .X n ; Y n ; n /n1 with X bounded, continuous and
RT of finite variation such that EQ 0 .Xun /2 d Œs; s u < 1,
X0n D 0, XTn D 0, Y0n D EQ .C / for all n and Z
Xu dsu so that
YT D C XT sT LT C L0 : By assumption, we have liquidated by time T , giving XT D 0. Thus, we have YT D C .LT L0 / C: That is, considering liquidity costs, this trading strategy subreplicates a long position in this contingent claim’s payoffs. Analogously, if we use X to hedge a short position in this contingent claim, the payoff is generated by
YTn D Y0n C X0n S.0; X0n / C
Remark 2. 1. If @S @x .; 0/ 0, then L D L0 if X is a continuous trading strategy. So, under this hypothesis, all claimsRC where T there exists a s.f.t.s. .X; Y; / such that C D cC 0 Xu dsu with X continuous can be replicated. For example, if S.; 0/ is a geometric Brownian motion (an extended Black–Scholes economy), a call option can be replicated since the Black–Scholes hedge is a continuous s.f.t.s. 2. If @S @x .; 0/ 0 (the general case), then L D L0 if X is a finite variation and continuous trading strategy. So, under this hypothesis, all claims C where there exists a s.f.t.s. RT .X; Y; / such that C D c C 0 Xu dsu with X of finite variation and continuous can be replicated. The remark above shows that if we can approximate X using a finite variation and continuous trading strategy, in a limiting sense, we may be able to avoid all the liquidity costs in the replication strategy. In this regard, the following lemma is relevant. Lemma 3 (Approximating continuous and finite variathere exists a pretion s.f.t.s.). Let C 2 L2 .d Q/. Suppose
RT 2 Q dictable X with E 0 Xu d Œs; s u < 1 so that C D c C RT 0 Xu dsu for some c 2 R. Then, there exists a sequence RT P Xu dsu D 0 Xu dsu C 0uT Xu su and Xu su D 0 for all u since su D 0 for all u by the continuity of s.
10
RT 0
n Xu dsu
0
Z
XTn S.T; 0/ LnT ! c C
T
Xu dsu D C (64.3) 0
in L2 .d Q/. Proof. Note that forR any predictable R T X that is integrable T with respect to s, 0 Xu dsu D 0 Xu 1.0;T .u/dsu since RT RT 0 1.0;T Xu dsu D 0 Xu dsu X0 s0 and s0 D 0: Therefore, we can without loss of generality assume that X0 D 0. Given any H 2 L (the set of adapted processes that have left continuous paths with right limits a.s.) with H0 D 0, we define, H n , by the following:
Y T D C .LT L0 / C where Y is the value in the money market account and L is the liquidity cost associated with X . The liquidation value of the trading strategies (long and short) provide a lower and upper bound on attaining the contingent claim’s payoffs.
T
Z Htn .!/
t
Dn
Hu .!/d u;
t n1
for all t 0, letting Hu equal 0 for u < 0. Then H is the a.s. pointwise limit of the sequence of adapted processes H n that are continuous and of finite variation. Note that H0n D 0 for all n. Theorem 2 in Chapter IV of Protter (2004) remains valid if bL is replaced by the set of bounded, continuous processes with paths of finite variation on compact time sets. Let R T Q 2 X with X0 D 0 be predictable and E 0 Xu d Œs; s u < k
1. Since X s is defined to be the limk!1 X s, where k the convergence is in H2 and X D X1fjX jkg ; and using the above observation, there exists a sequence of continuous n and bounded processes
of finite variation, .X /n1 , such that R T EQ 0 .Xun /2 d Œs; s u < 1, X0n D 0 for all n and Z
Z
T 0
Xun dsu !
T
Xu dsu ; 0
in L2 .d Q/ (see Theorems 2, 4, 5 and 14 in Chapter IV of Protter (2004) in this respect.) Furthermore, Theorem 9 and Corollary 3 in Appendix allow us to choose XTn D 0 for all n. Now, choose Y n D EQ .C / for all n and define Ytn for t > 0 by Equation (64.1). Let n D T for all n. Then, the sequence .X n ; Y n ; n /n1 will Equation (64.3). Note that Ln 0 for all n and R R T satisfy T n n t u 0 Xu dsu D 0 Xu dsu . This lemma motivates the following definition and extension of the second fundamental theorem of asset pricing. Definition 9. The market is approximately complete if given any contingent claim C , there exists a sequence of
64 Liquidity Risk and Arbitrage Pricing Theory
s.f.t.s. .X n ; Y n ; n / with EQ
R T 0
1015
.Xun /2 d Œs; s u < 1 for all
n such that YTn ! C as n ! 1 in L2 .d Q/. Theorem 3 (Second Fundamental Theorem). Suppose there exists a unique probability measure Q P such that S.; 0/ D s is a Q-local martingale. Then, the market is approximately complete. Proof. The proof proceeds in two steps. Step 1 shows that the hypothesis guarantees that a fictitious economy with no liquidity costs is complete. Step 2 shows that this result implies approximate completeness for an economy with liquidity costs. Step 1. Consider the economy introduced in this paper, but suppose x/ S.; 0/. In this fictitious economy, that S.; a s.f.t.s. X; Y 0 ; satisfies the classical condition with Yt0 Rt Y0 C X0 S.0; 0/ C 0 Xu dsu Xt st . The classical second fundamental theorem (see Harrison and Pliska Harrison and Pliska (1981)) applies: the fictitious market is complete if and only if Q is unique. Step 2. By Step 1, given Q is unique, the fictitious economy is complete and, moreover, s has the martingale representation property. Hence, there exists R a predictable
X such RT T that C D c C 0 Xu dsu with EQ 0 Xu2 d Œs; s u < 1 (see Section 64.3 of Chapter IV of Protter (2004) in this respect). Then, by applying the lemma above, the market is approximately complete. t u
the fact that Ln 0 for each n and EQ .YTn C / ! 0 gives lim infn!1 Y0n C X0n S.0; X0 / EQ .C / for all approximating sequences. However, as proven in Lemma 3, there exists n n n some approximating sequence .X ; Y ; n /n1 with L D 0 n n for all n. For this sequence, lim infn!1 Y 0 C X 0 S.0; X0/ D EQ .C /. t u Remark 3. 1. The above value is consistent with no arbitrage. Indeed, suppose the contingent claim is sold at price p > EQ .C /. Then, one can short the contingent claim at p and construct a sequence of continuous and of finite variation s.f.t.s., .X n ; Y n ; n /n1 , with Y0n D EQ .C /; X0n D 0 and limn!1 YTn D C in L2 , hence, in probability, creating a FLVR. However, this is not allowed since Q is an equivalent martingale measure for s. Similarly, one can show that the price of the contingent claim cannot be less than EQ .C /. 2. Given our supply curve formulation, this corollary implies that continuous trading strategies of finite variation can be constructed to both (i) approximately replicate any contingent claim, and (ii) avoid all liquidity costs. This elucidates the special nature of continuous trading strategies in a continuous time setting.
64.5 Example (Extended Black–Scholes Suppose the martingale measure is unique. Then, by Economy)
the theorem we know that given any contingent claim C , there exists a sequence of s.f.t.s. .X n ; Y n ; n /n1 with
R T EQ 0 .Xun /2 d Œs; s u < 1 for all n so that YTn D Y0n C RT n X0n S.0; X0n / LnT C 0 Xu dS.u; 0/ ! C in L2 .d Q/. We call any such sequence of s.f.t.s., .X n ; Y n ; n /n1 an approximating sequence for C . Definition 10. Let C be a contingent claim and ‰ C be the set of approximating sequences for C . The time 0 value of the contingent claim C is given by o
n
To illustrate the previous theory, we consider an extension of the Black–Scholes economy that incorporates liquidity risk. A detailed discussion of this example along with some empirical evidence regarding the pricing of traded options in the extended Black–Scholes economy can be found in Çetin et al. (2002).
64.5.1 The Economy
inf lim inf Y0n C X0n S.0; X0n / W .X n ; Y n ; n /n1 2 ‰ C : n!1
Corollary 1 (Contingent Claim Valuation). Suppose there exists a unique probability measure Q P such that S.; 0/ D s is a Q-local martingale. Then, the time 0 value of any contingent claim C is equal to EQ .C /. Proof. Let .X n ; Y n ; n /n1 be an approximating sequence for C . Then, EQ .YTn C /2 ! 0, and thus, EQ .YTn RT C / ! 0. However, since EQ 0 .Xun /2 d Œs; s u < 1 for R n all n, 0 Xu dsu is a Q-martingale for each n. This yields EQ .YTn / D Y0n CX0n S.0; X0 /EQ .LnT /. Combining this with
Let S.t; x/ D e ˛x S.t; 0/ with ˛ > 0 S.t; 0/
s0 e t CWt e rt
(64.4) (64.5)
where ; are constants and W is a standard Brownian motion initialized at zero. For this section, let the spot rate of interest be constant and equal to r per unit time. The marginal stock price follows a geometric Brownian motion. The normalization by
1016
U. Çetin et al.
the money market account’s value is made explicit in expression (64.5). Expressions (64.4) and (64.5) characterize an extended Black–Scholes economy. It is easy to check that this supply curve satisfies Assumption 1. Under these assumptions, there exists a unique martingale measure for S.; 0/ D s, see Duffie (1996). Hence, we know that the market is arbitrage-free and approximately complete.
Z 0
Z t Xtn D 1 1 ;T 1 .t/n n
Xtn D nTXn
T n1
Consider a European call option with strike price K and maturity date T on this stock with cash delivery. Given cash delivery, in order to avoid liquidity costs at time T , the payoff11 to the option at time T is selected to be CT D maxŒS.T; 0/ K; 0 . Under this structure, by the corollary to the second fundamental theorem of asset pricing, the value of a long position in the option is:
˛ .N 0 .h.u///2 su d u: T u (64.7)
In contrast, an approximate hedging strategy that is continuous and of finite variation having zero liquidity costs is the sequence of s.f.t.s. .X n ; Y n ; n /n1 with
n
64.5.2 Call Option Valuation
T
LT D X0 .S.0; X0 / S.0; 0// C
t n1
C N.h.u//d u; if 0 t T
nXn
T n1
1 ; n
1 t ; if T t T; n (64.8)
and Y0n D EQ .CT /. In the limit, this trading strategy attains RT n the call’s time T value, i.e., YTn D Y0n C 0 Xu dsu ! CT D maxŒS.T; 0/ K; 0 in L2 .d Q/.
64.6 Discontinuous Supply Curve Evolutions C0 D e rT EQ .maxŒS.T; 0/ K; 0 /: It is well-known that the expectation in this expression gives the Black–Scholes–Merton formula:
This section extends the previous economy to allow stock price supply curves with discontinuous sample paths.
p s0 N.h.0// Ke rT N.h.0/ T /
64.6.1 The Supply Curve and s.f.t.s.’s where N./ is the standard cumulative normal distribution function and h.t/
p log st log K C r.T t/ p C T t: 2 T t
Applying Itô’s formula, the classical replicating strategy, X D .Xt /t 2Œ0;T , implied by the classical Black–Scholes– Merton formula is given by Xt D N.h.t//:
(64.6)
This hedging strategy is continuous, but not of finite variation. .t; 0/ D ˛e 0 st D ˛st ). In this economy, we have that ( @S @x Hence, although the call’s value is the Black–Scholes formula, the standard hedging strategy will not attain this value. Indeed, using this strategy, it can be shown that the classical Black–Scholes hedge leads to the following nonzero liquidity costs (from expression (2.1)):12
For this extension we replace Assumption 1 with Assumption 2 (Supply Curve). 1. S.t; x; / is Ft -measurable and nonnegative. 2. x 7! S.t; x; !/ is a.e. t nondecreasing in x, a.s. (i.e., x y implies S.t; x; !/ S.t; y; !/ a.s. P, a.e. t). 3. S is C 2 in its second argument. 4. S.; 0/ is a semimartingale and @S.; 0/=@x has finite quadratic variation. Remark 4. Sample path continuity of S.t; 0/ is replaced by @S.; 0/=@x has finite quadratic variation. In order for Œ@S.; 0/=@x; ŒX; X to be well defined we need this restriction. It will be satisfied, for example, if S.t; x/ D f .Dt ; x/ where f is C 3 in its first argument, C 2 in its second argument, and where D is a semimartingale. The definition of a trading strategy remains unchanged under Assumption 2. The definition of a self-financing trading strategy is changed only marginally to include possible jumps in the first partial derivative of the supply curve, i.e.,
11
The strike price needs to be normalized by the value of the money market account. 12 Note that both LT and YTn are normalized by the value of the money market account.
Definition 11. A self-financing trading strategy (s.f.t.s.) is a trading strategy ..Xt ; Yt W t 2 Œ0; T /; / where (a) X is càdlàg if @S.t; 0/=@x 0 for all t , and X is càdlàg with
64 Liquidity Risk and Arbitrage Pricing Theory
finite quadratic variation (ŒX; X T < 1) otherwise, (b) Y0 D X0 S.0; X0 /, and (c) for 0 t T , Z
t
Yt D Y0 C X0 S.0; X0 / C
X
Xu dS.u; 0/ Xt S.t; 0/ 0
Xu ŒS.u; Xu / S.u; 0/
0ut
Z
0
t
@S .u; 0/d ŒX; X cu : @x
The justification of this definition, based on a limiting argument, is contained in Appendix.
64.6.2 The Extended First Fundamental Theorem Under Assumption 2, Theorem 1 extends. A similar proof applies. Theorem 4 (A Sufficient Condition for No Arbitrage). Given Assumption 2, if there exists a probability measure Q P such that S.; 0/ is a Q-local martingale, then there is no arbitrage for .X; Y; / 2 ‚˛ for any ˛.
64.6.3 The Extended Second Fundamental Theorem Suppose there exists an equivalent local martingale measure 2 so that the market is arbitrage free. Under Q for s 2 HQ Assumption 2, the definitions of a contingent claim, a complete market, and an approximately complete market remain unchanged. The extended second fundamental theorem of asset pricing still holds. The same proof applies as in the original theorem. The original proof did not use sample path continuity, but only the condition that the intercept of the supply curve has totally inaccessible jumps. Theorem 5 (Second Fundamental Theorem). Assume Assumption 2. Let S.; 0/ D s have totally inaccessible jumps. Suppose there exists a unique probability measure Q P such that S.; 0/ D s is a Q-local martingale. Then, the market is approximately complete.
64.7 Conclusion This paper extends classical arbitrage pricing theory to include liquidity risk by studying an economy where the security’s price depends on the trade size. Extended first and
1017
second fundamental theorems of asset pricing are investigated. The economy is shown to be arbitrage free if and only if the stochastic process for the price of a marginal trade has an equivalent martingale probability measure. The second fundamental theory of asset pricing fails to hold in our setting. In our setting, a martingale measure (appropriately defined) can be unique, and markets still be incomplete. However, a weakening of the second fundamental theorem holds. Markets will be approximately complete in our setting if the martingale measure is unique. In an approximately complete market, derivative prices are shown to be equal to the classical arbitrage free price of the derivative. Acknowledgment The authors wish to thank M. Warachka and Kiseop Lee for helpful comments, as well as the anonymous referee and Associate Editor for numerous helpful suggestions, which have made this a much improved paper. Supported in part by NSF grant DMS-0202958 and NSA grant MDA-904-03-1-0092.
References Back, K. 1993. “Asymmetric information and options.” The Review of Financial Studies 6(3), 435–472. Bank, P. and D. Baum. 2004. “Hedging and portfolio optimization in illiquid financial markets with a large trader.” Mathematical Finance 14(1), 1–18. Barles, G. and H. Soner. 1998. “Option pricing with transaction costs and a nonlinear Black–Scholes equation.” Finance and Stochastics 2, 369–397. Çetin, U. 2003. Default and liquidity risk modeling, PhD thesis, Cornell University. Çetin, U., R. Jarrow, P. Protter, and M. Warachka. 2002. Option pricing with liquidity risk , Working paper, Cornell University. Constantinides, G. and T. Zariphopoulou. 1999. “Bounds on prices of contingent claiims in an intertemporal economy with proportional transaction costs and general preferences.” Finance and Stochastics 3, 345–369. Cvitanic, J. and I. Karatzas. 1996. “Hedging and portfolio optimization under transaction costs: a martingale approach.” Mathematical Finance 6 , 133–165. Cvitanic, J. and J. Ma. 1996. “Hedging options for a large investor and forward-backward SDEs.” Annals of Applied Probability 6, 370–398. Cvitanic, J., H. Pham, and N. Touze. 1999. “A closed-form solution to the problem of super-replication under transaction costs.” Finance and Stochastics 3, 35–54. Delbaen, F. 2003. Personal communication. Delbaen, F. and W. Schachermayer. 1994. “A general version of the fundamental theorem of asset pricing.” Mathematische Annalen 300, 463–520. Duffie, D. 1996. Dynamic asset pricing theory, 2nd edition, Princeton University Press, Princeton, NJ. Geman, H., N. El Karoui, and J.-C. Rochet. 1995. “Changes of Numéraire, change of probability measure and option pricing.” Journal of Applied Probability 32, 443–458. Glosten, L. and P. Milgrom. 1985. “Bid, ask and transaction prices in a specialist market with heterogeneously informed traders.” Journal of Financial Economics 14, 71–100. Grossman, S. and M. Miller. 1988. “Liquidity and market structure.” Journal of Finance 43(3), 617–637.
1018 Harrison, J. M. and S. Pliska. 1981. “Martingales and stochastic integrals in the theory of continuous trading.” Stochastic Processes and Their Applications 11, 215–260. Heath, D., R. Jarrow and A. Morton. 1992. “Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation.” Econometrica 60(1), 77–105. Jarrow, R. 1992. “Market manipulation, bubbles, corners and short squeezes.” Journal of Financial and Quantitative Analysis September, 311–336. Jarrow, R. 1994. “Derivative security markets, market manipulation and option pricing.” Journal of Financial and Quantitative Analysis 29(2), 241–261. Jarrow, R. 2001. “Default parameter estimation using market prices.” Financial Analysts Journal, September/October, 75–92. Jarrow, R. and S. Turnbull. 1997. “An integrated approach to the hedging and pricing of Eurodollar derivatives.”Journal of Risk and Insurance 64(2), 271–299. Jouini, E. 2000. “Price functionals with bid-ask spreads: an axiomatic approach.” Journal of Mathematical Economics 34(4), 547–558. Jouini, E. and H. Kallal. 1995. “Martingales and arbitrage in securities markets with transaction costs.” Journal of Economic Theory 66(1), 178–197. Jouini, E., H. Kallal, and C. Napp. 2001. “Arbitrage in financial markets with fixed costs.”Journal of Mathematical Economics 35(2), 197–221. Kyle, A. 1985. “Continuous auctions and insider trading.” Econometrica 53, 1315–1335. Protter, P. 2001. “A partial introduction to financial asset pricing theory.”Stochastic Processes and their Applications 91, 169–203. Protter, P. 2004. Stochastic integration and differential equations, 2nd edition, Springer, Heidelberg. Soner, H. M., S. Shreve, and J. Cvitanic. 1995. “There is no nontrivial hedging portfolio for option pricing with transaction costs.” Annals of Applied Probability 5, 327–355.
U. Çetin et al.
b 0 D 0, and predictable, X b. ysis can be done for X
RT 0C
Xu dsu D
RT 0
b u dsu . The analX
Step 1. The Fictitious Economy Consider the fictitious economy introduced in Sect. 64.3. Delbaen and Schachermayer prove the following in Section 4 of Delbaen and Schachermayer (1994): Theorem 6. Given Assumption 1 and no arbitrage, there is NFLVR in the fictitious economy if and only if there exists a measure Q P such that S.; 0/ is a Q-local martingale. Since the stochastic integral of a predictable process can be written as a limit (uniformly on compacts in probability) of stochastic integrals with continuous and finite variation integrands (see section “Approximating Stochastic Integrals with Continuous and of FV Integrands” in Appendix), we have the following corollary.13 Corollary 2. Suppose there is no arbitrage opportunity in the fictitious economy. Given Assumption 1, if there’s an FLVR in the fictitious economy, there exists a sequence of n -admissible trading strategies X n , continuous and of FV, and a nonnegative FT -measurable random variable f0 ; not identically zero, such that n ! 0 and .X n S /T ! f0 in probability. The proof of this corollary is straightforward, and hence we omit it.
Appendix 64A Proof of the First Fundamental Theorem This theorem uses Assumption 1, sample path continuity of S.t; x/. The proof proceeds in two steps. Step 1 constructs a fictitious economy where all trades are executed at the marginal stock price. The theorem is true in this fictitious economy by the classical result. Step 2 then shows the theorem in this fictitious economy is sufficient to obtain the result in our economy. Prior to this proof, we need to make the following observation in order to utilize the classical theory. The classical theory [see Delbaen and Schachermayer (1994) or alternatively Protter (2001) for an expository version] has trading strategies starting with X0 D 0, while we have trading strategies with X0 D 0 but not X0 D 0. Without loss of generality, in the subsequent proof, we can restrict ourselves to predictable processes with X0 D 0. Here is the argument. Recall su D S.u; 0/. In our setup, choose Y 0 so that X0 S.0; 0/ C Y00 D 0 and Xt S.t; 0/ C Yt0 D X0 S.0; 0/ C RT RT b D 1.0;T X . X b is Y00 C 0C Xu dsu D 0C Xu dsu . Define X
Step 2. The Illiquid Economy In the economy with liquidity risk, restricting consideration to s.f.t.s. .X; Y; / with X finite variation and continuous processes, by Lemma 1, we have that Yt D .X s/t Xt S.t; 0/. At time T , we have YT D .X s/T . This is the value of the same s.f.t.s. in the fictitious economy. We use this insight below. Lemma 4. Given Assumption 1, let X be an ˛-admissible trading strategy which is continuous and of FV in the fictitious economy. Then there exists a sequence of .˛ C n /-admissible trading strategies, in the illiquid economy, .H n ; Y n ; n /n1 of FV and continuous on Œ0; n /, such that YTn tends to .X S /T , in probability, and n ! 0. 13 In the original paper Delbaen and Schachermayer (1994), there is a missing hypothesis in the statement of their theorem related to this corollary. We include here and in other results as needed the missing hypothesis of no arbitrage. We are grateful to Professor Delbaen for providing us with a counterexample that shows one does in fact need this hypothesis Delbaen (2003).
64 Liquidity Risk and Arbitrage Pricing Theory
1019
Proof. Let Tn D T n1 . Define fn .t/ D 1ŒTn t TnC1
XTn .t TnC1 / Tn TnC1
(64.9)
so that fn .Tn / D XTn and fn .TnC1 / D 0: Note that fn .t/ ! 0; a:s:; 8t: Define Xtn D Xt 1Œt
(64.10)
By this definition, X n is continuous and of FV. Note that T is a fixed time and not a stopping time, so X n is predictable. Moreover, Z t n fn .s/dS.s; 0/: (64.11) .X S /t D .X S /t ^Tn C 0
Notice that jfn .!/j sup jXt .!/j K.!/ 2 R since X
Lemma 5. Suppose there is no arbitrage opportunity in the fictitious economy. Given Assumption 1, there is NFLVR in the fictitious economy if and only if there is NFLVR in the illiquid economy. Proof. Suppose there is NFLVR in the fictitious economy. Since, given any s.f.t.s. (X; Y; / in the illiquid economy, Y .X S / ; it follows there exists NFLVR in the illiquid economy. Conversely, suppose there is FLVR in the fictitious economy. In view of Corollary 2, there’s a sequence, .X n /n1 , with each X n continuous, of FV, and n -admissible trading strategies such that .X n S /T ! f0 in probability where f0 is as before and n ! 0. However, by the previous lemma, there exists a sequence of ˛n -admissible trading strategies, .H n ; Y n ; n /n1 , where ˛n ! 0; in the illiquid economy such that Ynn ! f0 in probability, which gives an FLVR in the illiquid economy. t u
t
is continuous on Œ0; T . Thus, fn is bounded by an S.; 0/integrable function. Therefore, by dominated convergence theorem for stochastic integrals [see Protter (2004), p. 145] R fn .s/dS.s; 0/ tends to 0 in u.c.p. on the compact time interval Œ0; T , and therefore X n S ! X S in u.c.p. on Œ0; T .14 Now, let .n /n1 be a sequence of positive real numP bers converging to 0 such that n n < 1. Define n D inf ft > 0 W .X n S /t < ˛ n g ^ T: n is a predictable stopping time by the continuity of S.; x/. Due to u.c.p convergence of X n S to X S , passing to a subsequence if necessary, we have the following: ! P
sup j.X n S /t .X S /t j n
n :
(64.12)
Theorem 7 (First Fundamental Theorem). Suppose there is no arbitrage opportunity in the fictitious economy. Given Assumption 1, there is no free lunch with vanishing risk (NFLVR) in the illiquid economy if and only if there exists a measure Q P such that S.; 0/ is a Q-local martingale. Proof. By the previous lemma, NFLVR in the illiquid economy is equivalent to NFLVR in the fictitious economy, which is equivalent to existence of a martingale measure by Theorem 6. t u
Construction of the Self-Financing Condition for a Class of Trading Strategies
0t T
Notice that P. n < T / n , i.e., n ! T in probability. Moreover, n Tn because X n D X up to time Tn . Choose H n D X n 1Œ0; n / : Consider the sequence of trading strategies .H n ; n /n1 . Note that .H n S /t ˛ "n for all t 2 Œ0; n since Hnn D 0 for all n. Therefore, .H n ; n /n1 is a sequence of .˛ C n /-admissible trading strategies. The value of the portfolio at liquidation for each trading strategy is given by Ynn D X n . n / ŒS . n ; X n . n // S. n ; 0/ C .X n S / n (64.13) since H n is of FV and jumps only at n for each n by the continuity of X n . Therefore, it remains to show X n . n / ! 0 in probability since this, together with n ! T in probaP bility, will prove the theorem. Indeed, n P. n < T / P n n < 1. Therefore, by by the first Borel–Cantelli lemma, PŒ n < T i.o. D 0, which implies X n . n / D X n .T / D 0, with probability 1, for all but at most finitely many n. t u
The purpose of this section is to provide justification for Definitions 2 and 11 in the text. This proof uses only the weaker hypotheses of Assumption 2. Let t be a fixed time and let n be a sequence of random partitions of Œ0; t tending to identity in the following form: n W 0 D T0n T1n : : : Tknn D t where Tkn ’s are stopping times. For successive trading times, t1 and t2 , the self-financing condition can be written as Yt2 Yt1 D .Xt2 Xt1 / ŒS.t2 ; Xt2 Xt1 / : P n / for all n. Therefore, Note that Yt D Y0 C k1 .YTkn YTk1 we’ll define Yt to be the following limit whenever it exists: Y0 lim
n!1
X n n n XTkn XTk1 S Tk ; XTkn XTk1 : k1
(64.14) 14
One can also show this using integration by parts.
Example 2. In the classical case, S.t; x/ D S.t; 0/ for all x 2 R. Thus, self-financing condition becomes
1020
U. Çetin et al.
Yt2 Yt1 D ŒXt2 Xt1 S.t2 ; 0/
n!1
and initial trades must satisfy Y .0/ D X.0/S.0; 0/ instead. Therefore, Yt D Y0 lim
X
n!1
n n XTkn XTk1 S Tk ; 0
k1
2
D Y.0/ lim 4 n!1
X
XTkn S
Tkn ; 0
k1
2 D Y.0/ lim 4 n!1
X
X
3 n n S T ;0 5 XTk1 k
X S
Tkn ; 0
k1
X
X
n!1
n S Tk ; 0
k1
3 X n n 5 n S T ;0 XTk1 S Tk1 k1 ; 0
n!1
D Y0 Xt S.t; 0/ C X0 S.0; 0/ X n n n C lim XTk1 ;0 S Tk ; 0 S Tk1 Z
k;A
n!1
k;B
n S.Tkn ; 0/
S Tkn ; XTkn XTk1
t
D Xt S.t; 0/ C
k1
n
S Tkn ; XTkn XTk1 S.Tkn ; 0/ X n XTkn XTk1 C lim
k1
k1
X n XTkn XTk1
n
S Tkn ; XTkn XTk1 S.Tkn ; 0/ X n XTkn XTk1 D lim
n!1
k1
lim
n Tk1
n XTkn XTk1 S.Tkn ; 0//:
We know from Example 2R that the last sum converges to t X0 S.0; 0/ C Xt S.t; 0/ 0 Xu dS.u; 0/. Let A D A.; t/ be a set of jumps of X that has a.s. a finite number of times s, P and let B D B.; t/ be such that s2B . Xs /2 2 , where A and B are disjoint and A [ B exhaust the jumps of X on .0; t , see proof of Itô’s formula in Protter (2004). Thus,
k1
Tkn
X
lim
Xu dS.u; 0/: 0
Notice that the limit agrees with the value of Y .t/ in classical case. So, we have a framework that contains the existing theory.
P P P n ;T n ¤; , and where k;A denotes k1 1ŒA\.Tk1 k;B dek P n n notes k1 1ŒB\.Tk1 ;Tk D; Since A has only finitely many elements, ! by !, the first limit equals X
ŒS.u; Xu / S.u; 0/ Xu :
(64.16)
u2A
Theorem 8. For X càdlàg and has finite quadratic variation (QV), the value in the money market account is given by Z
t
Yt D Xt S.t; 0/ C
lim
n!1
t
0
Xu dS.u; 0/ 0
Z
Sx.1/ .u; 0/d ŒX; X cu
X
n!1
X
ŒS.u; Xu/ S.u; 0/ Xu : (64.15)
D lim
n!1
is the nth partial derivative of S with respect to x.
lim
n!1
X
X
n Tk1
S
Tkn ;
X
Tkn
lim
k1
X
2 n Sx.1/ .Tkn ; 0/ XTkn XTk1
k;A
X
n Tk1
D lim
n!1
n
S Tkn ; XTkn XTk1 S.Tkn ; 0/
2 n n Sx.1/ .Tk1 ; 0/ XTkn XTk1
k1
C lim
n!1
k;B
X
k1
D X.0/S.0; X0/ X n lim XTkn XTk1 n!1
2 n Sx.1/ .Tkn ; 0/ XTkn XTk1
k1
n!1
Tkn
k;B
X ˇ n ˇ n n ˇ XTkn XTk1 R Tk ; ˇXTkn XTk1 C lim
Proof. Expression (64.14) is X
X ˇ n ˇ n n ˇ XTkn XTk1 R Tk ; ˇXTkn XTk1
X
n!1
Yt D Y0
2 n Sx.1/ .Tkn ; 0/ XTkn XTk1
k;B
C lim
0ut .n/ where Sx
Applying Taylor’s formula to each S.Tkn ; /, the second limit becomes
X
n Sx.1/ .Tkn ; 0/ Sx.1/ .Tk1 ; 0/
k1
n
XTkn XTk1
2
64 Liquidity Risk and Arbitrage Pricing Theory
lim
n!1
C lim
n!1
X
1021
2 n Sx.1/ .Tkn ; 0/ XTkn XTk1
Equation (64.16) converges to X
k;A
X
n XTkn XTk1 R
ˇ
Tkn ; ˇXTkn
n XTk1
ˇ ˇ ;
k;B
(64.17)
and Equation (64.18) converges to Z
where R is the remainder term in Taylor’s formula. The sum of the first three limits converges to15 Z
t 0
D 0
0
X
Sx.1/ .u; 0/d ŒX; X u C X
Sx.1/ .u; 0/. Xu/2
0
Sx.1/ .u; 0/. Xu/2 :
(64.18)
Now we will show as tends to 0, the last term in Equation (64.17) vanishes. Assuming temporarily that .2/ jSx j < K < 1 uniformly in x and t, ˇˇ ˇ n ˇ ˇR T ; ˇX T n X T n ˇ ˇ k k k1 ˇ ˇ .1/ n ˇS .T ; x/ S .1/ .T n ; 0/ˇ sup x x k k ˇ ˇ ˇ 0jxjˇXT n XT n k
k1
ˇ ˇ
ˇ ˇ ˇ n
ˇ XTkn XTk1
ˇ .2/ n ˇ ˇS .T ; y/x XT n XT n ˇ x k ˇ k k1
sup
ˇ ˇ 0jyjjxjˇXT n XT n k
n K XTkn XTk1
k1
ˇ ˇ
n XTkn XTk1 ;
where the second inequality follows from the Mean Value Theorem. Therefore, the last sum in Equation (64.17) is less than or equal to, in absolute value, K lim
n!1
X ˇ ˇ ˇX T n X T n ˇ 3 k k1 k;B
n j < K lim sup jXTkn XTk1
n!1 k;B
X ˇ ˇ ˇX T n X T n ˇ 2 k k1 k
KŒX; X t : Note that can be made arbitrarily small and X has a finite QV. Furthermore, since all summands are positive, as ! 0, 15
.1/
Z
Note that the assumption that Sx .; 0/ has a finite QV is not needed .1/ when Sx .; 0/ is continuous. In this case, the second limit is zero. This .1/ follows from the fact that X has a finite QV and Sx .; 0/ is uniformly continuous, ! by !, over the compact domain Œ0; T .
Sx.1/ .u; 0/. Xu /2
0
Sx.1/ .u; 0/. Xu/2
Sx.1/ .u; 0/d ŒX; X u
X
Sx.1/ .u; 0/. Xu /2
0
D 0
u2A
t
D
t
X
0
Z
Sx.1/ .u; 0/. Xu/2
Sx.1/ .u; 0/d ŒX; X u C X
u2A
Z
t 0
Sx.1/ .u; 0/d ŒX; X u C Sx.1/ .; 0/; ŒX; X t
X
ŒS.u; Xu / S.u; 0/ Xu
0
Sx.1/ .u; 0/d ŒX; X cu :
For the general case, let Vkx D infft > 0 W S .2/ .t; x/ > kg. Define SQ .t; x/ WD S.t; x/1Œ0;Vkx / : Therefore, Equation (64.15) holds for SQ , for each k. Now, a standard argument using set unions, as in the proof of Itô’s formula in Protter (2004), establishes Equation (64.15) for S . t u
Approximating Stochastic Integrals with Continuous and of FV Integrands Lemma 6 is well known and can be found in Protter (2004). Lemma 6. Let X be a special semimartingale with the canonical decomposition X D N C A, where N is a local martingale and A is predictable. Suppose S has totally inaccessible jumps. Then A is continuous. We make the following assumption. [Note that this assumption is satisfied in all classical market models studies, since (for example) a Lévy process has only totally inaccessible jumps, and indeed by a classic theorem of P. A. Meyer, all “reasonable” strong Markov processes have only totally inaccessible jumps.] Assumption 3. S.; 0/ has only totally inaccessible jumps. We recall a few definitions. Definition 12. Let X be a special semimartingale with N The H2 norm of X is canonical decomposition X D NN C A: defined to be Z 1 ˇ ˇ N N 1=2 ˇ ˇ N d Au kX kH2 D N ; N 1 2 C : L
0
L2
1022
U. Çetin et al.
The space H2 of semimartingales consists of all special semimartingales with finite H2 norm. Definition 13. The predictable -algebra P on RC is the smallest -algebra making all processes in L measurable where L is the set of processes that have paths that are left continuous with right limits. We let bP denote bounded processes that are P-measurable.
E G ŒNN ; NN t D E G hNN ; NN it : Jump times of ŒNN ; NN are those of NN , which are totally inaccessible as a corollary to the previous lemma. Therefore, by the same lemma, hNN ; NN i is continuous. Now, Z
Definition 14. Let X 2 H with X D NN C AN its canonical decomposition, and let H; J 2 bP. We define dX .H; J / by
0
Z
T
.Hu .!//2 d hNN ; NN iu .!/ < 1;
0
for all m; for almost all !: Thus, by Lebesgue’s Dominated Convergence Theorem Z
T
2 Hu .!/ Hu .!/1Œ0;Tm d hNN ; NN iu .!/ ! 0; a.s..
0
L
From here on, we suppose s 2 H2 with the canonical N decomposition s D NN C A.
since hNN ; NN i is continuous. Moreover, ˝ ˛ 1=2 H H1Œ0;T 2 NN ; NN m 2 L 2 ˝ N N ˛1=2 H N;N <1
Theorem 9. Let > 0: For any H bounded, continuous and of FV, there exists H ; bounded, continuous and of FV, with HT D 0 such that ds .H; H / < : Proof. Define Htm D Ht 1Œ0;Tm C HTm
2 Hu .!/ Hu .!/1Œ0;Tm d hNN ; NN iu .!/
0
2
Z
1=2 T .Hu Ju /2 d NN ; NN u dX .H; J / 2 0 L Z T ˇ ˇ C jHu Ju j ˇd ANu ˇ 2
T
L2
since H s 2 H2 : A second application of Dominated Convergence Theorem yields
T t 1.T ;T T Tm m
where Tm D T m1 : We’ll first show ds H; H1Œ0;Tm ! 0 as m ! 1:
1=2 R T 2 N N To show 0 Hu .!/Hu .!/1Œ0;Tm d N;N u .!/
L2
! 0, first observe that ŒNN ; NN D hNN ; NN i C M , where hNN ; NN i is the compensator, hence predictable, of ŒNN ; NN and M is a local martingale. Since M is a local martingale, there exists a sequence .Tn /n1 of stopping times increasing to 1 such that M Tn is a martingale for each n. Thus, given a bounded G, G M Tn is a martingale implying EŒ.G M Tn /t D 0 for all t. Moreover, j.G M Tn /t j jGj ŒNN ; NN Tt n C jGj hNN ; NN iTt n jGj ŒNN ; NN t C jGj hNN ; NN it (64.19)
where the first equality is the triangle inequality and the second follows from ŒNN ; NN and hNN ; NN i being increasing. Furthermore, G1Œ0;Tn converges to G hence by Dominated Convergence Theorem for stochastic integrals, G M Tn converges to G M in ucp. Moreover, by Equation (64.19), since G is bounded and ŒNN ; NN and hNN ; NN i are integrable, EŒ.G M Tn /t converges to EŒ.G M /t by ordinary Dominated Convergence Theorem. Therefore, EŒ.G M /t D 0 for all t. Hence, we have
Z
1=2 T 2 ˝ ˛ Hu .!/ Hu .!/1Œ0;Tm d NN ; NN u .!/ 0
! 0:
L2
Since, for any bounded jGj; E G ŒNN ; NN t D E ŒG hNN ; NN it ; for all t; Z
1=2 T 2 Hu .!/ Hu .!/1Œ0;Tm d NN ; NN u .!/ ! 0; 0 2 L
By the previous lemma, AN is continuous as well, so ˇ ˇˇ ˇ ˇHu Hu 1Œ0;T ˇ ˇd ANu ˇ ! 0 by a similar argument. m L2 Hence, ds H; H1Œ0;Tm ! 0 as m ! 1:
too. R T 0
t It remains to show ds HTm TTT 1.Tm ;T ; 0 m as m ! 1. First note that
Z Z
T
HT2m .!/
0 T
0
T u T Tm
2
!
0;
˝ ˛ 1.Tm ;T d NN ; NN u .!/
˝ ˛ Kd NN ; NN u .!/ < 1
where K D max0t T Ht2 .!/1 < 1 since H is bounded. Thus, by the Dominated Convergence Theorem,
64 Liquidity Risk and Arbitrage Pricing Theory
Z
T 0
HT2m .!/
T u T Tm
2
1023
˝ ˛ 1.Tm ;T d NN ; NN u .!/ ! 0; a.s..
Moreover, another application of the Dominated Convergence Theorem yields "Z lim E
m!1
T 0
HT2m
T u T Tm
2
# ˝ ˛ 1.Tm ;T d NN ; NN u D 0:
T 0
ˇ
ˇ ˇ T u ˇˇ ˇHT 1 jdA j u ˇ m T T ˇ .Tm ;T m
which completes the proof.
!0
L2
t u
Corollary 3. Let > 0: For any H , bounded, continuous and of FV, there exists H , bounded, continuous and of FV, with HT D 0 such that kH s H skL2 < : Proof. This follows from a combination of Theorems 9 and 5 of Chapter IV in Protter (2004). t u
This proof only uses the weaker supply curve Assumption 2. Let .S.; x/x2R ; B/ represent an economy where S.; x/ is the stock price process per share for an order of size x and B is a strictly positive semimartingale representing bond price. We define the discounted stock price process SN .; x/ D S.;x/ B . We have shown that for a trading strategy .XN ; YN ; N / in the economy .SN .; x/x2R ; 1/, the self-financing condition is for 0 t N (64.20)
P N where sN WD SN .; 0/; LN 1t D 0ut Xu ŒS .u; Xu / R t @SN 2 c SN .u; 0/ and LN t D 0 @x .u; 0/d ŒX; X u . Using similar arguments, one can show that for a trading strategy, .X; Y; / in the economy .S.; x/x2R ; B/, the self-financing condition is Bt Yt D .X s/t Xt st L1t L2t C.Y B/t ;
1 1 1 1 L D L1 C L1 B B B X 1 1 D L1 C
Xu ŒS.u; Xu / S.u; 0/ B 0ut Bu D L1
1 C LN 1 B
1 @SN 1 2 1 1 L D L2 C L2 D L2 C .; 0/ ŒX; X c B B B B @x 1 N2 D L2 C L B
A Numéraire Invariance Theorem
N 1t LN 2t ; YNt D .XN sN /t XN t sNt L
Theorem 10. Let X be a predictable càdlàg process with finite QV and be a predictable stopping time. .X; Y; / is a s.f.t.s. in the economy .S.; x/x2R ; B/ if and only if it is a s.f.t.s. in .SN .; x/x2R ; 1/. Proof. Given X and , .X; Y; / is a s.f.t.s. in .S.; x/x2R ; B/ if Y satisfies Equation (64.21). Similarly, .X; YN ; / is a s.f.t.s. in .SN .; x/x2R ; 1/ if YN satisfies Equation (64.20). Moreover, Y and YN are uniquely determined given X and . Therefore, it remains to show Y D YN . Note that L1 and L2 are of FV. Thus,
A similar argument shows Z
be found in Geman et al. (1995), or alternatively in Protter (2001) for the most general classical setting.
for 0 t ; (64.21)
where s; L1 and L2 are defined similarly. L1 and LN 1 are submartingales being increasing processes, thus their càdlàg versions exist. We’ll use these càdlàg versions and therefore they are semimartingales of finite variation (FV) and, thus, Y being càdlàg is justified. The proof below is similar to the theorem on numéraire invariance for the classical case, which can
1 1 1 1 .Y B/ D ; Y B .Y B/ C .Y B/ C B B B B 1 1 1 ; Y B D Y B C .Y B/ C B B B 1 1 1 1 .X s/ D ; X s X s C .X s/ C B B B B
1 1 1 ; X s D X s C .X s/ C B B B 1 1 ;s D X sN X s X B B 1 1 C.X s/ C ; X s B B 1 1 ; X s D X sN X s B B 1 1 ; X s C.X s/ C B B D X sN X s
1 1 C .X s/ B B
1024
U. Çetin et al.
Dividing both sides of expression sf) by B and by using the above expressions and rearranging the terms yield
Y D X sN X sN LN 1 LN 2 1 1 1 ; Y B C .X s/ X s L1 L2 C .Y B/ C Y B C B B B 1 1 1 ; Y B D YN C B Y C Y B C B B B
1 1 1 N ;B D Y C Y B C B C B B B D YN C Y .1/ D YN t u
Chapter 65
An Integrated Model of Debt Issuance, Refunding, and Maturity Manak C. Gupta and Alice C. Lee
Abstract We integrate previous work in this area and develop a multiperiod model that simultaneously determines bond refunding, bond issuance, maturity structure, cash holdings, and bank borrowing policies. The focus here is on providing the required debt funds in the most cost efficient fashion. A strength of the model is that it allows for time varying interest costs, transaction costs, issuance costs, and refunding costs to be firm specific. The output of the model lays out the optimal financing decisions for each time interval that minimize the total discounted cost of providing the funds that match the requisite funds. By limiting the surplus funds available, the model minimizes the management incentive to overinvest and thereby reduces the agency costs. The model has economic implications for the firm’s financing decisions its default risk, growth opportunities, riskiness of cash flows, and firm size. Keywords Debt financing r Refunding r Maturity r Simultaneous r Multiperiod
65.1 Introduction Several authors have investigated the theoretical aspects of bond refunding, bond issuance, and maturity. Among them, Goswami et al. (1995), Mitchell (1991), and Barnea et al. (1980) examine terms of bond issuance; King and Mauer (2000) and Chiang and Narayanan (1991) bond refunding; Emery (2001) and Barclay and Smith (1995), Houston and Venkataraman (1994), Stochs and Mauer (1996), and Brick
This paper is reprinted from Review of Quantitative Finance and Accounting, 26 (2006). No.3, pp. 177–199. M.C. Gupta () Temple University, 205B Speakman Hall, Fox School of Business, Philadelphia, PA, 19122, USA e-mail: [email protected] A.C. Lee San Francisco State University, 1600 Holloway Avenue, San Francisco, CA, 94025, USA
and Ravid (1991) debt maturity structure. Integrating these developments on the theoretical front has been a fundamental problem in finance. We attempt to accomplish this and extend the work of these and others who have made significant contributions in these areas by developing an integrated model that simultaneously and in a multiperiod framework takes into account all these choices.1 It will hopefully help decision makers to improve and to make better financing decisions with enhanced understanding. The model presented here develops a unified framework that simultaneously determines the optimal debt refunding, debt issuance (long term and short term), and debt maturity structure policies in a multiperiod framework. Moreover, the model integrates these policies with other debt financing strategies that might be available to the firm, such as the alternatives of borrowing long term and investing temporarily surplus funds in marketable securities. Also, it takes into account differential interest costs, transaction costs, issuance costs, bond refunding costs, and scale-related considerations in cost structures. Also, the model examines the time variance in these cost elements and their uncertainty. The objective of the model developed here is to derive what Lewis (1990) calls a state-preference sequence of financing decisions to be optimally undertaken by the firm in each period to meet its time-varying fund requirements at minimum discounted total cost. In the process, the model simultaneously determines the optimal debt issuance, debt refunding, bank borrowings, marketable securities investments, and debt maturity structure policies. The model is amenable to a variety of constraints on the firm’s short- or long-term borrowing ability, whether imposed by management or by outside lending institutions. The model allows these constraints to vary or be fixed over time. The constraints may include the need to maintain a certain predetermined short- to long- term borrowing ratio. The desired level of this ratio may be influenced by the firm-specific volatility of cash flows (Heine and Harbus 2002), by growth 1
Carelton (1970) and Lee (1985) develop an optimal, linear, dynamic programming model for long-range financial planning that allows for a dynamic capital structure.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_65,
1025
1026
opportunities (see Lang et al. 1996), by the riskiness of cash flows (see Opler et al. 1999) or by the desired debt maturity structure, which itself may be influenced by term premiums (Brick and Ravid 1991). A strength of the model is that it allows the constraints and the costs to be firm specific as these and other authors have shown they indeed are. These cost parameters and constraints vary, among other factors, according to default risk, financial distress costs, firm size, growth, and volatility of cash flows. This paper makes the following contributions to the literature. First, it proposes a unified approach to the debt financing policy, extending and integrating the important contributions made by several authors in this area over a period of time. The model is developed in a multiperiod framework. Second, it takes into account the different costs associated with long-term financing, bank borrowings, call premium, bond refunding costs, and the bond issuance costs. It allows these costs to vary over time. Third, the model allows these costs to be firm specific, as is likely to be case in the real world, affected by growth opportunities, firm size, riskiness of cash flows, and default risk. Fourth, the model allows for several possible constraints that may be imposed by management and/or outside lending institutions such as the limits on short- or long-term debt or the ratio of short- to long-term financing, and so forth. Fifth, the model lays out a specific course of action for the management to follow in each time interval in the form of a sequence of specific state-contingent financing decisions that minimizes the discounted total cost of providing the requisite funds. Sixth, the model can be easily updated as more information becomes available or the firm’s situation changes and a new sequence of state contingent financing decisions are derived. The rest of the paper is organized as follows. In Sect. 65.2, we present the model and discuss various cost factors. In Sect. 65.3, we operationalize the model and discuss the information required and the output of the model. In Sect. 65.4, we present a numerical illustration, the methodology used, the results obtained, and the economic implications. Section 65.5 concludes the paper.
65.2 The Model We start by assuming a deterministic process and then incorporate into the model uncertainty of information about future interest rates and financing requirements. In the deterministic situation, the problem can be stated as follows. Given a finite length of time and projected money requirements in different intervals of this time, determine an optimal policy to issue or call bonds in each interval that minimizes some cost function or the objective function. By applying bonus (effective
M.C. Gupta and A.C. Lee
return on surplus funds invested in marketable securities) and penalty costs (effective costs of bank borrowings), the model determines the optimal sequence of short-term borrowing as well as the investment of temporarily surplus funds in shortterm securities. The debt financing model is general, and the firm may or may not have any bonds outstanding with which to begin. Also, contrary to some other models, this model has no requirement that at the end of the planning horizon the company should be free of debt in the form of outstanding bonds. This assumption is important, because in real life the company need not be free of debt at any particular time. Other logical assumptions made in the model are that (1) if a decision is made to issue a bond in any time interval in the planning horizon, only a single bond is issued, and (2) any time a bond is called, the entire issue is called.
65.2.1 Cost Factors The objective function is defined as the discounted total cost over the entire planning horizon. All the costs in this paper are firm specific and could vary from firm to firm, depending on their size, industry, and the strength of their balance sheets. More specifically, these factors could vary, among other things, according to uncertainty of cash flows, growth of the company, and default risk. We discuss this later in Sect. 65.3. These costs could also vary over time for the same firm. In general, the costs include: 1. Bond flotation cost for any new bond issued. 2. Bonus (return on investment of excess borrowings over funding requirements in marketable securities). 3. Penalty in case of not meeting the funding requirements in any interval. 4. Coupon paid on outstanding bonds. 5. Call premium for any bond called. The cost of floating a bond can generally be represented as ’ C “x, where ’ is a constant, and “ is dollars per bond. This expression means that any time a bond is issued, a fixed expense (’) is incurred, no matter how many bonds are issued. In addition, the cost is proportional to x, the number of bonds issued. The model is general enough that any other bond flotation cost expression can be incorporated. The bonus rate (bi ) is the short-term interest rate in the market in interval i that the firm can earn on its temporarily surplus funds minus the transaction costs of investment. This is the effective rate the firm can earn on its surplus funds, such as through commercial paper or other marketable securities. We call it the bonus rate (bi ), and it is lower than the effective short-term borrowing rate that the firm has to
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
pay for borrowing from banks (pi ). The condition bi < pi rules out the possibility that the firm will indulge in excessive short-term borrowing only to lend short term. The penalty rate (pi ) in interval i is charged when longterm borrowings fall short of the desired level. This is the effective rate the firm has to pay for short-term borrowing. It is assumed to be higher than the market rate (ri ), ruling out the possibility that the firm will maximize its short-term borrowing or indulge in short-term borrowing to lend long term. Short-term borrowing exposes a firm to the risk of temporary changes in interest rates and consequent fluctuations in earnings. Also, if interest rates should rise when the firm’s earnings are declining, the firm might not be able to renew a loan at any interest rate, resulting in possible bankruptcy (see Gupta 1972). The acceleration of tax benefits, an uncertaintyinduced capacity factor, and the volatility of assets favor longer debt maturity structures (see Brick and Ravid 1991; Wiggins 1990; Boyce and Kalotay 1979). These arguments further add to the real cost of short-term debt. Therefore, the nominal short-term interest rate might be low, but the effective rate (pi ) is usually high, based on the considerations listed above and also given the requirements of compensating balances, prior interest deductions for loans, and the installment payment schedule, and so forth. This justifies setting pi r i . The coupon cost is the total coupon interest that must be paid on the outstanding bonds. The coupon rate c dollars/bond varies from bond to bond, depending on the time the bond was issued, the original maturity of the bond, its callability, and whether the bond is issued at a discount or at a premium. The call premium cost is the premium that must be paid for a bond that is called before maturity. The call premium rate on bond, j , ej (dollars/bond), is a declining function of the age of the bond from the time of issue, the original term to maturity, and the original coupon rate. Precise analytical expression is not necessary. For a given bond, a call
1027
premium schedule in the form of a table representing the age of the bond from the time it is issued and the corresponding premium rate would suffice.
65.2.2 Development of the Model Let the planning horizon be divided into N discrete intervals that are assumed to coincide with the conventional interest payments, although this assumption is not necessary for operation of the model. The time intervals are shown schematically in the diagram below.
t0 is the initial time interval under consideration. For consistency, the decision regarding issue or call of a bond in any interval is made at the beginning of that interval. For example, any decision for interval 3 is made at time t2 . For each time interval i , there is a state vector Xi . This vector gives all the information about different bonds outstanding in that interval. We can write Xi as follows: Xi D Œx1;i ; x2;i ; x3;i ; : : : ; xN0 ;i ; xN0C1 ;i ; : : : ; xN 0 ;j : The dimensions of the state vector Xi are N 0 D N0 C N . Each element xj;i of the state vector Xi . characterizes the bondy j (number of bonds) outstanding in interval i . The first N0 terms of the state vector Xi are the N0 bonds outstanding at time t D t0 . The last N terms of the state vector Xi represent the bonds issued in the N time intervals, one in each interval; for example, xN0C1 ;i corresponds to the bond issued in the first interval. In the first interval, if the bond is not issued, then xN0C1 ;i is zero. Also, in the i 1 interval, if this particular bond is called, then also xN0C1 ;i becomes zero. From this, it should be clear that in general:
X0 D Œx1;0 ; x2;0 ; x3;0 ; : : : ; xN0 ;0 ; 0; 0; : : : ; 0 ; X1 D Œx1;1 ; x2;1 ; x3;1 ; : : : ; xN0 ;1 ; xN0 C1;1 ; 0; 0; : : : ; 0 ; Xi D Œx1;i ; x2;i ; : : : ; xN0 ;i ; xN0 C1;i ; xN0 C2;i ; : : : ; xN0 Ci;i ; 0; 0; : : : ; 0 ; XN D Œx1;N ; x2;N ; : : : ; xN0 ;N ; xN0 C1;N ; xN0 CN;N :
With this definition of Xi , the problem can be looked upon as the determination of Xi ; i D 1; : : : , N , which gives the optimal policy with respect to a particular objective, given
an initial state X0 D Œx1;0 ; x2;0 ; x3;0 ; : : : ; xN0 ;0 ; 0; 0; : : : ; 0 . The first N0 terms of X0 are the bonds that are outstanding at time t D t0 .
1028
M.C. Gupta and A.C. Lee
The relationship between the state vector Xi and Xi 1 is as follows: xj;i D xj;i 1 if bond j is not issued or is called or matures in the interval i . D 0 if the bond is called in the interval i or j > N0 C i , or the bond matures or was never issued. D xj;i if j D N0 C i and a bond is issued in the interval i . In this case xj;i 1 D 0. Let us now find a total cost function or the objective function in terms of the elements of state vectors Xi . Consider a time interval i . In this interval, the funds available through all the outstanding bonds are 0
Bi D V
N X
xj;i ;
(65.1)
j D1
where V D the face value of one bond D $1;000. We define the cumulative fund requirement, Di , in interval i , as follows. If the firm’s total fund requirements are Ri , and if it hopes to provide Ei from internal and external equity sources, then Ai is the amount that has to be acquired from debt financing where Ai D Ri Ei : The cumulative fund requirement to be financed by borrowing is Di , where i X Ak : (65.2) Di D V kD1
As mentioned earlier, the optimality of capital structure is a matter of debate, but the model developed here would work with any capital structure that the management decides to maintain. This debate about the optimality of Di =Ei can be traced back to Modigliani and Miller (1958); its resolution is not the focus of this paper. The optimum debt-equity ratio is affected by managerial incentives (Douglas 2002), Tobin’s Q (Lang et al. 1996); firm growth and equity ownership structure (McConnell and Servaes 1995; Fluck 1998); entrydeterring debt capacity (McAndrews and Nakamura 1992); and by firm-specific behavior and possibly mean-reverting earnings (Raymar 1991). It is also influenced by taxation in a multiperiod framework (Lewis 1990); by deviations from target debt/equity ratio (Hovakimian et al. 2001); by the liquidity needs resulting from sudden and severe volatility of earnings and business effects of credit downgrades (Heine and Harbus 2002); and it is possibly tied to the capital budgeting decisions, acquisitions, and investment opportunity sets (Graham and Harvey 2002; Barclay et al. 2003; Maksimovic et al. 1998). Di =Ei is also affected possibly by a multitude
of other factors such as the presence of high net present value projects, information asymmetries, access to outside capital markets, firm size, and managerial opportunism.2 Among other factors influencing the decision on Di =Ei is the need to provide for unforeseen situations that could diminish companies’ net worth and raise their debt-equity ratios. We have recently seen scores of companies from General Motors to AOL Time Warner confess that they are worth far less than their books show. General Motors, for example, had to slash equity by one-third because of underfunded pension liabilities (see Henry and Welch 2002). The need to write down acquisitions because of declining values and, markdown investments can bring off-balance-sheet activities (most often debt) onto company books, and revelations of shrinking shareholder equity or failure to follow GAAP rules can make matters worse. This could easily put companies in violation of their borrowing covenants, hurt their debt ratings, and increase their borrowing costs. All this could happen when earnings are shrinking or perceived to be shrinking, which could drive a company into bankruptcy as in the cases of WorldCom and Enron. Debate about optimal capital structure continues unabated. However, once the desired capital structure is determined, a company has to address the important task of providing the required debt funds over time at minimum discounted total cost. This then is the focus of this model. As mentioned before, all costs are firm-specific depending upon the firm’s credit ratings, size, growth of the firm, volatility of cash flows, default risk, and so forth. and could vary over time as well. We will discuss these firm specific cost factors in greater detail later on. To operationalize costs associated with different sources of financing, we proceed as follows: 1. Bond Flotation Cost. The bond that could be issued in the interval i is represented by the term xN0 C1;i of the state vector Xi according to our notation. Therefore, the bond flotation cost, Fi , in interval i is given by F i D ˛ C ˇ.xN0 Ci;i / if xN0 Ci;i ¤ 0 D 0 if xN0 Ci;i D 0I i:e:; no bound is issued in interval i (65.3) 2. Bonus/Penalty Cost. A penalty cost at the rate (pi ) is incurred in interval i if the cumulative funds available from borrowings (Bi ) fall short of the fund requirements (Di ). A bonus at the rate (bi ) is obtained in interval i if the funds available exceed the amount required to be financed since these surplus funds can be invested in marketable 2
Chang, Lee, and Lee (2005) use a structural equation model to investigate the determinants of capital structure.
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
1029
securities. If pi > bi , the penalty or bonus in interval i is given by 8 ˆ <i D bi if Bi > Di Li D .Di Bi /i ; where i D pi if Di > Bi ˆ : i D 0 if Di D Bi (65.4)
If we define d as the appropriate discount rate, the discounted total cost, Ti , will be
Li in interval i represents a bonus if Bi > Di and a penalty if Di > Bi . Note that when there is a bonus, there is no penalty, and vice versa. 3. Coupon Cost. A coupon has to be paid on all the outstanding bonds in interval i . Coupon rates differ from one bond to another, depending on the original term of maturity of the bond and the time the bond was issued. If cj is the coupon rate for bond j , the total coupon payments in the interval i; Ci , are given as
Ti D Ti0 =.I C d /:
(65.8a)
If the discount rate, d , is non-stationary over time, then , Ti D
Ti0
i Y
.I C dk /:
(65.8b)
kD1
Our objective in finding the optimal policy is to minimize the discounted cost over the planning horizon, where the total cost is the sum of all the Ti : Minimize
N X
Ti
i D1
0
Ci D
N X
cj xj;i
(65.5)
j D1
4. Call Premium Cost. If any bond j is called in interval i , the corresponding xj;i becomes zero. This means we do not pay a coupon on that bond any more, but we do pay the call premium for that bond. If the call premium rate for a bond j in interval i is ej;i , the total call premium, S , in interval i is
Managers can put some logical constraints on the model in the optimization. One possible constraint would be to limit the total debt financing from bonds. Other constraints could take the form of a ratio of short- to longterm financing or limits on short-term borrowing in any time interval.
65.3 Operationalizing the Model
0
Si D
N X
.xj;i 1 /ej;i for all xj;i D 0 and Mj;i ¤ 0
j D1
(65.6) where Mj;i is the remaining time-to-maturity for the j th bond outstanding at time i . The constraint, Mj;i ¤ 0 rules out the case when bonds mature at time i . Given interval i , we can now write the total cost Ti0 in interval i as follows: Ti0 D Fi C Ci C Si C Li 0
D ıi .˛ C ˇxN0 Ci;i / C
N X
0
C
.xj;i 1 /ej;i "j;i C .Di Bi /i ;
j D1
(65.7)
where ıi D 1 if xN0 Ci;i ¤ 0; D 0; otherwiseI "j;i D 1 if xj;i D 0; D 0; otherwise:
65.3.1 Information Required The initial information required is as follows:
cj xj;i
j D1 N X
Before we demonstrate the operation of the model, we note the information required, how it can be obtained, and explain the likely output of the model. Then, we discuss optimization of the model and work towards the solution.
1. Initial state X0 (i.e., number of bonds outstanding at time t D t0 and their maturity). This is obviously known to begin with. 2. Penalty rate (the effective costs of bank borrowing if the available funds fall short of the required funds). 3. Bonus rate (the rate that can be earned on temporarily surplus funds by investing in marketable securities) for each interval i . 4. Call premium schedule for the callable bonds. 5. Bond flotation cost parameters ’ and “. The parameters can be constant for all the intervals, or can be made a function of time or other variables such as number of bonds issued or coupon rate.
1030
M.C. Gupta and A.C. Lee
6. Coupon rates for bonds issued in different intervals. Coupon rates are a function of the interest rate in that interval and the maturity of the bond at the time of issue. If Mk D original maturity of the bond issued in the interval k, then
Let the maturity of bond j in interval i be Mj;i . Bond j is issued in the interval (j N0 / D k. Since Mk is the maturity of the bond issued in interval k at the time of issue, we can write
cj D function of (Mk ; rk ) where k D j N0 Yield curves can be used to determine cj (Mk , rk ); see Green (1993). Interest rate estimates in each time interval, however, necessarily need not be deterministic. Pye (1966) uses a Markov process to estimate the probability distribution of interest rates in each interval as a function of the interest rate in the previous interval. Thus, for each interval i , from the probability distribution given by the Markov process, we can estimate the expected value of the interest rate rNi . Inclusion of the expected interest rate rNi in the model changes the objective function from minimizing the discounted cost to minimizing the expected discounted cost of borrowing over the planning horizon. Similarly, one can estimate the fund requirements Di using some kind of a probability distribution. In this case, for each time interval, the expected value DN i of Di can be used as an information input in the model; if Di is negligibly related to ri , no further complications are introduced in the model. It has been argued that interest rates have historically varied within a certain range. When interest rates in interval i are at the low extreme, there is a greater probability that future interest rates will be higher rather than lower; the converse is also true when interest rates in interval i are at the high extreme. Behavior like this would suggest some kind of lognormal distribution for interest rates in the interval i . In this case, the mean of interest rates e.i C .1=2/¢2) can be used for an estimate of ri in the model, where and ¢ are the mean and the standard deviation of ri , respectively. Under these conditions, it is necessary that bi and pi have a multiplicative relationship with ri such as bi D ri and pi D ri , where and are positive constants ( less than and greater than unity, respectively). If Di is not lognormally distributed, further complications would arise, but there is little theory to argue for or against a lognormal distribution of interest rates or funding requirements.
Mj;i D Mk .i k/ for all j > N0 ; D Mj;0 I for all j < N0 : 2. The total discounted cost of borrowing for the firm over the entire planning horizon if the firm were to follow the optimal policy. 3. The net bond financing (Zi ) to be optimally undertaken by the firm in each interval of time. This is given by the change in the vector Xi over Xi l . If the summation of elements of vector Xi is ˙j xj;I , then Zi D
X j
xj;i
X
xj;i 1 :
j
In the optimal policy, it is possible for Zi to be negative, which indicates that in that time interval more bonds are redeemed than issued. 4. The cost-minimizing optimal short-term borrowing and lending to be undertaken by the firm in each interval of the planning horizon. A positive value for (Di Bi / indicates the optimal short-term borrowing, and a negative value the optimal short-term investment to be undertaken by the firm in each interval. 5. For each time interval, the new bonds (amounts and maturities) to be issued in the interval i ; the bonds that are to be called; and the bonds that mature in that time interval. These are indicated by changes in the elements of the state vector Xi from Xi 1 .
65.3.3 Optimization A simple illustration with only three time intervals demonstrates how the optimization can be carried out and what the output will be. The time scale with corresponding state vectors is shown below
65.3.2 Outputs Information obtained from the model, once the optimal policy minimizing the total discounted cost over the planning horizon is determined, includes 1. The state vector Xi (i D 1; 2; : : : ; N ). The Xi vector is the optimal bond portfolio for each time interval i based on the overall optimization. In each time interval, the maturity characteristics of each bond outstanding in the firm’s bond portfolio can be determined as follows:
Assume for demonstration purpose the following: 1. Initially, at t D t0 , only one bond is outstanding (i.e., N0 D 1). This bond cannot be called within the first two intervals, and it has coupon c1 . 2. Estimates of Di ; ri ; pi , and bi (where i D 1; : : : ; 3) are available.
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
1031
3. The coupon for the bond issued in the interval i; ci , is a known function of ri (the interest rate in interval i /, the maturity of the bond, and its callability. Also, any time a bond is issued, it is of fixed maturity, M0 . 4. The call premium schedule for all the bonds is as follows:
At each node of the decision tree figure, the value of the state vector is shown. The decision involved in going from one state to another can easily be interpreted. Now in order to find an optimal policy, we have to find an objective function for each possible path between state X0 and state X3 . There is an additional computation involved here because for any path followed in the decision tree, the objective along that path depends not only on parameters of the system (e.g., ri , pi , bi , Di / but also depends on the values of the state vector on that path. For example, consider path (1). The objective along that path will depend upon x1;0 (which is given), x2;1 , x3;2 , and x4;3 . We have to determine the three variables that give the minimum values on that path, for which we use linear programming, once the objective is written down on that path. Having done this for all possible paths, we will be able to determine the optimal policy. The optimal path is the one that minimizes the objective function. The corresponding state vectors at the nodes provide the necessary decisions to be taken in each interval. Optimization procedure for path (1). On path (1) of the Fig. 65.1 Decision Tree, the state vectors are
Year from the time of issue First Second Third Fourth
Call premium Cannot be called Cannot be called Known function of (M0 3) Known function of (M0 4)
N 0 , the dimension of the state vector for this problem, is N0 C N D 1 C 3 D 4. The state vectors in general are as follows: X0 D [x1;0 , 0, 0, 0], X1 D [x1;1 , x2;1 , 0, 0], X2 D [x1;2 , x2;2 , x3;2 , 0], X3 D [x1;3 , x2;3 , x3;3 , x4;3 ], where x1;i characterizes the bond outstanding at time t D t0 ; x2;i characterizes the bond issued in the first interval; x3;i characterizes the bond issued in the second interval; and x4;i characterizes the bond issued in the third interval. Note that at this stage we do not know the characteristics of xj;i , in the state vectors. The only thing known is x1;0 (i.e., the bond outstanding at time t D 0). All other values will depend on the decisions taken in each interval. Our objective is to determine all these values as to obtain an optimal policy.
65.3.4 Solution The decision tree constructed for this problem, starting from the first state and checking each possible alternative for passage from one state to another, is presented in Fig. 65.1. There are two possible ways to go from state X0 to state X1W (1) issue a bond in interval 1, or (2) do not issue a bond in interval 1. There is not an option to call a bond in this time interval. At each node, there are again two possible ways to go from state X1 to state X2 : (1) to issue a bond in interval 2, and (2) do not issue a bond. Once again, there is not the option to call any existing bonds at the end of the interval 1. This results in four nodes at the end of interval 1/beginning of interval 2. To go from state X2 to state X3 , there are now four possible paths at each node. Two possibilities obtained here not in previous stages because now we have an option to call or not to call the first bond. And for each of these primary decisions there are two possibilities: one to issue a bond, and the other not to issue a bond in the third interval.
X0 X1 X2 X3
D Œx1;0 ;0;0;0], D Œx1;1 ; x2;1 ;0;0], D Œx1;2 ; x2;2 ; x3;2 ;0], D Œ0; x2;3 ; x3;3 ; x4;3 ].
Each element xj;i of the state vector X refers to the j th bond outstanding in time interval i . For example, the elements x10 D x11 D x12 of the state vectors X0 ; X1 , and X2 refer to the bond that was initially outstanding and remains outstanding in time intervals 1 and 2. In state vector X3 , this element becomes zero, indicating that this bond is called in that time interval. The amounts of funds from the bonds available in each time interval are given by: B1 D x1;1 C x2;1 . B2 D x1;2 C x2;2 C x3;2 . B3 D x2;3 C x3;3 C x4;3 . The total cost in each interval after appropriate discounting is [using Equations (65.7) and (65.8)] T1 D Œı.˛ C ˇx2;1 / C .D1 x1;1 x2;1 /1 C c1 x1;1 Cc2 x2;1 =.1 C d1 /, T2 D Œı.˛ C ˇx3;2 / C .D2 x1;2 x2;2 x3;2 /2 C.c1 x1;2 C c2 x2;2 C c3 x3;2 / =Œ.1 C d1 /.1 C d2 / , T3 D Œı.˛ C ˇx4;3 / C .D3 x2;3 x3;3 x4;3 /3 C.c2 x2;3 C c3 x3;3 C c4 x4;3 / C .x1;3 e1;3 / =Œ.1 C d1 / .1 C d2 /.1 C d3 /]. Total Discounted Cost T D Tl C T2 C T3 ,
1032
M.C. Gupta and A.C. Lee
Fig. 65.1 Decision tree
where ı, ˛, ˇ, D1 , D2 , D3 ; C1 ; C2 ; C3 , 1 , 2 , 3 , d1 , d2 , d3 , and e13 are known constants. This makes the total cost T a linear function of the variables x2;1 ; x3;2 and x4;3 only. We can find the parameters that minimize T subject to constraints on the parameters as follows: x2;1 0; x3;2 0; x4;3 0;
Another variation is that when a bond is issued, we have a choice on its original time to maturity, and these original terms are discrete and finite. For example, at any time we can issue a bond of either 10, 15, or 20 years of maturity. There is a coupon rate corresponding to each maturity. In this case, the problem of finding an optimum along each path of the main decision tree is in itself a process of optimization by dynamic programming. Objectives on each path of this subdecision tree are then a linear function of the xj;i .
x2;1 ; x3;2 and x4;3 are integers; i D bi , if Bi > Di ; i D pi if Di > Bi . Variable original maturity. Instead of issuing a bond of fixed maturity, we can determine the optimal maturity for each bond, provided we know how the coupon on that bond varies with the original term to maturity. The coupon rate, instead of being constant, is now a function of initial maturity. This information may be obtained from interest rate yield curves. If we have an analytical expression for the coupon rate as a function of initial maturity, we want to find in minimizing total cost T the optimal maturity along with xj;i . In this case, the objective function along any path is a function not only of the Xi but also of the original term to maturity for the bond issued in each time interval i . For example, along path (1), the objective function is a function not only of x2;1 , x3;2 , and x4;3 but also of M1 , M2 , and M3 . All these six parameters need be established to provide the optimum along that path. This makes the problem non-linear, unlike the earlier case with a fixed and known original term to maturity, because now cj;i , the coupon rate, is multiplied by xj;i in the expression of T , and the expression for T becomes non-linear.
65.4 Numerical Illustration A simple numerical example illustrates the operation of the model. All the cost parameters and constraints used in this example are for illustrative purposes only. These cost factors and constraints are firm specific and are influenced by several economic factors. These factors are default risk, size of the firm, accessibility to financial markets, riskiness of cash flows, growth of the firm, and so forth, discussed later. The amount of funds required by a firm is the net result of changes in its working and fixed capital requirements. Changes in working capital requirements result from changes in liquidity, trade credit, and inventory requirements, and changes in fixed capital requirements result from changes in a firm’s fixed assets, influenced to a great extent by its growth rate. To some extent, the fund requirements and their mix are also influenced by the type of industry in which the firm is operating (see Gupta 1972). Insofar as no equity is issued or earnings are retained, borrowing needs are greater. Let the expected required funds’ vector for our illustration be given as follows (in millions of dollars): DN D Œ80:0; 100:0; 115:0; 128:0 :
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
We assume that the fixed and the variable components of bond flotation costs are $50,000 and $15 per bond, respectively. The discount rate, d , is 10%, and the cost vectors are as follows: c D Œ6:20; 7:20; 6:50; 7:50 ; pN D Œ7:40; 8:40; 7:50; 8:90 ; bN D Œ5:10; 6:20; 5:20; 6:60 : The dimension of the state vector, N 0 for this problem is N0 C N D 1 C 3 D 4. The state vectors in general are X0 X1 X2 X3
D .x1;0 ; 0; 0; 0/ D .x1;1 ; x2;1 ; 0; 0/ D .x1;2 ; x2;2 ; x3;2 ; 0/ D .x1;3 ; x2;3 ; x3;3 ; x4;3 /
where x1;i characterizes the bond issued at time t t0 and outstanding in interval i ; x2;i characterizes the bond issued in the first interval and outstanding in interval i ; x3;i characterizes the bond issued in the second interval and outstanding in interval i ; and x4;i characterizes the bond issued in the third interval and outstanding in interval i . Note that at this stage we do not know the values of xj;i in the state vectors. That is, from the information given, the optimal financing policy is not evident. It is not obvious, for example, whether or when the firm should optimally borrow from a bank, float a new bond, or move into and out of marketable securities, giving systematic attention to penalty costs, differential costs of funds, and cost of tradeoffs across time periods. The optimal decision depends on all the inter-period and intra-period relationships among costs, their lumpiness and proportionate cost structures, and the firm’s borrowing requirements. Also, consideration of these factors has to be simultaneous rather than piecemeal in determining all the values in the state vectors.
65.4.1 Methodology Our objective is to determine all the xj;i so as to derive an optimal policy. To determine optimal states for discrete systems such as this, there are two basic approaches. In the first approach, the optimal policy is determined by evaluating different possible paths along a decision tree for the given system (Fig. 65.1). Once a policy is found this way, it will automatically satisfy the principle of optimality. This approach is useful when the decision to be made in going from one state to another involves a finite number of possibilities. An example might be an equipment replacement decision.
1033
In the second approach, the principle of optimality is used to start with. Optimization proceeds in a backward direction at each stage, satisfying the principle of optimality. The simple numerical example presented here lends itself to solution by either method, but the backward method is more powerful and versatile. Also, it is more suitable when the transition from one stage to another is governed by an analytical expression or when the decision involves numerous possibilities. A popular method for such optimization problems is the conjugate gradient method. Briefly stated, the general optimization algorithm is3 1. Choose a vector XN 0 arbitrarily to serve as an initial guess for the vector XN that optimizes the function. 2. Find gN 0 D g.XN0 ), the gradient vector of the objective function at XN 0 . Let PN0 D gN 0 be the initial search direction. 3. Let XNi C1 D position of minimum of the objective function line through xN i in the initial direction pNi . f .x/ on the N N 4. If Xi C1 Xi is less than ", a small number, terminate. Optimum is at XNi C1 . If XN i C1 XNi > ", go to the next step. 5. Find gN i C1 D g. N XN i C1 /, the gradient vector at XNi C1 Cal.T / .T / culate ˇi D gi C1 .gi C1 /=Œgi C1 .gi C1 / where T represents the transpose of the vector. 6. Now, pNi C1 D gN i C1 C ˇ pNi is the new search direction. Go to step (3) and repeat to step (6). Note that step (3) is a simple one-dimensional optimization problem that can be solved by methods such as golden section or quadratic interpolation. This method is favored for its simplicity, modest demands on storage, and location of the minimum of any quadratic function of n arguments in at most n iterations (apart from rounding errors). Thus, the method is particularly suitable when some of the cost functions associated with different financing alternatives are non-linear. Bond issue costs, for example, may be non-linear with the number of bonds issued. In general, methods that calculate each new direction of search as part of the iteration cycle are inherently more powerful than methods that assign the directions in advance, in that the accumulated knowledge of the local behavior of the function can be taken into account. And, as we have said, such a method is better when there are many possibilities of passage from one state to another, as may be the case in the multiperiod financing policy we are trying to determine.
3
The conjugate gradient method is an algorithm for finding the nearest local minimum function of n variables that presupposes that the gradient function can be computed (see Bulirsch and Stoer, 1991). For a discussion of the computation of the conjugate gradient method on various processors, see Dongarra, Duff, Sorensen, and van der Vorst (1991), Demmel, Heath, and van der Vorst (1993), and Ortega (1988).
1034
M.C. Gupta and A.C. Lee
Table 65.1 Results of optimization-state vectors Optimum state vector along the path Path No.
x1
x2
x3
Optimum total cost T ($)
1
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(0, 0, 35,000, 0)
22,822,990.23
2
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(0, 0, 35,000, 0)
22,822,990.23
3
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(80,000, 0, 35,000, 0)
18,795,942.90
4
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(80,000, 0, 35,000, 0)
18,795,942.90
5
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
(0, 20,000, 0, 0)
23,063,110.44
6
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
(0, 20,000, 0, 0)
23,063,110.44
7
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
19,036,063.11
8
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
(80,000, 20,000, 0, 0)
19,036,063.11
9
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
19,241,096.92
10
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
19,241,096.92
11
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
(0, 0, 0, 0)
23,268,144.25
12
(80,000, 0, 0, 0)
(80,000, 0, 0, 0)
(0, 0, 0, 0)
23,268,144.25
13
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(80,000, 0, 35,000, 0)
18,795,942.90
14
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(80,000, 0, 35,000, 0)
18,795,942.90
15
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(0, 0, 35,000, 0)
22,822,990.23
16
(80,000, 0, 0, 0)
(80,000, 0, 35,000, 0)
(0, 0, 35,000, 0)
22,822,990.23
Minimum discounted total financing cost T
65.4.2 Results Results for the numerical problem produced by a simple computer program implementing the algorithm are presented in Table 65.1. Optimization along some of the paths yields identical optimal state vectors that are not shown. The last column of Table 65.1 shows that the minimum discounted total financing cost, T , is $18,795,942.90 (path numbers 3 and 13). The corresponding optimal state vectors are as follows: X1 D .80, 000, 0, 0, 0), X2 D .80, 000, 0, 35, 000, 0), X3 D .80, 000, 0, 35, 000, 0).
65.4.3 Optimal Policy Decisions The optimal state vectors listed above also determine the optimal policy for the firm, which is presented in Table 65.2. Table 65.2 delineates the short-term borrowing, the bonds issued, the bank interest cost, the coupon cost, and the total discounted financing costs for each time interval and for all the time intervals of the planning horizon corresponding to the optimal policy vectors. Table 65.2 lays out the sequence of optimal decisions to be undertaken by the firm over successive periods of time. These decisions are as follows: 1. In the first interval, the firm does not issue any new bonds. Whatever funds are required are financed by short-term borrowing in the sum of $20 million.
2. In the second interval, the firm issues bonds for $35,000 million, enough to pay off the outstanding bank loan and provide for increased funding requirements, which increases to $115 million from $100. 3. In the last interval, the firm meets its additional funding requirements by short-term borrowing and does not issue new bonds. Bank borrowing in this interval is $13 million.
65.4.4 Cost Information The optimal state vectors also determine the sequence of financial costs incurred in each time interval. The optimal financing cost breakdown for each time interval for this company is also presented in Table 65.2. 1. In the first time interval, the firm pays $1,680,000 in shortterm borrowing costs, and $4,960,000 in coupon costs on long-term bonds. The total discounted cost for this period is $6,640,000. 2. In the second time interval, there are no short-term borrowing costs because the short-term loans are optimally paid off by the proceeds of the new bond issued in this period, although the firm optimally pays a much higher coupon cost of $7,235,000. The firm also incurs bond flotation costs of $575,000. The total discounted cost for this period is $7,810,000. 3. In the third interval, the firm again pays short-term borrowing costs of $1,157,000 and also coupon costs of $7,235,000, for a total discounted cost of $8,392,000 in this period.
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
1035
Table 65.2 Decision matrix for optimal policy (in thousands of dollars) (4) Coupon (1) Funds (3) On Time cost on from interest (2) Funds interval bonds borrowing borrowing from bonds 1
20,000.00
2 3
80,000.00
1,680.00
115,000.00 13,000.00
115,000.00
4,960.00 7,235.00
1,157.00
(5) Bonds flotation cost
7,235.00
575.00
(6) Total funds (1) C (2)
(7) Total cost (3) C (4) C (5)
(8) Total discounted cost (7)/(l C d /t
100,000.00
6,640.00
6,305.03
115,000.00
7,810.00
6,454.55
128,000.00
8,392.00
6,305.03 18,795.94
4. The minimum total discounted cost of providing the required sequence of funds for this firm is $18,795,940. In conclusion, it should be noted out that without the simultaneous consideration of all the financing alternatives, the sequence of financing operations will be different. Also, the total cost of providing the firm with this time vector of funds will be much higher than $18,795,940. That is, any deviation from the sequence of financing actions derived from this model will result in a higher cost of providing the funds for this firm. Thus, a simultaneous model such as this is very important as the size of firms keeps increasing and the financing operations become increasingly complex. It is conceivable that the size of the firm, for example, will play a role in the shape of the financing vectors. The smaller size firm is likely to have less access to bond markets and have higher fixed financing costs, and, therefore, is likely to depend more on bank borrowings and also float larger bond issues and follow the in-and-out of the marketable securities investment strategy.
65.4.5 Moving Horizon So far we have limited the problem to a finite horizon, with an objective function designed to minimize total costs for a finite period. Any decision taken at the beginning of an interval considers all the information available about subsequent intervals. Under the optimal policy, then, the decision taken in the first interval is truly optimal with respect to the information the firm has. If the finite horizon is long enough, lengthening the horizon by one interval or shortening it by one interval will not appreciably affect the decision to be taken in the first interval. So, for the first interval, we follow the policy determined by the finite horizon. Since a firm is an ongoing entity and does not go out of business at the end of the planning horizon, managers would like to update the optimal policy as new information arrives. This new information could concern the change in macroeconomic factors such as yield curve or in micro-economic
factors such as change in the firm’s default risk or growth opportunities. With the arrival of this new information, the cost parameters and/or constraints could change. Also, the fund requirements could change. However, the financing actions taken based on the information then available are truly optimal. Updating of the optimal policy can proceed as follows: 1. Determine the optimal policy for the finite horizon. 2. Follow this policy for a few intervals. 3. As information arrives that was not available when the optimal policy was originally determined, reformulate the optimal policy using this new information and starting at the current state. 4. Repeat the procedure mentioned in step 1.
65.4.6 Economic Implications and Future Research The cost parameters and constraints used in the numerical example are for illustrative purpose only. In general, the cost parameters are firm specific. These cost parameters vary, among other things, according to default risk, financial distress costs, riskiness of cash flows, and size of the firm. Ceteris paribus, a small size firm is likely to have less clout with the investment banks and will probably face a higher bond financing cost in terms of higher interest cost or higher bond flotation cost or both than a larger size firm. This could cause the small firm to issue bonds less frequently but in a larger amount and put the surplus in marketable securities for future use. That is, the “transactions costs motive” is likely to produce different state preference of financing decisions that will be cost efficient for a smaller firm than for a larger size firm. For the same size firms, the firm with greater volatility of cash flows is likely to have a different set of cost parameters, and again a different sequence of financing decisions will be more cost efficient than for a firm with lower volatility of cash flows. However, these are theoretical constructs and need be empirically tested.
1036
The model allows a firm to impose a variety of constraints such as the requirement that marketable securities not fall below certain level. This could happen, as Lang et al. (1996) point out, if the firm has high Tobin’s Q and has good growth opportunities. Also, as Baskin (1987) points out, this could happen for competitive reasons. The readily available cash from marketable securities could at times show competitors the firm’s resolve to retaliate against encroachments and enable it to quickly preempt new opportunities. Also, as John (1993) argues, firms sometimes desire to hold greater amounts of cash (and cash equivalents like marketable securities) when they are subject to higher financial distress costs. In summary, not only are the cost parameters likely to be firm specific, but the constraints that the management and the outside lending institutions impose are likely to be firm specific. These cost parameters and constraints would depend on the default risk, financial distress costs, volatility of cash flows, (see Heine and Harbus 2002), growth opportunities of the firm, (see Lang et al. 1996), and riskiness of cash flows, (see, Guedes and Opler 1996). With different cost parameters and different constraints, the cost efficient financing vectors will be different for different sized firms, with different growth opportunities and/or different volatility of cash flows, and various default risk. In essence, each firm is likely to face a unique set of cost parameters and constraints and, therefore, is likely to have a unique set of cost efficient sequence of financing decisions. In addition, these parameters or constraints may change over time as a result of macro economic factors such as the state of the economy or the interest rate environment. Also, these cost parameters and constraints may change due to micro economic factors such as a decrease/increase in default risk or the volatility of cash flows or the growth rate, and so forth. With such changes, the model will produce a different set of financing decisions that will be cost efficient for the firm. However, the financial decisions previously taken by the firm are optimal and fully reflect the information then available. The financing decisions are thus dynamic in nature and future research is likely to validate these conclusions and shed light on the specific relationships aong the optimal financing vectors and, the firm size, growth opportunities, default risk, financial distress costs, and volatility and riskiness of cash flows.
65.5 Conclusions In recent years, the growth in the size of corporations has accelerated the need for a simultaneous debt financing model. The model developed here provides, given any capital structure, the required funds from bond refunding, bond issuance,
M.C. Gupta and A.C. Lee
bank borrowings, and marketable securities holdings, in each time interval and across all time intervals in a costminimizing fashion. Without the simultaneous consideration of all the alternatives, the cost of providing these funds will be much higher. The model optimally matches the debt funds supplied with the debt funds required. This eliminates the acquisition of surplus funds that might result from an ad hoc issuance of long- and short-term debt financing or debt refunding. As a consequence, it constrains the management’s opportunity to overinvest in reducing agency costs. The model advances the state of the art in scheduling and sequencing the financing operations of the firm and makes an important contribution over the extant literature in this area. This model integrates the important works of several authors, including the work on debt refunding by Mauer (1993), Ingersoll (1977), Chiang and Narayanan (1991), and Yawitz and Anderson (1977), and the work on bond issuance by Goswami et al. (1995), Mitchell (1991), and Barnea et al. (1980). The model determines the optimal bond refunding, bond issuance, bank borrowings, marketable securities holdings, and debt maturity structure policies. The model is multiperiod. A strength of the model is that it allows for the cost structures to be firm specific. These authors and others have shown that these costs parameters are indeed firm specific affected by default risk, growth opportunities, and volatility of cash flows, and so forth. Also, the model allows for these cost structures to vary over time as they often do. The model is designed to allow for the various constraints that may be imposed by the management or the outside lending institutions. In reality, these constraints may also vary over time, and the model allows for this. The output from the model optimally meets the firm’s time-varying and uncertain debt financing requirements resulting from fluctuating levels of fixed and working capital needs in a cost efficient manner linking what Mauer and Triantis (1994) call the financing with the investment decisions. The model is dynamic and is shown to be easily updated as the cost parameters or constraints change over time. The end result is that the model substitutes a coherent and well-thought out policy for uncoordinated and ad hoc actions in the areas of bond refunding, bond issuance, marketable securities holdings, and bank borrowing decisions and yields a sequence of specific statecontingent actions to be optimally taken by the firm in each time interval. An important implication of the analysis here is that the optimal financing decisions are very firm specific and dynamic in nature. This is so because the cost parameters and also the constraints imposed by management and/or outside institutions are very firm specific. In addition, these cost parameters and constraints change over time and hence the optimal financing decisions are dynamic in nature. Without the simultaneous consideration of all the financing alternatives
65 An Integrated Model of Debt Issuance, Refunding, and Maturity
and in a multiperiod framework, the sequence of financing operations will be different and the cost of providing the funds will be much higher than derived in this model. The growth in the size of firms and the increasing complexity of financing makes it imperative for managers to use a simultaneous and multiperiod model that provides them a framework for cost minimizing decision-making, taking into account the different altrnatives, their different costs, and the different time pattern of these cost elements and fund requirements. This model is a modest attempt to provide the necessary framework and fill an important gap in the available literature. It is hoped that it will help decision-making with enhanced understanding and with added strategic value for corporate profitability. It is also hoped that it would be a precursor to many more attempts by researchers at integrating the financing operations of firms. Acknowledgment The authors are grateful for comments by Andy Chen, C.F. Lee, and Michael Rebello. The financial support to computerize the model was provided by the Faculty Senate of Temple University.
References Barclay, M. J., L. M. Marx, and C.W. Smith. 2003. “The joint determination of leverage and maturity.” Journal of Corporate Finance 9(2), 1–18. Barclay, E. I. and C.W. Smith Jr. 1995. “The maturity structure of corporate debt.” Journal of Finance 50, 609–631. Barnea, A., R. A. Haugen, and L. W. Senbet 1980. “A rationale for debt maturity structure and call provisions in the agency theoretic framework.” Journal of Finance 35, 1223–1234. Baskin, J. 1987. “Corporate liquidity in games of monopoly power.” Review of Economics and Statistics 69, 312–319. Boyce, W. M. and A. J. Kalotay 1979. “Tax differentials and callable bonds.” Journal of Finance 34, 825–838. Brick, I. E. and S. A. Ravid 1991. “Interest rate uncertainty and optimal debt maturity structure.” Journal of Financial and Quantitative Analysis 28(1), 63–81. Bulirsch, R. and J. Stoer 1991. The conjugate-gradient method of Hestenes and Stiefel. §8.7 in introduction to numerical analysis. Springer-Verlag, New York. Carelton, W. T. 1970. “An analytical model for long-range financial planning.” Journal of Finance 25(2), 291–315. Chang, C., A. C. Lee, and C. F. Lee 2005. Reexamining the determinants of capital structure using structural equation modeling approach, Unpublished Paper, National Chenchi University, Taiwan. Chiang, R. C. and M. P. Narayanan 1991. “Bond refunding in efficient markets: a dynamic analysis with tax effects.” Journal of Financial Research 14(4), 287–302. Demmel, J., M. Heath, and H. van der Vorst 1993. “Parallel numerical linear algebra,” in Acta numerica Vol. 2. Cambridge University Press, Cambridge, England. Dongarra, J., I. Duff, D. Sorensen, D., and H. van der Vorst 1991. Solving linear systems on vector and shared memory computers, SIAM, Philadelphia, PA. Douglas, A. V. S. 2002. “Capital structure and the control of managerial incentives.” Journal of Corporate Finance 8(4), 287–311.
1037 Emery, W. G. 2001. “Cyclical demand and the choice of debt maturity.” Journal of Business 74(4), 557–590. Fluck, Z. 1998. “Optimal financial contracting: debt versus outside equity.” Review of Financial Studies 11(2), 383–418. Goswami, G., T. Noe, and M. Rebello 1995. “Debt financing under asymmetric information.” Journal of Finance 50, 633–660. Graham, J. and C. Harvey 2002. “How do CFOs make capital budgeting and capital structure decisions.” Journal of Applied Corporate Finance 15(1), 8–24. Green, R. C. 1993. “A simple model of the taxable and tax-exempt yield curves.” Review of Financial Studies 6(2), 233–264. Guedes, J. and T. Opler 1996. “The determinants of the maturity of corporate debt issues.” Journal of Finance 51(5), 1809–1833. Gupta, M. C. 1972. “Differential effects of tight money: an economic rationale.” Journal of Finance 27(3), 825–838. Gupta, M. C. and R. Huefner 1972. “A cluster analysis study of financial ratios and industry characteristics.” Journal of Accounting Research 10(1), 77–95. Henry, D. and D. Welch 2002. “$100 billion goes poof.” Business Week 10, 131–133. Heine, R. and F. Harbus 2002. “Toward a more complete model of capital structure.” Journal of Applied Corporate Finance 15(1), 31–46. Houston, J. F. and S. Venkataraman 1994. “Optimal maturity structure with multiple debt claims.” Journal of Financial and Quantitative Analysis 29, 179–201. Hovakimian, A., T. Opler, and S. Titman 2001. “The debt-equity choice.” Journal of Financial and Quantitative Analysis 36(1), 1–25. Ingersoll, J. 1977. “An examination of corporate call policies on convertible securities.” Journal of Finance 32(2), 463–78. John, T.A. 1993. “Accounting measures of corporate liquidity, leverage and costs of financial distress.” Financial Management 22, 91–100. King, D. and D. C. Mauer 2000. “Corporate call policy for nonconvertible bonds.” Journal of Business 73, 403–444. Lang, L. E. Ofek, and R. M. Stulz 1996. “Leverage, investment, and firm growth.” Journal of Financial Economics (40)1, 3–29. Lee, C. F. 1985. Financial analysis and planning. Addison-Wesley Publishing Company, Reading, MA. Lewis, C. M. 1990. “A multiperiod theory of corporate financial policy under taxation.” Journal of Financial and Quantitative Analysis 25(1), 25–43. Maksimovic, V., A. Stomper, and J. Zechner 1998. “Capital structure, information acquisition and investment decisions in industry equilibrium.” European Finance Review 2, 251–271. Mauer, D. C. 1993. “Optimal bond call policies under transactions costs.” Journal of Financial Research 16(1), 23–37. Mauer, D. C. and A. J. Triantis 1994. “Interactions of corporate financing and investment decisions: a dynamic framework.” Journal of Finance 49, 1253–1277. McAndrews, J. J., and L. I. Nakamura 1992. “Entry-deterring debt.” Journal of Financial and Quantitative Analysis 24(1), 98–110. McConnell, J. and H. Servaes 1995. “Equity ownership and the two faces of debt.” Journal of Financial Economics 39(1), 37–151. Mitchell, K. 1991. “The call, sinking fund, and term-to-maturity features of corporate bonds: an empirical investigation.” Journal of Financial and Quantitative Analysis 26(2), 201–222. Modigliani, F. and M. Miller 1958. “The cost of capital, corporation finance and the theory of investments.” American Economic Review 48, 162–197. Opler, T., L. Pinkowitz, R. Stulz, and R. Williamson 1999. “The determinants and implications of corporate cash holdings.” Journal of Financial Economics 52(1), 3–46. Ortega, J.M. 1988. Introduction to parallel and vector solution of linear systems. Plenum Press, New York.
1038 Pye, G. 1966. “A Markov model of the term structure.” Quarterly Journal of Economics 25, 60–72. Raymar, S. 1991. “A Model of Capital Structure When Earnings Are Mean-Reverting.” Journal of Financial and Quantitative Analysis 26(3), 327–344. Stochs, M. H. and D. C. Mauer 1996. “The determinants of corporate debt maturity structure.” Journal of Business 69(3), 279–312.
M.C. Gupta and A.C. Lee Wiggins, J. 1990. “The relation between risk and optimal debt maturity and the value of leverage.” Journal of Financial and Quantitative Analysis 25, 377–386. Yawitz, J. and J. Anderson 1977. “The effect of bond refunding on shareholder wealth.” Journal of Finance 32(5), 1738–1746.
Chapter 66
Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution Thomas S. Y. Ho and Sang Bin Lee
Abstract This paper describes a business model in a contingent claim modeling framework. The model defines a “primitive firm” as the underlying risky asset of a firm. The firm’s revenue is generated from a fixed capital asset and the firm incurs both fixed operating costs and variable costs. In this context, the shareholders hold a retention option (paying the fixed operating costs) on the core capital asset with a series of growth options on capital investments. In this framework of two interacting options, we derive the firm value. The paper then provides three applications of the business model. First, the paper determines the optimal capital budgeting decision in the presence of fixed operating costs, and shows how the fixed operating cost should be accounted for in an NPV calculation. Second, the paper determines the equity value, the growth option, the retention option as the building blocks of primitive firm value. Using a sample of firms, the paper illustrates a method that compares the equity values of firms in the same business sector. Third, the paper relates the change in revenue to the change in equity value, showing how the combined operating and financial leverage may affect the firm’s valuation and risks. Keywords Business model r Capital budgeting r Real option r Fixed operating cost r Capital structure arbitrage
66.1 Introduction A “business model” often simply describes “ways that a firm makes money.” It is a general description of the business environment, forecasts of earnings, and the proposed business strategies, and it often lacks the rigorous specification of a financial model. Despite the ambiguity, business models T.S.Y. Ho () Thomas Ho Company, Ltd, 55 Liberty Street, 4 B New York, NY 10005–1003, USA e-mail: [email protected] S.B.Lee Hanyang University, 17 Haeng-Dang-Dong, Seong-Dong-Ku, Seoul, 133–791, Korea
are important to corporate finance, investments, portfolio management, and many aspects of financial businesses. They provide a frame in which to determine a firm’s value, evaluate corporate strategies, distinguish one firm from another firm, or one business sector from another. The prevalent use of business models cannot be understated. Yet, despite the tremendous growth in applications of financial modeling in capital markets, the use of financial principles in developing business models is largely unexplored. An early example of financial modeling of a business was pioneered by Stoll (1976). He presents a business model of a market maker. Demsetz (1968) suggests that market makers are in the business of providing liquidity to a market and they are compensated by the market via their bid-ask spreads. Stoll then develops the optimal dealer’s bid-ask prices within Demsetz’s business environment and shows precisely how trader’s should set their bid-ask quotes, their trading strategies relating to their inventory positions, and finally the profits to the traders under competition. In short, Stoll provides the business model of a trading firm, leading to the subsequent growth of the microstructure theory. The successful use of the “dealer’s business model” in microstructure theory demonstrates the importance of insights gained in modeling a business. For example, Ho and Marcis (1984) extend the business model to incorporate the fixed operating costs of a market making firm to determine the equilibrium number of market-makers in the AMEX market. However, to date, few rigorous business models have been proposed in the corporate finance literature. There are some examples. Trigeorgis (1993) values projects as multiple real options on the underlying asset value. Botteron et al. (2003) use barrier options to model the flexibility in production and sales of multinational enterprises under exchange rate uncertainties. Brennan and Schwartz (1985) determine the growth model of a mining firm. Cortazar et al. (2001) develop a real option model for valuing natural resource exploration investments such as oil and copper when there is joint price and geological-technical uncertainty. Gilroy and Lukas (2006) formalize the choice of market entry strategy for an individual multinational enterprise from a real option perspective. Fontes (2008) addresses investment decisions in
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_66,
1041
1042
production systems by using real options. Sodal et al. (2008) value the option to switch between the dry bulk market and wet bulk market for a combination carrier. Villani (2008) combines the real option approach with the game theory to examine an interaction between two firms that invest in R&D. Wirl (2008) investigates optimal maintenance of equipment under uncertainty and the options of scrapping versus keeping the equipment as a back up while paying the keeping cost. These models explore the use of contingent claim models in various corporate financial decisions ranging from abandoning or increasing the mining capabilities to scrapping versus maintaining equipment. In reality, real option approach is in three different corporate uses. There are a strategic way of thinking, an analytical valuation tool, and an organization-wide process for evaluating, monitoring, and managing capital investment according to Triantis and Borison (2001). This paper extends the real option literature to describe the business models in a more general context. The purpose of the paper is twofold: First, we propose the use of a real option approach to describe a business, and second, we show how such a business model can be used in some applications. Our model is a discrete time, multiperiod, contingent claim model. We assume that a firm is subjected to a business risk. The revenues are generated from a capital asset. It has to incur a fixed operating cost, make a fixed payment continually (a perpetual payment) to stay in business, and has the options to invest in future projects. That is the firm must pay an exercise price continually to retain the option in business and at the same time maintain the growth options of Myers (1984). Such retention and growth options cannot be separated. The model is applicable to many business sectors, including retail chain companies, airlines, software companies, and other businesses whose revenues are generated from a core capital investment, and whose expense structure consists of both fixed operating and variable costs. The inputs to the business model can be drawn from the published financial statements and market data, and therefore the model is empirically testable. We then show how the business model can be used in three important areas in corporate finance: (1) to determine the optimal capital budgeting decision, given a fixed operating cost; (2) to derive the relative value of firms in the same business sector with different business models; and (3) to relate the change of the firm’s revenue to the change of the equity value. The first application deals with what Myers (1984) describes as two approaches in capital budgeting, the discounted cash flow and strategic planning approaches, as “two cultures and one problem” in the valuation of a firm. The bottom-up method (the discounted cash flow approach) determines the net present value of a project, and the manager
T.S.Y. Ho and S.B. Lee
accepts the project if the net present value is positive, and rejects it otherwise. The top-down method (the strategic planning approach) considers all future state-dependent investments simultaneously and determines the optimal investment strategies that maximize the firm value. While the two approaches are related by the valuation of a firm, corporate finance literature has not described how they are related to each other explicitly. Specifically, in the presence of fixed operating costs, how should the “expenses” of a project, at the margin, be incorporated in the NPV calculation? We show that the standard one period model cannot describe the relationships between the cost of the project to the future inflow of the project and the outflows of the project cost as well as the firm’s fixed operating costs. In this paper, we show how the NPV method is related to the top-down method via the implied fixed cost measure. Relating to this issue, McDonald (2006) argues that the discounted cash flow and the real option valuation should provide the same answer when the methods are used correctly. However, he further argues that to the extent that the managers who use the real option valuation have effectively adopted a different business model, there is a real and important difference between the discounted cash flow and the real option valuation. The second application of a business model deals with measuring the impact of the growth option, the debt, and the fixed operating cost on the observed equity value. These results are illustrated by applying the model to a sample of retail chain companies. The third application focuses on the relationship between the firm’s revenues and the stock valuation. We show how the operating leverage and the financial leverage together affect the change in the equity value with a change in the revenue. The model provides a return attribution of the equity returns based on the firm’s business model. This approach enables corporate managers to evaluate the impact of the firm’s operating leverage and the financial leverage on the risk of the firm’s earnings. The paper proceeds as follows. Section 66.2 provides the business model of a firm. Section 66.3 provides the numerical simulations of the capital budgeting problem, comparing the optimal capital budgeting decisions based on the bottom-up and top-down decisions. Section 66.4 provides the application of the business model to the equity value decomposition. Section 66.5 describes return attribution results using the business model. Finally, Sect. 66.6 contains the conclusions.
66.2 The Model Assumptions Many retail chain stores must incur significant fixed operating costs in setting up the distribution system, producing or buying the products, and managing the core business
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
processes. At the same time, the retail chain store invests in new distribution centers, and each investment is a capital budgeting decision. Each product development is also a capital budgeting decision, which should increase the firm value at the margin. These are some of many examples where capital budgeting decision on each project is part of the business model of the firm, which must include the management of a significant fixed operating cost. The model uses the standard basic assumptions in real option literature. We assume a multiperiod discrete time model where all agents make their decisions at specific regular intervals; the one period interest rate is RF , a constant; the firm seeks to maximize the shareholders’ wealth; and the market is efficient. We assume a frictionless market with no corporate taxes and personal taxes, and therefore the capital structure is irrelevant to the maximization of shareholders’ value. The following assumptions describe the model of the firm in this paper. Assumption 1. The Business Risk of the Firm (GRI) In this model, unlike many standard real option models, we use the sales (or revenue) of the firm as the risk driver, and not the operating profits, as commonly used. The sales represent the business risk of the firm, while the operating profits are affected by the business model of the firm. We assume that the firm is endowed with a capital asset (CA). For example, the capital asset can be a factory that produces goods and services resulting in sales. The sales generated by one unit of the capital asset are called the gross return on investment (GRI/: GRI is the risk driver of the model, and the risk represents the uncertain demand for the products. Therefore, the sales are stochastic, given by the following equation. Sales D GRI CA
(66.1)
When the GRI increases, and the firm would increase its sales for the same capital asset. When there is a downturn in GRI, the sales would fall. Extending the model to multiple risk sources should provide a more realistic model but may obscure the basic insights that the model provides. We assume that GRI follows a binomial lattice process that is lognormal with no drift. The upstate and downstate are given by a proportional increase of exp./ with probability q and a proportional decrease of exp./ with probability (1 q). is the volatility assumed to be constant. The market probability q is chosen so that the expected value of GRI over one period is the observed GRI at the beginning of each step. That is, the risk follows a martingale process. While GRI follows a recombining binomial process, we will use a nonrecombining tree notation. Specifically, we let n D 0; 1; 2; : : :, and for each time n, we let the index i denote the state variable i D 0; 1; 2; : : : ; 2n 1. Then at any node of the tree (n, i), the binomial upstate and downstate nodes of the following step are denoted by (n, 2i C 1) and
1043
(n, 2i), respectively. Then the martingale process is specified by the following equation: GRI.n; i / D q GRI.n C 1; 2i C 1/ C .1 q/
GRI.n C 1; 2i /; qD
1e ; where is the volatility of GRI e e
(66.2)
for n D 0; 1; 2; : : : and i D 0; 1; 2; : : : ; 2n 1. Let be the cost of capital for the business risk, which is the required rate of return for that business risk, GRI. Note that the standard cost of capital of a firm reflects the risk of the firm’s free cash flows taking the operating leverage into account, not the firm’s sales risk as we do here. Since the firm risk is the same at each node, the cost of capital is constant in all states and time periods on the lattice. Assumption 2. The Primitive Firm (Vp ) To apply the contingent claim valuation approach to value the firm, we begin with the definition of the primitive firm as the “underlying security.” The primitive firm is a simple corporate entity that has no debt or claims other than the common shares, which are publicly traded. The firm has one unit capital asset. The capital asset does not depreciate, and the value does not change. We can think of the capital assets as the distribution centers of a retail chain store. Let m be the gross profit margin. For simplicity, in this section, we assume that there are no costs associated in generating the sales, and that m equals unity. And therefore the firm’s sales are the profits, which are distributed to all the shareholders. Equation (66.2) presents the sales risk, and the GRI.n; i / and ¡ are the sales and cost of capital of the primitive firm at each node .n; i / on the lattice. Note that the sales (and hence the profits) are always positive, because GRI follows a multiplicative process. By the definition of the cost of capital and the primitive firm value at each node point on the binomial lattice is: VP .n; i / D
GRI.n; i / m CA Where m D 1
(66.3)
for n D 0; 1; 2; : : : and i D 0; 1; 2; : : : 2n 1. Given the binomial process of the primitive firm, which we will use as the “underlying security,” we can derive the risk-neutral probabilities, p.n; i / at time n and state i. The derivation is given in Appendix 66A. pD
A e 1 C RF : Where A D e e 1C
(66.4)
When the cost of capital equals the risk free rate, and when the volatility ¢ is sufficiently small, the risk neutral probability is approximately 0.5. That is, when ¢ is small, the upward movement is approximately the same as the
1044
T.S.Y. Ho and S.B. Lee
downward movement, and therefore, the expected value with the binomial probability of 0.5 shows that the GRI must follow a martingale process. When the cost of capital is high relative to the risk free rate, the risk neutral probability would assign a lower weight to the upward movement, according to Equation (66.4), to balance the use of the risk-free rate, a lower rate than the cost of capital, for discounting the future value. The use of the risk neutral probability ensures the valuation method is consistent with that of the market valuation of Equation (66.3). Note that as long as the volatility and the cost of capital are independent of the time n and state i, the risk neutral probability is also independent of the state and time, and is the same at each node point on the binomial lattice. We will value our firm relative to the primitive firm. Therefore using the standard relative valuation argument, we may assume that the primitive firm follows a drift at the risk free rate. The market probability q is relevant only to the extent of determining the cost of capital , but q is not used explicitly in the model. The use of the risk neutral probabilities enables us to discount all cash flows of our firm by the risk free rate in all states of the world. The primitive firm value Vp specifies the stochastic process of the “underlying security,” and Equation (66.4) is the standard assumption made in the contingent claim valuation model. Assumption 3. The Firm’s Cash Flows (CF) and Value (V) V is the value of a firm that has fixed operating costs, fixed expenditures for operating purposes. The fixed operating cost (FC) is independent of the units of the goods sold and is paid at the end of each period. Payments to the vendors and suppliers, the employees’ salaries and benefits are some
examples of the fixed operating costs and they may constitute a significant part of the firm’s cash outflow. The net profit of the firm, using Equation (66.1), is given by CF.n; i / D .GRI.n; i / CA .n; i / FC/
for n D 0; 1; 2; : : : and i D 0; 1; 2; : : : 2n 1. Note that GRI is the only source of risk to the firm’s net income. The firm pays all the net income as dividends. In the case of negative income, the firm issues equity to finance the short fall of cash for simplicity. Therefore, the firm’s net income is the free cash flow and the present value of which is the firm value V. Assumption 4. The Planning Horizon (T) and the Terminal Conditions We assume that there is a strategic planning time horizon T. We will value the firm at each node at planning horizon T. Conditional on the firm not defaulting before reaching the horizon T, we can determine the firm value at time T. Without the loss of generality, we make some simplifying assumptions at the terminal date. In this model, we assume that all future fixed operating cost is capitalized at time T to be a constant FC(T). After the horizon date T, the firm may default on the fixed operating cost. And therefore, the capitalized value of the fixed operating cost should depend of the primitive firm value at time T. The value of this capitalized value is also a contingent claim. The value, based on Merton (1973), is provided in Appendix 66B and the result will be used later. For clarity of the presentation at this point, we keep the model simple without affecting the main results. Therefore, at the terminal date T, the firm value is given by
V FC.T / C .GRI.T; i / CA.T; i / FC/; V .T; i / D Max p Vp FC.T / C .GRI.T; i / .CA.T; i / C I / FC I /; 0
/ where Vp D GRI.T;i /CA.T;i for i D 0; 1; 2; : : : ; 2T 1. That is, the firm value at time T is the primitive firm value with the capital asset CA, plus the cash flow of the firm over the final period, net of the capitalized fixed costs. The limited liability of a corporation is assumed in this model and therefore the firm value is bounded from being negative.
Assumption 5. Investment Decisions At each node, .n; i /, the firm has an option to make capital investments at $1 each year, every year over the planning horizon. For simplicity, we assume that the decisions are not reversible in that the firm cannot undo the investments in any future state of the world.
(66.5)
(66.6)
The increase of the capital investment leads to a direct increase in the firm’s capital assets. And, we have CA.n C 1; 2i / D CA.n C 1; 2i C 1/ D CA.n; i / C I.n; i / (66.7) The GRI of the business is not affected by the increase in the firm size, and hence the business risk is independent of the capital investment. However, the sales are affected by the capital budgeting decisions. The marginal increase in sales to the firm with the investment at the node .n; i / is given by Sales.n; i / D I.n; i / GRI.n; i /
(66.8)
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
Table 66.1 The numerical example for scenario Time 0 1 2 GRI 0:05 0:1 0:15 Investment 1 0 CA 30 31 31 Sales FC Investment Free cashflows
3:1 3 1 0:9
4:65 3 0 1:65
Finally, the investment decisions are made at all the nodes such that the firm value is maximized. It is important to note that since the firm can decide on the investment at particular, CA.n; i / at each node depends on the path to that node. Therefore, the model is a path dependent model. These assumptions complete the description of the model. Assumption (1) describes the risk class of the business. Assumption (2) introduces the primitive firm enabling us to relate the cost of capital ¡ to the risk neutral valuation framework. Assumption (3) specifies the business model of the firm, identifying the firm’s free cash flow as the residual of all the claims, like the fixed costs, on the firm’s sales. We use the simplest business model in this paper, but this assumption can be generalized to study different business models, which can be specified by different cost and sales structure. Assumption (4) specifies the terminal condition, following the standard assumptions made on the horizon in strategic planning. Assumption (5) specifies the marginal investment I.n; i / that increases the capital asset CA.n; i /: The marginal returns of the investments can be generalized, although we choose the simplest relationship here. Given the above assumptions, we can now determine the maximum value of the firm based on the optimal capital investment decisions. Let us use the following numerical example to illustrate the model in Table 66.1. Consider a particular scenario over two periods, where n D 0; 1; 2. The stochastic variable is GRI. Given the investment schedule on line 3, the capital asset over time is given by line 4. Sales are determined by GRI and CA. The fixed cost is constant over time. The free cash flow is then determined following the standard income statements.
1045
66.3 Simulation Results of the Capital Budgeting Decisions The firm seeks to maximize the firm value by using the control variables, which are the capital investments, at all the node points along the scenario paths in the binomial latT P 2n capital investment tice to the horizon date. There are nD1
decisions. We use the backward substitution method based on the nonrecombining tree. We first assume a set of investment decisions, I.n; i /, at each node along all paths. Note that I.n; i / can equal to 0 in some of the nodes. At the horizon date, we can determine the firm value at each node. Then we use the risk neutral probability and determine the firm value at time T 1, such that the firm value at that node point has a risk free return based on the risk neutral probability, V . Specifically, V .T 1; i / D
.p V .T; 2i C 1/ C .1 p/V .T; 2i // 1 C RF (66.9)
for i D 0; 1; : : :2T 1 1. Note that we are rolling back a nonrecombining tree and not a recombining lattice. Therefore, the state i here denotes a state along a scenario path of a tree, and the states (2i C 1) and 2i refer to the binary states of the subsequent period. If the firm value is less than the value FC C I, which is the cash outflow, the firm declares bankruptcy and has value zero, otherwise the firm value is V FCI . That is, V .T 1; i / .V .T 1; i /C D max GRI CA.T 1; i / FC I.T 1; i /; 0 (66.10) Note that we have assumed that the firm has decided on all the investment decisions at the beginning of the period. Therefore, at each node, the firm is obligated to invest I(n,i), a nonnegative value. We continue with this process recursively, rolling back one period at a time. We then determine the firm value. That is, we recursively apply the following Equation (66.11) until n D 0.
2
3 .p V .n; 2i C 1/ C .1 p/V .n; 2i // C 5 V .n 1; i / D max 4 1 C RF GRI.n 1; i / CA.n 1; i / FC I.n 1; i /; 0
for n D 0; 1; : : :T 1 and i D 0; 1; : : :; 2n 1.
(66.11)
1046
T.S.Y. Ho and S.B. Lee
We now seek a set of investment decisions I.n; i / along all the paths to determine the highest value of the firm. This search can be accomplished by a nonlinear optimization procedure. For clarity of the exposition, we have chosen to use a nonrecombining tree and a nonlinear optimization to determine the firm value. However, the model can be specified using a recombining binomial lattice, and the firm value can be determined using the standard roll back method, without the use of any numerical nonlinear optimization method. The model is presented in Appendix 66C. We have shown that the two approaches are equivalent. We simulate the model with the following inputs: the risk free rate of 10%; the cost of capital of 10% with volatility of 30%; a risk neutral probability p of 0.425557; an initial capital asset CA of $30 million; the capitalized fixed cost of FC/0.1, where the capitalized fixed cost is assumed to present value of the perpetual fixed cost payment discounted at the risk-free rate. We consider the problem over 5 years where the firm can invest $1 million on a new distribution center at each node on the binomial lattice. The optimal investment decisions (top-down capital budgeting decisions) are determined by a nonlinear optimal search algorithm1 where the investment decisions are the choice variables with the objective function being the firm value. The optimal decision can be related to the capital budgeting decisions. When the capital investment is made, the free cash-flow (CF) is given by CF D GRI CA FC
cepts or rejects a project by focusing on the project’s cash flow. If we assume that there is no fixed operating cost, the bottom-up approach can be shown to be the same as the topdown approach. In sum, the net present value maximization at the local level should lead to the global optimization for the firm that has no fixed operating costs. Figure 66.1 shows the capital budgeting decisions using the bottom-up approach, where 1 and 0 denote the acceptance and rejection decision, respectively. For example, 0 at the top node represents that the firm rejects the project at time 0 and state 0. Since we assume the nonrecombining tree, we have 2n states at period n. However, if we optimize the capital budgeting decisions using the top-down method, we have different optimal decisions shown in Fig. 66.2, where the shaded nodes represent the states where the capital budgeting decisions differ between the bottom-up and top-down methods.
(66.12)
Investment decisions should be made at the margin and therefore one may argue that the fixed operating cost is not needed to be considered. Given that the cost of capital is , then the net present value of the project is2 : NPV D
GRI I C GRI I I
(66.13)
The capital investment is made when NPV > 0. This capital budgeting decision can be called a bottom-up approach. In this approach, line managers deal with the capital budgeting decisions, maximizing the net present value of each project that they manage. The capital budgeting decisions are made from the line manager’s point of view rather than the headquarters’ overall view, in the sense that the line manager ac1
We use an optimization subroutine, GlobalSearch, written in Mathematica. The description of the procedure is provided at www.loehleenterprises.com. 2 For clarity of the exposition, let the NPV be defined by Equation (66.13). To be precise, the expected cash flow may not be perpetual in the presence of default. We will explain the implication of default on the free cash flow later in this section.
Fig. 66.1 Bottom-up capital budgeting decisions
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
1047
increase in the firm value in accepting a project at node .n; i / based on the top-down optimized solution. It is computed MPV.n; i / D FV .n; i / FV.n; i /
(66.14)
where FV .n; i / and FV.n; i / are the firm values at the node .n; i / with the investment and without the investment, respectively, while holding all the investment decisions in other nodes constant. This definition of the marginal change in the firm value isolates the effect of the investment at a specific node from the growth options at the other nodes. Let NPV.n; i / be the net present value of the project at node .n; i / such that NPV D PV I
(66.15)
and let the present value of the fixed cost as a function of the primitive firm value at node .n; i / be F .n; i / D F .Vp /
(66.16)
Then the “wealth transfer,” WT, the loss of the shareholders value in taking the project in the presence of the fixed cost, is given by WT.n; i / D
dF PV.n; i / dV p
(66.17)
It follows that the marginal change in the firm value is the NPV net of the wealth transfer effect. MPV.n; i / D NPV.n; i / WT.n; i /
(66.18)
Fig. 66.2 Top-down capital budgeting decisions
When we compare Figs. 66.1 and 66.2, the results show that the firm accepts projects using the bottom-up approach that are rejected by the top-down approach. That means, many NPV positive projects may have negative impact to firm value. Note that our model is consistent with that of Myers (1977). By viewing the fixed operating cost as claims to the value of the primitive firms, positive net present value project may not be accepted by the global optimization to maximize the firm value. Our model interprets the result to suggest that portion of the fixed operating costs should be incorporated in the calculation of the net present value of the project. Therefore, our valuation framework provides a model to adjust for the presence of fixed operating costs in capital budgeting in a multiperiod context, something that the Myers model does not cover. Specifically, we define the marginal present value MPV.n; i / at any node point .n; i / to be the marginal
Rearranging Equations (66.15), (66.17), and (66.18), we have MPV.n; i / D PV.n; i / D I (66.19) D D1
dF dV p
(66.20)
Note that D as a function of the firm value is determined by the fixed cost structure. Since we assume that the fixed cost is a fixed cashflow at any time and state, the function is the same at any node. Equation (66.19) can be interpreted intuitively. In the presence of a fixed cost, the capital budgeting decision depends on the fixed cost factor. Portion of the present value of the incoming cash flow should first be adjusted by the fixed cost factor and the project is taken (rejected) if MPV > . 0. D can be derived in our model, as the fixed cost can be valued. The plot of the fixed cost factor as a function of the percentage change in the firm value is provided in Fig. 66.3 below.
1048
T.S.Y. Ho and S.B. Lee
Fig. 66.3 Fixed cost factor Fixed Cost Factor
discount factor
1.05 0.95 0.85 0.75 0.65 0.55 −70
−20
30
80
130
180
230
280
330
%change in Firm Value
As expected, the decline in the firm value would lead to a lower discount factor, resulting in more positive NPV projects be rejected. When the firm value is significantly high, the fixed cost factor is one, and then the NPV and the top-down approach are the same. In general, given a firm’s business model, the fixed cost factor function can be derived. And this function provides the link between the bottom-up and top-down capital budgeting problem. The model assumes that all the projects are independent of each other in the sense that the capital budgeting decision of one project is independent of the other projects. Using the bottom-up method, the independence of projects would lead to independence in the capital budgeting decisions across the projects. Yet in the presence of the fixed operating cost, it is straightforward to show that optimal decisions of the projects are related. For example, referring to Fig. 66.2 in the top-down capital budgeting decision, we would optimally invest in the upstate for period 1 and would again optimally invest in period 2 in both up and down states. However, if we do not invest in period 1, then the top-down optimal solution would lead to no investment in the subsequent downstate in the second period. Therefore, the model shows that these projects are not independent in the capital budgeting decision, contrary to the bottom-up capital budgeting decision rule.
66.4 Relative Valuation of Equity In this section, we decompose the value of the primitive firm into its components. We recognize that the firm’s capitalization is a compound option on the underlying business risk. These embedded options are options on the financial leverage, operating leverage, and the strategic value. We estimate these option values using a sample of retail chain stores, and we show that such decomposition can provide us useful insights into the valuation the firms’ equities.
In deriving the value of the firm, we assume that the firm pays out all the free cash flows and we construct a recombining lattice from the tree described in the previous section, such that the firm value is derived by the rolling back procedure. The recombining lattice is described in Appendix 66C. We consider the following retail chain stores: Wal-Mart, Target, Lowe’s, and Darden. We will describe these firms in brief. Wal-Mart (WMT) is now the largest retailer in North America. The company is operating approximately 4,000 stores worldwide. WMT is also a leader in developing and implementing retail information technology. Target (TGT) is the fourth-largest U.S. general merchandise retailer. It has approximately 1,000 stores, with much of its revenue derived from its discount stores. Lowe’s Companies (LOW) is the second largest U.S. home improvement retailer, selling retail building materials and supplies through more than 600 stores. Darden Restaurants, Inc (DRI) has over 600 restaurants and is a leader in the casual dining sector. The details of implementing the model using the observed data are provided in Appendix 66D. This sample of firms, all share the basic business model of retail chains. They focus on core production of their products and they sell the products through their distribution networks. The turnover, which is the sales to the total asset, depends on consumer spending. In times of recession, consumers may lower their spending on merchandizing, dining, and expenditures. As such, we consider these firms as belonging to a similar risk class with similar cost of capital for the business. We have shown that the inputs to the business model are profit margin m, fixed operating costs FC, turnover x, capital investment rate I, and leverage l. All these inputs can be derived or observed from financial statements. The business model then derives the market value of equity, which is also observed in the market. We can calibrate the cost of capital of the business and the business volatility such that the equity value and the model inputs best fit the observed values.
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
1049
Table 66.2 Inputs to the business model GRI Gross profit margin (m) Fixed cost/total asset (FC/CA) Capital investment (I/CA) Leverage (CA/E)
Target 2.9769 0.3169 0.6564 0.1563 1.7218
Lowe’s 2.5807 0.2880 0.4684 0.2381 1.2965
Wal-Mart 4.8082 0.2274 0.7907 0.1530 1.3033
Darden 2.2823 0.2222 0.2293 0.1130 1.7190
GRI D sales/capital assets Gross profit margin D (sales cost of goods sold)/sales I/A D capital investment/capital assets Leverage D capital assets/equity Table 66.3 Reported and estimated market performance measures Target Lowe’s
S/E S/V Cost of capital Volatility
Wal-Mart
Darden
Reported
Estimated
Reported
Estimated
Reported
Estimated
Reported
Estimated
5.1009 0.8321 0.0860 0.2776
5.3231 0.8500
5.3543 0.9054 0.0821 0.3908
5.4717 0.9089
7.6893 0.9351 0.0702 0.3149
7.8420 0.9363
3.1623 0.8634 0.1036 0.3772
3.2699 0.8779
The cost of capital and volatility are used to calibrate the model such that the sum of squares of the difference between reported and the estimated S/E, P/E and S/V is minimized
Specifically, we use the data in Table 66.2, based on January 31, 2002 financial statements, as input to the business model. We then determine the implied volatility of the business risk driver (volatility) and the implied cost of capital of the business by minimizing the sum of squares of the observed market performance measures and the corresponding model value. The results are presented in Table 66.3. The results show that the cost of capital implied from the firm’s equity value for Wal-Mart is particularly low when compared with the other retail chain stores. Darden has the highest cost of capital, which is 10.36%. Target and Lowe’s has similar cost of capital of about 9%. We can now decompose the primitive firm value of each firm into its building blocks of value. We have shown that the market equity value is a compound option of three options. Equity is an option on the firm value, which has an embedded real option net of the “perpetual risky coupon debt” of the fixed costs. Or the market equity value can be built from the underlying firm value. Starting from the underlying firm value, we can add the real option and net of the perpetual debt. Then all equity firms with a real option (which is an all equity growth firm) are the underlying risky assets, whose European call option is the market value of equity. Let Vp be the value of the firm without debt, growth or fixed costs, which we called the primitive firm. It can be calculated by using the valuation model assuming that the fixed cost and capital investment rate is zero. Let Vfc be the value of the firm without debt and growth, but has the fixed costs, which we call the fixed-cost firm. F is the market value of the fixed costs, which is defined as
F D Vp Vfc
(66.21)
Let V be the value of the firm without debt, but with optimal capital investment strategy and fixed cost. Then G is the value of the growth option, which can be calculated as the difference between the firm value V and the firm without growth, Vfc . (66.22) G D V Vfc : D is the market value of the debt, relatively valued to the firm value V. Then the market capitalization of the firm (market value of the equity) is the underlying firm with the growth option net of the fixed costs and the debt. S D V D
(66.23)
Or the equity value can be reexpressed as S D Vp F C G D
(66.24)
The decomposition of the value is summarized in Table 66.4. To compare the results across the firms, we can normalize the equity value by the firm’s book equity value, by considering the market to book multiples (S/E), as reported in the last row of Table 66.4. The results show that Wal-Mart has the highest multiple of 7.8420. Now, Table 66.5 provides insights into the determinants of the market multiples of the firms. We can use the above results and derive the values in proportions as reported below. Note that the multiple (S/E) is the product of all the ratios presented in the rows above. And therefore, the table
1050
T.S.Y. Ho and S.B. Lee
Table 66.4 The value decomposition of the capitalization value S
Table 66.5 Determinants of the market multiples
Mkt equity S Primitive firm V* Mkt value fixed cost F Growth option G Mkt value of debt D Book equity E Estimated S/E
Vp/CA Vfc/Vp V/Vfc CA/E S/V S/E Book value (E/shares)
provides a decomposition of the equity multiples. The result shows that the firms have significant fixed operating costs. For example, Target’s fixed operating cost is 80.44% of the primitive firm value. However, the market assigns a significant growth value to Target. In fact, the firm with growth option is a multiple of 1.5602 to the firm without growth option. The results also show that Wal-Mart attains the high multiple because of its high value of the primitive firm value to its total asset. As we have shown above, the high multiple value is mainly the result of a market low cost of capital to the firm business. The business model provides a systematic approach to determine the building blocks of value to the market observed equity value. And therefore this approach provides us insight into the determinants of the market value of equity.
66.5 Equity Return Attribution The business model can also provide insights into the relationship between the equity value and the firm’s revenue. In this section, we use the business model to determine the impact of a 1% increase in the gross investment return on the stock returns. And in the process, we determine impact of the operating leverage, financial leverage, and the growth option on the equity returns. First, note that the stock price multiple to the book value can be expressed by adding numbering system to the equations. Vp Vfc CA V S S D
E CA Vp Vfc V E
(66.25)
Target 41;840 161;313 129;766 17;674 7;382 7;860 5.3231
Target 11:9200 0:1956 1:5602 1:7218 0:8500 5:3231 8:7062
Lowe’s 36;520 84;728 60;269 15;722 3;661 6;674 5.4717
Lowe’s 9.7913 0.2887 1.6428 1.2965 0.9089 5.4717 8.6044
Wal-Mart 275;270 762;427 568;304 99;878 18;732 35;102 7.8420
Wal-Mart 16:6651 0:2546 1:5145 1:3033 0:9363 7:8420 7:8004
Darden 3;385 9;612 6;438 682 471 1;035 3.2699
Darden 5.4017 0.3302 1.2148 1.7190 0.8779 3.2699 5.8818
It follows that, ln
Vp Vfc CA V S S D ln Cln Cln Cln Cln E CA Vp Vfc V E
(66.26)
Given in proportional increase of GRI by 1%, the change of the equity to book multiple is given by,
ln
Vp Vfc S CA V S D ln C ln C ln C ln C ln E CA Vp Vfc V E (66.27)
This equation provides an attribution of the proportional change in the stock multiple. The changes of the components are simulated and are provided in Table 66.6. Note that the sum of the rows equal to the stock price change (the last row). For example, consider Wal-Mart, 1% increase in the gross return on investment, and hence 1% increase in sales, would lead to 1.97% increase in the equity value. The return attribution shows that a significant portion of this return comes from the increase in the primitive firm value (1%) and the effect of the operating leverage (1.07%). The percent increase of the primitive firm value is directly proportional to the percent increase in revenue, by definition. The increase in the growth option value is impacted less by the revenue change, resulting in a negative contribution of the equity returns (0:22%). The financial leverage has an insignificant impact (0.13%) because of the relatively low financial leverage of Wal-Mart measured in market value. The capital asset and book equity value are not affected by the change in revenues. This result seems to apply approximately to other stores in this sample of retail chain stores, showing that these stores are quite similar in the sense that they are
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
Table 66.6 Stock return decomposition by firm values
ln(Vp /CA) ln(Vfc =Vp ) ln(V/Vfc ) ln(S/V) ln(CA/E) ln(Book value(E/shares)) ln(Stock price)
industry leaders in their specific businesses. However, for retail chain stores with higher operating leverage relative to the firm’s value, then the relationships are more complex. The analysis shows that the business model enables us to identify how the operating and financial leverage affect the equity returns and thus provides useful insights into the relationship between the market valuation and the profitability of the business. The use of the contingent claim approach to formulate the business risk enables us to incorporate the risk of the business (the volatility of the gross return on investment) to the debt structure and the operating leverage of the firm, something that the traditional financial ratio approach cannot capture.
66.6 Conclusion This paper provides a parsimonious model of a firm. The model enables us to value the firm as a contingent claim on the business risks. Using a contingent claim valuation framework, we can then relate the firm maximization to the capital budgeting rule, the fixed operating costs, and the cost of capital of the project as well as that of the firm. The model enables us to determine the impact of the fixed costs on the NPV valuation of a project. The business model also enables us to gain insight into the building blocks of value for the firm’s equity and the relationship of the equity returns to its revenues. Specifically, we have shown that the top-down and bottom-up decisions are related by the fixed cost factor, which is a function of the firm value. This function can be specified given the business model of the firm. The lower the firm value, the deeper the discount on the present value of the project, and that may lead to a rejection of a positive NPV project. This result has several implications in corporate finance. For some firms with high operating leverage, for example, communication companies, seeking to acquire other firms, the model suggests that the acquisition analysis should focus not only on the synergic effect in the capital budgeting decision but the fixed cost factor opposing effect. For a startup company, the extensive use of the operating cost
Target 0:0100 0:0138 0:0022 0:0027 0:0000 0:0000 0:0244
Lowe’s 0:0100 0:0085 0:0014 0:0010 0:0000 0:0000 0:0180
1051
Wal-Mart 0:0100 0:0107 0:0022 0:0013 0:0000 0:0000 0:0197
Darden 0.0100 0.0070 0.0003 0.0016 0.0000 0.0000 0.0189
substituting the variable costs would adversely affect its capital budgeting decisions. While we use retail chain stores to describe the business model, other businesses share a similar model. Also, the model can be generalized to incorporate multiple risk sources or perpetual fixed operating costs with more complex fixed costs schedules. These and other extensions of the model are not expected to change the key insights provided in this chapter. Furthermore, the model assumptions can be relaxed to further investigate other corporate finance issues. For example, the model can analyze the impact of the fixed operating costs on the debt structure. Debts can be viewed as junior debt to the “perpetual debt,” the fixed operating cost. The underlying security in this contingent claim valuation is the primitive firm. The impact of the fixed operating costs on the firm’s debt may explain the bond behavior observed in the market, as the bond would behave like a junior debt (Ho and Lee 2004b).
References Botteron, P., M. Chesney, and R. Gibson-Anser. 2003. “Analyzing firms strategic investment decisions in a real options framework.” Journal of International Financial Markets, Institutions and Money 13, 451–479. Brennan, M. J. and E. S. Schwartz. 1985. “Evaluating natural resource investments.” Journal of Business 58, 135–157. Cortazar, G., E. S. Schwartz, and J. Casassus. 2001. “Optimal exploration investments under price and geological-technical uncertainty: a real options model.” R&D Management 31(2), 181–190. Demsetz, H. 1968. “The cost of transacting.” Quarterly Journal of Economics 82(1), 33–53. Fontes, D. B. M. M. 2008. “Fixed versus flexible production systems: a real option analysis.” European Journal of Operational Research 188(1), 169–184. Gilroy, B. M. and E. Lukas. 2006. “The choice between greenfield investment and cross-border acquisition: a real option approach.” The Quarterly Review of Economics and Finance 46(3), 447–465. Ho, T. S. Y. and S. B. Lee. 2004a. The oxford guide to financial modeling, Oxford University Press, New York. Ho, T. S. Y. and S. B. Lee. 2004b. “Valuing high yield bonds: a business modeling approach.” Journal of Investment Management 2(2), 1–12. Ho, T. S. Y. and R. Marcis. 1984. “Dealer bid-ask quotes and transaction prices: an empirical study of some AMAX options.” Journal of Finance 39(1), 23–45.
1052
T.S.Y. Ho and S.B. Lee
McDonald, R. L. 2006. “The role of real options in capital budgeting: theory and practice.” Journal of Applied Corporate Finance 18(2), 28–39. Merton, R. 1973. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29(2), 449–470. Myers, S. C. 1977. “Determinants of corporate borrowing.” Journal of Financial Economics 5, 147–175. Myers, S. C. 1984. “Finance theory and financial strategy.” Interface 14, 126–137, reprinted in 1987. Midland Corporate Finance Journal 5(1), 6–13. Sodal, S., S. Koekebakker, and R. Aadland. 2008. “Market switching in shipping: a real option model applied to the valuation of combination carriers.” Review of Financial Economics 17(3), 183–203. Stoll, H. R. 1976. “Dealer inventory behavior: an empirical investigation of NASDAQ stocks.” Journal of Financial and Quantitative Analysis 11(3), 356–380. Triantis, A. and A. Borison. 2001. “Real options: state of the practice.” Journal of Applied Corporate Finance 14(2), 8–24. Trigeorgis, L. 1993. “The nature of option interactions and the valuation of investments with multiple real options.” Journal of Financial and Quantitative Analysis 28(1), 1–20. Villani, G. 2008. “An R&D investment game under uncertainty in real option analysis.” Computational Economics 32(1/2), 199–219. Wirl, F. 2008. “Optimal maintenance and scrapping versus the value of back ups.” Computational Management Science 5(4), 379–392.
Then the risk neutral probability p is defined as the probability that ensures the expected total return is the risk-free return. p Vpu C .1 p/ Vpd D .1 C RF / Vp :
Substituting Vp ; Vpu ; Vpd into equation above and solve for p, we have A e pD (66A.4) e e where A D
1CRF 1C
Appendix 66B The Model for the Fixed Operating Cost at Time T When the firm may default on the fixed operating cost, the value of the fixed operating cost can be viewed as a perpetual debt of a risk bond. The valuation formula of the perpetual debt is given by Merton (1973). 8 2F C 2R2F FC < 2 ˆ.V; 1/ D 1 V 2RF RF : 2 C 2
Appendix 66A Derivation of the Risk Neutral Probability
M
The risk-neutral probabilities p.n; i / can be calculated from the binomial tree of Vp . Let Vp .n; i / be the firm value at node (n, i). In the upstate, the firm value is Vp .n; i / D
where V D the primitive firm value
RF D risk free rate
By the definition of the binomial process of the gross return on investment, Vp .n C 1; 2i C 1/ D Vp .n; i /e : Further, the firm pays a cash dividend of Cu D Vp .n; i /
e . Therefore, the total value of the firm Vpu , an instant before the dividend payment in the upstate, is (66A.1)
Similarly, the total value of the firm Vpd , an instant before the dividend payment in the downstate, is Vpd D Vp .1 C / e :
9
2RF 2RF 2F C = ;2 C 2 ; 2 ; 2 V
FC D fixed cost per year
GRI.n; i / CA
Vpu D Vp .1 C / e :
(66A.3)
(66A.2)
D the Gamma function (defined in the below) D the standard deviation of GRI M./ D the confluent hypergeometric function (defined in the below)
a 2FC 1 b b M a; 2 C a; 2 D e V .1 C a/bFC V bRF V
b b V Ce FC aV.2 C a/C.1 C a/.b aV/ 1 C a; V where 2RF 2FC ;bD 2 ; 2 Z 1 Z .x/ D t x1 e t dt; and .a; x/ D aD
0
1
t a1 e t dt x
66 Business Models: Applications to Capital Budgeting, Equity Value, and Return Attribution
Appendix 66C The Valuation Model Using the Recombining Lattice In this model specification, we assume that the GRI stochastic process follows a recombining binomial lattice:
1053
At time T, the horizon date, consider the node (T, j), j is the state on a recombining lattice. Suppose that the firm has made k investments in the period T, where 0 k T 1. The firm value is given by Equation (66C.1):
GRI.n; j / D q GRI.n C 1; j C 1/ C.1 q/ GRI.n C 1; j /; qD
1 e ; where is the volatility of GRI, e e
where n D 0; 1; : : : T and j D 0; : : :; n. 2 GRI.T; j / .CA C .k C 1/I /
FC.T /C
3
7 6 7 6 6 .GRI.T; j / .CA C .k C 1/I / FC I /; 7 7 6 7 V .T; j I k/ D Max 6 7 6 7 6 GRI.T; j / .CA C kI/ 7 6 FC.T /C 5 4
(66C.1)
.GRI.T; j / .CA C kI / FC/; 0 for k D 0; : : : ; T and j D 0; : : : ; T
and CA is the initial capital asset. Now we roll back one period. We then compare the firm value with or without making an investment I. Given that the
firm at the end of the period T 1 has already invested k times and would not invest at time T 1, the firm value is
p V .T; j C 1I k/ C .1 p/ V .T; j I k/ C GRI.T 1; j / .CA C kI / FC 1 C RF
(66C.2)
If the firm at that time invests in the capital asset, then the firm value is p V .T; j C 1I k C 1/ C .1 p/ V .T; j I k C 1/ C GRI.T 1; j / .CA C .k C 1/I / FC I 1 C RF
Optimal decision is to maximize the values of the firm under three possible scenarios: taking the investment, not taking the
(66C.3)
investment, or defaulting. Therefore the value of the firm at the node (T 1, j) with k investments is
2 p V .T; j C 1I k C 1/ C .1 p/ V .T; j I k C 1/ 3 C 6 7 1 C RF 6 7 6 GRI.T 1; j / .CA C .k C 1/I / FC I; 7 6 7 7; V .T 1; j I k/ D Max 6 6 7 6 p V .T; j C 1I k/ C .1 p/ V .T; j I k/ 7 6 7 C 4 5 1 C RF GRI.T 1; j / .CA C kI/ FC; 0 for k D 0; 1; : : : ; T 1; and j D 0; 1; : : : ; T 1:
(66C.4)
1054
T.S.Y. Ho and S.B. Lee
Now we can determine the firm value recursively for each n, n D T 1; T 2; : : :1.
At the initial period,
2
3 p V .T; j C 1I k/ C .1 p/ V .T; j I k/ C 5; V .T; j I k/ D Max 4 1 C RF GRI.T; j / .CA C kI/ FC; 0
(66C.5)
for T D j D k D 0:
The firm value at the initial time can then be derived by recursively rolling back the firm value to the initial point, where n D 0. We follow the method of the fiber bundle modeling approach in Ho and Lee (2004a). To illustrate, we use a simple numerical example. Following the previous numerical example, we assume that the GRI is 0.1; the Capital Asset CA is 30; the risk free rate and the cost of capital are both 10%; the risk neutral probability is 0.425557; the volatility 30%; the fixed cost FC is 3 and finally the investment is 1. Given the above assumption, the binomial process is presented below The binomial lattice of GRI Time J 3 2 1 0
0 GRI
0.1
1
2
3
0.134985881 0.074081822
0.18221188 0.1 0.054881164
0.245960311 0.134985881 0.074081822 0.040656966
Given the GRI binomial lattice, we can now derive the firm value lattices. The values are derived by backward substitution. The firm value depends on the capital asset level CA, the state j and the time n. Firm value V(n,j,CA) State j 3 2 1 0 j 3 2 1 0 j 3 2 1 14:9802 0 6:329610 1:0230 Time n 0 1
value at time 0 does not involve any investment decision and therefore it is derived by rolling back from the firm values where the CA level is 30.
Appendix 66D Input Data of the Model The input data of the model are derived from the balance sheets and income statements of the firms. IS
Target
Lowe’s
Wal-Mart
Darden
Revenue Costs of sales Gross profit Gross profit margin (m)a Fixed cost Depreciation Interest cost Other incomes Pretax incomes Tax Effective tax ratio ( )b
39,888 27,246 12,642 0.3169 8,883 1,079 464 0 2,216 842 0.3800
22,111.1 15,744.2 6,366.9 0.2880 4,053.2 534.1 180 24.7 1,624.3 601 0.3700
217,799 168,272 49,527 0.2274 36,173 3,290 1,326 2,013 10,751 3,897 0.3625
4,021.2 3,127.7 893.5 0.2222 407.7 153.9 31.5 0.9 301.3 104.2 0.3458
a b
55:2836 14:9999 0:0000 0:0000
CA 32 32 32 32
31:0516 5:3286 0:0000
52:5780 13:5150 0:0000 0:0000
31 31 31 31
29:0473 4:6541 0:0000 2
49:8725 12:0302 0:0000 0:0000 3
30 30 30 30
Gross profit/revenue Tax/pretax incomes
Balance sheet
Target
Lowe’s Wal-Mart Darden
Capital assets Gross return on invest (GRI)a LTDb Book equity
13,533 2.9475 8,088 7,860
8,653.4 2.5552 3,734 6,674.4
1,779.5 2.2597 517.9 1,035.2
a
Initial GRI without the investment = Revenue/Capital assets We assume that the firms have only one bond. This assumption can be relaxed using the information of the debt structure of a firm
b
Market information
At time 3, the firm values are derived by Equation (66C.1) for each level of outstanding capital asset level at time 3, an instant before the investment decision. Then the firm values for time 2 are derived by Equation (66C.3). Once again, the firm value depends on the outstanding CA level. The firm
45,750 4.7606 18,732 35,102
a
Shares Stock pricea Market capitalization (equity)a Risk free rate (Rf)b Coupon rateb Max investc a
Target
Lowe’s Wal-Mart Darden
902.8 44.41 40,093 0.06 0.06 2,115
775.7 46.07 35,736 0.06 0.06 2,060.5
4,500 59.98 269,910 0.06 0.06 7,000
176 18.6 3,274 0.06 0.06 201
Market data We assume that the risk free rate and coupon rate are 6% c We use the capital expenditure in cash flow statements as the Max invest b
Chapter 67
Dividends Versus Reinvestments in Continuous Time: A More General Model Ren-Raw Chen, Ben Logan, Oded Palmon, and Larry Shepp
Abstract We present a continuous-time model of asset valuation in which the generated income follows a stochastic process, and the asset-owner allocates this income between reinvestment and payout. The income generating process is identical to the processes in the multiplicative (e.g., Black-Scholes-Samuelson-Merten) and the additive (e.g., Bachelier-Radner-Shepp) models for two alternative extreme values of the capital marginal productivity parameter. For all other values of this parameter, our process exhibits diminishing marginal productivity of capital. Thus, the model is more appropriate than both the multiplicative and the additive models for modeling the valuation of physical assets. Similar to the additive model, and in contrast to the multiplicative model, our model permits bankruptcy and does not allow unbounded growth of asset values. We demonstrate the existence of, and solve for, the optimal threshold company size level for paying dividends. When company size is above this level income is paid out as dividends, and when it is below this level all income is reinvested in the company. We also find the expected time until bankruptcy. Keywords Additive model r Multiplicative model r Continuous time r Confluent Hypergeometric function
67.1 Introduction The literature depicts two primary types of continuous-time, asset valuation models: the multiplicative (e.g., Samuelson 1965; Black and Scholes 1973; and Merton 1973) and additive R.-R. Chen () Graduate School of Business Administration, Fordham University, New York, NY 10019, USA e-mail: [email protected] O. Palmon Rutgers Business School, Rutgers University, Piscataway, NJ 08854, USA e-mail: [email protected] B. Logan Bell Labs, USA L. Shepp Statistics Department, Rutgers University, Piscataway, USA e-mail: [email protected]
(e.g., Bachelier 1900 and Radner and Shepp 1996) models. The generated income in the multiplicative models is proportional to asset value. Consequently, the asset value is always strictly positive. In the absence of debt, the equity value equals the asset value and thus is also always positive. In contrast, the generated income in the additive models is contingent on the firm’s survival, but is independent of asset value. We present a continuous-time model of asset valuation in which the asset-owner allocates the income between reinvestment and payout to maximize the present value of the payout. The income that is generated by the asset follows a stochastic process that, for two alternative extreme values of capital productivity parameter, is identical to the processes in the multiplicative and the additive models. For all other values of this parameter, our model exhibits positive and diminishing marginal product of capital. In contrast, under the multiplicative model, the rate of return is independent of asset size while under the additive model, the dollar income is independent of asset size. Thus, our model is more appropriate than the multiplicative and additive models for modeling the valuation of physical assets. We demonstrate that, similarly to the additive model, our model yields bankruptcy (i.e., asset value reaches zero in a finite time), and does not allow an unbounded growth of asset values (which is possible under the multiplicative model). Our model also indicates the existence of a threshold firm size for dividend distribution. When firm size is below that threshold, income is invested in the company. Funds are extracted from the company so that firm size does not exceed this threshold level. The next section describes our model, and the following sections present the solutions for the optimal payout/reinvestment policy and for the resulting expected bankruptcy time. Further remarks and the conclusion follow and conclude the manuscript.
67.2 The Model Radner and Shepp (1996) assume that a firm’s random income follows an additive Brownian motion. In contrast to firms whose income follows a multiplicative Brownian
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_67,
1055
1056
R.-R. Chen et al.
motion, firms in the Radner-Shepp model eventually go bankrupt with probability 1.1 We consider a general class of models with diminishing marginal productivity of capital, represented by a parameter , which takes values between 0 and 1. The model degenerates to the additive model when D 0 and to the multiplicative model when D 1. We derive the complete explicit optimal policy for the Radner-Shepp problem (with one pair of < ; >) which incorporates any value of between 0 and 1. The capital X.t/ of the firm at each time t is assumed to be governed by the following stochastic differential equation: dX.t/ D X .t/.dtCdW.t//dZ.t/; for t 0; X.0/ D x (67.1) where W .t/ is the standard Wiener process; the exponent 0 < 1 is fixed; is a measure of the rate of income; is a measure of risk; r is the fixed discount rate; and Z.t/ is the total dividend up to time t. Z.t/, which is to be found, is assumed to subtract from the firm’s capital value. We determine the optimal policy Z.t/, which the company chooses to maximize its value, V .x/ D V .x; ; ; r; /, which equals the expected present value of its dividends. Thus, V .x/ D sup V .x; Z/;
(67.2)
Z
where Z V .x; Z/ D Ex
0
e
rt
dZ.t/ :
(67.3)
0
We restrict our model to the case where there is only one < ; > pair and the firm needs only to decide whether to distribute dividends or reinvest.2 In this case (and when D 0) Radner and Shepp solve for V0 .x/ D supZ V0 .x; Z/, where Z
0
V0 .x; Z/ D Ex
e rt dZ.t/ :
(67.4)
0
They show that, when > 0, V0 .x/ D A.e C x e x /
x 0
V0 .x/ D V0 .0 / C x 0
0 x < 1
(67.5)
2 1 0 D ln C C AD
p ˙ 2 C 2r 2 ˙ D 2 1 However, a multiplicative model yields realistic hiring and firing decisions in Shiryaev and Shepp (1996). 2 Radner and Shepp (1996) allow several < ; > pairs and let the firm optimize over the < ; > pairs; we leave this for another paper.
1 e 0
(67.6)
Radner and Shepp further show that in this case the firm should reinvest all its income when X.t/ 0 and distribute dividends otherwise. As a consequence, the firm goes bankrupt in a finite time (i.e. 0 < 1) with probability 1 When 0 it is optimal to take all the money at time zero, and thus V0 .x/ D x:3 Radner and Shepp also demonstrate (see their Sect. 67.5) that when D 1 the value V1 .x/ is either x or unbounded. The present paper solves the Radner-Shepp problem as given in Equation (67.4) for the more general case X .t/ D X.t; / as given above. The income generation process in our model exhibits a diminishing marginal productivity of capital (or decreasing returns to scale when other factors are considered suppressed). For D 1, the Black-Scholes-SamuelsonMerton case, without dividend distribution, the well-known solution is X.t/ D xe
2 W .t /C 2 t
; t 0;
(67.7)
and it follows that the firm never goes bankrupt In the early days of the Black-Scholes-Samuelson-Merton, modeling a stock price by a multiplicative model was seen as an advantage over the earlier Bachelier additive model, X.t/ D x C t C W .t/
(67.8)
partly because the additive model takes negative values, 2 eventually, with the positive probability equal to e 2x= , even though > 0. Under additive Brownian motion, it would be reasonable to simply regard reaching zero level as bankruptcy, so negative values cannot occur. We demonstrate that when dividends are distributed optimally, the firm goes bankrupt with probability 1 in a finite time when 0 < < 1 (just as it does in the case where D 0). If dividends are not distributed, then again (as in the case where D 0), the probability, Q.x/, of the firm reaching zero starting from x (i.e., going bankrupt in finite time) is positive and is given by: Z
where
C
e C 0
1
e Q.x/ D Zx 1 e
2 u1 2 .1 /
2 u1 2 .1 /
du :
(67.9)
du
0
3 A more realistic variant where profit can only be taken in discrete lumps, or dividends, was treated by Liu (2003). Guo (2002) corrects and improves the presentation of Shiryaev and Shepp (1996).
67 Dividends Versus Reinvestments in Continuous Time: A More General Model
which reduces to the stated value in Radner-Shepp when D 0. Thus, our new -model allows bankruptcy while incorporating diminishing marginal productivity of capital which is absent in both Radner-Shepp and Black-Scholes models. We show that it is optimal to distribute dividends when the firm’s value is greater than a threshold level . This effectively introduces a reflecting barrier at . Our solution for reduces to 0 in Radner-Shepp, as given in Equation (67.6), when D 0. We also calculate the expected time until bankruptcy.
1057
To prove the reverse inequality we have from the first two conditions that if one uses the right Z.t/, then equality holds throughout the last line and since V .x/ V .x; Z/ D VO .x/, we see that V D VO . Next, we find a VO that satisfies the above four conditions. Observe that if we set f .x/ D
For any 0 < < 1, x 0, > 0, 0, and r > 0, let Z
V .xI ; ; r/ D supZ Ex
e rt dZ.t/
(67.10)
0
where the supremum is over all non-decreasing nonanticipating processes Z.t/ and the process X.t/ for any such Z.t/ is defined by X.0/ D x, and dX.t/ D X .t/.dt C dW.t// dZ.t/; t 0
Y .t/ D VO .X.t//e rt C
Z
t
e rs dZ.s/
(67.12)
0
is a super-martingale, EŒY .t/jF.s/ Y .s/, for s < t, as in Radner-Shepp, so that for 1, the first hitting time of zero, we have V .x; Z/ EŒY . / Y .0/ VO .x/
(67.13)
and since this holds for every Z.t/, we also have V .x/ VO .x/.
˛n x .1Cn.1 //
(67.14)
2 .1 C .n C 2/.1 //˛nC2 2 C.1 C .n C 1/.1 //˛nC1 r˛n (67.15)
with the initial value ˛0 D 1, n 1. Then 2 2 00 x f .x/ C x f 0 .x/ rf .x/ D 0 2
(67.16)
Thus f solves the differential equation for VO and this suggests the transformation from f to an analytic function h, a function with a power series with integral powers. We first transform f to an analytic function g defined by:
(67.11)
We will guess that the optimal policy Z.t/ is as in the case D 0; namely, that there exists a constant D .; ; ; r/, and no dividends are distributed when X.t/ , but that all positive income is paid as dividends when X.t/ > for all t. Returning to the stochastic optimization problem of choosing an optimal dividend policy, we will prove that if we can find a number and a function VO .x/, x 0 satisfying the following four conditions: 1. VO .x/ > x 2. 12 2 x 3 VO 00 .x/ C x VO 0 .x/ r VO .x/ D 0, for x 3. 12 2 x 2 VO 00 .x/ C x VO 0 .x/ r VO .x/ 0, for all x > 0 4. VO 0 .x/ > 1, for all x, and VO 0 .x/ D 1, for x then V .x/ D VO .x/. To prove this, note that then, for any Z.t/, the process
nD0
where ˛n , n 0, are given by 0D
67.3 The Solution
X1
x 1 f .x/ D xg 1 Setting O D 2 and rO D 2 equation for g as follows, g 00 .x/ C g 0 .x/
2r , 2
(67.17)
we obtain the differential
O 1 2 1 C O g.x/ rO D 0 1 x 1 x (67.18)
In order to have the confluent hypergeometric function as a solution we make the following transformation: g.x/ D h.x/e x
(67.19)
where is equal to either of C or , exactly the same values as in the case D 0, and the differential equation for h is
2 1 00 0 C O C 2 h .x/ C h .x/ 1 x
.2 / C O h.x/ D0 (67.20) x.1 / The solution to h is the standard confluent hypergeometric function (see Abramowitz and Stegun 1964, p. 504): h.z/ D M.a; b; z/ D 1 C
.a/n zn .a/1 z CC .b/1 .b/n nŠ
(67.21)
1058
R.-R. Chen et al.
where .a/n D a.a C 1/ .a C n 1/ and .b/n D b.b C 1/ .b C n 1/ with .a/0 D 1 and .b/0 D 1. M is a polynomial with a finite number of terms if a and b are integers, a < 0, and either b > 0 or b < a. If b is a negative integer, then M is undefined. Substituting Equations (67.19) and (67.21) into (67.17), we obtain, f .x/ D xe
x 1 1
cx1 M a; b; 1
(67.22)
where
VO .x/ D Af .x/; VO .x/ D VO . / C x ;
< x < 1 (67.24)
If we put D 0, an easy calculation shows that a D 1, x b D 2, c D .2 C /, O M.1; 2; x/ D e x1 and VO .x; 0/ agrees with the earlier case in Radner-Shepp. For D 0 the solution degenerates as was also seen in Radner-Shepp. For general 2 .0; 1/, since VO satisfies the differential equation II, and since V 0 . / D 1 and V 00 . / D 0, we have D r VO . / and we see that
c D .2 C / O
D
2 bD 1 1 O C .2 / aD c 1
0 x
z.1 / c
1
1
(67.25)
where z is the largest negative root of (67.23)
Note that Equation (67.22) is independent of the value (C or ) chosen for . Note also that a, b, and c are functions of . We can now construct VO in terms of f as follows. First, with f as above let be the first zero of f 00 .x/, so that f 00 . / D 0, and set,
z zM 0 .z/ D M.z/ c
r 1 : 1
(67.26)
Again, we may take either one of D ˙ and get the same answer. It is remarkable that does not depend on . To compare a general case where D 0:5 and the Radner-Shepp case D 0, we plot the V .x/ function as follows:
V
V 6
5
5 4
4
3
3
2
2
1
1 1 2 3 q = 0, x0 = 1.64433
4
x
1 2 3 q = 0.5, x0.5 = 3.35456
4
x
The relationship between and is depicted in the following graph: x 16
Note that lim !1 D 1.
14 12
67.4 Expected Bankruptcy Time
10 8
If is the bankruptcy time, the first time the process X.t/, which starts at x and satisfies dX.t/ D X .t/.dt C dW.t// dZ.t/ where dZ.t/ enforces a reflecting barrier at , hits zero, we call the expected time to hit zero,
6 4 2 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
q
0.8
.x/ D Ex Œ ; 0 x
(67.27)
67 Dividends Versus Reinvestments in Continuous Time: A More General Model
It is clear that .0/ D 0, 0 . / D 0 are boundary conditions for , and that satisfies the local equation
.x/ D dt C EŒ .x C dx/
1059
The solution via the Green function for any 0 < 1 is: Z
x
.x/ D
(67.28)
Z
0
h.u/du
(67.30)
x
Z
where
Using Ito calculus, this leads directly to the equation
g.u/h.u/du C g.x/
x
g.x/ D
e
2 u1 2 .1 /
du
(67.31)
0
1 D x 0 .x/ C x 2
2 00
.x/ 2
and
(67.29)
h.x/ D
2
2 2 1 x 2 g 0 .x/
(67.32)
x 1
where g 0 .x/ D e 2 .1 / . To compare the expected optimal bankruptcy time for D 0 and D 0:5, we plot .x/ as follows:
f
f
62.5
550
60
500
57.5
450
55
400
52.5 1
47.5
2
3
4
x
350 300
q = 0, x0 = 1.64433
67.5 Further Remarks It is easy to see that I-IV holds for the VO defined above so this completes the proof. It is easy to construct special choices of the parameters , , r where VO becomes an elementary function and where the solution VO is no more complicated than the one in Radne-Shepp. Indeed if we can make a D ` C 1 and b D k C 1 where ` < k are positive integers then M.` C 1; k C 1; z/ D
kŠ X1 nD0 `Š 1 1 zn 1
nC`C1nC`C2 n C k nŠ (67.33)
which is an elementary function, expressible in terms of a z repeated integral of e z since M.1; 2; z/ D e 1 z and there is a simple recurrence linking M.a; b; z/ with M.a 1; b; z/ and M.a 1; b; z/ (see Abramowitz and Stegun 1964).
1 2 3 q = 0.5, x0.5 = 3.35456
4
x
To make a and b integers, we may take for example D so that b D k C 1 and then choose , r, and , so that O / a D 1c C.2 D ` C 1 which is possible, so long as 1 k C 1 < ` < k 1. 2 k1 k
67.6 Conclusion We present a continuous-time model of asset valuation in which the generated income follows a stochastic process, and the owner allocates this income between reinvestment and payout. The model includes the multiplicative (i.e., Black-Scholes-Samuelson-Merton) and the additive (i.e., Bachelier-Radner-Shepp) models as two extreme special cases. In all other cases, the income is positively related to asset size, but the marginal product is decreasing in asset value. Under the multiplicative model the ratio of income to asset value (i.e., the rate of return) is independent of asset size. This assumption is more appropriate for modeling
1060
a financial asset where the number of shares that a small investor holds has no effect on the asset returns than for modeling the investment in physical assets. The assumptions of the multiplicative model also imply that the asset value is always strictly positive. Under the additive model income is contingent on survival. However, the dollar value of the income that is generated by an asset does not depend on the value of the asset. Our model is more appropriate than the multiplicative and additive models for modeling the valuation of physical assets because the generated income is positively related to asset size, but the marginal productivity of capital is declining in asset size. Similarly to the additive model, our model does not preclude bankruptcy, and does not allow an unbounded growth of asset values (which is possible under the multiplicative model). It also indicates the existence of a threshold firm size for dividend distribution. When actual size is below this threshold, income is reinvested in the company. Funds are extracted from the company so that actual size never exceeds this threshold level.
R.-R. Chen et al.
References Abramowitz, M. and I. A. Stegun. 1964. Handbook of Mathematical Functions, NBS Washington. Bachelier, L. 1900. “Théorie de la Spéculation.” Annales de l’Ecole Normale Superiure, 3, Paris: Gauthier-Villars. (English translation “Random Character of Stock Market Prices” in Cootner, ed. (1964)). Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–659. Guo, X. 2002. “Some risk management problems for firms with internal competition and debt” Journal of Applied Probability 39, 55–69. Liu, J. 2003. Profit taking. Dissertation thesis, Rutgers University. Merton, R. A. 1973. “A rational theory of option pricing.” Bell Journal of Economics 4, 141–183. Radner, R. and L. Shepp. 1996. “Risk vs. profit potential: a model for corporate strategy” Journal of Economic Dynamics and Control 20, 1373–1393. Samuelson, P. A. 1965. “Rational theory of warrant pricing.” Industrial Management Review 6, 13–39. Shiryaev, A. and L. Shepp 1996. “Hiring and firing optimally in a large corporation” Journal of Economic Dynamics and Control 20, 1523–1539.
Chapter 68
Segmenting Financial Services Market: An Empirical Study of Statistical and Non-parametric Methods Kenneth Lawrence, Dinesh Pai, Ronald Klimberg, Stephen Kudbya, and Sheila Lawrence
Abstract In this paper, we analyze segmentation of financial markets based on the general segmentation bases. In particular, we identify potentially attractive market segments for financial services using a customer dataset. We develop a multi-group discriminant model to classify the customers into three ordinal classes: prime customers, highly valued customers, and price shoppers based on their income, loan activity, and demographics (age). The multi-group classification of customer segments uses both classical statistical techniques and a mathematical programming formulation. For this study we use the characteristics of a real dataset to simulate multiple datasets of customer characteristics. The results of our experiments show that the mathematical programming model in many case consistently outperforms standard statistical approaches in attaining lower Apparent Error Rates (APER) for 100 replications in both high and low correlation cases.
68.1 Introduction Much research has been done on the application of discrimination and classification techniques to a priori predictive as well as descriptive segmentation. In a priori predictive approaches, the type and number of segments are determined in advance based on a set of criteria, and subsequently, predictive models are used to describe the relationship between the segment membership and a set of independent variables Wedel and Kamakura (1998). The main approaches for a priori descriptive segmentation are based on statistical or operational research methods of classification. The statistical
K. Lawrence () and S. Kudbya New Jersey Institute of Technology, Newark, NJ, USA e-mail: [email protected] Dinesh Pai and S. Lawrence Rutgers University, New Brunswick, NJ, USA R. Klimberg St. Joseph’s University, Philadelphia, PA, USA
methods include discriminant analysis, logistic regression, regression and cross-tabulation, while the operational research techniques include mathematical programming (MP) methods such as linear programming and its variants, goal programming, and fuzzy goal programming Thomas et al. (2006). As in statistical regression, the objective of classification technique is to identify a functional relationship between a response (dependent) variable Y and a vector of explanatory (independent) variables or attributes X from a given set of observations (Y; X ). However, in classification methods the response variable is discrete (dichotomous or polytochomous) where as in statistical regression it is real valued variable. The response variable is denoted by C1 ; C2 ; : : : ; Cq , where q is the number of pre-specified groups or classes Doumpos and Zopounidis (2002). The objective of the classification methods is to first analyze the training data set and develop a model for each class or group using the attributes available in the data. Once the model developed using training set performs satisfactorily, it can be used to classify future independent test data. In general, classification models assign observations of unknown class membership to a number of specified classes or groups using a set of explanatory variables associated with the group. These models have found myriad business applications such as in credit evaluation systems Myers and Forgy (1963), differentiating bank charge-card holders Awh (1974), screening credit applicants Capon (1978), assessing project implementation risk Anderson and Narasimhan (1979), predicting consumer innovators for new product diffusion Robertson and Kennedy (1968), predicting corporate bankruptcy Altman (1968), investigating new product success or failure Dillon et al. (1979), predicting bank failures Tam and Kiang (1992), and approving loan applications Gallant (1988). These models have been particularly useful in market segmentation based on observable and product specific bases. The advances in computers and information technology have further increased the efficacy of such approaches whereby vast amount of historical customer data can be processed to understand customer needs and wants. This has resulted in more
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_68,
1061
1062
focused marketing strategy resulting in lower costs, higher response rates, and consequently higher profits Zahavi and Levin (1997). Since Fisher’s seminal work Fisher (1936) on linear discriminant analysis numerous methods have been developed for classification purposes. Discriminant analysis has been successfully applied in many business applications including building credit scoring models for predicting credit risk, and investigating product failures Dillon et al. (1979), Myers and Forgy (1963). Logistic regression is a related statistical method which is now widely used and Westin (1973) was one of the first to apply it in a binary choice situation. Mangasarian (1965) was the first to use linear programming method in classification problems for distinguishing between the elements of two disjoint sets of patterns. Freed and Glover (1981) extended this work for predicting the performance of job applicants based on a set of explanatory variables. Tam and Kiang (1992) were one of the first to use neural networks in business research for predicting bank failures. This study focuses on segmenting financial services market for effectively targeting customers who offer higher expected growth in the value of future business. More specifically, we attempt to develop a discriminant model to classify the customers based on their income, loan activity, and demographics (age). We segment the customers into three ordinal classes: prime customers, highly valued customers, and price shoppers. The prime customers are the ones who have higher income levels and a loan activity commensurate with their income. They form the most desirable targets for the companies offering financial services. The highly valued customer class has income levels and loan activity relatively lower than the prime customers but profitable enough in the long run though with associated risks. Lastly, the price shoppers are short term customers with lower long term attractiveness but provide enough volume base. They are also the ones which cost higher to service due to their tendency to base their decisions on short term benefits and price sensitivity. We follow an a priori predictive approach to describe the relationship between the customer classes (response variable) and a set of independent variables based on general segmentation bases (predictor variables). We use classical statistical techniques such as discriminant analysis (Mahanalobis approach), and logistic regression. We, also, use a non-parametric mathematical programming method. We then compare the results from a simulation experiment to determine the efficacy of these methods in correctly identifying the target customer segments.
K. Lawrence et al.
68.2 Methodology 68.2.1 Linear Discrimination by the Mahalanobis Method The objective of discriminant analysis is to use the information from the independent variables to achieve the clearest possible separation or discrimination between or among groups. In this respect, the two group discriminant analysis is no different from multiple regression analysis. We use the independent variables to account for as much of the variation as possible in the dependent variable. For the discriminant analysis problem, we have two populations that have sets of n1 and n2 individuals selected from each population. Moreover, for each individual, there are p corresponding random variables X1 ; X2 ; : : : ; Xp . The basic strategy is to form a linear combination of these variables: L D B1 X1 C B2 X2 C C Bp Xp We then to assign a new individual to either group 1 or group 2 on the basis of the value of L. The values of B1 , B2 ; : : : ; Bp are close to provide a maximum discrimination between the two populations. The variation in the values of L should be greater between the two groups than the variation within the groups Wiginton (1980). This model can be easily extended to three group classification.
68.2.2 Linear Discrimination by Logit Regression Logistic regression has been extensively used in family studies and social sciences. Logistic regression has found two broad applications in applied research: classification (predicting group membership) and profiling (differentiating between two groups based on certain factors) Tansey et al. (1996), Shmueli et al. (2006). In general, the logistic regression model has the form p D ˇ0 C ˇ1 x1 C ˇ2 x2 C C ˇn xn D xˇ log 1p (68.1)
Where p is the probability of outcome of interest, ˇ0 is an intercept term, ˇi is the coefficient associated with the
68 Segmenting Financial Services Market: An Empirical Study of Statistical and Non-parametric Methods
corresponding dependent (explanatory) variable xi , x D .1; x1 ; x2 ; : : : ; xn / and ˇ D .ˇ0 ; ˇ1 ; : : : ; ˇn /0 . The probability of outcome of interest, p is expressed as a non-linear function of the predictors in the form pD
1 1 C e .ˇ0 Cˇ1 x1 Cˇ2 x2 C:::Cˇn xn /
(68.2)
The Equation (68.2) ensures that the right hand side will always lead to values within the interval [0, 1]. This is called the logistic response function. In order to fit this model, the method of maximum likelihood estimates is used to estimate the parameters of the multiple logistic response function. In this paper, we use multinomial logistic regression with cumulative logit method. For example, in our financial services segmentation, we have three classes, prime, highly val-
P .X D 0/ D P .X 0/ D
1063
ued, and price shoppers. We denote them by 0 D price shoppers, 1 D highly valued, and 2 D prime. We look at the cumulative probabilities of the class membership for segmentation. The probabilities that are estimated by the model are P .X 0/, i.e., the probability of a customer being price shopper and P .X 1/, i.e., the probability of a price shopper or highly valued customer. The three non-cumulative probabilities of class membership can be easily derived from the two cumulative probabilities: P .X D 0/ D P .X 0/ P .X D 1/ D P .X 1/ P .X 0/ P .X D 2/ D 1 P .X 1/ Hence, for three classes and three independent variable case, we would have
1 1C
e .ˇ0 Cˇ1 x1 Cˇ2 x2 Cˇ3 x3 /
P .X D 1/ D P .X 1/ P .X 0/ D P .X D 2/ D 1 P .X 1/ D 1
1 1C
e .˛0 C˛1 x1 C˛2 x2 C˛3 x3 /
1 1C
e .ˇ0 Cˇ1 x1 Cˇ2 x2 Cˇ3 x3 /
1 1 C e .˛0 C˛1 x1 C˛2 x2 C˛3 x3 /
68.2.3 Mathematical Programming In this paper, we use mathematical programming (MP) method to classify the new product purchasers into three groups Lam et al. (1996). This method does not make any rigid assumptions that some of the statistical methods do. The method utilizes constrained weights, which are generated using the standard evolutionary solver. The weights are used to develop a cut-off discriminant score for each observation to classify the observations into three groups. The advantage of the MP method in classification is its ability to weight individual observations which is not possible with statistical methods Freed and Glover (1981). Unlike the statistical techniques, this method is also devoid of any assumptions such as multivariate normality, absence of multicollinearity, and absence of specification errors Meyers et al. (2006). Let the P mean of the j th variable for k D 1; 2; : : : ; m be
We consider a classification problem with q variables and n, the total number of observations in the sample. For each pair of .u; v/, where u D 1; : : : ; m 1; v D u C 1; : : : ; m, the Minimize the Sum of Deviations model (MSD) formulation is: n X di (68.3) Minimize i 2Gu ;i 2Gv
s.t. q X
wj .xij x j .u// C di 0; 8i 2 Gu
(68.4)
wj .xij x j .v// di 0; 8i 2 Gv
(68.5)
wj .x j .u/ x j .k// 1;
(68.6)
j D1 q X j D1 q X j D1
xij
i 2G
x j .k/ D nkk where nk D number of observations in Gk , and m D number of designated groups. Let n be the total number of observations, n D n1 C n2 C : : : C nm
where, wj D weights, for j D 1; : : : ; q are unrestricted in sign xij D value of j th variable for the i th observation in the sample k D number of groups i.e., k D 1; 2; : : : , m groups.
1064
K. Lawrence et al.
Pair (u, v) D any group u D 1; 2; : : : ; m 1 and v D u C 1; : : : ; m (what this means is that we solve an LP for each pair of group to generate weights). di 0 for i 2 Gu and i 2 Gv , is the deviation of an individual observation from the cut-off score. The objective function (1) minimizes the sum of all the deviations (MSD). The constraints (2) and (3) force the classification scores of the objects in Gk to be as close to the mean classification score of group k as possible by minimizing di where i 2 Gk . The constraint (4) is a normalization constraint to avoid trivial values for discriminant weights. For each pair .u; v/, we use the wj values obtained from the LP solution of pair .u; v/ to compute the values of the classification scores, Si of the observations in Gu and Gv . Then all the cut-off values, Cuv , where u D 1; : : : ; m 1; v D u C 1; : : : ; m, are determined by solving the following LP problem,
Minimize
m X m1 X X i 2Gu uD1 vDuC1
diuv C
m X m1 X X
diuv
(68.7)
i 2Gv vD1 vDuC1
s.t. Siuv C diuv cuv ; for u D 1; : : : ; m 1; v D u C 1; : : : ; m; i 2 Gu
(68.8)
Siuv diuv cuv ; for u D 1; : : : ; m 1; v D u C 1; : : : ; m; i 2 Gv
(68.9)
whereP all cuv are unrestricted in sign and all diuv 0, and q Si D j D1 wj xij , for i 2 Gk . We have used the Standard Evolutionary Solver to solve this problem. The Standard Evolutionary Solver has several advantages over the Standard Solver such as its ability to handle complex constraints and objective function and its ability to eventually find globally optimum solution. However, the Evolutionary Solver takes much longer time than the Standard Solver to find a final solution.
68.3 Evaluating the Classification Function One important way of judging the performance for any classification procedure is to calculate its error rates or misclassification probabilities. The performance of a sample classification function can be evaluated by calculating the Actual Error Rate (AER). The AER indicates how the sample classification function will perform in future samples. Just as the optimal error rate, it cannot be calculated because it depends on an unknown density function. However, an estimate of a quantity related to the AER can be calculated.
There is a measure of performance that does not depend on the form of the parent population, which can be calculated for any classification procedure. This measure is called the Apparent Error Rate (APER). It is defined as the fraction of observations in the training sample that are misclassified by the sample classification function. The APER can be easily calculated from the confusion matrix, which shows actual versus predicted group membership. For n1 observations from …1 , n2 observations from …2 , and n3 observations from …3 , the confusion matrix is given in Table 68.1 Morrison (1969): The apparent error rate is thus APER D 1 n11 Cnn22 Cn33 or, in other words, the proportion of items in the training set that are misclassified, where, n D n1 C n2 C n3 . The APER is intuitively appealing and easy to calculate. Unfortunately it tends to underestimate the AER, and the problem does not appear unless the sample sizes of n1 , n2 and n3 are very large. This very optimistic estimate occurs because the data used to build the classification are used to evaluate it. The error rate estimates can be constructed so that they are better than the apparent error rate. They are easy to calculate, and they do not require distributional assumptions. Another evaluation procedure is to split the total sample into a training sample and a validation sample. The training sample is used to construct the classification function and the validation sample is used to evaluate it. The error rate is determined by the proportion misclassified in the validation sample. This method overcomes the bias problem by not using the same data to both build and judge the classification function. There are two main problems with this method: 1. It requires large samples. 2. The function evaluated is not the function of interest because some data is lost.
Table 68.1 Calculating misclassification rates Predicted membership Actual membership
…1
…2
…3
…1
n11
n12
n13
n1
…2
n21
n22
n23
n2
…3
n31
n32
n33
n3
Where n11 : number of …1 items correctly classified as …1 items; n12 : number of …1 items misclassified as …2 items; n13 : number of …1 items misclassified as …3 items; n21 : number of …2 items misclassified as …1 items; n22 : number of …2 items correctly classified as …2 items; n23 : number of …2 items misclassified as …3 items; n31 : number of …3 items misclassified as …1 items; n32 : number of …3 items misclassified as …2 items; n33 : number of …3 items correctly classified as …3 items
68 Segmenting Financial Services Market: An Empirical Study of Statistical and Non-parametric Methods
1065
68.4 Experimental Design
68.5 Results
We have used an individual’s income, loan activity, and age as explanatory variables. The response variable in our model is a multi-group variable which indicates whether customers are: prime customers, highly valued customers, and price shoppers. To evaluate how the model would perform, a simulation experiment was conducted based on the characteristics of a real dataset. We chose Discriminant Analysis – Mahalanobis method (DA), Logistic Regression (LR), and Mathematical Programming (MP) as the three discriminant models West et al. (1997). The discriminant analysis (Mahalanobis method) model and logistic regression model approaches were developed using Minitab software and STATPRO, respectively. We use Evolutionary solver to solve the MP model with an objective to minimize the sums of deviations. The data were generated from multivariate normal distribution using two different 3 3 covariance matrices (correlations). To generate the data we used the characteristics of a real dataset. The two covariance matrices corresponded with high and low correlations between the three explanatory variables, i.e., income, loan activity, and age Lam and Moy (2003). One hundred replications of 100 observations were generated for both the cases. Sixty observations were generated as the training sample for each replication. Another forty observations were generated as validation sample for each replication. All three approaches were used to in all replications of the two cases. The APER (percentage of incorrectly classified observations) in both the training and validation samples and their standard deviations are reported in Table 68.2.
We used paired t-tests to test the difference between the average APER of all the three approaches. We also used F-tests to test the ratio of the two population variances of the average APER of the three approaches. In general, the classification performance of MP and LR model are similar, but they were shown to be superior to discriminant analysis (Mahalanobis), as seen in Table 68.2. For both the correlations cases, the performance MP and LR model on both training and validation set was clearly superior to DA model. The MP and LR models are comparatively more robust than the DA model in this experiment, as was evident in their lower APER standard deviations in most cases for training and validation set. In such a comparison study, the performance on the validation set is considered to be a true indicator of a model’s performance. Again, the performance of MP and LR method on validation set are superior to the DA statistical technique, however, the MP model shows more robustness compared to LR model, thereby reinforcing the strength of MP model with this dataset. The results of the MP model vis-à-vis DA model are encouraging for several reasons. First, the methodology achieved lower APER than the others for training sets in both the cases. This was corroborated by the paired t-tests which indicated significant difference between MP & 1 ; MP & 2 for most cases in the validation test. Since lower APER on the validation set are deemed as a good check on the external validity of the classification function, we feel that the MP model performance was comparable to the LR model on this count. Secondly, these results were
Table 68.2 Model results
Apparent error rates (APER) DA
High correlation
MP
Mean (%)
Stdev (%)
Mean (%)
Stdev (%)
Mean (%)
Stdev (%)
Training
4
1.0
0.0
0.0
3.0
2.0
Validation
10.0
9.0
9
11.0
6.0
4.0
Training
5
2.0
0.0
0.0
3.0
2.0
Validation
8.0
7.0
3
5.0
5.0
4.0
Case Low correlation
LR
Paired t -tests were used to compare the average number of APER between MP and the other two approaches with Ho W MP D i , where i D 1. Discriminant Analysis and 2. Logistic Regression, and Ha W MP < i . F -tests were used to test the ratio of two population variances 2 2 =i2 D 1 and Ha W MP =i2 < of the average APER between MP and approach I; wi t h Ho W MP 1. The significance level used in both tests: ˛ D 0:01. Reject Ha at ˛ D 0:1
1066
achieved with relatively small samples. Finally, the model makes no rigid assumptions about the functional form and did not require large datasets.
68.6 Conclusions This paper examines the mathematical properties of the MP, discriminant analysis (Mahalanobis method) and logistic regression for classifying customers into three groups - prime customers, highly valued customer, and price shoppers. We have presented a general framework for understanding the role of three methods for this problem. While traditional statistical methods work well for some situations, they may not be robust in all situations. Mathematical programming models are an alternative tool when solving problems such as these. We compared the simulation results for these three approaches and found that MP provides significantly and consistently lower APERs for both training set and validations sets of data. In this paper, we used a three group classification problem with three explanatory variables. Future research could be extended to include more explanatory variables in the problem. Furthermore, we have assumed that the variables follow a normal distribution. This assumption might be relaxed to judge the performance of the MP and other statistical methods.
References Altman, E. 1968. “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy.” Journal of Finance 23(4), 589–609. Anderson, J., and R. Narasimhan. 1979. “Assessing project implementation risk: a methodological approach.” Management Science 25(6), 512–521. Awh, R. V. and D. Waters. 1974. “A discriminant analysis of economic demographic and attitudinal characteristics of bank chargecard holders: a case study.” Journal of finance 29, 973–80. Capon, N. 1978. “Discrimination in screening of credit applicants.” Harvard Business Review 56(3), 8–12. Dillon, W., R. Calantone, and P. Worthing. 1979. “The new product problem: an approach for investigating product failures.” Management Science 25(12), 1184–1196.
K. Lawrence et al. Doumpos, M. and C. Zopounidis. 2002. Multicriteria decision aid classification methods, Kluwer Academic Publishers, Inc., Boston, MA. Fisher, R. 1936. “The use of multiple measurements in taxonomic problems.” Annals of Eugenics 7, 179–188. Freed, N. and F. Glover. 1981. “Simple but powerful goal programming models for discriminant problems.” European Journal of Operational Research 7, 44–66. Gallant, S. 1988. “Connectionist expert systems.” Communications of the ACM 31(2), 152–169. Lam, K. F. and J. W. Moy. 2003. “A simple weighting scheme for classification in two-group discriminant problems.” Computers and Operations Research 30, 155–164. Lam, K., E. Choo, and J. Moy. 1996. “Improved linear programming formulations for the multi-group discriminant problem.” Journal of the Operational Research Society 47(12), 1526–1529. Mangasarian, O. 1965. “Linear and nonlinear separation of patterns by linear programming.” Operations Research 13, 444–452. Meyers, L., G. Gamst, and A. Guarino. 2006. Applied multivariate research, Sage Publications, Thousand Oaks, CA. Morrison, D. G. 1969. “On the interpretation of discriminant analysis.” Journal of Marketing Research 6, 156–163. Myers, J. H. and E. W. Forgy. 1963. “The development of numerical credit evaluation systems.” Journal of American Statistical Association 58, 799–806. Robertson, T. and J. Kennedy. 1968. “Prediction of consumer innovators - multiple discriminant analysis.” Journal of Marketing Research 5, 64–69. Shmueli, G., N. Patel, and P. Bruce. 2006. Data mining for business intelligence: concepts, techniques, and applications in microsoft office excel with XLMiner, Wiley, New Jersey. Tam, K. and M. Kiang. 1992. “Managerial applications of neural networks: the case of bank failure predictions.” Management Science 38(7), 926–947. Tansey, R., M. White, and R. Long. 1996. “A comparison of loglinear modeling and logistic regression in management research.” Journal of Management 22(2), 339–358. Thomas, L., K. Jung, S. Thomas, and Y. Wu. 2006. “Modeling consumer acceptance probabilities.” Expert Systems with Applications 30, 499–506. Wedel, M. and W. Kamakura. 1998. Market segmentation: conceptual and methodological foundations, Kluwer Academic Publishers, Inc., Boston, MA. West, P. M., P. L. Brockett, and L. L. Golden. 1997. “A comparative analysis of neural networks and statistical methods for predicting consumer choice.” Marketing Science 16(4), 370–391. Westin, R. 1973. “Predictions from binary choice models.” Discussion Paper No. 37. Northwestern University, Evanston, IL. Wiginton, J. C. 1980. “A note on the comparison of logit and discriminant models of consumer credit behavior.” The Journal of Financial and Quantitative Analysis 15(3), 757–770. Zahavi, J. and N. Levin. 1997. “Applying neural computing to target marketing.” Journal of Direct Marketing 11(4), 76–93.
Chapter 69
Spurious Regression and Data Mining in Conditional Asset Pricing Models Wayne Ferson, Sergei Sarkissian, and Timothy Simin
Abstract Stock returns are not highly autocorrelated, but there exists a spurious regression bias in predictive regressions for stock returns similar to the classic studies of Yule (Journal of the Royal Statistical Society 89, 1–64, 1926) and Granger and Newbold (Journal of Econometrics 4, 111–120, 1974). Data mining for predictor variables reinforces spurious regression bias because more highly persistent series are more likely to be selected as predictors. In asset pricing regression models with conditional alphas and betas, the biases occur only in the estimates of conditional alphas. However, when time-varying alphas are suppressed and only time– varying betas are considered, the betas become biased. The analysis shows that the significance of many standard predictors of stock returns and time-varying alphas in the literature is often overstated. Keywords Asset allocation r Monte Carlo simulations Persistent instruments r Predicting stock returns Time-varying returns
r r
rt C1 D a C bZt C vt C1 :
Standard lagged variables include the levels of short-term interest rates, payout-to-price ratios for stock market indexes, and yield spreads between low-grade and high-grade bonds or between long- and short-term bonds. Table 69.1 surveys major studies that propose predictor variables. Many of these variables behave as persistent, or highly autocorrelated, time series. We study the finite sample properties of stock return predictive regressions with persistent lagged regressors. Regression models for stock or portfolio returns on contemporaneously measured market-wide factors have also long been a staple of financial economics. Such factor models are used in event studies (e.g., Fama et al. 1969), in tests of asset pricing theories such as the Capital Asset Pricing Model (CAPM, Sharpe 1964) and in other applications. For example, when the market return rm is the factor, the regression model for the return rt C1 is: rt C1 D ˛ C ˇrm;t C1 C ut C1 ;
69.1 Introduction Predictive models for common stock returns have long been a staple of financial economics. As reviewed by Fama (1970), early studies used such models to examine market efficiency. Stock returns are assumed to be predictable, based on lagged instrumental variables, in the current conditional asset pricing literature. The simplest predictive model is a regression for the future stock return, rt C1 , on a lagged predictor variable: W. Ferson () University of Southern California, Los Angeles, CA, USA e-mail: [email protected] S. Sarkissian Associate Professor of Finance, Area Coordinator and Desmarais Faculty Scholar, McGill University, Faculty of Management, Montreal, QC H3A1G5, Canada e-mail: [email protected] T. Simin The Pennsylvania State University, University Park, IL, USA e-mail: [email protected]
(69.1)
(69.2)
where E.ut C1 / D E.ut C1 rm;t C1 / D 0. The slope coefficients are the “betas,” which measure the market-factor risk. When the returns are measured in excess of a reference asset like a risk-free Treasury bill return, the intercepts are the “alphas,” which measure the expected abnormal return. For example, when rm is the market portfolio excess return, the CAPM implies that ˛ D 0, and the model is evaluated by testing that null hypothesis. Recent work in conditional asset pricing allows for time-varying betas modeled as linear functions of lagged predictor variables, following Maddala (1977). Prominent examples include Shanken (1990), Cochrane (1996), Ferson and Schadt (1996), Jagannathan and Wang (1996) and Lettau
This chapter is based in part on two published papers by the authors: (i) “Spurious Regressions in Financial Economics?,” (Journal of Finance, 2003), and (ii) “Asset Pricing Models with Conditional Betas and Alphas: The Effects of Data Snooping and Spurious Regression,” (Journal of Financial and Quantitative Analysis, 2007). In addition, we are grateful to Raymond Kan for many helpful suggestions.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_69,
1067
1068
W. Ferson et al.
Table 69.1 Common instrumental variables: Sources, summary statistics and OLS regression results (1) (2) (3) (4) (5) (6)
(7)
(8)
(9) 2
(10)
Reference
Predictor
Period
Obs
¡Z
¢Z
“
t
R
HAC
Breen et al. (1989) Campbell (1987)
TB1y Two–one Six–one Lag(two) one ALLy–AAAy Dividend yield AAAy–TB1y UBAAy UBAAy–TB1y DJBM “Cay” DJBM SPBM
5404–8612 5906–7908 5906–7908 5906–7908 5301–8712 2701–8612 2601–8612 2802–7812 2802–7812 1927–1992 52Q4–98Q4 2602–9409 5104–9409
393 264 264 264 420 720 732 611 611 66 184 824 552
0.97 0.32 0.15 0.08 0.97 0.97 0.92 0.95 0.97 0.66 0.79 0.97 0.98
0.0026 0.0006 0.0020 0.0010 0.0040 0.0013 0.0011 0.0230 0.0320 0.2270 0.0110 0.2300 0.0230
2:49 11:87 2:88 9:88 0:88 0:40 0:51 1:50 1:57 0:28 1:57 2:96 9:32
3:58 2:38 2:13 2:67 1:46 1:36 2:16 0:75 1:48 2:63 2:58 2:16 1:03
0.023 0.025 0.025 0.063 0.005 0.007 0.007 0.002 0.007 0.078 0.057 0.012 0.001
NW(5) NW(0) NW(0) NW(6) MA(0) MA(9) MA(9) MA(9) MA(9) MA(0) MA(7) MA(9) MA(5)
Fama (1990) Fama and French (1988a) Fama and French (1989) Keim and Stambaugh (1986) Kothari and Shanken (1997) Lettau and Ludvigson (2001) Pontiff and Schall (1998)
This table summarizes variables used in the literature to predict stock returns. The first column indicates the published study. The second column denotes the lagged instrument. The next two columns give the sample (Period) and the number of observations (Obs) on the stock returns. Columns five and six report the autocorrelation .Z / and the standard deviation of the instrument .Z /, respectively. The next three columns report regression results for Standard & Poors 500 excess return on a lagged instrument. The slope coefficient is ˇ, the t -statistic is t , and the coefficient of determination is R2 . The last column (HAC) reports the method used in computing the standard errors of the slopes. The method of Newey and West (1987) is used with the number of lags given in parentheses. The abbreviations in the table are as follows: TB1y is the yield on the 1-month Treasury bill. Two-one, Six-one, and Lag(two)-one are computed as the spreads on the returns of the 2 and 1-month bills, 6 and 1-month bills, and the lagged value of the 2-month and current 1-month bill. The yield on all corporate bonds is denoted as ALLy. The yield on AAA rated corporate bonds is AAAy and UBAAy is the yield on corporate bonds with a below BAA rating. The variable “Cay” is the linear function of consumption, asset wealth, and labor income. The book-to-market ratios for the Dow Jones Industrial Average and the S&P500 are, DJBM and SPBM, respectively
and Ludvigson (2001). The time-varying beta coefficient is ˇt D b0 C b1 Zt , where Zt is a lagged predictor variable. In some cases, the intercept or conditional alpha is also timevarying, as ˛t D ˛0 C˛1 Zt (e.g., Christopherson et al. 1998). This results in the following regression model: rt C1 D ˛0 C˛1 Zt Cb0 rm;t C1 Cb1 rm;t C1 Zt Cut C1 ;
(69.3)
where E.ut C1 / D E.ut C1 ŒZt rm;t C1 / D 0. The conditional CAPM implies that ˛0 D 0 and ˛1 D 0. This chapter also studies the finite-sample properties of asset pricing model regressions like (69.3) when there are persistent lagged regressors. The rest of the chapter is organized as follows. Section 69.2 discusses the issues of data mining and spurious regression in the simple predictive regression (69.1). Section 69.3 discusses the impact of spurious regression and data mining on conditional asset pricing. Section 69.4 describes the data. Section 69.4 presents the models used in the simulation experiments. Section 69.6 presents the simulation results for predictive regressions. Section 69.7 presents the simulation results for various forms of conditional asset pricing models. Section 69.7 discusses and evaluates solutions to the problems of spurious regression and data mining. Section 69.9 examines the robustness of results. Section 69.10 concludes.
69.2 Spurious Regression and Data Mining in Predictive Regressions In our analysis of regressions, like Equation (69.1), that attempt to predict stock returns, we focus on two issues. The first is spurious regression, analogous to Yule (1926) and Granger and Newbold (1974). These studies warned that spurious relations may be found between the levels of trending time series that are actually independent. For example, given two independent random walks, it is likely that a regression of one on the other will produce a “significant” slope coefficient, evaluated by the usual t-statistics. Stock returns are not highly autocorrelated, so you might think that spurious regression would not be an issue for stock returns. Thus, one may think that spurious regression problems are unlikely. However, the returns may be considered as the sum of an unobserved expected return, plus unpredictable noise. If the underlying expected returns are persistent time series there is still a risk of spurious regression. Because the unpredictable noise represents a substantial portion of the variance of stock returns, the spurious regression will differ from the classical setting. The second issue is “naïve data mining” as studied for stock returns by Lo and MacKinlay (1990), Foster et al. (1997), and others. If the standard instruments employed in
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
the literature arise as the result of a collective search through the data, they may have no predictive power in the future. Stylized “facts” about the dynamic behavior of stock returns using these instruments (e.g., Cochrane 1999) could be artifacts of the sample. Such concerns are natural, given the widespread interest in predicting stock returns. Not all data mining is naïve. In fact, increasing computing power and data availability have allowed the development of some very sophisticated data mining (for statistical foundations, see Hastie et al. 2001). We focus on spurious regression and the interaction between data mining and spurious regression bias. If the underlying expected return is not predictable over time, there is no spurious regression bias, even if the chosen regressor is highly autocorrelated. The reason is that, under the null hypothesis there is no predictability, the autocorrelation of the regression errors the same as that of the left hand side asset returns. In this case, our analysis reduces to pure data mining as studied by Foster et al. (1997). The spurious regression and data mining effects reinforce each other. If researchers have mined the data for regressors that produce high t-statistics in predictive regressions, then mining is more likely to uncover the spurious, persistent regressors. The standard regressors in the literature tend to be highly autocorrelated, as expected if the regressors result from this kind of a “spurious mining” process. For reasonable parameter values, all the regressions that we review from the literature are consistent with a spurious mining process, even when only a small number of instruments are considered in the mining. While data mining amplifies the problem of spurious regressions, persistent lagged variables and spurious regression also magnify the impact of data mining. As a consequence, we show that standard corrections for data mining are inadequate in the presence of persistent lagged variables. These results have profound potential implications for asset pricing regressions because the conditional asset pricing literature has, for the most part, used variables that were discovered based on predictive regressions like Equation (69.1). It is important therefore to examine how data mining and spurious regression biases influence asset pricing regressions.
69.3 Spurious Regression, Data Mining, and Conditional Asset Pricing The conditional asset pricing literature using regressions like (Equation (69.3)) has evolved from the literature on pure predictive regressions. First, studies identified lagged variables that appear to predict stock returns. Later studies, beginning
1069
with Gibbons and Ferson (1985), used the same variables to study asset pricing models. Thus, it is reasonable to presume that data mining is directed at the simpler predictive regressions. The question is now, how does this affect the validity of the subsequent asset pricing research that uses these variables in regressions like Equation (69.3)? Table 69.2 summarizes representative studies that use the regression model (Equation (69.3)). It lists the sample period, number of observations, and the lagged instruments employed. It also indicates whether the study uses the full model (Equation (69.3)), with both time-varying betas and alphas, or restricted versions of the model in which either the timevarying betas or time-varying alphas are suppressed. Finally, the table summarizes the largest t-statistics for the coefficients ˛1 and b1 reported in each study. If we find that the largest t-statistics are insignificant in view of the joint effects of spurious regression and data mining, then none of the coefficients are significant. We return to this table later and revisit the evidence. Using regression models like Equation (69.3), the literature has produced a number of “stylized facts.” First, studies typically find that the intercept is smaller in the “conditional” model (Equation (69.3)) than in the “unconditional” model (Equation (69.2)): j˛j > j˛0 j. The interpretation of these studies is that the conditional CAPM does a better job of “explaining” average returns than the unconditional CAPM. Examples with this finding include Cochrane (1996), Ferson and Schadt (1996), Ferson and Harvey (1997, 1999), Lettau and Ludvigson (2001), and Petkova and Zhang (2005). Second, studies typically find evidence of time varying betas: The coefficient estimate for b1 is statistically significant. Third, studies typically find that the conditional models fail to completely explain the dynamic properties of returns: The coefficient estimate for ˛1 is significant, indicating a time-varying alpha. Our objective is to study the reliability of such inferences in the presence of persistent lagged instruments and data mining.
69.4 The Data Table 69.1 surveys nine of the major studies that propose instruments for predicting stock returns. The table reports summary statistics for monthly data, covering various subperiods of 1926 through 1998. The sample size and period depends on the study and the variable, and the table provides the details. We attempt to replicate the data series that were used in these studies as closely as possible. The summary statistics are from our data. Note that the first order autocorrelations of the predictor variables frequently suggest a high degree of persistence. For example, short-term Treasury-bill
1070
W. Ferson et al.
Table 69.2 Representative studies on conditional asset pricing models (1) (2) (3) (4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
Reference
Predictor
Period
Obs
’t
“t
j’=’j
’1
t.’1 /
b1
t.b1 /
Shanken (1990)
TB1y TB1vol DY Term TB1y DY Term Default Default TB1y DY Term “Cay” TB1y DY Term Default
5301–8212
360
Yes
Yes
NA
47Q1–93Q4
188
No
Yes
NA
0:48 5:70 None
1:17 3:56 None
6801–9012
276
No
Yes
0.72
None
None
1.42 8:40 0:53 0:31 NA
5.92 4:42 4:74 1:76 NA
6307–9012 7901–9012
330 144
Yes Yes
No Yes
1.53 0.77
None NA
144 900
Yes No
Yes Yes
0.84 0.97
3:10 2:01 3.72 1:85 NA None
None NA
63Q3–98Q3 2701–0112
65:7 0:21 1.22 0:21 NA None
NA NA
NA NA
Cochrane (1996) Ferson and Schadt (1996)
Jagannathan and Wang (1996) Christopherson et al. (1998)
Lettau and Ludvigson (2001) Petkova and Zhang (2005)
This table summarizes representative conditional asset pricing and performance evaluation studies. The first column indicates the published study. The second column specifies the lagged instruments used. The next two columns give the sample (Period) and the number of observations (Obs) on the stock returns. Columns five and six indicate whether the conditional model includes time-varying alpha .’t / and the time-varying beta .“t /, respectively. The last five columns summarize the regression results. Column seven shows the ratio of an intercept (a pricing error) in the conditional model to that of the unconditional. Columns 8 and 9 report the point estimates of the time-varying alpha, ’1 , and their corresponding t-statistics, t.’1 /. Columns 10 and 11 report the point estimates of the time-varying beta coefficient, b1 , and their corresponding t-statistics, t.b1 /. For each predictor, the table reports regression estimates corresponding to the largest in absolute value t-statistics. The abbreviations in the table are as follows: TB1y and TB1vol are the yield and volatility on the 1-month Treasury bill, respectively. Three-one is the difference between the lagged returns of a 3-month and a 1-month T-bill. The variable “Cay” is the linear function of consumption, asset wealth, and labor income. DY is the dividend yield of the CRSP index. Term is a spread between long-term and short-term bonds. Default is a spread between low-grade and high-grade corporate bonds. “None” stands parameter values not used in the study, “NA” stands for results not reported
yields, monthly book-to-market ratios, the dividend yield of the S&P 500 and some of the yield spreads have sample first order autocorrelations of 0.97 or higher. Table 69.1 also summarizes regressions for the monthly return of the S&P 500 stock index, measured in excess of the 1-month Treasury-bill return from Ibbotson Associates, on the lagged instruments. These are OLS regressions using one instrument at a time. We report the slope coefficients, their tratios, and the adjusted R-squares. The R-squares range from less than one percent to more than seven percent, and eight of the 13 t-ratios are larger than 2.0. The t-ratios are based on the OLS slopes and Newey and West (1987) standard errors, where the number of lags is chosen based on the number of statistically significant residual autocorrelations.1 The small R-squares in Table 69.1 suggest that predictability represents a tiny fraction of the variance in stock
1 Specifically, we compute 12 sample autocorrelations and compare p their values with a cutoff at two approximate standard errors: 2= T , where T is the sample size. The number of lags chosen is the minimum lag length at which no higher order autocorrelation is larger than two standard errors. The number of lags chosen is indicated in the far right column.
returns. However, even a small R-square can signal economically significant predictability. For example, Kandel and Stambaugh (1996) and Fleming et al. (2001) find that optimal portfolios respond by a substantial amount to small R-squares in standard models. Studies combining several instruments in multiple regressions report higher R-squares. For example, Harvey (1989), using five instruments, reports adjusted R-squares as high as 17.9% for size portfolios. Ferson and Harvey (1991) report R-squares of 5.8% to 13.7% for monthly size and industry portfolio returns. These values suggest that the “true” R-square, if we could regress the stock return on its time-varying conditional mean, might be substantially higher than we see in Table 69.1. To accommodate this possibility, we allow the true R-square in our simulations to vary over the range from zero to 15%. For exposition we focus on an intermediate value of 10%. To incorporate data mining, we compile a randomly selected sample of 500 potential instruments, through which our simulated analyst sifts to mine the data for predictor variables. All the data come from the web site http://www. Economagic.com: Economic Time Series Page, maintained by Ted Bos. The sample consists of all monthly series listed on the main home page of the site, except under the headings
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
of LIBOR, Australia, Bank of Japan, and Central Bank of Europe. From the Census Bureau we exclude building permits by region, state, and metro areas (more than 4,000 series). From the Bureau of Labor Statistics we exclude all non-civilian labor force data and state, city, and international employment (more than 51,000 series). We use the Consumer Price Index (CPI) measures from the city average listings, but include no finer subcategories. The Producer Price Index (PPI) measures include the aggregates and the twodigit subcategories. From the Department of Energy we exclude data in Section 69.10, the International Energy series. We first randomly select (using a uniform distribution) 600 out of the 10,866 series that were left after the above exclusions. From these 600 we eliminated series that mixed quarterly and monthly data and extremely sparse series, and took the first 500 from what remained. Because many of the data are reported in levels, we tested for unit roots using an augmented Dickey-Fuller test (with a zero order time polynomial). We could not reject the hypothesis of a unit root for 361 of the 500 series and we replaced these series with their first differences. The 500 series are randomly ordered, and then permanently assigned numbers between one and 500. When a data miner in our simulations searches through, say 50 series, we use the sampling properties of the first 50 series to calibrate the parameters in the simulations. We also use our sample of potential instruments to calibrate the parameters that govern the amount of persistence in the “true” expected returns in the model. On the one hand, if the instruments we see in the literature, summarized in Table 69.1, arise from a spurious mining process, they are likely to be more highly autocorrelated than the underlying “true” expected stock return. On the other hand, if the instruments in the literature are a realistic representation of expected stock returns, the autocorrelations in Table 69.1 may be a good proxy for the persistence of the true expected returns.2 The mean autocorrelation of our 500 series is 15% and the median is 2%. Eleven of the 13 sample autocorrelations in Table 69.1 are higher than 15%, and the median value is 95%. We consider a range of values for the true autocorrelation based on these figures, as described below. 2 There are good reasons to think that expected stock returns may be persistent. Asset pricing models like the consumption model of Lucas (1978) describe expected stock returns as functions of expected economic growth rates. Merton (1973) and Cox et al. (1985) propose real interest rates as candidate state variables, driving expected returns in intertemporal models. Such variables are likely to be highly persistent. Empirical models for stock return dynamics frequently involve persistent, auto-regressive expected returns (e.g., Lo and MacKinlay, 1988; Conrad and Kaul, 1988; Fama and French, 1988b; or Huberman and Kandel, 1990).
1071
69.5 The Models 69.5.1 Predictive Regressions In the model for the predictive regressions, the data are generated by an unobserved latent variable, Zt , as: rt C1 D C Zt C ut C1 ;
(69.4)
where ut C1 is white noise with variance, u 2 . We interpret the latent variable, Zt as the deviations of the conditional mean return from the unconditional mean, , where the expectations are conditioned on an unobserved “market” information set at time t. The predictor variables follow an autoregressive process:
0 Zt ; Zt
D
0 0 0 Zt 1 ; Zt 1 C "t ; "t ; 0
(69.5)
where Zt is the measured predictor variable and is the autocorrelation. The assumption that the true expected return is autoregressive (with parameter ) follows previous studies such as Lo and MacKinlay (1988), Conrad and Kaul (1988), Fama and French (1988b), and Huberman and Kandel (1990). To generate the artificial data, the errors ."t ; "t / are drawn randomly as a normal vector with mean zero and covariance matrix, †. We build up the time-series of the Z and Z through the vector autoregression Equation (69.3), where the initial values are drawn from a normal with mean zero other parameters and variances, Var.Z/ and Var.Z ˚ /. The that calibrate the simulations are ; u 2 ; ; ; and † . We have a situation in which the “true” returns may be predictable, if Zt could be observed. Thisis captured by the true R-squared, Var.Z /= Var.Z / C u 2 . We set Var.Z / to equal the sample variance of the S&P 500 return, in excess of a 1-month Treasury-bill return, multiplied by 0.10. When the true R-squared of the simulation is 10%, the unconditional variance of the rt C1 that we generate is equal to the sample variance of the S&P 500 return. When we choose other values for the true R-squared, these determine the values for the parameter u 2 . We set to equal the sample mean excess return of the S&P 500 over the 1926 through 1998 period, or 0.71% per month. The extent of the spurious regression bias depends on the parameters, and , which control the persistence of the measured and the true regressor. These values are determined by reference to Table 69.1 and from our sample of 500 potential instruments. The specifics differ across the special cases, as described below.
1072
While the stock return could be predicted if Zt could be observed, the analyst uses the measured instrument Zt . If the covariance matrix † is diagonal, Zt and Zt are independent, and the true value of ı in the regression (1) is zero. To focus on spurious regression in isolation, we specialize Equation (69.3) as follows. The covariance matrix † is a 2 2 diagonal matrix with variances . 2 ; 2 /. For a given value of the value of 2 is determined as 2 D .1 2 / Var.Z /. The measured regressor has Var.Z/ D Var.Z /. The autocorrelation parameters, D are allowed to vary over a range of values. (We also allow and to differ from one another, as described below.) Following Granger and Newbold (1974), we interpret a spurious regression as one in which the “t-ratios” in the regression (1) are likely to indicate a significant relation when the variables are really independent. The problem may come from the numerator or the denominator of the t-ratio: The coefficient or its standard error may be biased. As in Granger and Newbold, the problem lies with the standard errors.3 The reason is simple to understand. When the null hypothesis that the regression slope ı D 0 is true, the error term ut C1 of the regression Equation (69.1) inherits autocorrelation from the dependent variable. Assuming stationarity, the slope coefficient is consistent, but standard errors that do not account for the serial dependence correctly are biased. Because the spurious regression problem is driven by biased estimates of the standard error, the choice of standard error estimator is crucial. In our simulation exercises, it is possible to find an efficient unbiased estimator, since we know the “true” model that describes the regression error. Of course, this will not be known in practice. To mimic the practical reality, the analyst in our simulations uses the popular autocorrelation-heteroskedasticity-consistent (HAC) standard errors from Newey and West (1987), with an automatic lag selection procedure. The number of lags is chosen by computing the autocorrelations of the estimated residuals, and truncating the lag length when the sample autocorrelations become “insignificant” at longer lags. (The exact procedure is described in Footnote 1, and modifications to this procedure are discussed below.) This setting is related to Phillips (1986) and Stambaugh (1999). Phillips derives asymptotic distributions for the OLS estimators of the regression (Equation (69.1)), in the case
3
While Granger and Newbold (1974) do not study the slopes and standard errors to identify the separate effects, our simulations designed to mimic their setting (not reported in the tables) confirm that their slopes are well behaved, while the standard errors are biased. Granger and Newbold use OLS standard errors, while we focus on the heteroskedasticity and autocorrelation-consistent standard errors that are more common in recent studies.
W. Ferson et al.
where D 1; ut C1 0, and f"t ; "t g are general independent mean zero processes. We allow a nonzero variance of ut C1 to accommodate the large noise component of stock returns. We assume < 1 to focus on stationary, but possibly highly autocorrelated, regressors. Stambaugh (1999) studies a case where the errors f"t ; "t g are perfectly correlated, or equivalently, the analyst observes and uses the correct lagged stochastic regressor. A bias arises when the correlation between ut C1 and " t C1 is not zero, related to the well-known small sample bias of the autocorrelation coefficient (e.g., Kendall 1954). In the pure spurious regression case studied here, the observed regressor Zt is independent of the true regressor Zt , and ut C1 is independent of " t C1 . The Stambaugh bias is zero in this case. The point is that there is a problem in predictive regressions, in the absence of the bias studied by Stambaugh, because of spurious regression.
69.5.2 Conditional Asset Pricing Models The data in our simulations of conditional asset pricing models are generated according to: rt C1 D ˇt rm;t C1 C ut C1 ; ˇt D 1 C Zt ; rm;t C1 D C k Zt C wt C1 :
(69.6)
Our artificial analyst uses the simulated data to run the regression model (Equation (69.3)), focusing on the t-statistics for the coefficients f’0 ; ’1 ; b0 ; b1 g. The variable Zt in Equation (69.6) is an unobserved latent variable that drives both expected market returns and time-varying betas. The term “t in Equation (69.6) is a time-varying beta coefficient. As Zt has mean equal to zero, the expected value of beta is 1.0. When k ¤ 0 there is an interaction between the time variation in beta and the expected market risk premium. A common persistent factor drives the movements in both expected returns and conditional betas. Common factors in time-varying betas and expected market premiums are important in asset pricing studies such as Chan and Chen (1988), Ferson and Korajczyk (1995), and Jagannathan and Wang (1996), and in conditional performance evaluation, as in Ferson and Schadt (1996). There is a zero intercept, or “alpha”, in the data generating process for rt C1 , consistent with asset pricing theory. The market return data, rm;t C1 , are generated as follows. The parameter was described earlier. The variance of the error is w 2 D sp 2 k 2 Var.Z /, where sp D 0:057 matches the S&P 500 return and Var.Z / D 0:055, is the
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
estimated average monthly variance of the market betas on 58 randomly selected stocks from CRSP over the period 1926–1997.4 The predictor variables follow the autoregressive process (Equation (69.3)).
69.6 Results for Predictive Regressions 69.6.1 Pure Spurious Regression Table 69.3 summarizes the results for the case of pure spurious regression, with no data mining. We record the estimated slope coefficient in regression (Equation (69.1)), its Newey-West t-ratio, and the coefficient of determination at each trial and summarize their empirical distributions. The experiments are run for two sample sizes, based on the extremes in Table 69.1. These are T D 66 and T D 824 in Panels A and B, respectively. In Panel C, we match the sample sizes to the studies in Table 69.1. In each case, 10,000 trials of the simulation are run; 50,000 trials on a subset of the cases produce similar results. The rows of Table 69.3 refer to different values for the true R-squares. The smallest value is 0.1%, where the stock return is essentially unpredictable, and the largest value is 15%. The columns of Table 69.3 correspond to different values of , the autocorrelation of the true expected return, which runs from 0.00 to 0.99. In these experiments we set D . The subpanels labeled Critical t-statistic and Critical estimated R2 report empirical critical values from the 10,000 simulated trials, so that 2.5% of the t-statistics or 5% of the R-squares, lie above these values. The subpanels labeled Mean ı report the average slope coefficients over the 10,000 trials. The mean estimated values are always small, and very close to the true value of zero at the larger sample size. This confirms that the slope coefficient estimators are well behaved, so that bias due to spurious regression comes from the standard errors. When D 0, and there is no persistence in the true expected return, the table shows that spurious regression
4 We calibrate the variance of the betas to actual monthly data by randomly selecting 58 stocks with complete CRSP data for January 1926 through December 1997. Following Fama and MacBeth (1973), we estimate simple regression betas for each stock’s monthly excess return against the S&P 500 excess return, using a series of rolling 5-year windows, rolling forward one month at a time. For each window we also compute the standard error of the beta estimate. This produces a series of 805 beta estimates and standard error estimates for beta for each firm. We calibrate the variance of the true beta for each firm to equal the sample variance of the rolling beta estimates minus the average estimated variance of the estimator. Averaging the result across firms, the value of Var.Z / is 0.0550. Repeating this exercise with firms that have data from January of 1926 through the end of 2004 increases the number of months used from 864 to 948 but decreases the number of firms from 58 to 46. The value of Var.Z / in this case is 0.0549.
1073
phenomenon is not a concern. This is true even when the measured regressor is highly persistent. (We confirm this with additional simulations, not reported in the tables, where we set D 0 and vary .) The logic is that when the slope in Equation (69.1) is zero and D 0, the regression error has no persistence, so the standard errors are well behaved. This implies that spurious regression is not a problem from the perspective of testing the null hypothesis that expected stock returns are unpredictable, even if a highly autocorrelated regressor is used. Table 69.3 shows that spurious regression bias does not arise to any serious degree, provided is 0.90 or less, and provided that the true R2 is one percent or less. For these parameters the empirical critical values for the t-ratios are 2.48 (T D 66, Panel A), and 2.07 (T D 824, Panel B). The empirical critical R-squares are close to their theoretical values. For example, for a 5% test with T D .66; 824/ the F distribution implies critical R-squared values of (5.9%, 0.5%). The values in Table 69.3 when D 0:90 and true R2 D 1%, are (6.2%, 0.5%); thus, the empirical distributions do not depart far from the standard rules of thumb. Variables like short-term interest rates and dividend yields typically have first order sample autocorrelations in excess of 0.95, as we saw in Table 69.1. We find substantial biases when the regressors are highly persistent. Consider the plausible scenario with a sample of T D 824 observations where D 0:98 and true R2 D 10%. In view of the spurious regression phenomenon, an analyst who was not sure that the correct instrument is being used and who wanted to conduct a 5%, two-tailed t-test for the significance of the measured instrument, would have to use a t-ratio of 3.6. The coefficient of determination would have to exceed 2.2% to be significant at the 5% level. These cutoffs are substantially more stringent than the usual rules of thumb. Panel C of Table 69.3 revisits the evidence from the literature in Table 69.1. The critical values for the t-ratios and R-squares are reported, along with the theoretical critical values for the R-squares, implied by the F -distribution. We set the true R-squared value equal to 10% and D in each case. We find that 7 of the 17 statistics in Table 69.1 that would be considered significant using the traditional standards are no longer significant in view of the spurious regression bias. While Panels A and B of Table 69.3 show that spurious regression can be a problem in stock return regressions, Panel C finds that accounting for spurious regression changes the inferences about specific regressors that were found to be significant in previous studies. In particular, we question the significance of the term spread in Fama and French (1989), on the basis of either the t-ratio or the R-squared of the regression. Similarly, the book-to-market ratio of the Dow Jones index, studied by Pontiff and Schall (1998), fails to be significant with either statistic. Several other variables are
1074
Table 69.3 The Monte Carlo simulation results for regressions with a lagged predictor variable
W. Ferson et al.
Panel A: 66 Observations R2 =¡
0
0.5
0.9
0.95
0.98
0.99
0:0554 0:0246 0:0173 0:0075 0:0051 0:0040
0:0154 0:0074 0:0055 0:0029 0:0023 0:0020
0:0179 0:0088 0:0066 0:0037 0:0030 0:0026
0:0312 0:0137 0:0096 0:0040 0:0026 0:0020
0:0463 0:0193 0:0129 0:0042 0:0021 0:0012
2:3073 2:3076 2:3123 2:3335 2:3702 2:3959
2:4502 2:4532 2:4828 2:6403 2:8408 3:0046
2:4879 2:5007 2:5369 2:7113 2:9329 3:1232
2:4746 2:5302 2:5460 2:7116 2:9043 3:0930
2:4630 2:5003 2:5214 2:6359 2:7843 2:9417
0:0575 0:0578 0:0579 0:0593 0:0622 0:0649
0:0598 0:0608 0:0619 0:0715 0:0847 0:0994
0:0599 0:0607 0:0623 0:0737 0:0882 0:1032
0:0610 0:0616 0:0630 0:0703 0:0823 0:0942
0:0600 0:0604 0:0612 0:0673 0:0766 0:0850
0:5
0:9
0:95
0:98
0:99
0:0106 0:0049 0:0035 0:0017 0:0013 0:0011
0:0141 0:0069 0:0052 0:0029 0:0023 0:0021
0:0115 0:0055 0:0040 0:0021 0:0016 0:0014
0:0053 0:0021 0:0014 0:0003 0:0001 0:0000
0:0007 0:0011 0:0012 0:0014 0:0014 0:0014
2:0263 2:0297 2:0279 2:0088 2:0320 2:0246
2:0362 2:0429 2:0655 2:2587 2:3758 2:4164
2:0454 2:1123 2:1479 2:5685 2:7342 2:8555
2:0587 2:1975 2:3578 3:1720 3:6356 3:8735
2:0585 2:2558 2:4957 3:7095 4:4528 4:9151
0:0047 0:0047 0:0047 0:0047 0:0049 0:0050
0:0047 0:0048 0:0050 0:0066 0:0084 0:0104
0:0047 0:0051 0:0054 0:0085 0:0125 0:0166
0:0049 0:0056 0:0065 0:0132 0:0220 0:0308
0:0049 0:0059 0:0073 0:0183 0:0316 0:0450
Means: ı 0.001 0:0480 0.005 0:0207 0.010 0:0142 0.050 0:0055 0.100 0:0033 0.150 0:0024 Critical t-statistics 0.001 2:1951 0.005 2:2033 0.010 2:2121 0.050 2:2609 0.100 2:2847 0.150 2:2750 Critical estimated R2 0.001 0:0593 0.005 0:0590 0.010 0:0590 0.050 0:0593 0.100 0:0600 0.150 0:0600 Panel B: 824 Observations R2 =¡ 0 Means: ı 0.001 0:0150 0.005 0:0067 0.010 0:0048 0.050 0:0021 0.100 0:0015 0.150 0:0012 Critical t-statistics 0.001 1:9861 0.005 1:9835 0.010 1:9759 0.050 1:9878 0.100 1:9862 0.150 2:0005 Critical estimated R2 0.001 0:0046 0.005 0:0046 0.010 0:0046 0.050 0:0046 0.100 0:0047 0.150 0:0046
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
Table 69.3 (continued)
1075
Panel C: Table 69.1 simulation Obs
Critical theoretical R2
Critical t-statistic
Critical estimated R2
393 264 264 264 420 720 732 611 611 66 184 824 552
0.97 0.32 0.15 0.08 0.97 0.97 0.92 0.95 0.97 0.66 0.79 0.97 0.98
0.0098 0.0146 0.0146 0.0146 0.0092 0.0053 0.0053 0.0063 0.0063 0.0586 0.0209 0.0047 0.0070
3.2521 2.0645 2.0560 2.0318 3.2734 3.2005 2.3947 2.8843 3.2488 2.4221 2.2724 3.1612 3.6771
0.0311 0.0151 0.0151 0.0146 0.0304 0.0194 0.0103 0.0167 0.0219 0.0656 0.0270 0.0173 0.0293
The table reports the 97.5 percentile of the Monte Carlo distribution of 10,000 NeweyWest t -statistics, the 95 percentile for the estimated coefficients of determination, and the average estimated slopes from the regression rtC1 D ˛ C ıZt C vtC1 ; where rtC1 is the excess return, Zt is the predictor variable, and t D 1; : : : ; T . The parameter is the autocorrelation coefficient of the predictors, Zt and Zt . The R2 is the coefficient of determination from the regression of excess returns rtC1 on the unobserved, true instrument Zt . Panel A depicts the results for T D 66 and Panel B for T D 824. Panel C gives the simulation results for the number of observations and the autocorrelations in Table 69.1. In Panel C, the true R2 is set to 0.1. The theoretical critical R2 is from the F -distribution
marginal, failing on the basis of one but not both statistics. These include the short-term interest rate (Fama and Schwert 1977, using the more recent sample of Breen et al. 1989); the dividend yield (Fama and French 1988a); and the qualityrelated yield spread (Keim and Stambaugh 1986). All of these regressors would be considered significant using the standard cutoffs. It is interesting to note that the biases documented in Table 69.2 do not always diminish with larger sample sizes; in fact, the critical t-ratios are larger in the lower right corner of the panels when T D 824 than when T D 66. The mean values of the slope coefficients are closer to zero at the larger sample size, so the larger critical values are driven by the standard errors. A sample as large as T D 824 is not by itself a cure for the spurious regression bias. This is typical of spurious regression with a unit root, as discussed by Phillips (1986) for infinite sample sizes and nonstationary data.5 It is 5 Phillips derives asymptotic distributions for the OLS estimators of Equation 69.1, in the case where D 1; utC1 p 0. He shows that the t -ratio for • diverges for large T , while t .ı/= T; •, and the coefficient of determination converge to well-defined random variables. Marmol (1998) extends these results to multiple regressions with partially integrated processes, and provides references to more recent theoretical literature. Phillips (1998) reviews analytical tools for asymptotic analysis when nonstationary series are involved.
interesting to observe similar patterns, even with stationary data and finite samples. Phillips (1986) shows that the sample autocorrelation in the regression studied by Granger and Newbold (1974) converges in limit to 1.0. However, we find only mildly inflated residual autocorrelations (not reported in the tables) for stock return samples as large as T D 2000, even when we assume values of the true R2 as large as 40%. Even in these extreme cases, none of the empirical critical values for the residual autocorrelations are larger than 0.5. Since ut C1 D 0 in the cases studied by Phillips, we expect to see explosive autocorrelations only when the true R2 is very large. When R2 is small the white noise component of the returns serves to dampen the residual autocorrelation. Thus, we are not likely to see large residual autocorrelations in stock return regressions, even when spurious regression is a problem. The residuals-based diagnostics for spurious regression, such as the Durbin-Watson tests suggested by Granger and Newbold, are not likely to be very powerful in stock return regressions. For the same reason, typical application of the Newey-West procedure, where the number of lags is selected by examining the residual autocorrelations, is not likely to resolve the spurious regression problem. Newey and West (1987) show that their procedure is consistent for the standard errors when the number of lags
1076
used grows without bound as the sample size T increases, provided that the number of lags grows no faster than T 1=4 . The lag selection procedure in Table 69.3 examines 12 lags. Even though no more than nine lags are selected for the actual data in Table 69.1, more lags would sometimes be selected in the simulations, and an inconsistency results from truncating the lag length.6 However, in finite samples an increase in the number of lags can make things worse. When “too many” lags are used the standard error estimates become excessively noisy, which thickens the tails of the sampling distribution of the t-ratios. This occurs for the experiments in Table 69.2. For example, letting the procedure examine 36 autocorrelations to determine the lag length (the largest number we find mentioned in published studies) the critical t-ratio in Panel A, for true R2 D 10% and D 0:98, increases to 4.8 from 2.9. Nine of the 17 statistics from Table 69.1 that are significant by the usual rules of thumb now become insignificant. The results calling these studies into question are therefore even stronger than before. Thus, simply increasing the number of lags in the Newey-West procedure is not likely to resolve the finite sample, spurious regression bias.7 We discuss this issue in more detail in Section 69.8.1. We draw several conclusions about spurious regression in stock return predictive regressions. Given persistent expected returns, spurious regression can be a serious concern well outside the classic setting of Yule (1926) and Granger and Newbold (1974). Stock returns, as the dependent variable, are much less persistent than the levels of most economic time series. Yet, when the expected returns are persistent, there is a risk of spurious regression bias. The regression residuals may not be highly autocorrelated, even when spurious regression bias is severe. Given inconsistent standard errors, spurious regression bias is not avoided with large samples. Accounting for spurious regression bias, we find that 7 of the 17 t-statistics and regression R-squares from previous studies of predictive regressions that would be significant by standard criteria are no longer significant.
W. Ferson et al.
69.6.2 Spurious Regression and Data Mining We now consider the interaction between spurious regression and data mining in the predictive regressions, where the instruments to be mined are independent as in Foster et al. (1997). There are L measured instruments over which the analyst searches for the “best” predictor, based on the R-squares of univariate regressions. In Equation (69.5), Zt becomes a vector of length L, where L is the number of instruments through which the analyst sifts. The error terms "t ; "t become an L C 1 vector with a diagonal covariance matrix; thus, " t is independent of "t . The persistence parameters in Equation (69.5) become an .L C 1/-square, diagonal matrix, with the autocorrelation of the true predictor equal to . The value of is either the average from our sample of 500 potential instruments, 15%, or the median value of 95% from Table 69.1. The remaining autocorrelations, denoted by the L-vector, , are set equal to the autocorrelations of the first L instruments in our sample of 500 potential instruments.8 When D 95%, we rescale the autocorrelations to center the distribution at 0.95 while preserving the range in the original data.9 The simulations match the unconditional variances of the instruments, Var.Z/, to the data. The first element of the covariance matrix † is equal to 2 . For a typical i -th diagonal element of †, denoted by i , the elements of .Zi / and Var.Zi / are matched to the data, and we set i2 D Œ1 .Zi /2 Var.Zi /. Table 69.4 summarizes the results. The columns correspond to different numbers of potential instruments, through which the analyst sifts to find the regression that delivers the highest sample R-squared. The rows refer to the different values of the true R-squared. The rows with true R2 D 0 refer to data mining only, similar to Foster et al. (1997). The columns where L D 1 correspond to pure spurious regression bias. We hold fixed the persistence parameter for the true expected return, , while
8
6
At very large sample sizes, a huge number of lags can control the bias. We verify this by examining samples as large as T D 5;000, letting the number of lags grow to 240. With 240 lags the critical t -ratio when the true R2 D 10% and D 0:98 falls from 3.6 in Panel B of Table 69.2, to a reasonably well-behaved value of 2.23. 7 We conduct several experiments letting the number of lags examined be 24, 36, or 48, when T D 66 and T D 824. When T D 66 the critical t -ratios are always larger than the values in Table 69.2. When T D 824 the effects are small and of mixed sign. The most extreme reduction in a critical t -ratio, relative to Table 69.2, is with 48 lags, true R2 D 15%, and D 0:99, where the critical value falls to 4.23 from 4.92.
We calibrate the true autocorrelations in the simulations to the sample autocorrelations, adjusted for first-order finite-sample bias as: O C .1 C 3/=T, O where O is the OLS estimate of the autocorrelation and T is the sample size. 9 The transformation is as follows. In the 500 instruments, the minimum bias-adjusted autocorrelation is –0.571, the maximum is 0.999, and the median is 0.02. We center the transformed distribution about the median in Table 69.1, which is 0.95. If the original autocorrelation is less than the median, we transform it to: 0:95 C . 0:02/f.0:95 C 0:571/=.0:02 C 0:571/g: If the value is above the median, we transform it to: 0:95 C . 0:02/ f.0:999 0:95/=.0:999 0:02/g :
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
Table 69.4 The Monte Carlo simulation results of regressions with spurious regression and data mining, with independent regressors
1077
Panel A: 66 Observations; D 0:15 R2 =L
1
5
10
Means: ı 0 0:0004 0:0002 0:0002 0.001 0:0114 0:0044 0:0069 0.005 0:0050 0:0017 0:0017 0.010 0:0035 0:0008 0:0014 0.050 0:0014 0:0004 0:0004 0.100 0:0009 0:0006 0:0004 0.150 0:0007 0:0007 0:0002 Critical t-statistics 0 2:2971 3:2213 3:5704 0.001 2:2819 3:2105 3:5418 0.005 2:2996 3:2250 3:5466 0.010 2:2981 3:2109 3:5492 0.050 2:2950 3:2416 3:5096 0.100 2:3175 3:2105 3:5316 0.150 2:3040 3:2187 3:5496 Critical estimated R2 0 0:0594 0:0974 0:1153 0.001 0:0589 0:0969 0:1149 0.005 0:0591 0:0972 0:1151 0.010 0:0592 0:0967 0:1158 0.050 0:0596 0:0970 0:1163 0.100 0:0608 0:0969 0:1165 0.150 0:0612 0:0975 0:1165 Panel B: 824 Observations; ¡ D 0:15 R2 =L 1 5 10 Means: ı 0 0:0000 0:0000 0:0000 0.001 0:0004 0:0032 0:0017 0.005 0:0002 0:0012 0:0004 0.010 0:0001 0:0009 0:0004 0.050 0:0001 0:0005 0:0000 0.100 0:0000 0:0005 0:0001 0.150 0:0000 0:0003 0:0003 Critical t-statistics 0 2:0283 2:5861 2:8525 0.001 2:0369 2:6000 2:8534 0.005 2:0334 2:6043 2:8565 0.010 2:0310 2:6152 2:8694 0.050 2:0272 2:6229 2:8627 0.100 2:0115 2:6304 2:8705 0.150 2:0044 2:6327 2:8618 Critical estimated R2 0 0:0047 0:0079 0:0096 0.001 0:0047 0:0079 0:0096 0.005 0:0047 0:0080 0:0096 0.010 0:0047 0:0080 0:0096 0.050 0:0047 0:0081 0:0096 0.100 0:0047 0:0081 0:0097 0.150 0:0047 0:0082 0:0096
25
50
100
250
0:0004 0:0208 0:0113 0:0076 0:0018 0:0014 0:0009
0:0001 0:0078 0:0014 0:0002 0:0023 0:0013 0:0010
0:0001 0:0012 0:0031 0:0011 0:0013 0:0007 0:0010
0:0005 0:0162 0:0109 0:0098 0:0063 0:0044 0:0035
4:1093 4:1116 4:1190 4:1198 4:0981 4:1076 4:0644
4:4377 4:4351 4:4604 4:4728 4:4036 4:4563 4:5090
4:8329 4:8238 4:7951 4:7899 4:8803 4:8772 4:8984
5:2846 5:2803 5:2894 5:2900 5:2527 5:2272 5:2948
0:1387 0:1386 0:1383 0:1386 0:1390 0:1392 0:1397
0:1548 0:1546 0:1545 0:1544 0:1557 0:1570 0:1577
0:1738 0:1739 0:1734 0:1733 0:1738 0:1738 0:1745
0:1944 0:1944 0:1948 0:1950 0:1955 0:1954 0:1967
25
50
100
250
0:0000 0:0000 0:0000 0:0003 0:0005 0:0003 0:0003
0:0001 0:0028 0:0020 0:0015 0:0006 0:0001 0:0001
0:0002 0:0058 0:0031 0:0020 0:0009 0:0002 0:0002
0:0000 0:0015 0:0007 0:0004 0:0004 0:0003 0:0002
3:1740 3:1785 3:1769 3:1782 3:1846 3:1807 3:1766
3:3503 3:3616 3:3625 3:3544 3:3450 3:3648 3:3691
3:5439 3:5443 3:5440 3:5477 3:5552 3:5673 3:5723
3:8045 3:7928 3:7906 3:7917 3:8039 3:8041 3:7965
0:0116 0:0116 0:0116 0:0115 0:0116 0:0117 0:0117
0:0130 0:0130 0:0129 0:0129 0:0130 0:0131 0:0130
0:0145 0:0145 0:0145 0:0145 0:0145 0:0146 0:0146
0:0166 0:0166 0:0166 0:0166 0:0167 0:0168 0:0168
1078
Table 69.4 (continued)
W. Ferson et al.
Panel C: 66 Observations; ¡ D 0:95 R2 =L
1
5
10
Means: ı 0 0:0005 0:0002 0:0006 0.001 0:0140 0:0069 0:0212 0.005 0:0060 0:0042 0:0082 0.010 0:0042 0:0031 0:0051 0.050 0:0016 0:0006 0:0035 0.100 0:0010 0:0002 0:0021 0.150 0:0007 0:0005 0:0015 Critical t-statistics 0 2:3446 3:3507 3:6827 0.001 2:3641 3:3547 3:6776 0.005 2:4030 3:3864 3:7013 0.010 2:3939 3:4197 3:7308 0.050 2:5486 3:5482 3:9676 0.100 2:6955 3:7336 4:1899 0.150 2:8484 3:9724 4:4329 Critical estimated R2 0 0:0579 0:0974 0:1140 0.001 0:0587 0:0981 0:1143 0.005 0:0596 0:0987 0:1153 0.010 0:0604 0:1002 0:1166 0.050 0:0691 0:1113 0:1307 0.100 0:0802 0:1265 0:1508 0.150 0:0911 0:1451 0:1728 Panel D: 824 Observations; ¡ D 0:95 R2 =L 1 5 10 Means: ı 0 0:0001 0:0000 0:0000 0.001 0:0027 0:0016 0:0007 0.005 0:0012 0:0004 0:0003 0.010 0:0009 0:0005 0:0000 0.050 0:0004 0:0005 0:0001 0.100 0:0003 0:0002 0:0001 0.150 0:0003 0:0000 0:0000 Critical t-statistics 0 1:9807 2:6807 2:8535 0.001 1:9989 2:6876 2:8758 0.005 2:0406 2:7588 2:9269 0.010 2:1108 2:8538 3:0150 0.050 2:4338 3:3118 3:6292 0.100 2:6274 3:6661 4:0003 0.150 2:7413 3:8720 4:2048 Critical estimated R2 0 0:0045 0:0080 0:0096 0.001 0:0046 0:0082 0:0097 0.005 0:0048 0:0086 0:0102 0.010 0:0050 0:0092 0:0108 0.050 0:0077 0:0145 0:0173 0.100 0:0113 0:0216 0:0264 0.150 0:0151 0:0293 0:0356
25
50
100
250
0:0001 0:0105 0:0068 0:0029 0:0023 0:0013 0:0008
0:0006 0:0134 0:0024 0:0018 0:0016 0:0017 0:0011
0:0003 0:0112 0:0033 0:0027 0:0019 0:0005 0:0001
0:0017 0:0557 0:0240 0:0145 0:0012 0:0028 0:0013
4:1903 4:1756 4:1984 4:1952 4:4703 4:7485 4:9748
4:4660 4:5157 4:5625 4:6039 4:9512 5:2335 5:5547
4:9412 4:9201 4:9381 4:9718 5:2027 5:5027 5:8256
5:2493 5:2441 5:2760 5:3083 5:5539 5:9006 6:2563
0:1374 0:1376 0:1385 0:1402 0:1552 0:1774 0:2021
0:1515 0:1518 0:1530 0:1543 0:1711 0:1952 0:2209
0:1689 0:1692 0:1699 0:1711 0:1859 0:2099 0:2370
0:1885 0:1884 0:1895 0:1910 0:2057 0:2307 0:2587
25
50
100
250
0:0000 0:0005 0:0006 0:0003 0:0002 0:0003 0:0002
0:0001 0:0015 0:0008 0:0008 0:0007 0:0000 0:0001
0:0002 0:0072 0:0029 0:0013 0:0006 0:0001 0:0002
0:0001 0:0039 0:0026 0:0006 0:0001 0:0004 0:0002
3:1579 3:1745 3:2218 3:3500 4:1202 4:5660 4:8481
3:3640 3:3702 3:4497 3:5548 4:3685 4:9129 5:2200
3:5673 3:5792 3:6493 3:7836 4:6795 5:2567 5:5846
3:8103 3:8252 3:9075 4:0351 4:9741 5:6937 6:0420
0:0113 0:0115 0:0121 0:0131 0:0216 0:0331 0:0446
0:0129 0:0130 0:0137 0:0146 0:0244 0:0374 0:0508
0:0145 0:0146 0:0153 0:0163 0:0273 0:0421 0:0568
0:0164 0:0167 0:0176 0:0187 0:0314 0:0482 0:0647
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
Table 69.4 (continued)
1079
Panel E: Table 69.1 Simulation Obs
¡
Critical L (t-statistic)
Critical L .R2 /
¡
Critical L (t-statistic)
Critical L .R2 /
393 264 264 264 420 720 732 611 611 66 184 824 552
0.97 0.32 0.15 0.08 0.97 0.97 0.92 0.95 0.97 0.66 0.79 0.97 0.98
2 2 2 5 1 1 1 1 1 2 2 1 1
1 5 5 >500 1 1 1 1 1 2 7 1 1
0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95
4 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 10 1 1 1 1 1 2 3 2 1
The table reports the 97.5 percentile of the Monte Carlo distribution of 10,000 Newey-West t -statistics, the 95 percentile for the estimated coefficients of determination, and the average estimated slopes from the regression rtC1 D ˛ C ıZt C vtC1 ; where rtC1 is the excess return, Zt is the predictor variable, and t D 1; : : : ; T . The R2 is the coefficient of determination from the regression of excess returns rtC1 on the unobserved, true instrument Zt , which has the autocorrelation . The parameter L is the number of instruments mined, where the one with the highest estimated R2 is chosen. Panels A and B depict the results for T D 66 and T D 824, respectively, when the autocorrelation of the true predictor, D 0:15. Panels C and D depict the results for T D 66 and T D 824, respectively, when the autocorrelation of the true predictor, D 0:95, the median autocorrelation in Table I. In Panel E, the true R2 is set to 0.1 and the original distribution of instruments is transformed so that their median autocorrelation is set at 0.95. The left-hand-side of Panel E gives the critical L for the given number of observations and autocorrelation that is sufficient to generate critical t -statistics or R2 ’s in excess of the corresponding statistics in Table 69.1. The right-hand-side of Panel E gives the critical L that is sufficient to generate critical t-statistics or R2 ’s in excess of the corresponding statistics in Table 69.1 when D 0:95
allowing to vary depending on the measured instrument. When L D 1, we set D 15%. We consider two values for , 15% or 95%. Panels A and B of Table 69.4 show that when L D 1 (there is no data mining) and D 15%, there is no spurious regression problem, consistent with Table 69.2. The empirical critical values for the t-ratios and R-squared statistics are close to their theoretical values under normality. For larger values of L (there is data mining) and D 15%, the critical values are close to the values reported by Foster et al. (1997) for similar sample sizes.10 There is little difference in the results for the various true R-squares. Thus, with little persistence in the true expected return there is no spurious regression problem, and no interaction with data mining. Panels C and D of Table 69.4 tell a different story. When the underlying expected return is persistent . D 0:95/
10
Our sample sizes, T , are not the same as in Foster et al. (1997). When we run the experiments for their sample sizes, we closely approximate the critical values that they report.
there is a spurious regression bias. When L D 1 we have spurious regression only. The critical t-ratio in Panel C increases to 2.8 from 2.3 as the true R-squared goes from zero to 15%. The bias is less pronounced here than in Table 69.2, with D D 0:95, which illustrates that for a given value of , spurious regression is worse for larger values of . Spurious regression bias interacts with data mining. Consider the extreme corners of Panel C. Whereas, with L D 1 the critical t-ratio increases to 2.8 from 2.3 as the true R-squared goes from zero to 15%, with L D 250, the critical t-ratio increases to 6.3 from 5.2 as the true R-squared is increased. Thus, data mining magnifies the effects of the spurious regression bias. When more instruments are examined, the more persistent ones are likely to be chosen, and the spurious regression problem is amplified. The slope coefficients are centered near zero, so the bias does not increase the average slopes of the selected regressors. Again, spurious regression works through the standard errors. We can also say that spurious regression makes the data mining problem worse. For a given value of L the critical
1080
t-ratios and R2 values increase moving down the rows of Table 69.4. For example, with L D 250 and true R2 D 0, we can account for pure data mining with a critical t-ratio of 5.2. However, when the true R-squared is 15%, the critical t-ratio rises to 6.3. The differences moving down the rows are even greater when T D 824, in Panel D. Thus, in the situations where the spurious regression bias is more severe, its impact on the data mining problem is also more severe. Finally, Panel E of Table 69.4 revisits the studies from the literature in view of spurious regression and data mining. We report critical values for L, the number of instruments mined, sufficient to render the regression t-ratios and R-squares insignificant at the five percent level. We use two assumptions about persistence in the true expected returns: (i) is set equal to the sample values from the studies, as in Table 69.1, or (ii) D 95%. With only one exception, the critical values of L are 10 or smaller. The exception is where the instrument is the lagged 1-month excess return on a 2-month Treasurybill, following Campbell (1987). This is an interesting example because the instrument is not very autocorrelated, at 8%, and when we set D 0:08 there is no spurious regression effect. The critical value of L exceeds 500. However, when we set D 95% in this example, the critical value of L falls to 10, illustrating the strong interaction between the data mining and spurious regression effects.
69.7 Results for Conditional Asset Pricing Models 69.7.1 Cases with Small Amounts of Persistence We first consider a special case of the model where we set D 0 in the data generating process for the market return and true beta, so that Z is white noise and 2 ." / D Var.Z /. In this case the predictable (but unobserved by the analyst) component of the stock market return and the betas follow white noise processes. We allow a range of values for the autocorrelation, , of the measured instrument, Z, including values as large as 0.99. For a given value of , we choose 2 ."/ D Var.Z /.1 2 /, so the measured instrument and the unobserved beta have the same variance. We find in this case that the critical values for all of the coefficients are well behaved. Thus, when the true expected returns and betas are not persistent, the use of even a highly persistent regressor does not create a spurious regression bias in the asset pricing regressions of Equation (69.3). It seems intuitive that there should be no spurious regression problem when there is no persistence in Z . Since the true coefficient on the measured instrument, Z, is zero, the
W. Ferson et al.
error term in the regression is unaffected by the persistence in Z under the null hypothesis. When there is no spurious regression problem there can be no interaction between spurious regression and data mining. Thus, standard corrections for data mining (e.g., White 2000) can be used without concern in these cases. In our second experiment the measured instrument and the true beta have the same degree of persistence, but their persistence is not extreme. We fix Var.Z/ D Var.Z / and choose, for a given value of D ; 2 ."/ D 2 ." / D Var.Z /.1 2 /. For values of < 0:95 and all values of the true predictive R-squared, Rp 2 the regressions seem generally well-specified, even at sample sizes as small as T D 66. These findings are similar to the findings for the predictive regression (Equation (69.1)). Thus, the asset pricing regressions (Equation (69.3)) also appear to be well specified when the autocorrelation of the true predictor is below 0.95.
69.7.2 Cases with Persistence Table 69.5 summarizes simulation results for a case that allows data mining and spurious regression. In this experiment, the true persistence parameter is set equal to 0.95. The table summarizes the results for time-series samples of T D 66; T D 350, and T D 960. The number of variables for which the artificial agent searches in mining the data ranges from one to 250. We focus on the two abnormal return coefficients, f˛0 ; ˛1 g and on the time-varying beta coefficient, b1 . Table 69.5 shows that the means of the coefficient ’0 , the fixed part of the alpha, are close to zero, and they get closer to zero as the number of observations increases, as expected of a consistent estimator. The 5% critical t-ratios for ˛0 are reasonably well specified at the larger sample sizes, although there is some bias at T D 66, where the critical values rise with the extent of data mining. Data mining has little effect on the intercepts at the larger sample sizes. Since the lagged instrument has a mean of zero, the intercept is the average conditional alpha. Thus, the issue of data mining for predictive variables appears to have no serious implications for measures of average abnormal performance in the conditional asset pricing regressions, provided T > 66. This justifies the use of such models for studying the cross-section of average equity returns. The coefficients ˛1 , which represent the time-varying part of the conditional alphas, present a different pattern. We would expect a data mining effect, given that the data are mined based on the coefficients on the lagged predictor in the simple predictive regression. The presence of the interaction term, however, would be expected to attenuate the bias in the standard errors compared with the simple predictive
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
1081
Table 69.5 Simulating a conditional asset pricing model T D 66 T D 350 Rp 2
LD1
L D 25
Means: ’0 0.001 0:000 0:001 0.005 0:000 0:000 0.01 0:001 0:000 0.05 0:001 0:001 0.1 0:002 0:002 0.15 0:002 0:002 Critical 5% t-statistics for ’0 0.001 2:280 2:603 0.005 2:266 2:540 0.01 2:253 2:508 0.05 2:153 2:408 0.1 2:088 2:388 0.15 2:065 2:382 Means: ’1 0.001 0:001 0:007 0.005 0:001 0:005 0.01 0:001 0:005 0.05 0:001 0:005 0.1 0:001 0:005 0.15 0:001 0:004 Critical 5% t-statistics for ’1 0.001 2:392 3:992 0.005 2:390 3:961 0.01 2:387 3:912 0.05 2:412 3:924 0.1 2:426 3:912 0.15 2:423 3:913 Means: b1 0.001 0:041 0:017 0.005 0:038 0:019 0.01 0:038 0:012 0.05 0:039 0:014 0.1 0:040 0:018 0.15 0:041 0:015 Critical 5% t-statistics for b1 0.001 2:576 2:534 0.005 2:574 2:579 0.01 2:583 2:574 0.05 2:603 2:588 0.1 2:612 2:614 0.15 2:610 2:601
L D 250
LD1
T D 960 L D 25
L D 250
LD1
L D 25
L D 250
0:000 0:000 0:000 0:001 0:002 0:002
0:000 0:000 0:000 0:000 0:000 0:001
0:000 0:000 0:000 0:001 0:001 0:001
0:000 0:000 0:000 0:000 0:001 0:001
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
2:855 2:792 2:759 2:728 2:652 2:597
1:999 1:994 2:000 1:974 1:977 1:968
2:061 2:058 2:045 1:998 2:000 1:960
2:146 2:135 2:125 2:094 2:030 1:987
1:996 2:013 2:016 2:021 2:058 2:069
2:005 2:002 2:002 1:991 2:008 2:031
2:115 2:104 2:098 2:100 2:073 2:041
0:003 0:001 0:000 0:002 0:000 0:000
0:001 0:001 0:001 0:001 0:001 0:001
0:002 0:001 0:001 0:001 0:001 0:000
0:003 0:002 0:002 0:002 0:002 0:002
0:001 0:001 0:001 0:001 0:001 0:001
0:002 0:002 0:002 0:001 0:001 0:001
0:001 0:001 0:001 0:002 0:000 0:001
5:305 5:252 5:198 5:237 5:163 5:086
2:023 2:024 2:025 2:039 2:036 2:024
3:240 3:206 3:198 3:172 3:159 3:155
3:891 3:874 3:872 3:855 3:837 3:785
1:910 1:905 1:902 1:902 1:912 1:909
3:097 3:092 3:091 3:062 3:040 3:024
3:748 3:719 3:712 3:706 3:690 3:629
0:026 0:061 0:055 0:077 0:058 0:062
0:003 0:003 0:003 0:003 0:003 0:003
0:002 0:001 0:000 0:006 0:010 0:014
0:011 0:002 0:004 0:003 0:003 0:007
0:010 0:009 0:008 0:006 0:005 0:004
0:008 0:007 0:004 0:001 0:000 0:001
0:009 0:005 0:006 0:001 0:004 0:000
2:634 2:611 2:597 2:597 2:596 2:657
2:122 2:116 2:114 2:149 2:157 2:156
2:098 2:138 2:133 2:212 2:297 2:361
2:175 2:210 2:219 2:336 2:451 2:614
1:996 2:022 2:027 2:027 2:024 2:018
2:013 2:036 2:071 2:121 2:188 2:322
2:075 2:075 2:126 2:219 2:475 2:722
The table shows the results of 10,000 simulations of the estimates from the conditional asset pricing model, allowing for possible data mining of the lagged instruments. The regression model is: rtC1 D ˛0 C ˛1 Zt C b0 rm;tC1 C b1 rm;tC1 Zt C utC1 : T is the sample size, L is the number of lagged instruments mined, Rp 2 is the true predictive R2 in the artificial data generating process
1082
regression. The table shows only a small effect of data mining on the ’1 coefficient, but a large effect on its t-ratio. The overall effect is the greatest at the smaller sample size .T D 66/, where the critical t-ratios for the intermediate Rp 2 values (10% predictive R2 ) vary from about 2.4 to 5.2 as the number of variables mined increases from one to 250. The bias diminishes with T , especially when the number of mined variables is small, and for L D 1 there is no substantial bias at T D 360 or T D 960 months. The results on the ˛1 coefficient are interesting in three respects. First, the critical t-ratios vary by only small amounts across the rows of the table. This indicates very little interaction between the spurious regression and data mining effects. Second, the table shows a smaller data mining effect than observed on the pure predictive regression. Thus, standard data mining corrections for predictive regressions will overcompensate in this setting. Third, the critical t-ratios for ’1 become smaller in Table 69.5 as the sample size is increased. This is just the opposite of what is found for the simple predictive regressions, where the inconsistency in the standard errors makes the critical t-ratios larger at larger sample sizes. Thus, the sampling distributions for time-varying alpha coefficients are not likely to be well approximated by simple corrections.11 Table 69.5 does not report the t-statistics for b0 , the constant part of the beta estimate. These are generally unbiased across all of the samples, except that the critical t-ratios are slightly inflated at the smaller sample size .T D 66/ when data mining is not at issue .L D 1/. Finally, Table 69.5 shows results for the b1 coefficients and their t-ratios, which capture the time-varying component of the conditional betas. Here, the average values and the critical t-ratios are barely affected by the number of variables mined. When T D 66 the critical t-ratios stay in a narrow range, from about 2.5 to 2.6, and they cluster closely around a value of 2.0 at the larger sample sizes. There are no discernible effects of data mining on the distribution of the time-varying beta coefficients except when the R2 values are very high. This is an important result in the context of the conditional asset pricing literature, which we characterize as having mined predictive variables based on the regression (Equation (69.1)). Our results suggest that the empirical evidence in this literature for time-varying betas, based on the regression model (Equation (69.3)), is relatively robust to the data mining.
11
We conducted some experiments in which we applied a simple localto-unity correction to the t-ratios, dividing by the square root of the sample size. We found that this correction does not result in a t-ratio that is approximately invariant to the sample size.
W. Ferson et al.
69.7.3 Suppressing Time-Varying Alphas Some studies in the conditional asset pricing literature use regression models with interaction terms, but without the time-varying alpha component (e.g., Cochrane 1996, Ferson and Schadt 1996, Ferson and Harvey 1999). Since the timevarying alpha component is the most troublesome term in the presence of spurious regression and data mining effects, it is interesting to ask if regressions that suppress this term may be better specified. Table 69.6 presents results for models in which the analyst runs regressions without the ’1 coefficient. The results suggest that the average alpha coefficient, ˛0 , and its t-statistic remain well specified regardless of data mining and potential spurious regression. Thus, once again we find little cause for concern about the inferences on average abnormal returns using the conditional asset pricing regressions, even though they use persistent, data mined lagged regressors. The distribution of the average beta estimate, b0 , is not shown in Table 69.6. The results are similar to those obtained in a factor model regression where no lagged instrument is used. The coefficients and standard errors generally appear well specified. However, we find that the coefficient measuring the time-varying beta is somewhat more susceptible to bias than in the regression that includes ˛1 . The b1 coefficient is biased, especially when T D 66, and its mean varies with the number of instruments mined. The critical t-ratios are inflated at the higher values of Rp 2 and when more instruments are mined. These experiments suggest that including the timevarying alpha in the regression (69.3) helps “soak up” the bias so that it does not adversely affect the time varying beta estimate. We conclude that if one is interested in obtaining good estimates of conditional betas, then in the presence of potential data mining and persistent lagged instruments, the time-varying alpha term should be included in the regression.
69.7.4 Suppressing Time-Varying Betas There are examples in the literature where the regression is run with a linear term for a time-varying conditional alpha but no interaction term for a time varying conditional beta (e.g., Jagannathan and Wang 1996). Table 69.7 considers this case. First, the coefficient for the average beta in the regression with no b1 term (not shown in Table 69.7) is reasonably well specified and largely unaffected by data mining on the lagged instrument. We find that the coefficients for alpha, ˛0 and ˛1 , behave similarly to the corresponding coefficients in the full regression model (Equation (69.3)). The estimates of the average alpha are reasonably well behaved, and only
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
1083
Table 69.6 Simulating a conditional asset pricing model with no time-varying alpha T D 66 T D 350 T D 960 Rp 2
LD1
L D 25
Means: ’0 0.001 0:000 0:000 0.005 0:001 0:001 0.01 0:001 0:001 0.05 0:002 0:002 0.1 0:002 0:002 0.15 0:003 0:003 Critical 5% t-statistics for ’0 0.001 2:165 2:146 0.005 2:138 2:114 0.01 2:132 2:102 0.05 2:042 2:020 0.1 1:984 1:956 0.15 1:949 1:922 Means: b1 0.001 0:007 0:006 0.005 0:008 0:017 0.01 0:008 0:015 0.05 0:010 0:031 0.1 0:010 0:020 0.15 0:010 0:029 Critical 5% t-statistics for b1 0.001 2:630 2:605 0.005 2:636 2:646 0.01 2:661 2:665 0.05 2:656 2:748 0.1 2:629 2:811 0.15 2:607 2:857
L D 250
LD1
L D 25
L D 250
LD1
L D 25
L D 250
0:001 0:001 0:001 0:002 0:003 0:003
0:000 0:000 0:000 0:000 0:000 0:001
0:000 0:000 0:000 0:000 0:001 0:001
0:000 0:000 0:000 0:001 0:001 0:001
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:001
2:123 2:089 2:054 1:986 1:929 1:893
1:951 1:949 1:944 1:907 1:896 1:900
1:988 1:981 1:966 1:921 1:910 1:884
1:947 1:929 1:918 1:917 1:885 1:829
1:990 1:983 1:992 1:983 1:988 1:993
1:965 1:961 1:956 1:970 1:962 1:965
1:934 1:922 1:931 1:921 1:904 1:869
0:043 0:060 0:064 0:047 0:035 0:042
0:008 0:006 0:006 0:003 0:002 0:000
0:004 0:003 0:003 0:003 0:005 0:003
0:001 0:000 0:004 0:003 0:001 0:003
0:003 0:003 0:003 0:002 0:002 0:002
0:013 0:013 0:010 0:007 0:000 0:002
0:012 0:008 0:007 0:001 0:001 0:002
2:639 2:631 2:643 2:739 2:861 3:001
2:157 2:156 2:163 2:146 2:175 2:201
2:128 2:145 2:150 2:257 2:378 2:466
2:218 2:246 2:256 2:441 2:618 2:755
1:987 1:991 1:987 1:988 1:994 2:008
2:136 2:162 2:154 2:267 2:395 2:479
2:147 2:170 2:216 2:476 2:639 2:828
The table shows the results of 10,000 simulations of the estimates from the conditional asset pricing model with no time-varying alpha, allowing for the possibility of data mining for the lagged instruments. The regression model is: rtC1 D ˛0 C b0 rm;tC1 C b1 rm;tC1 Zt C utC1 : T is the sample size, L is the number of lagged instruments mined, Rp 2 is the true predictive R2 in the artificial data generating process
mildly affected by the extent of data mining at smaller sample sizes. The bias in ˛1 is severe. The bias leads the analyst to overstate the evidence for a time-varying alpha, and the bias is worse as the amount of data mining increases. Thus, the evidence in the literature for time-varying alphas, based on these asset-pricing regressions, is likely to be overstated.
69.7.5 A Cross-Section of Asset Returns We extend the simulations to study a cross-section of asset returns. We use five book-to-market (BM) quintile portfolios, equally weighted across the size dimension, as an illustration. The data are courtesy of Kenneth French.
In these experiments the cross-section of assets features cross-sectional variation in the true conditional betas. Instead of ˇt D 1 C Zt , the betas are ˇt D ˇ0 C ˇ1 Zt , where the coefficients ˇ0 and ˇ1 are the estimates obtained from regressions of each quintile portfolio’s excess return on the market portfolio excess return and the product of the market portfolio with the lagged value of the dividend yield. The set of ˇ0 ’s is {1.259, 1.180, 1.124, 1.118, 1.274}, the set of “1 ’s is {1:715, 1.000, 3.766, 7.646, 8.970}.12 The true predictive R-squared in the artificial data generating process is 12
The ˇ1 coefficient for the BM2 portfolio is 1.0, replacing the estimated value of 0.047. When the ˇ1 coefficient is 0.047 the simulated return becomes nearly perfectly correlated with rm and the simulation is uninformative. The dividend yield is demeaned and multiplied by 10.
1084
W. Ferson et al.
Table 69.7 Simulating a conditional asset pricing model with no time-varying beta T D 66 T D 350 T D 960 R2p
LD1
L D 25
Means: ’0 0.001 0:000 0:000 0.005 0:000 0:000 0.01 0:001 0:000 0.05 0:001 0:001 0.1 0:002 0:001 0.15 0:002 0:002 Critical 5% t-statistics for ’0 0.001 2:239 2:592 0.005 2:219 2:533 0.01 2:206 2:511 0.05 2:124 2:429 0.1 2:065 2:389 0.15 2:015 2:356 Means: ’1 0.001 0:001 0:000 0.005 0:001 0:000 0.01 0:001 0:000 0.05 0:000 0:001 0.1 0:000 0:001 0.15 0:000 0:000 Critical 5% t-statistics for ’1 0.001 2:307 4:015 0.005 2:311 4:005 0.01 2:312 3:998 0.05 2:322 3:968 0.1 2:323 3:908 0.15 2:323 3:901
L D 250
LD1
L D 25
L D 250
LD1
L D 25
L D 250
0:000 0:000 0:000 0:001 0:001 0:002
0:000 0:000 0:000 0:000 0:000 0:001
0:000 0:000 0:000 0:000 0:000 0:001
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
0:000 0:000 0:000 0:000 0:000 0:000
2:794 2:764 2:731 2:656 2:607 2:546
1:991 1:982 1:988 1:970 1:968 1:978
2:080 2:070 2:069 2:056 2:034 2:033
2:182 2:147 2:150 2:148 2:158 2:177
2:004 2:014 2:013 2:019 2:035 2:072
2:047 2:036 2:018 2:012 2:010 2:010
2:092 2:084 2:089 2:130 2:160 2:167
0:002 0:001 0:001 0:003 0:005 0:003
0:001 0:001 0:001 0:001 0:001 0:001
0:001 0:001 0:001 0:001 0:000 0:000
0:000 0:001 0:001 0:001 0:000 0:001
0:000 0:000 0:000 0:000 0:000 0:000
0:001 0:001 0:001 0:001 0:000 0:001
0:000 0:001 0:001 0:002 0:001 0:000
5:298 5:169 5:142 5:040 5:041 5:075
2:068 2:066 2:061 2:054 2:056 2:056
3:172 3:149 3:148 3:156 3:138 3:124
3:912 3:890 3:891 3:885 3:849 3:783
2:020 2:017 2:017 2:003 2:012 2:005
3:073 3:061 3:058 3:029 3:001 3:003
3:794 3:761 3:754 3:739 3:713 3:659
The table shows the results of 10,000 simulations of the estimates from the conditional asset pricing model with no time-varying beta, allowing for the possibility of data mining for the lagged instruments. The regression model is: rtC1 D ˛0 C ˛1 Zt C b0 rm;tC1 C utC1 : T is the sample size, L is the number of lagged instruments mined, Rp 2 is the true predictive R2 in the artificial data generating process
set to 0.5%. This value matches the smallest R-squared from the regression of the market portfolio on the lagged dividend yield with a window of 60 months. Table 69.8 shows simulation results for the conditional model with time-varying alphas and betas. The means of the b0 and b1 coefficients are shown in excess of their true values in the simulations. The critical t-statistics for both ’1 and b1 are generally similar to the case where Rp 2 D 0:5% in Table 69.5. As before, there is a large bias in the t-statistic for ’1 that increases with data mining but decreases somewhat
The dividend yield has the largest average sample correlation with the five BM portfolios among the standard instruments we examine.
with the sample size. The t-statistics for the time-varying betas are generally well specified. We conduct additional experiments using the cross section of asset returns, where the conditional asset pricing regression suppresses either the time-varying alphas or the timevarying betas. The results are similar to those in Table 69.8. When the time-varying betas are suppressed there is severe bias in ’1 that diminishes somewhat with the sample size. When time-varying alphas are suppressed there is a mild bias in b1 .
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
1085
Table 69.8 Conditional asset pricing models with a cross-section of returns T D 66 T D 350 BM quintile
LD1
Means: ’0 BM1 (low) 0:002 BM2 0:000 BM3 0:000 BM4 0:001 BM5 (high) 0:004 Critical 5% t-statistics for ’0 BM1 (low) 2:157 BM2 2:297 BM3 2:296 BM4 2:343 BM5 (high) 2:369 Means: ’1 BM1 (low) 0:002 BM2 0:000 BM3 0:009 BM4 0:019 BM5 (high) 0:021 Critical 5% t-statistics for ’1 BM1 (low) 2:381 BM2 2:390 BM3 2:418 BM4 2:403 BM5 (high) 2:417 Means: b1 BM1 (low) 0:018 BM2 0:003 BM3 0:108 BM4 0:230 BM5 (high) 0:479 Critical 5% t-statistics for b1 BM1 (low) 2:612 BM2 2:521 BM3 2:573 BM4 2:624 BM5 (high) 2:628
T D 960
L D 25
L D 250
LD1
L D 25
L D 250
LD1
L D 25
L D 250
0:001 0:000 0:002 0:003 0:002
0:001 0:000 0:001 0:001 0:006
0:002 0:000 0:002 0:004 0:006
0:002 0:000 0:002 0:005 0:006
0:002 0:000 0:002 0:006 0:005
0:002 0:000 0:002 0:005 0:008
0:002 0:000 0:002 0:006 0:007
0:002 0:000 0:003 0:007 0:006
2:593 2:523 2:681 2:686 2:659
2:705 2:742 2:916 3:060 3:099
1:691 1:916 2:093 2:059 2:135
1:847 2:067 2:218 2:253 2:233
1:914 2:156 2:380 2:399 2:326
1:526 1:960 2:143 2:132 2:269
1:618 2:034 2:206 2:233 2:284
1:651 2:056 2:301 2:368 2:359
0:000 0:000 0:006 0:027 0:028
0:001 0:002 0:008 0:004 0:068
0:001 0:001 0:001 0:007 0:005
0:004 0:000 0:005 0:003 0:028
0:002 0:001 0:008 0:012 0:027
0:000 0:000 0:001 0:003 0:002
0:000 0:000 0:007 0:003 0:006
0:000 0:001 0:001 0:005 0:009
4:088 3:884 4:146 4:263 4:227
5:382 4:956 5:720 5:705 5:594
2:037 2:025 1:972 2:078 2:005
3:243 3:145 3:264 3:240 3:271
3:917 3:793 3:999 3:934 4:076
1:962 1:971 1:952 1:999 2:021
3:115 3:044 3:148 3:179 3:134
3:813 3:637 3:804 3:807 3:817
0:007 0:032 0:050 0:050 0:075
0:054 0:015 0:128 0:062 0:389
0:007 0:000 0:066 0:016 0:041
0:002 0:004 0:033 0:054 0:113
0:051 0:009 0:042 0:087 0:006
0:021 0:010 0:041 0:028 0:136
0:001 0:000 0:004 0:068 0:032
0:004 0:002 0:034 0:032 0:058
2:522 2:591 2:501 2:556 2:640
2:548 2:559 2:585 2:552 2:536
2:065 2:146 2:083 2:048 2:177
2:168 2:126 2:159 2:114 2:149
2:181 2:149 2:089 2:091 2:068
2:045 2:056 2:103 2:130 2:084
2:115 2:056 2:023 2:004 2:050
2:061 2:067 2:061 2:035 1:982
The table shows the results of 10,000 simulations from a conditional asset pricing model, allowing for the possibility of data mining for the lagged instruments. The dependent variables are book-to-market quintile portfolios. The regression model is: rtC1 D ˛0 C ˛1 Zt C b0 rm;tC1 C b1 rm;tC1 Zt C utC1 : T is the sample size and L is the number of lagged instruments mined. The true predictive R2 in the artificial data generating process is 0.005
69.7.6 Revisiting Previous Evidence In this section, we explore the impact of the joint effects of data mining and spurious regression bias on the asset pricing evidence based on regression (Equation (69.3)). First, we revisit the studies listed in Table 69.2. Consider the models with both time-varying alphas and betas. If the data mining
searches more than 250 variables predicting the test asset return and T D 350, the 5% cut-off value to apply to the t-statistic for ’1 is larger than 3.8 in absolute value. For smaller sample sizes, the cut-off value is even higher. Note in Table 69.2 that the largest t-statistic for ’1 in Shanken (1990) with a sample size of 360 is 3:57 on the T-bill volatility, while the largest t-statistic for ’1 in Christopherson et al.
1086
(1998) with a sample size of 144 is 3.72 on the dividend yield. This means that the significance of the time-varying alphas in both of these studies is questionable. However, the largest t-statistic for b1 in Shanken (1990) exceeds the empirical 5% cut-off, irrespective of spurious regression and data mining adjustments. This illustrates that the evidence for time-varying beta is robust to the joint effects of data mining and spurious regression bias, while the evidence for timevarying alphas is fragile. Now consider the model with no time-varying alpha. If the data mining searches more than 250 variables to predict the test asset return, the 5% cut-off value to apply to the t-statistic on b1 is less than 3.5 in absolute value. Cochrane (1996) reports a t-statistic of 4:74 on the dividend yield in a timevarying beta, with a sample of T D 186. Thus, we find no evidence to doubt the inference that there is a time-varying beta. (However, the significance of the term premium in the time-varying beta, with a t-statistic of 1:76, is in doubt at the 10% level.) Finally, consider the model with no time-varying beta. If the data mining searches more than 25 variables to predict the test asset return, then the 5% cut-off value to apply to the t-statistic on ˛1 is larger than 3.1 in absolute value. The largest t-statistic in Jagannathan and Wang (1996) with a sample size of 330 is 3.1. Therefore, their evidence for a time-varying alpha does not survive even with a modest amount of data mining. We conclude that some aspects of the conditional asset pricing regression (Equation (69.3)) are robust to data mining over persistent predictor variables, while others are not. The regression delivers reliable estimates of average abnormal returns and betas. However, the estimates of time-varying alphas may have vastly overstated statistical significance when standard tests are used.
69.8 Solutions to the Problems of Spurious Regression and Data Mining 69.8.1 Solutions in Predictive Regressions The essential problem in dealing with the spurious regression bias is to get the right standard errors. We examine the Newey and West (1987) style standard errors that have been popular in recent studies. These involve a number of “lag” terms to capture persistence in the regression error. We use the automatic lag selection procedure described in footnote 1, and we compare it to a simple ordinary least squares (OLS) regression with no adjustment to the standard errors, and to a heteroskedasticity-only correction due to White (1980). Table 69.9 shows the critical t-ratios you
W. Ferson et al.
would have to use in a 5%, two-tailed test, accounting for the possibility of spurious regression. Here we consider an extreme case with D 99%, because if we can find a solution that works in this case it should also work in most realistic cases. The critical t-ratios range from 2.24 to 6.12 in the first three columns. None of the approaches delivers the right critical value, which should be 1.96. The table shows that a larger sample size is no insurance against spurious regression. In fact, the problem is the worst at the largest sample size. The Newey-West approach is consistent, which means that by letting the number of lags grow when you have longer samples, you should eventually get the right standard error and solve the spurious regression problem. So, the first potential solution we examine is simply to use more lags in the consistent standard errors. Unfortunately, it is hard to know how many lags to use. The reason is that in stock return regressions the large unexpected part of stock returns is in the regression error, and this “noise” masks the persistence in the expected part of the return. If you use too few lags the standard errors are biased and the spurious regression remains. The “White” example in column two is an illustration where the number of lags is zero. If you use too many lags the standard errors will be inefficient and inaccurate, except in the largest sample sizes. We use simulations to evaluate the strategy of letting the number of lags grow large. We find that in realistic sample sizes, more lags do not help the spurious regression problem. The fourth column of Table 69.9 (denoted NW(20)) shows an example of this where 20 lags are used in monthly data. The critical t-ratios are still much larger than two. In the smaller sample size .T D 60/ it is actually better to use the standard procedure, without any adjustments. A second potential solution to the spurious regression problem is to include a lagged value of the dependent variable as an additional right-hand side variable in the regression. The logic of this approach is that the spurious regression problem is caused by autocorrelation in the regression residuals, which is inherited from the dependent variable. Therefore, logic suggests that putting a lagged dependent variable in the regression should “soak up” the autocorrelation, leaving a clean residual. The columns of Table 69.9 labeled “lagged return” evaluate this approach. It helps a little bit, compared with no adjustment, but the critical t-ratios are still much larger than two at the larger sample sizes. For a hypothetical monthly sample with 350 observations, a t-ratio of 3.7 is needed for significance. The reason that this approach doesn’t work very well is the same reason that increasing the number of lags in the Newey-West method fails to work in finite samples. It is peculiar to stock return regressions, where the ex ante expected return may be persistent but the actual return includes a large amount of unpredictable noise. Spurious regression is driven in this case by persistence in
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
1087
Table 69.9 Possible solutions to the spurious regression problem: Critical t-ratios Lagged return
Detrended (12)
Observations
OLS
White
60 350 2,000
2:24 4:04 6:08
2:36 4:10 6:12
NW(auto)
NW(20)
OLS
NW(auto)
OLS
NW(auto)
2:71 3:87 4:62
3:81 3:77 4:17
2:19 3:74 5:49
2:67 3:73 4:58
2:06 2:28 2:33
2:46 2:21 1:94
Each cell contains the critical t -ratios at the 97.5 percentiles of 10,000 Monte Carlo simulations. OLS contains the critical t -ratios without any adjustment to the standard errors, in the White column the t -stats are formed using White’s standard errors, the NW(auto) t -stats use Newey-West standard errors based on the automatic lag selection, the NW(20) t -stats use the Newey-West procedure with 20 lags. The regression model of stock returns in columns two-to-five has one independent variable, the lagged instrument; in columns six and seven – two independent variables, the lagged instrument and the lagged return; in the last two columns the only independent variable, the lagged instrument, is stochastically detrended using a trailing 12-month moving average. The autocorrelation parameter of the ex ante expected return and the lagged predictor variable is set to 99% and the ex ante return variance is 10% of the total return variance
the ex ante return, but the noise makes the lagged return a poor instrument for this persistence.13 Of the various approaches we tried, the most practically useful insurance against spurious regression seems to be a form of “stochastic detrending” of the lagged variable, advocated by Campbell (1991). The approach is very simple. Just transform the lagged variable by subtracting off a trailing moving average of its own past values. Instead of regressing returns on Zt , regress them on: Xt D Zt .1=/†j D1;:::; Zt j :
(69.7)
While different numbers of lags could be used in the detrending, Campbell uses 12 monthly lags, which seems natural for monthly data. We evaluate the usefulness of his suggestion in the last two columns of Table 69.9. With this approach the critical t-ratios are less than 2.5 at all sample sizes, and much closer to 1.96 than any of the other examples. The simple detrending approach works pretty well. Detrending lowers the persistence of the transformed regressor, resulting in autocorrelations that are below the levels where spurious regression becomes a problem. Stochastic detrending can do this without destroying the information in the data about a persistent ex ante return, as would be likely to occur if the predictor variable is simply first differenced. Overall, we recommend stochastic detrending as a simple method for controlling the problem of spurious regression in stock returns.
13
More formally, consider a case where the ex ante return is an AR(1) process, in Box-Jenkins notation. The realized return is distributed as an AR(1) plus noise, which is ARMA(1,1). Regressing the return on the lagged return, the residual may still be highly persistent.
69.8.2 Solutions in Conditional Asset Pricing Models Since detrending works relatively well in simple predictive regressions, one would think of using it also in conditional asset pricing tests to correct the inflated t-statistics on the time-varying alpha coefficient. However, as we observed above, the bias in the t-statistic on a1 is largely due to data mining rather than spurious regression. As a result, high t-statistics on a1 for large number of data mining searches come not from the high autocorrelation of Zt but rather from its high cross-correlation with the asset return. Therefore, simple detrending does not work in this case because chosen instruments may not necessarily have high persistency.14
69.9 Robustness of the Asset Pricing Results This section summarizes the results of a number of additional experiments. We extend the simulations of the asset pricing models to consider examples with more than a single lagged instrument. We consider asset pricing models with multiple factors, motivated by Merton’s (1973) multiple-beta model. We also examine models where the data mining to select the lagged instruments focuses on predicting the market portfolio return instead of the test asset returns.
14 Note that the more the asset return volatility resembles that of the entire market (i.e., if it is lower than in our simulations) the higher is the likelihood of finding more evidence of spurious regression bias. Then a simple detrending will help adjust the t-statistics of both a1 and b1 just as it did in the predictive regression case.
1088
69.9.1 Multiple Instruments The experiments summarized above focus on a single lagged instrument, while many studies in the literature use multiple instruments. We modify the simulations, assuming that the researcher mines two independent instruments with the largest absolute t-statistics and then uses both of them in the conditional asset pricing regression (Equation (69.3)) with time-varying betas and alphas. (Thus, there are two a1 coefficients and two b1 coefficients.) These simulations reveal that the statistical behavior of both coefficients are similar to each other and similar to our results as reported in Table 69.5.
69.9.2 Multiple-Beta Models We extend the simulations to study models with three state variables or factors. In building the three-factor model, we make the following assumptions. All three risk premiums are linear functions of one instrument, Z . The factors differ in their unconditional means and their disturbance terms, which are correlated with each other. The variancecovariance matrix of the disturbance terms matches that of the residuals from regressions of the three Fama-French (1993, 1996) factors on the lagged dividend yield. The true coefficients for the asset return on all three factors and their interaction terms with the correct lagged instrument, Z , are set to unity. Thus, the true conditional betas on each factor are equal to 1 C Z . We find that the bias in the t-statistic for ’1 remains and is similar to the simulation in Table 69.5. There are no biases in the t-statistics associated with the b1 ’s for the larger sample sizes.
W. Ferson et al.
is confined to the smaller sample size, T D 66. Mining to predict the market return has little impact on the sampling distribution of b1 .
69.9.4 Simulations Under the Alternative Hypothesis Note that the return generating process (Equation (69.6)) does not include an intercept or alpha, consistent with asset pricing theory. Thus, the data are generated under the null hypothesis that an asset pricing model holds exactly. However, no asset pricing model is likely to hold exactly in reality. We therefore conduct experiments in which the data generating process allows for a nonzero alpha. We modify Equation (69.6) as follows: rt C1 D a1 Zt C ˇt rm;t C1 C ut C1 ; ˇt D 1 C Zt ; rm;t C1 D C k Zt C wt C1 :
(69.8)
In the system (Equation (69.8)), there is a time-varying alpha, proportional to Zt . We set the coefficient a1 D k and estimate the model (Equation (69.3)) again. With this modification, the bias in the time-varying alpha coefficient, ˛1 , is slightly worse at the larger R2 values and larger values of L than it was before. The overall patterns, including the reduction in bias for larger values of T , are similar. We also run the model with no time-varying beta, and the results are similar to those reported above for that case.
69.10 Conclusions 69.9.3 Predicting the Market Return Much of the previous literature looked at more than one asset to select predictor variables. For the examples reported in the previous tables, the data mining is conducted by attempting to predict the excess returns of the tests assets. But a researcher might also choose instruments to predict the market portfolio return. We examine the sensitivity of the results to this change in the simulation design. The results for the conditional asset pricing model with both time-varying alphas and betas are re-examined. Recall that when the instrument is mined to predict the test asset return, there is an upward bias in the t-statistic for ˛1 . The bias increases with data mining and decreases somewhat with T . When the instruments are mined to predict the market, the bias in ’1 is small and
Our results have distinct implications for tests of predictability and model selection. In tests of predictability, the researcher chooses a lagged variable and regresses future returns on the variable. The hypothesis is that the slope coefficient is zero. Spurious regression presents no problem from this perspective, because under the null hypothesis the expected return is not actually persistent. If this characterizes the academic studies of Table 69.1, the eight t-ratios larger than two suggest that ex ante stock returns are not constant over time. The more practical problem is model selection. In model selection, the analyst chooses a lagged instrument to predict returns, for purposes such as implementing a tactical asset allocation strategy, active portfolio management, conditional
69 Spurious Regression and Data Mining in Conditional Asset Pricing Models
performance evaluation or market timing. Here is where the spurious regression problem rears its ugly head. You are likely to find a variable that appears to work on the historical data, but will not work in the future. A simple form of stochastic detrending lowers the persistence of lagged predictor variables, and can be used to reduce the risk of finding spurious predictive relations. The pattern of evidence for the lagged variables in the academic literature is similar to what is expected under a spurious data mining process with an underlying persistent ex ante return. In this case, we would expect instruments to be discovered, then fail to work with fresh data. The dividend yield rose to prominence in the 1980s, but apparently fails to work for post-1990 data (Goyal and Welch 2003; Schwert 2003). The book-to-market ratio also seems to have weakened in recent data. When more data are available, new instruments appear to work (e.g., Lettau and Ludvigson 2001; Lee et al. 1999). Analysts should be wary that the new instruments, if they arise from the spurious mining process that we suggest, are likely to fail in future data, and thus fail to be practically useful. We also study regression models for conditional asset pricing models in which lagged variables are used to model conditional betas and alphas. The conditional asset pricing literature has for the most part used the same variables that were discovered based on simple predictive regressions, and our analysis characterizes the problem by assuming the data mining occurs in this way. Our results relate to several stylized facts that the literature on conditional asset pricing has produced. Previous studies find evidence that the intercept, or average alpha, is smaller in a conditional model than in an unconditional model, suggesting, for example, that the conditional CAPM does a better job of explaining average abnormal returns. Our simulation evidence finds that the estimates of the average alphas in the conditional models are reasonably well specified in the presence of spurious regression and data mining, at least for samples larger than T D 66. Some caution should be applied in interpreting the common 60-month rolling regression estimator, but otherwise we take no issue with the stylized fact that conditional models deliver smaller average alphas. Studies typically find evidence of time varying betas based on significant interaction terms. Here again we find little cause for concern. The coefficient estimator for the interaction term is well specified in larger samples and largely unaffected by data mining in the presence of persistent lagged regressors. There is an exception when the model is estimated without a linear term in the lagged instrument. In this case, the coefficient measuring the time-varying beta is slightly biased. Thus, when the focus of the study is to estimate accurate conditional betas, we recommend that a linear term be included in the regression model.
1089
Studies also find that even conditional models fail to explain completely the dynamic properties of stock returns. That is, the estimates indicate time-varying conditional alphas. We find that this result is the most problematic. The estimates of time variation in alpha inherit biases similar to, if somewhat smaller than, the biases in predictive regressions. We use our simulations to revisit the evidence of several prominent studies. Our analysis suggests that the evidence for time-varying alphas in the current literature should be viewed with some suspicion. Perhaps the current generation of conditional asset pricing models does a better job of capturing the dynamic behavior of asset returns than existing studies suggest. Finally, we think that our study, as summarized in this chapter, represents the beginning of what could be an important research direction at the nexus of econometrics and financial economics. The literature in this area has arrived at a good understanding of a number of econometric issues in asset pricing research; the two issues that we take on are only part of a much longer list that includes stochastic regressor bias, unit roots, cointegration, overlapping data, time aggregation, structural change, errors-in-variables, and many more. But what is less understood is how these econometric issues interact with each other. We have seen that the interaction of data mining and spurious regression is likely to be a problem of practical importance. Many other econometric issues also occur in combination in our empirical practice. We need to study these other interactions in future research.
References Breen, W., L. R. Glosten and R. Jagannathan. 1989. “Economic significance of predictable variations in stock index returns.” Journal of Finance 44, 1177–1190. Campbell, J. Y. 1987. “Stock returns and the term structure.” Journal of Financial Economics 18, 373–400. Campbell, J.Y. 1991. “A variance decomposition for stock returns.” Economic Journal 101, 157–179. Chan, C. K. and N. F. Chen. 1988. “An unconditional asset pricing test and the role of size as an instrumental variable for risk.” Journal of Finance 43, 309–325. Christopherson, J. A., W. E. Ferson and D. Glassman. 1998. “Conditioning manager alpha on economic information: another look at the persistence of performance.” Review of Financial Studies 11, 111–142. Cochrane, J. H. 1996. “A cross-sectional test of an investment-based asset pricing model.” Journal of Political Economy 104, 572–621. Cochrane, J. H. 1999. New facts in finance, Working Paper, University of Chicago, Chicago. Conrad, J. and G. Kaul. 1988. “Time-variation in expected returns.” Journal of Business 61, 409–425. Cox, J. C., J. E. Ingersoll Jr. and S. A. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 363–384. Fama, E. F. 1970. “Efficient capital markets: a review of theory and empirical work.” Journal of Finance 25, 383–417.
1090 Fama, E. F. 1990. “Stock returns, expected returns, and real activity.” Journal of Finance 45, 1089–1108. Fama, E. F., L. Fisher, M. C. Jensen and R. Roll. 1969. “The adjustment of stock prices to new information.” International Economic Review 10, 1–21. Fama, E. F. and K. R. French. 1988a. “Dividend yields and expected stock returns.” Journal of Financial Economics 22, 3–25. Fama, E. F. and K. R. French. 1988b. “Permanent and temporary components of stock prices.” Journal of Political Economy 96, 246–273. Fama, E. F. and K. R. French. 1989. “Business conditions and expected returns on stocks and bonds.” Journal of Financial Economics 25, 23–49. Fama, E., and K. French. 1993. “Common risk factors in the returns on stocks and bonds.” Journal of Financial Economics 33, 3–56. Fama, E., and K. French. 1996. “Multifactor explanations of asset pricing anomalies.” Journal of Finance 51, 55–87. Fama, E. F. and G. W. Schwert. 1977. “Asset returns and inflation.” Journal of Financial Economics 5, 115–146. Fama, E. F. and J. MacBeth. 1973. “Risk, return, and equilibrium: empirical tests.” Journal of Political Economy 71, 607–636. Ferson, W. E. and C. R. Harvey. 1991. “Sources of predictability in portfolio returns.” Financial Analysts Journal 3, 49–56. Ferson, W. E. and C. R. Harvey. 1997. “The fundamental determinants of international equity returns: a perspective on country risk and asset pricing.” Journal of Banking and Finance 21, 1625–1665. Ferson, W. E. and C. R. Harvey. 1999. “Conditioning variables and the cross-section of stock returns.” Journal of Finance 54, 1325–1360. Ferson, W. E. and R. Korajczyk. 1995. “Do arbitrage pricing models explain the predictability of stock returns?” Journal of Business 68, 309–349. Ferson, W. E. and R. Schadt. 1996. “Measuring fund strategy and performance in changing economic conditions.” Journal of Finance 51, 425–462. Fleming, J., C. Kirby and B. Ostdiek. 2001. “The economic value of volatility timing.” Journal of Finance 61, 329–352. Foster, F. D., T. Smith and R. E. Whaley. 1997. Assessing goodnessof-fit of asset pricing models: the distribution of the maximal R-squared.” Journal of Finance 52, 591–607. Gibbons, M. R. and W. E. Ferson. 1985. “Testing asset pricing models with changing expectations and an unobservable market portfolio.” Journal of Financial Economics 14, 216–236. Goyal, A. and I. Welch. 2003. “Predicting the equity premium with dividend ratios.” Management Science 49, 639–654. Granger, C. W. J. and P. Newbold. 1974. “Spurious regressions in economics.” Journal of Econometrics 4, 111–120. Harvey, C. R. 1989. “Time-varying conditional covariances in tests of asset pricing models.” Journal of Financial Economics 24, 289–318. Hastie, T., R. Tibshirani and J. Friedman. 2001. The elements of statistical learning, Springer, New York. Huberman, G. and S. Kandel. 1990. “Market efficiency and value line’s record.” Journal of Business 63, 187–216. Jagannathan, R. and Z. Wang. 1996. “The conditional CAPM and the cross-section of expected returns.” Journal of Finance 51, 3–54.
W. Ferson et al. Kandel, S. and R. Stambaugh. 1996. “On the predictability of stock returns: an asset allocation perspective.” Journal of Finance 51, 385– 424. Kendall, M. G. 1954. “A Note on the bias in the estimation of autocorrelation.” Biometrica 41, 403–404. Keim, D. B. and R. F. Stambaugh. 1986. “Predicting returns in the bond and stock markets.” Journal of Financial Economics 17, 357–390. Kothari, S. P. and J. Shanken. 1997. “Book-to-market time series analysis.” Journal of Financial Economics 44, 169–203. Lettau, M. and S. Ludvigson. 2001. “Consumption, aggregate wealth and expected stock returns.” Journal of Finance 56, 815–849. Lee, C., J. Myers and B. Swaminathan. 1999. “What is the intrinsic value of the dow?” Journal of Finance 54, 1693–1742. Lo, A. W. and A. C. MacKinlay. 1988. “Stocks prices do not follow random walks.” Review of Financial Studies 1, 41–66. Lo, A. W. and A. C. MacKinlay. 1990. “Data snooping in tests of financial asset pricing models.” Review of Financial Studies 3, 431–467. Lucas, R. E. Jr. 1978. “Asset prices in an exchange economy.” Econometrica 46, 1429–1445. Maddala, G. S. 1977 Econometrics, McGraw-Hill, New York. Marmol, F. 1998. “Spurious regression theory with non-stationary fractionally integrated processes.” Journal of Econometrics 84, 233– 250. Merton, R. C. 1973. “An intertemporal capital asset pricing model.” Econometrica 41, 867–887. Newey, W. K. and K. D. West. 1987. “A simple, positive definite, heteroskedasticity and autocorrelation consistent covariance matrix.” Econometrica 55, 703–708. Petkova, R. and L. Zhang. 2005. “Is value riskier than growth?” Journal of Financial Economics 78, 187–202. Phillips, P. C. B. 1986. “Understanding spurious regressions in econometrics.” Journal of Econometrics 33, 311–340. Phillips, P. C. B. 1998. “New tools for understanding spurious regressions.” Econometrica 66, 1299–1326 Pontiff, J. and L. Schall. 1998. “Book-to-market as a predictor of market returns.” Journal of Financial Economics 49, 141–160. Schwert, G. W. 2003. “Anomalies and market efficiency,” in Handbook of the Economics of Finance, G. M. Constantinides, M. Harris, R. M. Stulz (Eds.). North Holland, Amsterdam. Shanken, J. 1990. “Intertemporal asset pricing: an empirical investigation.” Journal of Econometrics 45, 99–120. Sharpe, W. F. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. Stambaugh, R. S. 1999. “Predictive regressions.” Journal of Financial Economics 54, 315–421. White, H. 1980. “A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity.” Econometrica 48, 817–838. White, H. 2000. “A reality check for data snooping.” Econometrica 68, 1097–1126. Yule G. U. 1926. “Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series.” Journal of the Royal Statistical Society 89, 1–64.
Chapter 70
Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests Dongcheol Kim
Abstract This article discusses the issue related to the errors-in-variables (EIV) problem in the asset pricing tests of the Fama-MacBeth two-pass methodology. The two-pass methodology suffers from the well-known errors-in-variables bias that could attenuate the apparent significance of the market beta. This paper provides a new correction for the EIV problem. After the correction, the market beta shows an economically and statistically significant explanatory power for average stock returns, and firm size has much less support than previously known. This article also examines the sensitivity of the EIV correction to the factors involved in the estimation of the market betas. Keywords Two-pass methodology r Errors-in-variables bias r Cross-section of stock returns r Market beta r Firm size
70.1 Introduction The capital asset pricing model (CAPM) of Sharpe (1964), Lintner (1965), and Black (1972) implies that there exists a positive linear relation between ex ante expected returns and market betas, and variables other than the market beta should not have any relation with the expected returns. However, recent findings suggest that market beta has very limited ability to explain the cross-section of average stock returns, while other idiosyncratic variables such as firm size and the bookto-market equity ratio have considerable power (see Banz 1981; Reinganum 1981; Keim 1983; Fama and French 1992; Davis 1994, among others). These findings cast doubt on the validity of the CAPM. Most empirical asset pricing tests that have found the contradictions to the CAPM, however, involve an errorsin-variables (EIV) problem in Fama and MacBeth’s (1973) two-pass estimation methodology: In the first-pass, beta D. Kim () Korea University Business School, Anam-dong, Sungbuk-ku, Seoul 136–701, Korea e-mail: [email protected]
estimates are obtained from separate time-series regressions for each asset, and in the second-pass, gammas are estimated cross-sectionally by regressing asset returns on the estimated betas. Therefore, the explanatory variable in the cross-sectional regression (CSR) is measured with error, which results in the EIV problem in estimating the explanatory power of market betas and the price of beta risk. The EIV problem results in an underestimation of the price of beta risk and an overestimation of the other CSR coefficients associated with variables observed without error (such as firm size or the book-to-market equity ratio). Hence, it is very important to determine whether the weak relation between average stock returns and market beta is the result of the misspecification of the asset pricing model or is simply a consequence of the EIV problem. Several estimation methods have been proposed to address the EIV problem. Litzenberger and Ramaswamy (1979) suggest a correction that involves a weighted least squares version of the CSR estimator with a modified design matrix. However, their corrected estimator can be obtained only when security residual variances are exactly known. Shanken (1992) modifies the traditional two-pass procedure and derives an asymptotic distribution of the CSR estimator within a multifactor framework in which asset returns are generated by portfolio returns and prespecified factors. He presents some support for the use of the modified twopass estimation as an alternative to the traditional two-pass estimation. Shanken also provides an adjustment for the standard errors of the CSR estimators with a multifactor interpretation. Instead of employing the two-pass procedure, Gibbons (1982) uses the maximum likelihood estimation (MLE) approach to eliminate the EIV problem by simultaneously estimating betas and beta risk prices. As Shanken mentions, however, Gibbons’ maximum likelihood estimator is asymptotically identical to the second-pass generalized least squares (GLS) estimator, and the advantage of the simultaneous estimation is, indeed, lost in linearizing the
Most of this article is reproduced from Kim’s previous papers (1995, 1997).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_70,
1091
1092
nonlinear constraint. Gibbons’ estimator is thus still subject to an EIV bias. McElroy and Burmeister (1988) jointly estimate asset risk factors and their associated risk prices. In particular, Kim (1995, 1997) provides a direct correction factor for the CSR coefficients. The correction is especially useful when all individual assets are used in the CSR estimation, since his estimators for the CSR coefficients are N -consistent (i.e., consistent when the time-series sample size for estimating betas, T , is fixed and the number of assets, N , is allowed to increase without bound). Kim’s correction, therefore, provides a justification for the use of individual assets in asset pricing tests. Individual assets have been rarely used in the previous asset pricing tests, since the use of individual assets causes an estimation bias in estimating risk premia due to the EIV problem. Rather than using individual stocks, thus, formed portfolios have been used to mitigate the EIV bias. In fact, use of individual assets has several advantages. First, it ensures the full utilization of information about the cross-sectional behavior of individual stocks, which might otherwise be lost in forming portfolios. Second, it avoids the data-snooping bias (Lo and MacKinlay 1990). Third, more importantly, it avoids arbitrariness in forming testing portfolios. The CSR test results could be sensitive to the portfolio formation method and the number of portfolios. According to the portfolio formation method and/or the number of portfolios used in the CSR estimation, the estimation results for risk premia are mostly different in the literature (see, for example, Fama and French 1992; Jegadeesh 1992; Kothari et al. 1995). In contrast to previous research, Kim’s method provides direct EIV-correction factors for the risk premia estimators. Therefore, his method is different from Shanken (1992) in that Kim provides a direct correction to the CSR estimates, while Shanken provides a corrected form for the standard error of the CSR estimates. The purpose of this paper is to propose a new correction for the EIV problem and, subsequently, to reexamine the explanatory power of market beta and firm size for average stock returns in the context of the new correction. Without additional information about the measurement error of the beta variable in the CSR model, the correction for the EIV problem is not feasible. It is known, however, that the measurement error of the beta variable is the same as the estimation error of the beta estimated in the first-pass regression. Specifically, conditional on the market return, the measurement error can be linearly represented by the idiosyncratic error terms prior to each CSR. Based on this fact, the necessary information about the relationship between the measurement error variance and the idiosyncratic error variance can be extracted. By utilizing the extracted information, a new correction is derived that is performed through maximum likelihood estimation under either homoscedasticity or heteroscedasticity of the disturbance terms of the market
D. Kim
model. Since this EIV-correction estimation is N -consistent (i.e., consistent when the time-series sample size, T , is fixed and the number of assets, N , is allowed to increase without bound), it is most useful when all available individual stocks (without forming portfolios) are used in the CSR estimation. In fact, the formation of portfolios for the CSR estimation might cause a loss of valuable information about cross-sectional behavior among individual assets, since the cross-sectional variations would be smoothed out. The results show that after correcting for the EIV problem, market beta has economically and statistically significant explanatory power for average stock returns both in the presence and in the absence of the firm size variable in the regression, and that the firm size factor, although significant, becomes much less important. In other words, the results suggest that the previous findings that market betas are dead may have been exaggerated because of their failure to correct for the EIV problem. Nevertheless, the results indicate that there is a misspecification in the CAPM related to firm size.
70.2 The Errors-in-Variables Problem The Fama and MacBeth (1973) second-pass CSR model of estimating the return-risk relation at a specific time t is Rt D 0t C 1t ˇt C "t ;
(70.1)
where Rt D .R1t ; ; RNt /0 is the return vector in excess of the riskless return or the return on a zero-beta portfolio, ˇt D .ˇ1t ; ; ˇNt /0 is the market beta vector, "t D ."1t ; ; "Nt / is the idiosyncratic error vector, and N is the number of assets. Since ˇt is unobservable, the estimated market beta, ˇOt 1 is used as a proxy for the unknown ˇt . The independent variable, therefore, is observed with error ˇOt 1 D ˇt C t 1 ;
(70.2)
where ˇOt 1 D .ˇO1t 1 ; ; ˇNt1 / is the market beta vector estimated from the first-pass time-series regression model (the market model) using T time-series data available up to t 1, and t 1 D .1t 1 ; ; Nt1 /0 is the measurement error vector with mean vector and covariance matrix † . Hereafter, the bold face letter indicates a vector. In the Fama and MacBeth risk premia estimation procedure, the estimates O1t are taken to be the independent and identical sampled values of the price of beta risk, 1 The time-series sample mean .NO1 / of the estimates O1t is regarded as the final estimate of 1 and is used for testing whether the price of beta risk is positive
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
and significantly different from zero.2 The time-series sample mean .NO0 / of the estimates O0t is used for testing whether 0 is zero 0 should not be significantly different from zero for the CAPM to be valid. As Shanken (1992) points out, use of the predictive beta .ˇOt 1 / in the CSR estimation has some advantages. First, the problem of a spurious cross-sectional relation arising from statistical correlation between returns and estimated market betas can be avoided. Second, the independence between the explanatory variable .ˇOt 1 / and the regression error term ."t / in the CSR can be maintained. Furthermore, the measurement error of the estimated beta could be assumed to be independent of the idiosyncratic error term, because the measurement error is a function of past idiosyncratic error terms prior to each CSR. It is well known that the traditional second-pass least squares estimate of 1t is not N -consistent. Assuming that the measurement error and the idiosyncratic error are independent, Richardson and Wu (1970) show that the traditional ordinary least squares (OLS) estimator, O1tOLS , is negatively (downward) biased: E O1tOLS 1t
2 2 1 1 1t C O.N 2 /; D 1C N 1 .1 C /2 (70.3) where D ˇ2t 1 =2t 1 , 2t 1 is the measurement error (cross-sectional) variance, and O.N 2 / is the term containing all remaining terms with order less than or equal to N 2 . The term ˇ2t 1 is used to designate N P .ˇOit1 ˇNt 1 /2 =.N 1/. Hence, 1t is underestimated i D1
under the traditional least squares estimation and so is the price of beta risk. The underestimation of the price of beta risk leads to an overestimation of the intercept 0 The misspecification of the CAPM is, therefore, exaggerated in the traditional least squares estimation. The firm size effect (Reinganum 1981; Banz 1981) is frequently cited as evidence of the misspecification of the CAPM. To examine whether firm size has the power to explain the cross-section of stock returns, the firm size variable is added to Equation (70.1): Rt D 0t C 1t ˇt C 2t Vt 1 C "t ;
(70.4)
2 Instead of using the time-series average, Amihud et al. (1993) propose a one-time 1 estimate from the joint-pooled time-series and crosssectional model. They analytically show that the joint-pooled estimator of 1 is more efficient than the time-series average of the least squares estimates of 1t obtained by the Fama and MacBeth’s two-pass procedure, since its variance is less than that of the time-series average. However, the joint-pooled estimator uses the beta variable measured with error and thus is still subject to an EIV bias.
1093
where Vt D .V1t 1 ; ; VNt1 /0 is the vector of the logarithm of market values measured without error at t 1. The CSR model of Equation (70.4) has one explanatory variable measured with error and another explanatory variable measured without error. The time-series sample mean .NO2 / of the estimates 2t is used for testing the explanatory power of firm size. Assuming that the measurement error and the idiosyncratic error are independent, McCallum (1972) and Aigner (1974) show that the biases of the traditional OLS estimator of 1t and 2t in Equation (70.4) are 2 E.O1tOLS / 1t D 4
3
5 1t C O N 1 ; 1 C 1 2O 1
ˇV
2 E.O2tOLS / 2t D D
ˇV O V2
4
3
1
1C 1
2O ˇV
(70.5)
5 1t C O.N 1 /
ˇV O OLS E O1t 1t C O.N 1 /; 2 V (70.6)
where ˇV O .ˇV O / is the cross-sectional covariance (correlation coefficient) between the estimated beta and the firm size variable, V2 is the cross-sectional variance of the firm size variable, and O.N 1 / is the term containing all remaining terms with order less than or equal to N 1 . Equation (70.3) shows that the OLS estimator of 1t ; O1tOLS , is negatively biased. Equation (70.5) now shows that after including the firm size variable, 1t , becomes even more underestimated, because the magnitude of negative bias in Equation (70.5) is greater than that in Equation (70.3).3 At the same time, Equation (70.6) shows that the traditional least squares coefficient estimate associated with the firm size variable, O2tOLS , is negatively biased, as long as the estimated beta and the firm size are negatively correlated. In other words, the CSR coefficient associated with the firm size variable is on average estimated more negatively than its true value under the traditional least squares estimation procedure. It is, therefore, possible that the explanatory power of the firm size variable has been overestimated in previous empirical studies, since the negative correlation between these two variables is empirically well-known. The extent of the overestimation depends on the magnitude of the correlation
In terms of O.N 1 /, the negative bias of O1tOLS is apparently greater when the size variable is included in the model than when the beta vari2 is always less than one. The greater able alone is in the model, since ˇV O the correlation coefficient between the beta variable and the second variable (firm size or book-to-market equity), the greater the negative bias of O1tOLS .
3
1094
coefficient between these two variables. The greater the negative correlation, the more severe the overestimation of the coefficient associated with firm size. Equation (70.6) indicates that if the estimated beta and the exactly measured second explanatory variable are positively correlated, the traditional least squares coefficient estimate associated with the second variable is positively biased. An example of this case is the book-to-market equity ratio. This variable is positively correlated with the estimated beta. When the book-to-market equity ratio is included as the second variable in Equation (70.4), the least squares estimate of 2t is positively (upward) biased; that is, the average value of the least squares estimates of 2t is greater than the true value. Therefore, the explanatory power of the bookto-market equity ratio for average stock returns reported by Fama and French (1992) is also exaggerated under the traditional least squares estimation procedure. The biases of the traditional least squares estimators could be eliminated if T -consistent beta estimates are used ( ! 1 as T ! 1), or if a finite set of well-diversified portfolios can be constructed ( ! 1 as t ! 0). However, concerns about changing risk premia parameters and beta nonstationarity limit the time-series length T that can be used. Similarly, the construction of a finite set of well-diversified portfolios is difficult in practice. Under these circumstances, an N -consistent estimation is necessary to asymptotically remove the biases. The N -consistent estimation could also resolve two other empirical issues: the optimal number of portfolios in the two-pass estimation that minimize the mean squared error of risk premia estimates and the best way to form portfolios.4
D. Kim
formation about the relation between the measurement error variance and the idiosyncratic error variance is extracted and this information is incorporated into the maximum likelihood estimation. Homoscedasticity of the disturbance term of the market model is assumed.5 Here, direct EIV-correction factors for the traditional least squares estimators of the CSR coefficients are provided. Rather than forming portfolios or increasing the time-series sample size, the proposed methodology increases the cross-sectional sample size (N) to its maximum. The N-consistent estimation is most useful when all available individual stocks (i.e., maximum cross-sectional sample size) are used in the CSR estimation. In fact, use of all available individual stocks ensures the full utilization of information about the cross-sectional behavior of individual stocks, which might otherwise be lost in forming portfolios. In order to derive the EIV-corrected N-consistent estimators of the CSR coefficients when one explanatory variable is measured with error and the other variable is measured without error in the model (see the Appendix in Kim (1995) for the case of K variables measured without error), define
" t D t t 1
Rt 0t 1t ˇt 2t Vt 1 D O ˇt 1 ˇt
This section presents an N-consistent estimation of the CSR coefficients (0t ; 1t ; 2t ) that asymptotically removes the biases of the Fama and MacBeth’s (1973) two-pass procedure. This N-consistent estimation is feasible when additional in4
Portfolio grouping is widely used to decrease the biases of the traditional least squares CSR estimates (OtOLS ). However, there is a tradeoff between the bias and the variance of the estimate according to the number of portfolios. As the number of portfolios (N ) is increased (and hence, there is a smaller number of individual securities within a portfolio), the magnitude of the bias becomes greater, while the variance of the estimate becomes smaller, and vice versa. Hence, there is an optimal number of portfolios with which a minimum mean squared error can be achieved. However, when risk premium is estimated by the time-series mean of the CSR estimates, the mean squared error of NO would be dominated by its bias, since its the risk premium estimate () variance would monotonically decrease as the testing period becomes longer.
N 0; ;
†" †" where D , and ˇt 1 is the (N 1) beta vector †" † estimated through the market model by using T time-series return observations from t T through t 1 (prior to each CSR). Then assume that ( 0 †" for t D t 0 (70.1A) E "t "t jRm D 0 for t ¤ t 0 E."t t 1 jRm / D †" D 0;
70.3 A Correction for the Errors-in-Variables Bias
(70.2A)
where Rm is the set of T market returns used to estimate betas prior to each CSR. The assumption (Equation (70.1A)) indicates that the idiosyncratic error terms "t are intertemporally independent and cross-sectionally heteroscedastic. Without loss of generality, assumption (Equation (70.2A)) can be maintained within the two-pass estimation framework, since conditional on the market return, ˇOt 1 is the best linear unbiased estimate by the classical regression theory and thus t 1 is a linear function of past idiosyncratic error terms "s prior to the CSR at time t.s t 1/. Put more specifically, the measurement error term of an asset i ’s beta variable in the CSR at time t is represented by it1 D
t 1 X
ais "is ;
(70.7)
sDt T
5
For the case of heteroscedasticity of the disturbance term, see Kim (1995).
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
P 2 where ais D his .Rms RQ m /= tj1 hij .Rmj RQ m / , his D Pt 1 Dt T Pt 1 1=Var."is /, and RQ m D j Dt T hij Rmj = j Dt T hij is the weighted average of the market returns. Conditional on the market return, therefore, the measurement error term of the beta variable in each CSR (it1 ) is independent of the idiosyncratic error term in the CSR ("it ). This is a consequence of the use of the predictive beta.6 Without additional information about the error terms "t and t 1 , the MLE of the unknown parameter vector D .0t ; 1t ; 2t ; ˇt 1 / that minimizes the quadratic function (or the likelihood function) L D t 1 0t does not exist. However, based on the fact that the measurement error is linearly related to the idiosyncratic error terms prior to each CSR (Equation (70.7)) additional information about the relationship between the variances of the error terms can be obtained. That is, conditional on the market return, the ratio of the idiosyncratic error variance matrix D D, to the measurement error variance matrix is †" †1 where D D diag.ı1t ; ; ıNt /, and ıit D Var."it /=Var.it1 / D
t 1 X
.Rms RN m /2 ŒVar."it / = Var."is / :
1095
where "Ot D Rt 0t 1t ˇOt 1 2t Vt 1 , and 1 is a vector of ones. The solution of Equations (70.9), (70.10), and (70.11) is the feasible GLS version of the MLE’s of the CSR coefficients (0t 1t ; 2t ), since the unknown error covariance maO " , which trix †" is substituted with its consistent estimator, † 7 is estimated from the market model residuals. Regardless of O " is used, the solution has the same asympwhether †" or † totic distribution under certain conditions (see Zellner 1962; Schmidt 1976; Fomby et al. 1984). Obviously, the weighted least squares (WLS) or OLS version of the MLEs can also be obtained if †" is a diagonal or a scalar matrix, respectively. In deriving the N -consistent estimators of the CSR coefficients at time t the knowledge of the ratio, ıit , is crucial. When the disturbance term of the market model is homoscedastic (i.e., two idiosyncratic error variance terms in Equation (70.8), Var("is) and Var("it ), are equal), the ratio is known conditionally on the market return and is the same as the sum of the squared mean-adjusted market returns for all assets because two idiosyncratic error variance terms in Equation (70.8) are cancelled out. On the other hand, when the disturbance term of the market model is heteroscedastic, the ratio is different across assets, and the idiosyncratic error variance terms contained in the ratio should be estimated. The case of homoscedasticity and then the case of heteroscedasticity is next discussed next.
(70.8)
sDt T
With this additional information, the N -consistent MLE of 0t , 1t , and 2t can be obtained by (i) setting the derivative of L with respect to ˇt 1 to zero, (ii) substituting the unknown ˇt 1 in L with its MLE (from the equation @L=@ˇt 1 D 0), and (iii) setting the derivatives of L with respect to 0t , 1t , and 2t to zero; that is, 1 1 @L O " ˇOt 1 † D "O0t D 1t2 IN C D @1t O 1 C 1t "O0t D.1t2 IN C D/2 † Ot D 0 " "
(70.9)
70.3.1 Homoscedastic Market Model Disturbances When the disturbance term of the market model is intertemporally homoscedastic and the time-series sample size (T ) in estimating ˇ is equal, the ratio of the idiosyncratic error variance to the measurement error variance, ıit , is the same for all assets; that is, 2 ıit D ıt D T O m;t 1 ;
@L O 1 D "O0t D.1t2 IN C D/1 † " Vt 1 D 0 @2t
(70.10)
@L O 1 D "Ot † " 1 D0 @2t
(70.11)
2 where O m;t 1 is the MLE of the market variance obtained by using the market returns from t T through t 1. The closedform solution of Equations (70.9–70.11) is then available.
When betas estimated by using the full-period sample are used in the CSR as in Fama and French (1992), the measurement error term is a linear function of all idiosyncratic error terms over the testing period, and not of only past idiosyncratic error terms. In each CSR, therefore, one of the idiosyncratic error terms contained in the measurement error term is overlapped with the idiosyncratic error term of the CSR model. In this case, the independence between the idiosyncratic error term of the CSR model and the measurement error term cannot be maintained.
7 These MLEs under the normality condition are identical to the homoscedastic version of the method of moments estimators without any distributional assumption of the error terms, since the system of moment conditions is exactly identified and the differentiating equations for the MLE are equivalent to the equations defining the method of moments estimators. See also Judge et al. (1980), Johnston (1984), and Fuller (1987) for deriving the method of moments estimation for the case that one explanatory variable measured with error is alone in the model and †" D IN .
6
1096
D. Kim
By solving these equations, the following EIV-corrected MLEs of the CSR coefficients at time t is obtained: i1=2 h M C M 2 C 4ıt m2 O .1 ORV OˇV =RˇO /2 O Rˇ
; O1t D 2mRˇO 1 ORV OˇV = O RˇO (70.12) O2t D
.mRV O1t mˇV O / mVV
;
N 0t D RN t O1t ˇOt 1 O2t VNt 1 ;
(70.13)
we cannot obtain the closed-form solutions as in Equations (70.12–70.14). Then, the MLEs of ’s can be obtained by solving Equations (70.9), (70.10), and (70.11) using an iterative way. By comparing the above EIV-corrected MLEs with the traditional least squares estimators that are subject to the EIV problem, the EIV-correction factors for the least squares estimators can be derived. After some algebra, the following relationships between the EIV-corrected MLEs and the traditional least squares (LS) estimates of the CSR coefficients at time t are obtained
(70.14)
where
2 2 / ıt mˇO ˇO 1 OˇV ; M D mRR .1 ORV O mxy D D
N N 1 XX wij .xi x/.y Q j y/ Q N i D1 j D1
i D1 j D1
For example, the sample co-moment between Rt and ˇOt1 is mRˇO D N P N 1P wij .Rit RQ t /.ˇOjt1 ˇQt1 / for the GLS version, mRˇO D N
8
iD1 j D1
wi .Rit RQ t /.ˇOit1 ˇQt1 / for the WLS version when wi is
iD1
the i -th element of the inverse of the diagonal error variance matrix, and N P mRˇO D N1 .Rit RN t /.ˇOit1 ˇNt1 / for the OLS version. iD1
O2t D C2t O2tLS ;
(70.16a)
Q O0t D O0tLS .C1t 1/O1tLS ˇOt 1 .C2t 1/O2tLS VQt 1 ; (70.17a) where , C1t D 1 2
0
2 ORV
3
1
mRR 1 41 1 @
C1A .1 .1 Q/1=2 /5 1; 2 2 ıt m ˇOˇO 1 O O
i D1 j D1
weighted average of a variable x, wij is the (i,j)-th element 1=2 O 1 is the estimated crossof † " , and Oxy D mxy =.mxx myy / sectional correlation coefficient between variables x and y8 O and V correspond to the variables R t , The subscripts R, ˇ, ˇOt 1 , Vt 1 , respectively. The time subscript t is dropped in the expression of the cross-sectional sample co-moments (mxy ) for convenience. The above MLEs are N -consistent and asymptotically normal. The derivation of the MLEs and the proof of the consistency are shown in the appendix of Kim (1995). As mentioned earlier, the final EIV-corrected estimates of 0 , 1 , and 2 are the time-series averages (NO0 ; NO1 ; NO2 ) of the estimates 0t , 1t , and 2t , respectively. Note that although the disturbance term of the market model is intertemporally homoscedastic, the ratio of the idiosyncratic error variance to the measurement error variance, ıit , is not the same across assets, if the time-series sample size (T ) in estimating ˇ is not equal across assets. In this case,
N
(70.15a)
1 O 1 Q .x 1x/ Q 0† " .y 1y/ N
is the weighted and centered cross-sectional second sample co-moment about the weighted average between variables x N P N P N N P P and y, xQ D wij xi = wij is the cross-sectional
N 1P
O1t D C1t O1tLS ;
ˇV
(70.18a)
Q D 1 OR2 ˇV O
,
0 13 2 ıt mˇO ˇO 1 O2O 1 O m 1 1 RR ˇV RV 4 C @ C
A5 ; 2 2 4 mRR 1 ORV ıt m 1 O2 2
ˇOˇO
O ˇV
(70.19a) and
O2 O RˇV
D
2 ORˇO OˇV O ORV
2 1O2O .1ORV /
is the estimated cross-sectional
ˇV
squared partial correlation coefficient between the variables Rt and ˇOt 1 , given Vt 1 . Here C1t and C2t are the direct EIV-correction factors for the least squares estimators of 1t O " is a diagonal (scalar) maand 2t , respectively. When † trix, Equations (70.15a), (70.16a), and (70.17a) represent a relationship between the WLS (OLS) version of the EIVcorrected MLEs and the traditional WLS (OLS) estimates. The above relationship also hold for the GLS version of estiO " is a full matrix. mates when † In the widely used special case when the beta variable alone is in the model, the EIV-corrected MLEs of the CSR coefficients (0t ; 1t ) in Equations (70.1) and (70.2) and the
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
corresponding EIV-correction factor can also be obtained from the above equations by setting ORV D OˇV O D 0: O1t D C1t O1tLS ; Q O0t D O0tLS C1t 1 O1tLS ˇOt 1 ;
(70.15b)
where ," C1t
1 1 2
D1
! # mRR 1=2 C 1 .1 .1 Q / / 1; ıt mˇO ˇO (70.16b)
Q D 1 OR2 ˇO
," 1 1 C 2 4
ıt mˇOˇO mRR
mRR C ıt mˇO ˇO
!# : (70.17b)
The EIV-correction factor C1t for the traditional least squares estimator of 1t is always greater than or equal to one, regardless of the presence of the firm size variable in the crosssectional regressions.9 Hence, the EIV-corrected MLE of 1t , O1t , has a greater positive value than the traditional least squares estimator, O1tLS , if the least squares estimator is positive and vice versa. Since many empirical findings show that the traditional least squares estimate of the price of beta risk is positive, the EIV-corrected estimate of the price of beta risk is greater than the traditional positive least squares estimate. Put differently, the MLE of the price of beta risk corrects the downward bias of the traditional least squares estimate. The EIV-correction factor C2t for the least squares estimator of the coefficient associated with the firm size variable would be between zero and one in most cases. Therefore, the EIV-corrected coefficient estimate of the firm size variable, O2t , is less negative than its least squares estimate, O2tLS when the least squares estimate is itself negative as in most cases. As a result, the ultimate EIV-corrected estimate of the coefficient associated with the firm size (NO2 ) would be less negative than its traditional least squares estimate (NO2LS ). 9
Since the denominator of Q is greater than 1, 0 Q 1. So, .1 Q/1=2 D 1.1=2/Q.1=8/Q2 .1=16/Q3 .5=128/Q4 < 1 .1=2/Q: Now, the EIV-correction factor C1t is 0
C1t 2
1
It is clear from Equation (70.17b) that when the beta variable alone is in the model, the correction of the downward bias of the traditional least squares estimator of the price of beta risk induces a correction of the upward bias of the traditional least squares intercept estimator. More specifically, the EIV-corrected intercept estimator, O0t , is smaller than the traditional least squares intercept estimator, O0tLS . Moreover, in the more general case when the firm size variable is included in the cross-sectional regression, Equation (70.17a) shows that the EIV-corrected intercept estimator is also smaller than the traditional least squares intercept estimator. Therefore, the ultimate intercept estimate by the EIVcorrection method (NO0 ) is smaller than that by the traditional least squares method (NO0LS ). In summary, the EIV correction implies a greater (more positive) estimate of the price of beta risk, a smaller (less negative) estimate of the coefficient associated with firm size, and a smaller intercept estimate.
70.3.2 Sensitivity of the EIV Correction to the Factors The extent of the measurement error of estimated betas depends on the length of the time-series data, T , that are used in estimating betas. If T is small, the measurement error of the estimate betas is large, and the EIV bias is large as well.10 If T is extremely large, the measurement error of the estimate betas is negligible, and the EIV bias issue can be ignored. Thus, when T is small, a large amount of the EIV correction would be needed in order to obtain the EIV-corrected risk premium estimate, while a small amount of the EIV correction would be needed when T is large. Like this, the EIV correction for the cross-sectional regression (CSR) risk premia estimates is conditional on the beta estimation of the firstpass. In other words, the magnitude of the EIV correction depends upon several factors related with the beta estimation. Here, a sensitivity analysis of the EIV corrections is presented to the relevant factors by using the EIV correction factor of Equation (70.17b). Since 0 < Q < 1, Kim’s EIV correction factor for the LS estimate of the price of beta risk, C1t , is approximated as C1t
13
2 41 1 Q @ mRR . 1ORV / C 1A5 4 ıt mˇOˇO 1O2O ˇV
3 2 1 OR O ˇV
D 1= 41 5 1: 2 2 =mRR 1 ORV 1 C ıt mˇOˇO 1 OˇV O 2
C1t equals one when and thus ıt ! 1; that is, when the T -consistent market beta estimates are used, there is no EIV-correction needed.
1097
1
. 1C 1 1 O2 O Rˇ
10
2 m T O m ˇOˇO
:
(70.17c)
mRR
In most asset pricing tests, betas are estimated by using return observations over the arbitrarily chosen fixed length of period; say five years or the whole sample period, implicitly assuming that betas are stationary over the period. However, this assumption does not have a scientific justification. This estimation method ignores the possibility of the shift of the beta and results in imprecise beta estimates
1098
D. Kim
As shown in the above equation, the EIV correction factor depends on several factors: the length of the beta estimation period or presumed beta stationarity period (T 2), the estimated market variance using T observations O m , the cross-sectional variance of the estimated betas (mˇOˇO ), the cross-sectional variance of returns (mRR), and the cross-sectional correlation between the estimated betas and returns (O2 O ). Rˇ To obtain the sensitivity of Kim’s EIV correction factor C1t to each of the factors, C1t is differentiated with respect to each of the factors; @C1t @mRR
D
ıt mˇOˇO 1 O2 O Rˇ
mRRO2 O C ıt mˇOˇO =mRR
0;
Rˇ
mRR @C1t D @T T @C1t mRR D 2 2 @O m O m @C1t mRR D @mˇO ˇO mˇOˇO
@C1t @mRR @C1t @mRR @C1t @mRR
2 Cerror term; s D tT; ; t1; "O2is D ˛0i C˛1i Rms C˛2i Rms (70.19)
0;
0;
0;
mRR mRR @C1t
1C D 2 ıt mˇOˇO @O O 1 O2 O Rˇ
the EIV-corrected estimation under heteroscedasticity and compare it with the estimation under homoscedasticity of the disturbance term of the market model. Under heteroscedasticity of the disturbance term of the market model, the ratio of the idiosyncratic error variance to the measurement error variance of Equation (70.8) ıit , is different across assets. The idiosyncratic error variance terms contained in the ratio ıit can be estimated through a functional form of conditional heteroscedasticity on the market return, which can capture a time-varying behavior of idiosyncratic error variances. A plausible form of the conditional heteroscedasticity in the disturbance term is that the square of the disturbance term covaries with the contemporaneous market return and the square of contemporaneous market return; that is,
!
@C1t @mRR
0:
Rˇ
The increase in T and O m2 has a negative effect on the change of the EIV correction factor. That is, the shorter the beta estimation period, the greater the EIV corrected amount needed. The increase in mRR and O2 O has a positive effect on the Rˇ change of the EIV correction factor. The increase in mˇO ˇO has a negative effect. That is, the larger the spread in betas, the smaller the EIV corrected amount needed. This implies that the greater spread in betas needs to be maintained in order to have a smaller EIV bias in the risk premium estimates. In particular, when portfolios are formed in order to test the validity of the CAPM, it is needed to maintain a wider spread in portfolio betas.
70.3.3 Heteroscedastic Market Model Disturbances Miller and Scholes (1972), Brown (1977), Belkaoui (1977), Bey and Pinches (1980), and Barone-Adesi and Talwar (1983) report some evidence of heteroscedasticity of residual asset returns. Heteroscedasticity of residual asset returns reduces the efficiency of empirical tests in which homoscedasticity is assumed. It is worthwhile, therefore, to perform
where "O2is is the squared residual return from the market model estimated using T observations prior to each CSR. In fact, this model is equivalent to White’s (1980) test for heteroscedasticity in the disturbance term of the market model. By substituting the idiosyncratic error variance terms contained in the ratio ıit of Equation (70.8) with the estimated conditional residual variances from Equation (70.19), the value of the ratio is obtained for each firm. As long as the estimated coefficients of Equation (70.19) are different across assets, the ratios are also all different across assets. When the ratio ıit is different across assets, the solution of Equations (70.9), (70.10), and (70.11) no longer has a closedform and requires the use of an iterative method. Unlike the earlier closed form MLE, it now is hard to analyze the consistency of the iterative MLE. It is worthwhile, therefore, to perform a Monte Carlo simulation to examine the consistency of the iterative MLE and to compare it with that of the closedform MLE.
70.3.4 Simulation Evidence In order to examine the estimation efficiency of the closedform MLE under homoscedasticity of the disturbance term of the market model, the iterative MLE under conditional heteroscedasticity, and the traditional least squares estimate under both homoscedasticity and conditional heteroscedasticity, a Monte Carlo simulation is used. The simulation draws from a finite universe of stock returns, whose behavior is governed by the capital asset pricing model (CAPM). To simulate stock returns, necessary parameters are collected from actual stock return data. That is, betas and residual variances are taken from an estimation of the market
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
model of 2199 firms listed on the Center for Research in Security Prices (CRSP) monthly return file over the sample period from January 1976 through December 1980. The coefficients .˛0i ; ˛1i ; ˛2i / of the conditional heteroscedasticity Equation (70.19) are taken from a regression of the squared market model residuals on the contemporaneous market returns and the square of the contemporaneous market returns. The average market value of each firm over the period from January 1976 to December 1980 is also obtained to form portfolios later in the simulation. Based on these collected parameter values, I generate 180 monthly returns are generated for each firm using the following market model: Rit D ˛i C ˇi Rmt C "it ; t D 1; ; 180; where Rmt is the CRSP equally-weighted market returns from January 1971 through December 1985. The mean return .Ei / of each firm’s simulated monthly returns is determined by the following CAPM equation: Ei D 0:004C0:010768ˇi . Therefore, ˛i is determined by ˛i D Ei ˇi RN m . Under homoscedasticity, the error terms, "it , are randomly generated from a normal distribution that has the same variance as the previously collected residual variance. Under conditional heteroscedasticity, the error terms are also randomly generated with a time-varying variance, whose condition is determined by the coefficients of the conditional heteroscedasticity Equation (70.19) These coefficients were taken earlier from the actual stock return data. In this manner, a series of 180 monthly return data are generated for each of the 2199 firms. Size-ˇ portfolios are then formed; that is, portfolios are first formed based on firm size (the average market value), and each size-sorted portfolio is then subdivided on the basis of each firm’s preranking beta on December of each year. The preranking beta is estimated using 60 monthly observations over the previous 5 years by rolling over every month. The composition of the portfolio is held constant for the following 1 year. The post-ranking returns are calculated with equal weights, and the post-ranking betas are estimated by rolling over the previous 5-year post-ranking returns. Therefore, the testing period is from t D 121 to t D 180 (or equivalently, from January 1981 to December 1985). Notice that the given price of beta risk of 1 D 0:010768 is equal to the average market return from January 1981 to December 1985 minus the given riskless rate of 0 D 0:004. The number of portfolios formed are N D 9, 16, 25, 49, 64, 100, 196, 289, 400, and 2199 (all individual firms). The true value of 1 is estimated by the time-series average .NO1 / of the 60 estimated values of 1t over the testing period for each replication. The total number of replications is 200. Table 70.1 presents the biases and the squareroot of the mean squared errors (MSEs) of the (WLS version)
1099
EIV-corrected MLE and the traditional WLS estimate of 1 . When the true disturbance term of the market model is homoscedastic, the bias of the MLE decreases and approaches zero as the number of portfolios, N , increases, while that of the WLS estimate is magnified. The MSEs of the two estimates behave in a similar fashion. When the true disturbance term of the market model is governed by the conditional heteroscedasticity of Equation (70.19), two different MLEs are computed: the first MLE is derived by assuming there is conditional heteroscedasticity, while the second MLE is derived by assuming there is homoscedasticity. The bias of the first MLE (estimated under the right assumption of the conditional heteroscedasticity) also decreases and approaches zero as N increases. However, the first MLE converges more slowly to the true value than does the MLE derived under the circumstance that the true disturbance term is homoscedastic. The bias of the second MLE (estimated under the incorrect assumption of homoscedasticity) decreases with N , but increases at the large values of N As expected, the bias and MSE of the second MLE are greater than those of the first MLE. The bias and MSE of the WLS estimate in the case of conditional heteroscedasticity increase with N and are worse than in the case of homoscedasticity. In sum, when the true disturbance term of the market model is homoscedastic, as the number of portfolios, N , increases, the (closed-form) MLE of 1 converges to the true value. When the true disturbance term of the market model is conditionally heteroscedastic, the (iterative) MLE of 1 also converges to the true value as long as it is estimated under the assumption of conditional heteroscedasticity. However, the convergence of the (closed-form) MLE estimated under the assumption of homoscedasticity is not guaranteed when the true disturbance term of the market model is conditionally heteroscedastic.
70.4 Results 70.4.1 Data All NYSE and AMEX stocks listed on the CRSP monthly return file for at least 2 years during the period 1926–1991 are used for the CSR estimation with portfolio groupings. Among them, all stocks listed for at least 5 years are used for the CSR estimation without portfolio groupings (i.e., with individual stocks). All stocks listed on the monthly NASDAQ return file for at least 5 years during the period 1973– 1991 are added to the NYSE and AMEX stocks for the CSR estimation for individual stocks. The restriction of at least 5-year survival for the CSR estimation for individual stocks is imposed in order to maintain the same value of the ratio ıt for all stocks in the EIV-corrected maximum likelihood
1100
D. Kim
Table 70.1 Simulation evidence: performance of the EIV-corrected and traditional estimates of the risk premium Simulated under conditional Simulated under conditional homoscedasticity heteroscedasticity Portfolio size .N /
EIV-corrected
Uncorrected
EIV-corrected with heteroscedasticity
Panel A: Bias .100/ 9 0:0229 0:0245 0:0280 16 0:0220 0:0249 0:0273 25 0:0213 0:0256 0:0237 49 0:0210 0:0294 0:0219 64 0:0178 0:0286 0:0202 100 0:0174 0:0344 0:0175 196 0:0129 0:0457 0:0139 289 0:0066 0:0543 0:0106 400 0:0047 0:0711 0:0074 0:0033 2199 0:0019 0:1668 Panel B: Square-root of the mean squared error .100/ 9 0:0249 0:0264 0:0288 16 0:0237 0:0264 0:0281 25 0:0228 0:0270 0:0245 49 0:0226 0:0305 0:0228 64 0:0197 0:0298 0:0213 100 0:0193 0:0353 0:0186 196 0:0153 0:0464 0:0154 289 0:0108 0:0549 0:0124 400 0:0082 0:0721 0:0108 2199 0:0048 0:1669 0:0069
EIV-corrected with homoscedasticity
Uncorrected
0:0391 0:0377 0:0320 0:0307 0:0284 0:0209 0:0074 0:0029 0:0036 0:0104
0:0416 0:0421 0:0389 0:0440 0:0458 0:0481 0:0597 0:0780 0:1007 0:2075
0:0395 0:0381 0:0325 0:0311 0:0289 0:0215 0:0093 0:0061 0:0067 0:0109
0:0420 0:0425 0:0393 0:0443 0:0461 0:0484 0:0600 0:0781 0:1008 0:2076
The data is simulated by the following capital asset pricing model: Ei D 0:004 C 0:010768ˇi , where Ei and ˇi are the expected return on an asset i and the beta of the asset, respectively. The price of beta risk .1 / is estimated by the time-series average of the month-by-month CSR coefficient estimates 1t for each replication. The total number of replications is 200. When the true disturbance term of the market model is conditionally homoscedastic, the CSR coefficient 1t is estimated by the EIV-corrected method (the WLS version of the MLE) and the uncorrected method (the traditional WLS) under the assumption of conditional homoscedasticity. When the true disturbance term of the market model is conditionally heteroscedastic, two different EIV-corrected MLE’s are computed: the first (iterative MLE) is computed under the assumption of conditional heteroscedasticity, while the second (closed-form MLE) is computed under the assumption of conditional homoscedasticity. The conditional heteroscedasticity is modeled by 2 , where "is and Rms are the disturbance term of the market model for an asset i at time "2is D ˛0i C ˛1i Rms C ˛i2 Rms s and the market return at time s, respectively.
estimation. Each firm’s monthly returns and market value data (stock price per share times the number of shares outstanding) are obtained from the CRSP monthly return tape. The 1-month Treasury bill returns are taken from Ibbotson (1992) as the riskless returns. For the CSR estimation for portfolios, size-ˇ portfolios are formed using the same grouping method as Fama and French (1992). That is, portfolios are first formed based on each firm’s market value in June of year y. Each size-sorted portfolio is then subdivided on the basis of each firm’s preranking beta in June of year y. Each firm’s preranking beta for month t is estimated by using at least 24 monthly return observations over the previous 5-years up to month t 1. All individual stocks are assigned to one of the size-ˇ portfolios in order of increasing firm size and then pre-ranking beta magnitude. The composition of the portfolio is held
constant for the following year from July of year y to June of year y C 1, and the equal-weighted returns on the portfolios are calculated for this period. The portfolio’s market value is also calculated with equal weights. In the end, there are post-ranking returns for July 1931 to December 1991 on the portfolios formed on size and preranking betas. Each portfolio’s post-ranking beta for month t is then estimated by using the previous 5-year post-ranking returns up to month t 1. Since the first post-ranking beta is estimated using post-ranking returns from July 1931 through June 1936, the whole testing period (i.e., the estimation period of gammas) is from July 1936 to December 1991. The post-ranking beta estimation method differs from the Fama and French’s method. They estimate post-ranking betas by using the fullperiod sample of post-ranking returns on each of 100 size-ˇ portfolios formed with nonfinancial firms. Therefore, 5 more
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
Table 70.2 Cross-sectional dependence between residuals from the market models
Average absolute correlation coefficient (standard error) Portfolio size .N / 9 16 25 49 64 100 196 289 400 NYSE/AMEX NASDAQ only
1101
Rejection rate at 5%
Average absolute correlation coefficient (standard error)
Rejection rate at 5%
Residuals from the market model with equally-weighted returns
Residuals from the market model with value-weighted returns
0.399 (0.005) 0.341 (0.004) 0.308 (0.004) 0.244 (0.003) 0.222 (0.003) 0.189 (0.003) 0.152 (0.002) 0.136 (0.002) 0.108 (0.002) 0.083 (0.001) 0.075 (0.001)
0.485 (0.006) 0.450 (0.006) 0.421 (0.006) 0.359 (0.005) 0.334 (0.005) 0.288 (0.005) 0.221 (0.004) 0.188 (0.004) 0.151 (0.003) 0.123 (0.001) 0.110 (0.001)
0.656 0.578 0.545 0.415 0.374 0.298 0.202 0.146 0.095 0.076 0.065
0.725 0.684 0.663 0.595 0.567 0.496 0.363 0.268 0.194 0.108 0.084
The average absolute correlation coefficient is the average of all pairwise absolute correlation coefficients estimated using overlapping 5-year monthly residuals from the market model from July 1936 through December 1991. The rejection rate is the percentage of rejecting the null that there is no correlation between residuals, when the correlation t -test is performed for all computed correlation coefficients. For NYSE/AMEX, 1200 randomly chosen individual stocks are used to estimate all possible pairwise correlation coefficients between residuals. For OTC, another 1200 stocks are randomly chosen from among all OTC stocks.
years’ post-ranking returns prior to the first testing month are needed in this case than in the Fama and French’s case. The numbers of portfolios to be formed are N D 9 .3 3/, 16 .4 4/, 25 .5 5/, 49 .7 7/, 64 .8 8/, 100 .10 10/, 196 .14 14/, 289 .17 17/, and 400 .20 20/. For the CSR estimation for individual stocks (all NYSE/AMEX and all NYSE/AMEX/OTC), each firm’s beta for month t is also estimated by using the previous 5-year return observations up to month t 1. In order to be consistent with the case of the CSR estimation for portfolios, returns from July 1931 through June 1936 are used to estimate the first beta of each firm. In both cases of the CSR estimation for portfolios and for individual stocks, the whole testing period is from July of 1936 through December of 1991.
sufficiently weak that – especially when N is large – the WLS version estimator can be used. In order to test the cross-sectional dependence between residuals, the market model is estimated for each portfolio in overlapping 5-year periods by rolling over every month, and then all possible pairwise correlation t-tests are performed with the 5-year residuals. The testing results are presented in Table 70.2. As the number of portfolios increases, the rejection rate of the null hypothesis that there is no correlation between the residuals decreases monotonically, and the magnitude of the average absolute correlation coefficient between the residuals also decreases.11 In particular, the rejection rates of the null hypothesis for the case of all individual stocks are quite low. They are only about 0.076 for the NYSE/AMEX stocks and 0.065 for the OTC stocks at the 5% significance level, when the CRSP equal-weighted
70.4.2 Cross-Sectional Dependence Between Residuals
11
In estimating the risk prices, concerns of changing risk premia parameters and beta nonstationarity frequently limit the available time-series length T In these cases, the GLS version of the EIV-corrected estimator may be infeasible because the inverse of the estimated error covariance matrix does not exist for large N , unless the error covariance matrix is diagonal. It is necessary, therefore, to examine whether the error covariance matrix is diagonal; that is, if the cross-sectional dependence between residual asset returns is
where N is the average correlation coefficient between two individual assets’ returns. As n increases, p goes to one. Conversely, as n decreases; that is, as the number of portfolios .N / increases, the correlation coefficient between portfolio returns decreases. These arguments also hold for portfolios’ residual returns.
These empirical results are consistent with the following theoretical arguments. When the number of assets in the portfolios is n, the correlation coefficient between two equal-weighted portfolios’ returns, , is given by 0 1 n n 1X 1 X A N @ Rit ; Rjt ; p D Corr n iD1 n j D1 N C .1 / N =n
1102
market returns are used.12 When the value-weighted market returns are used, the results are similar. The low rejection rates of the null hypothesis, which could easily be due to Type I error, are quite supportive of the assumption of weak cross-sectional dependence between individual stock residuals. The small average magnitude of the absolute correlation coefficients is also evidence supporting the assumption of weak cross-sectional dependence in the case when the number of assets is large. When the equal-weighted market returns are used, the average absolute correlation coefficients for the NYSE/AMEX and for the OTC individual stocks are only 0.083 and 0.065 respectively. Moreover, note that when the time-series length is 60, the marginal magnitude of the correlation coefficient, which is required to reject the null hypothesis at the 5% significance level, is approximately 0.254.13 Based on these results, the WLS version of the estimation is quite acceptable, especially when all individual assets are used. Therefore, the risk premia will be estimated using the WLS version of the maximum likelihood estimation and the traditional WLS estimation.
70.4.3 Beta Risk Price Estimates After the EIV Correction Table 70.3 presents the time-series averages (t-statistics in parentheses) of the CSR coefficients estimated by the (WLS version) EIV-corrected MLE method and by the traditional WLS method under the assumption of homoscedasticity of the disturbance term of the market model, when the CRSP equal-weighted market return is used as a proxy for the return on the market portfolio. Panel A contains the results by all firms for the whole period from July 1936 through December 1991 (666 months), and Panel B contains the results by nonfinancial firms for the subperiod July 1963 through December 1990 (330 months); that is, the same period used by Fama and French (1992). Table 70.3 shows that as the number of portfolios .N / increases, the relation between average stock returns and market beta becomes stronger and the intercept estimate becomes weaker both for the whole period and for the subperiod after the EIV correction. For the whole period, when the number of portfolios exceeds 49 and the beta variable alone is
D. Kim
in the model, all t-statistics of the EIV-corrected estimates of the price of beta risk .1 / exceed two. Especially, when all NYSE and AMEX individual stocks are used, the estimate of the price of beta risk is 0.766% per month, with t-statistic of 2.59. When all the post-January 1973 OTC individual stocks are added to the NYSE and AMEX stocks, the estimate of the price of beta risk is increased to 0.809% per month, with t-statistic of 2.63. For N 400, the intercept .0 / is insignificantly estimated when the EIV bias is corrected, a desirable feature for the CAPM. When the firm size variable is included in the model and all NYSE, AMEX, and OTC individual stocks are used, the EIV-corrected estimate of the price of beta risk continues to be statistically significant: It equals 0.704% per month and has a t-statistic of 2.14. On the other hand, when the traditional WLS estimation is performed, the estimate of the price of beta risk is statistically insignificant and its magnitude decreases as N increases. At the same time, as N increases, the intercept estimates grow larger and more significant, which results from the underestimation of 1 Put differently, as N increases, the estimation results from the traditional least squares method become more adverse to the CAPM due to the more severe EIV problem. Table 70.3 also presents the time-series averages of the estimated coefficients associated with the firm size variable under the assumption of homoscedasticity of the disturbance term of the market model. The natural logarithm of the portfolio’s average market value is used for the firm size variable.14 As the number of portfolios increases, the size of the traditional negative WLS coefficient estimate of the firm size variable increases slightly, and its statistical significance becomes stronger. After correcting for the EIV problem, however, as the number of portfolios increases, the coefficient estimate associated with the firm size variable decreases in absolute magnitude and its statistical significance becomes weaker both for the whole period and for the subperiod. The t-statistic of the EIV-corrected coefficient estimate associated with the firm size variable is much smaller than that of the traditional WLS estimate. Table 70.4 presents results analogous to Table 70.3 in the case when the CRSP value-weighted market return is used. The results are quite similar, except that the impact of EIVcorrection is more profound for the WLS estimate of the beta risk price and less profound for the WLS estimate of the coefficient associated with the size variable.15 In particular, the
12
Among all NYSE/AMEX stocks, 1200 stocks are randomly chosen to perform the correlation t-test. Another 1200 stocks are also randomly chosen from among all OTC stocks. 13 The marginal correlation coefficient of 0.254 is derived as folp .1 T / = .T 2/ is approximately lows: Based on the fact that = O t -statistic with (T -2) degrees of freedom, O of 0.254 is obtained when T is 60 and the critical value at the 5% significance level is t0:025 .58/ D 2:0017.
14
The other measures of the firm size variable such as the square root of the average market value, the reciprocal of the average market value, and the average market value itself were also tried. The results were similar. 15 These empirical findings can be explained by the following arguments. The EIV correction factors are a function of the ratio ıt D 2 . It can be easily proven that as the ratio increases, the EIV T O m;t1
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
1103
Table 70.3 Risk price estimates (t-Statistics) using the CRSP equal-weighted market return and assuming conditional homoscedasticity Rit D 0t C 1t ˇOit1 C 2t log.Vit1 / C !it EIV-corrected WLS (MLE) Portfolio size .N /
NO0
9
0.452 (3.29)
16
0.471 (3.48)
25 49
NO1
Uncorrected WLS NO2
NO0
NO1
NO2
0.530 (1.92)
0.503 (3.82)
0.476 (1.78)
0.506 (1.86)
0.523 (4.03)
0.450 (1.71)
0.462 (3.44)
0.515 (1.89)
0.522 (4.08)
0.449 (1.72)
0.418 (3.06)
0.552 (2.01)
0.492 (3.82)
0.471 (1.82)
64
0.407 (3.00)
0.564 (2.07)
0.487 (3.83)
0.475 (1.85)
100
0.394 (2.82)
0.573 (2.08)
0.498 (3.89)
0.456 (1.80)
196
0.360 (2.47)
0.599 (2.14)
0.502 (3.95)
0.438 (1.78)
289
0.299 (1.99)
0.671 (2.33)
0.512 (4.11)
0.425 (1.79)
400
0.231 (1.43)
0.752 (2.52)
0.525 (4.26)
0.413 (1.80)
NYSE/AMEX
0.230 (1.42)
0.766 (2.59)
0.564 (5.06)
0.375 (1.76)
NYSE/AMEX/ NASDAQ
0.207 (1.40)
0.809 (2.63)
0.578 (5.22)
0.368 (1.76)
9
1.097 (4.98)
0.190 (0.65)
0:086.2:77/
1.134 (5.25)
0.160 (0.57)
0:089.2:87/
16
1.017 (4.92)
0.212 (0.75)
0:074.2:48/
1.062 (5.26)
0.176 (0.64)
0:078.2:62/
25
1.076 (5.27)
0.188 (0.67)
0:084.2:88/
1.133 (5.71)
0.142 (0.52)
0:089.3:05/
49
1.001 (4.89)
0.240 (0.85)
0:079.2:72/
1.089 (5.54)
0.169 (0.64)
0:086.2:99/
64
1.023 (5.06)
0.228 (0.81)
0:083.2:85/
1.118 (5.77)
0.151 (0.57)
0:091.3:14/
100
0.988 (4.92)
0.252 (0.91)
0:080.2:76/
1.115 (5.83)
0.147 (0.58)
0:089.3:10/
196
0.901 (4.37)
0.300 (1.06)
0:070.2:47/
1.104 (5.90)
0.133 (0.55)
0:087.3:07/
289
0.833 (3.90)
0.377 (1.29)
0:068.2:39/
1.167 (6.34)
0.100 (0.44)
0:095.3:38/
400
0.683 (2.86)
0.515 (1.64)
0:059.2:28/
1.188 (6.38)
0.094 (0.44)
0:100.3:52/
NYSE/AMEX
0.609 (2.58)
0.644 (1.93)
0:057.2:16/
1.132 (6.70)
0.089 (0.90)
0:105.3:95/
NYSE/AMEX/ ASDAQ
0.566 (2.52)
0.704 (2.14)
0:054.2:03/
1.090 (6.92)
0.091 (1.09)
0:103.3:92/
9
0.425 (1.98)
0.353 (0.88)
0.479 (2.30)
0.297 (0.76)
16
0.413 (1.95)
0.363 (0.92)
0.467 (2.29)
0.307 (0.80)
25
0.422 (1.98)
0.352 (0.88)
0.477 (2.33)
0.293 (0.77)
49
0.376 (1.71)
0.393 (0.98)
0.448 (2.14)
0.315 (0.83)
64
0.335 (1.53)
0.436 (1.08)
0.414 (2.00)
0.349 (0.93)
100
0.359 (1.61)
0.406 (1.00)
0.444 (2.15)
0.311 (0.84)
196
0.242 (1.07)
0.522 (1.27)
0.379 (1.87)
0.366 (1.01)
289
0.190 (0.82)
0.574 (1.37)
0.386 (1.93)
0.350 (0.99)
400
0.141 (0.60)
0.636 (1.50)
0.388 (1.96)
0.350 (1.03)
NYSE/AMEX
0.111 (0.58)
0.699 (1.73)
0.410 (2.37)
0.313 (1.11)
NYSE/AMEX/ NASDAQ
0.072 (0.30)
0.741 (1.95)
0.408(2.34)
0.316 (1.34)
9
1.354 (3.72)
0:147 (0:36)
0:106 (2:27)
1.428 (4.03)
0:174 (0:44)
0:117 (2:55)
16
1.343 (3.56)
0:149 (0:32)
0:102 (2:10)
1.395 (4.09)
0:169 (0:44)
0:116 (2:54)
25
1.396 (3.98)
0:169 (0:48)
0:114 (2:48)
1.460 (4.34)
0:216 (0:57)
0:119 (2:63)
Panel A: July 1936 through December 1991
Panel B: July 1963 through December 1990
49
1.379 (3.95)
0:161 (0:40)
0:113 (2:48)
1.448 (4.36)
0:207 (0:55)
0:120 (2:66)
64
1.309 (3.76)
0:119 (0:30)
0:108 (2:35)
1.406 (4.27)
0:173 (0:47)
0:120 (2:64)
100
1.345 (3.96)
0:143 (0:36)
0:111 (2:45)
1.443 (4.52)
0:203 (0:56)
0:122 (2:72) (continued)
1104
D. Kim
Table 70.3 (continued) Rit D 0t C 1t ˇOit1 C 2t log.Vit1 / C !it EIV-corrected WLS (MLE)
Uncorrected WLS
Portfolio size .N /
NO0
NO1
NO2
NO0
NO1
NO2
196
1.241 (3.58)
0:031 (0:08)
0:112 (2:44)
1.397 (4.39)
0:156 (0:45)
0:124 (2:74)
289
1.102 (3.14)
0.051 (0.12)
0:101 (2:22)
1.390(4.47)
0:162 (0:49)
0:123 (2:77)
400
1.121 (3.21)
0.080 (0.19)
0:107 (2:31)
1.451(4.69)
0:182 (0:59)
0:132 (2:95)
NYSE/AMEX
0.688 (2.18)
0.484 (1.18)
0:081 (2:00)
1.197 (4.19)
0.061 (0.23)
0:119 (2:96)
NYSE/AMEX/ NASDAQ
0.617 (2.07)
0.580 (1.50)
0:078(1:81)
1.072 (4.10)
0.070 (0.43)
0:107 (2:71)
The risk price estimates (NO0 , NO1 , NO2 ) are the time-series averages of the month-by-month CSR coefficient estimates. The monthly CSR coefficients are estimated through the EIV-corrected WLS version of the MLE and the traditional uncorrected WLS estimation by assuming conditional homoscedasticity of the disturbance term of the market model. For the whole period from July 1936 to December 1991, both financial and nonfinancial firms are used, while for the subperiod from July 1963 to December 1990, only nonfinancial firms are used. The portfolios are size-ˇ portfolios they are first formed based on each firm’s market value in June of year y, and each size-sorted portfolio is then subdivided based on each firm’s pre-ranking betas in June of year y for July of year y through June of year y C 1. All firms traded for at least 2 years on the NYSE and AMEX are used to form the portfolios. NYSE/AMEX (NYSE/AMEX/NASDAQ) indicates individual stocks traded for at least 5 years on these exchanges. For the whole period (for the subperiod), on average, there are 1208 and 1579 stocks (1407 and 2083 nonfinancial stocks) in the monthly regressions for NYSE/AMEX and NYSE/AMEX/NASDAQ, respectively.
Table 70.4 Risk price estimates (tstatistics) using the CRSP value-weighted market return and assuming conditional homoscedasticity Rit D 0t C 1t ˇOit1 C 2t log.Vit1 / C !it EIV-Corrected WLS (MLE) Portfolio size .N /
NO0
NO1
Uncorrected WLS NO2
NO0
NO1
NO2
Panel A: July 1936 through December 1991 9
0.306 (1.82)
0.459 (1.80)
0.407 (2.65)
0.362 (1.53)
16
0.301 (1.80)
0.454 (1.80)
0.414 (2.75)
0.346 (1.50)
25
0.260 (1.53)
0.494 (1.94)
0.394 (2.65)
0.366 (1.60)
49
0.195 (1.14)
0.546 (2.13)
0.365 (2.52)
0.386 (1.73)
64
0.210 (1.24)
0.538 (2.11)
0.385 (2.73)
0.373 (1.70)
100
0.186 (1.06)
0.563 (2.17)
0.403 (2.89)
0.360 (1.66)
196
0.145 (0.82)
0.604 (2.29)
0.443 (3.40)
0.328 (1.61)
289
0.132 (0.68)
0.626 (2.25)
0.526 (4.17)
0.263 (1.35)
400
0.123 (0.58)
0.641 (2.36)
0.540 (4.58)
0.249 (1.37)
NYSE/AMEX
0.120 (0.56)
0.650 (2.54)
0.576 (5.29)
0.242 (1.49)
NYSE/AMEX/ NASDAQ
0.104 (0.47)
0.684 (2.75)
0.572(5.51)
0.254 (1.61)
9
1.097 (4.53)
0.071 (0.30)
0:073.2:53)
1.116(4.67)
0.058 (0.25)
0:074 (2:57)
16
1.073 (4.94)
0.072 (0.31)
0:072 (2:68)
1.103 (5.16)
0.051 (0.23)
0:074 (2:73)
25
1.040 (4.91)
0.105 (0.46)
0:074 (2:79)
1.080 (5.19)
0.076 (0.34)
0:076 (2:86)
49
1.023 (4.88)
0.128 (0.56)
0:079 (2:96)
1.088 (5.32)
0.081 (0.38)
0:082 (3:14)
64
1.026 (4.96)
0.120 (0.52)
0:078 (2:92)
1.103 (5.49)
0.065 (0.31)
0:082 (3:06)
100
1.024 (4.93)
0.140 (0.61)
0:083 (3:02)
1.133 (5.68)
0.060 (0.29)
0:088 (3:21)
196
0.916 (4.40)
0.215 (0.93)
0:078 (2:79)
1.103 (5.61)
0.077 (0.40)
0:087 (3:09)
289
0.867 (3.83)
0.257 (1.05)
0:074 (2:63)
1.182 (5.95)
0.045 (0.29)
0:092 (3:21)
400
0.720 (3.11)
0.387 (1.51)
0:074 (2:58)
1.178 (5.90)
0.038 (0.22)
0:095 (3:20)
NYSE/AMEX
0.575 (2.49)
0.564 (2.10)
0:073 (2:75)
1.121 (6.28)
0.036 (0.39)
0:103 (3:74)
NYSE/AMEX/ NASDAQ
0.528 (2.40)
0.634 (2.39)
0:074 (2:86)
1.079 (6.42)
0.038 (0.51)
0:100 (3:71) (continued)
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
1105
Table 70.4 (continued) Rit D 0t C 1t ˇOit1 C 2t log.Vit1 / C !it EIV-Corrected WLS (MLE) Portfolio size .N /
NO0
NO1
Uncorrected WLS NO2
NO0
NO1
NO2
Panel B: July 1963 through December 1990 9
0:039 (0:13)
0.573 (1.34)
0.109 (0.40)
0.431 (1.12)
16
0:023 (0:08)
0.554 (1.33)
0.144 (0.56)
0.395 (1.08)
25
0:044 (0:15)
0.567 (1.36)
0.127 (0.50)
0.404 (1.12)
49
0:071 (0:24)
0.589 (1.40)
0.145 (0.59)
0.385 (1.09)
64
0:080 (0:27)
0.600 (1.43)
0.167 (0.70)
0.366 (1.06)
100
0:020 (0:07)
0.550 (1.30)
0.248 (1.08)
0.300 (0.90)
196
0:108 (0:36)
0.641 (1.49)
0.277(1.30)
0.282 (0.90)
289
0:157 (0:50)
0.697 (1.57)
0.337 (1.65)
0.238 (0.79)
400
0:178 (0:64)
0.722 (1.67)
0.391 (2.02)
0.198 (0.70)
NYSE/AMEX
0:142 (0:50)
0.732 (1.73)
0.415(2.46)
0.183 (0.64)
NYSE/AMEX/ NASDAQ
0:163 (0:59)
0.762 (1.98)
0.422(2.79)
0.178 (0.69)
9
1.198 (3.00)
0:076 (0:21)
0:094 (2:13)
1.226 (3.18)
0:085 (0:24)
0:097 (2:20)
16
1.265 (3.48)
0:073 (0:19)
0:108 (2:32)
1.251 (3.52)
0:137 (0:40)
0:095 (2:23)
25
1.307 (3.48)
0:084 (0:23)
0:166 (2:34)
1.212 (3.53)
0:122 (0:37)
0:092 (2:22)
49
1.274 (3.31)
0:060 (0:16)
0:109 (2:35)
1.246 (3.74)
0:125 (0:38)
0:100 (2:36)
64
1.234 (3.21)
0:054 (0:15)
0:114 (2:32)
1.259 (3.85)
0:126 (0:40)
0:104 (2:44)
100
1.256 (3.26)
0:052 (0:15)
0:116 (2:26)
1.274 (3.93)
0:132 (0:43)
0:105 (2:45)
196
1.073 (3.11)
0.040 (0.11)
0:105 (2:24)
1.252(3.91)
0:094 (0:33)
0:111 (2:48)
289
0.985 (2.67)
0.135 (0.37)
0:106 (2:20)
1.275(4.03)
0:110 (0:40)
0:112 (2:49)
400
0.793 (2.26)
0.185 (0.50)
0:098 (2:08)
1.316(4.12)
0:117 (0:46)
0:118 (2:57)
NYSE/AMEX
0.611 (1.88)
0.496 (1.32)
0:100 (2:13)
1.204 (3.88)
0.178 (0.08)
0:121 (2:80)
NYSE/AMEX/ NASDAQ
0.463 (1.70)
0.586 (1.65)
0:089 (2:15)
1.128 (3.77)
0.241 (0.19)
0:114 (2:63)
The risk price estimates (NO0 , NO1 , NO2 ) are the time-series averages of the month-by-month CSR coefficient estimates. The monthly CSR coefficients are estimated through the EIV-corrected WLS version of the MLE and the traditional uncorrected WLS estimation by assuming conditional homoscedasticity of the disturbance term of the market model. For the whole period from July 1936 to December 1991, both financial and nonfinancial firms are used, while for the subperiod from July 1963 to December 1990, only nonfinancial firms are used. The portfolios are size-ˇ portfolios they are first formed based on each firm’s market value in June of year y, and each size-sorted portfolio is then subdivided based on each firm’s pre-ranking betas in June of year y for July of year y through June of year y C 1. All firms traded for at least 2 years on the NYSE and AMEX are used to form the portfolios. NYSE/AMEX (NYSE/AMEX/NASDAQ) indicates individual stocks traded for at least 5 years on these exchanges. For the whole period (for the subperiod), on average, there are 1208 and 1579 stocks (1407 and 2083 nonfinancial stocks) in the monthly regressions for NYSE/AMEX and NYSE/AMEX/NASDAQ, respectively.
estimation results in Panel B for the subperiod from July 1963 through December 1990 is comparable to those presented in Table III of Fama and French (1992), since they use the same portfolio grouping method (100 size-ˇ portfolios) by using nonfinancial firms and also use the CRSP
value-weighted market return. However, remember that their beta estimation method is different in that they estimate postranking betas by using the full-period sample of post-ranking returns on each of the size-ˇ portfolios and allocate the
correction factors C1t and C1t for O1tLS of Equations 70.15a and 70.15b monotonically decrease as long as ıt is greater than mRR =mˇOˇO , assuming the other terms contained in the EIV correction factors remain fixed. However, it can be empirically confirmed that the ratio ıt is always greater than mRR =mˇOˇO . Consequently, as the ratio increases, the EIV correction factor C2t for O2tLS of Equation 70.16a increases. Since the variance of the equal-weighted market return (0.003815 over the
whole period) is greater than that of the value-weighted market return (0.002114 over the whole period), the ratio is greater when the equalweighted market return is used than when the value-weighted market return is used. Put more specifically, when the value-weighted market return is used, the EIV correction for O1tLS is more profound while the EIV correction for O2tLS is less profound than when the equal-weighted market return is used.
1106
D. Kim
Table 70.5 Risk price estimates (tstatistics) assuming conditional heteroscedasticity Rit D 0t C 1t ˇOit1 C 2t log.Vit1 / C !it EIV-corrected WLS (MLE) Portfolio size .N /
NO0
NO1
9 16 25 49 64 100 196 289 400 NYSE/AMEX NYSE/AMEX/ NASDAQ 9 16 25 49 64 100 196 289 400 NYSE/AMEX NYSE/AMEX/ NASDAQ
0.463 (3.33) 0.471 (3.45) 0.466 (3.44) 0.418 (3.05) 0.408 (2.99) 0.395 (2.83) 0.351 (2.41) 0.294 (1.83) 0.225 (1.35) 0.225 (1.33) 0.203 (1.30) 1.102 (5.12) 1.020 (5.01) 1.079 (5.23) 0.999 (4.83) 1.040 (5.11) 0.995 (4.93) 0.897 (4.21) 0.827 (3.59) 0.682 (2.83) 0.604 (2.47) 0.558 (2.41)
0.519 (1.89) 0.506 (1.87) 0.511 (1.90) 0.552 (2.01) 0.563 (2.06) 0.572 (2.11) 0.607 (2.22) 0.676 (2.39) 0.758 (2.53) 0.771 (2.61) 0.813 (2.66) 0.186 (0.64) 0.210 (0.75) 0.184 (0.63) 0.242 (0.89) 0.216 (0.82) 0.247 (0.92) 0.305 (1.07) 0.381 (1.32) 0.517 (1.65) 0.647 (1.96) 0.710 (2.17)
Uncorrected WLS NO2
NO0
NO1
NO2
0:088 (2:82) 0:075 (2:51) 0:085 (2:86) 0:079 (2:73) 0:088 (2:83) 0:082 (3:77) 0:071 (2:46) 0:066 (2:35) 0:060 (2:23) 0:055 (2:10) 0:052 (2:01)
0.315 (1.91) 0.302 (1.86) 0.264 (1.58) 0.198 (1.27) 0.208 (1.33) 0.180 (1.12) 0.138 (0.81) 0.121 (0.63) 0.114 (0.50) 0.109 (0.47) 0.095(0.42) 1.102(4.55) 1.076(4.89) 1.044(4.93) 1.020(4.82) 1.026(4.95) 1.028(4.97) 0.914(4.37) 0.864(3.81) 0.714(3.06) 0.714 (2.44) 0.519 (2.37)
0.451 (1.76) 0.453 (1.77) 0.490 (1.91) 0.543 (2.11) 0.540 (2.09) 0.560 (2.16) 0.611 (2.33) 0.637 (2.35) 0.650 (2.43) 0.661 (2.57) 0.693 (2.78) 0.068 (0.27) 0.071 (0.31) 0.103 (0.43) 0.129 (0.57) 0.121 (0.55) 0.137 (0.58) 0.217 (0.96) 0.259 (1.08) 0.390 (1.54) 0.567 (2.11) 0.638 (2.40)
0:075 (2:58) 0:074 (2:68) 0:076 (2:77) 0:077 (2:86) 0:079 (2:87) 0:084 (2:98) 0:078 (2:84) 0:073 (2:60) 0:071 (2:55) 0:071 (2:70) 0:069 (2:77)
The risk price estimates (NO0 , NO1 , NO2 ) are the time-series averages of the month-by-month CSR coefficient estimates for the whole period from July 1936 to December 1991. The monthly CSR coefficients are estimated through the EIV-corrected WLS version of the (iterative) MLE by assuming 2 , conditional heteroscedasticity of the disturbance term of the market model. The conditional heteroscedasticity is modeled by "2is D ˛0i C ˛1i Rms where "2is and Rms are the disturbance term of the market model for an asset i at time s and the market return at time s, respectively. The portfolios are size-ˇ portfolios; they are first formed based on each firm’s market value in June of year t , and each size-sorted portfolio is then subdivided based on each firm’s pre-ranking betas in June of year t for July of year t through June of year t C 1. All firms traded for at least 2 years on the NYSE and AMEX are used to form the portfolios. NYSE/AMEX (NYSE/AMEX/NASDAQ) indicates individual stocks traded for at least 5 years on these exchanges. For the whole period, on average, there are 1208 and 1579 stocks in the monthly regressions for NYSE/AMEX and NYSE/AMEX/NASDAQ, respectively.
post-ranking beta of a size-ˇ portfolio to each stock in the portfolio. These allocated betas are then used in the CSR estimation for individual stocks. Even given this different method, Fama and French’s results are broadly similar to those reported in Panel B. Both results, which are contaminated by the EIV problem, show that beta has little power to explain average stock returns. After correcting for the EIV problem, however, beta shows more explanatory power for average stock returns; that is, the estimate of the price of beta risk for this subperiod in the CSR with beta alone is 0.762% per month, with t-statistic of 1.98. However, when the size variable is included in the model, the EIV-corrected estimate of the price of beta risk is not as significant as that obtained for the whole period in Panel A, partly because the timeseries sample size of the CSR coefficient estimates is smaller. It is 0.634% per month, with t-statistic of 2.39, for the whole period, while it is 0.586% per month, with t-statistic of 1.65, for the subperiod.
Table 70.5 presents the EIV-corrected estimation results for the whole period by the iterative MLE when the conditional heteroscedasticity of the disturbance term of the market model of Equation (70.19) is assumed.16 The results are quite similar to those obtained with the assumption of homoscedasticity.17 The similarity may be explained by the fact that when all NYSE, AMEX, and OTC stocks are examined, the estimated intercept term .˛0i / in the proposed conditional heteroscedasticity Equation (70.19) accounts on average for 90% of the total residual variance ("2i /, while the C 2 ˛2i Rms / accounts for only 10%. In time-varying part .˛O 1i Rms other words, although there is a heteroscedastic component 16 By using the traditional least squares estimates or the closed form MLEs as initial values, Newton’s iterative method converges quickly. 17 Different forms of conditional heteroscedasticity have also been con2 and "O2it D ˛0i C ˛1i .Rmt Rm /2 . However, sidered; "O2it D ˛0i C ˛1i Rmt the estimation results of the price of beta risk (not reported) are quite similar.
70 Issues Related to the Errors-in-Variables Problems in Asset Pricing Tests
in the disturbance term of the market model, the homoscedastic component (intercept) overwhelmingly dominates the heteroscedastic component. Moreover, with 5 years of return data on individual stocks, the rejection rate of homoscedasticity by the White’s (1980) test is only 6.4% at the 5% significance level. It could be asserted, therefore, that the heteroscedasticity of the market model disturbance term is not a serious concern in the EIV-correction estimation of risk premia when individual stocks are used. This view is also supported by the simulation results of the previous section. In summary, after correcting for the EIV problem, market beta has an economically and statistically significant explanatory power for average stock returns for the whole period from July 1936 to December 1991, regardless of whether the firm size variable is included in the cross-sectional regressions and regardless of whether one adjusts for heteroscedasticity of the disturbance term of the market model. It is worthwhile, however, to notice that the results are sensitive to the choice of the testing period. For the subperiod July 1963 through December 1990, for example, when the size variable is included in the model, the explanatory power of market beta is weaker than that obtained for the whole period. However, it is more significant than that obtained without correcting for the EIV bias. The EIV problem, if not corrected, can lead to misleading conclusions. For example, the EIV problem caused Fama and French (1992) to underestimate the importance of market beta and to overestimate the importance of idiosyncratic factors. Nevertheless, the results of this paper indicate that firm size is still significant in explaining the cross-section of average returns, which is consistent with a misspecification in the CAPM.
1107
to compare the MLE with the traditional WLS estimator in terms of both bias and MSE as the number of portfolios increases. To do this, assume that the average of the CRSP monthly market returns minus the 1month Treasury Bill returns is the true price of beta risk. Only the results obtained under the assumption of homoscedasticity of the disturbance term of the market model, are reported, since the results obtained under the assumption of heteroscedasticity are similar. Kim (1995) shows in the figure the movements of the relative N bias q ..O1 1 /=1 / and the square-root of the relative MSE . MSE.NO1 /= 2 / of the estimates of the price of beta risk for 1
the whole period, when beta is the only explanatory variable in the model. The relative bias of the MLE of 1 decreases almost monotonically and approaches zero as the number of portfolios increases. The relative MSE of the MLE of 1 also decreases with the number of portfolios. On the other hand, the bias of the traditional WLS estimate of 1 becomes worse as the number of portfolios increases, a result that is expected from earlier analysis. The MSE of the WLS estimate also increases with the portfolio size, since the increase of the bias dominates the decrease of the variance.19 The relative bias of the MLE of 1 is smaller when the value-weighted market returns are used than when the equal-weighted market returns are used, while the relative MSE is not very different. Kim (1995) shows in the figure the relative bias and the square-root of the relative MSE of the estimates of the price of beta risk for the whole period, when both beta and the firm size variables are in the model. Overall, the EIVcorrected MLE method dominates the traditional WLS estimation method both with respect to bias and MSE and for any considered portfolio size.20
70.4.4 Estimation Efficiency of the Beta Risk Price Estimates While the traditional least squares estimate of the price of beta risk is biased, its asymptotic variance is smaller than that of the EIV-corrected MLE.18 It is interesting, therefore,
18
After manipulating the variance of the EIV-corrected MLE of the CSR coefficient 1t , Var.O1t /, of equation (A.5) in the appendix of Kim (1995), the following relation is obtained: 0 1
2
Var.O1tLS / ıt R 1 B C ; C Var.O1t / D @
2 A 1R N 1R 2 1 ˇV O where R D Var.it1 /=mˇOˇO . Since mˇOˇO is greater than Var.it1 / (from Equation 70.2), 0 R 1: Therefore, the variance of the traditional least squares estimate of 1t ; Var.O1tLS /, is smaller than the variance of its EIV-corrected MLE. Since the estimate of the price
of beta risk is the time-series average of the CSR estimates of 1t ; Var.NO1 / D .1=T / Var.O1t / and Var.NO1LS / D .1=T / Var.O1tLS /, where T is the time-series sample size. Therefore, the variance of the least squares estimate of the price of beta risk is smaller than the variance of its EIV-corrected MLE. 19 As the CSR sample size .N / and/or the times-series sample size .T / increase, the variance of the price of beta risk estimate decreases. However, the bias of the traditional least squares estimate NO1LS is not eliminated but increased, although both N and T increase. The MSE of NO1LS is therefore dominated by its bias as N and T increase. On the other hand, both the variance and the bias of the MLE NO1 decease as N and T increase and thus the MSE of NO1 decreases. 20 In a Monte Carlo simulation study (not reported), the MLE also outperforms the traditional WLS estimate in any considered portfolio formation method (portfolios by size, ˇ, size-ˇ, half in size and half in ˇ, and independent size-ˇ ) with respect to both bias and MSE. The size-ˇ portfolio formation gives better estimation results than the other portfolio formation methods.
1108
70.5 Conclusions This paper develops a new method for correcting the EIV problem in the estimation of the price of beta risk within the two-pass estimation framework. The correction for the EIV problem is performed by incorporating the extracted information about the relationship between the measurement error variance and the idiosyncratic error variance into the maximum likelihood estimation under either homoscedasticity or heteroscedasticity of the disturbance term of the market model. This paper also reexamines the explanatory power of beta and the firm size for average stock returns using the proposed correction. If no correction for the EIV problem is implemented, then, consistent with recent empirical studies, there is no significant relation between beta and expected returns – especially when firm size is included as an additional explanatory variable. With the EIV correction, both the price and significance of betas increase. This is most evident when the full sample from July 1936 to December 1991 is used. As a result of the EIV correction, the intercept (which should be zero under the null hypothesis) is insignificant from zero. Finally, the EIV correction diminishes the impact of the firm size variable. Nevertheless, firm size remains a significant force in explaining the cross section of average returns. Although the results suggest a more important role for market beta, the evidence is still consistent with a misspecification in the capital asset pricing model.
References Aigner, D. J. 1974. “MSE dominance of least squares with errors-ofobservation” Journal of Econometrics 2, 365–372. Amihud, Y, B. J. Christensen, and H. Mendelson 1993. Further evidence on the risk-return relationship. Working paper, New York University. Banz, Rolf W. 1981. “The relationship between return and market value of common stocks.” Journal of Financial Economics 9, 3–18. Barone-Adesi, G. and P. P. Talwar. 1983. “Market models and heteroscedasticity of residual security returns.” Journal of Business and Economics 1, 163–168. Belkaoui, A 1977. “Canadian evidence of heteroscedasticity in the market model.” Journal of Finance 32, 1320–1324. Bey, R. P. and G. E. Pinches 1980. “Additional evidence of heteroscedasticity in the market model.” Journal of Financial and Quantitative Analysis 15, 299–322. Black, F. 1972. “Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–455. Brown, S. J. 1977. “Heteroscedasticity in the market model: a comment.” Journal of Business 50, 80–83. Davis, J. L., 1994, “The Cross-Section of Realized Stock Returns: The Pre-COMPUSTAT Evidence.” Journal of Finance 49, 1579–1593.
D. Kim Fama, E. F. and J. D. MacBeth. 1973. “Risk, return, and equilibrium: empirical tests.” Journal of Political Economy 81, 607–636. Fama, E. F. and K. R. French. 1992. “The cross-section of expected stock returns.” Journal of Finance 47, 427–465. Fomby, T. B., R. C. Hill, and S. R. Johnson. ‘1984. Advanced econometric methods, Springer-Verlag, New York. Fuller, W. A. 1987. Measurement error models, John Wiley & Sons, New York. Gibbons, M. R. 1982. “Multivariate tests of financial models.” Journal of Financial Economics 10, 3–27. Ibbotson, R. C. 1992. Stocks, bonds, bills, and inflation: 1991 yearbook, R.G. Ibbotson Associates, Chicago, Ill. Jegadeesh, N. 1992. “Does market risk really explain the size effect?” Journal of Financial and Quantitative Analysis 27, 337–351. Judge, G. G., W. E. Griffiths, R. C. Hill and T.-C. Lee. 1980. The theory and practice of econometrics, 1st Edition, John Wiley & Sons, New York. Keim, D. B. 1983. “Size-related anomalies and stock return seasonality.” Journal of Financial Economics 12, 13–32. Kim, D. 1995. “The errors in the variables problem in the cross-section of expected stock returns.” Journal of Finance 50, 1605–1634. Kim, D. 1997. “A reexamination of firm size, book-to-market, and earnings-price in the cross-section of expected stock returns.” Journal of Financial and Quantitative Analysis 32, 463–489. Kothari, S. P., J. Shanken, and R. G. Sloan. 1995. “Another look at the cross-section of expected stock returns.” Journal of Finance 50, 185–224. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets.” Review of Economics and Statistics 47, 13–37. Litzenberger, R. H. and K. Ramaswamy. 1979. “The effect of personal taxes and dividends on capital asset prices: theory and empirical evidence.” Journal of Financial Economics 7, 163–196. Lo, A. and A. C. MacKinlay. 1990. “When are contrarian profits due to stock market overreaction?” Review of Financial Studies 3, 175–205. McCallum, B. T. 1972. “Relative asymptotic bias from errors of omission and measurement.” Econometrica 40, 757–758. McElroy, M. B. and E. Burmeister. 1988. “Arbitrage pricing theory as a restricted nonlinear multivariate regression model.” Journal of Business and Economic Statistics 6, 29–42. Miller, M. H. and M. Scholes. 1972. “Rates of return in relation to risk: a re-examination of some recent findings,” in Studies in the Theory of Capital Markets, Michael C. Jensen (Ed.). Praeger, New York. Reinganum, M. R. 1981. “Misspecification of capital asset pricing: empirical anomalies based on earning yield and market value.” Journal of Financial Economics 9, 19–46. Richardson, D. H. and De-Min Wu. 1970. “Least squares and grouping method estimation in the errors in variable model.” Journal of American Statistical Association 65, 724–748. Schmidt, P. 1976. Econometrics, Marcel Dekker: New York. Shanken, J. 1992. “On the estimation of beta-pricing models.” Review of Financial Studies 5, 1–33. Sharpe, W. F. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19, 425–442. White, H. 1980. “A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity.” Econometrica 50, 483–499. Zellner, A. 1962. “An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias.” Journal of the American Statistical Association 57, 348–368.
Chapter 71
McMC Estimation of Multiscale Stochastic Volatility Models German Molina, Chuan-Hsiang Han, and Jean-Pierre Fouque
Abstract In this paper we propose to use Monte Carlo Markov Chain methods to estimate the parameters of Stochastic Volatility Models with several factors varying at different time scales. The originality of our approach, in contrast with classical factor models is the identification of two factors driving univariate series at well-separated time scales. This is tested with simulated data as well as foreign exchange data. Keywords Time scales in volatility r Bayesian estimation r Markov Chain Monte Carlo r Foreign exchange volatility model r Multifactor model
71.1 Introduction Volatility modeling and estimation from historical data has been extensively studied during the last two decades. Constant volatility models have been shown to provide a poor fit of financial time series, while dynamic structures provide a more realistic approach to volatility modeling. One of the papers that first tackled this problem was the original ARCH work by Engle (1982), followed by its generalization (GARCH) by Bollerslev (1986). In the last decade the GARCH family has expanded enormously, with papers of great relevance, like the proposed EGARCH by Nelson (1991), and other adaptations/generalizations, like threshold, power, (fractionally) integrated as well as multivariate versions.
C.-H. Han () Department of Quantitative Finance, National Tsing-Hua University, Hsinchu, Taiwan 30013, ROC e-mail: [email protected] G. Molina Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, NC 27709, USA J.-P. Fouque Department of Statistics and Applied Probability, University of California, Santa Barbara, CA 93106-3110, USA
A second approach to modeling time-varying volatility comes from the stochastic volatility models. GARCH and stochastic volatility models differ in the structure of the volatility process. The former assumes dependence on the previous realized shocks in the series, while the latter assumes that it follows its own stochastic process, with potential dependence coming through the correlation structure between the series and the volatility processes. Estimation of stochastic volatility models has been an important issue in the literature. Bates (1996) gives a review of the different approaches to fitting these models. Generalizations of these models have been explored, including jumps in returns and in volatilities and/or fat tails in distributions, as in Jacquier et al. (1998), Chib et al. (2002) or ter Horst (2003), or non-Gaussian errors, as in Fridman and Harris (1998). Other extensions include multivariate stochastic volatility models, as proposed by Harvey et al. (1994). Multifactor models for multivariate series have been developed in the last decade, where several return series are being driven by a common (smaller) set of underlying factors, as in Lopes (1999), Aguilar and West (2000), and Chib et al. (2002). Another approach to multifactor models came with the idea of multiscale modeling. Univariate series can be driven by several factors, with those factors varying at different time scales, as in Alizadeh et al. (2002). In fact, two-factor stochastic volatility models have been shown to produce the kurtosis, fat-tailed return distribution and long memory effect observable in many financial time series. This is welldocumented in LeBaron (2001). Chernov et al. (2003) used efficient method of moments to calibrate this model. One of the main results they found is that two factors are necessary for log-linear models. In this paper we propose the use of Bayesian methods for the parameter estimation in this class of stochastic volatility models. Bayesian methods have become more popular for estimation of complicated models with the introduction of Markov chain Monte Carlo (McMC) techniques. A review of these methods can be found in Robert and Casella (1999). Apart from its computational advantages, the possibility of incorporating prior knowledge of the parameters in the estimation
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_71,
1109
1110
provides a useful tool to practitioners. Key papers in Bayesian estimation of stochastic volatility models are Jacquier et al. (1994) and Kim et al. (1998) among many others. Volatility is one of the most sensitive factors in derivative pricing. Black and Scholes introduced a closed form solution to derivative prices under constant volatility. Their work has been extended to incorporating either constant volatility plus jumps or stochastic volatility with/without jumps to provide a better fit of implied volatility surfaces. A recent alternative to these approaches is to consider volatility driven by short and long time scale factors. Under this framework, Fouque et al. (2003) use a combination of singular and regular perturbations to derive a closed form formula to approximate European option prices. They use the term structure of implied volatilities to calibrate some aggregate parameters in their formulas and they find two factor with well-separated time scales provide significantly better fit in comparison with one factor model. As a consequence of two factor stochastic volatility models, Monte Carlo simulations to evaluate option prices can be computed efficiently, key issue from a practitioner’s point of view. Fouque and Han (2004) propose the use of variance reduction techniques such as importance sampling and control variate methods to evaluate no-arbitrage prices of derivatives. In this paper we consider a class of stochastic volatility models where the volatility is the exponential of a sum of two mean reverting diffusion processes (more precisely Ornstein–Uhlenbeck processes). These mean reverting processes vary on well-separated time scales. We then rewrite these models in discrete time and use McMC techniques to estimate their parameters (Sect. 71.2). Our study is illustrated on simulated data (Sect. 71.3) where we show that we can effectively estimate several factors at well-separated time scales. We also present (Sect. 71.4) an application of our technique to a set of foreign exchange data. This paper focuses on the development of an McMC estimation algorithm for single time series driven by several volatility factors at different time scales. We informally address model selection issues through the diagnostics of the McMC results, but do not enter in proper computation of Bayes Factors or reversible jump algorithms. We instead restrict ourselves to propose a Bayesian alternative to the estimation problems when more than one stochastic volatility factor drive univariate series, and address identifiability problems inherent to such estimation. Our interest resides in applications in the Foreign Exchange market, this being one of the markets more intensively addressed in the statistical literature. The importance of the identification of a second factor, and its relevance for a more accurate estimation of the first factor, is outlined in the paper.
G. Molina et al.
71.2 Multiscale Modeling and McMC Estimation 71.2.1 Continuous Time Model For purposes of general notation, let K be the number of volatility factors driving a univariate time series, where K can be either one or two. We first introduce our K factor volatility model in continuous time. In this class of models the volatility is simply the exponential of the sum of K mean reverting Ornstein–Uhlenbeck (O–U) processes: .0/
dSt D St dt C t St dW t log.t2 / D
K X
.j /
Ft
j D1 .j / dFt
.j /
.j /
D ˛j .j Ft /dtCˇj dW t ; j D1; : : : ; K ; (71.1)
where S is the asset price, is the constant rate of return and W ./ are (possibly correlated) Brownian motions. Each of the factors F is a mean reverting O–U process, where j denotes the long-run mean, ˛j is the rate of mean reversion, and ˇj is the “volatility of volatility” of the F .j / factor. The typical time scale of this factor is given by 1=˛j . In our model these time scales are distant in magnitude, which we define as well-separated. They are also ordered, so that, in the two factor case, 0 < ˛1 < ˛2 , and, therefore, the second factor corresponds to the shortest time scale and the first factor corresponds to the longest time scale. We also assume that the variances of the long-run distributions of these factors, given by ˇj2 =2˛j , are of the same order. In particular this implies that the ˇj ’s are also correspondingly ordered ˇ1 < ˇ2 < < ˇK .
71.2.2 Discretization of the Model The goal of this section is to present a discretized version of the stochastic volatility model given in Equation (71.1). Given a fixed time increment > 0, we use the following Euler discretization of the spot price process .St / and volatil .j / ; j D 1; : : : ; K at the times tk D k : ities Ft StkC1 Stk D Stk C tk Stk log.t2k /
D
K X j D1
p
tk
.j /
Ftk
p .j / .j / .j / .j / Ftk Ftk1 D ˛j j Ftk1 C ˇj vtk ;
71 McMC Estimation of Multiscale Stochastic Volatility Models .j /
where tk and vtk are (possibly correlated) sequences of i.i.d. standard normal random variables. In this paper we restrict ourselves to the case where the sequence of ’s is independent of the v.j / ’s, which, themselves can be correlated. This is usually the case in the Foreign Exchange market, where market participants operate at different time scales (intraday market makers, hedge funds or pension funds operate with quite different schedules, and based on different information sets), but not independently, and where the leverage effect is more rare than in equity markets. We regroup terms within the equations above such that 1 p
StkC1 Stk D tk tk S tk log.t2k / D
K X
.j /
Ftk
.j / .j / Ftk j D .1 ˛j / Ftk1 j p Cˇj vtk ; is obtained. To study the hidden volatility process, we introduce the returns
StkC1 Stk 1 ytk D p ; S tk
and the discrete driving (vector of) volatilities htk D Ftk ; .1/
where Ftk (respectively ) denotes the vector .Ftk ; : : : ; .K/ Ftk /0 (respectively .1 ; : : : ; K /0 ). The autoregressive parameters will be denoted by
j D 1 ˛j ; j D 1; : : : ; K ; and the standard deviation parameters will be defined by p
; j D 1; : : : ; K :
With an abuse of notation, and since the observations are equally spaced, we will use t instead of tk = for the discrete time indices, so that t 2 f0; 1; 2; ; T g. To summarize, our model is a K-dimensional AR(1) for the log-volatilities, and can be written in state space form as: yt D e .1 ht C1 /=2 t
(71.2)
ht D ˆht1 C ˙ 1=2 vt for t D 1; : : : ; T ;
(71.3)
0
0
.1/
.K/
the K-dimensional autoregressive (diagonal) matrix with elements j is denoted by ˆ and the covariance matrix is denoted by ˙ . In the following sections we assume that the covariance matrix ˙ is diagonal, since the correlation parameter between the factors is indeed not identifiable. For a proof of equivalence between the diagonal and nondiagonal case, see Appendix 1. In Sect. 71.2.3 we make precise the parametrization of our discrete model, including the initial state and random variable notation.
71.2.3 Discrete Time Model Specification
j D1
v.j / D ˇj
1111
where vt is the vector .vt ; : : : ; vt / of standard random Gaussian variates, 1 is a (K-dimensional) vector of ones,
The simplest one-factor stochastic volatility model can be defined in state-space form as in Equation (71.4). We follow the notation in Kim et al. (1998), and denote ht as the logvolatility at time t, which follows an AR(1) process reverting to its long-term mean according to the mean-reversion parameter and with a volatility of volatility v . The initial state h0 is assumed to come from its invariant (implied marginal) distribution so that: yt D e .ht C/=2 t ht D ht 1 C v vt ˇ h0 N1 h0 ˇ0; v2 =.1 2 / ˇ
t t ˇˇ 0; I N2 2 vt vt ˇ .; ; v / .; ; v /:
(71.4)
To complete the joint distribution a prior density .; ; v / on the remaining parameters must also be specified; this will be more explicitly done in Sect. 71.2.4. For simplicity, the mean return is assumed equal to zero. A trivial modification of the procedure presented here would allow us to consider a nonzero mean in the returns. Our contribution in this paper does not focus on modeling extreme events (fat tails in the distributions and/or jumps), which is still a possible addition to the model here presented, but instead generalizes the assumption that a single volatility process drives the series. We will assume that the volatility processes are independent [˙ in Equation (71.5) being diagonal]. We show in Appendix 1 that, when the factors are positively correlated, this is equivalent to having independent factors both for estimation and for prediction. Indeed the covariance is not identifiable, since the model is overparametrized.
1112
G. Molina et al.
The one-factor model can be extended to the general case as 0
specify values for these hyperparameters such that the prior is flat. The result do not seem to be sensitive to which large values of ak and bk are chosen.
yt D e .1 ht C/=2 t ht D ˆht1 C ˙ 1=2 vt
71.2.5 Estimation
h0 NK .h0 j0; / ˇ
t ˇˇ t 0; I NKC1 KC1 vt vt ˇ .; ˙ ; ˆ/ .; ˙ ; ˆ/ ;
(71.5)
where ˙ is the (K K) diagonal matrix of conditional variances
fk2 gK kD1 of the log-volatility processes; D ˆˆ C ˙ is the implied (stationary) marginal covariance matrix for the initial state log-volatility vector h0 ; ˆ is a K K diagonal matrix with mean reversion parameters Since ˆ and ˙ are diagonal matrices, is a diagonal matrix with entries !i D i2 =.1 i2 /. Notice that we do not have a vector in the observation equation (71.5), but instead a single parameter . This is because only the linear combination (sum) is identifiable. Therefore we define the mean log-volatility of the series D 10 . One interesting feature of this model is its similarity to models that introduce jumps in volatility. If we set i D 0 for some i , the implied model is equivalent to one that introduces a permanent (log-normal) source of independent jumps in the volatility series, as the hi;t would be (temporally independent) normals added to the (log) variance process driving the series.
Since the volatility parameters ht are nonlinear in the likelihood, a mixture approximation to the likelihood as detailed in Kim et al. (1998), has become a commonly used linear approximation technique. Taking the logarithm of the squares of the mean corrected returns, yt , transforms Equation (71.5) into log.yt2 / D 10 ht C C log.t2 / ; „ ƒ‚ …
(71.6)
t
and t log 21 is then well-approximated as a mixture of seven normals with (known) means mi , variances vi and weights qi . We introduce vectors of indicator variables ıt D fı1t ; : : : ; ı7t g, for the particular component of the mixture at each time point t, in the same fashion as Kim et al. (1998), so that, consequently, we have Œt jıi t D 1 N1 .t jmi 1:2704; vi / : For simplicity of notation, let y D flog.y12 /; : : : ; log.yT2 /g and h D fh0 ; h1 ; : : : ; hT g. The new state-space representation can be written as
ˇ yt jfıi t g7iD1 ; ht ; N1 yt ˇ10 ht C 1:2704 C
7 X
ıi t m i ;
i D1
7 X
! ı i t vi
i D1
Œht jht1 ; ˆ; ˙ NK .ht jˆht1 ; ˙ /
71.2.4 Prior Specification We specify independent priors for each of the (sets of) parameters, which is the product measure .; ˆ; ˙ / D 1 .ˆ/ 2 ./ 3 .˙ /. We set a uniform prior between 1 and 1 for each of the K mean reverting parameters. In order to guarantee identifiability of the different series, we specify a prior ordering of the parameters, namely 1 .ˆ/ / 11> 1 > 2 >1 . Usually, with financial data, there is substantial prior information about and ˙ . We could, in principle, use an informative normal prior for , 2 ./ D N1 .jm0 ; v0 / with fixed hyperparameters m0 and v0 . Here, however, a vague prior is used by choosing a large value of v0 . A similar prior is chosen for the variance hyperparameters: 3 .˙ / / QK 2 kD1 IG.k jak ; bk / for k D 1 : : : K with fixed ak ; bk . We
Œh0 jˆ; ˙ NK .h0 j0; / Œ; ˆ; ˙ .; ˙ ; ˆ/ Œı
T Y
MN .ıt j1; q1 ; : : : ; q7 / : (71.7)
t D1
A Gibbs sampling scheme can now be employed, and consists of the following steps 1. 2. 3. 4. 5. 6.
Initialize ı; ; ˆ; ˙ Sample h Œhjy ; ; ˙ ; ı; ˆ Sample ı Œıjy ; ; h Sample Œjy ; h; ı Sample ˆ Œˆj˙ ; h Sample ˙ Œ˙ jh; ˆ
71 McMC Estimation of Multiscale Stochastic Volatility Models
Details of the full conditionals are given in Appendix 2. In principle one could obtain improved sampling versions by integrating out some parameters from the full conditionals, as in Kim et al. (1998). The simpler version given here is efficient enough for our purposes, however.
71.3 Simulation Study In this section we study the behavior of the algorithm when the estimated model is the one generating the data, and also when we try to estimate using the wrong model (wrong number of factors). This will be investigated for different sample sizes and parameter settings. For all experiments, we set the true value of D 1:5, which is a value similar to the posterior mean of we obtain using real data. We will denote by K the true number of volatility factors that generated the data, while K is the value with which we do the estimation. We generate the data under three possible parameter
settings: Setting 1: K D 1, 1 D 0:95, 1 D 0:26 Setting 2: K D 2, 1 D 0:98, 1 D 0:20, 2 D 0:60, 2 D 0:80 Setting 3: K D 2, 1 D 0:98, 1 D 0:20, 2 D 0:30, 2 D 0:80 We consider for estimation both the 1 and 2 factor stochastic volatility models: K=1,2. Two sample sizes are considered: T D 1;500 or T D 3;000. Any combination of the above forms an experiment. As an example, a possible experiment comprises generating 3,000 data points with the first parameter setting (one volatility factor) and estimating it with K D 2 factors. This would correspond to the last column of Table 71.2. We considered K D 3 as well, but the algorithm failed to identify three factors for the experiments considered, so we restricted ourselves to the K D 2 factor case of interest. The results in Tables 71.1 and 71.2 are based on 100 Monte Carlo experiments for each of the settings and for each of the experiments. The results reported are Monte Carlo means of the posterior medians (bold), Monte Carlo means of the posterior standard deviations (in parenthesis to the right). Below those numbers are the standard deviations of the posterior medians and in parenthesis the standard deviation of the posterior standard deviations. Three main conclusions can be drawn from these tables. First, we can accurately estimate at least two factors with a relatively small data length if the time scales are well separated. Second, overparametrization tends to induce lack of convergence (high Monte Carlo means of the posterior standard deviations of the redundant parameters). Third, under-
1113
parametrization tends to induce apparent convergence around values of the parameters situated in between the true values, and either visualization of the chain or more proper model selection procedures are necessary. Figure 71.1 shows the posterior distribution for k , when the data was generated with one volatility scale (K D 1). After a burn-in of 50,000 iterations, we record, with a thinning of 25, the following 50,000 iterations. Two typical runs are given. Estimation of 1 , under K D 1, is very stable around the true value (0.95). On the other hand, estimation of 2 is very unstable, since the data was generated with only one factor. This graph shows the pattern one should expect when the model is overparametrized. Either the estimates of
1 and 2 are confounded (the volatility series splits into two almost equivalent parts) or 2 never stabilizes. Figure 71.2 shows the posterior distribution for k , k D 1; : : : ; K when the data was generated with two factors (K D 2, 1 D 0:98 and 2 D 0:6). The model with K D 1 is underparametrized, leading to an apparent convergence, but the value to which it converges is between the two true values that generated the data. The convergence is fine for the case when we estimate with K D 2.
71.4 Empirical Application: FX Data We study the volatility in a foreign exchange time series. Since most time series could be argued to have a positive trend, we construct the detrended series as in Kim et al. (1998). In principle we could easily introduce mean processes (constant or time-dependent), but we refrain from doing so to allow comparisons with previous literature and to focus our efforts mainly on the volatility modeling. We study volatility in the daily exchange rate between the UK Sterling and the US Dollar from 10/09/1986 to 08/09/1996. This data is shown in Fig. 71.3. We transform the spot prices into detrended returns and perform the analysis by running the algorithm with one and two. The results are shown in Table 71.3. All results are posterior medians (posterior standard deviations). From the left plot in Fig. 71.4 one can argue that, for the GBP/USD series, using one factor gives reasonably stable results. There is a (very) heavy left tail in the posterior distribution of 1 under K D 1, though. This could be an indication of the existence of a second volatility factor running at a faster scale. If that was the case, the single volatility factor model would be picking up a mixture of both the slow and fast mean reverting processes, achieving an intermediate result in terms of its estimation, and showing a left-tail of the distribution heavier than one would expect. If we run the estimation with K D 2 we can see (Fig. 71.4, right plots) that there seems to be (at least) a second volatility
1114
G. Molina et al.
Table 71.1 Estimates of one factor stochastic volatility model
True value
1
0:95
2
1:5
1 2
0:26
Estimation with K D 1 T D 1;500 T D 3;000 0:95 0:02 1.50 0:14 0:27 0:04
(0.02) (0.00) (0.14) (0.03) (0.04) (0.01)
0:95 0:01 1.50 0:10 0:27 0:03
(0.01) (0.00) (0.10) (0.01) (0.03) (0.00)
Estimation with K D 2 T D 1;500 T D 3;000 0:96 0:01 0:66 0:36 1.54 0:14 0:12 0:09 0:29 0:04
(0.02) (0.01) (0.29) (0.17) (0.15) (0.04) (0.06) (0.03) (0.05) (0.01)
0:96 0:01 0:61 0:38 1.50 0:01 0:15 0:09 0:28 0:02
(0.01) (0.00) (0.29) (0.16) (0.10) (0.02) (0.05) (0.03) (0.04) (0.01)
Parameter setting 1: Results from 100 Monte Carlo simulations generated with K D 1 factor. Columns represent data lengths (T D 1;500; 3;000) and factors used for the estimation (K D 1; 2). Top lines are means of the Monte Carlo posterior medians (means of the Monte Carlo standard deviations). Bottom lines are standard deviations of the posterior medians (standard deviations of the posterior standard deviations) Table 71.2 Estimates of two factor stochastic volatility models
True value
1
0:98
2
0:60
1.5
1
0:20
2
0:80
1
0:98
2
0:30
1.5
1
0:20
2
0:80
Estimation with K D 1 T D 1;500 T D 3;000
Estimation with K D 2 T D 1;500 T D 3;000
0.86 0.05 1.44 0.26 0.69 0.09
(0.03) (0.01) (0.14) (0.03) (0.06) (0.01)
0.87 0.03 1.46 0.22 0.67 0.06
(0.02) (0.00) (0.10) (0.02) (0.05) (0.00)
0.97 0.02 0.53 0.15 1.52 0.24 0.24 0.07 0.78 0.09
(0.02) (0.01) (0.14) (0.06) (0.32) (0.18) (0.08) (0.03) (0.08) (0.01)
0.98 0.01 0.55 0.07 1.49 0.17 0.22 0.05 0.82 0.05
(0.01) (0.00) (0.08) (0.02) (0.21) (0.07) (0.04) (0.01) (0.05) (0.00)
0.89 0.05 1.42 0.26 0.52 0.12
(0.03) (0.01) (0.16) (0.04) (0.07) (0.01)
0.91 0.04 1.41 0.21 0.51 0.08
(0.02) (0.01) (0.11) (0.02) (0.05) (0.01)
0.97 0.01 0.23 0.15 1.52 0.25 0.22 0.04 0.78 0.08
(0.01) (0.01) (0.16) (0.04) (0.28) (0.10) (0.05) (0.02) (0.08) (0.01)
0.98 0.01 0.28 0.09 1.50 0.18 0.21 0.03 0.80 0.05
(0.01) (0.00) (0.10) (0.02) (0.21) (0.06) (0.03) (0.01) (0.05) (0.00)
Parameter settings 2 and 3: Results from 100 Monte Carlo simulations generated with K D 2 factors. Columns represent data lengths (T D 1;500; 3; 000) and factors used for the estimation (K D 1; 2). Top lines are means of the Monte Carlo posterior medians (means of the Monte Carlo standard deviations). Bottom lines are standard deviations of the posterior medians (standard deviations of the posterior standard deviations)
factor running at a very fast mean reverting rate. The algorithm remains highly stable around a low value for the mean reverting parameter, and a high value for its volatility. Furthermore, the stability of the parameter estimates for the first factor also improves. Indeed, the inclusion of this second factor in the model allows us to identify that the first factor was moving at a lower frequency than previously thought (the posterior median of the mean reverting parameter for K D 2 for the slower scale is 0.988 under K D 2, but it was 0.916
under K D 1). This is a key result of the introduction of a second factor: a significant change in our estimate of the first factor, whose mean reversion speed, if such second factor exist, is usually inflated. From a practitioner’s point of view, accurate estimation of volatility persistence is a major factor, especially in derivative pricing (Fig. 71.5). Figure 71.6 shows the posterior distribution of the mean log-volatilities for each time point t and for each k when K D 2, together with their sum and the absolute value of the returns.
71 McMC Estimation of Multiscale Stochastic Volatility Models
Fig. 71.1 Two typical runs (horizontal) where the estimation of k was done using K D 1 (1st column) and K D 2 factors (2nd and 3rd columns), but the data was generated with K D 1 ( 1 D 0:95), after burn-in and thinning of 1 iteration every 25
1115
T=3000,K=1
T=3000,K=2
0.5
0.5
φ2
0.5
φ1
1
φ1
1
0
0
–0.5 0
1000
2000
–0.5
0
1000
2000
–0.5
1
1
0.5
0.5
0.5
0
0
0
1000
2000
–0.5
T=3000,K=1
0
1000
2000
–0.5
T=3000,K=2 1
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
0.7
φ2
1
0.6
0.6 0.5
0.5
0.4
0.4
0.4
0
1000
2000
0.3
0
1000
2000
0.3
1
1
1
0.9
0.9
0.9
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.5
0.4
0.4
0.4
0.3
0
1000
2000
0.3
0
1000
1000
2000
0
1000
2000
0.6
0.5
0.3
0
T=3000,K=2
1
φ1
φ1
0
1
–0.5 0
Fig. 71.2 Two typical runs (horizontal) where the estimation of k was done using K D 1 (1st column) and K D 2 (2nd and 3rd columns) factors, but the data was generated with K D 2 ( 1 D 0:98; 2 D 0:6), after burn-in and thinning of 1 iteration every 25
T=3000,K=2
1
2000
0.3
0
1000
2000
0
1000
2000
G. Molina et al.
1.6
1.8
GBP/USD spot
1.4
Fig. 71.3 Daily GBP/USD spot (top) and returns (bottom) for the period 10/09/1986–08/09/1996
2.0
1116
0
500
1000
1500
2000
2500
1500
2000
2500
0.03 −0.03 −0.01 0.01
GBP/USD returns
time
0
500
1000 time
Table 71.3 GBP/USD results: Parameter posterior medians (standard deviations), estimated under different number of factors K D 1; 2
1
2 12 22 (0.057) (0.006)
0.149
Fig. 71.4 Traceplots of
k ; k D 1; : : : ; K for the GBP/USD series estimated with K D 1 and K D 2 factors, after burn-in and thinning of 1 iteration every 25
(–) (0.090)
1.220 1.470
(0.106) (0.255)
0.134 0.013
φ1
K=1
(0.130) (0.007)
0.947
(–) (0.132)
K=2
K=2
1
1
1
0.8
0.8
0.8
0.6
0.6
0.6
0.4
φ2
0.916 0.988
φ1
KD1 KD2
0.4
0.4
0.2
0.2
0.2
0
0
0
–0.2
0
10000
–0.2
0
10000
–0.2
0
10000
71 McMC Estimation of Multiscale Stochastic Volatility Models Fig. 71.5 Histograms of 1 , 2 and for the GBP/USD series estimated with K D 2 factors, after burn-in and thinning of 1 iteration every 25
1117
σ1
σ2 6000
12000
5000
5000
10000
4000
4000
8000
3000
3000
6000
2000
2000
4000
1000
1000
2000
0
0
0.2
0.4
0
0 0.5
1
σ1
1.5
–4
–2
σ2
0
2
μ
|ri × 100|
GBP/USD series 3 2 1 0
0
500
1000
1500
2000
2500
–2 0
500
1000
1500
2000
2500
500
1000
1500
2000
2500
500
1000
1500
2000
2500
h1t+h2t
2 0
2
h1t
1 0 –1 0 2 1
h2t
Fig. 71.6 GBP/USD absolute value of the return series (first plot). Posterior means of the sum of the two factors (second plot). Posterior mean of the slow mean reverting factor (third plot) and posterior mean of the fast mean reverting factor (fourth plot), estimated using K D 2 factors, after burn-in and thinning of 1 iteration every 25 iterations, for a chain run for 100,000 iterations
μ
6000
0 –1 0
1118
Given the limitations of the algorithm to identify more than two factors for the data lengths considered, we do not claim the lack of a third factor in this dataset, but the impossibility to identify it. Notice in Fig. 71.4 that the short time scale process captures the extreme events, while the long scale factor captures the trends and is much less influenced by extreme events. There is a peak in the returns between the 1,900th and the 2,000th observation (that is in August 1994). This peak occurs during a period of relative stability (the absolute value of the returns are clearly more stable than in previous periods). The slow mean reverting volatility process remains low at this time and is not affected by this peak, while the fast mean reverting volatility process exhibits a peak coinciding with that day. We do not address the question of whether this should be considered a jump in volatility or the effect of a second volatility process. However, the tight concentration of the posterior around positive values for the second factor, as shown in Table 71.3, seem to point towards the existence of a clear second volatility factor at a (much) faster scale. One can also construct easily any functional of the parameters in the model and do inference on it. Indeed, an interesting quantity to look at is the variance of the longrun distributions of the factors. For this dataset these longrun variances (as defined in Sect. 71.2.1) are of the same order when K D 2 (posterior means are 0.5597 for the slow mean-reverting process and 0.5589 for the fast mean reverting one), while, when estimating them with K D 1, we get a long-run variance of 0.8215. Given that the factors are quite separated, this result gives a very intuitive way to construct joint priors on the mean reverting parameters and the variance parameters.
71.5 Implication on Derivatives Pricing and Hedging In previous sections the existence of short and long time scales in driving volatility processes has been justified through the method of McMC. This finding is consistent to those in Alizadeh et al. (2002) and Chernov et al. (2002) based on data defined under the historical or physical probability space. Due to the fact that these time scales are preserved under change of probability measures, one may find empirical evidence of the existence of multiple time scales in volatility from market prices of derivatives. By means of singular and regular perturbation methods, higher order approximations of option prices are derived by taking into account the effect of stochastic volatility. Fast and slow time scales are justified to provide a good fit to term structure of implied volatility, interest rates, or credit spread yields, etc. These can be found in a series of papers by Fouque et al. (2003, 2006),
G. Molina et al.
Cotton et al. (2004), Papageorgiou and Sircar (submitted), and references therein. Given a fully specified multiscale stochastic volatility model, Fouque and Han (2007, 2009) develop a martingale control variate method to deal with option pricing problems by means of Monte Carlo simulations. The martingale control represents the value of portfolio process consisting of some imperfect hedging strategy. As a result the variance reduced by this method indicates the mean squared error of misreplication by associated hedging strategy. Han and Lai (in press) incorporate a randomized Quasi Monte Carlo and show a significant improvement on the convergence of Monte Carlo simulations. Han (submitted) estimate the joint default probability of many names under structural models by an importance sampling method. This result is helpful to deal with high-dimensional problems for the use of, for example, credit portfolio (coherent) risk management and the evaluation of Collateralized Debt Obligations and Basket Default Swaps.
71.6 Conclusions This paper addressed identifiability and estimation problems for stochastic volatility models with two volatility factors given that those can come at different, and well-separated, time scales. We show that the number of factors can be visually identified from a simple analysis of the Markov Chains under different models, since overparametrized models will bring clear signs of lack of convergence, while underparametrized models will bring an incorrect appearance of convergence. This is of special relevance to industry practitioners, who might find model selection tools unfeasible under time constraints. This algorithm relies heavily on the factors being well-separated, and the degree of separation, as well as the length of the dataset, does have an influence of the ability to estimate second factors in a given series. We would, therefore, recommend running the algorithm with at least two factors as a first approach, as it seems to be quite robust to the length of the datasets and the parameter settings. If there is lack of convergence, reduce to K D 1. One could argue that, if the second factor is small ( 2 close to zero), there is no practical point and no gain in introducing more than one factor. However, as shown in the simulated data and in the application, a (much) more accurate identification of the slow mean-reverting volatility factor is obtained. This will usually come in the form of a more persistent slow mean-reverting factor ( 1 closer to one), and, therefore, have a huge impact on volatility predictions. Acknowledgments This material was based upon work supported by the National Science Foundation under Agreements No. DMS-0112069 and DMS-0071744. We would like to thank Prof. Mike West for the
71 McMC Estimation of Multiscale Stochastic Volatility Models foreign exchange data used in this paper, as well as the members of the Stochastic Computation research group at the Statistical and Applied Mathematical Sciences Institute (SAMSI), and the SAMSI Workshops participants for their valuable comments. We also thank Prof. Jim Berger for his help and support in the development of this paper.
References Aguilar O. and M. West. 2000. “Bayesian dynamic factor models and variance matrix discounting for portfolio allocation.” Journal of Business and Economic Statistics 18, 338–357. Alizadeh, S., M. Brandt, and F. Diebold. 2002. “Range-based estimation of stochastic volatility models.” Journal of Finance 57, 1047–1091. Bates, D. 1996. “Testing option pricing models,” in Statistical methods in finance, Vol. 14 of Handbook of Statistics, G. Maddala and C. Rao (Eds.). North Holland, Amsterdam, Chap. 20, pp. 567–611. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31, 307–327. Chernov, M., R. Gallant, E. Ghysels, and G. Tauchen. 2003. “Alternative models for stock price dynamics.” Journal of Econometrics 116, 1–2, 225–257. Chib, S., F. Nardari, and N. Shephard. 2002. “Markov Chain Monte Carlo methods for stochastic volatility models.” Journal of Econometrics 108, 281–316. Cotton, P., J.-P. Fouque, G. Papanicolaou, and R. Sircar. 2004. “Stochastic volatility corrections for interest rate derivatives.” Mathematical Finance 14(2), 173–200. Engle, R. F. 1982. “Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation.” Econometrica 50, 987–1008. Fouque, J.-P. and C.-H. Han. 2004. “Variance reduction for Monte Carlo methods to evaluate option prices under multi-factor stochastic volatility models.” Quantitative Finance 4(5), 597–606. Fouque, J.-P. and C.-H. Han. 2007. “A martingale control variate method for option pricing with stochastic volatility.” ESAIM Probability & Statistics 11, 40–54. Fouque, J.-P. and C.-H. Han. 2009. “Asymmetric variance reduction for pricing American options,” Mathematical Modeling and Numerical Methods in Finance, A. Bensoussan, Q. Zhang, and Ph. Ciarlet (Eds.), Elsevier, Amsterdam. Fouque, J.-P., G. Papanicolaou, R. Sircar, and K. Sølna. 2003. “Multiscale stochastic volatility asymptotics.” SIAM Journal on Multiscale Modeling and Simulation 2(1), 22–42. Fouque, J.-P., R. Sircar, and K. Sølna. 2006. “Stochastic volatility effects on defaultable bonds.” Applied Mathematical Finance 13(3), 215–244. Fridman, M. and L. Harris. 1998. “A maximum likelihood approach for non-Gaussian stochastic volatility models.” Journal of Business and Economic Statistics 16, 284–291. Han, C.-H. and Y. Lai. in press. “Generalized control variate methods for pricing asian options.” Journal of Computational Finance. Han, C.-H. submitted. “Estimating joint default probability by efficient importance sampling with applications from bottom up.” Harvey, A., E. Ruiz, and N. Shephard. 1994. “Multivariate stochastic variance models.” Review of Economic Studies 61, 247–264. ter Horst, E. 2003. A Levy generalization of compound Poisson processes in finance: theory and applications, PhD thesis, Institute of Statistics and Decision Sciences. Jacquier, E., N. G. Polson, and P. E. Rossi. 1994. “Bayesian analysis of stochastic volatility models.” Journal of Business and Economic Statistics 12, 413–417. Jacquier, E., N. G. Polson, and P. E. Rossi. 1998. Stochastic volatility: univariate and multivariate extensions, CIRANO working paper.
1119 Kim, S., N. Shephard, and S. Chib. 1998. “Stochastic volatility: likelihood inference and comparison with ARCH models.” The Review of Economic Studies 65, 361–393. LeBaron, B. 2001. “Stochastic volatility as a simple generator of apparent financial power laws and long memory.” Quantitative Finance 1(6), 621–631. Lopes, H. 1999. PhD thesis, Institute of Statistics and Decision Sciences. Melino, A. and S. Turnbull. 1990. “Pricing foreign currency options with stochastic volatility.” Journal of Econometrics 45, 239–265. Nelson, D. 1991. “Conditional heteroskedasticity in asset returns: a new approach.” Econometrica 59(2), 347–370. Papageorgiou, E. and R. Sircar. submitted. “Multiscale intensity models and name grouping for valuation of multi-name credit derivatives.” Robert, C. and G. Casella. 1999. Monte Carlo statistical methods, Springer Texts in Statistics. West, M. and J. Harrison. 1999. Bayesian forecasting and dynamic models, Springer, Berlin.
Appendix 71A Proof of Independent Factor Equivalence In the case where factors are positively correlated, the inferential problem simplifies to one of independent volatility processes. ˇ
0 ˇ Proof for the 2-factor case. Let yt N1 yt ˇ0; e 1 ht C be a univariate time series driven by two volatility factors represented in the vector ht . Define an i.i.d. sequence of stochastic (degenerate) vectors ft N2 .t j0; R/gTtD1, where R is a positive semidefinite 2 by two covariance matrix such that 0 < r11 D r22 D r12 . We can see that 10 t 0 8t. Then we can define the “equivalent” uncorrelated volatility vector ht D ht t with conditional diagonal covariance matrix D. The observation equation remains unchanged, namely ˇ
ˇ
0 0 ˇ ˇ yt N1 yt ˇ0; e 1 ht C D N1 yt ˇ0; e 1 .ht Ct /C ˇ
0 ˇ D N1 yt ˇ0; e 1 ht C (71.8) while we can see in (71.9) that the evolution equations are now independent: ht N1 .ht jˆht1 ; ˙ / ˇ ht C t N2 ht C t ˇˆ.ht1 C t1 /; ˙ 1 0 ˇ ˇ ˇ C B ˇ ht N2 @ht ˇˆht1 ; R „ C ˆRˆ ƒ‚ C ˙… A : (71.9) ˇ ˇ D If ˙12 > 0; 1 ; 2 ¤ 0, 9 R W R C ˆRˆ C ˙ D D for some diagonal positive definite matrix D with elements di2 . Matching elements gives us that r11 D r22 D
1120
˙12 =.1 C 1 2 / D r12 . The variances of these independent volatility processes will be one-to-one versions of the original variances, since di2 D ˙12 .1 C i2 /=.1 C 1 2 / C ˙i i . Therefore any two positively correlated volatility processes are equivalent to two independent processes, making a parameter like ˙12 nonidentifiable. The implied invariant distribution (marginal) for the initial state would just be a normal centered at zero and with diagonal covariance matrix D ˆˆ C D. Proof for the K-factor case. The proof can be extended to the general case. If our univariate time series was driven by an arbitrary number K of positively correlated volatility factors, represented in the vector ht , we can define a sequence of stochastic vectors ft NK .t j0; R/gTtD1 that comply with the diagonality restriction R C ˆRˆ C ˙ D D and with the equivalence restriction 1T R1 D 0. Solving rij D ˙ij =.1 C i j / for i ¤ j , guarantees the diagonality of the transformed factors. One possible solution that guarantees the uniqueness of the equivalence condition can be found by imposing an auxiliary restriction, namely 1T R D 0, which turns out to satisfy P ri i D j ¤i rij .
Appendix 71B Full Conditionals Sampling h. After approximating the likelihood through a mixture of normals and, conditionally on the mixture component, we can (jointly) sample the vector of volatilities, including the initial state, using its multivariate DLM
G. Molina et al.
structure by running the Forward Filtering/Backward Smoothing algorithm as detailed in West and Harrison (1999). Sampling ı. Each of the T indicator vectors, ıt , comes from a seven-component multinomial distribution with 1 outcome and with probabilities proportional to qi D qi fN log.yt2 /jmi C C 10 ht 1:2704; vi , where mi ; vi ; qi , i D 1; : : : ; 7 are the prior means, variances and weights of the mixture distributions and fN denotes the normal density function. Therefore ıt MN.ıt j1; q1 ; : : : ; q7 /. Sampling . Under a normal prior ./ N.m0 ; v0 / the full conditional for is also normal, [j : : : ] P N1 .jm0 ; v0 /, where m0 D .1=v0 / TtD1 .log.yt2 /10 ht PT 1 1 and at C 1:2704/=bt and v0 D .v1 0 C t D1 bt / at , bt are the mixture component means and variances, P P at D 7iD1 ıi t mi and bt D 7iD1 ıi t vi . Sampling ˆ. Imposing stationarity and ordering a priori can be done with a uniform prior on the domain
1 > 2 > > K . Then we can sample k P N1 . k jmk ; vk /, where mk D .1=v/ TtD2 hk;t hk;t 1 and PT 1 2 vk D t D1 hk;t until the proposed set of values follows the ordering conditions. Then, if we define D ˆ ˆ C ˙ , we will set ˆ D ˆ with probability equal to min .1; expflogŒfN .h0 j0; / logŒfN .h0 j0; / g/. Sampling ˙ . We can update the components of ˙ either all at once or one at a time. Under a prior of the QK 2 form .˙ / / kD1 IG.k jak ; bk / (ak ; bk can denote previous information), we can sample k2 IG.k2 j˛k =2; ˇk =2/, where ˛k D T C 1 C 2ak and P ˇk D TtD1 Œh2k;t k hk;t 1 2 C h2k;0 .1 k2 / C 2bk .
Chapter 72
Regime Shifts and the Term Structure of Interest Rates Chien-Chung Nieh, Shu Wu, and Yong Zeng
Abstract Since the seminal paper of Duffie and Kan (1996), most empirical research on the term structure of interest rates has focused on a class of linear models, generally referred to as “affine term structure models.” Since these models produce such a closed-form solution for the entire yield curve, they become very tractable in empirical applications. Nonlinearity can be introduced into dynamic models of the term structure of interest rates either by generalizing the affine specification to a quadratic form or by including a Poisson jump component as an additional state variable. In our paper, we survey some recent studies of dynamic models of the term structure of interest rates that incorporate Markov regimes shifts. We not only summarize an early literature of regime-switching models that mainly focus on the short-term interest rate, but also the recent studies considering regime-switching models in discrete-time and continuous-time, respectively. Keywords Term structure r Interest rates r Regimes shift
72.1 Introduction Since the seminal paper of Duffie and Kan (1996), most empirical research on the term structure of interest rates have focused on a class of linear models, generally referred to as “affine term structure models.” In this class of models,
C.-C. Nieh () Department of Banking and Finance, Tamkang University, Taipei, Taiwan, ROC e-mail: [email protected] S. Wu Department of Economics, The University of Kansas, Lawrence, KS, USA Y. Zeng Department of Mathematics and Statistics, The University of Missouri at Kansas City, Kansas City, MO, USA
the yield on a -period bond (free of default risk), i;t , can be obtained as a linear function of some underlying state variables, Xt i;t D A C B= Xt where the coefficients A£ and B£ are determined by a system of differential equations. Since these models produce such a closed-form solution for the entire yield curve, they become very tractable in empirical applications. Piazzesi (2003) and Dai and Singleton (2003) provide excellent surveys of affine term structure models. Nonlinearity can be introduced into dynamic models of the term structure of interest rates either by generalizing the affine specification to a quadratic form along the line of Ahn et al. (2002), or by including a Poisson jump component as an additional state variable as in Ahn and Thompson (1988), Das (2002) or Piazzesi (2005) among others. Another approach is to incorporate Markov regime shifts into an otherwise standard affine model. Bansal and Zhou (2002), Wu and Zeng (2005), Dai et al. (2007) are some recent examples. These nonlinear models not only are more general than models without regime shifts, but also retain much of the tractability of the standard affine models. Moreover, there are natural economic interpretations of regime shifts. For example, much documented empirical evidence shows that the aggregate economy experiences recurrent shifts between distinct regimes of the business cycle. Such regimes shifts ought to be reflected into the dynamics of asset prices, and bond yields in particular. Another motivation for the regimeswitching models is the impact of the monetary policy on interest rates. Most central banks in the world have now used some short-term interest rates as their policy instruments. A notable feature of the monetary policy behavior is that changes in the policy rate’s target of the same direction are usually very persistent. For example, in the U.S. the Fed decreased its interest rate target 12 times consecutively between January 2001 and November 2002, and since June 2003 there have been 11 interest rate hikes by the Fed without a single decrease. Presumably such shifts in the overall monetary policy stance (from accommodative to tightening or vice versa) have more important effects on interest rates than a single
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_72,
1121
1122
interest rate change does. A model with regime shifts is the most convenient tool to capture such policy behavior. In this chapter we survey some recent studies of dynamic models of the term structure of interest rates that incorporate Markov regimes shifts. In Sect. 72.2 we summarize an early literature of regime-switching models that mainly focus on the short-term interest rate. The success of these models has since motivated fully fledged dynamic asset pricing models for the whole yield curve under regimeswitching. The next two sections attempt to summarize these recent studies. Section 72.3 considers regime-switching models in a discrete-time framework while Sect 72.4 contains continuous-time models. Section 72.5 provides some concluding remarks.
72.2 Regime-Switching and Short-Term Interest Rate The misspecification of existing single-regime models of short-term interest rate has been widely discussed. One potential description for this misspecified phenomenon is that the structural form of conditional means and variances is held fixed over the sample period. Ultimately, all such single-regime models, assuming that the short rate is meanreverting, involve the estimation of a set of parameters that are assumed to be fixed throughout the entire sample period. However, the later discussed literature about the shortterm interest rate in favor of a regime-switching model since it is more flexible and constitutes an attractive line in describing the “style fact” of the short-term interest rate process. The regime-switching model is more attractive owing to its feature of incorporating significant nonlinearity in contrast to the traditional linear property of the speed of revision and long-run mean inherent in most single-regime models. The regime-switching model is flexible enough to incorporate a different speed of revision to a different long-run mean at different times. The parameters in regime-switching model differing in different regime can account for the possibility that the short rate DGP may undergo a finite number of changes over sample period, which can capture the stochastic behavior of time-varying short-term interest rates. Since the regimes are never observed and the parameters are unknown and have to be estimated, probabilistic statements can be made about the relative likelihood of their occurrence, conditional on an information set. The period of unprecedented interest rate volatility always coincides with changes in business cycle caused by various economic or noneconomic shocks. In general, changes in monetary or fiscal policies result in a business cycle fluctuation, which may cause interest rates to behave quite differently in different time periods. The real world has also experienced various shocks in the economic environment within
C.-C. Nieh et al.
past few decades. For examples, the 1973 OAPEC oil crisis, the October 1987 stock market crash, 1997 Asian financial crisis, 2000 dot com crash, 2001 911-event, 2007 subprime mortgage crisis, and so on.1 Since the stochastic behavior of short-term interest rates is well-described by regimeswitching models, the earlier literature of regime-switching models applied to the interest rate process mainly focuses on the short-term interest rate. Those researches addressed on the earlier regime-switching models of the short-term interest rate follow the classical paper by Hamilton (1988). In this strand of literature, the models are mainly about the short-term interest rate alone, and long-term rates are usually related to the short-term rate via the expectation hypothesis. Examples include Lewis (1991), Cecchetti et al. (1993), Evans and Lewis (1995), Sola and Driffill (1994), Garcia and Perron (1996), Gray (1996), Bekaert et al. (2001), Ang and Bekaert (2002), among others. Most of the past literature has estimated the two-state regime-switching models, with the exception of Garcia and Perron (1996) and Bekaert et al. (2001) that focused mainly on the three-state models.
72.2.1 Short-Term Interest Rate Models Short-term interest rate models should capture two wellknown properties of the short rate process, mean-revision and leptokurtic unconditional distribution. Two most common classes of short rate models are known as diffusion models and GARCH (general autoregressive conditional heteroskedasticity) models.2 We review the diffusion model first in this paper and construct a regime-switching model later, which follows the framework of the diffusion model. In continuous time or diffusion models, the short-term interest rate is usually described as based on the Brownian motion. The dynamics of the short rate is thus expressed by the stochastic differential equation as the following framework. p dr D .˛ C ˇr/dt C rdW
(72.1)
where dW is the increment from a standard Brownian motion. This stochastic differential equation is usually transferred to an autoregressive (AR) model for an estimation purpose.3 rt D ˛ C ˇrt 1 C "t
1
(72.2)
OAPEC is an abbreviation for Organization of Arab Petroleum Exporting Countries consisting of the Arab members of OPEC plus Egypt and Syria. 2 Engle (1982) shows that a possible cause of the leptokurtosis in the unconditional distribution is conditional heteroskedasticity. 3 See Chan et al. (1992) and Gray (1996) for more details.
72 Regime Shifts and the Term Structure of Interest Rates
1123
2 where EŒ"t =ˆt 1 D 0 and E "2t =ˆt 1 D 2 rt 1 . ˆt 1 is the agents’ information set at time t 1. From this equation, mean revision and leptokurtosis can be captured by setting ˇ < 0 and > 0 (conditional heteroskedasticity), respectively.
72.2.2 Regime-switching Regime-switching can be viewed as the state changes in a finite Markov chain. For a general model considered in Gray (1996), we specify a special case that allows the short rate regime-switching setting held with regime-switching mean and variance. rt D ˛st C ˇst rt 1 C "t ;
"t N 0; s2t
(72.3)
where st is the unobserved state variable presumed to follow a two-state Markov chain with transition probability, pij . ˇst is the parameter for the measurement of the speed of revision to long-run mean in state st . The error term, "t , follows a normal distribution with 0 mean and a standard deviation of st in state st . Equation (72.3) is thought to follow a regime-switching framework by quasimaximum likelihood as described in Hamilton (1989). The testable scheme is expressed as follows. .˛1 ; ˇ1 ; 1 / if st D 1 .˛St ; ˇSt ; St / D .˛2 ; ˇ2 ; 2 / if st D 2 The evolution of the unobservable state variable is assumed to follow a two-state first-order Markov chain process satisfying p11 C p12 D p21 C p22 D 1, where pij D Pr.st D j=st 1 D i / gives the probability that state i followed by state j.4 The state in each time point determines which of the two normal densities is used to generate the model. For our case of short-term interest rate, it is assumed to switch between two regimes (different long-run mean and speed of revision) according to transition probabilities.
Hamilton’s (1989) Markov-switching estimation by quasimaximum likelihood.6 Let yt D Rt ; xt D .1; Vt /0 and ıst D .˛st ; ˇst /. Equation (72.3) can be expressed as7 : yt D xt0 ıst C "t ;
"t N 0; s2t
This regime-switching model assumes that the variance is also shifting between regimes. st is the unobserved state variable presumed to follow a two-state Markov chain with transition probability, pij . As usual, we use capital letters Xt and Yt to represent all the information available up to time t and to denote the vector of the unknown population parameters. 0 0 i.e., Xt D 0 .x01;0x2 ; : : : ; xt / ; Yt D .y1 ; y2 ; : : : ; yt / and D 1 ; 2 , where Xt is exogenous or predetermined and conditional on st 1 , and st is independent of Xt . The grouping parameter vector can be decomposed by 1 D .˛1 ; ˛2 ; ˇ1 ; ˇ2 ; 1 ; 2 /0 and 2 D .p11 ; p22 /0 . By denoting XQT D .x1 ; x2 ; : : : ; xT /0 ; YQT D .y1 ; y2 ; : : : ; yT /0 , and SQT D .s1 ; s2 ; : : : ; sT /0 , the joint density of YQT and SQT is as: f YQT ; SQT I XQT ; ı D f YQT SQT I XQ T ; 1 f SQT I XQ T ; 2 D
T Y
f .yt =st I xt ; 1 /
t D1
T Y
f .st =st 1 I xt ; 2 /
t D1
The log likelihood function can thus be expressed as: Q / D ln.YQT ; SQT I X;
T X
lnŒf .yt =st I xt ; 1 /
t D1
T X
lnŒf .st =st 1 I xt ; 2 /
t D1
If the state is known, the parameter vector 2 would be irrelevant and the log likelihood function would be maximized with respect to 1 . X @ lnŒf .yt =st I xt ; 1 / @ lnŒf .YQT ; SQT I XQ T ; / D @1 @1 t D1 T
72.2.2.1 Quasimaximum Likelihood Estimation of Parameters There are various ways to estimate the regime-switching model.5 The estimation of the MS setting of Equation (72.3) mainly follows Garcia and Perron (1996), which employs 4
Markov property argues that the process of st depends on the past realizations only through st1 . 5 See Kim and Nelson (1999).
Now, let’s introduce the estimation produces as the following way. 6 Garcia and Perron (1996) employs Hamilton’s (1989) regimeswitching model to explicitly account for regime shifts in an autoregressive model with three-state regime-switching mean and variance. 7 For simplicity, the following analysis is based only on one industry. We thus omit the symbol i.
1124
C.-C. Nieh et al.
72.2.2.2 Filter Probability
72.2.2.3 Smoothed Probability
We assume that is already observed.8 Based on Hamilton (1994), the derivation begins with the unconditional probability of the state of the first observation. 22 D , and consequently, p.st D 1/ D .1p111p /C.1p22 / p.st D 2/ D 1 Given Yt 1 , the joint probability of st 1 and st is:
The derivation of filter probability utilizes the information up to time t. Alternatively, we can use the full sample of s available information to draw the inference. It is therefore more efficient in the sense that all the information up to time T is utilized instead of t. Similarly, the smoothed probability of the first observation has to be derived. Consider the joint probability of yt ; st and s1 :
p.st ; st 1 =Yt 1 I Xt /
p.yt ; st ; s1 =Yt 1 I Xt /
D p.st =st 1 ; Yt 1 I Xt 1 / p.st 1 =Yt 1 I Xt 1 / D p.s t =st 1 /p.st 1 =Yt 1 I Xt 1 /
(72.4)
where the first equality is given by Bayes’ theorem and the second one is by the independence principle of the Markov chain. Since the transition probability p.st =st 1 / and the filter probability at time t 1, p.st 1 =Yt 1 I Xt 1 /, are both known at time t, it is not difficult to calculate p.st ; st 1 =Yt 1 I Xt / in Equation (72.4). Summing up st 1 to get the conditional marginal distribution of st : p.st =Yt 1 I Xt / D
2 X
D f .yt =st ; s1 ; Yt 1 I Xt / p.st ; s1 =Yt 1 I Xt 1 / D f .yt =st ; s1 ; Yt 1 I Xt / p.s t =s1 / p.s1 =Yt 1 I Xt 1 / (72.7) where the first equality is given by Bayes’ theorem and the second one is by the Bayes’ theorem and the independence principle of Markov chain. The first term on the right hand side of Equation (72.7) is the sample likelihood function. The second term can be derived by: p.st =s1 / D
p.s t ; st 1 =Yt 1 I Xt /
D
2 X
p.st =st 1 /p.st 1 =s1 /
st D1
The joint probability of yt and St at time t is then calculated as9 : D p.yt ; st =Yt 1 I Xt / D f .yt =st ; Yt 1 I Xt 1 / p.st =Yt 1 I Xt 1 /
p.st ; st 1 =s1 /
st D1
(72.5)
st 1 D1
2 X
2 X
2 X
p.st =st 1 /p.st 1 =st 2 /p.st 2 =s1 /
st 1 D1 st 2 D1
(72.6)
The first term on the right hand side is the sample likelihood function and the second term is from Equation (72.5), so Equation (72.6) can be calculated. Therefore, the filter inference about the probable regime at time t is given by:
D
2 X
2 X
:::::::::::::::::::
st 1 D1 st 2 D1
P .st =Yt I Xt / D
p.y t ; s t =Yt 1 I Xt / p.yt =Yt 1 I Xt /
f .yt =st ; Yt 1 I Xt /p.st =Yt 1 I Xt / D 2 P f .yt =st ; Yt 1 I Xt /p.st =Yt 1 I Xt / st D1
2 X
p.s1 =st 1 /
s2 D1
p.st 1 =st 2 / : : : p.s2 =s1 / and the third term is the filter probability at time t 1. These terms are all known at time t and can be used to calculate Equation (72.7). The joint probability of st and s1 is thus given by: P .st ; s1 =Yt I Xt / D
p.y t ; s t ; s1 =Yt 1 I Xt / p.yt =Yt 1 I Xt /
8
The following derivations are mainly from Shen (1994). f(.) and p(.) denote the continuous and discrete density function, respectively.
9
The numerator is from Equation (72.7) and the denominator can be derived by summing up st and s1 of Equation (72.7).
72 Regime Shifts and the Term Structure of Interest Rates
The conditional marginal probability of s1 is then given by summing up st : p.s1 =Yt I Xt / D
2 X
p.st ; s1 =Yt I Xt /
st D1
and this is the smoothed probability of the first observation at time t. Similarly, the smoothed probability at time t C 1 can be obtained by: p.y t C1 ; s t C1 ; s1 =Yt I Xt C1 / P .st C1 ; s1 =Yt C1 I Xt C1 / D p.yt C1 =Yt I Xt C1 /
1125
72.2.2.4 Estimation According to the smoothed probability derived in the previous section, we can say that the observations were generated from the first state with probability p.st D 1=YT I XT / and from the second state with probability p.st D 2=YT I XT /. Hamilton (1994) shows that the relevant conditions of the maximum likelihood estimates of the 'st are: T X
.yt D x 0 t 'Oj /xt p.st jYt I Xt / D 0;
(72.8) T P 2 P
where, O 2 D
p.yt C1 ; st C1 ; s1 =Yt I Xt C1 /
j D 1; 2
t D1
.yt x 0 'Oj / p.st jYt I Xt /
t D1 j D1
(72.9)
T
D f .yt C1 =st C1 ; s1 ; Yt I Xt C1 / p.st C1 ; s1 =Yt I Xt C1 /
Equation (72.8) implies that 'Oj satisfies a weighted OLS orthogonality condition where each observation is weighted by D f .yt C1 =st C1 ; s1 ; Yt I Xt C1 / p.s t C1 =s1 /p.s1 =Yt I Xt C1 / the probability that it came from regime j. In particular, 'Oj can be found from an OLS regression of yt .j / on xt .j /: and " T #1 " T # X X 0 0 D p.st C1 =s1 / xt .j /xt .j / xt .j /yt .j / ; j D 1; 2 'Oj D
2 X
2 X
:::::::::::::::::::
st 1 D1 st 2 D1
2 X
t D1
p.s1 =st 1 /
s2 D1
p.st C1 =st /p.st =st 1 / : : : p.s2 =s1 / Summing up st C1 to get the smoothed probability of the first observation at time t C 1 yields: 2 X
p.s1 =Yt C1 I Xt C1 / D
p.st C1 ; s1 =Yt C1 I Xt C1 /
st C1 D1
By repeating the above steps, we are able to get the smoothed probability of the first observation at time T: p.s1 =YT I XT / D
2 X
sT D1
p p.st D j jYT I XT / p xt .j / D xt p.st D j jXT I YT /
yt .j / D yt
where j denotes the present state and “ ” is used to distinguish the terms of the weighted observations from the original observations. The estimate of ¢ 2 in Equation (72.9) is just the combined sum of the squared residuals from these two regressions divided by T. Hamilton (1994) also shows the maximum likelihood estimates for the transition probabilities:
p.sT ; st =YT I XT /;
T P
pij D
Similarly, the smoothed probability of the ith observation at time T is given by: 2 X
where,
p.sT ; s1 =YT I XT /
sT D1
p.st =YT I XT / D
t D1
t D 1; 2; : : : ; T
p.st D j; st 1 D j jYT I XT /
t D2 T P
; p.st 1 D i jYT I XT /
t D2
which is essentially the number of times state i followed by state j divided by the number of times the process was in state i .
1126
C.-C. Nieh et al.
72.3 Regime-Switching Term Structure Models in Discreet Time Instead of focusing on a single interest rate, dynamic models of the term structure of interest rates attempt to model the joint movements of interest rates across the whole spectrum of maturities. The standard approach begins by first postulating a set of state variables, denoted as Xt , that underlie the dynamics of interest rates. Under the no-arbitrage condition, a positive stochastic discount factor, denoted as Mt, exists and determines all bond prices, we can, therefore, make further parametric assumptions about Mt , and the market price of risk in particular. Affine models of the term structure of interest rates are obtained after assuming both the short-term interest rate and the market price of risk are linear functions of Xt .10 The main advantage of this class of models is their tractability. Because the solution of the term structure of interest rates can be obtained as a linear function of the state variable Xt , which makes it easy to implement the models empirically. In this section, we generalize the standard affine models by incorporating Markov regime shifts in the dynamics of Xt and Mt . We can see that regime shifts not only
0q X
B B .Xt 1 ; St / D B @
make an otherwise standard model more flexible, they also introduce a new regime-switching risk premium in addition to the risk premiums due to shocks to the state variable Xt .
72.3.1 State Variables Let’s assume that there are K possible regimes, and let St denote the regime at time t. St follows a Markov-switching process with transition probability .St ; St C1 /. For example, if St D s; St C1 D s 0 ; .St ; St C1 / gives the probability of switching from regime s at time t to regime s 0 at time t C 1 Let Xt be an N 1 vector that contains all other state variables. We assume that Xt follows a stationary mean-reverting process conditional on each regime; that is, Xt D .St / C ˆ.St /Xt 1 C
X
(72.10)
where "t C1 is an N 1 standard normal random variable, N 1 vector and .St / and ˆ.St / are regime-dependent P N N matrix, respectively, and .Xt 1 ; St / is an N N diagonal matrix given by11
1
0 0;1 .St / C 1;1 .St /Xt 1
Again the presence of St indicates that the coefficients in † are regime-dependent. We can collect all the 0;i .St / P .i D 1; : : : ; N ) in an N N diagonal matrix, 0 .St /, and P collect all 1;i .St / in an N N matrix 1 .St /, with the i th column being 1;i . In some models such as Dai et al. (2007), the regime-dependence of the parameters is specified slightly differently. Parameters such as ; ˆ depend on the regime at time t 1; St 1 , instead of the regime at time t. In other models such as those of Bansal and Zhou (2002), the parameters are assumed to depend on regime at time t; St , as in the current paper.
.Xt 1 ; St /"t
::
:
q
0 0;N .St / C 1;N .St /Xt 1
C C C A
72.3.2 The Stochastic Discount Factor Under fairly general condition (see, for example, Harrison and Kreps 1979), the absence of arbitrage in financial markets implies that there exists a positive stochastic discount factor, Mt , such that the price of an asset at time t, Pt, is given by Pt D Et .Mt C1 Xt C1 / where XtC1 is the random payoff of this asset at time t C 1. In the case of bonds, if we let Pn;t denote the price of an n-period zero-coupon bond at time t, we have Pn;t D Et .Mt C1 Pn1;t C1 /
10
In the case of stochastic volatility, affine models assume that the product of the market price of risk and the volatility term is a linear function of the state variable Xt . In other words, it is assumed that, under the risk-neutral probability measure, Xt follows a linear mean-reverting process
11
Of course, some regularity conditions need to be imposed on the parameters so that the term inside the square root is non-negative and the process is well defined. See Dai et al. (2006) for more details.
72 Regime Shifts and the Term Structure of Interest Rates
The model is completed by making specific assumptions about the stochastic discount factor, Mt , and the short-term interest rate. In particular, let 1 0
0
Mt C1 D e mt C1 D e it 2 t t t "t C1 where it is the (one-period) short-term interest rate, t is the market price of risk, and "t C1 is the fundamental shocks that drive the state variables. Models differ in their assumptions about the market price of risk. For example, in Gaussian affine models, it is assumed that t is a linear function of the state variable Xt . Under regime shifts, we assume that
1127
The last component of a dynamic model of the term structure of interest rates is a specification of the short-term interest rate, it . Here, following the standard affine models, we assume that it D A1 .St / C B10 .St /Xt
(72.14)
where A1 .St / and B1 .St / (an N 1 vector) are all regimedependent parameters.
72.3.3 Solving for the Term Structure of Interest Rates
1 mt C1 D it C 0t .St ; St C1 /t .St ; St C1 / C 0t .St ; St C1 /"t C1 2 (72.11) Again let Pn;t being the price of a n-period zero-coupon bond at time t, we have where 0t .St ; St C1 / indicates that the market price of risk not only is a function of Xt , but also depends on regime St and Pn;t D Et .e mt C1 Pn1;t C1 / St C1 . As pointed out by Dai and Singleton (2003), the presence of St C1 has an important implication. If t depends only We guess the solution to the bond price is, for some regimeon St , it implies that a regime shift at time t C 1 will not dependent coefficients An .St / an Bn .St /, have an impact on the stochastic discount factor, Mt C1 , or in0 vestors’ intertemporal marginal rate of substitution. In order Pn;t D e An .St /Bn .St /Xt words, the regime-switching does not present a systematic risk to investors and hence will not be priced in the bond marSubstituting into the asset pricing equation above, we obtain ket. Assets are risky only because of the shock "t C1 . What 0 the regime-switching will do, however, is to make risk pree An .St /Bn .St /Xt miums, or the expected excess return on risky assets, regime1 0 0 0 D Et .e it 2 t t t "t C1 e An1 .St C1 /Bn1 .Sn1 /Xt C1 / dependent. The risk premiums will be time-varying not only because they depend on Xt , but also because the parameters (72.15) are regime-dependent. On the other hand, if t depends on St C1 as well, there will be an additional risk premium asso- Note that D .S ; S / depends on X ; S and S . t t t t C1 t t t C1 ciated with regime-switching. Because in this case, regime Since the regime-switching (from S to S ) probability is t t C1 shifts have a direct impact on Mt C1 and will be regarded as given by .S ; S /, Equation (72.15) can be written as t t C1 a systematic risk just as the shock "t C1 . X To obtain a closed-form solution for the term structure of e An .St /Bn0 .St /Xt D .St ; St C1 /E.e t C1 .St ;St C1 / jIt ; St C1 / interest rates, in general we need t to satisfy St C1 (72.16) X t .St ; St C1 / D 0 .St ; St C1 / C ƒ.St ; St C1 /Xt C1 t where It is the information set at time t, and t C1 is given by (72.12) or equivalently t .St ; St C1 / D
X1 t
Œ0 .St ; St C1 / C ƒ.St ; St C1 /Xt C1 (72.13)
where 0 .St ; St C1 / is an N 1 vector of regime-dependent St C1 / is a regime-dependent (on St and St C1 ) constant, ƒ.St ;P (on St and St C1 /N N matrix. t is the conditional volatility term of the state variable Xt C1 as defined in Equation (72.10).
1 t C1 .St ; St C1 / D it 0t t 0t "t C1 2 0 C An1 .St C1 / Bn1 .St C1 /Xt C1 (72.17) First note that 1
E.e t C1 .St ;St C1 / jIt ; St C1 / D e E.t C1 jIt ;St C1 /C 2 Var.t C1 jIt ;St C1 /
1128
which is approximately12
C.-C. Nieh et al.
rates as in Bansal and Zhou (2002) because of the assumptions that the coefficient ƒ in Equation (72.12) and the coefE.e t C1 .St ;St C1 / jIt ; St C1 / 1 C E.t C1 jIt ; St C1 / ficient B1 in Equation (72.14) are regime-dependent. Exact solutions can be obtained if we assume that both ƒ and B1 1 C Var.t C1 jIt ; St C1 / do not depend on regimes. This is essentially the assump2 tion made in Dai et al. (2007). In their models, the factor Substituting back into Equation (72.16) and using the same loading coefficient Bn in the solution of the term structure 0 approximation for e An .St /Bn .St /Xt , is also independent of regimes. More general models can be obtained by making the regime-switching probability timewe have varying. That is, we can assume that the conditional probh X An .St / Bn0 .S t /Xt D .S t ; S t C1 / E.t C1 jIt ; St C1 / ability .St ; St C1 / depends on Xt as in Dai et al. (2007). St C1 However, in this case, we need Mt C1 to be separable in St C1 i and "t C1 and place some additional restrictions on t . This 1 C Var.t C1 jIt ; St C1 / (72.18) generalization can be more easily discussed in a continuous2 time model below. where t C1 is given in Equation (72.17). Equation (72.18) gives a system of difference equations for An .St / and Bn .St / that can be solved recursively, with the initial condition that 72.4 Regime-Switching Term Structure A0 .St / D 0; Bn .St / D 0 for all St , and A1 .St / and B1 .St / Models in Continuous Time are given in Equation (72.14). For example, if there are K possible regimes at each time t, Equation (72.18) results in the following equations Continuous-time models provide more analytical tractability. The affine term structure model of Duffie and Kan (1996) for An .s/ and Bn .s/ .s D 1; 2; : : : ; K/. has been further studied in Dai and Singleton (2000) and K generalized to “essentially affine” models in Duiffie (2002) h X 0 Bn .s/ D .s; s / ˆ ƒ.s; s / Bn1 .s / and “semi-affine” models in Duarte (2004). Das (2002) and s Chacko and Das (2002) incorporated jumps in an otherwise i standard affine model. Cheridito et al. (2007) considers more 1X 2 B .s / C B1 .s/ (72.19) general specification of the market price of risk. Quadratic C 1 n1 2 term structure models are studied in Ahn et al. (2002). See K X 0 An .s/ D .s; s /ŒAn1 .s / C Bn1 .s /. 0 /.s; s // the survey paper by Dai and Singleton (2003) and see the book by Singletons (2006) for more recent development. s This section focuses on the continuous-time term structure X X0 1 0 .s / Bn1 .s / (72.20) models with regime-shifting. Bn1 0 0 2 To the best of our knowledge, Landen (2000) provides Note that Equation (72.19) and Equation (72.20) define a sys- the first continuous-time regime-switching model of the term tem of 2 K difference equations that must be solved jointly. structure of interest rates. Under the risk-neutral pricing meaDenote in;t as continuously compounding n-period inter- sure, she derives the dynamics of the yield curve and obtains an explicit solution for a special regime-switching affine est rate, or the yield on the n-period bond. By definition, case. Dai and Singleton (2003) propose a fully fledged dynin;t namic model of the term structure of interest rates under Pn;t D e regime-switching. Their model characterizes the dynamics Therefore, the term structure of interest rates can be obtained of interest rates under both the risk-neutral and the physical measures in the presence of regime-switching risk preas, for any maturity n and time t, miums. Following the approach of Cox et al. (1985a, b) and B .S / .S / A Ahn and Thompson (1988), Wu and Zeng (2005) developed a n t n t in;t D an .St /Cbn0 .St /Xt D C Xt (72.21) regime-switching models from a general equilibrium framen n work under the systematic risk of regime shifts. Other studIn this model we have to rely on log-linear approximation ies of regime-switching term structure models include Elliott to get an analytical solution to the terms structure of interest and Mamon (2001), Wu and Zeng (2004, 2007), and so forth. Papers for regime-switching models of the term structure of interest rates under default risk can be found in Bielecki and 12 Rutkowski (2000, 2001). This is the approximation used in Bansal and Zhou (2002).
72 Regime Shifts and the Term Structure of Interest Rates
In the rest of this section, we present a slightly more general framework than the one in Wu and Zeng (2006). We first review a useful representation of regime shifts introduced in Wu and Zeng (2007) with a slight generalization. We then follow the no-arbitrage approach to develop a general tractable multifactor dynamic model of the term structure of interest rates that includes not only regime shifts but also jumps. The model allows for regime-dependent jumps while both jump risk and regime-switching risk are priced. A closed-form solution for the term structure is obtained for an affine-type model under log-linear approximations.
1129
period .0; t/. For example, m.t; fug/ counts the cumulative times to shift to regime u during .0; t/. Then, m is the suitable random counting measure. A marked point process or a random measure is uniquely characterized by its stochastic intensity kernel.13 To define the stochastic intensity kernel for m.t; /, we denote ˜ as the usual counting measure on U . Note R that ˜ has these two properties: For A 2 U; .A/ D IA .du/ (i.e., .A/ R counts the number of elements in A) and A f .u/.du/ D P u2A f .u/. The stochastic intensity kernel can depend on the current time and regime. We also allow m.t; / to depend on Xt , another state variable to be defined later. We define the stochastic intensity kernel as
72.4.1 A Useful Representation for Regime Shift Regime-shifting can be viewed as the state changes in a finite-state Markov chain. There are three commonly used mechanisms to model a Markov chain with time-varying transition probabilities. Each mechanism has its applications in the literature of interest rate term structure. The first is Conditional Markov Chain approach, which is discussed in the book of Yin and Zhang (1998) with many applications. Bielecki and Rutkowski (2000, 2001) and Dai and Singleton (2003) are applications of this mechanism in modeling the term structure of interest rates. Hidden Markov Model (HMM) approach is the second mechanism, which is summarized in a book by Elliott et al. (1995). Elliott and Mamon (2001) utilize the HMM approach to derive a model of the term structure of interest rates. The third is the Marked Point Process or the Random Measure approach, which is employed in Landen (2000). The mark space in Landen’s representation is E D f.i; j / W i 2 f0; 1; : : : ; N g; j 2 f0; 1; : : : ; N g; i ¤ j g, the product space of regimes including all possible regime switching. The third mechanism has an important advantage over the previous two mechanisms. Namely, the regime process has a simple integral form so that Ito’s formula can be applied easily. Along the line of the third mechanism, a simpler integral form for the regime process is developed using only the space of regime as the mark space. This is introduced in Wu and Zeng (2007) and we discuss such representation here with a slight generalization. Let St represent the most recent regime. There are two steps to obtain the simple integral representation of St . First, we define the random counting measure m.t; / for every t > 0. Denote the mark space U D f0; 1; : : : ; N g as all possible regimes with the power -algebra. Denote u as a generic point in U and A as a subset of U . We denote m.t; A/ as the cumulative number of entering a regime in A during the
m .dt; d u/ D h.uI X.t/; S.t/; t/.d u/dt;
(72.22)
where h.uI X.t/; S.t/; t/ is the conditional regime-shift (from regime S.t/ to u) intensity at time t given X.t/ and S.t/. Note that h.uI X.t/; S.t/; t/ corresponds to the hij in the infinitesimal generator matrix of the Markov chain. We assume 0
h.u; X.t/; S.t/; t/ D e h0 .u;S.t /;t /Ch1 .u;S.t /;t /X.t / where h0 .u; X.t/; S.t/; t/ is a real value and h1 .u; X.t/; S.t/; t/ is a L 1 vector. We choose h.u; X.t/; S.t/; t/ as an exponential affine form so that we can obtain an approximate close form solution for the dynamics of the term structure. Intuitively, we can consider m .du; dt/ as the conditional probability of shifting from Regime S.t/ to Regime u during Œt; t C dt , given X.t/ and S.t/. We can express m .t; A/, the compensator of m.t; A/, as Z tZ m .t; A/ D
h.uI X./S./; /.du/d 0
D
A
XZ u2A
t
h.uI X./S./; /d 0
Second, we express S.t/ in the integral form below using the random measure defined above: Z S.t/ D S.0/ C .u S.//m.d ; d u/ (72.23) Œ0;t U
Note that m.d; du/ takes two possible values: 0 or 1. It takes one only at the regime-switching time ti and with u D S.ti /, the new regime at time ti . Hence, the above expression is but 13
See Last and Brandt (1995) for detailed discussion on marked point process, stochastic intensity kernel, and related results.
1130
C.-C. Nieh et al.
P a telescoping sum: S.t/ D S.0/ C ti
form solution of the term structure is obtained for an affinetype model using log-linear approximation.
Z dS.t/ D
.u S.t//m.dt; d u/
(72.24)
U
To understand the above differential equation, we can assume there is a regime switching from S.t/ to u occurs at time t, then S.t/ S.t/ D u S.t/ implying S.u/. These two forms are crucial, because they allow the straightforward application of Ito’s formula in the following subsections.
72.4.2 Regime-Dependent Jump Diffusion Model for the Term Structure of Interest Rates This section proposes a general multifactor term structure model under regime-switching jump diffusion. A closed-
72.4.2.1 State Variables In this section, the most recent regime S.t/ follows Equation (72.23) or (72.24). Other L state variables, Xt , are described by the following equation, which is a continuoustime analogue of Equation (72.10) but includes an additional jump component: dXt D Œ‚0 .St ; t/ C ‚1 .St ; t/Xt dt X C .Xt ; St ; t/d Wt C J.St ; t/dNt (72.25) where ‚0 .St ; t/ and ‚1 .St ; t/ are regime-dependent L 1 P vector and L L matrix, respectively. .Xt ; St ; t/ is an L L diagonal matrix given by
0q 0 0;1 .St ; t/ C 1;1 .St ; t/Xt B X B :: .Xt ; St ; t/ D B : @ q
where 0;i .St ; t/ .i D 1; : : : ; N / are regime-dependent coefficients and all 1;i .St ; t/ are regime-dependent L 1 vectors. Wt is a L 1 vector of independent standard Brownian motions; Nt is a L 1 vector of independent Poisson processes with L 1 time-varying and regime-dependent intensity ıJ .St ; t/I J.St ; t/ is an L L matrix of regimedependent random jump size with a conditional density g.J jSt ; t/. Given fSt ; tg, we assume that J.St ; t/ are serially independent and are also independent of Wt and Nt .
0 0;L .St ; t/ C 1;L .St ; t/Xt
C C C A
kernel, Mt , which is also called stochastic discount factor in Sect. 72.3.2. We first present it in a stochastic differential equation (SDE) form as dMt D rt dt D;t d Wt ƒ0J;t .dNt ıJ .St ; t/dt/ Mt Z S .X.t/; S.t/; t/Œm.dt; d u/m.dt; d u/ : U
(72.27) The L 1 vector of market prices of diffusion risk conditioning on Xt ; St , and t is D;t D D;t .Xt ; St ; t/ D ƒ0D .St ; t/
72.4.2.2 The Short Rate The instantaneous short-term interest rt is a linear function of Xt given St and t rt D 0 .St ; t/ C 1 .St ; t/0 Xt
1
X
.Xt ; St ; t/
where ƒ0D .St ; t/ D .D;1 .St ; t/; D;L .St ; t// is an L 1 vector. The L 1 vector of market prices of jump risk conditioning on St and t is
(72.26)
where 0 .St ; t/ is a scalar and 1 .St ; t/ is a L 1 vector.
0J;t D 0J .St ; t/ D .J;1 .St ; t/; J;L .St ; t// And the market price of regime-switching (from regime S.t/ to regime u) risk given X.t/; S.t/ and t is
72.4.2.3 The Stochastic Pricing Kernel 0
S .X.t/; S.t/; t/ D 1 e 0;S .u;S.t //C1;S .u;.S.t //X.t / Using no arbitrage argument (see Harrison and Kreps 1979 for technical conditions), we further specify the pricing
where 01;S D .1;S;1 ; 1;S;L /
72 Regime Shifts and the Term Structure of Interest Rates
1131
The specifications above complete the model for the term structure of interest rates, which can be solved by a change of
probability measure. Specifically, for fixed T > 0, we define the equivalent martingale measure Q by the Radon–Nikodym D T =0 where for t 2 Œ0; T derivative dQ dP
Rt 0
RT 1 0 t D e t r d e 0 D; d W . / 2 D; D; d
Rt 0 Rt 0
e 0 J; ıJ; d C 0 log.1J; / dNt
Rt R Rt R
e 0 U S .uIX ;S /m .d ;d u/C 0 U log.1S .uIX ;S //m.d ;d u/
Observe that the first two terms of t correspond to the discrete time Mt C1 D e mt C1 where mt C1 is given in Equation (72.11). The extra two terms correspond to the market prices of jump and regime-shift. In the absence of arbitrage, the price at time t of a default-free pure discount bond that matures at T, P .t; T /, can be obtained as RT
n RT o Q P .t; T / D Et e t r d D E Q e t r d jFt n RT o D E Q e t r d jXt ; St (72.29)
(72.28)
Z dSt D
.u St /e m .dt; du/ :
(72.31)
U
The drift term is Q .Xt ; St ; t/ D Œ‚0 .St ; t/ C ‚1 .St ; t/ Xt ‚ † .Xt ; St ; t /2 D .St ; t/
(72.32)
e1 .St ; t / Xt e0 .St ; t / C ‚ D‚
(72.33)
with with the boundary condition P .T ; T / D P .T; T / and the last equality comes from the Markov property of .Xt ; St /.
e0 .St ; t / D ‚0 .St ; t / ƒD .St ; t / †0 .St ; t / ‚ and
72.4.2.4 The Dynamics Under Q
e1 .St ; t / D ‚1 .St ; t / ƒD .St ; t / †0 .S; t/ ‚ 1
To present the dynamics of state variables, Xt and St under equivalent martingale measure Q, we first define X0
1
D
X0
.St ; t/ D .0;1 .St ; t/; ; 0;L .St ; t// 0 1 1;1 .St ; t/ X X B C :: D .St ; t/ D @ A : 0
0
1
1;L .St ; t/ and 0 D;1 .St ; t/ B :: ƒD D ƒD .St ; t/ D @ :
1 C A
et e t is a L 1 standard Brownian motion under QI N W is a L 1 vector of Poisson processes with intensity e ı J .St ; t/ whose elements are given by e ı i;J .St ; t/ D e .t; A/ Œ1 i;J .St ; t/ ıi;J .St ; t/ for i D 1; : : : ; LI m is the marked point process with intensity matrix n o e e H .uI Xt ; St / D h .uI Xt ; St / D fh .uI Xt ; St / o n ho.u;S .t //Ce h01 .u;S .t //Xt with .1 .uI X ; S //g D ee S
t
t
e h0 .u; .St // D h0 .u; S .t// C 0;S .u; S .t// and e h1 .u; S .t// D h1 .u; S .t// C 1;S .u; S .t//. The comD pensator of m e .t; A/ under Q becomes e m .dt; du/ h .uI Xt ; St / .du/ dt, .1 S .uI Xt ; St // m .dt; dz/ D e
D;L .St ; t/ Then, the dynamics of Xt and St under Q are given by the following stochastic differential equations respectively, Q .Xt ; St ; t/ dt C dX t D ‚
X
et CJ .Xt ; St / dN
et .Xt ; St ; t/ dW (72.30)
72.4.2.5 Bond Pricing Let P .t; T / D F .t; Xt; St; T / D F .t; X; S; T / where X D Xt and S D St. The following proposition gives the partial differential equation (PDE) determining the bond price.
1132
C.-C. Nieh et al.
Proposition 1. The price of the default-free pure discount bond F (t, X, S, T) defined in Equation (72.29) satisfies the following partial differential equation 2
@F e 1 @ F X X0 @F C ‚ C tr @t @X 0 2 @X @X 0 C Es;t . s F / C e ı 0J EJ;t . X F / D rF Q
Q
R Q and EJ;t . X F / D X Fe g . J j Xt ; St / dJ; that is, the mean of X F conditioning on Xt and St under Q. (72.34)
with the boundary condition F .T ; X; S; T / D F .T; X; S; T / D 1. Note that S F D F .t; R Xt ; u; T / F .t; Q h .uI St / .du/; Xt ; St ; T / and ES;t . S F / D U S F e that is, the mean of S F conditioning on Xt and St under Q;
0 1 F .t; Xt C JL ; St ; T / F .t; Xt ; St ; T / B C ::
X F D@ A : F .t; Xt C J1 ; St ; T / F .t; Xt ; St ; T /
The above proposition follows from the Feynman–Kac formula and the Markov property of .X; S /. 0 When we let F.t;Xt ;St ;T /DeA.S.t/;t/CB .S .t/;t/X .t/ with A .S .t/ ; t/ D A .S .t/ ; t; T / and B .S .t/ ; t/ D B .S .t/ ; t; T / Equation (72.34) becomes
@B 0 @A .S .t/ ; t/ C .S .t/ ; t/ X .t/ @t @t 0 f0 .S .t/ ; t/ C ‚ f1 .S .t/ ; t/ X .t/ C B .S .t/ ; t/ ‚
X 0 1 X0 .S .t/ ; t/ B 2 .S .t/ ; t/ C B 2 .S .t/ ; t/ .S .t/ ; t/ X .t/ C 0 1 2 Z
0 h01 .u;S .t /;t /X .t / h0 .u;S .t /;t /Ce e S A.S .t /;t /C S B .S .t /;t /X .t / 1 ee .d u/ C U
0
Q C EJ;t e B .S .t /;t /J .S .t /;t / 1 D
0
.St ; t/ C
1
.St ; t/0 Xt :
To solve for A .S .t/ ; t / and B .S .t/ ; t/, we apply the following log-linear approximations
0 0 h01 .u;S .t /;t /Xt / e . S B .S .t /;t /Ce h1 .u; S .t/ ; t/ X.t/ 1 C S B 0 .S .t/ ; t/ C e and h01 .u;S .t /;t /X .t / 1 Ce h01 .u; S .t/ ; t/ X .t/ ee
Under these approximations:
0 h0 .u;S .t /;t /Ce h01 .u;S .t /;t /X .t / e S A.S .t /;t /C S B .S .t /;t /X .t / 1 ee h0 .u;S .t /;t //C. S B 0 .S .t /;t /Ce h01 .u;S .t /;t /X.t // h01 .u;S .t /;t /X .t / h0 .u;S .t /;t /Ce D e . S A.S .t /;t /Ce ee
h0 .u;S .t /;t / 1 C S B 0 .S .t/ ; t/ C e e S A.S .t /;t /Ce h01 .u; S .t/ ; t/ X.t/
h0 .u;S .t /;t / 1 Ce h01 .u; S .t/ ; t/ X .t/ ee
h0 .u;S .t /;t / h0 .u;S .t /;t / e S A.u;S .t /;t / 1 C ee D ee h
e S A.u;S .t /;t / S B 0 .u; S .t/ ; t/ C e h01 .u; S .t/ ; t/ i e h01 .u; S .t/ ; t/ X .t/
72 Regime Shifts and the Term Structure of Interest Rates
1133
Now by matching coefficients, we obtain a set of differential equations for A .S; t/ D A .S .t/ ; t; T / and B .S; t/ D B .S .t/ ; t; T /, which can be solved for the term structure of interest rates.
Proposition 2. The price at time t of a default-free pure discount bond with maturity T t is given by P .t; T / D 0 e A.S;t /CB.S;t / Xt and the Tt-period interest rate is given 0 / Xt by R .t; T / D A.S;t B.S;tT/ , where A .S; t/ T t t and B .S; t/ are determined by the following differential equations
@B .S; t/ e 1X .S; t/ B 2 .S; t/ C ‚1 .S; t/0 B .S; t/ C 1 @t 2 Z h
i h0 .uIS;t / e S A S B C e h1 .uI S; t/ e h1 .uI S; t/ ee .du/ D C
1
.S /
(72.35)
U
and @A .S; t/ e C ‚0 .S /0 B .S; t/ @t X 1 C B 2 .S; t/0 .S; t/ 0 2 h 0 i Ce ı J .S; t/0 EJ e J B.S;t / 1 Z A h0 .uIS;t / e S 1 ee .d u/ D C
0
.S /
(72.36)
U
with boundary conditions A .S; 0/ D 0 and B .S; 0/ D 0, where S A D A .u; t/ A .S; t/ ; S B D B .u; 0 t/ B .S; t/, and B 2 .S; t/ D B12 .S; t/ ; : : : ; BL2 .S; t/ .
regimes represent different stances of the monetary policy, investors have to use observable state variables such as interest rates to learn the current regime when the monetary policy is not completely transparent. Another possible extension is to tie the regimes more closely to economic fundamentals. Most regime-switching models are motivated by the business cycle fluctuations or shifts in policy regimes. In the term structure models reviewed in this chapter, however, regimes are only identified by the dynamics of interest rates and lack clear economic interpretations. Introducing other macroeconomic variables that are better indicators of the business cycle or the policy as additional state variables can give us deeper insight into the dynamic properties of interest rates across different economic regimes.
72.5 Conclusion Interest rate is probably one of the more important macroeconomic variables that is closely watched by financial market participants and policy makers. Much research effort has been devoted to econometric modeling of the dynamic behavior of interest rates. This chapter provides a brief review of models of the term structure of interest rates under regime shifts. These models retain the same tractability of the singleregime affine models on one hand, and allow more general and flexible specifications of the market prices of risk, the dynamics of state variables, as well as the short-term interest rates on the other hand. These additional flexibilities can be important in capturing some silent features of the term structure of interest rates in empirical applications. The models can also be applied to analyze the implications of regimeswitching for bond portfolio allocations and the pricing of interest rate derivatives. An implicit assumption in the models reviewed in this chapter is that investors can observe the regimes. One extension is to assume that regimes are not observable or can only be imperfectly observed by investors. For example, if
References Ahn, D.-H, R. F. Dittmar, and A. R. Gallant. 2002. “Quadratic term structure models: theory and evidence.” The Review of Financial Studies 15, 243–288. Ahn, C and H. Thompson. 1988. “Jump diffusion processes and the term structure of interest rates.” Journal of Finance 43, 155–174. Ang, A. and G. Bekaert. 2002. “Regime switches in interest rates.” Journal of Business and Economic Statistics 20(2), 163–182. Bansal, R. and H. Zhou. 2002. “Term structure of interest rates with regime shifts.” Journal of Finance 57, 1997–2043. Bekaert, G., R. J. Hodrick, and D. A. Marshall. 2001. “‘Peso problem’ explanations for term structure anomalies.” Journal of Monetary Economics 48(2), 241–270. Bielecki, T. and M. Rutkowski. 2000. “Multiple ratings model of defaultable term structure.” Mathematical Finance 10, 125–139. Bielecki, T. and M. Rutkowski. 2001. Modeling of the defaultable term structure: conditional Markov approach, Working paper, The Northeastern Illinois University. Chacko, G. and S. Das. 2002. “Pricing interest rate derivatives: a general approach.” Review of Financial Studies 15, 195–241. Chan, K. C., G. A. Karolyi, F. A. Longstaff, and A. B. Sanders. 1992. “An empirical comparison to alterative models of the short-term interest rate.” Journal of Finance 47, 1209–1227.
1134 Cecchetti, S. J., P. Lam, and N. C. Mark. 1993. “The equity premium and the risk-free rate: matching the moments.” Journal of Monetary Economics 31, 21–46. Cheridito, P., D. Filipovic, and R. Kimmel. 2007. “Market price of risk specifications for affine models: theory and evidence.” Journal of Financial Economics 83, 123–170. Cox, J., J. Ingersoll, and S. Ross. 1985a “An intertemporal general equilibrium model of asset prices.” Econometrica 53, 363–384. Cox, J., J. Ingersoll, and S. Ross. 1985b “A theory of the term structure of interest rates.” Econometrica 53, 385–407. Das, S. 2002. “The surprise element: jumps in interest rates.” Journal of Econometrics 106, 27–65. Dai, Q. and K. Singleton. 2000 “Specification analysis of affine term structure models.” Journal of Finance 55, 1943–1978. Dai, Q. and K. Singleton. 2003. “Term structure dynamics in theory and reality.” The Review of Financial Studies 16, 631–678. Dai, Q., A. Le, and K. Singelton. 2006. Discrete-time dynamic term structure models with generalized market prices of risk, working paper, Graduate School of Business, Stanford University. Dai, Q., K. Singleton, and W. Yang. 2007. “Regime shifts in dynamic term structure model of the U.S. treasury bond yields.” Review of Financial Studies 20, 1669–1706. Diebold, F. and G. Rudebusch. 1996. “Measuring business cycles: a modern perspective.” Review of Economics and Statistics 78, 67– 77. Duiffie, G. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57, 405–443. Duffie, D. and R. Kan. 1996. “A yield-factor model of interest rates.” Mathematical Finance 6, 379–406. Duffie, D., J. Pan, and K. Singleton. 2000 “Transform analysis and asset pricing for affine jump-diffusions.” Econometric 68, 1343–1376. Elliott, R. J. et al. 1995. Hidden Markov models: estimation and control, New York, Springer. Elliott, R. J. and R. S. Mamon. 2001. “A complete yield curve descriptions of a Markov interest rate model.” International Journal of Theoretical and Applied Finance 6, 317–326. Engle, R. F. 1982. “Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation.” Econometrica 50, 987–1008. Evans, M. D. and K. Lewis. 1995. “Do expected shifts in in?ation affect estimates of the long-run fisher relation?” The Journal of Finance 50(1), 225–253. Garcia, R. and P. Perron. 1996. “An analysis of the real interest rate under regime shifts” The Review of Economics and Statistics 78(1), 111–125. Gray, S. F. 1996. “Modeling the conditional distribution of interest rates as a regime-switching process.” Journal of Financial Economics 42, 27–62.
C.-C. Nieh et al. Hamilton, J. D. 1988. “Rational-expectations econometric analysis of changes in regime: an investigation of the term structure of interest rates.” Journal of Economic Dynamics and Control 12, 385–423. Hamilton, J. 1989. “A new approach to the economic analysis of nonstationary time series and the business cycle.” Econometrica 57, 357–384. Hamilton, J. D. 1994. Time series analysis, Princeton University Press, New Jersey. Harrison, M. and D. Kreps. 1979. “Martingales and arbitrage in multiperiod security markets.” Journal of Economic Theory 20, 381–408. Kim, C. J. and C. Nelson. 1999. “A Bayesian approach to testing for Markov switching in univariate and dynamic factor models,” Discussion Papers in Economics at the University of Washington 0035, Department of Economics at the University of Washington. Landen, C. 2000. “Bond pricing in a hidden Markov model of the short rate.” Finance and Stochastics 4, 371–389. Last, G. and A. Brandt. 1995. Marked point processes on the real line, Springer, New York. Lewis, K. 1991. “Was there a ‘Peso problem’ in the U.S. term structure of interest rates: 1979–1982?” International Economic Review 32(1), 159–173. Piazzesi, M. 2005. “Bond yields and the federal reserve.” Journal of Political Economy 113, 311–344. Shen, C. H. 1994. “Testing efficiency of the Taiwan–US forward exchange market – a Markov switching model.” Asian Economic Journal 8, 205–215. Sola, M. and J. Driffill. 1994. “Testing the term structure of interest rates using a vector autoregression with regime switching.” Journal of Economic Dynamics and Control 18, 601–628. Wu, S. and Y. Zeng. 2004. “Affine regime-switching models for interest rate term structure,” in Mathematics of Finance, G. Yin and Q. Zhang (Eds). American Mathematical Society Series – Contemporary Mathematics, Vol. 351, pp. 375–386. Wu, S. and Y. Zeng. 2005. “A general equilibrium model of the term structure of interest rates under regime-switching risk.” International Journal of Theoretical and Applied Finance 8, 839–869. Wu, S. and Y. Zeng. 2006. “The term structure of interest rates under regime shifts and jumps.” Economics Letters 93, 215–221. Wu, S. and Y. Zeng. 2007. “An exact solution of the term structure of interest rate under regime-switching risk.” in Hidden Markov models in finance, Mamon, R. S. and R. J. Elliott (Eds). International Series in Operations Research and Management Science, Springer, New York, NY, USA. Yin, G. G. and Q. Zhang. 1998. Continuous-time Markov chains and applications. A singular perturbation approach, Springer, Berlin.
Chapter 73
ARM Processes and Their Modeling and Forecasting Methodology Benjamin Melamed
Abstract The class of ARM (Autoregressive Modular) processes is a class of stochastic processes, defined by a nonlinear autoregressive scheme with modulo-1 reduction and additional transformations. ARM processes constitute a versatile class designed to produce high-fidelity models from stationary empirical time series by fitting a strong statistical signature consisting of the empirical marginal distribution (histogram) and the empirical autocorrelation function. More specifically, fitted ARM processes guarantee the matching of arbitrary empirical distributions, and simultaneously permit the approximation of the leading empirical autocorrelations. Additionally, simulated sample paths of ARM models often resemble the data to which they were fitted. Thus, ARM processes aim to provide both a quantitative and qualitative fit to empirical data. Fitted ARM models can be used in two ways: (1) to generate realistic-looking Monte Carlo sample paths (e.g., financial scenarios), and (2) to forecast via point estimates as well as confidence intervals. This chapter starts with a high-level discussion of stochastic model fitting and explains the fitting approach that motivates the introduction of ARM processes. It then continues with a review of ARM processes and their fundamental properties, including their construction, transition structure, and autocorrelation structure. Next, the chapter proceeds to outline the ARM modeling methodology by describing the key steps in fitting an ARM model to empirical data. It then describes in some detail the ARM forecasting methodology, and the computation of the conditional expectations that serve as point estimates (forecasts of future values) and their underlying conditional distributions from which confidence intervals are constructed for the point estimates. Finally, the chapter concludes with an illustration of the efficacy of the ARM modeling and forecasting methodologies through an example utilizing a sample from the S&P 500 Index.
B. Melamed () Department of Supply Chain Management and Marketing Sciences, Rutgers Business School – Newark and New Brunswick, Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
Keywords ARM processes r Autoregressive modular processes r ARM modeling methodology r ARM forecasting methodology r Financial applications
73.1 Introduction The enterprise of modeling aims to fit a model from a parametric family to empirical data, subject to prescribed fitting criteria. From a high-level vantage point, fitting criteria formulate some measure of “deviation” of any candidate model from the empirical data, and modeling proceeds as a search for a model that minimizes that “deviation” measure. For example, regression-type modeling utilizes fitting criteria such as least squares, maximum likelihood, and others. A statistical signature (signature, for short) is a set of empirical or theoretic statistics: the former is computed from empirical data and the latter is derived from a theoretic model. Here, we shall be concerned with time series modeling under stationary conditions, so the signatures of interest will be derived from stationary empirical time series and stationary time-series models. Accordingly, members of a signature might include first-order statistics (e.g., the common mean, variance and/or higher moments, or the marginal distribution), second-order statistics (e.g., the autocorrelation function, or equivalently, the spectral density), or even higher-order statistics. Clearly, statistical signatures can be partially ordered by strength under set inclusion in a natural way as follows: signature A is stronger (dominates) signature B, if A contains B. In signature-oriented modeling, one simply utilizes fitting criteria based on a statistical signature. In other words, for a prescribed signature, one searches for a model whose signature fits or approximates its empirical counterpart. It is reasonable to expect that the predictive power of models (referred to as model fidelity) would generally increase as the fitting signature becomes stronger; stated more simply, the more empirical statistics are fitted by a model, the better the model. The need for high-fidelity models of stochastic sequences is self-evident in systems containing random
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_73,
1135
1136
phenomena or components, such as temporal dependence in the context of queueing systems (Livny et al. 1993; Altiok and Melamed 2001). Examples include bursty traffic in the Next Generation Internet (especially compressed video – a prodigious consumer of bandwidth), bursty arrivals of demand at an inventory system, reliability (breakdown and repair) of components with non-independent uptimes and downtimes, interest rate evolution, and unfolding enemy arrivals (engagements) in a battlefield simulation. This chapter reviews a class of high fidelity and versatile models, called ARM (Autoregressive Modular) processes. It then proceeds to describe the ARM modeling and forecasting methodology. All variants of ARM processes share a signature-oriented modeling philosophy, where the underlying empirical time series is stationary in the sense that the underlying probability law is assumed to have at least time invariant 2-dimensional joint distributions. Specifically, ARM processes aim to construct high-fidelity models from empirical data, which can capture a strong statistical signature of the empirical data by simultaneously fitting the empirical marginal distribution and the empirical leading (low-lag) autocorrelations. Accordingly, a prospective model is declared a good fit to an empirical time series, provided it satisfies the following requirements: (a) The marginal distribution of the model fits exactly its empirical counterpart (histogram). (b) The leading autocorrelations of the model approximate their empirical counterparts (the number of lags is left to the modeler and often determined by the availability of data). (c) The model gives rise to Monte Carlo simulated paths that are visually “similar” to (“resemble”) the empirical records. Requirements (a) and (b) constitute crisp quantitative goodness-of-fit criteria (see also Cario and Nelson 1996). In contrast, the notions of sample path “similarity” or “resemblance” in requirement (c) are heuristic and cannot be defined with mathematical crispness; note that “resemblance” does not mean that the sample paths are identical point for point, but rather that human observers perceive them as “similar” in appearance. It should be pointed out that these notions are intuitively clear and that modelers routinely attempt to validate their models by visual inspections for “path similarity.” Experience suggests that ARM processes can be fitted to arbitrary marginal distributions, a wide variety of autocorrelation functions (monotone, oscillating, alternating, etc.), and a broad range of sample path behaviors (non-directional, cyclical, etc.) The following notation will be used throughout. The Laplace transform of a function f .x/ is defined by fQ.s/ D R1 sx dx (tilde denotes the Laplace transform oper1 f .x/ e ator). The cumulative distribution function (cdf) of a random
B. Melamed
variable Z is denoted by FZ and its probability density function (pfd) is denoted by fZ ; the corresponding mean and standard deviation are denoted by Z and Z , respectively. Similarly, the marginal density of a stationary random sequence fZn g is denoted by fZ . The notation Z F means that the random variable Z has the cdf F . The autocorrelation function of a real-valued random sequence fZn g with finite second moments is Z .k; / D
EŒZk ZkC EŒZk EŒZkC Zk ZkC
(73.1)
When Z .k; / does not depend on k in Equation (73.1), but only on the lag , we denote the autocorrelation function by Z ./. The conditional density of a random variable Y given a random variable X is denoted by fY jX .yjx/ D
@ P fY y j X D xg @y
(73.2)
The indicator function of a set A is denoted either by 1A .x/ or 1fx 2 Ag, while symbols decorated by a hat always p denote a statistical estimator. The imaginary number 1 is denoted by i , and the real-part operator by Re. Finally, the modulo-1 (fractional part) operator, hi, is defined by hxi D x maxfinteger n W n xg.
73.2 Overview of ARM Processes This section provides an overview of ARM processes – a class of autoregressive stochastic processes, with modulo-1 reduction and additional transformations (Melamed 1999). It generalizes the class of TES (Transform-Expand-Sample) processes (Melamed 1991; Jagerman and Melamed 1992a, 1992b, 1994, 1995; Melamed 1993, 1997), and it further gives rise to a discretized version that subsumes the class of QTES (Quantized TES) processes (Melamed et al. 1996; Klaoudatos et al. 1999), a discretized variant of TES. However, while TES and QTES processes admit only iid (independent identically distributed) so-called innovation sequences (see below), ARM processes constitute a considerable generalization in that they admit dependent innovation sequences as well.
73.2.1 Background ARM Processes We define two variants of ARM process classes, denoted by ARMC and ARM , respectively, and henceforth we append a plus or minus superscript to objects associated with these classes, but omit the superscript if the object is common to both classes. Accordingly, let U0 Unif.0; 1/
73 ARM Processes and Their Modeling and Forecasting Methodology
be a random variable, uniformly distributed on the interval [0,1). An innovation process is any real-valued stochastic sequence, fVn g1 nD0 , where the sequence fVn g is independent of U0 . All random variables are defined over some common probability space .; ˙; P /. 1 ˚ We next define two stochastic sequences, UnC nD0 and fUn g1 nD0 , to be referred to as background ARM processes due to their auxiliary nature. These are given by the recursive equations UnC D UN
D
hU0 C V0 i; C hUn1 C Vn i;
n D 0 n > 0
UnC ; n even 1 UnC ; n odd
(73.3) (73.4)
We remark that the definitions above have a simple geometric interpretation: the modulo-1 arithmetic renders background ARM processes random walks on the unit circle (a circle of unit circumference), where each point on the circle corresponds to a fraction in the range [0,1) (see Diaconis 1988). The innovation random variables correspond to step sizes of the random walk on the unit circle. In fact, the properties of modulo-1 arithmetic imply the following alternative representation for Equation 73.3, UnC D hU0 C S0;n i D hU0 C hS0;n ii;
(73.5)
where the Sm;n are partial sums of innovation random variables, given by 8 if m > n ˆ < 0;n X (73.6) Sm;n D Vj ; if m n ˆ :
1137
or Xn D D.Un /
(73.8)
where D is a measurable transformation from the interval [0,1) to the R 1 real line, called a distortion. We assume throughout that 0 D 2 .x/dx < 1. In the sequel, we shall drop the superscript of background or foreground ARM processes, when the distinction is immaterial. Two fundamental properties of ARM processes facilitate ARM modeling (Melamed 1999): The marginal distribution of all background ARM pro-
cesses is uniform on the interval [0,1), regardless of the innovation sequence selected! If the innovation sequence is stationary, so are the corresponding marginal distributions of background and foreground ARM processes. Consequently, we gain broad modeling flexibility as follows: Any marginal distribution can be easily fitted to a fore-
ground ARM process, by appeal to the Inverse Transform method: If U Unif.0; 1/ and F is any (cumulative) distribution function, then F 1 .U / F (see, e.g., Bratley et al. 1987; Law and Kelton 1991). Since this is true regardless of the innovation process selected, we can, in principle, attempt the fitting of both the empirical (marginal) distribution and empirical autocorrelation function simultaneously. The two fittings are decoupled: the first is guaranteed an exact match via the Inversion Transform method, and the second is approximated by a suitable choice of an innovation process.
j Dm
The probability law of the innovation process determines whether or not the background process is directional. In particular, note that a non-symmetrical innovation density imparts a drift to the background process, resulting in a cyclical process around the unit circle. See Melamed (1999) for details.
73.2.3 Transition Functions of Background ARM Processes The transition densities of background ARM processes are processes as follows. given separately for ARMC and ARM ˚ C C For any background ARM process Un and 0 u; v < 1, 1,
73.2.2 Foreground ARM Processes fU C
C kC jUk
Target ARM processes are transformed background ARM processes, and are consequently referred to as foreground ARM processes. These are of the general forms
.vju/ D
D1 C 2
1 X
fQSkC1; kC .i 2 / e i 2 .vu/
D 1 1 X
i h Re fQSkC1; kC .i 2 / e i 2 .vu/
D1
XnC D D.UnC /
(73.7)
(73.9)
1138
B. Melamed
and for any background ARM process fUn g and 0 u; v < 1; 1,
fUkC jUk .vju/ D
8 1 X ˆ ˆ fQSkC1; kC .i 2 / e i 2 .vu/ ; k even; even ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ ˆ fQSkC1; kC .i 2 / e i 2 .vu/ ; k even; odd ˆ < D 1
1 X ˆ ˆ ˆ fQSkC1; kC .i 2 / e i 2 .vCu/ ; k odd; even ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ ˆ fQSkC1; kC .i 2 / e i 2 .vCu/ ; k odd; odd :
(73.10)
D 1
Equation (73.10) can be further simplified analogously to Equation (73.9).
73.2.4 Autocorrelation Functions of Foreground ARM Processes Let fXn g1 nD0 be a foreground ARM process of the form of Equations (73.7) or (73.8) with a finite positive variance and autocorrelation function X .k; /. Then for any foreground ˚ ARMC process, XnC , and 1, XC .k; / D
1 1 X Q Q 2 /j 2 fS .i 2 / jD.i X2 D 1 kC1; kC ¤0
1 i h 2 X Q 2 /j 2 D 2 Re fQSkC1; kC .i 2 / jD.i X D 1
(73.11) and for any foreground ARM process, fXn g, and 1,
X .k; / D
8 1 ˆ 1 X Q ˆ Q 2 /j 2 ; k even; even ˆ .i 2 / jD.i fS ˆ ˆ ˆ X2 D 1 kC1; kC ˆ ˆ ¤0 ˆ ˆ 1 ˆ X ˆ 1 ˆ ˆ ˆ fQSkC1; kC .i 2 / DQ 2 .i 2 /; k even; odd ˆ 2 ˆ < X D 1 ¤0
1 ˆ 1 X Q ˆ Q 2 /j 2 ; k odd; even ˆ fSkC1; kC .i 2 / jD.i ˆ 2 ˆ ˆ ˆ X D 1 ˆ ¤0 ˆ ˆ ˆ 1 ˆ X 1 ˆ ˆ ˆ fQSkC1; kC .i 2 / DQ 2 .i 2 /; k odd; odd ˆ ˆ : X2 D 1 ¤0
Equation (73.12) can be further simplified analogously to Equation (73.11).
(73.12)
73 ARM Processes and Their Modeling and Forecasting Methodology
73.2.5 Empirical Distributions Practical ARM modeling focuses on distortions based on inverse distribution functions via an application of the Inverse Transform (Bratley et al. 1987). For a realization of an empirical sequence fYn gN nD0 , an empirical density is often estimated by a histogram, which gives the observed relative frequencies of various ranges via a step function of the form hY .y/ D
J X
1Œlj ;rj / .y/
j D1
pOj ; 0 y < 1; wj
(73.13)
where J is the number of histogram cells, Œlj ; rj / is the support of cell j with width wj D rj lj > 0, and pOj is the estimator of the probability of cell j . The corresponding distribution function, HO Y , is a piecewise linear function, obtained by integrating Equation (73.13) with respect to y, and its inverse, HO Y1 , is the piecewise linear function HO Y1 .x/ D
w j O ; 1ŒCO j 1 ;COj / .x/ lj C x Cj 1 p O j j D1 J X
0 x < 1; (73.14) where fCO j gJj D0 is the cdf of fpOj gJj D1 , i.e., CO j D 1 j J .CO 0 D 0; CO J D 1/.
Pj i D1
pOi ;
It turns out that a distortion of the form D D HO Y1 is rarely adequate in practice, because background ARM processes, being random walks on the unit circle, produce visual “discontinuities” when the background process crosses the circle origin. Furthermore, a monotone transformation (including HO Y1 ) preserves this effect (see Jagerman and Melamed 1992a; Melamed 1993 for a detailed discussion). To circumvent this problem, we use in practice, a 2-stage composite distortion of the form
u=; 0 u .1 u/=.1 /; u 1
Q 2 / DQ .i 2 / D D.i Q C .1 / D.i 2 .1 //; 0 (73.18) (Jagerman and Melamed 1992a). Thus, if the original distortion DQ is known, then any composite distortion, DQ , is readily computable as well. These have been computed in Jagerman and Melamed (1992b) for various distributions, including the histogram distortion D D HO Y1 in Equation (73.14).
(73.15)
where the first stage, S , “smoothes” the sample paths of ARM processes (with the uniform marginal distributions preserved intact), while the second stage, HO Y1 , carries out inversion. The first stage is called a stitching transformation and is given by S .u/ D
where the parameter 2 Œ0; 1 is called a stitching parameter. We point out that the smoothing action of stitching transformations is effected for 0 < < 1, due to their continuity on the unit circle (the term “stitching” is motivated by the fact that S .0/ D S .1/ for all 0 < < 1). Note that for the extremal cases, D 1 and D 0, the corresponding stitching transformations are S1 .u/ D u (the identity), and S0 .u/ D 1 u (the antithetic), and these are “nonsmoothing,” as they are not continuous on the unit circle. Practical modeling most often employs 0 < < 1, and primarily, the midpoint D 0:5. For a discussion of the smoothing properties of stitching transformations and their uniformity-preserving property, refer to Melamed (1991), Jagerman and Melamed (1992a) and Melamed (1993). A geometric interpretation of the stitching parameter is more involved, but can be readily seen when the background process is highly autocorrelated and cyclical (has a drift about the unit circle). In this case, the stitching transformation curve can be seen to serve as the “profile” of a cycle, in the sense that the sample paths of the background process tend to “hug” that curve. The parameter then specifies the apex of the cycle profile. The introduction of stitching as a smoothing operation still allows us to calculate the Laplace transform of the resulting composite distortion, needed in Equations (73.11–73.12). More specifically, for any general composite distortion of the form (73.17) D .u/ D D.S .u// where D is any distortion, the corresponding Laplace transform is
73.2.6 Stitching Transformations
D.x/ D HO Y1 .S .x//;
1139
(73.16)
73.3 The ARM Modeling Methodology Suppose the modeler wishes to fit an ARM model to some empirical record fYngN nD0 , from which the empirical distribution (histogram) and empirical autocorrelation function have been estimated. The ARM modeling methodology searches for an innovation sequence, V D fVn g, and a stitching parameter, , such that the corresponding ARM process gives rise to an autocorrelation function that approximates well its
1140
B. Melamed
empirical counterpart; recall that ARM theory ensures that the associated marginal distribution is automatically matched to its empirical counterpart. Such ARM modeling may be implemented as a heuristic interactive search with extensive modeler interaction, or as an algorithmic search with minimal modeler intervention. Both approaches require software support (Hill and Melamed 1995). The ARM modeling methodology unfolds in practice as follows:
73.4 The ARM Forecasting Methodology
First, the ARM process sign (ARMC or ARM ) is
fX jU C .xju/, can be readily computed too, in view of kC k Equations (73.7–73.8), which relate background and foreground ARM processes via a deterministic transformation. However, we cannot readily compute a transition density of foreground ARM processes of the form fX C jX C .xjy/ and
selected. Such a selection is typically guided by the modeler’s experience (however, trying both classes would simply double the modeling time). Next, the empirical inverse distribution function, HO Y1 , is constructed from the empirical data. Once the modeler specifies the number of histogram cells and their width, the distortion transformation is readily computed from Equation (73.14) and Equation (73.18). The third step, and the hardest one, is to search for a pair, .V; /, consisting of an innovation process and a stitching parameter, which jointly give rise to an ARM process that approximates well the leading empirical autocorrelations. In the heuristic approach, the modeler embarks on a heuristic search of pairs .V; /, guided by visual feedback (Melamed 1993; Hill and Melamed 1995). This approach, however, may be time-consuming and consequently can engender considerable tedium. Alternatively, an algorithmic search approach can be devised, analogously to Jelenkovic and Melamed (1995). In this approach, a search is combined with nonlinear optimization to produce a set of ARM candidate models. First, the parameter space, .V; /, is quantized into a moderately large, but finite, set Q of ARM model specifications. The autocorrelation function is computed for each model in Q via the computationally fast formulas in Equations (73.11) and (73.12). A “best set” of m models is selected from Q (the value of m would be typically small, on the order of 10); the associated “goodness-of-fit” criterion would select those ARM models whose autocorrelation functions have the smallest distance from their empirical counterpart in some prescribed metric. A standard choice of such a metric is the weighted sum of the squared autocorrelation differences, the sum ranging over some span of lags, D 1; : : : ; L. Next, a Steepest-Descent nonlinear optimization is performed for models in the “best set” only, to further reduce the aforementioned autocorrelation distance. The modeler then selects the “best of the best,” either as the model with the overall smallest distance, or an optimized model in the “best set” that gives rise to Monte Carlo sample paths, which appear most “similar” to the empirical data.
The computational basis of the ARM forecasting methodology is provided by the transition densities in Equations (73.9–73.10). These equations allow us to compute numerically conditional densities fU C jU C .vju/ and kC
fU C
C kC jUk
k
.vju/ of ARM background processes, and conse-
quently, their foreground counterparts, fX C
C kC jUk
kC
.xju/ and
k
jX .xjy/. The problem stems from the fact that for fXkC k 0 < < 1 and a given stitched foreground ARM value, x, the stitching transformation (Equation 73.16) has two background ARM value solutions,
u.1/ .x/ D x; u.2/ .x/ D 1 .1 / x:
(73.19)
To get around this problem, consider any background ARM 1 process fUn g1 nD0 and its foreground counterpart, fXn gnD0 . ARM forecasting uses systematically a convex (probabilistic) combination of transition densities of the form ck˙ jk .yjx/ D p fXk˙ jUk .yju.1/ .x// C .1 p/ fX
k˙ jUk
.yju.2/ .x// (73.20)
for some probability value 0 p 1, referred to as the mixing parameter, whose selection via a simple optimization procedure will be described later. More specifically, using the transition density of Equation (73.20), ARM forecasts consist of both point estimates and confidence intervals as follows: A point estimate of the future value of XkC given the
current value of Xk is computed as the conditional expectation with respect to the density ckC jk .yjx/ EckC jk ŒXkC j Xk D x D p E XkC jUk D u.1/ .x/ C.1 p/ E XkC jUk D u.2/ .x/ (73.21)
73 ARM Processes and Their Modeling and Forecasting Methodology A confidence interval with a prescribed probability about
the point estimate (Equation (73.21)) is computed via an appropriate transition density ckC jk .yjx/ of the form of Equation (73.20). We next proceed to describe the details of the ARM forecasting methodology.
73.4.1 Selection of the Mixing Parameter The selection of the mixing parameter utilizes time reversal (Kelly 1979) of the background ARM process, thereby generalizing the approach of Jagerman and Melamed (1995). This approach calls for the computation of a prescribed number of backward (time reversed) conditional expectations of the form Eck jk ŒXk j Xk D x D p E Xk jUk D u.1/ .x/ C .1 p/ E Xk jUk D u.2/ .x/
1141
time reversal, and the backward transition densities are readily given in terms of the forward transition densities. Accordingly, for any ARM background process fUn g1 nD0 and 0 u; v < 1; 1, fUk jUk .vju/ D fUk jUk .ujv/
fUk .v/ D fUk jUk .ujv/ fUk .u/ (73.23)
because the marginal density of ARM background processes is uniform on [0,1). Using Equations (73.9–73.10), we can write the formulas for backward transition˚ densities as follows. For any background ARMC process UnC and 0 u; v < 1; 1, fU C
C k jUk
.vju/ D
D1 C 2
1 X
fQSk C1; k .i 2 / e i 2 .uv/
D 1 1 X
i h Re fQSk C1; k .i 2 / e i 2 .uv/
D1
(73.24) (73.22)
and for any background ARM process fUn g and 0 u; v < 1; 1,
To this end, we take advantage of the fact that the Markovian property of background ARM processes is preserved under 8 1 X ˆ ˆ fQSk C1; k .i 2 / ˆ ˆ ˆ ˆ D 1 ˆ 1 ˆ ˆ ˆ ˆ X Q ˆ fSk C1; k .i 2 / ˆ < D 1 fUk jUk .vju/ D 1 X ˆ ˆ ˆ fQSk C1; k .i 2 / ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ ˆ .i 2 / fQS : k C1; k
e i 2 .uv/ ;
k even; even
e i 2 .uCv/ ;
k even; odd (73.25)
e i 2 .uCv/ ; k odd; even e i 2 .uv/ ; k odd; odd
D 1
Equation (73.25) can be simplified analogously to Equation (73.24). Observing that past values are known in practice, the idea is to generate retrograde point estimates and compare them to a window of (known) past values. The mixing parameter p is then selected so as to minimize an objective function involving deviations of retrograde forecasts from known past values, analogously to ordinary regression. More specifically, for each time index k, consider the objective function
observed value of Xk . Our goal is to select a mixing parameter p that minimizes Equation (73.26) by differentiating it with respect to p, setting the derivative to zero and then solving for p. The nominal solution is given by N P
p D
D1
h ih i .1/ .2/ .2/ w ek .xO k / ek .xO k / xO k ek .xO k / N P D1
i2 h .1/ .2/ w ek .xO k / ek .xO k / (73.27)
gk .p/ D
N X
h w
.1/
.2/
pek .xO k / C .1 p/ ek .xO k / xO k
i2
D1
(73.26) where p is the mixing parameter to be selected, N is the win.i / non-negative weights, ek .xO k / D dow size, the w are .igiven E D.Uk /jUk D u / .xO k / , for i D 1; 2, and xO k is the
If Equation (73.27) yields a value between 0 and 1, then p is selected as the actual mixing parameter (if the denominator vanishes, select any value of p, say, p D 1). Otherwise, if it yields a negative value, then the actual mixing parameter is set to p D 0, and if it yields a value that exceeds 1, it is set to p D 1.
1142
B. Melamed
73.4.2 Computation of Conditional Expectations
with the computation of forward conditional expectations, utilizing˚ Equation (73.9). For any background ARMC C process Un and 1,
Conditional expectations are used in the computation of the point forecasts in Equations (73.21) and (73.22). We begin
C E XkC jUkC D u D
Z1 D.v/ fU C
C kC jUk
.vju/ d v
0
Z1 D
1 X
D.v/
D 1
0 1 X
D
fQSkC1; kC .i 2 / e i 2 .vu/ d v
fQSkC1; kC .i 2 /
D 1 1 X
D
Z1 D.v/ e i 2 .vu/ d v 0
Q 2 / fQSkC1; kC .i 2 / e i 2 u D.i
D 1
D X C 2
1 X
i h Q 2 / Re fQSkC1; kC .i 2 / e i 2 u D.i
(73.28)
D1
Utilizing Equation (73.10), a similar computation for any foreground ARM process, fXn g, and 1 yields
E XkC jUk D u D
8 1 X ˆ ˆ Q 2 /; k even; even fQSkC1; kC .i 2 / e i 2 u D.i ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ ˆ Q 2 /; k even; odd fQSkC1; kC .i 2 / e i 2 u D.i ˆ < D 1
1 X ˆ ˆ Q 2 /; ˆ fQSkC1; kC .i 2 / e i 2 u D.i k odd; even ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ Q ˆ fQSkC1; kC .i 2 / e i 2 u D.i 2 /; k odd; odd :
(73.29)
D 1
Equation (73.29) can be further simplified analogously to the last line in Equation (73.28).
The computation of backward conditional expectations utilizes Equations (73.24) and (73.25), and is similar to that in (73.28). For any background ARMC process ˚ Equation C Un and 1,
1 X C Q 2 / E Xk jUkC D u D fQSk C1;k .i 2 / e i 2 u D.i D 1
D X C 2
1 X D1
i h Q 2 / Re fQSk C1;k .i 2 / e i 2 u D.i
(73.30)
73 ARM Processes and Their Modeling and Forecasting Methodology
1143
and for any foreground ARM process, fXn g, and 1,
E Xk jUk D u D
8 1 X ˆ ˆ Q 2 /; k even; even fQSk C1;k .i 2 / e i 2 u D.i ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ ˆ Q fQSk C1;k .i 2 / e i 2 u D.i 2 /; k even; odd ˆ < D 1
(73.31)
1 X ˆ ˆ Q ˆ fQSk C1;k .i 2 / e i 2 u D.i 2 /; k odd; even ˆ ˆ ˆ ˆ D 1 ˆ ˆ 1 ˆ X ˆ ˆ Q 2 /; k odd; odd ˆ fQSk C1;k .i 2 / e i 2 u D.i : D 1
Equation (73.31) can be further simplified analogously to Equation (73.30).
73.4.3 Computation of Conditional Distributions Conditional cumulative distribution functions (cdf) are used in the computation of confidence intervals for forecasts. For any foreground ARM process, fXn g1 nD0 , we note the relations
FXk˙ jUk .xju/ D FS .Uk˙ /jUk .FX .x/ju/
(73.32)
where S .Uk˙ / is a stitched background random variable and FX is the common cdf of the foreground ARM random variables. Equation (73.32) shows that it suffices to derive the formulas for the conditional cdf FS .Uk˙ /jUk .sju/, as the requisite one is readily obtained by setting s D FX .x/ in Equation (73.32). Accordingly, utilizing Equation ˚ C (73.9), one C has for any background ARM process Un and 1,
Zs FS .U C
C kC /jUk
.sju/ D
fS .U C
C kC /jUk
.rju/ dr
0 .2/ .s/ uZ
D1
fU C
C kC jUk
.vju/ d v
u.1/ .s/ .2/ .s/ uZ
1 X
D1 u.1/ .s/
D1
fQSkC1; kC .i 2 / e i 2 .vu/ d v
D 1
1 X
fQSkC1; kC .i 2 / e
D 1
Ds C
X
ei 2 v d v u.1/ .s/ .1/ .s//
e i 2 .uCu fQSkC1; kC .i 2 /
¤0
Ds C 2
.2/ .s/ uZ
i 2 u
1 X D1
"
.2/ .s//
e i 2 .uCu
i 2 .1/ .s//
e i 2 .uCu Re fQSkC1; kC .i 2 /
.2/ .s//
e i 2 .uCu
i 2
# (73.33)
1144
B. Melamed
Utilizing Equation (73.10), a similar computation for any foreground ARM process, fXn g, and 1 yields Zs F
S .UkC /jUk
.sju/ D
fS .UkC /jUk .rju/ dr
0
D
8 .1/ .2/ X e i 2 .uCu .s// e i 2 .uCu .s// ˆ ˆ Q ˆ s C ; k even; even fSkC1; kC .i 2 / ˆ i 2 ˆ ˆ ˆ ¤0 ˆ ˆ .2/ .1/ ˆ X ˆ e i 2 .uu .s// e i 2 .uu .s// ˆ ˆ Q ˆ s C ; k even; odd fSkC1; kC .i 2 / ˆ i 2 < ¤0
.2/ .1/ X ˆ e i 2 .uu .s// e i 2 .uu .s// ˆ Q ˆ s C ; fSkC1; kC .i 2 / ˆ ˆ i 2 ˆ ˆ ¤ 0 ˆ ˆ ˆ .1/ .2/ ˆ X ˆ e i 2 .uCu .s// e i 2 .uCu .s// ˆ Q ˆ ; fSkC1; kC .i 2 / ˆ i 2 :s C
(73.34) k odd; even k odd; odd
¤0
Equation (73.34) can be further simplified analogously to the last line in Equation (73.33). The computation of backward conditional distributions utilizes Equations (73.24) and (73.25), and is similar to that in Equations and (73.34). For any background ˚ (73.33) C C ARM process Un and 1, Zs FS
C .Uk
.sju/ D /jU C
fS
k
C C .Uk /jUk
.rju/ dr
0
Ds C
X
.2/ .s//
e i 2 .uu fQSk C1;k .i 2 /
.1/ .s//
e i 2 .uu
(73.35)
i 2
¤0
and for any foreground ARM process, fXn g, and 1, Zs F
S .Uk /jUk
.sju/ D
fS .Uk /jUk .rju/ dr
0
D
8 .2/ .1/ X e i 2 .uu .s// e i 2 .uu .s// ˆ ˆ Q ˆ s C ; fSk C1;k .i 2 / ˆ i 2 ˆ ˆ ˆ ¤0 ˆ ˆ .1/ .2/ ˆ X ˆ e i 2 .uCu .s// e i 2 .uCu .s// ˆ ˆ Q ˆ s C ; fSk C1;k .i 2 / ˆ i 2 < ¤0
k even; even k even; odd
.1/ .2/ X ˆ e i 2 .uCu .s// e i 2 .uCu .s// ˆ Q ˆ s C ; k odd; even fSk C1;k .i 2 / ˆ ˆ i 2 ˆ ˆ ¤ 0 ˆ ˆ ˆ .2/ .1/ ˆ X ˆ e i 2 .uu .s// e i 2 .uu .s// ˆ Q ˆ ; k odd; odd fSk C1;k .i 2 / ˆ i 2 :s C
¤0
(73.36)
73 ARM Processes and Their Modeling and Forecasting Methodology
Again, Equations (73.35) and (73.36) can be further simplified analogously to the last line in Equation (73.33). We conclude this section by observing that the conditional densities of FS .Uk˙ /jUk .sju/ in Equations (73.33–73.36) are readily obtained by differentiating them with respect to s. The transition densities for any foreground ARM process fXn g1 nD0 are then given by the formula fXk˙ jUk .xju/ D fS .Uk˙ /jUk .FX .x/ju/ fX .x/
(73.37)
Finally, the conditional densities fS .Uk˙ /jUk .sju/ can be readily expressed in terms of the conditional densities fUk˙ jUk .vju/. To this end, write FS .Uk˙ /jUk .sju/ D FUk˙ jUk .u.1/ .s/ju/ C1 FUk˙ jUk .u.2/ .s/ju/ D FUk˙ jUk .sju/ C1 FUk˙ jUk .1 .1 /sju/ and differentiating this equation with respect to s yields fS .Uk˙ /jUk .sju/ D fUk˙ jUk .sju/ C .1 / fUk˙ jUk .1 .1 /sju/ (73.38)
73.5 Example: ARM Modeling of an S&P 500 Time Series Stochastic financial models are used to support investment decisions in financial instruments: short-term forecasting is employed to guide day-to-day trading, while longer-term planning (e.g., comparing alternative investment strategies) often calls for the generation of financial Monte Carlo scenarios. Furthermore, quantitative investment strategies require scenarios to estimate internal parameters of various financial models (Hull 2009), typically to valuate the current fair price of a security, rather than to forecast its future price. As a time series modeling and forecasting methodology, ARM processes are suited for both purposes. The modeling methodology described in Sect. 73.3 was applied to an empirical time series of 500 values drawn from the S&P 500 Index, yielding an ARMC model. Figure 73.1 displays the best model found by the ARM modeling methodology. The screen image in Fig. 73.1 consists of four panels with various statistics providing visual information on the fit quality as follows: 1. The upper-left panel depicts the empirical S&P 500 Index time series and a simulated time series of the fitted model.
1145
The initial value of both time series were arranged to coincide. 2. The upper-right panel depicts an empirical histogram of the S&P 500 Index time series with a compatible histogram of the simulated time series superimposed on it. 3. The lower-left panel depicts the empirical autocorrelation function of the S&P 500 Index data, as well as its simulated and theoretical counterparts. 4. The lower-right panel depicts the empirical spectral density function of the S&P 500 Index data, as well as its simulated and theoretical counterparts. An inspection of Fig. 73.1 reveals that the fitted ARMC model provides a good fit to the empirical data in accordance with requirements (a)–(c) in Sect. 73.1. Next, the fitted ARMC model was exercised to compute point forecasts (conditional expectations), using the ARM forecasting methodology as described in Sects. 73.4.1 and 73.4.2. Figure 73.2 displays various forecasting-related statistics for the empirical sample of S&P Index data. The screen image in Fig. 73.2 consists of four panels with various statistics pertaining to point forecasts as follows: 1. The upper-left panel depicts the tail-end of the empirical S&P 500 Index time series with forward forecasts superimposed on it. Each empirical value “sprouts” a branch consisting of a sequence of 3 forward forecasts, computed by optimizing the backward fitting of the previous 5 empirical values (if available). 2. The upper-right panel depicts the tail-end of the empirical S&P 500 Index time series and the corresponding sequence of lag-1 forward forecasts. The deviation of a forecast value from the (empirical) true value can be gauged by the vertical distances along the two time series. 3. The lower-left panel depicts the sequence of the aforementioned deviation values. 4. The lower-right panel depicts a histogram of the aforementioned deviation values. An inspection of Fig. 73.2 shows that the forecasts are quite good. In particular, inspection of the last panel shows that the deviations are arranged in a tight range around the origin, supporting this observation. Figure 73.3 depicts a magnified view of the upper-left panel in Fig. 73.2. Note that the forecasting branches lie quite closely to the empirical data, indicating good forecasting results. Figure 73.4 depicts a magnified view of the upper-right panel in Fig. 73.2. Again the two curves of the (empirical data and forecasts) lie quite close to each other. A single out-of-sample forecast is shown in the upper right corner (unmatched by empirical data). Finally, the fitted ARMC model was exercised to compute forecast distributions (conditional distributions and
1146
Fig. 73.1 Statistics of the best model fitted to a sample of S&P 500 Index data
Fig. 73.2 Multiple-Lag forecasts of the best model fitted to a sample of S&P 500 Index data
B. Melamed
73 ARM Processes and Their Modeling and Forecasting Methodology Fig. 73.3 Multiple-lag forecasts for a sample of S&P 500 Index data
Fig. 73.4 Single-lag forecasts for a sample of S&P 500 Index data
1147
1148
B. Melamed
Fig. 73.5 Forecasts of the best model fitted to a sample of S&P 500 Index data for the last value of the empirical data
densities), as described in Sect. 73.4.3. Figure 73.5 depicts these statistics for the last empirical sample value of the S&P 500 Index data. The screen image in Fig. 73.5 displays two panels with forecast information as follows: 1. The upper panel depicts the empirical S&P 500 Index time series with forward and backward forecasts superimposed on it for the last empirical value. Shown are the sequence of 3 forward forecasts and 5 backward forecasts, as well as the 90% confidence interval about each of them. 2. The lower panel depicts the density of the corresponding forward lag-1 forecast, conditioned on the last empirical value. Accordingly, the confidence interval for the lag-1 forward forecast was obtained by a simple search over the values of the cdf corresponding to the displayed pdf. Recall that the forward forecast values in Fig. 73.5 constitute out-of-sample predictions on future values of the S&P 500 Index.
73.6 Summary The class of ARM (Autoregressive Modular) processes is a class of stochastic processes, defined by a nonlinear autoregressive scheme with modulo-1 reduction and additional
transformations. ARM processes constitute a versatile class designed to produce high-fidelity models from stationary empirical time series by fitting a strong statistical signature consisting of the empirical marginal distribution (histogram) and the empirical autocorrelation function. More specifically, fitted ARM processes guarantee the matching of arbitrary empirical distributions, and simultaneously permit the approximation of the leading empirical autocorrelations. Additionally, simulated sample paths of ARM models often resemble the data to which they were fitted. Thus, ARM processes aim to provide both a quantitative and qualitative fit to empirical data. Fitted ARM models can be used in two ways: (1) to generate realistic-looking Monte Carlo sample paths (e.g., financial scenarios), and (2) to forecast via point estimates as well as confidence intervals. This chapter starts with a high-level discussion of stochastic model fitting and explains the fitting approach that motivates the introduction of ARM processes. It then continues with a review of ARM processes and their fundamental properties, including their construction, transition structure, and autocorrelation structure. Next, the chapter proceeds to outline the ARM modeling methodology by describing the key steps in fitting an ARM model to empirical data. It then describes in some detail the ARM forecasting methodology, and the computation of the conditional expectations that
73 ARM Processes and Their Modeling and Forecasting Methodology
serve as point estimates (forecasts of future values) and their underlying conditional distributions from which confidence intervals are constructed for the point estimates. Finally, the chapter concludes with an illustration of the efficacy of the ARM modeling and forecasting methodologies utilizing a sample from the S&P 500 Index. Acknowledgements I would like to thank Dr. David Jagerman, Xiang Zhao, and Junmin Shi for reading and commenting on the manuscript.
References Altiok, T. and B. Melamed. 2001. “The case for modeling correlation in manufacturing systems.” IIE Transactions 33(9), 779–791. Bratley, P., B. L. Fox, and L. E. Schrage. 1987. A guide to simulation, Springer, New York. Cario, M. C. and B. L. Nelson. 1996. “Autoregressive to anything: timeseries input processes for simulation.” OR Letters 19, 51–58. Diaconis, P. 1988. Group representations in probability and statistics, Lecture Notes – Monograph Series 11, Institute of Mathematical Statistics, Hayward, CA. Hill, J. R. and B. Melamed. 1995. “TEStool: a visual interactive environment for modeling autocorrelated time series.” Performance Evaluation 24(1&2), 3–22. Hull, J. 2009. Options, futures, and other derivative securities, 7th ed., Prentice-Hall, Upper Saddle River. Jagerman, D. L. and B. Melamed. 1992a. “The transition and autocorrelation structure of TES processes part I: general theory.” Stochastic Models 8(2), 193–219.
1149 Jagerman, D. L. and B. Melamed. 1992b. “The transition and autocorrelation structure of TES processes part II: special cases.” Stochastic Models 8(3), 499–527. Jagerman, D. L. and B. Melamed. 1994. “The spectral structure of TES processes.” Stochastic Models 10(3), 599–618. Jagerman, D. L. and B. Melamed. 1995. “Bidirectional estimation and confidence regions for TES processes.” Proceedings of MASCOTS ‘95, Durham, NC, pp. 94–98. Jelenkovic, P. and B. Melamed. 1995. “Algorithmic modeling of TES processes.” IEEE Transactions on Automatic Control 40(7), 1305–1312. Kelly, F. P. 1979. Reversibility and stochastic networks, Wiley, New York. Klaoudatos, G., M. Devetsikiotis, and I. Lambadaris. 1999. “Automated modeling of broadband network data using the QTES methodology,” Proceedings IEEE ICC ‘99, Vol 1, Vancouver, Canada, pp. 397–403. Law, A. and D. W. Kelton. 1991. Simulation modeling and analysis, McGraw-Hill, New York. Livny, M., B. Melamed, and A. K. Tsiolis. 1993. “The impact of autocorrelation on queueing systems.” Management Science 39(3), 322–339. Melamed, B. 1991. “TES: a class of methods for generating autocorrelated uniform variates.” ORSA Journal on Computing 3(4), 317–329. Melamed, B. 1993. “An overview of TES processes and modeling methodology,” in Performance evaluation of computer and communications systems, L. Donatiello and R. Nelson (Eds.). Lecture Notes in Computer Science, Springer, New York, pp. 359–393. Melamed, B. 1997. “The empirical TES methodology: modeling empirical time series.” Journal of Applied Mathematics and Stochastic Analysis 10(4), 333–353. Melamed, B. 1999. “ARM processes and modeling methodology.” Stochastic Models 15(5), 903–929. Melamed, B., Q. Ren, and B. Sengupta. 1996. “The QTES/PH/1 queue,” Performance Evaluation 26, 1–20.
Chapter 74
Alternative Econometric Methods for Information-based Equity-selling Mechanisms Lee Cheng-Few and Yi Lin Wu
Abstract Extant research offers mixed empirical results on the information content of private placements. Hertzel and Smith (J Finance 48:459–485, 1993) suggest that, on average, private placement firms are undervalued. On the other hand, Hertzel et al. (J Finance 57:2595–2617, 2002) show that private placement firms experience significant negative long-run postannouncement stock price performance and that high levels of capital expenditures around private placement reflect managerial overoptimism. Empirical work examining the information content of private placements typically takes the approach based on proxies for information asymmetry that suffers the intrinsic errors-in-variables problem. This paper circumvents the empirical difficulty by developing the two-stage estimation approach and the conditional correlation approach. The conditional correlation coefficient varies between 1 and C1 that allows comparisons across samples feasible. Thus, it prevails over the two-stage estimation approach to identify the significance of the information content of equity-selling mechanisms. Keywords Private placement r Public offering r Conditional correlation r Probit models
74.1 Introduction One important connection between asymmetric information and undervaluation in the area of corporate finance is the choice of equity-selling mechanisms. The fundamental implication of asymmetric information on the choice of equityselling mechanisms is that firms issuing private (public) Y.L. Wu () Department of Quantitative Finance, College of Technology Management, National Tsing Hua University, Hsinchu 30013, Taiwan, ROC e-mail: [email protected] L. Cheng-Few Department of Finance and Economics, School of Business, Rutgers University, Piscataway, NJ 08854-8054, USA
offerings will be on average undervalued (overvalued) (Myers and Majluf 1984; Hertzel and Smith 1993). If firms are undervalued such that issuing equity to the public investors will dilute value of the existing shareholders, then firms may choose not to issue equity publicly and instead resort to the private placement market so as to convey their favorable private information to sophisticated (accredited) investors in the negotiation process. In this framework, the decision to go to the private placement market conveys that the firms are undervalued. On the other hand, the extant behavioral finance literature suggests that private placements reflect managers’ overconfidence on the future firm prospects (Hertzel et al. 2002) or that private placement firms correctly anticipate naive investors’ misvaluation and take advantage of the capital market conditions (Huson et al. 2006). The empirical evidence on the information content of private placements is conflicting. Some studies find significant positive short-term stock price reaction to private placement announcements (Wruck 1989; Hertzel and Smith 1993) and favorable ex post outcomes (Hertzel and Rees 1998; Goh et al. 1999), unlike public offerings (Masulis and Korwar 1986; Jain 1992). Others find that, similar to public offering firms, private placement firms experience significant negative long-run postannouncement stock price performance (Krishnamurthy et al. 2005, Hertzel et al. 2002) and that high levels of capital expenditures around private placement reflect managerial overoptimism (Hertzel et al. 2002). Empirical work examining the information content of private placements typically takes the approach based on proxies for information asymmetry (Wu 2004). The primary difficulty with the proxies is the errors-in-variables problem. Because information asymmetry is unobservable, any empirical attempt to capture information asymmetry requires the use of proxies. It is plausible that some proxies capture underlying intrinsic characteristics rather than information asymmetry. As a result, an inference drawn from the significance of the coefficients of the proxies for information asymmetry may be biased. To circumvent the potential biases embedded in the approach based on surrogates, this paper distinguishes observable and unobservable variables. This distinction is
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_74,
1151
1152
important as will be shown later that we cannot ignore the private information simply because the surrogates and private information are highly related neither can we ignore the unobservable variables when the surrogates are empirically irreverent. Furthermore, this paper uses two empirical strategies in the research design. First, this paper uses insider trading prior to the announcement of equity-selling mechanisms to measure the other insiders’ decision. This is because most studies indicate significant changes in insider trading patterns before equity-offering announcement (Karpoff and Lee 1991; Kahle 2000; Clarke et al. 2001; Lee 1997) and researchers have concluded that insider trades reveal important nonpublic information (John and Mishra 1990; Damodaran and Liu 1993; Seyhun 1986; Meulbroek 1992). Second, this paper uses both the two-stage estimation approach and the conditional correlation approach. Specifically, this paper assumes that the same set of private information explains both insider trading activity and the choice of equity-selling mechanisms. The two-stage estimation approach uses the estimated residuals of insider purchase regression and insider sale regression (hereafter referred to as abnormal insider purchases and abnormal insider sales, respectively) to measure private information. Assuming that both insider trading activity and the equity-selling mechanisms choice are driven by the same set of private information, the two-stage estimation approach hypothesizes that if abnormal insider purchases correlate positively, negatively, or are uncorrelated with the probability of making private placements, then the private placement firms are undervalued, overvalued, or evenly valued, respectively. An opposite prediction holds for the information content of abnormal insider sales. In addition to the two-stage estimation approach, this paper uses the conditional correlation approach to investigate the information content of the equity-selling mechanism choice. Expressing equity-selling mechanism choice and insider purchases (or insider sales) as a pair of conditional equations, whereby the conditioning variables measure the public information, as long as the unobservable variables are correlated, the generalized residuals (or standardized residuals) will be correlated. Consequently, the conditional correlation approach hypothesizes that if the generalized residuals of the pair of conditional equations correlate positively, negatively, or are uncorrelated, then this statistical relationship serves as evidence that private placements signal positive, negative, or no firm-specific private information, respectively. Compared with the two-stage estimation approach that is based on the unconstrained coefficient of the abnormal insider purchases (abnormal insider sales), the conditional correlation coefficient varies from 1 to C1. Consequently, it offers the additional advantage that allows across samples comparisons both the direction and magnitude of private information revealed in the equity-selling mechanism choices. In addition, this paper introduces nonparametric approaches
L. Cheng-Few and Y.L. Wu
when the data are sensitive to the arbitrary imposition of linear functional form and normal distribution of the error disturbances. This paper is organized as follows. Section 74.2 describes and develops hypothesis about the informational content of equity-selling mechanisms. Section 74.3 derives the econometric estimation methodology. Section 74.4 concludes the paper.
74.2 The Information Contents of Equity-Selling Mechanisms This section explores the information contents of equityselling mechanisms. First, the information content is that private placement firms may be undervalued. Hertzel and Smith (1993) extend Myers and Majluf (1984) to the case of private placements. In Hertzel and Smith (1993), managers with favorable information who, under the Myers and Majluf assumptions, would choose not to issue equity publicly and thereby forego a profitable investment opportunity may resort to the private placement market so as to convey their favorable information to investors in the negotiation process. Furthermore, there is a difference in investor sophistication between private placement market and public offering market. The undervalued firms avoid the underinvestment problem by placing equity with sophisticated (accredited) investors who are able to assess true firm value. In equilibrium, firms that view themselves as undervalued choose private placements. Secondly, the private placements may reflect managers’ overconfidence on the future firm prospects. This is consistent with the documented negative long-term post placement stock price performance (Krishnamurthy et al. 2005; Hertzel et al. 2002) and that high levels of capital expenditures around private placement reflects managerial overoptimism (Hertzel et al. 2002). Lastly, the private placements may reflect managers’ anticipation on naive investors’ overreaction and take advantage of it. Unlike public offerings that have negative abnormal returns for both the short-term and the long-term (Loughran and Ritter 1995, 1997), private placements have significant positive stock price reaction to announcement but are followed by negative long-term (3–5 years) postissue stock price performance. Behavioral finance attributes longterm return continuation (reversal) in public (private) offerings to underreaction (overreaction) to information conveyed in announcements. Naive investors incorrectly expect that the superior prepublic offering performance will continue and the poor preprivate offering performance will improve. Huson et al. (2006) show that private placements are more likely to occur following periods of superior capital market
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms
conditions. Furthermore, private placement offer prices are not efficient with respect to public information as measured discounts are related to capital market conditions. In summary, private placement firms correctly judge naive investors’ misvaluation and take advantage of the capital market conditions. The above three information contents of equity-selling mechanisms differ in that the first reveals managers’ superior knowledge about future firm cash flow, the second reveals management hubris about the future cash flow, while the last reveals managers’ correct anticipation of the market misvaluation. Here, we use insider trading prior to the equity-selling mechanisms to measure the other contemporaneous informed parties’ decision. Inferring the roles of equity-selling mechanisms from the ex post consequences (e.g., longterm stock price performances and accounting performances) would not be a powerful test, since the firms’ qualities may have evolved directly due to the equity-selling mechanisms (Wruck 1989). There are several major reasons to expect that insider trading reflects insiders’ superior knowledge about a firm’s prospects, insiders’ personal belief about their firms’ prospects, or insiders’ exploitation of market misvaluation (1) studies on insider trading activity conclude that insiders trade profitably and the power of using insider trading patterns to predict stock returns ranks above other potential trading strategies (Lakonishok and Lee 2001; Seyhun 1992) (2) Boehmer and Netter (1997) use insider trading as a proxy for insiders’ personal belief about their firms’ prospects to investigate managements’ resistance on acquisitions and takeovers. Abnormal insider purchases prior to private placements may reflect insider overoptimism about the future prospects of the firm, (3) Rozeff and Zaman (1998) suggest that insider purchases (sales) are positively (negatively) related to book-to-market ratio which measures security misevaluation, (4) Piotroski and Roulstone (2005) document that insider trades reflect both personal belief and superior knowledge about future cash flow realization, and (5) the private information set leading to insider trading is comprehensive. Therefore, it is unlikely that insider trades and equity-selling mechanisms are driven by different sets of private information. Most studies examining the timing of insider trades indicate significant changes in insider trading patterns before the announcements of various corporate events, including equity offerings (Kahle 2000), earning announcements (Park et al. 1995), dividend announcements (John and Lang 1991), capital expenditure announcements (John and Mishra 1990), and merger announcements (Keown and Pinkerton 1981). For these reasons, this paper hypothesizes that the likelihood of placing equity privately is expected to correlate positively with the likelihood of insider purchases and negatively with the likelihood of insider sales.
1153
74.3 Alternative Econometric Methods for Information-Based Equity-Selling Mechanisms Empirical work examining the information contents of equity-selling mechanisms is typically based on proxies for information asymmetry (Wu 2004). The primary difficulty with the proxies is the errors-in-variables problem. Because information asymmetry is unobservable, any empirical attempt to capture information asymmetry requires the use of proxies. It is plausible that some proxies capture underlying intrinsic characteristics rather than information asymmetry. As a result, an inference drawn from the significance of the coefficients of the proxies for information asymmetry may be biased. To circumvent the potential biases embedded in this approach, this paper distinguishes observable and unobservable variables. This distinction is fundamental for understanding the parameters of interest and the inferences based on them. This paper assumes that the firm-specific information set x has two components such that x D .xobs ; xunobs /; where xobs and xunobs denoting the vectors of exhaustive set of firmspecific variables observed by public investors and insiders, respectively. To control for the surrogates for the private information, xobs include the firm’s age at IPO dates, the duration between IPO dates and equity offering announcements in terms of years, the bid-ask spread, the number of shareholders, and monthly trading volume/the average of shares outstanding over the previous 2 years. To control for the impact of security misvaluation (Rozeff and Zaman 1998; Piotroski and Roulstone 2005), xobs include contemporaneous 1-month buy-and-hold equal-weighted market-adjusted return and the firm’s book-to-market ratio. xobs also include variables to capture preissue ownership structure and the extent of CEOs organization power (Wruck 1989; Wu 2004, Barclay et al. 2007), prior financial performance and financial constraints, firm size, issue size, firm risk, the impact of the stock options (Biais and Hillion 1994), the stock exchanges (Lin and Howe 1990), and the identities of insiders (Seyhun 1986).1 Let (ci ; pi ) denote a pair of informed parties’ decisions. While (ci ; pi ) are unobserved, the underlying decision outcome, that is, intensive insider purchases (sales) and the equity selling mechanisms are observed. (ci ; pi ) are related to (ci ; pi ) in the sense that ci I.0;1/ .ci / and pi I.0;1/ .pi / or pi depending on the econometric specifications. Specifically, ci equals one or zero, depending on whether the firms choose private or public offerings and pi measures an intensive insider purchases
1
See Lee and Wu (in press) for the detailed list of observable factors affecting insider trades and equity-selling mechanisms.
1154
L. Cheng-Few and Y.L. Wu
(sales) event. We use two standard measures of an intensive insider purchase (sale) event. First, three or more insider purchases (sales) and no insider trade occurring in the opposite direction in the prior 1, 3, 6, or 12 months of the equity offering announcement (Jaffe 1974; Rozeff and Zaman 1988; Lin and Howe 1990). Second, a positive (negative) difference between insider purchases and sales in the prior 1, 3, 6, or 12 months of the equity offering announcement (John and Lang 1991). We introduce two benchmark estimation approaches, including the two-stage estimation approach and the conditional correlation approach. Both approaches are performed under three alternative econometric specifications, including (1) two independent probit models (single-equation model), (2) the bivariate probit methods where pi is a binary variable that equals one if three or more insider purchases (sales) and zero insider sales (purchases) occur in the prior 1, 3, 6, or 12 months of the equity offering announcement and zero otherwise, and (3) the mixed binary and censored probit models where pi is a censored variable that equals the positive (negative) net insider purchases occur in the prior 1, 3, 6, or 12 months of the equity offering announcement and zero otherwise. Note that in all three econometric specifications ci D 1 if firm i chooses private placements and ci D 0 if firm i chooses public offerings. In addition, this paper introduces nonparametric approaches as a robustness check when the data are sensitive to the functional form and normal distribution of the error disturbances.
vectors of exhaustive set of firm-specific variables observed by public investors and insiders, respectively. xobs , includes measures for security misvaluation, information asymmetry, firm ownership structure, financial performances, etc.2 We assume that the conditional distribution of xunobs;i given xobs;i , depends on xobs;i only through a linear regression function, i.e., a linear relationship exist between surrogates for the private information and private information xunobs;i;t D Axobs;i;t C ef i;t ; P ei;t jxobs;i;t / D e . where E.f ei;t jxobs;i;t / D 0 and Var.f As the public investors cannot directly observe xunobs;i;t , Equation (74.1) can be rewritten as T T D xobs;i;t .1 C AT 2 / C t C ef f pi;t i;t 2 C u i;t ; ci;t T T D xobs;i;t .ˇ1 C AT ˇ2 / C t C ef f i;t ˇ2 C v i;t ; (74.2) T f ef i;t 2 C u i;t in Equation (74.2) is called the residuals of intensive insider trading regression or abnormal insider trading. From the law of iterated expectation, E.f ei;t T 2 C T ei;t ˇ2 C vf uf i;t jxobs;i;t / D 0 and E.f i;t jxobs;i;t / D 0; given vi jxobs;i;t / D 0: that E.f ui;t jxobs;i;t / D 0 and E.e Two-stage estimation leads to ci;t D .f ei;t T 2 C uf i;t /ˇ3
e
74.3.1 The Two-Stage Estimation Approach This subsection describes the two-stage estimation approach used to test the information contents of equity-selling mechanisms. Equity selling mechanisms and intensive insider purchases (sales) are denoted by (ci ; pi ). Assume, T T D xobs;i;t 1 C xunobs;i;t 2 C t C uf pi;t i;t ; ci;t T T D xobs;i;t ˇ1 C xunobs;i;t ˇ2 C t C vf i;t ;
e
e
(74.1)
f with wi;t D vf i;t u i;t and E.wi;t jxobs;i;t / D 0, then ˇ3 D ˇ22 where E.f ui;t jxobs;i;t ; xunobs;i;t / D 0, E.f vi;t jxobs;i;t ; xunobs;i;t / D 0; the subscript i indexes the firm, t denotes time, ci equals one or zero depending on whether the firms choose private or public offerings, pi measures an intensive insider purchases (sales) event and the year dummies captures the unobserved time-varying factor that is common across firms. Again, we assume that the firm-specific information x has two components, xobs and xunobs , denoting the
e
T Cxobs;i;t .ˇ1 C AT ˇ2 / C t C wi;t ;
e
(74.3)
f with wi;t D vf i;t u i;t and E.wi;t jxobs;i;t / D 0, then ˇ3 D ˇ22 . The estimated residuals of intensive insider purchase T f regression (i.e., abnormal insider purchases), ef i;t 2 C u i;t , measure unobserved private information. This is because ˇ3 is purged of firm-specific public information and transitory differences for firm i across years. The signs of ˇ2 and 2 indicate the sign of ˇ3 . Accordingly, if ˇ3 is significantly positive, negative, or is indifferent from zero, then the private placement signals positive, negative, or no firm-specific private information, respectively. An opposite interpretation holds for the estimated residuals of intensive insider sale regression (i.e., abnormal insider sales).
2 Throughout the paper, we assume that insiders’ both decisions are related to the same xobs . The logic extends directly to more complicated models where each decision is related to an unique vector of observable variables, since we can partition xobs;i;t into .xobs1;i;t ; xobs2;i;t /; where
0 T C xunobs;i;t 2 C t C f ui;t ; pi;t D .xobs1;i;t ; xobs2;i;t / 1
ˇ1 T ci;t D .xobs1;i;t ; xobs2;i;t / C xunobs;i;t ˇ2 C t C ve i;t : 0
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms
Compared with the framework using the proxy approach (Wu 2004), an emphasis here is that the two-stage estimation approach distinguishes between public and private information. The interpretation of the information content of equityselling mechanism choice is not based on the coefficients of the surrogates of private information but on the coefficient (ˇ3 ) of the residuals of intensive insider trading regression. Since the inference of the information role is drawn from the significance of ˇ3 , the regression (Equation (74.3)) tells us that we cannot ignore the private information simply because the surrogates and private information are highly related (A is significant), neither can we ignore the unobservable variables when the surrogates are empirically irreverent (ˇ1 C AT ˇ2 is insignificant).
1155
74.3.2 The Conditional Correlation Approach In a purely statistical sense, for any two informed parties’ decisions, as long as a pair of unobservable variables is correlated, conditional on the observable variables, the residuals of these two informed parties’ decisions will be correlated. The conditional correlation coefficient is derived from the correlation between the generalized residuals (or the standardized residuals) of the intensive insider trading regression and the equity-selling mechanism choice regression. T T D xobs;i;t .1 C AT 2 / C t C ef f pi;t i;t 2 C u i;t ; T T ci;t D xobs;i;t .ˇ1 C AT ˇ2 / C t C ef f i;t ˇ2 C v i;t ;
The conditional correlation coefficient is given by
T Corr.f ei;t T 2 C uf f f i;t ; e i;t ˇ2 C v i;t jxobs;i;t / T Cov.f ei;t T 2 C uf f f i;t ; e i;t ˇ2 C v i;t jxobs;i;t / Dp p T Var.f ei;t 2 C uf ei;t T ˇ2 C vf i;t jxobs;i;t / Var.f i;t jxobs;i;t / P T 2 e ˇ2 P D TP : 2 1=2 .2 .ˇ2T e ˇ2 C v2 /1=2 e 2 C u /
The necessary and sufficient condition for C orr.f e T C P i;t 2 T T f f uf i;t , e i;t ˇ2 C v i;t jxobs;i;t / to be positive is 2 e ˇ2 > 0: Two special cases can be summarized as follows: P P 2 2 If dim ( e )=p > 1, with ; :::; e;p } and e D diagf P e;1 T ˇ2;j 2;j > 0 for j =1,. . . , p, then 2 ˇ > 0: 2 e P P If dim ( e )=1, 2T e ˇ2 > 0 if and only if 2 ˇ2 > 0. The direction of the conditional correlation depends on P the sign of the numerator, 2T e ˇ2 . In the case where the P off-diagonal elements of e are zero and each element of the two column vectors, ˇ2 and 2 , is of the same sign (exP cept for the zero vector), then 2T e ˇ2 is greater than zero. If, instead of being multidimensional, xunobs;i can be fully captured in a variable, then ˇ2 e2 2 is greater than zero if and only if 2 ˇ2 is greater than zero. Note that the conditional correlation coefficient that we explore is quite different from Pearson correlations. Conditioning on firm-specific public information and transitory differences for firm i across years, the conditional correlation coefficient from Equation (74.4) is purged of firm-specific public information and transitory differences for firm i across T f years. Accordingly, if ef i;t 2 C u i;t correlates positively, negT f atively, or is uncorrelated with ef i;t ˇ2 C v i;t , then this statistical relationship serves as evidence that private placements signal positive, negative, or no firm-specific private information, respectively. Comparing Equations (74.2) and (74.4) shows that the sign of the coefficient (ˇ3 D ˇ22 / of the residuals of inten-
(74.4)
sive insider trading regression is the same as that of conditional correlation coefficient.3 There are two major reasons that the conditional correlation approach may prevail over the two-stage estimation approach. First, the conditional correlation coefficient which is bounded by 1 and 1 permits across samples comparisons the magnitude of private information revealed in equity-selling mechanism choices, unlike the two-stage estimation approach that suffers the inherent incommensurate of coefficient of the residuals of intensive insider trading regression (f ei;t T 2 C uf i;t // among samples. Second, at each fitted probability, the variance of the residuals of intensive insider trading regression equals fitted probability times (one-fitted probability). Thus if fitted probability is high, the variance of the abnormal insider trading are either very small or large. As a result, the abnormal insider trading may have a skew distribution. Both the two-stage estimation approach and the conditional correlation approach can be performed under the three variations of probit models including the two independent probit model, the bivariate probit model that explicitly accounts for cross equation impact, and the mixed binary and censored probit model where one response variable is 3
Here, we assume that the conditional correlation coefficient remains constant over time. This assumption reduces the number of parameters considerably and hence, makes efficient estimation possible. Note that we include year dummies t throughout the analysis to capture any unobserved time-varying factors that are common across firms.
1156
L. Cheng-Few and Y.L. Wu
continuous and the other is binary. For brevity, we consider the conditional correlation approach only. The derivation applies directly to the two-stage estimation approach.
asymptotically as a chi-squared (2 ) distribution with one degree of freedom. Compared with the test statistic ‰ developed by Gouriéroux et al. (1987) and employed by Chiappori and Salanie (2000) for testing asymmetric information in the French automobile insurance market,
74.3.2.1 Two Independent Probit Models (Single-Equation Model)
. ‰D
We can reparameterize Equation (74.2) as
(74.5)
T T f f with D 1 CAT 2 ; f i;t D e i;t 2 C u i;t ; ˇ D ˇ1 CA ˇ2 , T f f and "f i;t D e i;t ˇ2 C v i;t . As pi I.0;1/ .pi / and ci I.0;1/ .ci /, then T > 0/ D I .xobs;i;t Ct C f p.pi;t D 1/ D p.pi;t i;t > 0/ T T D p.f i;t < xobs;i;t C t / D ˆ.xobs;i C t /; T > 0/ D I .xobs;i;t ˇ C t C "f p.ci;t D 1/ D p.ci;t i;t > 0/ T T D p.f "i;t < xobs;i;t ˇ C t / D ˆ.xobs;i;t ˇ C t /; Let ˆ denote
cumulative distribution function of the f i:i:d i;t N.0; 1/ with N.0; I2 /; where I2 is the identity "f i;t matrix of order two:4 The single-equation model computes the generalized residuals, bi and "bi ; given by the conditional b i jpi / and E.b b "i jci /, respectively. For example, means, E.b T ˇCt / b "i jci D 1/ D .xobs;i;t b "i jci D 0/ "bi D E.b or "bi D E.b T ˆ.xobs;i;t ˇCt /
.x T
= 1ˆ.xobs;i;t T
ˇCt /
obs;i;t ˇCt /
. Let and ˆ denote, respectively, the den-
sity function and the cumulative distribution function of N (0,1), we then define a test statistic given by
74.3.2.2 The Bivariate Probit Models The bivariate probit model explicitly accounts for the crossequation impact. As a result, it may yield estimators that are asymptotically more efficient than the single-equation approach, provided that the residuals from the two equations are correlated (Zellner and Lee 1965).
T ci;t D xobs;i;t ˇ C t C "f i;t ;
i D1
i D1
"b2i b2i
d
Ð 2 .1/;
T D xobs;i;t C t C f pi;t i;t ;
"bi bi 1 d s Ð N.; /; n n n P 1 P b2 "i n1 b2i n n
n P
there are two important advantages of adopting b instead of ‰. First, in addition to the common assumption that observations are cross-sectional independently and identically distributed (i.e., i:i:d:), their test statistic is only valid under one more assumption: the pair of generalized residuals are conditional independent, i.e., Cov.b "i ; bi / D0. Second, ‰ fails to retain information on both the direction and the magnitude of correlation. Since the key objective of this paper is to detect the information content of equity-selling mechanisms based on the direction and the magnitude of the conditional association between private placements and insider trading, we adopt b as a test statistic instead of ‰.
n 1P
b D s
"bi bi /2
i D1
i D1
T D xobs;i;t C t C f pi;t i;t ; T ci;t D xobs;i;t ˇ C t C "f i;t ;
n P
i D1 d
n.b /2 Ð 2 .1/; Under the null of i:i:d: of the generalized residuals, p n.b / is distributed asymptotically as a standard normal distribution. Consequently, n.b /2 is distributed
where pi;t is a binary variable that equals one if three or more insider purchases (sales) and zero insider sales (purchases) occur in the prior 1, 3, 6, or 12 months of the equity offering announcement and zero otherwise, ci;t equals one or zero, depending on whether
of the firms choose private or public P P f 1 i:i:d i;t . ferings, and N.0; /; with D "f 1 i;t Then Corr.f f i;t ; " i;t / D . Let ˆ denote the cumulative distribution function of N.0; 1/ and set Z1 F .x Tobs;i;t
4
T xobs;i;t
e
2 the equation " D " ˇ C "t C i;t" assuming that var("e i;t / D " : The conditional correlation coefficent between (ci;t ; pi;t / depends only ; pi;t ) not on their absolute scale. on the sign of (ci;t
Z1
C t ; / T T xobs;i;t ˇt xobs;i;t t
T Note that the equation ci;t D xobs;i;t ˇ C t C "e i;t is equivalent to ci;t
C
t ; x Tobs;i;t ˇ
"
e
h i 2 2 2 .e "i;t / C.e i;t / 2 "e i;t e i;t =Œ2.1 /
2
p
1 2
d f f i;t d " i;t ::
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms
1157
Then
T T p.pi;t D 1/ D p.pi;t > 0/ D p.f i;t < xobs;i;t C t / D ˆ.xobs;i;t C t /; T T p.ci;t D 1/ D p.ci;t > 0/ D p.f "i;t < xobs;i;t ˇ C t / D ˆ.xobs;i;t ˇ C t /; T T > 0; ci;t > 0/ D F .xobs;i;t C t ; xobs;i;t ˇ C t ; /; p.pi;t D 1; ci;t D 1/ D p.pi;t T T T p.ci;t D 1; pi;t D 0/ D p.ci;t D 1/ p.ci;t D 1; pi;t D 1/ D ˆ.xobs;i;t ˇ C t / F .xobs;i;t C t ; xobs;i;t ˇ C t ; /; T T T p.ci;t D 0; pi;t D 1/ D p.pi;t D 1/ p.ci;t D 1; pi;t D 1/ D ˆ.xobs;i;t C t / F .xobs;i;t C t ; xobs;i;t ˇ C t ; /; T T T T ˇ C t / ˆ.xobs;i;t C t / C F .xobs;i;t C t ; xobs;i;t ˇ C t ; /: p.ci;t D 0; pi;t D 0/ D 1 ˆ.xobs;i;t
The log-likelihood function for the parameters .ˇ T ; T ; /T ; is then given by
l./ D
Xn
T T ci;t pi;t log F .xobs;i;t C t ; xobs;i;t ˇ C t ; / i D1 T T T Cci;t 1 pi;t log ˆ.xobs;i;t ˇ C t / F .xobs;i;t C t ; xobs;i;t ˇ C t ; / T T T C t / F .xobs;i;t C t ; xobs;i;t ˇ C t ; / Cpi;t .1 ci;t / log ˆ.xobs;i;t T T T T ˇ C t / ˆ.xobs;i;t C t / C F .xobs;i;t C t ; xobs;i;t ˇ C t ; / : C.1 pi;t /.1 ci;t / log 1 ˆ.xobs;i;t
74.3.2.3 The Mixed Binary and Censored Probit Models A continuous response variable may better differentiate between varying levels of insiders’ perceptions of their firms’ prospects, compared to a binary response variable. T D xobs;i;t C t C f pi;t i;t ; T ci;t D xobs;i;t ˇ C t C "f i;t :
; the number of insider purchases, if the 1, Let pi;t = pi;t 3, 6, or 12 months prior to the equity offering is an intensive insider purchase event and pi;t = 0 otherwise, ci;t D 1 if firm i chooses private placements and ci;t D 0 if firm
P f i id i;t i chooses public offerings, and N.0; /; with "f i;t 2
P 5 D : 1 Let and ˆ denote, respectively, the density function and the cumulative distribution function of N (0,1). The joint and pi;t equals the conditional distridensity function of ci;t bution of ci;t given pi;t multiplies the marginal density func . The conditional distribution of ci;t given pi;t is tion of pi;t T 2 i =; 1 /. Then N.xobs ˇ C t C e
ˇ p.pi;t pci;t ;pi;t .1; pi;t / D p.ci;t > 0 ˇpi;t / ˇ D p.ci;t D 1 ˇpi;t p.pi;t /; T T ˇ C t C .pi;t xobs t /= xobs D p p ˆ 2 1 2
1
5
!
" exp
2 # T pi;t xobs t 2 2
To save space in the derivation, we consider xobs only by ignoring the firm i and the time t notation.
1158
L. Cheng-Few and Y.L. Wu
ˇ p.pi;t pci;t ;pi;t .0; pi;t / D p.ci;t < 0 ˇpi;t / ˇ ; / D p.ci;t D 0 ˇpi;t p.pi;t !# " " 2 # T T T pi;t ˇ C t C .pi;t xobs t /= xobs t xobs 1 D p p exp 1ˆ 2 2 2 1 2 Set parameters T ˇ T ; T ; 2 ; ; and T g.pi ; xobs ; / xobs ˇ C t
p T C .pi;t xobs t /= = 1 2 :
Then the log-likelihood function for the parameters is 2 T pi;t xobs t 1 2 log.2 / l./ D 2 2 2 i D1 C ci log ˆ g.pi ; xobs ; / C .1 ci / log 1 ˆ.g.pi ; xobs ; // : n X
We estimate by solving the likelihood equation(or score D 0, where equation) @l./ =@j Db # " n X
g pi ; xobs ; ci 1 ci @l. / xobs ; p D @ˇ ˆ g pi ; xobs ; 1 ˆ g pi ; xobs ; 1 2 iD1 ( #) " n T X g pi ; xobs ; pi xobs 1 ci @l. / ci p xobs ; D @ 2 ˆ g pi ; xobs ; 1 ˆ g pi ; xobs ; 1 2 iD1 " # n T T X / g pi ; xobs ; .pi xobs .pi xobs /2 @l. / 1 ci 1 ci ; p D 2C @ 2 2 2 4 ˆ g pi ; xobs ; 1 ˆ g pi ; xobs ; 2 1 2 3 iD1 # " n T X
g pi ; xobs ; .1 2 /.pi xobs / C g pi ; xobs ; ci 1 ci @l. / : D @ ˆ g pi ; xobs ; 1 ˆ g pi ; xobs ; .1 2 /3=2 iD1
The observed Fisher information for is @2 l./= ; where @@ T j Db
2
@ l./ D @ˇ@ˇ T
n X
8 ˆ ˆ g p ; xobs ; 2 <
1 1 2 ˆ ˆ :
ci
C
1ci
9 > > = 2
i Œˆ.g.pi ;xobs ; // Œ1ˆ.g.pi ;xobs ; // T ; xobs xobs 0 > ci 1ci > g pi ; xobs ; i D1 ; ˆ.g .pi ;xobs ; // 1ˆ.g .pi ;xobs ; // 8 9 2 ˆ > c 1c i i ˆ > n < g pi ; xobs ; = 2 C 2 X @2 l./ ˆ g p ;x ; 1ˆ g p ;x ; Œ . . Œ . . obs // obs // T i i D ; xobs xobs 2/ ˆ 0 > @ˇ@ T .1 ci 1ci ˆ > i D1 : g pi ; xobs ; ; ˆ.g .pi ;xobs ; // 1ˆ.g .pi ;xobs ; //
2
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms
8 2 ˆ ˆ < g p ; xobs ;
1159
9 > > n = T 2 C 2 X i .pi xobs / @2 l./ ˆ g p ;x ; 1ˆ g p ;x ; Œ . . Œ . . obs // obs // i i xobs ; D 0 > @ˇ@ 2 2 3 .1 2 / ˆ ci 1ci ˆ > i D1 : g pi ; xobs ; ; ˆ.g .pi ;xobs ; // 1ˆ.g .pi ;xobs ; // n T X .1 2 / pi xobs C g pi ; xobs ; @2 l./ D @ˇ@ .1 2 /2 i D1 8 9 2 ˆ > ci 1ci ˆ >
g pi ; xobs ; ˆ > 2 C 2 ˆ > ˆ > ˆ g p ;x ; 1ˆ g p ;x ; Œ . . i obs // Œ . . i obs // ˆ > < = 0 ci 1ci xobs ; g pi ; xobs ; ˆ g p ;x ; 1ˆ.g .pi ;xobs ; // ˆ > ˆ > . . i obs // ˆ > ˆ > .g .pi ;xobs ; // ˆ > ci i ˆ > C 1ˆ g 1c : ; 3=2 ˆ.g .pi ;xobs ; // . .pi ;xobs ; // .12 / T 2 @2 l./ @ l./ D @ @ˇ T @ˇ@ T ci
1ci
n X @2 l./ 1 D 2 @ @ T i D1 8 8 99 2 ˆ > ˆ > ci 1ci ˆ ˆ > < g pi ; xobs ; = < => 2 C 2 2 ˆ g p ;x ; 1ˆ g p ;x ; Œ . . // Œ . . // obs obs T i i xobs xobs 1 ; 0 > ˆ > .1 2 / ˆ ci 1ci ˆ > ˆ > : C g pi ; xobs ; ; : ; ˆ.g .p ;x ; // 1ˆ.g .p ;x ; // i
obs
i
obs
99 8 8 .g .pi ;xobs ; // ci 1ci ˆ ˆ > p > ˆ ˆ >> 1ˆ g p ;x ; > ˆ ˆ ˆ.g .pi ;xobs ; // > ˆ ˆ > . . i obs // > 12 > ˆ ˆ > > ˆ ˆ > T 2 > ˆ ˆ > .pi xobs / 2 > ˆ ˆ > > ˆ ˆ >
g p ; x ; C > ˆ ˆ > obs 2/ i > ˆ ˆ > 2 1 . > ˆ ˆ > n = < < = T 2 X pi xobs @ l./ 1 ci 1ci xobs ; D C C 2 2 > ˆ > Œˆ.g.pi ;xobs ; // Œ1ˆ.g.pi ;xobs ; // @ @ 2 4 3 ˆ > ˆ > i D1 ˆ > ˆ ˆ > > ˆ ˆ > 2 .p x T / 0 > ˆ ˆ > > ˆ ˆ > 2 i 1obs
g pi ; xobs ; > ˆ ˆ > 2/ . > ˆ ˆ > > ˆ ˆ > > ˆ ˆ > > ˆ ˆ > > ˆ ˆ > c 1c i i ; : : ; ˆ.g .pi ;xobs ; // 1ˆ.g .pi ;xobs ; // n T X g pi ; xobs ; .1 2 / pi xobs C g pi ; xobs ; @2 l./ D @ @ 2 .1 2 /2 i D1 8 9 ˆ > ci 1ci ˆ g p ; xobs ; 2 > > C ˆ 2 2 i ˆ > ;x ;x ˆ > ˆ g p ; 1ˆ g p ; Œ . . // Œ . . // obs obs i i ˆ > < = 0 ci 1ci xobs ;
g pi ; xobs ; ˆ g p ;x ; 1ˆ.g .pi ;xobs ; // ˆ > ˆ > . . i obs // ˆ > ˆ >
.g .pi ;xobs ; // ˆ > ci i ˆ > 1ˆ g 1c : ; 3=2 2 ˆ g p ;x ; p ;x ; . . i obs // . . i obs // .1 / T 2 @2 l./ @ l./ D @ 2 @ˇ T @ˇ@ 2 T 2 @2 l./ @ l./ D @ 2 @ T @ @ 2
1160
L. Cheng-Few and Y.L. Wu
n P i D1
@2 l./ D @ 2 @ 2
1 2 4
T / .pi xobs
2
6
C
8 ˆ ˆ ˆ ˆ <
T .g .pi ;xobs ; //.pi xobs /
p
4
12 4
ci ˆ.g .pi ;xobs ; //
2
g pi ; xobs ;
1ci 1ˆ.g .pi ;xobs ; //
9 > > > > =
ci 1ci 2 C 2 ˆ g p ;x ; 1ˆ g pi ;xobs ; // Œ . . // Œ . . obs i 4. > / ˆ ˆ > ˆ > 0 ci 1ci ˆ > : g pi ; xobs ; ; ˆ.g .pi ;xobs ; // 1ˆ.g .pi ;xobs ; // n T P / .pi xobs ci 1ci 1ˆ g p ;x ;
g pi ; xobs ; 3=2 ˆ.g .pi ;xobs ; // . . i obs // 2.12 / 3 i D1
@2 l./ @@ T @2 l./ @@ 2
@2 l./ @@
T 2 .pi xobs /
12
2
6
T T /Œ.12 /.pi xobs /Cg .pi ;xobs ; / .pi xobs 2 4 2 2.1 / 8 9 2 ˆ > ci 1ci ˆ > C < g pi ; xobs ; = 2 2 Œˆ.g.pi ;xobs ; // Œ1ˆ.g.pi ;xobs ; // 0 ˆ > ci i ˆ > 1ˆ g 1c : g pi ; xobs ; ; ˆ.g .pi ;xobs ; // . .pi ;xobs ; // T 2 @ l./ D @ˇ@ 2 T @ l./ D @ @ T 2 @ l./ D @ 2 @ 8 9 2 T .pi xobs / Œg.pi ;xobs ; / ˆ > < = C n 3=2 3=2 P ci 1ci .12 / .12 / T 2 .pi ;xobs ; / > i D1 ˆ : Œ.1 /.pi xobs /Cg ; ˆ.g.pi ;xobs ; // 1ˆ.g.pi ;xobs ; // 3=2 2 .1 / Œ.p x T /.12 /Cg .pi ;xobs ; / 2 i obs 3 D .12 / 2 8 9 2 ˆ > ci 1ci ˆ > C < g pi ; xobs ; = 2 Œˆ.g.pi ;xobs ; // 2 Œ1ˆ.g.pi ;xobs ; // 0 ˆ > ci i ˆ > 1ˆ g 1c : g pi ; xobs ; ; ˆ.g .p ;x ; // . .p ;x ; //
@2 l./ D @ 2 @
@2 l./ @@ˇ T
C
i
obs
i
obs
:
74.3.3 The Nonparametric Approach The discussion so far assumes that the error disturbances are normally distributed.and that a parametric function for the conditional expectation of a dependent variable has been specified correctly. However, there is no prior assumption that the intrinsic relation between xunobs;i and xobs;i must be linear [see Chen et al. (2004) on using residual analysis technique to detect the nonlinear specification in corporate finance research]. It is also possible for the pair of informed parties’ decisions given observable variables to be linearly uncorrelated but dependent.
Since the functional forms of the parametric procedures are relatively restrictive. we propose a nonparametric approach based on sign test, and a chi-squares test for independence.
74.3.3.1 Sign Test and Chi-Squared Test for Independence A sign tests based on a 2 2 contingency table are performed to examine whether the signs of the generalized residuals for equity-selling mechanism choice (b "i ) match with the
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms
signs of the generalized residuals for insider trades (b i / from Equation (74.5) using the saturated models (models that include a exhaustive set of xobs;i ) under two independent probit models, the bivariate probit models, and the mixed binary and censored probit models.6 Here, we also use the nonparametric Chi-square test to examine if the estimated impact of abnormal insider purchases (sales) is sensitive to the methodological approach. The proposed grouping is to let xi be a vector of m (number of independent variables) dummies (either 0 or 1). Firms having the same values for m dummies are grouped by 2m : In each grouping, a two-by-two classification table generated by the values of ci and pi is computed. Let j D 0; 1: pi D 0
pi D 1
ci D 0
n00
n01
ci D 1
n10
n11
Define nj: D nj 0 C nj1 ; n:j D n0j C n1j ; and n:: D n:0 C n:1 . Now consider the test statistic Vm D P Œnj k .nj: n:k =n:: / 2 , which is the well-known 2 test for nj k
j;kD0;1
independence. If ci and pi are independently distributed, then Vm is 2 .1/. The probability of rejecting independence in each grouping is assumed to be 0.05. That is, if the value of Vm for a certain grouping is greater than 3.84, the 5% critical value for 2 .1/, the null hypothesis of independence is rejected. Furthermore, if Vm is assumed to be independent between groupings, then the probability of rejecting independence for the 2m groupings has a binomial distribution with the number of trials equals 2m and the probability of success equals 0.05, the probability of rejecting independence in each grouping. The statistic is generated by adding up the Vm values for 2m cells, which is distributed as a 2 .2m /. To implement this test, we first need to do the groupings. It is infeasible to use all observable variables (xobs /, since it will result in too small number of observations in each cell. We could use around four to five variables, depending on the
6
Bear in mind that the estimated generalized residuals in binary response models (ci ,pi ) are not direct measures of error terms from the underlying regression models where the dependent variables (ci ; pi / are not directly observable. The estimated correlation coefficient for the individual generalized residuals is as follows: EŒ.c E.c //.pi E.pi // E.ci D1;pi D1/E.ci D1/E.pi D1/ p b D pvar.c iE.c i//pvar.p D pvar.c : E.p // E.c // var.p E.p // i
i
i
i
i
i
i
i
T T Since Max(0,1-ˆ.xobs;i ˇ/ ˆ.xobs;i // E.ci D 1; pi D 1/ T T Min.ˆ.xobs;i ˇ/; ˆ.xobs;i //: is bounded within certain range and varies Therefore, the range of b with the independent variable. Because of this important caveat, we propose to use the sign test that focuses on the sign as a robustness test.
1161
sample size, that are highly significant in the regressions for equity-selling mechanism choices and the intensity of insider trading.
74.4 Conclusions Recently, there has been a considerable debate on the information content of the equity-selling mechanisms. This paper adopts a two-stage estimation approach and a conditional correlation methodology to reexamine this debate. This paper assumes that both the choice of equity-selling mechanisms and insider trades are driven by the same set of private information. This paper uses the estimated residuals from the insider trading regression as a measure of insiders’ superior information about the firm’s prospects and hypothesizes that private placement firms are undervalued when the estimated residuals from the insider purchase (sale) regression correlate positively (negatively) with the probability of making private placements. The conditional correlation coefficient we explore is quite different from Pearson correlations in that it is purged of public information and measures correlation between two set of private information. These two approaches help to circumvent the problems with cross-sectional studies that proxies for information asymmetry are plagued by significant measurement errors and the problems with event studies that long-term ex post stock performance may be endogenously affected by equity-selling mechanisms (Wruck 1989) and statistical significant event date returns could be driven by market microstructure problems (Lease et al. 1991) or long-term event-study methodologies (Mitchell and Stafford 2000). This conditional correlation is initially used by Chiappori and Salanie (2000). This paper relaxes an extremely restrictive assumption required by Chiappori’s and Salanie’s test statistic (p. 66) of conditional independence within residuals of a pair of regressions and retains information on both the direction and magnitude of correlation. This retention is crucial because the key objective of this paper is to detect the direction of the conditional association in order to infer whether private placements signal positive, negative, or no firm-specific private information, respectively. In our application of the conditional correlation approach, one of the pair of the informed parties’ decisions is the equity selling mechanism choice and the other informed parties’ decision is the insider trading activity. Instead of using the insider trading, we could use the revisions in analysts’ EPS forecasts. Studies widely use analysts as a good proxy for well-informed investors (e.g., Womack 1996). The rationality of analysts’ EPS forecasts is supported by Keane and Runkle (1998). The rationality is in the sense of Muth (1961), that is, the analysts’ EPS forecasts are equal to the expectation of actual
1162
EPS, conditional on the available information at forecasting date. This paper applies the conditional correlation approach to incorporate cases where both response variables are binary and cases where response variables are continuous instead of binary. Similar application of the conditional correlation approach in other setting may allow researchers to answer other important empirical questions more convincingly. Apart from its methodological contribution, the empirical results shed light on the debate over private placement puzzles and the estimates could offer suggestions to the SEC proposal of integrating private and public offerings (Davidson et al. 1998, pp. 151–382).
References Barclay, M. J., C. G. Holderness, and D. P. Sheehan. 2007. Private placements and managerial entrenchment, Unpublished working paper, Boston College. Biais, B. and P. Hillion. 1994. “Insider and liquidity trading in stock and options markets.” Review of Financial Studies 7, 743–780. Boehmer, E. and J. M. Netter. 1997. “Management optimism and corporate acquisitions: evidence from insider trading.” Managerial and Decision Economics 18, 693–708. Chen, S.-S., K. W. Ho, C.-F. Lee, and K. Shrestha. 2004, “Nonlinear models in corporate finance research: review, critique, and extensions.” Review of Quantitative Finance and Accounting 22(2), 141–169. Chiappori, P.-A. and B. Salanie. 2000. “Testing for asymmetric information in insurance markets.” Journal of Political Economy 108, 56–78. Clarke, J., C. Dunbar, and K. M. Kahle. 2001. “Long-run performance and insider trading in completed and canceled seasoned equity offerings.” Journal of Financial and Quantitative Analysis 36, 415–430. Damodaran, A. and C. H. Liu. 1993. “Insider trading as a signal of private information.” Review of Financial Studies 6, 79–119. Davidson, G. K., R. C. Nash, R. R. Plumridge. 1998. Private placements, Practicing Law Institute, New York. Goh, J., M. J. Gombola, H. W. Lee, and Y. L. Feng. 1999. “Private placement of common equity and earnings expectations.” Financial Review 34, 19–32. Gouriéroux, C., A. Monfort, E. Renault, and A. Trognon. 1987. “Generalized residuals.” Journal of Econometrics 34, 5–32. Hertzel, M. G. and L. Rees. 1998. “Earnings and risk changes around private placements of equity.” Journal of Accounting, Auditing and Finance 13, 21–35. Hertzel, M. G. and R. L. Smith. 1993. “Market discounts and shareholder gains for placing equity privately.” Journal of Finance 48, 459–485. Hertzel, M. G., M. Lemmon, J. Linck, and L. Rees. 2002. “Long-run performance following private placements of equity.” Journal of Finance 57, 2595–2617. Huson, M. R., P. H. Malatesta, and R. Parrino. 2006. Capital market conditions and the volume and pricing of private equity sales, Unpublished working paper, University of Alberta, University of Washington and University of Texas at Austin. Jaffe, J. F. 1974. “Special information and insider trading.” Journal of Business 47, 410–428.
L. Cheng-Few and Y.L. Wu Jain, P. C. 1992. “Equity issues and changes in expectations of earnings by financial analysts.” Review of Financial Studies 5, 669–683. John, K. and L. H. P. Lang. 1991. “Insider trading around dividend announcements: theory and evidence.” Journal of Finance 46, 1361–1389. John, K. and B. Mishra. 1990. “Information content of insider trading around corporate announcements: the case of capital expenditures.” Journal of Finance 45, 835–855. Kahle, K. M. 2000. “Insider trading and the long-run performance of new security issues.” Journal of Corporate Finance: Contracting, Governance and Organization 6, 25–53. Karpoff, J. M. and D. Lee. 1991. “Insider trading before new issue announcements.” Financial Management 20, 18–26. Keane, M. P. and D. E. Runkle. 1998. “Are financial analysts’ forecasts of corporate profits rational?” Journal of Political Economy 106, 768–805. Keown, A. J. and J. M. Pinkerton. 1981. “Merger announcements and insider trading activity: an empirical investigation.” Journal of Finance 36, 855–869. Krishnamurthy S., S. Paul, S. Venkat, and W. Tracie. 2005. “Does equity placement type matter? Evidence from long-term stock performance following private placements.” Journal of Financial Intermediation 14, 210–238. Lakonishok, J. and I. Lee. 2001. “Are insider trades informative?” Review of Financial Studies 14, 79–111. Lease, R. C., R. W. Masulis, W. Ronald, and J. Page. 1991. “An investigation of market microstructure impacts on event study returns.” Journal of Finance 46, 1523–1536. Lee, I. 1997. “Do firms knowingly sell overvalued equity?” Journal of Finance 52, 1439–1466. Lee, C.-F. and Y. L. Wu. in press. “An econometric investigation of the information content of equity-selling mechanisms.” Review of Quantitative Finance and Accounting. Lin, J. C. and J. S. Howe. 1990. “Insider trading in the OTC market.” Journal of Finance 45, 1273–1284. Loughran, T. and J. R. Ritter. 1995. “The new issues puzzle.” Journal of Finance 50, 23–51. Loughran, T. and J. R. Ritter. 1997. “The operating performance of firms conducting seasoned equity offerings.” Journal of Finance 52, 1823–1850. Masulis, R. W. and A. N. Korwar. 1986. “Seasoned equity offerings: an empirical investigation.” Journal of Financial Economics 15, 91–118. Meulbroek, L. A. 1992. “An empirical analysis of illegal insider trading.” Journal of Finance 47, 1661–1699. Mitchell, M. L. and E. Stafford. 2000. “Managerial decisions and longterm stock price performance.” Journal of Business 73, 287–329. Muth, J. F. 1961. “Rational expectations and the theory of price movements.” Econometrica 29, 315–335. Myers, S. C. and N. S. Majluf. 1984. “Corporate financing and investment decisions when firms have information that investors do not have.” Journal of Financial Economics 13, 187–221. Park, S., H. J. Jang, and M. P. Loeb. 1995. “Insider trading activity surrounding annual earnings announcements.” Journal of Business Finance & Accounting 22, 587–614. Piotroski, J.D. and D. Roulstone. 2005. “Do insider trades reflect both contrarian beliefs and superior knowledge about future cash flow realizations?” Journal of Accounting and Economics 39, 55–82. Rozeff, M. S. and M. A. Zaman. 1988. “Market efficiency and insider trading: new evidence.” Journal of Business 61, 25–44. Rozeff, M. S. and M. A. Zaman. 1998. “Overreaction and insider trading: evidence from growth and value Portfolios.” Journal of Finance 53, 701–716.
74 Alternative Econometric Methods for Information-based Equity-selling Mechanisms Seyhun, H. N. 1986. “Insiders’ profits, costs of trading, and market efficiency.” Journal of Financial Economics 16, 189–212. Seyhun, H. N. 1992. “Why does aggregate insider trading predict future stock returns?” Quarterly Journal of Economics 107, 1303–1331. Womack, K. 1996. “Do brokerage analysts’ recommendations have investment value?” Journal of Finance 51, 137–167.
1163
Wruck, H. K. 1989. “Equity ownership concentration and firm value: evidence from private equity financings.” Journal of Financial Economics 23, 3–28. Wu, Y. L. 2004. “The choice of equity-selling mechanisms.” Journal of Financial Economics 74, 93–119.
Chapter 75
Implementation Problems and Solutions in Stochastic Volatility Models of the Heston Type Jia-Hau Guo and Mao-Wei Hung
Abstract In Heston’s stochastic volatility framework, the main problem when implementing Heston’s semi-analytic formula for European-style financial claims is the inverse Fourier integration. The numerical integration scheme of a logarithm function with complex arguments has puzzled practitioners for many years. Without good implementation procedures, the numerical results obtained from Heston’s formula may not be robust, even for customarily used Heston parameters, as the time to maturity is increased. In this chapter, we compare three major approaches to solving the numerical instability problem inherent in the fundamental solution of the Heston model. Keywords Heston r Stochastic volatility r Semi-analytic Complex logarithm
r
75.1 Introduction The randomness of the variance process varies as the square root of variance in the Heston’s stochastic volatility framework (see Heston 1993). The literature on asset pricing using the Heston model has expanded dramatically over the last decade to successfully describe the empirical leptokurtic distributions of asset price returns. However, the implementations of Heston’s formula are not as straightforward as they may appear and most numerical procedures are not reported in detail (see Lee 2005). The complex logarithm contained in the formula of the Heston model is the primary problem. This chapter compares three main approaches to this problem: rotationcorrected angle, direct integration, and simple adjusted J.-H. Guo () College of Management, National Chiao Tung University, Hsinchu, Taiwan, ROC e-mail: [email protected] M.-W. Hung College of Management, National Taiwan University, Taipei, Taiwan, ROC
formula. Recently, the robustness of Heston’s formula has become one of the main issues on option pricing. It is a wellknown fact that the logarithm of a complex variable z D rei is multivalued; that is, ln z D ln jzj C i.arg.z/ C 2 n/ where arg.z/ 2 Œ ; / and n 2 Z. If one restricts the logarithm to its principal branch by setting n D 0 (similar to most software packages, such as CCC, Gauss, Mathematica, and others), it is necessarily discontinuous at the cut (see Fig. 75.1). The Heston model is represented in Lewis’ illustration, in which the type of financial claim is entirely decoupled from the calculation of the Green function. Different payoffs are then managed through elementary contour integration over functions and contours that depend on the payoff. In this way, one can see that the issue is fundamentally related to the Green function component of the solution. Once the implementation problems in the Green function component of the solution have been solved, the robustness of the formula for all European-style financial claims in Heston’s model can be assured. The rest of this chapter proceeds as follows: Sect. 75.2 gives the derivation of the transformed-based solution for Heston’s stochastic volatility model and introduces the discontinuity problem arising from the derived formula. Section 75.3 compares three main solutions to the discontinuity problem and gives some numerical examples to illustrate their usefulness. Section 75.4 summarizes the chapter.
75.2 The Transform-Based Solution for Heston’s Stochastic Volatility Model Heston’s stochastic volatility model is based on the system of stochastic differential equations, which represent the dynamics of the stock price and variance processes under the risk-neutral measure p Vt dW S .t/ p dV t D . Vt /dt C V Vt dW V .t/:
dSt D rSt dt C St
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_75,
(75.1) (75.2)
1165
1166
J.-H. Guo and M.-W. Hung q
1 W .x; V; / D 2
3
Z
i ImŒ C1
e i x G. ; V; /d
(75.6)
i ImŒ 1
2 1 0
arg(z)
−1 −2 −3 −10
−5
0
5
10
Fig. 75.1 Discontinuities occur at the cut by restricting the logarithm of a complex variable z D e i to the principal branch
St and Vt denote the stock price and its variance at time t, respectively; r is the risk-free interest rate. The variance evolves according to a square-root process: is the long-run mean variance, is the speed of mean reversion, and V is the parameter which controls the volatility of the variance process. WS and WV are two standard processes of Brownian motion having the correlation . The Heston partial differential equation for a European-style claim C.S; V; t/ with expiration T , is V S 2 @2 C V2 V @2 C @2 C @C C C C S V V @t 2 @S 2 @S @V 2 @V 2 @C @C C . V / rC D 0: (75.3) CrS @S @V The fundamental transform method proposed by Lewis (2000) can reduce Equation (75.3) from two variables to one and entirely separate every (volatility independent) payoff function from the calculation of the Green function. After substituting the following, D T t; x D log.S / C r.T t/,C.S; V; t/ D W .x; V; /e r.T t / into Equation (75.3), we have @W 1 D V @ 2
@2 W @W 2 @x @x
C V V
@2 W @x@V
1 @2 W @W : C V2 V C . V / 2 @V 2 @V
(75.4)
Let G. ; V; / denote the Fourier transform of W .x; V; /: Z
1
G. ; V; /
e i x W .x; V; /dx:
so that differentiation w.r.t. x becomes multiplication by i in the transform. By taking the derivative of both sides of Equation (75.5), and then replacing @W =@ inside the integral by the right-hand side of Equation (75.4), we translate into a PDE for G. ; V; /. 1 @2 G 1 @G D V2 V V . 2 i /G @ 2 @V 2 2 @G C.. V / i V V / @V
(75.7)
Hence, a solution G. ; V; / to Equation (75.7), which satisfies G. ; V; 0/ D 1, is called a fundamental transform. Given the fundamental transform, a solution for a particular payoff can be obtained by C.S; V; t/ D
1 r.T t / e 2 Z i ImŒ C1
e i x W . ; V; 0/G. ; V; /d ; i ImŒ 1
(75.8) where WQ . ; V; 0/ is the Fourier transform of the payoff function at maturity. We will deal with a few common types of payoff functions and see what restrictions are necessary for their Fourier transforms to exist.
75.2.1 Call Option At maturity, the payoff of a vanilla call option with strike K is Max ŒST K; 0 in terms of our original variables. In terms of the logarithmic variables, we have W .x; V; 0/ D Max Œe x K; 0 , so the Fourier transform of the payoff is of the form: Z 1
e i x W .x; V; 0/dx W . ; V; 0/ D Z
1 1
e i x .e x K/dxDK 1Ci
D
.i 2 /;
logŒK
(75.5)
(75.9)
1
Given the transform G. ; V; /, the inversion formula is
which does not exist unless Im Œ > 1.
75 Implementation Problems and Solutions in Stochastic Volatility Models of the Heston Type
1167
Im[A(t, f)]
75.2.2 Put Option The payoff of a vanilla put option with strike K is ı Max ŒK ST ; 0 . Its transformed payoff is also K 1Ci i 2 , but the restriction is Im Œ < 0.
75.2.3 Digital Call
20 15 10 5 0 −5
The payoff of a digital call with strike K is H ŒST K whereıH is a Heaviside function. Its transformed payoff is K i .i /, subject to Im Œ > 0.
75.2.4 Cash-Secured Put The payoff of a cash-secured put with strike ı 2 K is 1Ci . i /, Min ŒST ; K . Its transformed payoff is K subject to 0 < Im Œ < 1. The fundamental solution of Equation (75.7) is in the form G. ; V; / D e A.; /CB.; /V :
(75.10)
−10 −15 −40
−20
0
20
40 Re[f]
Fig. 75.2 The discontinuity occurs in the fundamental solution of the Heston model if the logarithm with complex arguments p is restricted to the principal branch. Underlying: dSt D rSt dt C Vt St dW S .t / with pS0 D 100 and r D 0:0319. Variance: dV t D . Vt / dt C V Vt dW V .t / with V0 D 0:010201; D 6:21; D 0:019; V D 0:61, and D 0:70. Time to maturity: T D 2:00. The red line was obtained by evaluating A.; / with the unfixed form given in Equation (75.14). The green dashed line was obtained by evaluating A.; / with the adjusted formula given in Equation (75.31), and is the correct curve. The logarithmic function for both cases is restricted to using only the principal branch
After substituting Equation (75.10) into Equation (75.7), a pair of ordinary differential equations for A.; / and B.; / is obtained Im[A(t ,f)]
A D B
(75.11)
1 1 B D B 2 V2 B. C i V / . 2 i /: 2 2
(75.12)
The solutions can be expressed by B.; / D
. C iV C d. // .1 e / (75.13) 2 d. / .1 g. /e / V d. /
A.; / D 2 . C iV C d. // V 2 log
g. /e d. / 1 g. / 1
15 10 5 0 −5 −20
t 5 4 3 2
−10 0
1 10 20 Re[f ]
! (75.14)
Fig. 75.3 Discontinuities arise quite naturally for customarily used Heston parameters, typically occurring in practice as time to maturity is increased. The other parameters are the same as those specified in Fig. 75.2
using the auxiliary functions C iV C d. / and C iV d. / q d. / D . 2 i /V2 C . C iV /2 : g. / D
(75.15)
If the complex, multivalued logarithm is restricted to the principal branch only, discontinuities are necessarily incurred at the cut of the complex logarithm along the integration path, resulting in an incorrect value for Heston’s
formula. Figure 75.2 illustrates the discontinuity problem in the implementation of the fundamental solution. In this example, depicted in Fig. 75.2, S0 D 100; r D 0:0319; V0 D 0:010201; D 0:70; D 6:21; D 0:019; V D 0:61, and ImŒ D 2. Reasonable parameters in practice may incur the numerically induced discontinuity such that the correct treatment of the phase jump is very crucial. In fact, in examples with long maturity periods, discontinuities are certain to arise
1168
J.-H. Guo and M.-W. Hung Re [f]
Re [f]
1
0.4
0.75 0.5
0.2
0.25
0
Im[G(f,V,t)]
0
Im[G(f,V,t)]
−0.25
−0.2
−0.5 −0.75
−0.4 −40
−20
0
20
−40
40
Fig. 75.4 If the power of a complex variable z˛ is restricted to the principal branch, ˛ 2 Z makes discontinuities diminish at the cut. The fundamental function G. ; V; / in Equation (75.16) is evaluated with the same parameters specified in Fig. 75.2 except for D 19:58421. In this scenario, ˛ D =V2 D 1
from the formula presented in Equation (75.14) for A.; /, if theı complex logarithm uses the principal branch only and 2 V2 is not an integer (see Fig. 75.3). One may shift the problem from the complex logarithm to the evaluation of G. ; V; / D
g. /e d. / 1 g. / 1
2˛
e ˛.CiV Cd. //CB.; /V ; (75.16)
ı where ˛ D V2 . However, this formula comes with the related branch switching problem of the complex power function and discontinuities do not diminish in its implementation. Note that taking a complex variable z to the power ˛ gives (75.17) z˛ D r ˛ e i ˛ : After restricting arg.z/ 2 Œ ; /, the complex plane is cut along the negative real axis. Whenever z crosses the negative real axis, the sign of its phase changes from to . Therefore, the phase of z˛ changes from ˛ to ˛ . This may lead to a jump because e i D e i )
e i ˛ ¤ e i ˛ if ˛ … Z : e i ˛ D e i ˛ if ˛ 2 Z
(75.18)
To demonstrate this, Fig. 75.4 gives the scenario that ˛ 2 Z and there is no jump at all. Figure 75.5 gives another scenario that ˛ … Z and the complex power function indeed jumps.
−20
0
20
40
Fig. 75.5 If the power of a complex variable z˛ is restricted to the principal branch, ˛ … Z makes discontinuities occur at the cut. The fundamental function G. ; V; / in Equation (75.16) is evaluated with the same parameters specified in Fig. 75.2. In this scenario, ˛ D =V2 D 0:3170921
to remedy phase jumps. As described in Kruse and Nögel (2005), if the imaginary value of the complex logarithm for one step differs from the previous one by more than 2 , the jump of 2 is added or subtracted to recover the continuity of phase. However, using this approach, the already complex integrals of Heston’s formula may become too complicated in practice. Therefore, simulation is also considered as a practical alternative for finding option prices (see Broadie and Kaya 2006). To make matters worse, discontinuities arise quite naturally for customarily used Heston parameters simply as time to maturity is increased, thereby illustrating the importance of the correct treatment of phase jumps for Heston’s formula. Kahl and Jäckel (2005) remedied these discontinuities using the rotation-corrected angle of the phase of a complex variable. Shaw (2006) dealt with this problem by replacing the call to the complex logarithm by direct integration of the differential equation. In addition, Guo and Hung (2007) also proposed a simple adjusted formula to solve this discontinuity problem. From a computational and convenience point of view, the solution of Guo and Hung (2007) can be implemented easily and is thereby suitable for practical application. These solutions are presented by the following statements.
75.3.1 Rotation-Corrected Angle 75.3 Solutions to the Discontinuity Problem of Heston’s Formula In the literature, various authors propose the idea of carefully keeping track of the branch by monitoring the complex logarithm function for each step along a discretized integral path
In order to guarantee the continuity of A.; /, an rotationcorrected term must be additionally calculated in advance. First, we introduce the notation g. / D rg . /e ig . / ;
(75.19)
d. / D ad . / C i bd . /:
(75.20)
75 Implementation Problems and Solutions in Stochastic Volatility Models of the Heston Type
The next step is ı to have a closer look at the denominator of .g. /e d. / 1/ .g. / 1/:
g. / 1 D rg . /e ig . / 1 D rg . /e i.g . /C2 m/ (75.21) where g . / C ; (75.22) m D int 2 g . /
D arg.g. / 1/;
(75.23)
rg . / D jg. / 1j;
(75.24)
and with int Œ denoting Gauss’s integer brackets. The same calculation is done with the numerator:
After replacing the call to the complex logarithm by direct integration of the differential equation, the complex logarithm cannot be a problem any more and the continuity of A.; / is guaranteed.
75.3.3 Simple Adjusted Formula Here, we briefly introduce the simple adjusted formula of Guo and Hung (2007) to the discontinuity problem in the implementation of Heston’s formula. The solution is to move exp Œd. / into the logarithm of A.; / by simply adjusting A.; / as follows: A.; / D
g. /e d. / 1 D rg . /e
ig . / .ad . /Ci bd . //
e
1169
. C iV d. // V2
g. /e d. / 1 d. / e 2 log g. / 1
1
! : (75.31)
D rg . /e ad . / e i .g . /Cbd . / / 1
D rgd . /e i.gd . /C2 n/
(75.25)
where g . / C bd . / C ; n D int 2
(75.26)
gd . / D arg.g. /e d. / 1/;
(75.27)
rgd . / D jg. /e d. / 1j:
(75.28)
Hence, we can compute the logarithm of .g. /e d. / 1/= .g. / 1/ quite simply as
The insight in the formula in Equation (75.31) is that the subtraction of the number 1 from a complex variable, c, results simply in a shift parallel to the real axis. Because an imaginary component must be added to move a complex number across the negative real axis, the phases of c 1 and c exist on the same phase interval. Therefore, the logarithms of c 1 and c have the same rotation count number. It illuminates this simple solution to assure that the phase of A.; / is continuous, without the necessity of a rotation-corrected term. The logarithm presented in Equation (75.31) is the only term possibly giving rise to discontinuity. Trivially, log
h . i g. /e d. / 1 D log rgd . / rg . / log g. / 1
g. /e d. / 1 d. / e D log g. /e d. / 1 g. / 1 log .g. / 1/e d. / :
Ci Œgd . / g . /C2 .n m/ ; (75.29) where 2 .n m/ is the rotation-corrected angle.
75.3.2 Direct Integration
(75.32) Because the subtraction of 1 does not affect count the rotation d. / of the phase of a complex variable, log g. /e 1 has ˘ the same rotation count˘number as log g. /e d. /˘ . Never theless, log g. /e d. / log .g. / 1/ e d. / needs no rotation-corrected terms for all levels of Heston parameters because
Another way to avoid the branch cut difficulties arising from the choice of the branch of the complex logarithm is to perform direct numerical integration of A.; / w.r.t. according to Equation (75.11). Given B.; /; A.; / can be obtained by Z
A.; / D
B.s; /ds: 0
(75.30)
log
g. /e d. / .g. / 1/ e d. /
D log
g. / g. / 1
D logŒg. / logŒg. / 1 (75.33) and, again, log Œg. / and log Œg. / 1 has the same rotation count number. Hence, the formula in Equation (75.31),
1170
J.-H. Guo and M.-W. Hung
for A.; /, provides a simple solution to the discontinuity problem for Heston’s stochastic volatility model. Compared to the rotation-corrected angle method, the simple adjusted-formula method needs no rotation-corrected terms in the already complex integral of Heston’s formula to recover its continuity for all levels of Heston parameters. Although the direct integration method neither needs the rotation-corrected terms to guarantee the continuity of Heston’s formula, it inevitably introduces the discretization bias into the evaluation of the Green function component of the solution. This bias may create another serious problem of computation. Many steps may be necessary to reduce the bias to an acceptable level and, hence, more computational effort is needed to guarantee that the bias is small enough. As a consequence, the direct integration method requires more
Computing Time (sec.)
2.5 2 1.5
1.812
0.141
0.156
1.875
1.937
1.531
1.453
1
1.828
1 0.656
0.5
0.157
0.172
0 0.1
0.5
0.157
0.109
1
1.5
2
2.5
0.157 0.156
3
3.5
computing time than the simple adjusted-formula method to avoid the discontinuity problem arising from the complex logarithm. Figure 75.6 illustrates a comparison of the computing time for applying the simple adjusted-formula and direct integration methods to evaluate a European call option. The direct integration method is more time-consuming than the simple adjusted-formula method. Moreover, the simple adjusted-formula method has an advantage in that its time consumption remains almost at the same level as the time to maturity increases. In contrast, the computing time via the direct integration method increases rapidly with an increase in time to maturity. These computational results were performed on a desktop PC with an Intel Pentium D 3.4 GHz processor and 1 GB of RAM, running Windows XP Professional. The codes were written using the Mathematica software. Table 75.1 is an illustration of the usefulness of the simple adjusted formula for evaluating European call options in Heston’s model using the complex logarithm restricted to the principal branch. The algorithm was verified using Monte Carlo simulation with the exact method proposed by Broadie and Kaya (2006) for the stochastic volatility process. Although the exact method of simulation has the advantage that its convergence rate is much faster than that of the conventional Euler discretization method, it is, of course, computationally more burdensome than the simple adjusted-formula method.
Time to Maturity (year)
Fig. 75.6 Computing time comparison under Heston’s stochastic volatility model for a European call option: direct integration versus simple adjusted formula. The red line represents the computing time for evaluating a standard call option price using A.; / via direct integration w.r.t. given in Equation (75.30). The green dashed line represents the computing time for evaluating the same call option price using A.; / via the simple adjusted formula given in Equation (75.31). The other parameters are the same as those specified in Fig. 75.2
Table 75.1 Impact of the discontinuity problem on the evaluation of European call options in Heston’s model on stochastic volatility T (year) 0.50 1.00 1.50 2.00 2.50 3.00
Monte Carlo simulation with the exact method (10,000 trials) 4:2658 6:7261 8:9510 10:9633 12:6100 14:2591
75.4 Conclusion This paper looks at the issue raised by branch cuts in the transform solutions for European-style financial claims in the Heston model. The multivalued nature of the complex logarithm and power functions results in numerical instability in
Fundamental solution of the Heston model Adjusted formula using formula in Equation (75.31) 4:2545 6:8061 8:9557 10:8830 12:6635 14:3366
Unfixed formula using formula in Equation (75.14) 4:2555 6:4483 8:3286 9:7079 10:5542 10:9778
Here S0 D 100:00; K D 100:00; r D 0:0319; V0 D 0:010201; D 0:70; D 6:21; D 0:019, and V D 0:61. Note that the evaluation of the fundamental solution of the Heston model using formula in Equation (75.31) still yields values consistent with those of the Monte Carlo simulation for all time-to-maturity cases, although the complex logarithm is restricted to the principal branch
75 Implementation Problems and Solutions in Stochastic Volatility Models of the Heston Type
the implementation of the fundamental transform. Compared to the work of Kahl and Jäckel (2005), neither the direct integration method of Shaw (2006) nor the simple adjusted formula of Guo and Hung (2007) requires rotation-corrected terms to assure the robustness of the evaluation of Heston’s formula. After taking computing time into consideration, the evidence shows that the simple adjusted-formula method is greatly superior to the direct integration method.
References Broadie, M. and Ö. Kaya. 2006. “Exact simulation of stochastic volatility and other affine jump diffusion processes.” Operations Research 54(2), 217–231. Guo, J. and M. Hung. 2007. “A note on the discontinuity problem in Heston’s stochastic volatility model.” Applied Mathematical Finance 14(4), 339–345.
1171
Heston, S. 1993. “A closed-form solution for options with stochastic volatility with application to bond and currency options.” Review of Financial Studies 6(2), 327–343. Kahl, C. and P. Jäckel. 2005. “Not-so-complex logarithm in the Heston model.” Wilmott Magazine 19, 94–103. Kruse, S. and U. Nögel. 2005. “On the pricing of forward starting options in Heston’s model on stochastic volatility.” Finance and Stochastics 9, 223–250. Lee, R. 2005. “Option pricing by transform methods: extensions, unification, and error control.” Journal of Computational Finance 7(3), 51–86. Lewis, A. 2000. Option valuation under stochastic volatility, Finance Press, CA. Shaw, W. 2006. Stochastic volatility: models of Heston type, lecture notes. http://www.mth.kcl.ac.uk/~shaww/web_page/papers/ StoVolLecture.pdfhttp://www.mth.kcl.ac.uk/~shaww/web_page/ papers/StoVolLecture.nb
Chapter 76
Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models: Evidence from U.S. Stock Markets Zhuo Qiao and Wing-Keung Wong
Abstract This paper tests for the generalized autoregressive conditional heteroscedasticity (GARCH) effect on U.S. stock markets for different periods and reexamines the findings in Lamoureux and Lastrapes (Journal of Finance 45(1):221–229, 1990) by using alternative proxies for information flow, which are included in the conditional variance equation of a GARCH model. We also examine the spillover effects of volume and turnover on the conditional volatility using a bivariate GARCH approach. Our findings show that volume and turnover have effects on conditional volatility and that the introduction of volume/turnover as exogenous variable(s) in the conditional variance equation reduces the persistence of GARCH effects as measured by the sum of the GARCH parameters. Our results confirm the existence of the volume effect on volatility, consistent with the findings by Lamoureux and Lastrapes (Journal of Finance 45(1):221–229, 1990) and others, suggesting that the earlier findings were not a statistical fluke. Our findings also suggest that, unlike other anomalies, the volume effect on volatility is not likely to be eliminated after discovery. In addition, our study rejects the pure random walk hypothesis for stock returns. Our bivariate analysis also indicates that there are no volume or turnover spillover effects on conditional volatility among the companies, suggesting that the volatility of a company’s stock return may not necessarily be influenced by the volume and turnover of other companies. Keywords Volatility r GARCH effect r Volume effect r Turnover r Information flow r Multivariate GARCH r Mixture of distributions hypothesis
Z. Qiao Faculty of Business Administration, University of Macau, Macau, China e-mail: [email protected] W.-K. Wong () Department of Economics, Hong Kong Baptist University, Kowloon Tong, Hong Kong e-mail: [email protected]
76.1 Introduction The relation between the volume of trade and stock prices has received increasing attention from academic researchers. This relation is robust to various time intervals (hourly, daily, and weekly) and numerous financial markets (stocks, currency, and futures). While the empirical link between price volatility and volume appears strong, it is not obvious why this should be so. Various theoretical models have been proposed to explain this relationship. These include “asymmetric information” models proposed by Kyle (1985) and Admati and Pfleiderer (1988); “differences in opinion” models proposed by Varian (1985) and Shalen (1993); the sequential information arrival hypothesis proposed by Copeland (1976) and Smirlock and Starks (1985); and the “mixture of distributions” models proposed by Epps and Epps (1976), Harris (1986), Tauchen and Pitts (1983), and Jones et al. (1994). In the asymmetric information model, informed investors submit trades based on their private information. When informed investors trade more, volatility increases because of the generation of private information. However, recent evidence (see, for example, Daigler and Wiley 1999) shows that the positive volatility-volume relation is driven by the general public (less informed traders), while informed traders who observe order flow often decrease volatility. In the difference in opinion model, when public information switches from favorable to unfavorable or vice versa, investors have different beliefs concerning the stock, and this will generate trading among themselves. Hence, trading volume and absolute return are positively related because both are correlated with the flow of public information or the anticipation of information. The sequential information arrival hypothesis assumes that information is observed by each trader sequentially and randomly. From an initial position of equilibrium, where all traders possess the same information, a new flow of information arrives in the market and traders revise their expectations accordingly. Hence, trading volume may have the ability to predict current volatility and vice versa. In the mixture of distributions models (MDH), it is assumed that the joint distribution of volume and volatility is bivariate normal
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_76,
1173
1174
conditional upon a mixing variable, typically the number of information arrivals. All traders simultaneously receive the new price signals. Following the MDH model, Lamoureux and Lastrapes (1990) show that, using daily trading volume as a proxy for the mixing variable, the introduction of volume as an exogenous variable in the conditional variance equation eliminates the persistence of GARCH effects as measured by the sum of the GARCH parameters. Following the work of Lamoureux and Lastrapes (1990), there are many analyses studying this issue. However, the findings are not consistent. For example, Ragunathan and Peker (1997) find a strong contemporaneous effect of trading volume on volatility in the Sydney Futures Exchange. Miyakoshi (2002) finds that the inclusion of the trading volume variable in both ARCH and EGARCH models eliminates the ARCH/GARCH effect for individual stocks as well as for the index on the Tokyo Stock Exchange. Bohl and Henke (2003) and Wang et al. (2005) obtain similar findings for Polish and Chinese stock markets, respectively. Nonetheless, using unexpected trading volume as a proxy variable for the information flow, Aragó and Nieto (2005) show that the inclusion of trading volume does not substantially reduce the persistence of conditional volatility. Thus, it is still interesting to investigate this issue with a new proxy variable for the information flow. In this article, we first follow the methodology of Lamoureux and Lastrapes (1990) to revisit the volume-volatility relation in the U.S. stock markets. We then include turnover as another exogenous variable to study the issue. Several measures of volume and turnover have been proposed. Some studies of aggregate trading activity use the total number of shares traded on the market as a measure of volume, as seen for example in Epps and Epps (1976), Gallant et al. (1992) and Hiemstra and Jones (1994). Some use aggregate turnover – the total number of shares traded divided by the total number of shares outstanding – as a measure of volume, as do Smidt (1990), Campbell et al. (1993), Jones et al. (1994). Some use individual share volume in the analysis of price/volume and volatility/volume relations, as do Epps and Epps (1976), Lamoureux and Lastrapes (1990, 1994), and Andersen (1996). Some focus on the impact of information events on trading activity and use individual share turnover as a measure of volume (see, for example, Morse 1980; Lakonishok and Smidt 1986; and Stickel and Verrecchia 1994). Alternatively, the impact of information events can use individual dollar volume normalized by aggregate market dollar volume as a measure of volume, as studied in Tkac (1996). In addition, the total number of trades and the number of trading days per year could be used as measures of trading activity, as described in Conrad et al. (1994). Lo and Wang (2000) suggest that turnover yields the
Z. Qiao and W.-K. Wong
sharpest empirical implications in trading activity and is the most natural measure, though the other measures of volume do capture important aspects. We follow Lo and Wang’s approach to focus on the relation between turnover and volatility in the paper. First, we provide a comprehensive analysis of the role of turnover in the volume-volatility relation. The turnover data of individual shares are included in the variance equation to examine whether price volatility decreases and the GARCH effect is reduced with the turnover in variance and whether turnover is a more significant latent variable for information than volume. We examine this relation using data in the early part of the 1980s, as studied in Lamoureux and Lastrapes (1990), and we examine the data for the 1990s and 2000s to explore whether the relation still holds for the later periods. Our results show that both volume and turnover play significant roles in the volume-volatility relation. Our results are very similar to those found by Lamoureux and Lastrapes (1990). We reconfirm the significance of the volume and turnover effects on volatility and suggest that the earlier findings were not a statistical fluke. Most anomalies, for example the January effect, will diminish and eventually disappear as more investors come to know about the effect and exploit these opportunities. Our findings suggest that, unlike other anomalies, the volume effect on volatility is not likely to be exploited by investors, and hence, the volumevolatility relationship will not necessarily be eliminated after its discovery. In addition, our findings confirm the existence of both the GARCH effect and the volume-volatility relationship and reject the pure random walk hypothesis for stock returns. Academics and practitioners may believe that the volume and turnover from one company could have some impact on the stock return volatility of another company, but so far, this issue has not been studied. In this paper, we attempt to address this gap in the literature to examine the spillover effect of the volume and turnover from one company to another company by employing a bivariate BEKK(1,1) GARCH model. Our findings conclude that there is no volume or turnover spillover effects on conditional volatility among the companies. This result suggests that, since there are numerous companies traded in the market, a company’s stock return volatility may not necessarily be influenced by the volume and turnover of other companies. The remainder of this paper is structured as follows. Section 76.2 introduces the theory of the volume-volatility relation. Section 76.3 describes the data and methodology employed in our analysis of the relationship between volatility and volume on the NYSE for different periods. Section 76.4 presents the empirical results. We provide concluding remarks in the last section.
76 Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models
76.2 The Mixture of Distribution Hypothesis MDH introduced by Clark (1973) has played a prominent role in explaining the leptokurtosis effect on the returns of financial assets. The central proposition of MDH is that the variance of daily price changes is driven by the random number of daily price-relevant pieces of information serving as the mixing variable. The study of the volume-GARCH effect is examined within the context of MDH by Clark (1973), Epps and Epps (1976), Tauchen and Pitts (1983), and Harris (1987), as shown in the following model: p rt D 1 Z1t Ft ; p Vt D 2 Ft C 2 Z2t Ft ; and Ft D ˛0 C ˛Ft 1 C ˆt ; Ft 0 where rt is the stock return on day t; Vt is daily trading volume, Ft is a latent mixing variable known as information flow, Z1 and Z2 are mutually and serially independent stochastic processes with zero mean and unit variance, and ˆt is a serially independent random variable with zero mean that is restricted to ensure that Ft is always non-negative. The moving average structure of the innovation to the Ft shown in the last formula is translated into the conditional variance. There are several ways to deal with the fact that the information arrivals are generally unknown. However, some studies (see, for example, Lamoureux and Lastrapes 1990) have looked for proxies of the mixing variable. They choose daily trading volume as the measure of daily information flows into the market and estimate a generalized GARCH model where volume is included as an exogenous variable in the variance equation. They find that, when volume is included, the other coefficients in the conditional variance equation become statistically insignificant for most of the 20 stocks in their sample.
1175
The daily rate of return is the continuously compounded return defined as follows: rt D 100In.Pt =Pt 1 / where Pt is the adjusted price according to dividend, allotment, and split at day t. The volume variable, V, adopted in this paper is the actual daily volume expressed in million shares. Another measure of volume-turnover, T , is obtained from the total number of shares traded divided by the total number of shares outstanding as follows: Tt D .Vt =NSt / 100 where NS is the outstanding personal shares. To study the volume-GARCH effect, we first use the GARCH(1,1) model (Bollerslev 1986), which restricts the conditional variance of a time series to depend on past squared residuals of the process and has been found to adequately fit many economic and financial time series (see Bollerslev 1987; Baillie and Bollerslev 1989; and Lamoureux and Lastrapes 1990). To examine the effect of volume and turnover on stock return volatility, we assume the stock daily return follows a simple mean equation: rt D c C "t
"t jˆt 1 N 0; t2 :
(76.1)
In our study, we assume the square of the residual is the proxy of daily volatility following any of the models below: Model 1 W t2 D ˇ0 C ˇ1 "2t1 C ˇ2 t21
(76.2)
Model 2 W t2 D ˇ0 C ˇ1 "2t1 C ˇ2 t21 C ˇ3 Vt
(76.3)
Model 3 W t2 D ˇ0 C ˇ1 "2t1 C ˇ2 t21 C ˇ3 Tt
(76.4)
and
76.3 Data and Methodology For the U.S. stock market, we choose the data set from the NYSE, comprising daily returns, volume, and turnover on the 20 most actively traded companies whose options are also traded on the Chicago Board Options Exchange (CBOE). We study three periods: the first period is from 01/01/1982 to 31/12/1983, which is similar to the period studied in Lamoureux and Lastrapes (1990); the second period is from 01/01/1992 to 31/12/1993, after the study by Lamoureux and Lastrapes (1990) had been published; and the last period is from 01/01/2001 to 31/12/2002, which provides information for any recent developments. The data are obtained from Datastream International.
Model 4 W t2 D ˇ0 C ˇ1 "2t1 C ˇ2 t21 C ˇ3 Tt C ˇ4 Vt : (76.5) We note that ˇ1 measures the ARCH effect, ˇ2 measures the GARCH effect, while the sum ˇ1 C ˇ2 is a measure of the persistence of a shock to the variance. As this sum approaches unity, the persistence of the shock to volatility becomes greater. An appealing explanation for the presence of the GARCH effect is based on the hypothesis that daily returns are generated by a mixture of distributions, in which the rate of daily information arrival is the stochastic mixing variable (Lamoureux and Lastrapes 1990). GARCH captures the time series properties of this mixing variable (Diebold 1986; Stock 1987) by reflecting time dependence in the process when information flow to the market is generated and
1176
Z. Qiao and W.-K. Wong
volatility shocks are allowed to persist over time. This persistence captures the propensity of returns like magnitude to cluster in time and explains the nonnormality and nonstability of empirical distributions of asset returns (Lamoureux and Lastrapes 1990). However, the sequential information arrival hypothesis (Copeland 1976; Tauchen and Pitts 1983; and Smirlock and Starks 1985) assumes that traders receive new information in a sequential, random fashion and revise their expectations accordingly. Hence, trading volume volatility may have the ability to predict current volatility and replace the significance of the GARCH effect in the daily returns (Lamoureux and Lastrapes 1990). On the other hand, as aggregate turnover is a good measure of volume (Smidt 1990; Campbell et al. 1993; and Jones et al. 1994), it could replace volume and serve to explain the diminishing GARCH effect on daily returns. Hence, in this paper we use both volume and turnover as exogenous variables. Thereafter, we examine whether the effects of volume and turnover from a company have any impact on the stock return volatility of another company by employing a bivariate
BEKK(1,1) model (Engle and Kroner 1995) such that: Model 5 W Rt D C C "t
for t D 1; 2; : : : TI
(76.6)
0 P P 0 0 where t D A0 A0 C A1 "t 1 "t 1 A1 C P "t 0 N.0; t /; B1 t 1 B1 ; Rt is a 2 1 return vector at time t, C is a 2 1 mean vector, A0 is a lower triangular matrix, but A1 and B1 are unrestricted square matrices. For our analysis, we add two exogenous volume variables V1 and V2 (T1 and T2 for the case of turnover effect) to the conditional variance equation.
76.4 Empirical Findings in NYSE Table 76.1 reports the various descriptive statistics for the unconditional distributions of the returns for 20 stocks on the NYSE for the three periods selected. The results show that a substantial departure from normality is unveiled by
Table 76.1 Descriptive statistics of samples Sample
Stock acronym
Time periods
SW(p)
Median volume
Median MV
82–83
SW(p)
Median volume
Median MV
92–93
SW(p)
Median volume
Median MV
01–02
1
DIEBOLD
0:00
210:35
645:56
0:00
145:30
806:02
0:00
297:65
2
ENGELHARD
0:00
136:20
846:33
0:00
189:60
2; 289:86
0:00
459:70
3; 409:97
3
NBI
0:00
2; 483:20
721:46
0:00
2; 995:20
4; 534:67
0:00
3; 705:15
54; 881:51
4
WINNEBAGO
0:00
87:45
1; 093:51
0:00
66:40
1; 527:65
0:00
1; 112:95
8; 966:81
5
CORNING
0:00
53:35
329:09
0:00
33:20
175:27
0:00
146:55
695:73
6
3M CO.
0:00
664:45
8; 894:53
0:00
582:60
22; 456:90
0:00
1; 964:85
45; 950:39
7
ABBOTT LABS
0:02
2; 101:60
4; 766:16
0:00
2; 225:00
23; 996:37
0:00
3; 615:75
75; 682:13
8
AFLAC
0:00
495:00
209:45
0:00
840:70
2; 760:65
0:00
1; 696:95
15; 344:05
9
AIRBORNE
0:00
26:20
77:12
0:00
148:60
427:96
0:00
362:25
617:88
10
ALASKA AIR GROUP
0:00
28:40
117:00
0:00
41:20
224:77
0:00
186:70
740:82
11
ALBERTSONS
0:00
152:00
736:90
0:00
406:20
6; 313:67
0:00
1; 379:30
12; 144:54
12
ALCAN
0:00
250:85
2; 301:04
0:00
266:40
4; 335:27
0:00
1; 119:35
11; 632:78
13
ALCOA
0:00
1; 311:20
2; 445:81
0:00
2; 282:40
6; 018:73
0:01
3; 471:55
29; 912:09
14
ALLEGHENY EN.
0:00
140:00
1; 096:36
0:00
125:80
2; 717:53
0:00
662:80
4; 589:23
15
ALLIANT
0:00
17:20
300:49
0:00
12:60
948:38
0:00
202:35
2; 385:05
16
ALTRIA GP
0:00
5; 342:40
7; 394:10
0:00
5; 402:70
66; 956:69
0:00
6; 711:95 10; 3307:25
17
NABORS INDS
0:00
16:20
56:73
0:00
133:10
432:89
0:00
1; 990:35
5; 151:74
18
NEWMONT MINING
0:02
100:85
1; 397:90
0:00
255:90
3; 144:66
0:00
3; 278:30
7; 271:83
19
PERSIAMERICA
0:00
235:85
481:49
0:00
191:80
1; 526:30
0:00
180:90
2; 264:17
20
GT.LAKES CHM.
0:00
64:70
270:89
0:00
180:30
4; 884:98
0:00
217:70
1; 297:33
Mean
0:00
695:87
1; 709:10
0:00
826:25
7; 823:96
0:00
1; 638:15
19; 441:89
Median
0:00
146:00
729:18
0:00
190:70
2; 739:09
0:00
1; 116:15
6; 211:79
Note: SW(p) is the p-value of the Shapiro-Wilk statistic, and MV is market capitalization.
2; 592:56
76 Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models
the modified Shapiro-Wilk statistic (Royston 1982) in all periods. Consistent with the findings in Lamoureux and Lastrapes (1990), we find that the p-values of the ShapiroWilk statistic (denoted as SW(P) in the table) of the returns for all U.S. companies in our study are close to zero, implying that all series are non-normal.1 The approximate maximum likelihood estimation of the “pure” GARCH(1,1) model as in Model 1 is carried out first, and the results are presented in Table 76.2. From the table, we find that a significantly high proportion of our samples are either significant in ˇ1 or ˇ2 across all three sub-sample periods, which provides clear evidence of a GARCH effect. The measure of average volatility persistence; that is, ˇ1 Cˇ2 , is about 0.5, implying a relevant high degree of volatility persistence. Our results also suggest a trend of more persistent volatility in the later periods. We then estimate Model 2 and present the corresponding results in Table 76.3. As indicated in Table 76.3, once volume, V, is included as an exogenous variable, as shown in Model 2, the proportion of ˇ1 or ˇ2 , which is significant in our three sample periods, is significantly less than the corresponding results in Table 76.2. Furthermore, the volatility persistence measure ˇ1 C ˇ2 is dramatically reduced in all three periods. In contrast, the percentage of companies with significant estimates of the volume coefficient ˇ3 is Table 76.2 Maximum likelihood estimates of GARCH(1,1) model % of significant Persistence Period
“1
“2
“1 C “2
1982–1983 1992–1993 2001–2002
65% 60% 85%
40% 60% 70%
0.3600 0.4694 0.5980
Note: “1 .“2 / is the ARCH (GARCH) coefficient. Volatility persistence is the average of the persistence .“1 C “2 / of all companies. Table 76.3 Maximum likelihood estimates of GARCH(1,1) model with volume % of significant Persistence Period
“1
“2
“3
“1 C “2
1982–1983 1992–1993 2001–2002
15% 10% 5%
10% 0% 5%
80% 25% 55%
0.0734 0.01432 0.0364
Note: “1 .“2 / is the ARCH (GARCH) coefficient. “3 is the coefficient capturing the volume effect on conditional volatility. Volatility persistence is the average of the persistence .“1 C “2 / of all companies. 1
We also apply other statistics, such as the skewness and the kurtosis statistics, the Jarque-Bera test, the Anderson-Darling test, and the Kolmogorov-Smirnov test, to test for normality in all the tables. Since the results of other normality statistics are similar to the results of the modified Shapiro-Wilk statistic, we skip reporting the results of other normality tests, but they are available on request.
1177
Table 76.4 Maximum likelihood estimates of GARCH(1,1) model with turnover % of significant Persistence Period
“1
“2
“3
“1 C “2
1982–1983 1992–1993 2001–2002
15% 20% 5%
5% 0% 5%
95% 100% 100%
0.0233 0.0260 0.0365
Note: “1 .“2 / is the ARCH (GARCH) coefficient. “3 is the coefficient capturing the effect of turnover on conditional volatility. Volatility persistence is the average of the persistence .“1 C “2 / of all companies. Table 76.5 Maximum likelihood estimates of GARCH(1,1) model with turnover and volume % of significant Persistence Period
“1
“2
“3
“4
“1 C “2
1982–1983 1992–1993 2001–2002
20% 5% 10%
10% 0% 0%
75% 50% 55%
15% 25% 35%
0.05190 0.0089 0.0147
Note: “1 .“2 / is the ARCH (GARCH) coefficient. “3 is the coefficient capturing the volume effect on conditional volatility. “4 is the coefficient capturing the effect of turnover on conditional volatility. Volatility persistence is the average of the persistence .“1 C “2 / of all companies.
notably larger. All these suggest that once contemporaneous volume is included as an explanatory variable in the model, the GARCH effect diminishes, and daily trading volume has significant explanatory power regarding the volatility of daily returns. Lo and Wang (2000) argue that the turnover, T, may also be a suitable proxy for the flow of information into the market. Thus, we use T to replace V, as shown in Model 3, in modeling the U.S. stock returns and display the results in Table 76.4. The results show that the coefficient of T is significantly different from zero in almost all cases in all three periods. In addition, similar to the results in Model 2, the percentage of companies with significant ˇ1 or ˇ2 shrinks considerably, and the persistence of volatility is reduced in all three periods. These results again strongly suggest that lagged variance contributes little after accounting for the rate of information flow, as measured by turnover. To check whether the inclusion of both factors will be more useful, we then include both V and T in the variance equation, as shown in Model 4, and present the results in Table 76.5. The results show that, compared with the results in Tables 76.3 and Table 76.4, there is not much difference in the change of percentages of companies with significant ˇ1 or ˇ2 , or the reduction in the volatility persistence measure. We, therefore, conclude that in general only V or T needs to be included in the model. To investigate the normality assumption, we plot2 the empirical density functions for the residuals obtained from each 2
The plots are available on request.
1178
Z. Qiao and W.-K. Wong
Table 76.6 Estimates of bivariate GARCH(1,1) model with volume Sample 1 and 2 Coefficient(std error) Sample 2 and 3
Coefficient(std error)
Sample 1 and 3
Coefficient(std error)
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) V(2,1) V(1,2)
0:0000.0:0455/ 0:0000.0:0388/ 0:0000.0:0549/ 0.0001(0.0622) 0:0017.0:0174/ 0:0022.0:0243/
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) V(2,1) V(1,2)
0:0000.0:0699/ 0.0000(0.0630) 0.0000(0.0545) 0:0000.0:0543/ 0:0057.0:0515/ 0.0008(0.0078)
0.0000(0.0687) 0:0000.0:0523/ 0:0000.0:0583/ 0.0000(0.0550) 0:0020.0:0163/ 0.0019(0.0087)
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) V(2,1) V(1,2)
Note: ARCH(j, k) corresponds to the (j, k)-th element of A1 in Equation (76.6) and GARCH(j, k) corresponds to the (j, k)-th element of B1 in Equation (76.6). V(i, j) measures the effect of the volume of company j to the volatility of company i. ; and indicate significance at the 1, 5, and 10% level, respectively. Table 76.7 Estimates of bivariate GARCH (1,1) model with turnover Sample 1 and 2 Coefficient(std error) Sample 2 and 3
Coefficient(std error)
Sample 1 and 3
Coefficient(std error)
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) T(2,1) T(1,2)
0.0000(0.0413) 0.0000(0.0255) 0.0000(0.0503) 0.0001(0.0574) 0.0000(0.0000) 0.0000(0.0000)
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) T(2,1) T(1,2)
0.0000(0.0468) 0.0000(0.0437) 0.0000(0.0482) 0.0000(0.0437) 0.0000(0.0000) 0.0000(0.0000)
0.0000(0.0605) 0:0000.0:0556/ 0.0000(0.0526) 0.0000(0.0484) 0.0000(0.0000) 0.0000(0.0000)
ARCH(2,1) ARCH(1,2) GARCH(2,1) GARCH(1,2) T(2,1) T(1,2)
Note: ARCH(j, k) corresponds to the (j, k)-th element of A1 in Equation (76.6) and GARCH(j, k) corresponds to the (j, k)-th element of B1 in Equation (76.6). T(i, j) measures the effect of the volume of company j to the volatility of company i. ; and indicate significance at the 1, 5, and 10% level, respectively.
of the four models in the three periods, and we plot the standard normal distribution for reference. The plots show that the residuals from the last three models are closer to normality. This leads us to conclude that Models 2–4 are more adequate for explaining the volatility. Now we examine the impact of volume and turnover from one company on the stock return volatility of another company by employing the bivariate BEKK(1,1) MGARCH model, as stated in Model 5. We built the bivariate BEKK(1,1) model between any two companies to analyze mutual spillover effects. Since the results are qualitatively the same, we report only the results based on Sample 1 versus Sample 2, Sample 2 versus Sample 3, and Sample 1 versus Sample 3, as exhibited in Tables 76.6 and 76.7.3 As indicated in these tables, all entries in both the cross-ARCH terms, ARCH(1,2) and ARCH(2,1), and the cross-GARCH terms, GARCH(1,2) and GARCH(2,1), are not significant; therefore, we conclude that there are no volatility spillover effects among the companies. In addition, all entries of both crossvolume terms, V(1,2) and V(2,1), and the cross-turnover terms, T(1,2) and T(2,1), are also not significant, implying the existence of neither mutual volume nor turnover spillover effects on conditional volatility among the companies. This result suggests that, since there are numerous companies
3
Complete results are available on request. In addition, we report only the estimates for the “spillover” effects in Tables 76.6 and 76.7 as we are only interested in studying these effects in this paper and the estimates of other terms are basically consistent with the other univariate models.
traded in the market, a company’s stock return volatility may not necessarily be influenced by the volume and turnover of other companies.
76.5 Conclusion Although there are many empirical studies on the volumevolatility relation, there is still no consensus about what actually drives the relation. To enrich the understanding of the trading dynamics behind the relation, we reexamine the empirical findings of Lamoureux and Lastrapes (1990) by using volume and turnover to measure the effect of information flow on volatility. Using the updated transaction data from the NYSE, we include volume, turnover, and both together in the variance equation of a GARCH model. We also reexamine whether the results of Lamoureux and Lastrapes (1990) have changed over time. Our results show that there is a trend toward more informative volume and/or turnover and reconfirm that the finding of a significant effect of volume on volatility is very similar to the results found by Lamoureux and Lastrapes (1990), suggesting that the earlier finding was not a statistical fluke. In the absence of volume as a mixing variable, the returns of sample stocks were described by a GARCH model. Introducing volume as a proxy for information arrival reduced the GARCH effect. It appears that the role of turnover is also pronounced for volatility, and turnover is a good proxy variable for information flow. In addition, our multivariate analysis concludes that there is no
76 Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models
mutual volume or turnover spillover effects on conditional volatility among the companies. One possible explanation is that, since there are numerous companies traded in the market, the volatility of a company’s stock return may not necessarily be influenced by the trading volume (turnover) of other companies. Our findings show that the GARCH effect could be used to measure volatility in the stock markets and, conversely, volume or turnover could replace the presence of the GARCH effect in the stock returns Here, we would like to provide an explanation on the observations about volume and turnover. When volatility increases, stock prices move up or down sharply in a short period. Technical analysts will capture this opportunity to invest. This fits the old Wall Street adage that “it takes volume to move prices.” A herd effect could then be created, and more and more traders would join in a situation that will create further volatility (GARCH effect) and both higher volume and turnover. Hence, it is not surprising that volume or turnover can replace the GARCH effect in the model. Most anomalies, for example the January effect, will diminish and eventually disappear after their discovery as more and more investors exploit this effect. For example, after discovering the January effect investors who expect a stock’s price to appreciate in January will then purchase the stock before January and sell it at the end of January to make a profit. This will drive up stock prices before January and push down prices at the end of January and will result in the diminishing or even the disappearance of the January effect. For the volume-volatility relationship, it is a different story. Volatility will create further volatility, and more and more investors will trade to take advantage of the active market. Hence, volume and turnover will increase. This phenomenon is selfcreated. Based on the findings in this paper we hypothesize that, unlike other anomalies, both the GARCH effect and the volume-volatility relationship will not be able to be exploited by investors and will persist in the market. In addition, the existence of both the GARCH effect and the volume-volatility relationship leads us to reject the pure random walk hypothesis for stock returns. Understanding the relationship between volume and volatility will be useful in investment decision making. The information of the companies (Thompson and Wong 1991; Wong and Chan 2004), the economic and financial environment (Wong and Li 1999; Fong et al. 2005; Broll et al. 2006; Wong 2007; Wong and Chan 2008; Wong and Ma 2008; Wong et al. 2006, 2008) can also be used to make better investment decisions. Another extension that would improve investment decision making is to include technical analysis (Wong et al. 2001, 2003) that incorporates the volume volatility relationship.
1179
Acknowledgements The authors are most grateful to Professor C.F. Lee for his substantive comments and suggestions, which have significantly improved this manuscript. The authors are grateful to Jun Xu for her helpful assistance with the paper. The second author would like to thank Professors Robert B. Miller and Howard E. Thompson for their continuous guidance and encouragement. The research is partially supported by the University of Macau and Hong Kong Baptist University.
References Admati, A. R. and P. Pfleiderer. 1988. “A theory of intraday patterns: volume and price variability.” Review of Financial Studies 1, 3–40. Andersen, T. 1996. “Return volatility and trading volume: an information flow interpretation.” Journal of Finance 51, 169–204. Aragó, V. and L. Nieto. 2005. “Heteroskedasticity in the returns of the main world stock exchange indices: volume versus GARCH effects.” International Financial Markets, Institution and Money 15, 271–284. Baillie, R. T. and T. Bollerslev. 1989. “The message in daily exchange rates: a conditional variance tale.” Journal of Business and Economic Statistics 7, 297–305. Bohl, M. T. and H. Henke. 2003. “Trading volume and stock market volatility: the Polish case.” International Review of Financial Analysis 12, 513–525. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31, 307–327. Bollerslev, T. 1987. “A conditional heteroskedasticity time series model for speculative prices and rates of return.” Review of Economics and Statistics 69, 542–547. Bollerslev, T., R. Y. Chou, and K. F. Kroner. 1992. “ARCH modeling in finance.” Journal of Econometrics 52, 5–59. Bollerslev, T., R. F. Engle, and D. B. Nelson. 1994. “ARCH model,” in Handbook of econometrics IV, R. F. Engle and D. C. McFadden (Eds.). Elsevier, Amsterdam, pp. 2959–3038. Broll, U., J. E. Wahl, and W. K. Wong. 2006. “Elasticity of risk aversion and international trade.” Economics Letters 92(1), 126–130. Campbell, J., S. Grossman, and J. Wang. 1993. “Trading volume and serial correlation in stock returns.” Quarterly Journal of Economics 108, 905–939. Clark, P. K. 1973. “A subordinate stochastic process model with finite variance for speculative prices.” Econometrica 41, 135–155. Conrad, J., A. Hammeed, and C. Niden. 1994. “Volume and autocovariances in short-horizon individual security returns.” Journal of Finance 49, 1305–1329. Copeland, T. E. 1976. “A model of asset trading under the assumption of sequential information arrival.” Journal of Finance 31, 1149–1168. Daigler, R. T. and M. K. Wiley. 1999. “The impact of trader type on the futures volatility-volume relation.” Journal of Finance 6, 2297–2316. Diebold, F. X. 1986. “Comment on modeling the persistence of conditional variance.” Econometric Reviews 5, 51–56. Epps, T. and M. Epps. 1976. “The stochastic dependence of security price changes and transaction volumes: implications for the mixture of distributions hypothesis.” Econometrica 44, 305–321. Engle, R. F. 1982. “Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation.” Econometrica 50, 987–1008. Engle, R. F. and K. F. Kroner. 1995. “Multivariate simultaneous generalized ARCH.” Econometric Theory 11, 122–150.
1180 Fong, W. M., H. H. Lean, and W. K. Wong. 2005. “Stochastic dominance and the rationality of the momentum effect across markets.” Journal of Financial Markets 8, 89–109. Gallant, R., P. E. Rossi, and G. Tauchen. 1992. “Stock prices and volumes.” Review of Financial Studies 5, 199–244. Harris, L. 1986. “Cross-security tests of the mixture of distributions hypothesis.” Journal of Financial and Quantitative Analysis 21, 39–46. Harris, L. 1987. “Transaction data tests of the mixture of distributions hypothesis.” Journal of Financial Quantitative Analysis 2, 127–141. Hiemstra, C. and J. Jones. 1994. “Testing for linear and nonlinear Granger causality in the stock price-volume relation.” Journal of Finance 49, 1639–1664. Jones, C. M., G. Kaul, and M. L Lipson. 1994. “Transactions, volume, and volatility.” Review of Financial Studies 7(4), 631–651. Kyle, A. S. 1985. “Continuous auctions and insider trading.” Econometrica 53, 1315–1335. Lakonishok, J. and S. Smidt. 1986. “Volume for winners and losers: taxation and other motives for stock trading.” Journal of Finance 41, 951–974. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Heteroskedasticity in stock returns data: volume versus GARCH effects.” Journal of Finance 45(1), 221–229. Lamoureux, C. and W. Lastrapes. 1994. “Endogenous trading volume and momentum in stock-return volatility.” Journal of Business and Economic Statistics 12, 253–260. Lo, W. A. and J. Wang. 2000. “Trading volume: definitions, data analysis, and implications of portfolio theory.” Review of Financial Studies 13, 257–300. Miyakoshi, T. 2002. “ARCH versus information-based variances: evidence from the Tokyo stock market.” Japan and the World Economy 2, 215–231. Morse, D. 1980. “Asymmetric information in securities markets and trading volume.” Journal of Financial and Quantitative Analysis 15, 1129–1148. Ragunathan, V. and A. Peker. 1997. “Price variability, trading volume, and market depth: evidence from the Australian futures market.” Applied Financial Economics 5, 447–454. Royston, P. 1982. “An extension of Shapiro and Wilk’s W Test for normality to large samples.” Applied Statistics 31, 115–124. Shalen, C. T. 1993. “Volume, volatility, and the dispersion of beliefs.” Review of Financial Studies 6, 405–434. Smidt, S. 1990. “Long-run trends in equity turnover.” Journal of Portfolio Management 17, 66–73. Smirlock, M. and L. Starks. 1985. “A further examination of stock price changes and transaction volume.” Journal of Financial Research 8, 217–225. Stickel, S. and R. Verrecchia. 1994. “Evidence that volume sustains price changes.” Financial Analysts Journal 57–67. Stock, J. H. 1987. “Measuring business cycle time.” Journal of Political Economy 95, 1240–1261. Tauchen, G. E. and M. Pitts. 1983. “The price variability – volume relationship on speculative markets.” Econometrica 51(2), 485–505. Thompson, H. E. and W. K. Wong. 1991. “On the unavoidability of ‘unscientific’ judgment in estimating the cost of capital.” Managerial and Decision Economics 12, 27–42. Tkac, P. 1996. A trading volume benchmark: theory and evidence, Working Paper, University of Notre Dame. Varian, H. 1985. “Divergence of opinion in complete markets: a note.” Journal of Finance 40, 309–317. Wang, P., P. J. Wang, and A. Y. Liu. 2005. “Stock return volatility and trading volume: evidence from the Chinese stock market.” Journal of Chinese Economic and Business Studies 1, 39–54. Wong, W. K. 2007. “Stochastic dominance and mean-variance measures of profit and loss for business planning and investment.” European Journal of Operational Research 182, 829–843.
Z. Qiao and W.-K. Wong Wong, W. K. and R. Chan. 2004. “The estimation of the cost of capital and its reliability.” Quantitative Finance 4(3), 365–372. Wong, W. K. and R. Chan. 2008. “Markowitz and prospect stochastic dominances.” Annals of Finance 4(1), 105–129. Wong, W. K., B. K. Chew, and D. Sikorski. 2001. “Can P/E ratio and bond yield be used to beat stock markets?” Multinational Finance Journal 5(1), 59–86. Wong, W. K. and C. K. Li. 1999. “A note on convex stochastic dominance theory.” Economics Letters 62, 293–300. Wong, W. K. and C. Ma. 2008. Preferences over location-scale family, Economic Theory 37(1), 119–146. Wong, W. K., M. Manzur, and B. K. Chew. 2003. “How rewarding is technical analysis? Evidence from Singapore stock market.” Applied Financial Economics 13(7), 543–551. Wong, W. K., K. F. Phoon, and H. H. Lean. 2008. “Stochastic dominance analysis of Asian hedge funds.” Pacific-Basin Finance Journal 16(3), 204–223. Wong, W. K., H. E. Thompson, S. Wei, and Y. F Chow. 2006. “Do Winners perform better than Losers? A Stochastic Dominance Approach.” Advances in Quantitative Analysis of Finance and Accounting 4, 219–254.
Appendix 76A Modeling the volatility in financial markets is very important for options trading and risk management. Although volatility is not directly observable, it has some common features. First, there exist volatility clusters; that is, volatility may be high in certain time periods and low in other periods. Second, volatility evolves over time in a continuous manner; that is, volatility jumps are rare. Third, volatility seems to react asymmetrically to positive market news and negative market news. These properties play important roles in the development of volatility models. Since the publication of the seminal papers by Engle (1982) and Bollerslev (1986), ARCH/GARCH models have been widely applied to the analysis and forecasting of volatility in financial markets. Their popularity stems from their main ability to capture the volatility clustering phenomenon. Let rt denote a log-return series. It can be expressed as the summation of its mean, c, and a zero-mean innovation, "t , if rt does not possess any significant autocorrelation: rt D c C "t :
(76.7)
To capture the volatility clustering phenomenon, we assume that the variance of "t , conditional on information set at time t 1 is t2 , as follows: t2
D ˛0 C
p X
˛i "2ti
i D1
where ˛0 > 0 and ˛i 0 for i D 1; : : : ; p.
(76.8)
76 Revisiting Volume vs. GARCH Effects Using Univariate and Bivariate GARCH Models
The above equation represents an AR(p) process for "2t , which is known as the autoregressive conditional heteroskedasticity (ARCH) model of Engle (1982). From the structure of the ˚model, p it can be seen that large past squared shocks, "2ti i D1 , imply a large conditional variance t2 for the mean-corrected return "t . Consequently, "t tends to assume a large variance whenever there are large past squared shocks. This means that, under the ARCH framework, large shocks tend to be followed by another large shock. An alternative representation of the ARCH .p/ model is:
"t D t at ; t2 D ˛0 C
p X
˛i "2ti
(76.9)
i D1
where at is an iid white noise, which is often assumed to follow a standard normal or standardized Student-t distribution. Equation (76A.3) is usually adopted in the literature, since it is convenient for deriving the properties of the ARCH model as well as for specifying the likelihood function for estimation. Although the ARCH model is simple, in practice, it often requires a large number of lags of p, and thus many parameters, to adequately describe the volatility process of an asset return. Bollerslev (1986) proposes a useful extension known
1181
as the generalized ARCH (GARCH) model. Let "t D rt c be the mean-corrected log-return, then "t follows a GARCH .p; q/ model if: "t D t at ; t2 D ˛0 C
p X
˛i "2ti C
i D1
where ˛0 > 0; ˛i 0; ˇj 0 and
q X
ˇj t2j
(76.10)
j D1 max.p;q/ P
.˛i Cˇj / < 1. As
i D1
specified before, at is an iid white noise and is often assumed to be a standard normal or standardized Student-t distribution. Here, it is understood that ˛0 D 0 for any i > p and ˇj D 0 for any j > q. The latter constraint on ˛i C ˇj implies that the unconditional variance of "t is finite, whereas its conditional variance t2 evolves over time. Clearly, the above equation reduces to a pure ARCH .p/ model if q D 0. Take a simple GARCH(1,1) model as an example: t2 D ˛0 C ˛1 "2t1 C ˇ1 t21 ; 0 ˛1 ; ˇ1 1; .˛i C ˇj / < 1 (76.11) It can be observed that a large "t 1 or t21 will produce a large t2 , which means that a large "t 1 tends to be followed by another large "t , generating, again, the well-known behavior of volatility clustering in financial time series. For detailed investigations of GARCH models, readers may refer to Bollerslev et al. (1992, 1994).
Chapter 77
Application of Fuzzy Set Theory to Finance Research: Method and Application Shin-Yun Wang and Cheng Few Lee
Abstract The impact of implicit “Fuzziness” is inevitable due to the subjective assessment made by investors. Human judgment of events may be significantly different based on individuals’ subjective perceptions or personality tendencies for judgment, evaluation and decisions; thus human judgment is often fuzzy. So we will use the fuzzy set theory to describe and eliminate the “fuzziness” that is the subjective assessment made by investors. Due to the fluctuation of the financial market from time to time, in reality, the future state of a system might not be known completely due to lack of information, so investment problems are often uncertain or vague in a number of ways. The traditional probability financial model does not take into consideration the fact that investors often face fuzzy factors. Therefore, the fuzzy set theory may be a useful tool for modeling this kind of imprecise problem. This theory permits the gradual assessment of the membership of elements in relation to a set; this is described with the aid of a membership function valued in the real unit interval (0, 1). The fuzzy set theory allows the representation of uncertainty and inexact information in the form of linguistic variables that have been applied to many areas. The application of the fuzzy set theory to finance research is proposed in this article. Keywords Fuzzy set r Fuzziness r Membership function Linguistic variables
r
77.1 Introduction The application of fuzzy set theory to finance research is proposed in this article. Due to the fluctuation of the financial market from time to time, in reality, the future state of
S.-Y. Wang () National Dong Hwa University, Hualien, Taiwan e-mail: [email protected] C.F. Lee Rutgers University, New Brunswick, NJ, USA
a system might not be known completely due to lack of information, so investment problems are often uncertain or vague in a number of ways. This type of uncertainty has long been handled appropriately by probability theory or statistics. However, in many areas, such as funds, stocks, debt, derivatives and others, human judgment of events may be significantly different based on individuals’ subjective perceptions or personality tendencies for judgment, evaluation and decisions; thus it is often fuzzy. So we will use the fuzzy set theory to describe and eliminate the “Fuzziness,” which is the subjective assessment made by investors in finance research. While assessing the distribution of a primary variable in finance research, an expert should evaluate the influence of sample information. This involves the fuzzy factor of the expert’s subjective judgment. That is, the fuzzy factor of the expert’s subjective judgment in the finance research should not be overlooked. Otherwise, the evaluation will not accurately reflect the problem and will lead to inaccurate decision-making. The traditional probability financial model does not take into consideration that investors face fuzzy (vague/imprecise/uncertain) factors in financial analysis. Therefore, the fuzzy set theory proposed by Zadeh (1965) may be a useful tool for modeling this kind of imprecise problem. The impact of implicit “Fuzziness” is inevitable due to the subjective assessment made by investors. In the real world, there are times when input parameters and data cannot always be determined in the precise sense. Therefore, the fuzzy set theory naturally provides an appropriate tool in modeling the imprecise models. For example, all of the following statements could be regarded as fuzzy sets: The riskless interest rate may occur imprecisely, “there is a good chance for a riskless interest rate of 3% next year; the riskless interest rate is very unlikely to go below 1%; and it is most probably in the range of 1.5–2.5%.” In another example, “in a booming economy, there is about a 60% probability that riskless interest rate will grow 10% next year.” The phrases “booming economy” and “about 60%” mean implicitly that the probability for the event of “10% riskless interest rate” could be 55, 58, 60, or 65% – all of those statements can be characterized as fuzzy sets. Furthermore,
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_77,
1183
1184
in decision-making there is much uncertainty. There are many factors that affect decision-making, including human psychology state and external information input, which is difficult to be derived in terms of probabilistic or stochastic measurement. In addition, an investor usually depends on an expert’s judgment to derive the probability distribution of primary variables in a financial model. In other words, an investor often subjectively describes the uncertainty he/she faces with implicit fuzziness or impreciseness, which can be expressed as, using both random and fuzzy elements as a base to subjectively assess uncertainty. Thus, it is fuzzy probability calculated and derived in accordance with the “degree of belief” by experts in the real world. The fuzzy set theory is used to measure the effect of this fuzziness, because the thoughts and controlled behaviors of humans involve both a fuzziness and non-quantitative quality. Therefore, the fuzzy set theory would result in using a more realistic methodology for finance research. The remainder of this article is organized as follows. The fuzzy set is introduced and discussed in Sect. 77.2, including the definition of fuzzy set, the membership function and terminology, type of membership functions, fuzzy logic, the operations of fuzzy set, defuzzify, and fuzzy decision, in Sect. 77.3 the applications of fuzzy set theory, such as option, real option, capital budget, and so forth are introduced. The conclusions are presented in Sect. 77.4.
S.-Y. Wang and C.F. Lee
estimate of the standard deviation of the values. Then, any single value has an uncertainty equal to the standard deviation. However, if the values are averaged and the mean is reported, then the averaged measurement has uncertainty equal to the standard error which is the standard deviation divided by the SQUARE root of the number of measurements. When the uncertainty represents the standard error, of the measurement, then about 68.2% of the time, the true value of the measured quantity falls within the stated uncertainty range. Therefore, no matter how accurate our measurements are, some uncertainty always remains. The possibility is the degree to which something happens, but is the probability that things happen or not. So the methods with which that we deal with uncertainty are to avoid the uncertainty, statistical mechanics and fuzzy set (Zadeh 1965). The possibility theory derived from the fuzzy set theory is always compared with the probability theory. For a detailed discussion, refer to Zadeh (1978) and Dubois and Prade (1988). The concept of the fuzzy random variable was introduced by Kwakernaak (1978) and Puri and Ralescu (1986). The occurrence of a fuzzy random variable makes the combination of randomness and fuzziness more persuasive, since the probability theory can be used to model uncertainty, and the fuzzy set theory can to model imprecision.
77.2.1 The Definition of Fuzzy Set 77.2 Fuzzy Set Before illustrating the fuzzy set theory, which makes decisions under uncertainty, it is important to define what is uncertainty. Uncertainty is a term used in subtly different ways in a number of fields, including philosophy, statistics, economics, finance, insurance, psychology, engineering, and science. It applies to predictions of future events, to physical measurements already made, or to the unknown. Uncertainty must be taken in a way that is radically distinct from the familiar notion of risk, from which it has never been properly separated. The essential fact is that “risk” in some cases is a quantity susceptible to measurement, while at other times, it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating. It will appear that a measurable uncertainty, or “risk” proper, as we shall use the term, is so far different from an immeasurable one that it is not in effect an uncertainty at all. Often, the uncertainty of a measurement is found by repeating the measurement enough times to get a good
Fuzzy sets were introduced by Lotfi A. Zadeh in 1965. What Zadeh proposed is very much a paradigm shift that first gained acceptance in the Far East; its successful application has ensured its adoption around the world. Fuzzy sets are an extension of the classical set theory and are used in fuzzy logic. In the classical set theory the membership of elements in relation to a set is assessed in binary terms according to a crisp condition – an element either belongs or does not belong to the set. By contrast, fuzzy set theory permits the gradual assessment of the membership of elements in relation to a set; this is described with the aid of a membership function valued in the real unit interval (0, 1). Fuzzy sets are an extension of the classical set theory, since, for a certain universe, a membership function may act as an indicator function, mapping all elements to either 1 or 0 as in the classical notion. Specifically, A fuzzy set is any set that allows its members to have different grades of membership (membership function) in the interval (0,1). A fuzzy set on a classical set X is defined as follows: AQ D fx; A .x//jx 2 X g
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
77.2.2 Membership Function The membership function A .x/ quantifies the grade of membership of the elements x to the fundamental set X . An
77.2.2.1 Membership Function Terminology Universe of discourse: The universe of discourse is the range of all possible values for an input to a fuzzy system. Support: The support of a fuzzy set F is the crisp set of all points in the universe of discourse U such that the membership function of F is non-zero. SuppA D fxjA .x/ > 0; 8x 2 Xg Core: The core of a fuzzy set F is the crisp set of all points in the universe of discourse U such that the membership function of F is 1. core A D fxjA .x/ D 1; 8x 2 Xg Boundaries: The boundaries of a fuzzy set F is the crisp set of all points in the universe of discourse U such that the membership function of F is between 0 and 1.
1185
element mapping to the value 0 means that the member is not included in the given set; 1 describes a fully included member. Values strictly between 0 and 1 characterize the fuzzy members.
Boundaries A D fxj0 < A .x/ < 1; 8x 2 Xg Crossover point: The crossover point of a fuzzy set is the element in U at which its membership function is 0.5. Height: The biggest value of membership functions of fuzzy set. .x/ D 0:5 Normalized fuzzy set: The fuzzy set of Height.A/ D 1 X W finite P P Cardinality of the set: jAj D A .x/ D A .x/ Relative cardinality: kAk D Convex fuzzy set:
x2X jAj jXj
x2Supp.A/
X 2 R; a fuzzy set A is Convex, if for 8 2 Œ0; 1 A .x1 C .1 /x2 / min.A .x1 /; A .x2 //
1186
S.-Y. Wang and C.F. Lee
77.2.2.2 Type of Membership Functions 1. Numerical definition (discrete membership functions) AD
X
A .xi /=xi
xi 2X
2. Function definition (continuous membership functions) Including of S function, Z Function, Pi function, Triangular shape, Trapezoid shape, Bell shape. Z AD
A .x/=x X
1. S function: monotonical increasing membership function 8 ˆ <
ˇ S.xI ˇ; ; / 2 ….xI ˇ; / D ˆ : 1 S.xI ; C ˇ ; C ˇ/ 2 Piecewise continuous membership function 4. Trapezoidal membership function
8 0 ˆ ˆ
ˆ ˆ x ˛ 2 ˆ ˆ < 2 ˛
S.xI ˛; ˇ; / D ˆ x˛ 2 ˆ ˆ 1 2 ˆ ˆ ˛ ˆ : 1
f or x ˛
for x
1
for ˛ x ˇ 0
for ˇ x
a1
a
b
b1
f or x
2. Z function: monotonical decreasing membership function 8 1 ˆ ˆ
ˆ ˆ x˛ 2 ˆ ˆ <1 2 ˛
Z.xI ˛; ˇ; / D ˆ x˛ 2 ˆ ˆ 2 ˆ ˆ ˛ ˆ : 0
for x
f or x ˛ for ˛ x ˇ for ˇ x f or x
3. … function: combine S function and Z function, monotonical increasing and decreasing membership function
8 0 ˆ ˆ ˆ x a1 ˆ ˆ ˆ ˆ < a a1 1 A .x/ D ˆ ˆ b 1x ˆ ˆ ˆ ˆ ˆ : b1 b 0
f or x a1 for a1 x a for a x b for b x b1
5. Triangular membership function
for b1 x
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
human reasoning and, especially, common sense reasoning are approximate in nature. The essential characteristics of fuzzy logic are as follows.
1
0
1187
a1
a
8 0 ˆ ˆ ˆ x a1 ˆ ˆ < aa A .x/ D b1 x1 ˆ ˆ ˆ ˆ ˆ : b1 a 0
b1
f or x a1 for a1 x a for a x b1 for b1 x
6. Bell-shaped membership function
A.x/ D ce
.xa/2 b
77.2.3 Fuzzy Logic Before illustrating the mechanisms that make fuzzy logic machines work, it is important to define fuzzy logic. Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth – truth values between “completely true” and “completely false.” As its name suggests, it is the logic underlying modes of reasoning that are approximate rather than exact. The importance of fuzzy logic derives from the fact that most modes of
1. In fuzzy logic, exact reasoning is viewed as a limiting case of approximate reasoning. 2. In fuzzy logic, everything is a matter of degree. 3. Any logical system can be fuzzified. 4. In fuzzy logic, knowledge is interpreted as a collection of elastic or, equivalently, fuzzy constraint on a collection of variables. 5. Inference is viewed as a process of propagation of elastic constraints.
77.2.4 The Operations of Fuzzy Set After reviewing know about the characteristics of fuzzy sets, we will introduce the operations of fuzzy set. A fuzzy number is a convex, normalized fuzzy set AQ R whose membership function is at least segmental continuous and has the functional value A .x/ D 1 at precisely one element. This can be compared to the funfair game “guess your weight,” where someone guesses the contestants’ weight, with closer guesses being more correct, and where the guesser “wins” if they guess close enough to the contestant’s weight, with the actual weight being completely correct (mapping to 1 by the membership function). A fuzzy interval is an uncertain set AQ R with a mean interval whose elements possess the membership function value A .x/ D 1. As in fuzzy numbers, the membership function must be convex, normalized, and at least segmental continuous. Set-theoretic operations 1. Subset: A B , A B 2. Complement: A D X A , A .x/ D 1 A .x/ 3. Union: C D A [ B , c .x/ D max.A .x/; B .x// D A .x/ _ B .x/ 4. Intersection: C D A \ B , c .x/ D min.A .x/; B .x// D A .x/ ^ B .x/
1188
1. The Complement function is c W Œ0; 1 ! Œ0; 1 ; AN .x/ D c.A .x// (a) Boundary condition: c.0/ D 1 and c.1/ D 0; (b) Monotonic property: if A .x1 / < A .x2 / then c.A .x1 // c.A .x2 //; (c) Continuity: The Complement function C(.) must be a continuous function.; (d) Involution: c.c.A .x/// D A .x/; 8x 2 U . 2. The Intersection function is t W Œ0; 1 Œ0; 1 ! Œ0; 1 , A\B .x/ D tŒa .x/; B .x/ (a) Boundary condition: t.0; 0/ D 0 and t.A .x/; 1/ D t.1; A .x// D A .x/; (b) Monotonic property: if A .x/ < C .x/ and B .x/ < D .x/ then t.A .x/; B .x// t.C .x/; D .x/; (c) Commutativity: t.A .x/; B .x// D t.B .x/; A .x//; (d) Associativity: t.A .x/; t.B .x/; c .x/// D t.t.A .x/; B .x//; C .x/. 3. The Union (t-conorms) function is s W Œ0; 1 Œ0; 1 ! Œ0; 1 ; A[B .x/ D sŒa .x/; B .x/ (a) Boundary condition: s.1; 1/ D 1 and s.A .x/; 0/ D s.0; A .x// D A .x/; (b) Monotonic property: if A .x/ < C .x/ and B .x/ < D .x/ then s.A .x/; B .x// s.C .x/; D .x//; (c) Commutativity: s.A .x/; B .x// D s.B .x/; A .x//; (d) Associativity: s.A .x/; s.B .x/; c .x/// D s.s.A .x/; B .x//; C .x//. Although one can create fuzzy sets and perform various operations on them, in general they are mainly used when creating fuzzy values and to define the linguistic terms of fuzzy variables. This is described in the section on fuzzy variables. At some point it may be an interesting exercise to add fuzzy numbers to the toolkit. These would be specializations of fuzzy sets with a set of operations such as addition, subtraction, multiplication, and division defined on them. For simplified calculation, we often use the triangular fuzzy numbers. According to the characteristics of triangular fuzzy numbers and the extension principle put forward by Zadeh
S.-Y. Wang and C.F. Lee
(1965), the operational laws of triangular fuzzy numbers, AQ D .l1 ; m1 ; r1 / and BQ D .l2 ; m2 ; r2 / are as follows: 1. Addition of two fuzzy numbers .l1 ; m1 ; r1 / ˚ .l2 ; m2 ; r2 / D .l1 C l2 ; m1 C m2 ; r1 C r2 / 2. Subtraction of two fuzzy numbers .l1 ; m1 ; r1 /‚.l2 ; m2 ; r2 / D .l1 r2 ; m1 m2 ; r1 l2 / 3. Multiplication of two fuzzy numbers .l1 ; m1 ; r1 / ˝ .l2 ; m2 ; r2 / Š .l1 l2 ; m1 m2 ; r1 r2 / 4. Division of two fuzzy numbers .l1 ; m1 ; r1 /Ø.l2 ; m2 ; r2 / Š .l1 =r2 ; m1 =m2 ; r1 = l2 /
77.2.5 Defuzzify When we get through the operations of fuzzy set to determine the fuzzy interval, we will then convert the fuzzy value into the crisp value. Below are some methods that convert a fuzzy set back into a single crisp (non-fuzzy) value. This is normally done after a fuzzy decision has been made and the fuzzy result must be used in the real world. For example, if the final fuzzy decision were to adjust the temperature setting on the thermostat a “little higher,” then it would be necessary to convert this “little higher” fuzzy value to the “best” crisp value to actually move the thermostat setting by some real amount. The methods are as follows: 1. Maximum defuzzify finds the mean of the maximum values of a fuzzy set as the defuzzification value. Note that: this doesn’t always work well because there can be x ranges where the y value is constant at the max value, and other places where the maximum value is only reached for a single x value. When this happens the single value gets too much of a say in the defuzzified value.
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
2. Moment defuzzifies a fuzzy set returning a floating point (double value) that represents the fuzzy set. It calculates the first moment of area of a fuzzy set about the y axis. The set is subdivided into different shapes by partitioning vertically at each point in the set, resulting in rectangles, triangles, and trapezoids. The center of gravity (moment) and area of each subdivision is calculated using the appropriate formulas for each shape. The first moment of area of the whole set is then: x P
x D 0
xi0 Ai
i D1 x P
Ai
i D1
1189
cases the center of area can be satisfied by more than one value. For example, for the fuzzy set defined by the points: (5,0) (6,1) (7,0) (15,0) (16,1) (17,0) the COA could be any value from 7.0 to 15.0 since the two identical triangles centered at x D 6 and x D 16 lie on either side of 7.0 and 15.0. We will return a value of 11.0 in this case (in general we try to find the middle of the possible x values). 4. Weighted Average Defuzzify finds the weighted average of the x values of the points that define a fuzzy set using the membership values of the points as the weights. This value is returned as the defuzzification value. For example, if we have the following fuzzy set definition:
where xi0 is the local center of gravity, Ai is the local area of the shape underneath line segment .pi 1 ; pi /, and n is the total number of points. As an example,
Then the weighted average value of the fuzzy set points will be: .1:0 0:9 C 4:0 1:0/=.0:9 C 1:0/ D 2:579 For each shaded subsection in the diagram above, the area and centre of gravity is calculated according to the shape identified (i.e., triangle, rectangle or trapezoid). The center of gravity of the whole set is then determined: x 0 D .2:333 1:0C3:917 1:6C5:50:6C6:3330:3/= .1:0 C 1:6 C 0:6 C 0:3/ D 3:943 : : :
This is only moderately useful since the value at 1.0 has too much influence on the defuzzified result. The moment defuzzification is probably most useful in this case. However, a place where this defuzzification method is very useful is when the fuzzy set is in fact a series of singleton values. It might be that a set of rules is of the Takagi-Sugeno-Kang type (1st order) with formats like: If x is A and y is B then c D k
3. Center of Area (COA) defuzzification finds the x value such that half of the area under the fuzzy set is on each side of the x value. In the case above (in the moment defuzzify section) the total area under the fuzzy set is 3.5 .1:0 C 1:6 C 0:6 C 0:3/. So we would want to find the x value where the area to the left and the right both had values of 1.75. This occurs where x D 3:8167. Note that in general the results of moment defuzzify and center of area defuzzify are not the same. Also note that in some
where x and y are fuzzy variables and k is a constant that is represented by a singleton fuzzy set. For example we might have rules that look like: where the setting of the hot valve has several possibilities, say full closed, low, medium low, medium high, high and full open, and these are singleton values rather than normal fuzzy sets. In this case medium low might be 2 on a scale from 0 to 5.
1190
S.-Y. Wang and C.F. Lee
An aggregated conclusion for setting the hot valve position (after all of the rules have contributed to the decision) might look like:
decision preference ratings, including linguistic variable and triangular fuzzy number. (2) Evaluating the importance weights of the criteria and the degrees of appropriateness of the decision alternatives. (3) Aggregating the weights of the decision criteria. 3. Selection of the optimal alternative: this step includes two activities. (1) Prioritization of the decision alternatives using the aggregated assessments. (2) Choice of the decision alternative with highest priority as the optimal.
77.3 Applications of Fuzzy Set Theory
And the weighted average defuzzification value for this output would be: .1:0 0:25 C 2:0 1:0 C 3:0 0:5 C 4:0 0:5/= .0:25 C 1:0 C 0:5 C 0:5/ D 2:556 Note that neither a maximum defuzzification nor a moment defuzzification would produce a useful result in this situation. The maximum version would use only 1 of the points (the maximum one) giving a result of 2.0 (the x value of that point), while the moment version would not find any area to work with and would generate an exception. This description of the weighted average defuzzify method will be clearer after the sections on fuzzy values and fuzzy rules have been completed.
77.2.6 Fuzzy Decision After the process of defuzzified, the next step is to make a fuzzy decision. Fuzzy decision is a model for decisionmaking in a fuzzy environment; the object function and constraints are characterized as their membership functions, the intersection of fuzzy constraints and fuzzy objection function. Fuzzy decision-making method consists of three main steps: 1. Representation of the decision problem: the method consists of three activities. (1) Identifying the decision goal and a set of the decision alternatives. (2) Identifying a set of the decision criteria. (3) Building a hierarchical structure of the decision problem under consideration 2. Fuzzy set evaluation of decision alternatives: the steps consist of three activities. (1) Choosing sets of the preference ratings for the importance weights of the
Zadeh (1965) first introduced the “fuzzy set theory” that allows the representation of uncertainty and inexact information in the form of linguistic variables, which apply to many areas, such as pattern recognition and clustering; fuzzy control-automobiles, air-conditioning, robotics; fuzzy decision-stock market, finance, investment; expert systemdatabase, information retrieval, image processing, as well as other field-neural networks, genetic algorithms, and so forth. Recently, there have been many financial studies on the fuzzy set theory. The book of collected papers edited by Ribeiro et al. (1999) applied the fuzzy set theory to the discipline called financial engineering. Chui-Yu and Chan (1998) proposed a capital budgeting model under uncertainty in which cash flow information is specified as a special type of fuzzy number, and is then used to estimate the present worth of each fuzzy project cash flow. Carlsson and Fuller introduced a (heuristic) real option rule in a fuzzy setting, where the present values of expected cash flows and expected costs are estimated by trapezoidal fuzzy numbers. Wu (2004) proposed the fuzzy pattern of Black-Scholes formula and put–call parity relationship. Lee et al. (2005a, b) adopted the fuzzy decision theory and Bayes’ rule as a base for measuring fuzziness in the practice of option analysis to derive a fuzzy B-S Pricing model under fuzzy environment. At the same time, they also apply fuzzy set theory to the Cox, Ross and Rubinstein (CRR 1979) model to set up the fuzzy binomial option pricing model (OPM). Sánchez and Gómez (2003a, b, 2004) addressed the subject of fuzzy regression (FR) and the term structure of interest rates (TSIR). We give brief examples about the fuzzy set theory in finance research. An innovative method based on fuzzy set theory has been developed that can accurately estimate.
77.3.1 A Example of Option An option gives the holder of the option the right to buy (sell) an asset by a certain date for a certain price. The optimal option price has been computed by the binomial model
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
(1979) or the Black and Scholes model (1973). However, volatility and riskless interest rate are assumed as constant in those models. Hence, many subsequent studies emphasized the estimated riskless interest rate and volatility. Cox and Ross (1975) introduced the concept of Constant-Elasticityof-Variance for volatility. Hull and White (1987) released the assumption that the distribution of price of underlying asset and volatility are constant. Wiggins (1987), Scott (1987), Lee et al. (1991) released the assumption that the volatility is constant and assumed that the volatility followed Stochastic-Volatility. Amin (1993) and Scott (1987) considered that the Jump-Diffusion process of stock prices and the volatility were random process. Researchers have so far made substantial effort and achieved significant results concerning the pricing of options (e.g., Brennan and Schwartz 1977; Geske and Johnson 1984; Barone-Adesi and Whaley 1987). Empirical studies have shown that, given their basic assumptions, existing pricing models seem to have difficulty in properly handling the uncertainties inherent in any investment process. Option prices are affected by the five primary factors. These are striking price, current stock price, time, riskless interest rate, and volatility. Since the striking price and time until option expiration are both determined, current stock prices reflect on every period, but riskless interest rate determines the interest rate of currency market, and volatility can’t be observed directly, but can be estimated by historical data and situation analysis. Therefore, riskless interest rate and volatility are estimated. The concept of fuzziness can be used to estimate the two factors of riskless interest rate and volatility. The pricing characteristics of financial market can be represented by the mathematical description of the option pricing model. In addition, option prices can be determined by the following types of factors: 1. Objective factors of the contract (such as striking price, current stock price, time, etc.); 2. Subjective factors of the investor (such as riskless interest rate and volatility in accordance with the classification of “booming economy,” “fair economy,” or “depression,” etc.). As a result of the objective and subjective factors, the option pricing model can use the fuzzy set theoretical approach to study advanced methodology for the analysis of option price. Wu (2004) proposed the fuzzy pattern of Black-Scholes formula and put–call parity relationship. Lee et al. (2005a, b) modeled the option price on B-S model or binominal model, based on option price data available. So most of studies have focused on how to release the assumptions in the binomial model and the Black-Scholes model, including the fact that (1) the short-term riskless interest rate is constant, and (2) the volatility of a stock is constant. After loosening these
1191
assumptions, the fuzzy set theory applies to the option pricing model, in order to replace the complex models of previous studies. Based on the fuzzy Black-Scholes model and fuzzy binominal model theories, two real-world examples have been given to demonstrate the efficacy of the theory.
77.3.2 The B-S Model Under Fuzzy Environment When dealing with the actual B-S issues, an investor not only faces a fuzzy sample information space but he/she also remains in a fuzzy state space. For example, industry forecasts its future riskless interest rate in accordance with the classification of “booming economy,” “fair economy,” or “depression.” The definitions of “booming economy,” “fair economy,” and “depression,” depend on the investor’s subjective opinion. Therefore, the state space encountered by the investor also involves implicit fuzziness. Besides, the actions that the investor takes will cause the price structure to change accordingly in the analysis. Let’s take the pricing change for example, when t drops, Rt is expected to go down. However, due to the investor’s environment, timing of the decision-making, and the inability to give it a trial, it is virtually impossible to wait for the collection of perfect information. Under these circumstances, the alternative adopted by the investor of B-S model under uncertainty contains fuzziness. In summary, the B-S model, with which an investor actually deals, is in a fuzzy state with fuzzy sample information and fuzzy action. As a result, the B-S model can be defined Q P .FQ /; C.A; Q FQ /g, in with fuzzy decision space BQ D fFQ ; A; which FQ D fFQ1 ; FQ2 ; : : : ; FQK g stands for fuzzy state set; FQk ; k D 1; 2; : : : ; K stands for a fuzzy set in S ; S D fS1 ; S2 ; : : : ; SI g stands for the state set. Si ; i D 1; 2; : : : ; I stands for a state of the state set. AQ D fAQ1 ; AQ2 ; : : : ; AQN g stands for fuzzy action set;. AQn ; n D 1; 2; : : : ; N stands for a fuzzy action set in DI D D fd1 ; d2 ; : : : ; dJ g stands for action set; dj ; j D 1; 2; : : : ; J stands for an action or alternative available for the investor. P .FQk / is the prior probability Q FQ / is the evaluation function of AQ FQ . of FQk I C.A; The prior probability of fuzzy state P .FQk / is defined in accordance with Equation (77.1) as I X P FQk D FQk .Si / P .Si /:
(77.1)
i D1
Let MQ D fMQ 1 ; MQ 2 ; : : : ; MQ J g be the fuzzy sample information space in X . MQ j ; j D 1; 2; : : : ; J , is the fuzzy sample information; X D fx1 ; x2 ; : : : ; xm g is the sample information space, and fxr g; r D 1; 2; : : : ; m is an independent event.
1192
S.-Y. Wang and C.F. Lee
The posterior probability of fuzzy state FQk is defined in accordance with the posterior probability of fuzzy events after the fuzzy sample information MQ j is
From above description, we can derive the expected value S of Rt ; t , and the fuzzy B-S call option pricing. For instance, a company’s expert in its sales department has long performed the riskless interest rate Rt . If this sales I expert always forecasts the company’s riskless interest rate X P .FQk jMQ j / D FQk .Si / P .Si jMQ j / of the next term in accordance with the economy’s condii D1 tion, which might be classified as a “booming economy,” m a “fair economy,” and a “depression.” Let the state set be X P .xr jSi / MQ j .xr / P .Si / S D fS1 ; S2 ; : : : ; SI g, where Si ; i D 1; 2; : : : ; I stands I X rD1 for the riskless interest rate, and the fuzzy stat set be FQ D D FQk .Si / Qj/ P . M fFQ1 ; FQ2 ; : : : ; FQK g, in which FQk ; k D 1; 2; : : : ; K stands for i D1 an economy condition. If this company prepares a new lower m I X X FQk .Si / MQ j .xr / P .xr jSi / P .Si / price plan to respond to the price competition in the market, the sales department expects riskless interest rate. Let the i D1 rD1 D m sample information space be X D fx1 ; x2 ; : : : ; xm g, where X MQ j .xr / P .xr / fxr g; r D 1; 2; : : : ; m stands for rate of riskless interest rates rD1 growth under the different price plans and fxr g is an inde(77.2) pendent event. Also let the fuzzy sample information space MQ D fMQ 1 ; MQ 2 ; : : : ; MQ J g, in which MQ j ; j D 1; 2; : : : ; J An investor must consider the call price after he or she has stands for a riskless interest growth rate condition that might realized the fuzzy state FQk and the fuzzy sample informa- be classified as “high riskless interest growth rate” or “fair Q tion MQ j of the industry in order to draft an optimal deci- riskless interest growth rate.” The expected value of Rt in Fk Q Q Q Q state, R F is sion and action An . Let the call price be the value C.An ; Fk / t k Q FQ /, then the expected call price of evaluation function C.A; I m C.AQn jMQ j / of AQn can be defined as X X ŒSi .1 C xr P .xr jSi // FQk .Si / P .Si /: Rt .FQk / D C.AQn jMQ j / D
K X
C.AQn ; FQk /P .FQk jMQ j /:
i D1
The optimal action AQn can be determined by C.AQn jMQ j /. The call price C.AQn jMQ j / of the optimal fuzzy action AQn can be defined as N
(77.5)
(77.3)
kD1
C.AQn jMQ j / D ^ C.AQn jMQ j /;
Taking a change of the price policy in fuzzy environment, the company estimates its future t , which depends on the investor’s subjective judgment. Therefore, the expected value of t in FQk state, in which u D 1; 2; : : : ; v stands for a fuzzy action set in t ; EAQ.Q t / is
(77.4)
nD1
rD1
EAQ.Q t / D
v X
t u AQ.t u / P .t u /:
(77.6)
uD1
N
where, ^ is selecting the minimum value of N values. nD1
The lower the call option prices the better for investor to reduce the loss.
Combining Equations (77.5) and (77.6), the Fuzzy B-S Option Pricing Model and the expected call price C.AQn jMQ j / can be defined in accordance with Equation (77.3) as
K X ˇ ˇ C.AQn ˇMQ j / D C.AQn ; FQk / P .FQk ˇMQ j / kD1
D
K X
C.AQn ; FQk /
D
kD1
ˇ FQk .Si / P .Si ˇMQ j /
i D1
kD1 K X
I X
I P m P
C.AQn ; FQk /
i D1 rD1
FQk .Si / MQ j .xr / P .xr jSi / P .Si / m P rD1
: MQ j .xr / P .xr / (77.7)
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
According to the expected call price C.AQn jMQ j /, the optimal action AQn under the fuzzy B-S model can be defined as
1193
The lower the call option prices the better for investor to reduce the loss.
terest rates growth under the different price plans and fxr g is an independent event. Also let the fuzzy sample information space MQ D fMQ 1 ; MQ 2 ; : : : ; MQ J g, in which MQ j ; j D 1; 2; : : : ; J stands for a riskless interest growth rate condition that might be classified as “high riskless interest growth rate” or “fair riskless interest growth rate”. The expected value of Rt in FQk state, Rt FQk can be defined in accordance with Equation (77.3) as
77.3.3 The derivation of fuzzy B-S OPM
Rt .FQk / D
N
C.AQn jMQ j / D ^ C.AQn jMQ j /:
(77.8)
nD1
I X
ŒSi .1 C
i D1
m X
xr P .xr jSi // FQk .Si / P .Si /:
rD1
(77.9) The expected value S of Rt ; t , and the fuzzy B-S call option pricing are derived in the following. 77.3.3.1 Expected Value of Rt For instance, a company’s expert in its sales department has long performed the riskless interest rate Rt . If this sales expert always forecasts the company’s riskless interest rate of the next term in accordance with the economy’s condition, which might be classified as a “booming economy,” a “fair economy,” and a “depression.” Let the state set be S D fS1 ; S2 ; : : : ; SI g, where Si ; i D 1; 2; : : : ; I stands for the riskless interest rate, and the fuzzy stat set be FQ D fFQ1 ; FQ2 ; : : : ; FQK g, in which FQk ; k D 1; 2; : : : ; K stands for an economy condition. If this company prepares a new lower price plan to respond to the price competition in the market, the sales department expects riskless interest rate. Let the sample information space be X D fx1 ; x2 ; : : : ; xm g, where fxr g; r D 1; 2; : : : ; m stands for rate of riskless in-
77.3.3.2 Expected Value of t Taking a change of the price policy in fuzzy environment, the company estimates its future t , which depends on the investor’s subjective judgment. Therefore, the expected value of t in FQk state, in which u D 1; 2; : : : ; v stands for a fuzzy action set in t ; EAQ.Q t / can be defined in accordance with Equation (77.3) as EAQ.Q t / D
v X
t u AQ.t u / P .t u /:
(77.10)
uD1
77.3.3.3 The Fuzzy B-S OPM Combining Equations (77.9) and (77.10), the Fuzzy B-S Option Pricing Model and the expected call price C.AQn jMQ j / can be defined in accordance with Equation (77.8) as
K X ˇ ˇ C.AQn ˇMQ j / D C.AQn ; FQk / P .FQk ˇMQ j / kD1
D
K X
C.AQn ; FQk /
I X
ˇ FQk .Si / P .Si ˇMQ j /
i D1
kD1
m I X X
D
K X
C.AQn ; FQk /
FQk .Si / MQ j .xr / P .xr jSi / P .Si /
i D1 rD1
kD1
m X
:
(77.11)
MQ j .xr / P .xr /
rD1
According to the expected call price C.AQn jMQ j /, the optimal action AQn under the fuzzy B-S model can be defined as N
C.AQn jMQ j / D ^ C.AQn jMQ j /: nD1
(77.12)
77.3.4 General Inference According to the definition of the fuzzy set given by Zadeh (1965), AQ is a fuzzy set of X and AQ W X ! Œ0; 1 . AQ is the membership function of A; that is, when x 2 X , then
1194
S.-Y. Wang and C.F. Lee
(
AQ .x/ 2 Œ0; 1 . When the range of the membership function is improved to {0, 1}, then AQ will be transformed to a crisp set A. The AQ will be transformed into CA (characteristic function). This kind of transformation for a decision maker in the B-S model means that the estimates of the primary variable .S; K; T; R; / are without fuzziness, that is, AQ .xi / D 1 or 0. Under these circumstances, AQ .x/ D CA .x/ D 1 if x 2 A, or AQ .xi / D 0, if x … A, therefore, the probability for the occurrence of fuzzy event AQ can be defined as
C.dj
j
J
D ^
j D1
C dj ; Si P .Si jxr /
)
i D1 I X
C dj ; Si P .Si jxr /: (77.14)
i D1
77.4 A Example of Fuzzy Binomial OPM The fuzzy binomial OPM is illustrated below, including onestep and inference of fuzzy binomial OPM.
I I X X AQ .xi /P .xi / D CA .xi /P .xi / D P .A/ : P AQ D i D1
jx r / D Min
I X
i D1
(77.13) According to the traditional probabilistic B-S model, the decision space faced by an investor is B D fS; D; P .Si /; C.dj ; Si /g, in the B-S model, C D S N.d1 / KeRT N.d2 /, the C value is C.dj jxr /. Therefore, under Si state and xr sample information, the expected call price C.dj jxr / of the optimal alternative is defined as
Stock price
t⫽1
77.4.1 One-Step Fuzzy Binomial OPM According to the triangular fuzzy number, we can set up the stock price, call price, volatility, riskless interest rate, and probability that have three values, including low, medium, and high. The derive process of call prices is described as follows:
Option price
t⫽2
t⫽2
t⫽1
( e−rh Δt ⋅Cul,e−rm Δt ⋅Cum,e−rlΔt ⋅Cur )
(Pd ⋅ Sul , Pm ⋅ Sum , Pu ⋅ S ur)
S
(C d , C m ,C u , ) {(1 − P u ) ⋅ S dl,(1 − P m) ⋅ S dm,(1 − P d ) ⋅ S dr}
Assume that an option will mature in a given period, and the stock price is initially set to S.t D 1/. It can move up or down in the next period .t D 2/. It is possible that there are three scenarios in the up movement, Sur .D ur S /; Sum .D um S /, or Sul .D ul S /, and three scenarios in the down movement, Sdr .D dr S /; Sdm .D dm S /, or Sdl .D dl S /. Hence, in each time interval the stock price moves from its initial value of S to one of the above six new values, Sur ; Sum ; Sul ; Sdr ; Sdm or Sdl . where, Sur : stock price of p up movement under greatest volatility, .1C/ t , Sur D S0 Œe of up movement under medium volatility, Sum : stock price p Sum D S0 Œe t , up movement under smallest volatility, Sul : stock price of p .1/ t Sul D S0 Œe , movement under greatest Sdr : stock price of down p volatility, Sdr D S0 Œe .1/ t ,
( e−rl Δt ⋅Cdl,e−rm Δt ⋅Cdm,e−rhΔt ⋅Cdr )
movement under medium Sdm : stock price of down p volatility, Sdm D S0 Œe t , under smallest Sdl : stock price of down movement p volatility, Sdl D S0 Œe .1C/ t . The call prices in the maturity are Cur D max.Sur X; 0/; Cum D max.Sum X; 0/; Cul D max.Sul X; 0/; Cdr D max.Sdr X; 0/; Cdm D max.Sdm X; 0/; Cdl D max.Sdl X; 0/. We can combine the bond and stock to get the call price. We make an example of the greatest volatility. Cu D u S C Bu where,
u : the number of buying stocks under the greatest volatility, suppose stocks can be divided infinitely; S : current stock price; Bu : amount invested in bonds under the greatest volatility.
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
Assume we buy u number of stocks under the greatest volatility, and invest amount Bu in risk assets, .rl ; rm ; rh /, after one period, the value of portfolio under for up movement is u ur S C e rh t Bu and the one for down is
u dl S C e rh t Bu . where, ur: the range of the greatest volatility of “up” movement. The movement from S to Sur .D ur S / is therefore an “up” movement under the greatest volatility; dl: the range of the greatest volatility of “down” movement. The movement from S to Sdl .D dl S / is therefore a “down” movement under the greatest volatility; e rl t : the discount factor of riskless interest rate; e rl t Bu : the sum of riskless interest and principal of bond after one period under the greatest volatility. Suppose the value of the portfolio either under up or down movement is equal to the value of the corresponding call price. So
u ur S C e rh t Bu D Cur
(77.15)
u d l S C e rh t Bu D Cd l
(77.16)
Then from Equations (77.15) and (77.16), we can get Bu D
ur Cd l d l Cur .ur d l/e rh t
u D
Cur Cd l ; hedge ratio under greatest volatility: .ur d l/S
(77.17)
(77.18) We replace u and Bu in Cu D u S C Bu by Equations (77.17) and (77.18) and we can derive the following equations. Cu D D
ur Cd l d l Cur Cur Cd l S C .ur d l/S ure rh t d le rh t Cur e rh t Cd l e rh t ur Cd l d l Cur C ure rh t d le rh t ure rh t d le rh t
.e rh t d l/Cur C .ur e rh t /Cd l D ure rh t d le rh t D
.e rh t d l/Cur C .ur e rh t /Cd l .ur d l/e rh t
where; Pu D D So; Pm D
e rh t d l ur d l e rh t C d l ; 1 Pu D ur d l ur d l ur e rh t ur d l
e rm t d m um e rm t ; 1 Pm D um d m um d m
Pd D
1195
e rl t dr ul e rl t ; 1 Pd D ul dr ul dr
The probabilities of three up movements are Pu ; Pm , and Pd , while the probabilities of three down movements are 1 Pu ; 1 Pm; and 1 Pd , respectively. In the above equations, Cu ; Cm; and Cd are the right, medium, and the left values of the current fuzzy option price. So we can get the fuzzy binomial OPM under greatest volatility. In a onestep fuzzy binomial OPM, we can get the current call price of a triangular fuzzy number .Cu ; Cm ; Cd /: Cu D e rl t ŒPu Cur C .1 Pu / Cdl . Similarly, we also can get the fuzzy binomial OPM under medium and smallest volatility as follows: Cm D e rm t ŒPm Cur C .1 Pm / Cd l ; Cd D e rh t ŒPd Cul C .1 Pd / Cdr :
77.4.2 Two-Step Fuzzy Binomial OPM The two-step fuzzy OPM can be further processed, as follows. The fuzzy of stock price is combined at t D 3; the different paths of stock price may generate the same values. The numbers of every node at different periods will be inferred later. Next, we use the stock price at t D 3 and exercise price to compute the option price and we move from the stock price of t D 3 to the call price of t D 3. We obtain the call price from the stock price, and then we infer the call price from t D 3 to t D 2. Similar to the process t D 3 to t D 2, we obtain the call price at t D 1. Finally, the method for defuzzifying a fuzzy ranking and the BNP value for the fuzzy number RQ i can be found using the following equation: BNPi D Œ.U RQ i D RQ i / C .M RQ i D RQ i / =3 C D RQ i 8i (77.19) After using Equation (77.19), the call price is still the only triangular fuzzy number. We can follow the same process to get a triangular fuzzy number for current call prices at t D 2, and we can compute the interval value of the call price at t D 1.
77.4.3 N-Step Fuzzy Binomial OPM In order to obtain the option price from the stock price, the detail and inference process are derived and explained below.
1196
S.-Y. Wang and C.F. Lee
Step 1: Estimate and value to compute the Sur ; Sum ; Sul ; Sdr ; Sdm or Sdl . We estimated and to compute volatility. In the other words, .1 C / is the volatility of the right value of up movement, .1 / is the volatility of the left value of up movement, and Sur .D ur S /; Sum .D um S /, and Sul .D ul S / are the degree of up movement. In the down movement, .1 / is the volatility of the right value of down movement, .1 C / is the volatility of the left value of down movement, and Sdr .D dr S /; Sdm .D dm S /, and Sdl .D dl S / are the degree of down movement. Step 2: From the stock price of t D n to the call price of t D n. We use the stock price at t D n and exercise price to compute the option price. The process is the same with CRR model. Cn D max.Sn X; 0/, where Cn is the call price at time t; Sn is the stock price at time t; X is exercise price. Step 3: The call price from t D n to t D n 1. Now we integrate the fuzzy set with the CRR model, and use different probabilistic for different volatilities and risk interest rates. Here, Cu ; Cm and Cd are the largest, middle, and smallest values of the fuzzy option price. Each node of the call price has three call prices, not just a single one, as in the CRR model. So we replace the call prices Cur and Cdl for the greatest volatility to get Cu D e rl t ŒPu Cur C .1 Pu / Cdl I Cum and Cdm for the medium volatility to get Cm D e rm t ŒCum Pm C Cdm .1 Pm / ; and Cul and Cdr for the smallest volatility to get Cd D e rh t ŒCul Pd C Cdr .1 Pd / . Step 4: The call price from t D n 1 to t D n 2. We use the same principle with the call price from t D n to t D n 1, however with three call prices at t D n 1. We compute the BNP value of triangular fuzzy number to operate, and a triangular fuzzy number represents the call price at every node. If we still use the same way for the call price from t D n to t D n 1, the value we get at every node will be multiple call price. Step 5: The call price from t D n2 to t D n3; : : : ; t D 1. The same principle applies to the call price from t D n 1 to t D n 2, for computing the BNP value to get the reasonable call price from Equation (77.19), we can further formulate Equation (77.19) into Equation (77.20). Hence, we can get the only triangular fuzzy number of the current call price, not multiple solutions. Ct D Œ.Cu Cd / C .Cm Cd / =3 C Cd
(77.20)
Real options can be valued using the analogue option theories that have been developed for financial options, but they are not necessarily the same. Real options are concerned about strategic decisions of a company, where degrees of freedom are limited to the capabilities of the company. In these strategic decisions different stakeholders play a role, especially if the resources needed for an investment are significant and thereby the continuity of the company is at stake. In addition, it is quite different from traditional discounted cash flow investment approaches. In the traditional methods, it is very hard to make a decision when there is uncertainty about the exact outcome of the investment. And since these methods ignore the value of flexibility and discount heavily for external uncertainty, many interesting and innovative activities and projects are cancelled because of the uncertainties. We shall introduce a real option rule in a fuzzy setting (Carlsson and Fuller 2003), where the present values of expected cash flows and expected costs are estimated by triangular fuzzy numbers, and determine the optimal exercise time is determined by using the fuzzy real option model. Usually, the present value of expected cash flows cannot be characterized by a single number. But managers are able to estimate the present value of expected cash flows by using triangular possibility distribution of the form FQ D Œfl ; fm ; fr ; that is, the most possible values of the present value of expected cash flows lie in the fm (which is the core of the triangular fuzzy number FQ ), fr is the greatest value and fl is the smallest value for the present value of expected cash flows. In a similar manner one can estimate the expected costs by using a triangular possibility distribution of the form XQ D Œxl ; xm ; xr ; that is, the most possible values of expected costs lie in the interval xm (which is the core of the triangular fuzzy number XQ ), xr is the greatest value and xl is the smallest value for expected costs. In addition, the values are also influenced by such factors in the financial market, as riskless interest rate (discount rate) and the volatility of cash inflow. We use historic data to estimate these values by using a triangular possibility distribution of the form RQ D Œrl ; rm ; rr ; ıQ D Œl ; m ; r ; Q D Œl ; m ; r :
77.5 An Example of Real Options In real options, the options involve real assets. To have a real option means to have the possibility for a certain period to either choose for or against making an investment decision.
That is, the most possible values of discount rate, riskadjusted discount-rate and the volatility of cash inflow lie in the interval rm ; m and m (which is the core of the triangular Q ıQ and Q ), rr is the greatest value and rl is fuzzy number R; the smallest value for discount rate, r is the greatest value
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
and l is the smallest value for the risk-adjusted discountrate, r is the greatest value and l is the smallest value for volatility of cash inflow. The information needed for capital budgeting is generally not known with certainty. The sources of uncertainty may be the net cash inflow, the discount rate, or the life of the project, and so forth Hence in addition to considering the fuzzy volatility Q and investing cost XQ , we also consider Q In these the fuzzy cash inflow FQ , fuzzy interest rate ıQ and R. circumstances we suggest the use of the following formula for computing fuzzy real option values
1197
where Q ıQ ˚ Q 2 =2/ ˝ T p ln.FQ ØXQ / ˚ .R‚ p ; dQ2 D dQ1 ‚Q ˝ T dQ1 D Q ˝ T and where, FQ denotes the probabilistic mean value of the present value of expected cash flows, XQ stands for the probabilistic mean value of expected costs and Q .FQ / is the variance of the present value expected cash flows, ıQ stands for the risk-adjusted discount-rate, RQ stands for the discount rate, T stands for the interval, Using above formula for arithmetic operations on triangular fuzzy numbers we find
Q Q R; Q .FQ ; XQ ; ı; Q Q ; T / D FQ ˝ e ı˝T ˝ N.dQ1 /‚XQ ROV Q ˝e R˝T ˝ N.dQ2 /:
FROV D .fl ; fm ; fr / ˝ e ı˝T ˝ N.dQ1 /‚.xl ; xm ; xr / ˝ e R˝T ˝ N.dQ2 / Q
Q
D .fl N.dQ1 / ˝ e ı˝T xr N.dQ2 / ˝ e R˝T ; fm N.dQ1 / ˝ e ı˝T xm N.dQ2 / ˝ e R˝T ; Q
Q
Q
Q
Q Q fr N.dQ1 / ˝ e ı˝T xl N.dQ2 / ˝ e R˝T ;
p Where dQ1 D Œ.ln fl =xr C rl T r T C p l l T =2/=r T ; .ln fm =xm C rm T m T Cpm m T =2/=m T ; .ln p fr =xl C rr T l T C r r T =2/=l T , dQ2 D dQ1 ‚Q ˝ T : In the following we shall generalize the probabilistic decision rule for optimal investment strategy to a fuzzy setting: Where the maximum deferral time is T , make the investment (exercise the option) at time t I 0 t T , for which the option, CQ t , attends its maximum value,
computerized implementation we have employed the following function to determine the expected fuzzy real option values fCQ l ; CQ m ; CQ r g, of triangular form: E.Ct / D Œ.Cr Cl / C .Cm Cl / =3 C Cl where, Cr ; Cm and Cl are the largest, middle, and smallest values of the fuzzy real option values.
Q Q Q R˝T CQ t D max CQ t D FQ ˝e ı˝T˝N.dQ1 /‚X˝e ˝N.dQ2 / t D0;1;:::;T
77.6 Fuzzy Regression
Where Q P V .c fQ0 ; c fQ1 ; : : : ; c fQt ; ı/ Q FQt D P V .c fQ0 ; c fQ1 ; : : : ; c fQT ; ı/ Q D P V .c fQt C1 ; c fQ1 ; : : : ; c fQT ; ı/; that is, FQt D c fQ0 C
T X j D1
D
T X j Dt C1
t X c fQj c fQj c fQ0 C j Q Q j .1 C ı/ .1 C ı/ j D1
c fQj ; Q j .1 C ı/
where c fQj denotes the expected (fuzzy) cash flow at time j; ıQ is the risk-adjusted discount rate (or required rate of return on the project). However, to find a maximizing element from the set fCQ 0 ; CQ 1 ; : : : ; CQ T g is not an easy task because it involves ranking of triangular fuzzy numbers. In our
The traditional regression is based on the assumption that the uncertainty is due to the randomness of the variables, but in fuzzy regression, the uncertainty is assumed to be from membership grades. Ordinary regression techniques only deal with crisp inspected data, but the fuzzy regression model can solve subjective and non-crisp data. In this section, we propose applications of fuzzy regression techniques for financial problems. Some articles in the financial literature suggest using fuzzy numbers to model interest rate uncertainty, but they do not explain how to quantify these rates with fuzzy numbers. Recent articles by Sánchez and Gómez (2003a, b, 2004) addressed the subject of fuzzy regression (FR) and the term structure of interest rates (TSIR). Following Tanaka et al. (1982), their regression models included a fuzzy output, fuzzy coefficients, and an non-fuzzy input vector. The fuzzy components were assumed to be triangular fuzzy numbers (TFNs). The basic idea was to minimize the fuzziness
1198
S.-Y. Wang and C.F. Lee
of the model by minimizing the total support of the fuzzy coefficients, to including all the given data. Key components of the Sánchez and Gómez methodology included constructing a discount function from a linear combination of splines, the coefficients of which were assumed to be TFNs, and using the minimum and maximum negotiated price of fixed income assets to represent the fuzziness of the dependent variable observations. Conventional study of the regression analysis is based on the conception that the uncertainty of observed data comes from the random property. Here we consider both the random property and the fuzzy perception to construct the regression model by using fuzzy random variables. We define response as a fuzzy random variable, and call this model the fuzzy regression model. A fuzzy regression model is used in evaluating the functional relationship between the dependent and independent variables in a fuzzy environment. Most fuzzy regression models are considered to be fuzzy outputs and parameters but non-fuzzy (crisp) inputs. In general, there are two approaches in the analysis of fuzzy regression models: linear-programming-based methods and fuzzy least-squares methods. In 1992, Sakawa and Yano considered fuzzy linear regression models with fuzzy outputs, fuzzy parameters, and fuzzy inputs. They formulated multiobjective programming methods for the model estimation along with a linearprogramming-based approach. The fuzzy linear regression basics is standard (classical) statistical linear regressions taking the form yi D ˇ0 C ˇi xi C "i ; i D 1; 2; : : : ; m
regression model: (1) models where the relationship of the variables is fuzzy, and (2) models where the variables themselves are fuzzy. An obvious way to bring the fuzzy regression more in line with statistical regression is to model the fuzzy regression along the same lines. In the case of a single explanatory variable, we start with the standard linear regression model, which in a comparable fuzzy model takes the form: YQi D ˇ0 C ˇ1 XQ i C "Qi ; i D 1; 2; : : : ; m
(77.22)
Rearranging the terms in Equation (77.22), "Qi D YQi ˇ0 ˇ1 XQ i ; i D 1; 2; : : : ; m
(77.23)
From a least squares perspective, the problem then becomes min SQ D
n X
.YQi b0 b1 XQi /2
(77.24)
i D1
There are a number of ways to implement FLSR, but the two basic approaches are FLSR using distance measures and FLSR using compatibility measures. A description of these methods follows. The studies of Sánchez and Gómez provide some interesting insights into the use of fuzzy regression for the analysis of the TSIRs. However, their methodology relies on possibilitic regression, because some of these limitations can be circumvented using FLSR techniques,
(77.21)
where the dependent (response) variable, yi , the independent (explanatory) variables, xi and the coefficients (parameters), ˇi , are crisp values, and "i is a crisp random error term with E."i / D 0, variance ¢ 2 ."i / D ¢ 2 , and covariance ¢."i ; "j / D 0; 8i; j; i ¤ j . Although statistical regression has many applications, problems can occur in the following situations: 1. 2. 3. 4.
Difficulties verifying distribution assumptions; Number of observations is inadequate (small data set); Ambiguity of events or degree to which they occur; Vagueness in the relationship between input and output variables; 5. Inaccuracy and distortion introduced by linearalization. Thus, there is difficulty verifying that the error is normally distributed, or statistical regression is problematic if the data set is too small, or if there is ambiguity associated with the event or if the linearity assumption is inappropriate, or if there is vagueness in the relationship between the independent and dependent variables. These are the very situations in which fuzzy regression was meant to address. Two general ways (not mutually exclusive) to develop a fuzzy
77.7 Conclusion The impact of implicit “Fuzziness” is inevitable due to the subjective assessment made by investors in finance research. How to draw the high or low of each criterions are described by linguistic terms, which can be expressed in fuzzy numbers. The concept of fuzzy numbers attempts to deal with real problems, by possibility; it is much easier to comprehend. Due to vague concepts frequently represented in decision data, the crisp values are inadequate to model real-life situations, so we can’t predict the value exactly. In fact, there exists value interval, which is more elastic and also corresponds to actual situations. As financial markets fluctuate from time to time, input data in financial formulas cannot always be expected in a precise sense. The fuzzy set theory provides a useful tool for conquering this kind of impreciseness; therefore eliminating possible human negligence in the content or structure of financial markets. It is important that researchers are familiar with these techniques as well. If this article can help in this regard, it will have served its purpose.
77 Application of Fuzzy Set Theory to Finance Research: Method and Application
References Amin, K. I. 1993. “Jump diffusion option valuation in discrete time.” Journal of Finance 48(5), 1833–1863. Barone-Adesi, G. and R. E. Whaley. 1987. “Efficient analytic approximation of American option values.” Journal of Finance 42(2), 301–320. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81(3), 637–654. Brennan, M. J. and E. S. Schwartz. 1977. “The valuation of American put options.” Journal of Finance 32(2), 449–462. Carlsson C. and R. Fuller. 2003. “A fuzzy approach to real option valuation.” Fuzzy Sets and Systems 139, 297–312. Chui-Yu, C. and S. P. Chan. 1998. “Capital budgeting decisions with fuzzy projects.” The Engineering Economist 43(2), 125–150. Cox, J. C. and S. A. Ross. 1975. Notes on option pricing I: constant elasticity of variance diffusion, Working paper, Stanford University. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7(3), 229–263. Dubois D and H. Prade. 1988. Possibility theory, Plenum, New York. Lee, J. C., C. F. Lee, and K. C. J. Wei. 1991. “Binomial option pricing with stochastic parameters: A beta distribution approach.” Review of Quantitative Finance and Accounting 1(3), 435–448. Lee, C. F., G. H. Tzeng, and S. Y. Wang. 2005a. “A new application of fuzzy set theory to the Black-Scholes option pricing model.” Expert Systems with Applications 29(2), 330–342. Lee, C. F., G. H. Tzeng, and S. Y. Wang. 2005b. “A Fuzzy set approach to generalize CRR model: an empirical analysis of S&P 500 index option.” Review of Quantitative Finance and Accounting 25(3), 255–275. Geske, R. and H. E. Johnson. 1984. “The American put valued analytically.” Journal of Finance 1511–1524.
1199
Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” Journal of Finance 42(2), 281–300. Kwakernaak, H. 1978. “Fuzzy random variables i: definitions and theorems.” Information Sciences 15, 1–29. Puri, M.L and D. A. Ralescu. 1986. “Fuzzy random variables.” Journal of Mathematical Analysis and Applications 114, 409–422. Ribeiro, R. A, H. J. Zimmermann, R. R. Yager, J. Kacprzyk (Eds.). 1999. Soft computing in financial engineering, Physica-Verlag, Heidelberg. Sánchez, J. de A. and A. T. Gómez. 2003a. “Applications of fuzzy regression in actuarial analysis.” JRI 2003 70(4), 665–699. Sánchez, J. de A. and A. T. Gómez. 2003b. “Estimating a term structure of interest rates for fuzzy financial pricing by using fuzzy regression methods.” Fuzzy Sets and Systems 139(2), 313–331. Sánchez, J. de A. and A. T. Gómez. 2004. “Estimating a fuzzy term structure of interest rates using fuzzy regression techniques.” European Journal of Operational Research 154, 804–818. Scott, L. 1987. “Option pricing when variance changes randomly: theory, estimation and an application.” Journal of Financial and Quantitative Analysis 22(4), 419–438. Tanaka, H., S. Uejima, and K. Asai. 1982. “Linear regression analysis with fuzzy model.” IEEE Transactions on Systems, Man and Cybernetics 12(6), 903–907. Wiggins, J. B. 1987. “Option values under stochastic volatility: theory and empirical evidence.” Journal of Financial Economics 19(2), 351–372. Wu, H. -C. 2004. “Pricing European options based on the fuzzy pattern of Black–Scholes formula.” Computers & Operations Research 31, 1069–1081. Zadeh, L. A. 1965. “Fuzzy sets.” Information and Control 8, 338–353. Zadeh, L. A. 1978. “Fuzzy sets as a basis for a theory of possibility.” Fuzzy Sets and Systems 1, 3–28.
Chapter 78
Hedonic Regression Analysis in Real Estate Markets: A Primer Ben J. Sopranzetti
Abstract This short primer provides an overview of the nature and variety of hedonic pricing models that are employed in the market for real estate. It explores the history of hedonic modeling and summarizes the field’s utility-theory-based, microeconomic foundations. It also provides empirical examples of each of the three principal methodologies and a discussion of and potential solutions for common problems associated with hedonic modeling. Keywords Hedonic models r Regression r Real-estate
78.1 Introduction Hedonic modeling first originated as a method for valuing the demand and the price of farm land.1 Although Court (1939) is widely considered to be the father of hedonic modeling, his paper had nothing to do with real estate; instead, Court created a hedonic pricing index for automobiles. Regardless of how they are used, hedonic regressions deconstruct the price of an asset into the asset’s component parts, and then use some form of ordinary least squares regression analysis to examine how each individual piece uniquely contributes to the item’s overall value. The consumer price index is probably the most famous example of the use of hedonic regressions as a control mechanism for differences in the quality of products over time. The consumer price index measures the change over time in the price of a bundle of goods. But, if the quality of the goods in the bundle changes over time, then one can imagine that obtaining a high quality prediction of the value of the index at some future point in time can be problematical. For example, imagine trying to predict the price of an automobile today, based on pricing information from 1963. A Chevrolet Corvette costs substantially more money today that it did back in 1963. Some of the increase in the car’s price is due
to inflation, but another part of the price increase is because Corvettes are far safer, faster, and lighter today than they were back then. Corvettes nowadays enjoy a vast technical superiority over their vintage forbearers. The technical improvements clearly add value. As we will see below, they add “hedonic utility.” So when trying to predict the price of a new Corvette with a turbo-charged engine, air conditioning, 6-speed transmission, airbags, and a sport-tuned suspension, one cannot examine the simple inflationary price increase of a Corvette, but in addition one must examine the price increases of the additional individual improvements. Although the consumer price index might arguably be the most famous use of hedonic modeling techniques, hedonic pricing models are also widely utilized to price other items, such as electronics, clothing and, in particular, real estate (the focus of this paper), where they are most often utilized to correct for the heterogeneity among properties and houses. Since each house has idiosyncratic characteristics that make it unique, estimating demand or prices from real estate data can be challenging. Rather than pricing a given house or property directly, a researcher can deconstruct the house and property into their value-adding components, such as lot size, square feet, number of bathrooms, number of bedrooms, neighborhood quality, and so forth. A well-specified hedonic model will estimate the contribution to the total price of each of these features separately. If it is the price that is estimated, then the hedonic model is called an “additive” model. If, instead, the elasticity is estimated, then it is called a “log” model. This short primer will explore in some detail the nature and variety of hedonic pricing models, and should provide a solid foundation for any researcher interested in employing this widely utilized empirical technique.
78.2 The Theoretical Foundation Although many empirical papers using hedonic modeling techniques were published in the years that followed Court’s
B.J. Sopranzetti Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
1 Hass (1922) and Wallace (1926) both use hedonic-style models to value farmland in the midwestern U.S.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_78,
1201
1202
work, Lancaster’s (1966) seminal paper is the first attempt to create a theoretical foundation for hedonic modeling. To this end, Lancaster presented a groundbreaking theory of hedonic utility. Lancaster argues that it is not necessarily a good itself that creates utility, but instead the individual “characteristics” of a good that create utility. Specifically, an item’s utility is simply the aggregated utility of the individual utility of each of its characteristics. For example, the utility that comes from owning an expensive car comes not so much from the car itself, but from the fact that it provides not only transportation, but also fast acceleration, enhanced safety, attractive styling, increased prestige, and so forth. Furthermore, he argues that items can be arranged into groups based on the characteristics they contain. Consumers make their purchasing decisions within a group based on the number of characteristics a good possesses per unit cost. For example, people make their home purchase decisions based on the number of bedrooms, bathrooms, and so forth. Although Lancaster is the first to discuss hedonic utility, he says nothing about pricing or pricing models. Rosen (1974) is the first to present a theory of hedonic pricing. Rosen argues that an item can be valued as the sum of its utility generating characteristics; that is, an item’s total price should be the sum of the individual prices of its characteristics. This implies that an item’s price can be regressed on the characteristics to determine the way in which each characteristic uniquely contributes to the price. Although Rosen did not formally present a functional form for the hedonic pricing function, his model clearly implies a nonlinear pricing structure. For a more thorough review of the extant literature, please see the excellent surveys by Follain and Jimenez (1985) and Sheppard (1999).
78.3 The Data The data are 7,088 observations of real estate transactions of properties that are located in the Klein school district, of suburban Houston between January 1, 1992 and December 31, 1995. The data set provides information on property, home, and transaction characteristics. Property characteristics include the lot size in square feet, distance to the central business district, and information on neighborhood submarkets within the overall Klein area. Home characteristics include the size of the house in square feet, year built, number of bathrooms, number of bedrooms, a dummy variable for the existence of a pool, and whether the house has any known defects. Transaction characteristics include the list price, the transaction price, the list date, the number of days that the property remained on the market prior to being sold, whether the property is sold “as is,” whether the home was sold by
B.J. Sopranzetti
Table 78.1 Summary statistics Name Mean St. Dev
Minimum
Maximum
Sale FT Age Lot Bed Bath Pool FC Asis D1dummy CBD New Summer
18,000 819 0 5,000 1 1 0 0 0 0 10.61 0 0
385,000 3,994 28 49,933 6 5.1 1 1 1 1 28.86 1 1
98,114 2,250 12.60 9,563 3.63 2.22 0.14 0.05 0.01 0.03 20.26 0.02 0.41
44,438 668 5.90 3,972 0.60 0.44 0.35 0.21 0.12 0.16 3.50 0.13 0.49
a financial institution that had previously foreclosed on the property, and the identity of the listing and selling real estate brokers. From the initial 7,088 observations, 146 observations are deleted due to missing or obviously incorrect data. In addition, several screens are employed to increase the homogeneity of the properties in our sample and to eliminate observations in which data may not have been entered correctly. Properties are omitted if their 1. age exceeds 30 years old (24 observations); 2. lot size is smaller than 5,000 or larger than 50,000 square feet (165 observations); 3. living space exceeds 4,000 square feet (362 observations); and 4. number of bathrooms (full and half) exceeds six (six observations). The final sample includes 6,385 observations. Table 78.1 includes the summary statistics for the dependent variables.
78.4 The Linear Model The basic additive hedonic equation is one where the value of an asset is regressed against the characteristics that determine its value. This linear model is appropriate when there are heterogeneous products and heterogeneous buyers and the heterogeneous items to be valued can be easily replaced/restocked once they have been purchased; that is, there is a continuous, uninterrupted supply of the item to be priced. In this case, the pricing model of the item is very simply the sum of the prices of its component parts, where Value D f.S; N; C; T/
78 Hedonic Regression Analysis in Real Estate Markets: A Primer
Where S represents the structural characteristics of the home and property, for example, square footage, property size, number of bedrooms. N represents the neighborhood characteristics L represents the location within a giving market C represents the contract conditions, for example, is the property sold as is, is there a condo fee. And T represents the date or time that the transaction price is observed For ease of notation, we allow the matrix X to represent the combination of the individual vectors S, N, L, C, and T VALUE D Xˇ C "
(78.1)
Thus an item’s expected price is the characteristics X times “, where “ represents a vector of marginal prices.
1203
must be very careful when interpreting the coefficients of a hedonic regression. A sampling of recent hedonic real estate models yields the following common dependent variables: Structural characteristics of the home and property: square footage of the unit, square footage of the property, total number of rooms, total number of bedrooms, total number of bathrooms, the existence of a pool, any known defects, structural type (single family, duplex, condominium, etc.), age, air conditioning, finished basement, fireplaces, garages, etc. Neighborhood characteristics: quality of the school system, quality of the neighborhood, median salary. Location within a giving market: distance from the central business district, proximity to a train station, distance to supermarket, distance to schools, flooding area. Contract Conditions: Was the property sold as is? Was it a foreclosure? For a more complete discussion of the potential explanatory variables, please see Hocking (1976), Leamer (1978) and Amemiya (1980).
78.5 Empirical Specification
78.5.3 Example Using the Linear Model
78.5.1 The Dependent Variable
Table 78.2 present the results of a linear hedonic pricing model where the dependent variable is the property’s transaction price. The independent variables include the square
In real estate modeling, it is common to use the most recent transaction price as a dependent variable. There are also studies that use rent as the dependent variable, but rents are problematical since different apartments may have different terms in the rental agreement; for example, some might include heat and hot water or parking. One way of dealing with the “rent” problem is to obtain the cost of utilities (or parking) for properties where the utilities are not included in the rental agreement, and add these costs to the base rental price to get an adjusted rental price. Another possibility is to use the actual rental price as a dependent variable and then add a dummy independent variable that equals one if the unit includes utilities or parking and zero otherwise. Using actual transaction prices circumvents these problems, but exposes the researcher to another problem that current transactions may not be representative of the total housing stock: a selection bias.
78.5.2 Independent Variables One of the principal criticisms of the hedonic modeling of real estate prices is the severity of the omitted variable problem: the coefficient estimates are often not robust to changes in the model’s specification. This implies that researchers
Table 78.2 OLS regression of transaction price on the independent variables OLS sale Coefficient T-stat FT SQFT Age SQAGE Lot SQLOT Bed SQBED Bath 2 Bath 3 Bath 4 Bath 5 Pool Foreclose Asis Defect CBD New Summer Constant R2
0.94591 8.65E-04 3035.3 59.356 1.7881 1.62E-05 42,839 6207.4 6305.7 4199.3 15,936 16,196 11,447 10,025 11,932 383.98 267.98 3004.3 1260.8 17,355 84.3%
10:33 15:35 20:71 10:25 10:01 3:59 11:40 12:35 2:11 5:82 17:26 6:26 16:85 9:03 6:21 0:28 4:12 1:60 2:81 2:78
1204
B.J. Sopranzetti
footage of the home and it square, the age of the home and its square, the size of the lot and its square, the number of bedrooms and its square, dummy variables representing the number of bathrooms, the proximity in miles to the central business district, a dummy variable for a pool, foreclosure, sold as is, if there are defects, if the home is brand new, and if the property was listed in the summer. The results of the hedonic regression will likely not be surprising to anyone that has ever shopped for a home. The hedonic model has parceled out the value of the home/property into its component parts and has succeeded in explaining 84.3% of the value of transaction price. The relationship between real estate transaction prices and the square footage of the home is concave; for small homes small increases in size have a larger marginal impact than for larger homes. The same is true for the square footage of the property and the bedrooms. The relationship between transaction price and age is convex; implying that the market price of young houses depreciates at a larger rate than that of older houses. It is not surprising that given the hot summers in Texas, a pool adds substantial value (in this case $11,447 of value), nor is it surprising that foreclosed upon properties and properties that are sold as-is tend to sell for less money than other properties. Properties farther from the central business district also tend to sell for higher prices. Interestingly, properties listed in the summer tend to sell for on average $1,260 higher prices than those not listed in the summer. The principal use of a linear hedonic model is to help researchers construct a property’s transaction price from its component parts. So, using the above model, a three year old, 1,800 square foot home, with a 200ft by 200ft lot, three bedrooms and two baths, with a pool, sold at a foreclosure sale, with no known defects, that is located seven miles from the central business district, and listed in the summertime would have a predicted transaction price of $102,092.
78.6 The Semi-Log Model When the item to be priced cannot be easily restocked, for example, a home, then non-linearities arise in the hedonic pricing structure. To deal with this, it is common for researchers to employ the following semi-logarithmic functional form for the hedonic model VALUE D e xˇ"
(78.2)
ln VALUE D Xˇ C "
(78.3)
thus
In this case, the log of an item’s expected price is the sum of its characteristics X times ˇ, and the marginal price of each individual attribute x is
PRICE.x/ D e xb
(78.4)
Where x is the current level of the characteristic and b is the regression coefficient. Notice that the semi-log form implies that the price of a given characteristic varies with its level; that is, the prices are non-linear. The semi-log structural hedonic pricing model has several advantages over its linear counterpart. The principal advantage is that it permits the value of a given characteristic (the number of bathrooms, for example) to vary proportionately with the value of other characteristics (the number of bedrooms). This is not the case with a linear model, where a second bathroom adds the same value to a house that has one bedroom as it does to one that has five bedrooms.
78.6.1 Example Using the Semi-Log Model Table 78.3 present the results of a linear hedonic pricing model where the dependent variable is the log of the property’s transaction price. Again, the results are not surprising. An advantage of the semi-log form is that the model’s coefficients are easily interpreted. The percentage change in the value of the house for a unit change in the dependent variable
Table 78.3 OLS regression of LOGSALE on the dependent variables OLS LOGSALE Coefficient T-stat FT SQFT Age SQAGE Lot SQLOT Bed SQBED Bath 2 Bath 3 Bath 4 Bath 5 Pool Foreclose Asis Defect CBD New Summer Constant R2
6.76E-04 5.57E-08 2.50E-02 4.33E-04 2.02E-05 2.70E-10 0.3214 4.46E-02 1.29E-02 5.32E-02 0.11094 9.35E-02 0.10416 0.13635 0.18429 9.26E-03 4.17E-03 2.02E-02 1.38E-02 9.5387 87.5%
28:68 11:89 20:48 8:98 13:62 7:17 10:28 10:66 0:52 8:86 14:44 4:35 18:43 14:77 11:52 0:81 7:71 1:29 3:69 183:40
78 Hedonic Regression Analysis in Real Estate Markets: A Primer
can be represented as e b 1, where b is the regression coefficient.2 For example, if the coefficient on the variable that represents a pool equals 0.104, then adding a pool to a house would increase its value by e 0:104 1 D 10:96%.
78.7 The Box-Cox Model Although not as widely employed due to the difficulty of interpreting the coefficients, a more general form of the hedonic pricing model was first presented by Halvorsen and Pallakowski (1981), which applies the seminal work of Box and Cox (1964) to hedonic modeling. The basic Box-Cox transformation is X 1 ; if ¤ 0; X > 0 D ln X; if D 0 X ./ D
(78.5)
So the basic form of the Box-Cox model is given by VALUE D X ./ ˇ C "
(78.6)
Notice that when equals one the Box-Cox structural form reduces to the linear form. A more general form of the BoxCox model can be expressed by VALUE. / D ˇ0 C
X j
ˇj Xj C
1 2
X X j
k
jk Xj Xk
1205
Table 78.4 Box-Cox regression of the transaction price on the dependent variables Box sale Coefficient T-stat FT SQFT Age SQAGE Lot SQLOT Bed SQBED Bath 2 Bath 3 Bath 4 Bath 5 Pool Foreclose Asis Defect CBD New Summer Constant
2.35E-04 2.19E-08 7.82E-03 1.34E-04 6.49E-06 8.79E-11 0.10082 1.39E-02 8.94E-03 1.70E-02 3.37E-02 2.86E-02 3.28E-02 4.49E-02 6.17E-02 3.08E-03 1.39E-03 7.83E-03 4.40E-03 6.1796 R2
31:29 14:67 20:13 8:71 13:69 7:33 10:12 10:43 1:13 8:87 13:76 4:17 18:21 15:26 12:10 0:85 8:05 1:57 3:69 372:70 87:5% 0:100
78.7.1 Example Using the Box-Cox Model
(78.7) In this case, when ™ and œ are both equal to one and the crossproducts jk are all zero, then Equation (78.7) reduces down to a simple linear regression model. When ™ and œ are both equal to zero and the cross-product jk are all also equal to zero, then the model reduces down to a straightforward log– log functional form. The Box-Cox model was first introduced into the mainstream finance literature by C.F. Lee in his seminal 1976 paper, which examines the Functional Form and the Dividend Effect of the Electric Utility Industry. Other excellent examples of the application of the Box-Cox model to finance include Lee and Kau (1976) and Lee et al. (1980). One of the most innovative uses of Box-Cox hedonic models in real estate have been to back out property depreciation rates for federal tax reasons. Hulten and Wycoff (1981) present evidence of a constant geometric rate derived from a hedonic model with a Box-Cox Transformation of the value industrial and commercial buildings. The more flexible Box-Cox approach permits relationships to emerge instead of forcing a predetermined, and perhaps ad hoc, functional form. 2
See Halvorsen and Palmquist (1980).
Table 78.4 presents the results of a linear hedonic pricing model where the dependent variable is a Box-Cox transformation of the property’s transaction price. Notice that lambda emerges from the model and is not exogenously imposed.
78.8 Problems with Hedonic Modeling 78.8.1 The Identification Problem There is an inherent identification problem that occurs when one attempts to model an item’s demand function when the prices that one has to work with come from the interaction of the item’s supply and demand functions: it is difficult to separate the supply and demand impact on price. There is a second problem with hedonic modeling that comes from the non-linear nature of the pricing structure. In a simple demand model, the price of an item is taken as given and consumers make their purchase decision (quantity) based on the exogenous price; that is, consumers are price-takers. But, as seen above, non-linear hedonic models imply that the price of a characteristic is correlated with quantity; so consequently,
1206
buyers will select not only the quantity of a characteristic but, by design, also its price. Several authors, Blomquist and Worley (1982) or Diamond and Smith (1985), have attempted to solve this problem through the use of instrumental variables.
78.8.2 The Equilibrium Pricing Problem A key aspect of demand modeling is that observed prices are assumed to be equilibrium prices. Unfortunately, in markets such as real estate where adjustment costs can be large, the notion of observed prices being equilibrium ones becomes problematical. There are several papers that attempt to deal with the disequilibrium character of real estate transaction prices. See Maclennan (1977) and (1982) for a good overview of the disequilibrium problem. There are several ways in which researchers have attempted to deal with the disequilibrium problem. Bowden (1978) addresses the problem by utilizing only those observations that are either at or near equilibrium. He employs a switching regression technique. See Anas and Eum (1984) for an application of this “disequilibrium” hedonic modeling technique. Although switching regression models can be effective at mitigating the disequilibrium problem, they are not without their challenges. The major problem is how to differentiate between equilibrium and disequilibrium prices. Another problem is that modelers are often more interested in predicting actual future transaction prices rather that so called equilibrium prices. Another way to deal with the disequilibrium problem is to find a way to adjust prices back to their equilibrium levels. There are several papers that use this technique.3 The technique involves estimating a time series index of prices, then finding a way to determine which prices are equilibrium ones (for example, prices may be deemed to be near their equilibrium values if there was little or no price change from period to period). Take this set of equilibrium prices and estimate their determining characteristics, so that for each period there is an actual price, and estimated index price, and an estimated equilibrium price. From these prices it is possible to determine the extent to which the price is out of equilibrium. The last step is to find the determinants of the disequilibrium and adjust the initial prices accordingly.
B.J. Sopranzetti
78.9 Recent Developments There have been some interesting recent developments in the area of hedonic modeling. Below is a brief synopsis of a few recent papers and issues that are on the cutting-edge of the literature. Costanigro et al. (2007) argue that when researchers disregard heterogeneity across assets (they examine red and white wine) they introduce an aggregation bias into their estimated prices. They suggest that a model that estimates hedonic functions that are specific to price ranges yields more accurate predictions. Collins et al. (2007) examine a similar uniqueness problem, but with respect to art rather than wine. Goetzmann and Peng (2006) analyze and present a model that adjusts for the bias that occurs because of a house seller’s reservation price in transaction-based hedonic price indices. They present a hedonic model where the ratio of sellers’ reservation prices to the actual market value has an impact on trading volume and can lead to a bias in the observed transaction prices. They find that there is an upward bias to index returns when trading volume decreases. Diewert (2002) and Feenstra and Knittel (2004) examine the problem of quality adjustments in a hedonic model. These papers examine whether the output price index for a durable good can also be used as (partial) input price index. Although these papers focus on the producer price index, the quality adjustments can be applied more widely. The authors find that the relevant input price is the value of the characteristics bundle of the underlying asset (i.e., the market price) divided by the quantity of the individual characteristics associated with the bundle. For example, if one were to buy a cleaning service, one would take the price of the cleaning service divided by the total number of benefits provided by the service (e.g., the time savings, the convenience, the trustworthiness, the quality of the work.) Lastly, Bajari and Benkard (2005) reexamine hedonic models of demand for differentiated products. They nicely generalize Rosen’s original hedonic model to allow for unobservable product characteristics and for the hedonic pricing function to have a nonseparable form. They use a semiparametric approach demonstrate that if there are only a few products, one can construct bounds on an individual’s utility parameters, in addition to other important considerations, for example, aggregate demand and consumer surplus.4
4
3
See Abraham and Hendershott (1996), Malpezzi (1998), and Drieman and Follain (2000).
Please see Bontemps et al. (2008), Jensen and Webster (2007), Ferreira and McMillan (2007), Hahn and Mathews (2007) and Epple et al. (2006) for additional work on this interesting theme.
78 Hedonic Regression Analysis in Real Estate Markets: A Primer
78.10 Conclusion This short primer provides an overview of the literature and the microeconomic theory that underpins modern hedonic pricing models. At their most basic, hedonic pricing models deconstruct an asset’s price into the price of the asset’s individual component parts, and then use some form of ordinary least squares regression analysis, using either a linear, semi-log, or Box-Cox structural form, to examine how each individual component part uniquely contributes to the item’s overall value. This paper explores each of these aforementioned structural forms and their associated problems, and in addition offers some guidance on common treatments of the dependent and independent variables.
References Abraham, J. M. and P. H. Hendershott. 1996. “Bubbles.” Journal of Housing Research 7(2), 191–208. Amemiya, T. 1980. “Selection of regressors.” International Economic Review 21, 331–354. Anas, A. and S. J. Eum. 1984. “Hedonic analysis of a housing market in disequilibrium.” Journal of Urban Economics 15 87–106. Bajari, P. and C. L. Benkard. 2005. “Demand estimation with heterogeneous consumers and unobserved product characteristics: a hedonic approach.” Journal of Political Economy 113(6), 1239. Bayer P., F. Ferreira, and R. McMillan. 2007. “A unified framework for measuring preferences for schools and neighborhoods.” Journal of Political Economy 115(4), 588–638. Blomquist, G. and L. Worley. 1982. “Specifying the demand for housing characteristics: the exogeniety issue,” in The economics of urban amenities, D. B. Diamond and G. Tolley (Eds.). Academic Press, New York. Bowden, R. J. 1978. The econometrics of disequilibrium. North Holland. Box, G. E. P. and D. Cox. 1964. An analysis of transformations.” Journal of the American Statistical Association, Society Series B, 26, 211–252. Bontemps, C., M. Simioni, and Y. Surry. 2008. “Semiparametric hedonic price models: assessing the effects of agricultural nonpoint source pollution.” Journal of Applied Econometrics 23(6), 825–842. Collins, A., A. E. Scorcu, R. Zanola. 2007. “Sample selection bias and time instability of hedonic art price indexes, Quaderni, Working Papers DSE Nı 610. Costanigro, M., J. J. McCluskey, and R. C. Mittelhammer. 2007. “Segmenting the wine market based on price: hedonic regression when different prices mean different products.” Journal of Agricultural Economics 58(3), 454–466. Court, A. T. 1939. “Hedonic price indexes with automotive examples.” The dynamics of automobile demand, General Motors, New York. Diamond, D. B. Jr. and B. Smith. 1985. “Simultaneity in the market for housing characteristics.” Journal of Urban Economics 17, 280–292. Diewert, W. E. 2002. Hedonic producer price indexes and quality adjustment, Discussion Paper No.: 02–14, University of British Columbia, Vancouver.
1207 Epple, D., R. Romano, and H. Sieg. 2006. “Admission, tuition, and financial aid policies in the market for higher education.” Econometrica 74(4), 885–928. Feenstra, R. C. and C. R. Knittel. 2004. Re-Assessing the U.S. Quality Adjustment to Computer Prices: The Role of Durability and Changing Software (October 2004). NBER Working Paper No. W10857. Follain, J. R. and E. Jimenez. 1985. “Estimating the demand for housing characteristics: a survey and critique.” Regional Science and Urban Economics 15(1), 77–107. Goetzmann, W. and L. Peng. 2006. “Estimating house price indexes in the presence of seller reservation prices.” Review of Economics and Statistics 88(1), 100–112. Haas, G. C. 1922. “Sales prices as a basis for farm land appraisal.” Technical bulletin, Vol 9, The University of Minnesota Agricultural Experiment Station, St. Paul. Hahn, W. F. and K. H. Mathews. 2007. Characteristics and hedonic pricing of differentiated beef demands. Agricultural Economics 36(3), 377–393. Halvorsen, R. and R. Palmquist. 1980. “The interpretation of dummy variables in semilogrithmic regressions.” American Economic Review 70, 474–475. Halverson, R. and H. O. Pollakowski. 1981. “Choice of functional form equations.” Journal of Urban Economics 10(1), 37–49. Hocking, R. R. 1976. “The analysis and selection of variables in linear regression.” Biometrics 32, 1–49. Hulton, C. and F. Wycoff. 1981. “The measurement of economic depreciation,” in Depreciation, inflation, and the taxation of income from capital, C. Hulten (Ed.). Urban Institute Book, Washington. Jensen, P. H. and E. Webster. 2007. “Labelling characteristics and demand for retail grocery products in Australia.” Australian Economic Papers 47(2), 129–140. Lancaster, K. J. 1966. “A new approach to consumer theory.” Journal of Political Economy 74, 132–157. Leamer, E. E. 1978. Specification searches: ad hoc inference with nonexperimental data. Wiley, New York. Lee, C. F. 1976. “Functional form and the dividend effect of the electric utility industry.” Journal of Finance 31(5), 1481–1486. Lee, C. F. and J. B. Kau. 1976. “The functional form in estimating the density gradient: an empirical investigation.” Journal of American Statistical Association 71, 326–327. Lee, C. F., F. J. Fabozzi, and J. C. Francis. 1980. “Generalized functional form for mutual fund returns.” Journal of Financial and Quantitative Analysis 15(5), 1107–1120. Maclennan, D. 1977. “Some thoughts on the nature and purpose of hedonic price functions.” Urban Studies 14, 59–71. Maclennan, D. 1982. Housing economics, Longman, London. Malpezzi, S. 1998. “A simple error correction model of house prices,” Wisconsin-Madison CULER working papers 98–11, University of Wisconsin Center for Urban Land Economic Research. Rosen, S. 1974. “Hedonic prices and implicit markets: product differentiation in pure competition.” Journal of Political Economy 82(1), 34–55. Sheppard, S. 1999. “Hedonic analysis of housing markets,” in Handbook of regional and urban economics, P. C. Chesire and E. S. Mills (Eds.). Vol 3, Elsevier, New York. Wallace, H. A. 1926. “Comparative farmland values in Iowa,” Journal of Land and Public Utility Economics 2, 385–392.
Chapter 79
Numerical Solutions of Financial Partial Differential Equations Gang Nathan Dong
Abstract This paper provides a general survey of important PDE numerical solutions and studies in detail of certain numerical methods specific to finance with programming samples. These important numerical solutions for financial PDEs include finite difference, finite volume, and finite element. Finite difference is simple to discretize and easy to implement; however, explicit method is not guaranteed stable. The finite volume has an advantage over finite difference in that it does not require a structured mesh. If a structured mesh is used, as in most cases of pricing financial derivatives, the finite volume and finite difference method yield the same discretization equations. Finite difference method can be considered a special case of the finite element method. In general, the finite element method has better quality in obtaining an approximated solution when compared to the finite difference method. Since most PDEs in financial derivatives pricing have simple boundary conditions, the implicit method of finite difference is preferred to finite element method in applications of financial engineering. Keywords Numerical solution r Black–Scholes PDE r Finite difference r Finite element r Finite volume
79.1 Introduction Numerical methods seek to iteratively solve financial partial differential equations if analytical solutions of PDEs are not conveniently available. The development of analytical and numerical methods, based primarily on PDEs and their numerical solutions, has helped to increase the relevance of such development in the everyday practice of quantitative finance.
In general, it is not easy or even possible to find the exact analytical solutions of parabolic initial boundary value PDE problems, such in the case of Black–Scholes equation. This is the main reason for resorting to some numerical techniques. In recent years, spurred by the enormous economical success of financial derivatives, a need for sophisticated computational algorithms has developed. For example, to price an American option, quantitative financial analysts have asked for the numerical solution of a free-boundary PDE. Fast and accurate numerical methods have become essential tools to price financial derivatives and to manage portfolio risks. The required methods aggregate to the new subfield of computational finance. Computational finance is a field where different cultures meet. Hence, a wide array of academia and practitioners, with diverse backgrounds, are most interested in finding practical methods to solve the analytical problems in finance. In this chapter, we survey three different types of numerical methods, namely finite difference, finite volume, and finite element, and provide applications to solve Black– Scholes PDE, the equation for pricing options on stocks, all to show readers the advantages and disadvantages of each method.
79.2 The Model Given underlying stock price x, interest rate r, stock return volatility ¢, time t, strike price K and options price u, the Black–Scholes PDE is: 1 @2 u @u @u C 2 x 2 2 C rx ru D 0 @t 2 @x @x
(79.1)
The terminal condition is: G.N. Dong () Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_79,
u.T / D x.T / K; x > X
(79.2)
u.T / D 0; x X
(79.3)
1209
1210
G.N. Dong
79.3 Discretization The purpose of discretization is the conversion of a continuous problem with infinite degrees of freedom to a discrete problem with only finite number of unknowns that are easy for computer algorithm to solve. Instead of finding a solution everywhere in a continuous domain, the discretization method obtains an approximated solution at discrete points on the same domain. It defines a mesh network on the continuous domain, interpolates the intermediate values between the mesh points, and constructs a system of equations. The classical numerical method for partial differential equations is the finite difference method where the discrete problem is obtained by replacing derivatives with difference quotients involving the values of the unknown at discrete points on the mesh. The mathematical concerns of this approximation method are sometimes serious and difficult to address, including the questions of adequacy, accuracy, convergence, and stability.
The approximations of the solutions at these mesh points are denoted by: j (79.6) Ui u.xi ; tj / Its partial derivatives can be approximated by three Eulermethod based finite difference methods: @u u.x C x; t/ u.x; t/ D lim
x!0 @x
x j
j
Ui U i C1 (Forward difference)
x u.x; t/ u.x x; t/ @u D lim
x!0 @x
x j
(79.7)
j
Ui Ui 1 (Backward difference)
x @u u.x C x; t/ u.x x; t/ D lim
x!0 @x 2 x
j
(79.8)
j
Ui 1 U (Central difference) i C1 2 x
(79.9)
The second order derivative of x can be defined as: @2 u Œu.x C x; t/u.x; t/ Œu.x; t/u.x x; t/ D lim
x!0 @x 2
x 2
79.4 Finite Difference To approximate the model equation, the finite difference method divides the price domain [0,X] and time domain [0,T] by a set of lines parallel to the x and t axes to form a mesh. For further simplification, the lines in this mesh are equally spaced, and the number of lines is finite for both price (x) and time (t) as in Fig. 79.1. Define the spatial (price) spacing as x and time spacing as t, and mesh points as:
Tmax
xi D i x; x D 1=N; i D 0; 1; 2; : : : ; N
(79.4)
ti D j t; t D 1=M; j D 0; 1; 2; : : : ; M
(79.5)
j
j
j
2Ui C Ui 1 U i C1
x 2
79.4.1 Explicit Method We use central difference method for the first order derivative of x as described in Hull (2003) and backward difference method for the second order derivative of t as shown in Fig. 79.2:
Time Tmax ⫽ MΔt
Time ⫽ MΔt
(79.10)
i−1
i
i+1
j+1 j 3Δt
3Δt 2Δt
2Δt
Δt
Δt
0
Δx
2Δx
Fig. 79.1 Discretization mesh
3Δx
xmax ⫽ NΔx Price
0
Δx
2Δx
3Δx
Fig. 79.2 Discretization mesh of explicit method
4 xmax ⫽ NΔx Price
79 Numerical Solutions of Financial Partial Differential Equations j
j 1
U Ui @u u.x; t/ u.x; t t/ D lim i
t !0 @t
t
t
1211
Ui
D ai Ui 1 C bi Ui C ci Ui C1
(79.13)
ai D
1
t.ri C s 2 i 2 / 2
(79.14)
of t and x is chosen. Explicit method is not guaranteed to be stable. The error term might not disappear as
t and x become smaller and smaller, and instead it might accumulate and cause nonconvergence. Morton and Mayers (2005) analyze the accuracy, consistency, and convergence of the explicit method using truncation error and Fourier/Von Neumann Analysis. Wilmott (2006) poses an upper bound on the size of time step and suggests reducing the time step by a factor of four if halving the price step in order to improve accuracy. In practice, we must avoid using this discretization method and always consider other stable schemes such as the implicit method. To implement the explicit method for Black–Scholes PDE, we need to clarify the boundary condition that the options value is less than the difference between the maximum price of the underlying security and the exercise price discounted by interest rate r at that time t:
bi D 1 t.r C s 2 i 2 /
(79.15)
u.x; t/ xmax e r.T t / E
(79.17)
1 ci D t.ri C s 2 i 2 / 2
(79.16)
u.0; t/ 0
(79.18)
(79.11) We now can write the Black–Scholes PDE in this discretization form: j
j 1
j
j
Cri x
j
j
U 2Ui C Ui 1 1 C 2 i 2 x 2 i C1 2
x 2
Ui Ui
t
j
Ui C1 Ui 1 j rU i D 0 2 x
(79.12)
We can further simplify it based on Christensen and Munk (2004): j 1
j
j
j
This scheme has first-order accuracy, O. t), with only an exception of second-order accuracy if a specific ratio function [bsme]= bsme(K, p, t, s, r) Tmax = 1; Xmax = 1; dX = 0.01; dT = 0.0025; N = round(Xmax/dX)+1;M = round(Tmax/dT)+1; U = zeros(N,M); vertx = 0:dX:Xmax; for i=1:N if vertx(i) > K U(i,M) = vertx(i) - K; end end for j=M-1:-1:1 for i=1:N a = dT*(-r*i+(s*i)^2)/2; b = 1-dT*(r+(s*i)^2);
The following is the sample program code written in Matlab with xmax D $1:00 and tmax D 1 year:
1212
G.N. Dong
c = dT*(r*i+(s*i)^2)/2; if (i == 1) U(i,j) = 0; else if (i == N) U(N,j) = Xmax-K*exp(-r*(M-j)*dT); else U(i,j) = a*U(i-1,j+1)+b*U(i,j+1)+c*U(i+1,j+1); end end end end i = round((N-1)*p/Xmax)+1; j = round((M-1)*(Tmax-t)/Tmax)+1; bsme = U(i,j); end
We note that, not every combination of input parameters produces a realistic result. We have to pay extra attention to the ratio of t to x2 and the value of x by keeping them small.
Time Tmax ⫽ MΔt
i−1
i
i+1
j+1 j
79.4.2 Implicit Method 3Δt
While the advantage of the explicit method is simplicity, the benefit of the implicit method is stability. We use the forward difference method to approximate the second order derivative of t as shown in Fig. 79.3: j C1
j
U @u u.x; t C t/ u.x; t/ Ui D lim i
t !0 @t
t
t (79.19)
2Δt Δt
0
Δx
2Δx
3Δx
Fig. 79.3 Discretization mesh of implicit method
xmax ⫽ NΔx Price
79 Numerical Solutions of Financial Partial Differential Equations
The Black–Scholes PDE in this discrete method becomes: j C1
Ui
j
j
j
1213
It can be further reduced to:
j
U 2Ui C Ui 1 1 Ui C 2 i 2 x 2 i C1
t 2
x 2 j
j C1
j
Ui 1 U j rU i D 0 Cri x i C1 2 x
(79.20)
j
j
j
Ui
D ai Ui 1 C bi Ui C ci Ui C1
(79.21)
ai D
1
t.ri s 2 i 2 / 2
(79.22)
bi D 1 C t.r C s 2 i 2 /
(79.23)
1 ci D t.ri C s 2 i 2 / 2
(79.24)
For each time step at j from 0 to M, there are N equations that need to be resolved: 2
b1 6a2 6 6 6 6 6 4
c1 b2 a3
c2 b3
c3 :::
::: bN 1 aN
3 2 j 3 2 U j C1 a U j U1 1 0 1 j 7 j C1 6 76 U U 6 6 7 2 7 6 2j 7 6 j C1 76 U 7 6 U3 76 3 7 D 6 76 ::: 7 6 ::: 76 7 6 j C1 cN 1 5 4 UNj 1 5 4 UN 1 j j C1 j bN U U cN U
It is same as the explicit method that the approximation error is of order O( t). However, the implicit method guarantees convergence such that the error term will eventually become zero as t and x are infinitely small. Finding the
function [bsmi]= bsmi(K, p, t, s, r) X = [0:0:0]; B = [0:0:0]; Tmax = 1; Xmax = 1; dX = 0.01; dT = 0.0025; N = round(Xmax/dX)+1;M = round(Tmax/dT)+1; U = zeros(N,M); A = zeros(N,N); vertx = 0:dX:Xmax; for i=1:N if vertx(i) > K B(i) = vertx(i) - K;
N
N
3 7 7 7 7 7 7 7 5
N C1
solution of N linear equations Ax D b for all M time steps is a big disadvantage of the implicit method. It is a tradeoff between complexity and stability. The implementation is more complicated than the previous explicit method in Matlab program:
1214
G.N. Dong
U(i,M) = B(i); end end for i=1:N a = dT*(r*i-(s*i)^2)/2; b = 1+dT*(r+(s*i)^2); c = -dT*(r*i+(s*i)^2)/2; if (i>1) A(i,i-1)=a end A(i,i)=b if (i
79 Numerical Solutions of Financial Partial Differential Equations
1215
i = round((N-1)*p/Xmax)+1; j = round((M-1)*(Tmax-t)/Tmax)+1; bsmi = U(i,j); end
79.4.3 Crank–Nicolson Method Although the explicit method is computationally simple and the implicit method has a low truncation error of order O. t), the average of both methods can become more efficient than its predecessors in terms of the error order of j C1
Ui
j
1 Ui C 2 i 2 x 2
t 4
1 C ri x 2
j C1
j C1
Ui C1 Ui 1 2 x
j C1
approximation, which is O. t2 ). This method is named after Crank and Nicolson (1947). An alternative is the Douglas Method, which is a “weighted” average of both explicit and implicit method. The Black–Scholes PDE in Crank–Nicolson scheme is: j C1
j C1
j
j
j
C Ui 1 2Ui C Ui 1 U Ui C1 2Ui C i C1 2
x
x 2 ! j j Ui C1 Ui 1 1 j C1 j C C Ui / D 0 r.Ui 2 x 2
!
(79.25)
It can be further simplified as: j
j C1
j
j C1
ai D
j C1
ai Ui 1 C bi Ui C ci Ui C1 D ai Ui 1 C .4 bi /Ui j C1
ci Ui C1
(79.26)
1
t.ri s 2 i 2 / 2
(79.27)
bi D 2 C t.r C s 2 i 2 /
(79.28)
1 ci D t.ri C s 2 i 2 / 2
(79.29)
This scheme is more efficient than both the implicit and the explicit methods due to its second-order error term, and it is guaranteed to be stable. Similar to the implicit method, the computer program needs to solve N equations of Ax D b for each time step j from N to 1 backwards: 2
b1 6a2 6 6 6 6 6 4
c1 b2 a3
c2 b3
c3 :::
::: bN 1 aN
3 j C1 j C1 j 32 j 3 2 c1 U1 a1 U0 .4 b1 /U0 U1 j j C1 j C1 j C1 7 6 76 a2 U1 C .4 b2 /U2 c2 U3 U2 7 7 6 7 76 6 7 7 6 j j C1 j C1 j C1 76 U 7 6 a U C .4 b /U c U 7 3 2 3 3 4 3 76 3 7 D 6 7 76 ::: 7 6 7 ::: 76 7 6 j C1 j C1 7 cN 1 5 4 UNj 1 5 4 aN 1 UNj C1 5 2 C .4 bN 1 /UN 1 cN 1 UN j j C1 j C1 j bN U aN U C .4 bN /U cN U N
N 1
N
N C1
1216
G.N. Dong
The Matlab program code of Crank–Nicolson Method: function [bsmc]= bsmc(K, p, t, s, r) X = [0:0:0]; B = [0:0:0]; Tmax = 1; Xmax = 1; dX = 0.01; dT = 0.0025; N = round(Xmax/dX)+1;M = round(Tmax/dT)+1; U = zeros(N,M); A = zeros(N,N); vertx = 0:dX:Xmax; for i=1:N if vertx(i) > K B(i) = vertx(i) - K; U(i,M) = B(i); end end for i=1:N a = dT*(r*i-(s*i)^2)/2; b = 2+dT*(r+(s*i)^2); c = -dT*(r*i+(s*i)^2)/2; if (i>1) A(i,i-1)=a end A(i,i)=b if (i
79 Numerical Solutions of Financial Partial Differential Equations
1217
end for j=M-1:-1:1 B(1) = U(1,j+1)*(2-dT*(r+(s*i)^2))+U(2,j+1)*dT*(r*i+(s*i)^2)/2; for i=2:N-1 B(i) = U(i-1,j+1)*(dT*((s*i)^2-r*i)/2)+U(i,j+1)*(2-dT*(r+(s*i)^2))+ U(i+1,j+1)*dT*(r*i+(s*i)^2)/2; end B(N) = U(N-1,j+1)*(dT*((s*N)^2-r*N)/2)+U(N,j+1)*(2-dT*(r+(s*N)^2))+ (Xmax-K*exp(-r*(M-j-1)*dT))*dT*(r*N+(s*N)^2)/2+ (Xmax-K*exp(-r*(M-j)*dT))*dT*(r*N+(s*N)^2)/2; X = linsolve(A,B’); for i=1:N U(i,j) = X(i); end end i = round((N-1)*p/Xmax)+1; j = round((M-1)*(Tmax-t)/Tmax)+1; bsmc = U(i,j); end
79.5 Finite Volume Finite volume is a method for representing and evaluating PDE as algebraic equations. Similar to the finite difference method, values are calculated at discrete points on a mesh. This method defines “finite volume” or cell, which is a small volume surrounding each node point on a mesh. Volume integrals in a partial differential equation that contain a divergence term are converted to surface integrals, using the divergence theorem. These terms are then evaluated as fluxes at the surfaces of each finite volume. Law of conservation states that the flux entering a given volume is identical to that leaving the adjacent volume.
Finite volume method is more flexible than finite difference method, and can handle cases where the underlying partial differential equation becomes convection dominated or degenerates at boundary. It is often used with unstructured meshes and it can converge even if the initial condition is not smooth or discontinuous. To approximate the model equation, the finite volume method divides the price and time domains to form a mesh and defines a finite volume or cell surrounding each discrete point. The cell boundary is usually located at the midpoint of two neighboring discrete points as shown in Fig. 79.4.
1218
G.N. Dong
The approximation of this model equation using the finite volume method is the same as the result of using the finite difference explicit method:
ith cell
0 1 j j j j j Ui 1 Ui A 1 2 2 @ Ui C1 Ui Ui C C xi
t 2
x 1
x 1
x i+½
x i-½
x i-1
xi
x i+1
j C1
ai
Ui
iC 2
Fig. 79.4 Discretization mesh and finite volume (cell) j
Crxi
Cell boundaries: x
iC
xi C1 C xi 2 xi C xi 1 D 2 D
1 2
x
1 i 2
(79.30)
D xi C1 xi
(79.32)
x
D xi xi 1
(79.33)
The partial derivatives of u(x,t) can be approximated by: Ui C1 Ui @u 1 @x i C 2
x 1
(79.34)
Ui Ui 1 @u @x i 21
x 1
(79.35)
iC
2
i 2
Cell or finite volume: ai D
xi C xi 1 xi C1 C xi 2 2
(79.36)
We can now integrate the Black–Scholes PDE over the cell ai : Z ai
@u ds C @t
Z
2
ai
1 2 2@ u x ds C 2 @x 2
ai
Z ai
Z ai
j C1
Z ai
@u rx ds @x
Finite element method is preferred for solving PDE over complex domains: when the domain changes as during a solid state reaction with a moving boundary, when the desired precision varies over the entire domain, or when the solution lacks smoothness. The solution can be found either by eliminating the differential equation completely in steady state problems, or by rendering the PDE into an approximating system of ordinary differential equations, which are then solved using standard techniques such as finite difference and Runge-Kutta. To apply the finite element method to our model equation, we define a bounded spatial domain N where xN D xmax D Œ0; x N
Z ruds D 0
T
0
ai
0
Z
Z
T
C
xN
r
x 0
Z
T 0
1 2 2
@u vdsdt @x
Z
xN
x2 0
Z
@2 u vdsdt @x 2
Z
T
xN
uvdsdt D 0
r 0
0
(79.43)
1
2
i
2
(79.39)
Ui C1 Ui 1 U 1 / D rxi i 2 2 (79.40)
ruds D rU i ai
0
@u vdsdt C @t
(79.38)
iC
Z
xN
0
1 2 2 @2 u 1 Ui 1 Ui A Ui C1 Ui x ds 2 xi2 @ C 2 2 @x 2
x 1
x 1
@u rx ds rxi .U 1 iC @x 2
Z
ai
j
@u U Ui ds ai i @t
t
(79.42)
The detail analysis of this method can be found in Forsyth and Vetzal (2006).
(79.37) Z
j
rU i ai
79.6 Finite Element
x
1 i 2
!
(79.31)
Distance between two points: 1 iC 2
j
Ui C1 Ui 1 2
i 2
(79.41)
After integration by parts, it becomes Z
T 0
"Z
xN 0
2 @u vds @t 2 Z
xN
C.r / 2
0
Z
xN
x2 0
2 2 @u.x/ @u @v N ds C xN v.x/ N @x @x 2 @x
@u x vds r @x
Z
xN
# uvds dt D 0
0
(79.44)
79 Numerical Solutions of Financial Partial Differential Equations
It can be further simplified to Z
T
Derive algebraic equations for the coefficients of the ex-
@u Œm. ; v/ C a.u; v/ dt D 0 @t
(79.45)
BC W u.x; T / D Max.x K; 0/
(79.46)
0
1219
Where
pansion. Foufas and Larson (2004) derive a system of matrix equations by simplifying the piecewise continuous function to first-order. Apply Galerkin method to solve this system of equations. Gockenbach (2002) has detailed mathematical treatments on this method.
Z xN @u 2 @u.x/ @u N m ;v D vds xN 2 v.x/ N (79.47) @t 2r @t 0 @t Z xN Z 2 xN 2 @u @v @u a.u; v/ D .r 2 / vds ds x 2 0 @x @x 0 @x Z xN 2 C xu. uvds (79.48) N x/v. N x/ N r 2 0 This is the weak form of the boundary value problem. In order to complete the discretization scheme of finite element to find the approximate solution of u(x,t), we need to follow these specific steps: Choose a particular basis function, such as piecewise
polynomials, that can expand the trial and test functions.
79.7 Empirical Result Table 79.1 compares the numerical results of European call option prices calculated by Finite Difference’s Explicit Method, Implicit Method, Crank–Nicolson Method, and Black–Scholes close-form solution. In most scenarios, all three finite difference methods generate same call option prices of 4-decimal-point precision and they are very close to the results using Black–Scholes formula. Explicit Method reveals instability when volatility is above 20%, whereas Implicit Method and Crank–Nicolson Method remain stable in this computation.
Table 79.1 Numerical results of European call option prices and comparison to Black-Scholes solution Stock Strike Year to Risk-free rate Volatility Explicit price price maturity (%) (%) method
Implicit method
Crank– Nicolson
Black– Scholes
0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50
0.0297 0.4910 0.4054 0.3102 0.2151 0.1200 0.0006 0.0014 0.0147 0.0249 0.0344 0.0435 0.0525 0.0155 0.0212 0.0393 0.0496 0.0603 0.0711 0.0246 0.0252 0.0357 0.0421 0.0488 0.0555
0.0297 0.4910 0.4054 0.3102 0.2151 0.1200 0.0006 0.0014 0.0147 0.0249 0.0344 0.0435 0.0525 0.0155 0.0212 0.0393 0.0496 0.0603 0.0712 0.0246 0.0252 0.0357 0.0421 0.0488 0.0556
0.0293 0.4905 0.4049 0.3098 0.2146 0.1195 0.0004 0.0023 0.0147 0.0246 0.0338 0.0428 0.0515 0.0153 0.0210 0.0387 0.0488 0.0593 0.0698 0.0244 0.0250 0.0351 0.0414 0.0479 0.0545
0.50 0.01 0.10 0.20 0.30 0.40 0.60 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.01 0.2 0.4 0.6 0.8 1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
10 10 10 10 10 10 10 10 10 10 10 10 10 1 5 15 20 25 30 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 1 5 15 20 25 30
0.0297 0.4910 0.4054 0.3102 0.2151 0.1200 0.0005 0.0014 0.0147 0.0249 0.0344 0.0435 0.0525 0.0155 0.0212 0.0393 0.0496 0.0604 0.0712 0.0246 0.0252 0.0357 0.0422 1 1
1220
To illustrate the discretization and dynamics of Finite Difference method, we plot three waterfall figures as Figs. 79.5, 79.6 and 79.7. The mesh size of Time to Maturity
G.N. Dong
is 400 with maximum time of 1 year. The mesh size of underlying stock price is 100 with maximum price of $1. Figures 79.5 and 79.6 indicate that high volatility causes higher premium for out-of-money call options. The same effect exists when risk-free rate is high as in Fig. 79.7.
79.8 Conclusion
Fig. 79.5 Implicit method when strike price D $0.5, risk-free rate D 10% and volatility D 10%
Conceptually, the finite difference and finite volume methods are approximation to the PDE; the finite element method is an approximation to the solution. The finite difference is simple to discretize and easy to implement, but the explicit method is not guaranteed stable; therefore the implicit method is always preferred. Finite volume has an advantage over finite difference that it does not require a structured mesh. If a structured mesh is used, the finite volume and the finite difference method yield the same discretization equations. The most attractive feature of the finite element method is its ability to handle complex geometries and boundaries. Finite difference method can be considered a special case of finite element method. In general, the finite element method has better quality in obtaining an approximated solution compared to the finite difference method. Since most PDEs in financial derivatives pricing have simple boundary conditions, the implicit method of finite difference is preferred to the finite element method in applications of financial engineering. When we are faced with a derivative pricing PDE that requires finding solutions numerically, we need to consider various key criteria for evaluating these discretization methods: approximation accuracy, solution convergence, error stability, extension flexibility, computation performance and memory occupation.
Fig. 79.6 Implicit method when strike price D $0.5, risk-free rate D 10% and volatility D 100%
References
Fig. 79.7 Implicit method when strike price D $0.5, risk-free rate D 100% and volatility D 10%
Brandimarte, P. 2006. Numerical methods in finance and economics: a MATLAB-based introduction, Wiley, New York. Christensen, M. and C. Munk. 2004. Note on numerical solution of Black–Scholes-Merton PDE, Working paper, University of Southern Denmark. Duffy, D. 2006. Finite difference methods in financial engineering: a partial differential equation approach, Wiley, New York. Forsyth, P. and K.Vetzal. 2006. Numerical PDE methods for pricing path dependent options, University of Waterloo, Ontario, Canada. Foufas, G. and M. Larson. 2004. Valuing European, barrier and lookback options using finite element method and duality techniques, Chalmers University of Technology, Sweden. Gockenbach, M. 2002. Partial differential equations: analytical and numerical methods, SIAM Press. Hull, J. 2003. Options futures and other derivatives, 5th ed., Prentice Hall, Upper Saddle River, New Jersey.
79 Numerical Solutions of Financial Partial Differential Equations Mathews, J. and K. Fink. 1999. Numerical methods using Matlab, Prentice Hall, Upper Saddle River, New Jersey. Morton, K. W. and D. F. Mayers. 2005. Numerical solution of partial differential equations: an introduction, Cambridge University Press, Cambridge. Topper, J. 2005. Financial engineering with finite elements, Wiley, New York. Wilmott, J. 2006. Wilmott on quantitative finance, Wiley, New York, pp. 1213–1215.
1221
Further Reading Mathews and Fink (1999) and Brandimarte (2006) are good introduction books on Matlab programming and application in finance and economics. Duffy (2006) and Topper (2005) have put together comprehensive examples that show the use of finite difference and finite element methods in pricing financial derivative instruments.
Chapter 80
A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches Ivan E. Brick and Daniel G. Weaver
Abstract The most common capital budgeting approaches use the basic constant risk-adjusted discount models. The most popular valuation approach discounts the unlevered cash flows by the after-tax weighted average cost of capital. The adjusted present value (APV) approach takes the sum of the present value of the unlevered cash flow discounted by the cost of equity of the unlevered firm and the value of the tax benefits of debt discounted by the cost of debt. Traditional approaches for calculating the after-tax weighted average cost of capital and cost of equity of the unlevered firm assume that firms maintain a constant market-value percentage of debt. However, firms typically maintain a book-value percentage of debt. Brick and Weaver (Rev Quant Finance Account 9:111–129, 1997) present an approach to estimate the cost of equity of an unlevered firm when the firm maintains a constant book-value-based leverage ratio. We demonstrate that both the Modigliani and Miller (Am Econ Rev 53:433–443, 1963) and Miles and Ezzell (J Finan Quant Anal 15:719–730, 1980) approaches may yield substantial valuation errors when firms determine debt levels based on book-value percentages. In contrast, our method makes no errors as long as managers know the marginal tax benefit of debt. If the managers do not know the marginal tax benefits of debt, our approach will still result in the smallest valuation error. Keywords Capital budgeting r After-tax weighted average cost of capital r APV r cost of equity of the unlevered firm
80.1 Introduction Textbooks generally recommend using the after-tax weighted average cost of capital (ATWACOC) to discount unlevered cash flows to calculate the net present value of a project.1 A survey conducted by Bruner et al (1998) suggests that the ATWACOC approach is the preferred approach for capital I.E. Brick () and D.G. Weaver Rutgers University, Newark, NJ, USA e-mail: [email protected]; [email protected]
budgeting purposes employed by the leading U.S. companies. An approach pioneered by Myers (1974) suggests using the adjusted present value (APV) approach to evaluate the NPV of projects. In this latter approach, the net present value of the project equals the sum of the present value of the unlevered cash flow discounted by the cost of equity of the unlevered firm and the value of the tax benefits of debt. Both of these approaches implicitly assume a constant financing policy. The ATWACOC approach assumes that the market-value proportions of debt and equity is kept constant throughout the life of the project. The APV approach can be used under two alternative financing policies. APV may be used if one assumes that the level of debt is a known constant throughout the life of the asset. In this case, we use the Modigliani and Miller (1963) approach to estimate the cost of equity of the unlevered firm and discount the tax benefits of debt by the cost of debt. Alternatively, one may assume (as is done for the ATWACOC) that the level of debt is a constant percentage of the market value of the project. In this case, one uses the Miles and Ezzell (1980) model to estimate the cost of equity of the unlevered firm, which is then used to discount the unlevered cash flows. Moreover, the level of debt is uncertain after the first period and therefore one must use the cost of equity of the unlevered firm to discount the tax benefits of debt after the first period.2 However, as Booth (2002, 2007) demonstrates, using the APV approach is difficult to implement. In particular, to estimate the NPV using the APV approach, one must calculate separately the value of the tax benefits of debt. Since the perpetuity assumption is overly restrictive for most projects, generally one must assume that the firm maintains a constant percentage of the market value of the project. With this assumption, one can use the Miles and Ezzell (1980) framework to correctly estimate the cost of equity of the unlevered firm. But until one estimates the NPV of the project using the APV approach, how does one know the appropriate level 1
See for example, Brealey et al. (2007), Gitman (2007), and Ross et al. (2008). Berk and DeMarzo (2007) use both the ATWACOC and the APV approaches. 2 Also see Arzac and Glosten (2005) and Farber et al. (2006).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_80,
1223
1224
of debt? And if the manager does not know the level of debt, how can the manager correctly use the APV to find the NPV? Hence, Booth (2002) and most textbooks implicitly conclude that since we do not have to actually estimate the appropriate level of debt to find the correct NPV when using the ATWACOC approach, that approach is best and easiest to use. However, for the ATWACOC approach to yield an accurate project valuation estimate, the firm must maintain a constant percentage debt based on the current market value of the project. In contrast, most firms maintain a debt-equity mix based on a long run target book value.3 Generally, the market value of debt is not much different from the book value of debt unless interest rates have changed dramatically since the initial offering of the debt. But it is not uncommon for the market value of equity to differ substantially from its book value. In this case, the existing frameworks of ATWACOC or APV (i.e., using either the Modigliani and Miller (1963) model or Miles and Ezzell (1980) framework to estimate the cost of equity of the unlevered firm) will yield inaccurate estimates of the NPV of the project. In fact, as shown by Bernhard (1978) and Brick and Weaver (1984), they may also give erroneous decisions such as rejecting a truly profitable project. Brick and Weaver (1997) derive the correct cost of equity of an unlevered firm in the case that firms gear their financing mix towards target book-value debt-equity ratios.4 In this paper, we present their methodology and compare and contrast the Brick and Weaver approach to those recommended by the typical finance textbook.5
I.E. Brick and D.G. Weaver
Note that our methodology essentially relies on the Myers (1974) APV framework. However, given personal taxes, the gain from leverage will then be a function of both the corporate tax rate and the marginal personal tax rates of both debt and equity securities. Unfortunately, managers cannot readily observe the true gain from leverage. If their estimate of the gain from leverage is incorrect, then the efficacy of the capital budgeting methods to yield an accurate estimate of the net present value will depend upon whether or not firms rebalance their leverage ratio based upon the market value or book value of the assets. If the firm maintains a constant market based leverage, then the ATWACOC will always yield the correct NPV. But if the firm is maintaining a constant target book value leverage, our approach will, in general, be most accurate. The organization of this paper is as follows: Sect. 80.2 describes the most popular approaches to capital budgeting; Sect. 80.3 presents an approach that correctly estimates the cost of equity of an unlevered firm; Sect. 80.4 demonstrates by example the potential bias in the APV of a project if the manager estimates the cost of equity of an unlevered firm using the Modigliani and Miller (1963) or the Miles and Ezzell (1980) framework. Additionally, we demonstrate the accuracy of our approach. We also examine the implication of personal taxes on valuation in Sect. 80.5. In Sect. 80.6, we offer concluding remarks.
80.2 Textbook Approaches to NPV
3
Lev (1969), Marsh (1982), Frecka and Lee (1983), Lee and Wu (1988) and Graham and Harvey (2002), document that firms generally maintain a debt-equity financing mix based on a target “book-value” debtto-asset-ratio. Generally, restrictive covenants are stated using bookvalue measures as opposed to market-value measures (see for example, Smith and Warner 1979). The purpose of the restrictive covenants is to protect the bondholders from stockholder expropriation of their wealth. A violation of these restrictive covenants results in the firm being in technical default and incurring associated default costs. A potential explanation for the focus of lenders and borrowers on book-value measures is that the implied limits of the restrictive covenants would be more volatile with market-value measures, resulting in a greater probability of the firm violating a restrictive covenant. Additionally, restrictive covenants that set a limit on the debt-equity ratio of the firm in terms of market-value might induce stockholders to take on more risk to raise the value of equity, decrease the value of debt, and reduce the implied leverage ratio of the firm. 4 Beranek (1977, 1978) has developed a capital budgeting approach that would correctly determine whether the project is profitable even if the firm does not maintain a constant market-value based leverage ratio. However, this approach, unlike the one proposed in this paper and that of Brick and Weaver (1997), cannot give an exact estimate of the project’s net present value and, therefore, would not be useful in evaluating mutually exclusive projects. 5 The literature also recommends alternative capital budgeting approaches. Arditti and Levy (1977), Kaplan and Ruback (1995), and Ruback (2002) suggest that the before-tax weighted average cost of capital (BTWACC) should be used to discount levered cash flows (also
According to the ATWACOC, the manager discounts the expected unlevered cash flow of the firm, X, by the ATWACOC. More formally, we have: NPV.ATWACOC/ D I C
n X
Xt =.1 C ATWACOC/t
tD1
(80.1) where Xt is the expected unlevered cash flow at time t and I is the initial investment. ATWACOC equals L.1 £/R C .1 L/Ke where L is the market value proportion of debt, £ is the corporate tax rate, R is the cost of debt and Ke is the cost of equity. Throughout this paper, we assume that there is no risk of default. The APV approach is the sum of the discounted value of the unlevered cash flow discounted by the cost of equity of
referred as Capital Cash Flows) which explicitly incorporate the tax deductibility of interest expense. The equity residual (ER) method examines the project from the stockholders’ perspective. These two alternative approaches also assume that the level of debt is a constant percentage of the value of the project or the firm (See Boudreaux and Long 1979, Chambers et al. 1982 and Ruback 2002).
80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches
the unlevered firm and the present value of the tax benefits of debt. Assuming that the firm maintains an overall financing policy of a constant level of debt in perpetuity, then the manager can use the Modigliani and Miller (1963) framework to estimate the cost of equity of the unlevered firm. More formally, NPV.APVMM/ D I C
n X
Xt =.1 C KeuMM /t
tD1
C
n X
£ rD=.1 C R/t
(80.2)
tD1
where KeuMM is the cost of equity of an unlevered firm consistent with the Modigliani and Miller (1963) framework and is equal to ATWACOC/.1 £L/. D is the constant level of debt used to finance the project. Alternatively, if leverage is a constant percentage of the firm’s or project’s market value, then the debt level is known with certainty at time 0 since the value of the project is known. However, the future level of debt for t 1 is uncertain since the future market value of the firm or project is not known with certainty at time t D 0. In this case, the tax benefits of debt for t 1 should be discounted by the cost of equity of the unlevered firm. Moreover, in this setting, the firm is assumed to rebalance its debt in every period to maintain constant weight. Miles and Ezzell (1980) derive, in this case, a closed form formula for the cost of equity of the unlevered firm. In particular, KeuME D ŒATWACOC.1 C R/ C L£ R = Œ1 C R.1 £L/ (80.3) Consequently, the NPV of the project using the Miles-Ezzell (1980) assumptions is:6 NPV.APVME/ D I C
n X
Xt =.1 C KeuME /t
tD1
C
n X
£ RDt1 = .1 C R/.1 C KeuME /t1
tD1
(80.4) To better understand the differences of the above approaches, consider the following example. Assume that a firm’s ATWACOC is 9.6%, with L D :4; £ D :4; R D 10% and Ke D 12%. The firm is considering an investment of $10 million. The life of the asset is 10 years and the unlevered cash flow is $2.5 million per year. Using Equation (80.1), the
6 The derivation of this formula is given in the next section. Also see Fernandez (2004) and Cooper and Nyborg (2006).
1225
NPV(ATWACOC) D $5,628,969.59 and the market value of the project at time zero is $15,628,969.59. Consequently, to maintain a constant debt equity ratio consistent with L D :4, the level of debt raised by the firm to finance this project is 40% of $15,628,969.59 or $6,251,587.84. According to the Modigliani and Miller (1963) framework, KeuMM D 11:4286%. Consequently, using Equation (80.2), we find that the NPV(APVMM) D $15,998,729.44. To understand why we have different values for NPV, consider Table 80.1. The first column is time t. The second column gives you the unlevered cash flow for each period. The third column is the market value of the project for time t, found by obtaining the present value at time t of the unlevered cash flows from t C 1 to the end of the life of the project using the ATWACOC as the discount rate. Notice that the value of the project declines through time. Accordingly, if the firm wants to maintain a constant leverage ratio of 40%, the level of debt should decline. This is an automatic assumption of the ATWACOC approach. However, NPV(APVMM) approach assumes that the level of debt remains at a constant value of $6,251,587.84. As a result, the tax benefits of debt are higher in this APV approach because it automatically assumes a greater level of debt.7 To calculate the NPV(APVME), we first must calculate KeuME using Equation (80.3). Accordingly, KeuME D 11:2177%. We discount column (2) of Table 80.1 by 11.2177%, which yields the value of the unlevered project equal to $4,589,633.33. The value of the tax benefits of debt is found by first discounting the interest payments payments of column (5) at KeuME to Time 1 and discount that value and the assumed tax benefit of interest at time 1 by the cost of debt. Summing the value of the unlevered project and the value of the tax benefits of debt, we once again get the same NPV as the ATWACOC. This highlights another difference between the ATWACOC approach and the APV approach using the Modigliani and Miller approach in that the ATWACOC and the APV using the Miles and Ezzell framework assume that the interest payments after t D 1 are uncertain (as is implied by the use of KeuME to discount the interest payments). Note, however, that one cannot estimate the NPV using the APV approach without using the ATWACOC to first find the market value of the project in each period. Hence, we can readily understand the popularity of the ATWACOC.
7 Note that if the unlevered cash flows are growing and that the market value of the firm and debt increases through time, then the APV approach using the Modigliani and Miller (1963) framework would generally have a biased downward estimate of the true NPV. The reason for this is that the debt level according to this APV framework assumes that the debt level is constant instead of increasing.
1226
I.E. Brick and D.G. Weaver
Table 80.1 This Table delineates the implicit level of debt financing and interest expense of the capital budgeting example set forth in Sect. 80.2 according to the ATWACOC capital budgeting approach
Time (1)
Xt (2)
Market value (3)
Debt (4)
Interest (5)
0 1 2 3 4 5 6 7 8 9 10
-$10,000,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000 $2,500,000
$15,628,969.59 $14,629,350.67 $13,533,768.34 $12,333,010.10 $11,016,979.07 $9,574,609.06 $7,993,771.53 $6,261,173.59 $4,362,246.26 $2,281,021.90 $0.00
$6,251,587.84 $5,851,740.27 $5,413,507.33 $4,933,204.04 $4,406,791.63 $3,829,843.62 $3,197,508.61 $2,504,469.44 $1,744,898.50 $912,408.76 $0.00
$625,158.78 $585,174.03 $541,350.73 $493,320.40 $440,679.16 $382,984.36 $319,750.86 $250,446.94 $174,489.85 $91,240.88
80.3 Theoretical Valuation of Cash Flows
VL D
1 X
Xft =.1 C Keu /t
tD1
The theoretically correct value of a firm or project is given by the certainty equivalent present value of unlevered cash flows and the tax benefits of debt (see Taggart 1991). More formally, VL D
1 X
›t Xft =.1 C R/t C
tD1
1 X
£ “t R Dft1 =Œ.1 C R/t
tD1
(80.5) where Xft is the expected unlevered cash flow at time t, ›t and “t are the certainty equivalent factors for the firm’s unlevered cash flow and interest expense at time time t, R is the risk free interest rate assumed to be constant throughout the life of the firm, and Dft is the level of debt at time t. For the certainty equivalent model to be equivalent to a constant risk-adjusted discount model, we have to assume that ›t D ›t and “t D “t .8 We assume in this paper that the level of debt at time t is a proportion of the firm’s book value. We also assume that the debt is default free. Hence, the tax benefits of debt at time 1 is known with certainty at time 0 because the book value of the project is known with certainty at time zero. Consequently, “1 D 1. We will assume that the change in the firm’s book value for t 1 is strictly proportional to the unlevered cash flow of the previous period. Accordingly, the level of debt and tax benefits of debt are uncertain after t D 0. As a result, “t D ›t1 for t > 1. These assumptions are consistent with the Miles and Ezzell (1980) and the ATWACOC frameworks. Given these assumptions, the relationship between the cost of equity of the unlevered firm and the certainty equivalent factor is given by › D .1 C R/=.1 C Keu /. As a result, Equation (80.5) maybe rewritten as:
8
See Robichek and Myers (1966).
C
1 X
£ R Dft1 =Œ.1 C R/.1 C Keu /t1
(80.6)
tD1
We assume that the firm maintains a debt to book value ratio of ’. We assume that the expected book value of the firm for t 1 is of the same risk class as the operating cash flows of the firm. Consequently, this assumption is also consistent with the basic risk assumption concerning future tax benefits of debt used by Miles and Ezzell (1980). We assume that the expected unlevered after-tax cash flow of the firm, Xft , grows at the rate g for n periods and, thereafter, at the growth rate of zero. For illustrative purposes assume that additions to firm book value for t n are a function of the unlevered cash flow at time t. Let Xt D .1 ™/CFt where ™ is the investment rate as a percentage of CFt where CFt is the unlevered operating after tax cash flow of the firm.9 As a result, the expected book-value of the firm also grows at rate g for n periods and remains a constant thereafter.10 According to our assumptions, ™ > 0 for t n and ™ is equal to the depreciation rate for t > n. This latter simplifying assumption allows us to assume that the book value of the firm is constant after t D n. Let BVFt denote the book value of the firm at time t. Consequently,
9 The operating cash flow does not include the investment made by the firm. However, the factor .1 ™/ incorporates the amount of money invested for working capital, and long-term tangible and intangible assets. The investment rate ™ need not be constant and can for some period be greater than one implying the firm does not have sufficient earnings to fulfill the investment goals of the firm. For simplicity, we shall assume that ™ < 1 and is constant. 10 This assumption simply prevents the expected future book-value from having an infinite value.
80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches
BVFj D BVF0 C
t X
affected by the more complicated assumption. Second, the framework we use is generally consistent with the implicit risk assumptions that Miles and Ezzell (1980) use to evaluate a finite-life project. This simplification is for illustrative purposes. However, Beaver and Ryan (1993) empirically demonstrate that the book value of a firm’s assets is positively correlated with lagged values of its market value. Thus, “t D ›t1 is an appropriate assumption. Accordingly, the certainty equivalent of BVFt ; CE.BVFt / is:
™CFF0 .1 C g/j for t n
jD1
and BVFj D BVF0 C
n X
™CFF0 .1 C g/j for t > n:
(80.7)
jD1
Given our assumption regarding the change in book-value through time, the risk associated with the book-value at time t equals the risk associated with the historical levels of net retention. Hence, the certainty equivalent coefficient factor for BVft should be a weighted average of ›t where the weights reflect the historical contribution of previous retentions. We chose to greatly simplify this assumption for three reasons. First, the general implication of our paper would not be
VL D
1 X
›t XFt =.1 C R/t C
tD1
1 X
1227
CE.BV/Ft D ›t1 BVFt for t n and CE.BV/Ft D ›n BVFt for t > n
(80.8)
The theoretical value of the firm is therefore:
£ ’ ›t1 R BVFt1 =.1 C R/t
tD1
D XF =.R G/ XF .1 C G/n =Œ.R G/.1 C R/n C XF .1 C G/n =ŒR.1 C R/n ( C£’
n X
) R›
t1
BVFt1 /=.1 C R/ C R › t
n1
BVFn =ŒR.1 C R/ n
(80.9)
tD1
where XF is the expected unlevered cash flow at t D 1, and G D › C g› 1. The first three expressions following the second equal sign of Equation (80.9) equal the value of the unlevered cash flow. The last two expressions are the present value of the tax benefits of debt. Equation (80.9) incorporates
our assumption that the firm’s operating cash flows grows at rate g until t D n and thereafter the growth rate is zero. As a result, the interest payments of the firm grows at rate g until t D n C 1 and remains constant thereafter. Since by › D .1 C Rf /=.1 C Keu /, we can rewrite Equation (80.9) as:
VL D XF =.Keu g/ XF .1 C g/ =Œ.Keu g/.1 C Keu / C XF .1 C g/ =ŒKeu .1 C Keu / n
( C ˛
n X
n
R BVFt1 / =Œ.1 C R/.1 C Keu /
t1
n
C BVFn = .1 C R/.1 C Keu /n1
n
) (80.10)
tD1
As Brick and Weaver (1997) demonstrate, setting VL to the actual market value of the firm, one can use a numerical iterative technique to find the correct Keu . The above expressions are based on simplifying assumptions that allow us to derive the cost of equity of the unlevered firm. Changing the assumptions that we make regarding the dividend policy, the frequency of rebalancing, the debt-equity mix, and cash flow
growth rate would change the formulation of the firm’s value but not the broad implications of our results. Analysts typically make assumptions concerning cash flow growth patterns. Once this is done, a real world analysis can easily be done on a spreadsheet. We would then use the derived Keu obtained in Equation (80.10) and substitute that value for KeuME of Equation (80.4) to find the NPV of the project.
1228
I.E. Brick and D.G. Weaver
Table 80.2 This Table delineates the firm’s cash flow and book values of assets and debt based upon the assumptions given in Sect. 80.4 Firm’s operating Unlevered cash Time cash flow flow Book value Debt Interest (1)
(2)
(3)
(4)
(5)
(6)
0 1 2 3 4 5 6 7 8 9 10 11 1
$166,666,666.67 $183,333,333.33 $201,666,666.67 $221,833,333.33 $244,016,666.67 $268,418,333.33 $295,260,166.67 $324,786,183.33 $357,264,801.67 $392,991,281.83 $392,991,281.83
$100,000,000.00 $110,000,000.00 $121,000,000.00 $133,100,000.00 $146,410,000.00 $161,051,000.00 $177,156,100.00 $194,871,710.00 $214,358,881.00 $235,794,769.10 $235,794,769.10
$800,000,000.00 $866,666,666.67 $940,000,000.00 $1,020,666,666.67 $1,109,400,000.00 $1,207,006,666.67 $1,314,374,000.00 $1,432,478,066.67 $1,562,392,540.00 $1,705,298,460.67 $1,862,494,973.40 $1,862,494,973.40
$320,000,000.00 $346,666,666.67 $376,000,000.00 $408,266,666.67 $443,760,000.00 $482,802,666.67 $525,749,600.00 $572,991,226.67 $624,957,016.00 $682,119,384.27 $744,997,989.36 $744,997,989.36
$32,000,000.00 $34,666,666.67 $37,600,000.00 $40,826,666.67 $44,376,000.00 $48,280,266.67 $52,574,960.00 $57,299,122.67 $62,495,701.60 $68,211,938.43 $74,499,798.94
80.4 An Example Assume that › D 0:973451327 and R D :1. Since › D .1 C R/=.1 C Keu /, then the assumed value of › implies that Keu D :13. Assume further that XF D $100 million, g D 10%; n D 10, and £ D :4 Further assume that ™ D :4 and the initial book value of the firm is $800 million. Assume that the target debt to total assets (book value) is 40%. Table 80.2 summarizes the cash flow implications of the above parameters. The original level of debt is 40% of the book value at time zero, implying that the firm borrows $320 million to finance this firm. Accordingly, the level of interest payments, assuming that the cost of borrowing is 10%, is $32 million at t D 1. Note that at t D 1, the expected after- tax operating cash flow is $166.67 million (see column 2). The investment rate of 40% implies that the firm will increase its book value by $66.67 million (see column 4). Further note that the cash flow grows at 10% per year until and including year 10. Thus, the expected unlevered cash flow of the firm and the expected book value of the firm at t 11 are identical to that in year 10.11 Using Equation (80.9) we can calculate the equilibrium value of the firm, which is $1,498,571,279.22. The value of equity is found by reducing the value of the firm by $320 million of debt outstanding at t D 0. Thus, the value of equity is $1,178,571,279.22. Note also that if we were to use Equation (80.10), set the value of the firm, VL , equal to $1,498,571,279.22, and solve for Keu , we would find it to be equal to 13%.
11
In actuality, the cash flow at time 11 should be much greater $235,794,769.10 since the investment rate of the firm as a percentage of the operating cash flows is equal to the depreciation rate and less than ™. For simplicity, we assume that the unlevered cash flow and the book value for t 11 are identical to that of t D 10. Changing these assumptions will not affect the implications of our results.
Table 80.3 The implied cash flow to stockholders of the example delineated in Table 80.2 Time CFstock 0 1 2 3 4 5 6 7 8 9 10 11 1
$107,466,666.67 $118,533,333.33 $130,706,666.67 $144,097,333.33 $158,827,066.67 $175,029,773.33 $192,852,750.67 $212,458,025.73 $234,023,828.31 $257,746,211.14 $191,094,889.74
Using the above table we can find the ATWACOC. First note that L D $320; 000; 000=$1; 498; 571; 279:22 or 21.35%. Second, we need to compute the cash flow to stockholders for each year and find the interest rate that equates the present value of the cash flow to stockholders to the value of equity. Note that the cash flow to stockholders at time t is equal to CFstock;t D XFt .1 £/Interestt .Debtt1 Debtt / (80.11) Table 80.3 delineates the cash flow to stockholders of our example as set in Table 80.2. The discount rate that equates the present value of the cash flow to stockholder to the value of equity of $1,178,571,279.22 is the cost of equity of the firm. In this case, it is equal to 14.04%. We can then compute the ATWACOC, which equals L.1 £/Rf C .1 L/Ke , or in our case, it is approximately 12.323%. Interestingly, if we were to discount the unlevered cash flow by
80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches
our computed ATWACOC, we would obtain a firm value of $1,410,410,431.95, an error of 5.89%. In order to evaluate the efficacy of the traditional capital budgeting approaches when firms maintain constant book value-weights, as opposed to market value weights, we consider the following capital budgeting example. The firm has a project whose cash flow for years 1–10 is $2.5 million per year and these cash flows are as uncertain as the unlevered cash flow of the firm described in Table 80.2. We also assume that the initial book value of the project is $10 million and it depreciates on a straight line basis to zero in ten years. For simplicity, assume that the book value is as uncertain as the firm’s cash flow. In truth, this may not be the case and if the firm is not making any more investment in this project throughout the life of the project, the book value in any given year is really certain. But for now, let us maintain the uncertainty assumption so that the risk profile of debt level in any given period is similar to that assumed by the ATWACOC and Miles and Ezzell (1980) approaches. Table 80.4 summarizes the implicit financing assumptions of the various capital budgeting approaches. As in Table 80.1, the level of financing assumed by both the ATWACOC and the Miles and Ezzell (1980) approaches assume that the level of debt in every period is rebalanced such as it always represents 21.35% of the market value of the project. But in reality, the firm is maintaining a level of debt that is 40% of the firm’s book value. Note that according to Equation (80.3) the cost of equity of the unlevered firm is 13.202% while the ATWACOC is 12.323% (as calculated previously). Therefore, according to the ATWACOC and Miles and Ezzell (1980) approaches, the NPV of the project is $3,940,640.34. The Brick and Weaver (1997) approach indicates that the cost of equity of the unlevered firm is actually 13%, yielding the true NPV of $4,143,885.04. Accordingly, the traditional textbook
1229
approaches to capital budgeting results in an error of 4.9% of the true value. The actual error might actually be larger if the book value is known with certainty, as previously noted, since the tax benefits of debt would be discounted only by the cost of debt yielding a higher NPV of $4,182,477.95.
80.5 Personal Tax and Miller Equilibrium The previous sections assume no personal taxes. One might expect that asset prices and the resulting expected yields are related to the personal tax liabilities incurred by investors for income obtained from holding financial securities. For example, yields of municipal bonds are lower than the yields of corporate bonds due to the income tax exemption municipal bonds offered to investors. The tax liability for equity income can be much lower than that of debt income for several reasons. First, corporate interest income is taxed at the normal tax bracket of the individual investor while capital gains and dividend income are taxed at a maximum of 15% (as of December 2007). Second, investors can postpone their capital gains liability as long as they do not sell their investment. Third, investors can use the inherent tax-timing option of their equity to further reduce the expected marginal tax rate of holding equity (Constantinides 1983, 1984).12 Miller (1977) demonstrates that the tax benefit gain from leverage is related to the difference between the corporate tax advantage of having debt and the personal tax advantage for investors to
12
Actually, there are also valuable tax timing options for long term bonds as well. However, the value of these options increase with the volatility of the underlying security and equity is more volatile than debt.
Table 80.4 This Table delineates the implicit financing assumptions of the project of Table 80.1 assuming that the firm maintains a constant market value or book value leverage ratio Level of debt if maintaining a Unlevered cash Market value Level of debt using constant 40% book using the ATWACOC and flow of the Book value of value weight Time project ATWACOC Miles and Ezzell project (1) (2) (3) (4) (5) (6) 0 1 2 3 4 5 6 7 8 9 10
$10,000,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00 $2,500,000.00
$13,643,326.43 $12,897,490.63 $12,055,760.44 $11,105,806.43 $10,033,713.92 $8,823,779.20 $7,458,279.47 $5,917,213.27 $4,178,007.48 $2,215,186.68 $0.00
$3,060,016.33 $2,892,735.30 $2,703,946.43 $2,490,884.40 $2,250,428.34 $1,979,056.10 $1,672,792.71 $1,327,152.09 $937,071.41 $496,836.86 $0.00
$10,000,000.00 $9,000,000.00 $8,000,000.00 $7,000,000.00 $6,000,000.00 $5,000,000.00 $4,000,000.00 $3,000,000.00 $2,000,000.00 $1,000,000.00 $0.00
$4,000,000.00 $3,600,000.00 $3,200,000.00 $2,800,000.00 $2,400,000.00 $2,000,000.00 $1,600,000.00 $1,200,000.00 $800,000.00 $400,000.00 $0.00
1230
I.E. Brick and D.G. Weaver
hold equity. In particular, Miller finds that the tax gain from leverage, GL , is given by Œ1.1T/.1Tpe/=.1Tpd / where T is the corporate tax rate, Tpe is the marginal personal tax rate on equity income and Tpd is the marginal personal tax rate on equity income. Note that if Tpe D Tpd , then the gain from leverage is identical to that found in the Modigliani and Miller (1963) framework. With personal taxes, Taggart (1991) shows that the theoretical value of the firm is given by: VL D
1 X
›t XFt =.1 C Rfe /t C
tD1
1 X
GL “t R Dft1 =Œ.1 C Rfe /t
tD1
(80.12) where Rfe is the pre-personal-tax-riskless return on equity securities.13 In equilibrium, the after-personal tax return on equity must equal the after-personal-tax-return on debt or Rfd .1 Tpd / D Rfe .1 Tpe /, where Rfd is the pre-personaltax-riskless return on debt securities. Consider the special case where a Miller (1977) equilibrium holds, where GL D 0. In this case, the tax benefits of debt are zero and, consequently, the financing decision is irrelevant.14 Theoretically, the after-tax weighted average cost of capital should not be affected by the financing mix chosen by the firm. In this case, as long as the firm maintains a constant market value leverage ratio, the ATWACOC approach will provide an accurate assessment of the NPV of
13
Sick (1990) derives an analogous valuation model. DeAngelo and Masulis (1980) show that the Miller equilibrium will not result in an economy whereupon firms can lose the tax deductibility of interest when taxable income is negative. Moreover, Miller and Scholes (1978) show how individuals can defer the dividend tax such that investors are indifferent between receiving dividends and capital gains. The Miller equilibrium should then collapse to the Modigliani and Miller (1963) framework, assuming one can do the same with interest income. Thus for simplicity, we examine the efficacy of the capital budgeting framework for two cases when GL D £ and GL D 0. 14
the project. To illustrate this point, consider the case that personal tax on equity income is zero while the marginal personal tax rate on debt income is 40%. Let Rfe D 5% implying that Rfd D 8:333%. Assume that › D 0:92920354, implying again that Keu D :13.15 Assume further as we did in Table 80.2, that XF D $100 million, g D 10%; n D 10, and £ D :4. Further assume that ™ D :4 and the initial book value of the firm is $800 million. Using Equation (80.12), we can deduce the equilibrium value of the firm, which given our parametric assumptions, equals $1,320,705,066.04. Finally, we will set the leverage rate so that the initial level of debt is approximately equal to 40% of the book value. Hence, let L D 24:23%. Table 80.5 summarizes the cash flow implications of the above parameters. The market value of the firm at time t is obtained by using Equation (80.12) to find the present value of cash flows from t C 1 to 1 back to time t. Note that the level of debt is maintained such that it is always 24.23% of the firm. The value of equity at time zero is found by subtracting out the value of debt $320,006,837.50 from the initial value of the firm. Consequently, the value of equity is now $1,000,698,228.54. The implicit cash flow to stockholders is given in Table 80.6. The interest rate that equates the present value of the cash flow to stockholders depicted by Table 80.6 to the value of equity of $1,000,698,228.54 is 15.558%, implying that the ATWACOC is 13.00%. This implies that if the firm finances all projects such that the level of debt financing is always 24.23% of the value of the project, the ATWACOC will yield the correct NPV. Since the ATWACOC does not require the knowledge of GL , the ATWACOC approach is the superior capital budgeting technique because of its accuracy and ease of use. The 15
That is because the certainty equivalent factor is given by › D .1 C Rfe /=.1 C Keu /.
Table 80.5 This Table summarizes the example based upon the assumptions of Sect. 80.5 under the Miller (1977) equilibrium and assuming that the firm maintains a constant market value leverage ratio Firm’s operating Unlevered Time cash flow cash flow Market value of firm Book value of assets Debt (1) (2) (3) (4) (5) (6) 0 1 2 3 4 5 6 7 8 9 10 11 1
$0.00 $166,666,666.67 $183,333,333.33 $201,666,666.67 $221,833,333.33 $244,016,666.67 $268,418,333.33 $295,260,166.67 $324,786,183.33 $357,264,801.67 $392,991,281.83 $392,991,281.83
$0.00 $100,000,000.00 $110,000,000.00 $121,000,000.00 $133,100,000.00 $146,410,000.00 $161,051,000.00 $177,156,100.00 $194,871,710.00 $214,358,881.00 $235,794,769.10 $235,794,769.10
$1,320,705,066.04 $1,392,396,724.63 $1,463,408,298.83 $1,532,651,377.68 $1,598,796,056.78 $1,660,229,544.16 $1,715,008,384.90 $1,760,803,374.93 $1,794,836,103.68 $1,813,805,916.15 $1,813,805,916.15 $1,813,805,916.15
$800,000,000.00 $866,666,666.67 $940,000,000.00 $1,020,666,666.67 $1,109,400,000.00 $1,207,006,666.67 $1,314,374,000.00 $1,432,478,066.67 $1,562,392,540.00 $1,705,298,460.67 $1,862,494,973.40 $1,862,494,973.40
$320,006,837.50 $337,377,726.38 $354,583,830.81 $371,361,428.81 $387,388,284.56 $402,273,618.55 $415,546,531.66 $426,642,657.75 $434,888,787.92 $439,485,173.48 $439,485,173.48 $439,485,173.48
80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches
Table 80.6 The cash flow to stockholders for the example described in Table 80.5 Implied interest Time expense CFstock 0 1 2 3 4 5 6 7 8 9 10 11 1
$26,667,236.46 $28,114,810.53 $29,548,652.57 $30,946,785.73 $32,282,357.05 $33,522,801.55 $34,628,877.64 $35,553,554.81 $36,240,732.33 $36,623,764.46 $36,623,764.46
$101,370,547.00 $110,337,218.11 $120,048,406.46 $130,558,784.30 $141,925,919.76 $154,210,232.18 $167,474,899.50 $181,785,707.29 $197,210,827.17 $213,820,510.43 $213,820,510.43
APV approach will also yield an identical answer as long as the manager knows that GL D 0.16 But in general, managers do not observe the value of GL . If the firm does maintain a constant market value based leverage ratio, then only the ATWACOC will yield an unbiased estimate of the NPV because it is the only method that does not require knowledge of GL .17 However, if firms do maintain a constant book value leverage ratio, then none of the approaches described in this paper will give an accurate estimate of the NPV of the project. To illustrate, consider our example again with the same assumptions as above in this section, except now we will assume that the firm maintains a 40% book value leverage ratio. Assume that GL D 0, but the managers assume that it is equal to £, which equals 40%.18 Table 80.7 presents the relevant parameters with the 40% book value leverage assumption. Please note that the level of debt at time zero for both Tables 80.5 and 80.7 are almost identical. Hence, our results are not affected by assuming very different initial levels of debt. Rather, our results differ because the debt rebalancing assumptions are different for the two tables. The value of the firm is $1,320,705,066.04 as it was for the example depicted in Table 80.5. The value of equity, $1,000,705,066.04, is slightly different, since the debt level of Table 80.7 is almost $7,000 less than that of the 16
Using Equation (80.3) and substituting GL (which equals zero) for £, one obtains that ReuME D 13%. 17 The Miles and Ezzell (1980) framework also will not work since the framework would require the estimation of the cost of equity of the unlevered firm. If the manager assumes that GL D £ when in fact the gain from leverage is zero, then Equation (80.3) will give the wrong estimate for the cost of equity of the unlevered firm. In particular, assuming GL D £, one obtains that ReuME D 13:525%. 18 Essentially, a manager does not observer Rfe or believes that Rfe is equal to the yield of default-free debt. Ruback (2002) suggests that Rfe is the return of the zero beta portfolio. The problem is that the return of zero beta portfolio can be estimated ex-post but it is not easy to do so ex-ante.
1231
debt level in Table 80.5. Accordingly, the discount rate that equates the present value of the cash flow to stockholders as summarized in column 7 of Table 80.7 to the firm’s value of equity ($1,000,705,066.04) is 16.6357% and the resulting ATWACOC is 13.8164%. Now consider our project of Table 80.4. In particular, the firm is considering an investment of $10 million. The life of the asset is 10 years and the unlevered cash flow is $2.5 million per year. Using Equation (80.1), the NPV(ATWACOC) D $3,134,251.83. Recall, that the true ATWACOC should be equal to 13% and therefore the true NPV D $3,565,608.69. Hence, the percentage error using the ATWACOC in this case is 12.10%. To calculate the APV using the Brick and Weaver (1997) methodology, we would have to set VL of Equation (80.10) to $1,320,705,066.04 and back out the cost of equity of the unlevered firm. Recalling that the manager mistakenly believes that Rfe D R, which equals 8.3333% and that GL D £. Using Equation (80.10), we find that the cost of equity of the unlevered firm is 14.342%. Accordingly, using our approach, the NPV of the project is $3,439,511.06. That is, our approach will also give an inaccurate answer to the NPV if one mistakenly believes that the gain from leverage is given by the Modigliani and Miller (1963) framework (i.e., GL D £) where in truth it is zero. The error in this case is 3.54%, or less than two-thirds the error obtained using the traditional ATWACOC.
80.6 Conclusion Calculating the cost of capital for use in risk-adjusted discount capital budgeting procedures generally assumes that the level of debt is either a constant or a constant percentage of the (market) value of the project or firm. However, firms generally maintain a debt-equity financing mix based on target “book-value” debt-to-asset-ratio rather than a “marketvalue” based leverage ratio; namely, the debt-to-total-firmvalue ratio. We show that if firms do maintain a constant book-value debt-to-asset ratio, then the use of capital budgeting procedures based on market-value ratios yield substantial errors in estimating the true net present value of the project. In this paper, we present the methodology of Brick and Weaver (1997) to correctly estimate the cost of equity of an unlevered firm assuming that the firm maintains a constant book-value leverage ratio. Assuming that the tax benefit of debt is purely a function of the corporate tax rate, we show that the correct cost of equity of the unlevered firm can easily be calculated using a spreadsheet. This estimate can then be used as the appropriate discount rate to accurately estimate project net present value using an APV framework resulting in correct project selection.
1232
I.E. Brick and D.G. Weaver
Table 80.7 This Table summarizes the example set forth in Sect. 80.5 assuming Miller (1977) equilibrium and that the firm maintains a constant book value leverage ratio Firm’s operating Unlevered Book value Time (1)
cash flow (2)
cash flow (3)
of the firm (4)
Debt (5)
Interest (6)
CFstock (7)
0 1 2 3 4 5 6 7 8 9 10 11 1
$0.00 $166,666,666.67 $183,333,333.33 $201,666,666.67 $221,833,333.33 $244,016,666.67 $268,418,333.33 $295,260,166.67 $324,786,183.33 $357,264,801.67 $392,991,281.83 $392,991,281.83
$0.00 $100,000,000.00 $110,000,000.00 $121,000,000.00 $133,100,000.00 $146,410,000.00 $161,051,000.00 $177,156,100.00 $194,871,710.00 $214,358,881.00 $235,794,769.10 $235,794,769.10
$800,000,000.00 $866,666,666.67 $940,000,000.00 $1,020,666,666.67 $1,109,400,000.00 $1,207,006,666.67 $1,314,374,000.00 $1,432,478,066.67 $1,562,392,540.00 $1,705,298,460.67 $1,862,494,973.40 $1,862,494,973.40
$320,000,000.00 $346,666,666.67 $376,000,000.00 $408,266,666.67 $443,760,000.00 $482,802,666.67 $525,749,600.00 $572,991,226.67 $624,957,016.00 $682,119,384.27 $744,997,989.36 $744,997,989.36
$26,666,666.67 $28,888,888.89 $31,333,333.33 $34,022,222.22 $36,980,000.00 $40,233,555.56 $43,812,466.67 $47,749,268.89 $52,079,751.33 $56,843,282.02 $62,083,165.78
$110,666,666.67 $122,000,000.00 $134,466,666.67 $148,180,000.00 $163,264,666.67 $179,857,800.00 $198,110,246.67 $218,187,938.00 $240,273,398.47 $264,567,404.98 $198,544,869.63
However, given personal taxes, the gain from leverage will then be a function of both the corporate tax rate and the marginal personal tax rates of both debt and equity securities. Unfortunately, managers cannot observe the true gain from leverage. If their estimate of the gain from leverage is incorrect, then the efficacy of the capital budgeting methods to yield an accurate estimate of the net present value will depend on whether or not firms rebalance their leverage ratio based on the market value or book value of the assets. If the firm maintains a constant market based leverage, then the ATWACOC will always yield the correct NPV. But if the firm is maintaining a constant target book value leverage ratio, our approach will be most accurate. In future work, we will explore how to extend our framework in the event that a firm can default on its debt obligations. This paper demonstrates that the capital budgeting methods are sensitive to the assumed financing policies of the firm. The major textbooks also take into account the implicit real options of capital budgeting projects. For example, textbooks use an option framework to examine the implicit benefits of guarantees or ability to abandon projects. The traditional capital budgeting approaches as discussed in this paper do not take into account real options. As a result, the traditional capital budgeting approaches of constant riskadjusted discount rates produce a downward bias estimate of the NPV of a project if the project has a real option component. The real option methodology suggested by textbooks implicitly assumes that the debt capacity of such options is zero. Whether or not this is a correct assumption or a matter of importance is a subject for future research. Acknowledgments The authors have benefited from the comments of Cheng-Few Lee.
References Arditti, F. D. and H. Levy. (1977). “The weighted average cost of capital as a cutoff rate: a critical examination of the classical textbook weighted average.” Financial Management (Fall 1977), 6, 24–34. Arzac, E. and L. Glosten. (2005). “A reconsideration of tax shield valuation.” European Financial Management (September 2005), 11, 453–61. Beaver, W. H. and S. G. Ryan. (1993). “Accounting fundamentals of the book-to-market ratio.” Financial Analysts Journal (November– December 1993), 49, 50–56. Beranek, W. (1977). “The weighted average cost of capital and shareholder wealth maximization.” Journal of Financial and Quantitative Analysis (March 1977), 12, 17–31. Beranek, W. (1978). “Some new capital budgeting theorems.” Journal of Financial and Quantitative Analysis (December 1978), 809–823. Bernhard, R. (1978). “Some new capital budgeting theorems: comment.” Journal of Financial and Quantitative Analysis (December 1978), 13, 825–829. Berk, J. and P. DeMarzo. (2007). Corporate finance, Pearson Addison Wesley, Boston. Booth, L. (2002). “How to find value when none exists: pitfalls in using APV and FTE.” Journal of Applied Corporate Finance (Spring 2002), 15, 8–17. Booth, L. (2007). “Capital cash flows, APV and valuation.” European Financial Management (January 2007), 13, 29–48. Boudreaux, K. J. and R. W. Long. (1979). “The weighted average cost of capital as a cutoff rate: a further analysis.” Financial Management (Summer 1979), 8, 7–14. Brealey, R., S. Myers, and F. Allen. (2007). Principles of corporate finance, 9th edn. McGraw Hill, New York. Brick, I. E. and D. G. Weaver. (1984). “A comparison of capital budgeting techniques in identifying profitable investments.” Financial Management (Winter 1984), 13, 29–39. Brick, I. E. and D. G. Weaver. (1997). “Calculating the cost of capital of an unlevered firm for use in project evaluation.” Review of Quantitative Finance and Accounting 9, 111–129. Bruner, R. F., K. M. Eades, R. S. Harris, and R. C. Higgins. (1998). “Best practices in estimating the cost of capital: survey and synthesis.” Financial Practice and Education (Spring/Summer) 8, 13–28.
80 A Primer on the Implicit Financing Assumptions of Traditional Capital Budgeting Approaches Chambers, D. R., R. S. Harris, and J. J. Pringle. (1982). “Treatment of financing mix in analyzing investment opportunities.” Financial Management (Summer 1982), 11, 24–41. Constantinides, G. M. 1983. “Capital market equilibrium with personal tax.” Econometrica 51, 611–636. Constantinides, G. M. 1984. “Optimal stock trading with personal taxes: implications for prices and the abnormal January returns.” Journal of Financial Economics 13, 65–89. Cooper, I. A. and K. G. Nyborg. (2006). “The value of tax shields is equal to the present value of the tax shields.” Journal of Financial Economics 81, 215–225. DeAngelo, H. and R. W. Masulis. (1980). “Optimal capital structure under corporate and personal taxation.” Journal of Financial Economics 8, 3–39. Farber, A., R. Gillet, and A. Szafarz. (2006). “A General Formula for the WACC.” International Journal of Business 11, 211–218. Fernandez, P. (2004) “The value of tax shields is not equal to the present value of tax shields.” Journal of Financial Economics 73, 145–165. Frecka, T. J. and C. F. Lee. (1983). “Generalized financial ratio adjustment processes and their implications.” Journal of Accounting Research (Spring 1983), 21, 308–316. Gitman, L. 2007. Principles of managerial finance, 12th edn. Pearson Prentice Hall, Boston. Graham, J. R. and C. R. Harvey. (2002). “How do CFOs make capital budgeting and capital structure decisions?” Journal of Applied Corporate Finance (Spring 2002), 15, 8–23. Kaplan, S. N. and R. Ruback. (1995). “The valuation of cash flow forecasts: an empirical analysis.” Journal of Finance (Septemeber 1995), 50, 1059–1093. Lee, C. F. and C. Wu. (1988). “Expectation formation and financial ratio adjustment processes.” The Accounting Review (April 1988), 63, 292–306.
1233
Lev, B. (1969). “Industry averages as targets for financial ratios.” Journal of Accounting Research (Autumn 1969), 7, 290–299. Marsh, P. (1982). “The choice between equity and debt: an empirical study.” Journal of Finance (March 1982), 37, 121–144. Miles, J. and R. Ezzell. (1980). “The weighted average cost of capital, perfect capital markets, and project life: a clarification.” Journal of Financial and Quantitative Analysis (September 1980) 15, 719–730. Miller, M. (1977). “Debt and taxes.” Journal of Finance (May 1977), 32, 261–275. Miller, M. and M. Scholes. (1978). “Dividends and taxes.” Journal of Financial Economics 6, 333–364. Modigliani, F. and M. Miller. (1963). “Corporate income taxes and the cost of capital: a correction.” American Economic Review (June 1963), 53, 433–443. Myers, S. C. (1974). “Interactions of corporate financing and investment decisions – implications for capital budgeting.” Journal of Finance (March 1974), 29, 1–25. Robichek, A. and S. C. Myers. (1966). “Risk-adjusted discount rates.” Journal of Finance (December 1966), 21, 727–730. Ross, S., R. Westerfield, and B. Jordan. (2008). Fundamentals of corporate finance, 8th edn. McGraw Hill Irwin, New York. Ruback, R. (2002). “Capital cash flows: a simple approach to valuing risky cash flows.” Financial Management (Summer 2002), 31, 85–103. Sick, G. A. (1990). “Tax adjusted discount rates.” Management Science (December 1990), 36, 1432–1450. Smith, C. Jr. and J. B. Warner. (1979). “On financial contracting: an analysis of bond covenants.” Journal of Financial Economics (June 1979), 7, 117–61. Taggart, R. A. (1991). “Consistent valuation and cost of capital expressions with corporate and personal taxes.” Financial Management (Autumn 1991), 20, 8–20.
Chapter 81
Determinants of Flows into U.S.-Based International Mutual Funds Dilip K. Patro
Abstract The last few decades has witnessed a dramatic growth of U.S.-based mutual funds that invest in non-U.S. stock markets. This paper provides a comprehensive analysis of flows into these international mutual funds for 1970–2003. Our analysis uncovers several new facts about mutual fund flows. First, the empirical findings show a strong relationship between flows into U.S.-based international mutual funds and the correlation between the returns of the fund’s assets and the returns of the U.S. market, consistent with investors’ desire for international diversification. Furthermore, a stronger flow-performance relationship is observed when these correlations are low. As expected, the flows are lower when the volatility of the fund is higher. Second, the flows are related to contemporaneous and past fund returns supporting an “information asymmetry” as well as “return chasing” hypothesis for international capital flows. Keywords International capital flows r International mutual funds
81.1 Introduction This paper analyzes flows into U.S. mutual funds that invest in international (non-U.S.) stock markets. While there were less than 20 such funds prior to 1980, there are currently almost 1,500 such funds investing all across the globe. This dramatic growth is the result of increased globalization of capital markets, reduction of cross border investment barriers, and perhaps an increased awareness of the potential benefits of international investments. Despite this dramatic growth of international mutual funds, there is little empirical evidence on what factors drive flows into and out of these funds and what determines U.S. households’ (investors’)
D.K. Patro () Risk Analysis Division, Office of the Comptroller of the Currency (OCC), Washington, DC 20219, USA e-mail: [email protected]
decision to purchase international funds. The objective of this paper, therefore, is to provide a comprehensive analysis of the flows into U.S.-based international mutual funds during 1970–2003. The relationship between mutual funds flows, performance, and fund characteristics has been studied for diversified domestic equity mutual funds (see, for example, Gruber 1996; Chevalier and Ellison 1997; Sirri and Tufano 1998; Barber et al. 2005 among others). Apart from their tremendous growth in numbers and size, examining flows into international mutual funds is important and interesting because it allows us to examine and answer some important questions about U.S. investor behavior regarding investment in foreign markets and international capital flows.1 It is often suggested that investors should hold an internationally diversified portfolio. As a result of the low correlations between international equity markets, investors could potentially improve their reward to risk ratio by investing globally (see, Levy and Sarnat 1970; De Santis and Gerard 1997; Solnik 2004). However, there is no direct evidence that investors follow this strategy. Therefore, we hypothesize and test if flows into international mutual funds are driven by an “international diversification motive.” As Errunza et al. (1999) demonstrate, U.S. investors do not need to directly trade abroad to benefit from global diversification. Investing in mutual funds is another way of achieving that objective. We also examine how the flow-performance relationship varies for different levels of correlation between the funds’ assets and the U.S. market. To our knowledge, this is one of the first studies to examine if U.S. households’ decisions to buy international mutual funds is driven by a desire for international diversification. Unlike flows into domestic funds, flows into international mutual funds are also cross-border capital flows and therefore have implications for understanding determinants of international capital flows. Thus, the second set of hypothesis
1
Using a sample of international mutual funds also gives us a new and independent sample to reexamine the flow-performance relationship for mutual funds.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_81,
1235
1236
tested is drawn from the predictions of models of international capital flows such as Bohn and Tesar (1996) and Brennan and Cao (1997). The hypothesis tested based on predictions of these studies is that international mutual fund flows are related to past and current returns of the fund.2 This also builds on the empirical findings of these as well as other papers such as Tesar and Werner (1994, 1995), Bekaert and Harvey (2000), and Froot et al. (2001). The analysis complements the findings of these studies, which focus on flows of institutional investors, by providing results on behavior of small investors regarding international investments.3 Since most mutual fund investors are small investors, this is also one of the first papers to examine the portfolio choices for small investors when it comes to foreign investments. The results from flows of small investors could be potentially different from results from flows of institutions since there is growing evidence that behavior of institutional and small investors differ (see, for example, Grinblatt and Keloharju 2000). The empirical results support many of our hypotheses and we uncover several new facts about international mutual fund flows. We find that the flows into international mutual funds are indeed driven by the investors’ desire for international diversification. The flow-performance relationship is also sensitive to the correlation between the fund’s assets and the U.S. market. Second, there is a strong relationship between flows and returns for international mutual funds, consistent with the predictions of Bohn and Tesar (1996) and Brennan and Cao (1997). These findings provide a better understanding of international mutual fund flows and their growth, and behavior of small investors with regard to international investments. The rest of the paper is organized as follows. Section 81.2 discusses the motivations for our hypotheses. Section 81.3 discusses the data. The methodology and the empirical results are discussed in Sect. 81.4. Section 81.5 summarizes the main findings and concludes.
81.2 Motivation and Hypotheses The size of the U.S. international mutual fund market is now almost $300 billion. Furthermore, there are almost 1,500 such mutual funds. Clearly, this is an important and growing
2 This analysis is done both at the fund level and at the aggregate level for flows of all funds. 3 The investment decisions of small investors has also received growing attention in the literature (see, for example, Barber and Odean 2000; Grinblatt and Keloharju 2000; Bhattacharya and Groznik 2008).
D.K. Patro
sector of the U.S. economy. Yet there is little empirical evidence on the determinants of flows into these international mutual funds. This paper aims to fill that void. It has been suggested and shown empirically that (see, for example, Levy and Sarnat 1970; Solnik 2004) international diversification can improve the reward-to-risk ratio for an investor due to the low correlations across national equity markets. Motivated by these observations, the first hypothesis we test is if indeed flows into international mutual funds are driven by investors’ desire to diversify internationally. Assuming that the U.S. investor holds a welldiversified portfolio of U.S. equities, we test if the correlation between the U.S. market portfolio and the funds’ assets (invested in foreign markets) is a significant determinant of fund flows. For example, if ¢US is the standard deviation of a U.S.-only portfolio and ¢f is the standard deviation of a foreign portfolio, the standard deviation ¢p of a portfolio with weights w1 and w2 on the U.S. and foreign portfolio, would be: q 2 w21 US p D C w22 f2 C 2w1 w2 US f (81.1) where ¡ is the correlation between the two portfolios. Holding returns (or performance), weights (which is a asset allocation decision) and variance constant, since lower correlation would imply lower portfolio variance .¢p /, we expect that across funds, when the correlation is low, the flows are higher. That is, the higher the perceived benefits of diversification, the higher are the flows. Similarly, from the above equation, we expect a negative relationship between flows and foreign portfolio volatility, since higher foreign portfolio volatility increases the internationally diversified portfolio volatility, holding all else constant. Furthermore, we hypothesize that the relationship between flows and past performance (which is well documented in the literature for domestic funds) not only exists for international funds – it is stronger when these correlations are lower. The theoretical basis for our tests relating flows and fund returns is based on the predictions of Brennan and Cao (1997), who develop a dynamic model of international capital flows under the assumption of information asymmetry between the domestic and foreign investors in a particular market. The model abstracts from currency risk and barriers to investments.4 In their model, the domestic investors in a market are better informed vis-à-vis foreign investors. Therefore, following a public signal, foreign investors revise their expectation about returns more than domestic investors. As a result, when the signal is good, foreign investors buy more equity and drive up the prices. This suggests a contemporaneous relationship between international flows and returns
4 See, for example, Stulz (1999), Errunza and Losq (1989) for international asset pricing models, which account for investment barriers.
81 Determinants of Flows into U.S.-Based International Mutual Funds
for a market. A lagged version of this model predicts that flows are related to past returns. Bohn and Tesar (1996) use mean-variance analysis to show that the investor’s decision to invest in foreign markets has a “portfolio rebalancing” component and a “return chasing” component. The authors show that flows are related to expected returns, which they interpret as evidence supporting the return-chasing hypothesis. This is also consistent with the theoretical predictions of Brennan and Cao (1997). Thus, based on these studies, the second set of hypothesis we test is that the international mutual fund flows are related to past and current returns of the funds themselves. This would be consistent with a search across countries for higher returns. In this analysis, we also include the changes in trade-weighted value of the dollar to proxy for currency risk (see, Adler and Dumas 1983). We do not have guidance from theory on the sign of this variable.5 These above mentioned studies, as well as studies such as Tesar and Werner (1995) provide empirical support for these predictions using aggregate market flows recorded by the U.S. Treasury. However, there are many biases associated with country level capital flows data (see, Tesar and Werner 1995; Bekaert and Harvey 2000).6 Therefore, Froot et al. (2001) use a novel dataset of daily flows from State Street Bank – one of the largest custodian banks. We complement the findings of these studies by using flows into international mutual funds. Apart from overcoming the biases of not accounting for capital gains and so forth, the mutual fund dataset is of interest as small investors or households mainly hold it, where as the treasury data is aggregate data for institutions.7 Therefore, our analysis complements the findings of these studies by providing results on behavior of small investors regarding international investments. Several studies including, Patel et al. (1994), Gruber (1996), Chevalier and Ellison (1997), Sirri and Tufano (1998), and Nanda et al. (2004) find that flows into and out of domestic mutual funds in the U.S. are related to past performance. This is consistent with the theoretical prediction of Berk and Green (2004). This is also consistent with the “trend following” hypothesis from international capital flows. We hypothesize that this relationship will be similar
5
The hypothesis based on models of capital flows are tested both at the fund level and the aggregate flow levels. Since the results are similar, we focus on the fund level flows. 6 These biases include mis-reporting of transactions, transactions at foreign financial centers, exclusion of reporting of sales and purchases for less than $2 million, and no accounting for capital gains. 7 We calculated the correlation of aggregate monthly mutual fund flows with the monthly U.S. flows to foreign stocks and find the correlation is 0:02 and not significant. This confirms that the flow patterns of individual and institutions differ. We also analyze aggregate flows for international mutual funds and find similar results to those obtained using fund flows. When we use aggregate flows we use market returns in place of fund returns.
1237
for international mutual funds and test the findings of these earlier studies using a new independent sample. As Sirri and Tufano (1998) suggest, investors may use past performance as indicative of future performance. Furthermore, the authors find that fund flows are related to investors’ search costs proxied by size of the fund complex and advertising fees. Khorana and Servaes (1999) find that the decision to start a new fund is related to the level of assets for funds with the same objective, past performance, and fund fees. Chevalier and Ellison (1997) show that the performance-flow relationship is sensitive to the age of the fund. Motivated by these findings we include the past performance of the funds, size of the fund complex, and age and fees of the funds as control variables in our analysis. The total net asset of the fund at the end of the previous quarter is used to control for fund size since the same dollar flow will have larger effect on smaller funds. Thus, based on the previous theoretical and empirical literature, we hypothesize that after controlling for fund performance and fund characteristics, the international mutual fund flows are a function of the funds correlation with the U.S. market and past and current fund returns.8 To supplement our analysis, we test if U.S. investors prefer domestic funds to foreign funds when controlling for fund characteristics. The results from tests of these hypotheses will complement the earlier findings from studies using aggregate capital flows at the country level and fund flows for domestic mutual funds. In the next section we discuss the data used to test our hypothesis.
81.3 Data The data used in the empirical analysis is from the CRSP survivor-bias free database of U.S. mutual funds. This database originally constructed and used by Carhart (1997) has been used by many researchers (see, for example, Wermers (2000). From this database we sample all international mutual funds – funds that invest in non-U.S. equity securities – for the entire history from January 1, 1962 till December 31, 2003. To provide a close comparison of our findings for flows of international mutual funds with flows for domestic mutual funds, we also select all diversified domestic equity mutual funds that invest primarily in U.S. equity. For the domestic funds sample, as is the case with many other studies, we exclude all bond funds, balanced funds, sector funds, precious metal funds, and funds that have less than 8
It is possible that other factors such as market timing affect flows into international mutual funds. However, we believe that to be of importance for short-term daily or even weekly flows. Our focus is more on quarterly and monthly flows.
1238
50% of assets in equity securities in a given year.9 For the sample of international mutual funds, global funds – funds that invest in both U.S. and non-U.S. securities – are excluded. The diversified domestic funds include all growth, income and growth and income funds. The international mutual funds include all U.S. based mutual funds that invested in non-U.S. equity for the past 42 years. The CRSP mutual fund database includes monthly returns from 1962. However, the total net assets of the funds (which are used to calculate the flows) are available annually from 1962, quarterly from 1970, and monthly from 1991. To have the largest possible sample of all mutual funds for the longest possible period, we focus on quarterly mutual fund flows from 1970 to 2003, a period of 34 years and 136 quarters. Thus, as long as a fund exists for a quarter, it is included in our analysis. We also use the monthly data from 1991 to 2003 to examine if our results using quarterly flows are robust to using monthly flows. Our full sample includes 2,412 international mutual funds of which 1,456 were in existence at the end of 2003. The sample of diversified domestic mutual funds includes 10,019 funds of which 6,641 were in existence at the end of December 2003. Even though we use the quarterly flow data from 1970, the returns data prior to 1970 are still used for calculating past performance. The appendix provides the details for our sample and a list of the ten largest mutual funds as of December 2003 for all international, emerging, and domestic funds. This appendix also describes in detail the selection and categorization of all international funds. Table 81.1 reports some descriptive statistics for our sample of funds. Panel A is for all international funds, panel B is for diversified domestic funds. The statistics clearly indicate the dramatic growth of U.S.-based international equity funds. While there were only 17 such funds in 1962 and 135 funds in 1989, there were 1,569 funds in 2003. This dramatic growth provides further motivation to study determinants of flows into these funds. The TNA-weighted annual average return for these funds is 12.69%, which is higher than the 12.27% average return for domestic mutual funds. The total fees of the funds are the expense ratio plus the front-end load (if any) amortized over a 7-year period as in Sirri and Tufano (1998). The international funds have higher median total fees of 2.08%, while the domestic funds have median total fees of 1.79%.
9
Flows in the first and the 99th percentile were excluded to reduce the impact of outliers. Including them does not affect the significance of the variables included. However, this mitigates the results from being affected by flows that are very low because the funds are “incubator funds.”
D.K. Patro
81.4 Methodology and Empirical Results This section discusses the methodology used to test the hypotheses, which are drawn from the theoretical models and prior empirical research, followed by a discussion of the empirical results. The net flow for a fund in a given quarter (or month) is defined as: Flowi;t D
TNAi;t TNAi;t 1 .1 C ri;t / MAi;t TNAi;t 1
(81.2) where TNAi;t is fund i’s total net assets at the end of quarter t, ri;t is the fund’s return during the quarter, and MAi;t is the assets acquired during the quarter through mergers.10 Therefore, Flowi;t measures the growth rate of the fund’s assets in excess of the growth due to capital gains and dividends. This is the dependent variable in all our regressions. For all our empirical tests we use different versions of the following panel model, where the data are pooled across funds and quarters. Pooling data across funds and time allows us to test determinants of cross-sectional variations in fund flows as well as the dynamics of fund flows over time.11 This is a generic specification and we use several versions of this model along with dummy variables to test our various hypotheses. Flowi;t D ˛ C ˇ1 .fund0 s_correlation_with_US_ market/i;t C ˇ2 .market_returns/i;t Cˇ3 .fund_performance/t C ˇ3 .total_fees/i;t Cˇ4 .real_changes_in_dollar/t Cˇ5 .Category_Flow/i;t C ˇ6 .Std_devn_fund_ returns/i;t 1 C ˇ7 .Log_Age/i;t 1 C ˇ8 .Log_TNA/i;t 1 Cˇ9 .Log_Complex_TNA/i;t 1 C "i;t
(81.3)
The correlation of the fund with the U.S. market, relative market performance (U.S. vs. Non-U.S.), and variables for currency crises are examples of new independent variables that are unique to international funds not used in previous research. The first independent variable is the correlation of the fund’s last 12-month returns with the return on the U.S.
10
As in several other studies including Sirri and Tufano (1998), we assume that the flows occur at the end of the period (quarter or month). 11 We also estimated models with fixed-effects for fund categories as well as time (years) and excluding the market returns. However, since the findings are very similar, they are not reported.
1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996
Panel (A)
Year
10:30 20:89 16:30 14:39 8:69 28:57 14:17 10:84 0:07 16:20 17:34 18:77 27:93 37:35 26:77 2:98 8:55 24:41 33:23 3:98 20:42 22:65 3:16 31:42 15:57 1:82 17:55 28:43 6:08 33:64 9:06 11:59 0:76 35:67 21:16
CRSP VW return (%)
8:91 22:83 16:78 12:66 10:23 23:99 10:88 8:35 3:94 14:52 19:17 14:87 26:41 37:14 24:08 7:46 6:37 18:67 32:83 4:88 22:30 22:35 6:71 32:16 18:16 5:19 17:10 31:64 3:22 30:90 7:77 9:98 1:36 38:12 23:45
S&P 500 return (%)
10:51 31:21 37:60 14:17 22:15 37:10 3:74 19:43 34:30 6:18 24:43 1:03 0:86 24:61 7:86 56:72 69:94 24:93 28:59 10:80 23:20 12:50 11:85 32:95 8:06 11:55 6:36
MSCI EAFE return (%) 17 18 17 15 14 13 14 16 18 19 19 18 17 17 16 17 19 18 18 20 22 24 30 45 65 96 115 135 161 194 232 314 552 699 833
Number of funds
Table 81.1 International mutual funds: descriptive statistics
11:13 8:48 12:71 7:72 6:12 27:38 27:99 6:36 10:56 23:76 21:45 21:33 23:86 31:27 18:02 1:44 26:50 18:07 37:10 0:12 7:67 26:56 4:17 37:07 44:05 11:41 15:41 21:42 12:03 13:22 5:08 38:54 2:81 5:93 12:33
Mean annual return (%) 9:99 10:38 17:88 3:77 8:77 33:48 27:57 12:90 12:19 25:14 11:59 24:36 27:84 29:13 18:61 1:15 19:15 33:05 49:20 5:20 16:74 25:63 4:15 38:78 48:49 12:64 15:32 24:20 9:02 15:87 1:97 43:75 2:77 10:13 15:84
TNAweighted mean annual return (%) 11:19 8:59 13:34 7:96 6:76 29:44 28:12 2:99 10:38 22:32 20:69 20:70 23:88 31:49 17:86 1:66 25:76 16:60 36:33 0:24 7:66 26:84 3:71 37:09 42:89 11:92 15:28 21:17 12:18 13:57 4:77 39:61 2:90 6:06 12:31
Annualized quarterly return (%)
22:40 1:56 11:01 26:27 14:72 23:58 27:09 28:24 18:81 0:73 16:46 32:88 46:81 5:43 18:76 27:64 4:77 39:50 47:23 12:92 13:39 23:62 9:26 15:11 2:09 42:98 2:15 9:98 15:40
Annualized TNAweighted quarterly return (%) 14:45 15:60 17:15 14:85 11:90 11:55 21:40 19:86 16:35 20:23 31:26 38:12 24:80 24:80 22:70 18:79 18:70 24:46 39:30 38:90 39:03 54:33 54:01 70:69 70:89 38:82 30:18 38:09 34:20 35:08 37:25 74:37 47:44 33:99 32:26
Median TNA ($ millions) 330:90 310:10 284:50 230:90 168:40 241:70 506:24 585:63 547:50 779:69 779:71 691:43 484:60 563:22 616:43 536:23 481:99 551:73 875:04 1014:83 1230:37 1721:19 2357:99 4545:77 10252:82 11321:01 11737:58 15620:80 17350:88 22122:28 26818:76 79240:82 110540:18 130722:96 181465:62
Total net assets ($ millions)
(continued)
1.83 1.69 1.80 1.81 1.70 1.82 0.94 1.28 1.40 1.95 2.02 2.03 2.15 2.21 2.14 2.23 2.31 2.21 2.09 2.06 2.23 2.19 2.00 2.05 1.84 2.05 2.09 2.20 2.04 1.75 2.00 2.00 2.06 2.13 2.14
Median total fees (%)
81 Determinants of Flows into U.S.-Based International Mutual Funds 1239
Year
33:99 29:68 21:78 8:52 11:96 21:98 28:71 12:90
30:35 22:31 25:39 11:16 11:26 20:98 33:09 12:45
2:06 20:33 27:30 13:96 21:21 15:66 39:17 13:03
MSCI EAFE return (%) 1; 141 1; 337 1; 461 1; 585 1; 662 1; 651 1; 569 415:85
Number of funds 7:45 11:50 49:73 13:37 16:55 12:48 38:28 12:69
Mean annual return (%) 0:20 2:29 52:89 18:53 17:75 14:44 40:51 11:07
Mean 0.060 0.716 0.041 0.085 0.022 0.055 0.844 2.057 3.332 8.661 0.069 0.080
0:86 1:31 52:09 19:17 17:86 14:38 40:24 10:89
Annualized quarterly return (%)
Median 0.002 0.779 0.044 0.021 0.012 0.051 1.180 2.120 3.343 8.879 0.068 0.079
6:37 9:90 46:03 14:12 16:92 12:99 37:59 12:41
Annualized TNAweighted quarterly return (%)
Std. dev. 0.301 0.779 0.289 0.359 0.044 0.025 1.392 0.786 2.025 2.109 0.158 0.102
28:80 23:33 32:70 25:69 20:32 19:64 24:80 34:83
Median TNA ($ millions)
Minimum 1.241 0.853 1.226 0.000 0.139 0.000 2.485 0.020 0.693 0.683 0.421 0.289
223322:42 248375:23 367343:54 311298:60 239440:49 231901:01 288720:50 74863:92
Total net assets ($ millions)
Maximum 5.000 0.995 4.741 22.476 0.317 0.603 3.736 9.470 10.567 13.369 1.646 0.932
2:08 2:15 2:17 2:16 2:17 2:17 2:16 2:08
Median total fees (%)
This table reports the descriptive statistics for the sample of international equity mutual funds. The table reports both annual returns and annualized quarterly returns. The total fees are the annual expenses plus one-seventh of the front-end load (if any) a The MSCI EAFE index is not available before December 1969
Variables Flow Correlation of fund returns with U.S. market index U.S. market adjusted fund return Squared U.S. market adjusted fund return Flows to all international funds Std. dev. of last 12 months returns Logarithm of age (in years) Total fees Logarithm of lag TNA Logarithm of lag fund complex TNA Past 3 years mean return Past 5 years mean return
N 40418 42451 44327 44327 50956 50956 50956 44671 43881 50159 22524 13030
TNAweighted mean annual return (%)
Panel (B): Descriptive statistics for dependent and independent variables
1997 1998 1999 2000 2001 2002 2003 Average 1970–2003
S&P 500 return (%)
CRSP VW return (%)
Table 81.1 (continued)
1240 D.K. Patro
81 Determinants of Flows into U.S.-Based International Mutual Funds
market index. We expect that the flows will be negatively related to this variable, since higher correlation implies lower diversification potential. The market returns are the returns on the U.S. market and the return on the home market of the fund’s assets. These foreign market returns are returns on the Morgan Stanley Capital International (MSCI) index for that country or region. The U.S. market return is the return on the U.S. CRSP value weighted index. Several measures of fund_performance are used in alternate specifications. We use the fund’s past and current quarter’s returns, past year’s returns, market adjusted returns, rankings based on risk-adjusted returns, and rankings based on the intercept (alpha) from a factor model. When market adjusted fund returns (MAR) are used, its squared term is also included because of the non-linear relationship between flows and performance documented by earlier studies on domestic funds. We expect a positive relationship between flows and fund’s past performance. The fund’s returns for the current quarter or past quarter are also used to test the information asymmetry and the trend chasing hypotheses, respectively. We include the size of the fund complex, Log_Complex_ TNA – the total net assets of all international funds of the fund complex (such as Templeton, Fidelity, Vanguard and so forth) as a proxy for search costs. The percentage change in the Federal Reserves’ real trade weighted value of the dollar is used as a proxy for currency risk. Risk of the fund is proxied by Std_devn – the standard deviation of monthly returns of the fund for the 12 months prior to that quarter, although we also use risk-adjusted performance measures. The motivation for using simple measures such as the mean and standard deviation of past returns is because most often that is what is advertised or revealed to investors. In addition to the return and risk variables that are used to test our hypothesis, we use several control variables that have been shown to be significant determinants of domestic mutual fund flows. This also allows us to test if these variables have similar effects as shown for domestic funds. The fees of the fund is the total fees calculated as the expense ratio plus the front-end load (if any) amortized over a 7-year period as in Sirri and Tufano (1998).12 Category_flow is the flow to all international funds in that quarter. This is used to control for any general market wide trends. Log_age is the logarithm of the age of the fund in years and Log_TNA is the size of the fund. Several specifications of this pooled model are estimated. Since we are using panel data, heteroskedasticity and serial
1241
correlation are potential problems. For each specification, we use the Wooldridge (2002) test for autocorrelation in panel data and find that we can reject the null of first-order serial correlation. Furthermore, we have estimated the pairwise correlations of the independent variables to ensure that none of them are very high. Finally, we also tested for heteroskedasticity using the White test. Since we cannot reject heteroskedasticity, we estimate and report White t-statistics based on standard errors robust to heteroskedasticity. In addition to the pooled panel analysis quantile regression are also performed. This is especially useful since as the descriptive statistics indicate we observe some extreme values for the flows. The dependent variable used is the conditional median.
81.4.1 Diversification and International Mutual Fund Flows We test our hypothesis that the flows into U.S.-based international mutual funds are negatively related to the correlation of the fund’s returns with the return on the U.S. market index. In Table 81.2 we report the results from panel regressions of fund flows on various fund characteristics. These characteristics may be interpreted as control variables for this analysis and the correlation – which is calculated using the monthly return of the fund and the CRSP value weighted market index for 12 months prior to the quarter. For comparison, Table 81.2 and all the subsequent tables report the results from quintile regression for comparison. In general, although the magnitudes differ, we find similar results using pooled analysis and quintile regressions. We report results from four alternate specifications. The results are striking and indicate that the correlations are significantly negative in all specifications (for example, in model 2, the t-statistic is .7:007/). These findings support our assertions that there is a strong “diversification motive” for U.S. households investing in international mutual funds. Holding all else constant, funds receive lower flows when correlations are high, because the potential diversification benefits are low. The empirical evidence clearly supports the diversification hypothesis.13 In practice, some of this could be due to the fact that the funds themselves focus on the benefits of international diversification when marketing international mutual funds. Furthermore, using brokerage data,
13
12
We obtain very similar results if we include only expense ratios.
We also used Fama-Macbeth regressions as a robustness check. Here we are only able to examine cross-sectional variations excluding market returns. Since the results are similar in that they do not affect the significance of any variable, they are not reported.
Logarithm of lag TNA
Total fees
Logarithm of age (in years)
Std. dev. of last 12 months returns
Flows to all international funds
U.S. market adjusted fund return (MAR)2
U.S. market adjusted fund return (MAR)
Intercept
0.172 (15.859) 0.129
(20.159) 0.013
(1.236) 0.993
(18.192) 0:179
.2:306/ 0:059 .26:006/ 0:013 .6:075/ 0:009 .9:162/
0.133 (14.745) 0.126
(20.011) 0.017
(1.681) 1.082
(20.058) 0:239
.3:171/ 0:063
.27:786/ 0:012 .5:667/ 0:009
.8:731/
Table 81.2 Diversification and international mutual fund flows OLS regressions
.9:347/
.25:916/ 0:013 .6:121/ 0:009
.2:693/ 0:058
(18.152) 0:215
(0.160) 0.995
(14.488) 0.002
0.179 (15.882) 0.168
.3:682/
.3:943/ 0:010 .3:289/ 0:006
.4:618/ 0:017
(6.479) 0:437
(1.440) 0.687
(9.188) 0.026
0.094 (5.463) 0.125
(8.668)
(51.017) 0:01 .15:744/ 0.002
.15:200/ 0:036
(48.921) 0:333
(18.678) 0.581
(39.090) 0.027
0.075 (26.192) 0.068
Quantile regressions
(8.375)
(48.174) 0:011 .16:141/ 0.002
.13:739/ 0:035
(45.124) 0:302
(16.685) 0.567
(39.404) 0.024
0.083 (25.344) 0.068
(7.099)
(44.926) 0:011 .15:118/ 0.002
.12:821/ 0:035
(42.337) 0:308
(10.584) 0.568
(31.702) 0.019
0.088 (24.928) 0.089
(continued)
(6.127)
(5.136) 0:01 .10:742/ 0.003
.7:035/ 0:009
(21.788) 0:252
(6.823) 0.359
(15.292) 0.032
0.033 (5.480) 0.07
1242 D.K. Patro
34,889 0.095
.0:039/
.1:426/
(0.533)
34,889 0.094
0
0:031
0.008
35,538 0.090
(8.878)
34,889 0.061
(3.338) 11,502 0.051
(4.072) 11,502 0.062
34,889 0.060
.0:457/ 0.038
.1:284/ 0.133
35,538 0.060
0.004
.5:981/
0:046
(4.260)
.3:718/ 0:024
(3.636) 0:013
0:001
0:027
.6:977/ 0:032
1:467 0:019
.3:483/
.5:665/
(1.992) 0:014
.5:139/
(2.097)
.2:977/ 0:058
(2.223) 0:039
.7:227/ 0:067
0
.7:007/
0
(7.977) 0:069
0
(8.049) 0:064
0.002
(7.667)
0.006
0.006
0.006
This table reports the results from pooled panel regressions of quarterly net mutual fund flows on market returns and fund characteristics. The data spans 136 quarters from 1970 to 2003. The table reports the coefficient estimates and the White heteroscedasticity consistent t-stats in the parentheses Robust t statistics in parentheses Significant at 10% Significant at 5% Significant at 1%
Observations Adjusted R-squared/pseudo R-squared
Past 5 years mean return
Interaction of high correlation dummy and U.S. market adjusted fund return Interaction of high correlation dummy and U.S. market adjusted fund return2 Past 3 years mean return
Correlation of fund returns with U.S. market index
Table 81.2 (continued) Logarithm of lag fund complex TNA
81 Determinants of Flows into U.S.-Based International Mutual Funds 1243
1244
Bailey et al. (2008) find that investors with more wealth and awareness of the benefits of diversification are more likely invest in foreign securities.14 Perhaps the same is true for investors in mutual funds, which can further explain our findings. Note that the results are not only statistically significant, they are economically significant too – for example, in model 2 in Table 81.2, the coefficient of 0:064 for correlation corresponds to a 0.64% increase in flows for a 0.1 decline in correlation. For a fund with $100 million in assets, this translates to $0.64 million. The results are very similar when we use the U.S. beta of the fund in place of correlation.15 Furthermore, the results reported are for all funds that include diversified international funds, regional funds, and country funds. When we run these regressions for only diversified funds or only emerging market funds, the results are similar and therefore not reported. For these sub-samples the correlations are again negative and significant at the one percent level. The significance of the results for diversified funds also indicates that the results are not driven by desire to invest in a few countries with higher expected returns. Models 3 and 4 of Table 81.2 present results when an interaction variable, which is the product of a high correlation dummy (correlations higher than the median), and the fund’s returns in excess of the U.S. market return and its squared term are included. The coefficients are always negative and significant in two instances. This variable is used to test if the relationship between flows and past performance is weaker when the correlations are high. The negative sign and significance in two cases supports this hypothesis. Thus it appears that, not only U.S. investors prefer funds that have low correlation with U.S. markets, but the flow-performance relationship is stronger when the correlations are low. The standard deviation of the last year’s returns also has a significantly negative sign in all specifications, consistent with our hypothesis that as the volatility increases, the benefits of diversification decline and therefore the flows decline. In Table 81.3, we report results from three alternate specifications to examine if our findings are robust. In model 1, the past year’s returns of the fund are used. First, the previous year’s return of the fund is used to rank all the funds from 0 to 1, which we define as Rank. Then we define the tercile ranks as: Bottom_tercile D Min(1/3, Rank), Middle_tercile D Min (1/3, Rank-Bottom_tercile) and Top_tercile D Min(1/3, Rank-Bottom_tercile-Middle_tercile). This piecewise linear relationship between fund flows and previous year’s returns
D.K. Patro
follows Sirri and Tufano (1998). This is useful because of the non-linear relationship between flows and return shown by Carhart (1997), Gruber (1996), and Chevalier and Ellison (1997). Thus, this specification allows us to test if there is an asymmetric response to good and poor performance. In model 2, we use the same methodology but use a two-factor model that includes the world market return and the changes in real trade weighted value of the dollar (based on the international asset pricing model of Adler and Dumas 1983). We use the alpha from the following two-factor model,16 using the 12 months returns prior to that quarter: Rf t Rf t D a C ˇf .Rwt Rf t / C ˇ2f Rf xt C "pt
(81.4)
where, Rft is the return on the fund, Rft is the risk-free rate proxied by the return on 30-day T-bills, Rwt is the return on the MSCI world market index, and Rfx is the return on the Federal Reserve’s real trade weighted value of the dollar. In model 3, we use the world market returns instead of U.S. market returns for measuring performance. The results in Table 81.3 indicate that there is a significant positive relationship between top performers and fund flows (t-stat of 12.592). We find a surprisingly significant positive relationship between fund flows and performance for funds in the bottom performance tercile. What is interesting is, however, that the coefficient of the correlation variable is always negative and significant, suggesting that our findings are quite robust to alternate specifications. Here and in later tables, the control variables are significant and have signs consistent with previous findings in the context of diversified domestic mutual funds. A priori, there is no reason to believe them to be otherwise. Therefore, these results confirm our expectations and support the findings from studies using domestic funds and a new dataset. The coefficients for the total fees of the funds are always significantly negative, which is similar to findings in Sirri and Tufano (1998) and more recent evidence in Barber et al. (2005) for domestic funds. The previous 12 months standard deviation of returns is significant in some specifications. The size of the fund measured by the TNA at the end of the previous quarter is significantly negative while the size of the fund family measured by the total TNA of all funds belonging to that family in that quarter is significantly positive. These findings are similar to the findings for domestic funds. The negative sign on fund size implies that bigger funds have smaller percentage flows for the same dollar
14
The authors assume that investors who are well diversified at home are better aware of the benefits of international diversification. This also is consistent with our assumption that the investors are well diversified at home. 15 The U.S. beta is estimated by regressing the fund’s last twelve months returns on the CRSP value weighted U.S. market index.
16
We also estimated a six-factor model including these two factors, the three Fama-French factors and the Carhart momentum factor. The results are similar so they are not reported.
Interaction of high correlation dummy and World market adjusted fund return Interaction of high correlation dummy and World market adjusted fund return
Correlation of fund returns with U.S. market index
Logarithm of lag fund complex TNA
Logarithm of lag TNA
Total fees
Logarithm of age (in years)
Std. dev. of last 12 months returns
Flows to all international funds
Intercept
.1:396/ 0:057 .25:353/ 0:014 .6:310/ 0:010 .9:655/ 0.006
.2:947/ 0:053 .23:808/ 0:014 .6:509/ 0:011 .11:184/ 0.007 (8.466) 0:042 .4:591/
.0:752/ 0:051 .23:418/ 0:015 .6:798/ 0:012 .11:839/ 0.006 (8.285) 0:040 .4:418/
(6.335) (0.372)
(continued)
0.012
.6:042/ 0.108
(2.223) 0:015
(10.980) 0:034 .45:463/ 0:01 .15:316/ 0.002 (6.689) 0.001
(42.653) 0:255
0.08 (23.496) 0.555
0.007
.2:548/
(3.704) 0:007
(13.730) 0:032 .39:865/ 0:01 .14:320/ 0.001 (1.668) 0.001
(51.323) 0:325
0.039 (10.091) 0.699
(33.479)
1:582
(3.864) 0:004
(9.195) 0:031 .40:227/ 0:01 .14:421/ 0 .0:819/ 0.001
(54.754) 0:212
0.028 (7.313) 0.723
Quantile regressions
.4:987/
.6:467/ 0:081
(7.822) 0:060
(18.014) 0:111
(23.114) 0:216
(23.695) 0:055
0.163 (14.973) 0.994
0.089 (7.898) 1.208
0.078 (6.763) 1.242
Table 81.3 Diversification and international mutual fund flows: alternate performance measures OLS regressions
81 Determinants of Flows into U.S.-Based International Mutual Funds 1245
0.001 (5.148) 0.001 (6.395) 0.003 (12.592) 34,819 0.107 34,889 0.098
34,819 0.065
(13.486)
(15.004)
34,819 0.101
(10.948) 0.001
.0:989/ 0.001
(5.739) 0.004
0.001 (9.798) 0.001 (10.982) 0.001 (18.418) 34,819 0.069
.0:802/
(9.142) 0.001
.2:000/ 0.001
.1:820/
34,889 0.062
0:189 0.022
.0:873/ 0.017
(9.499) 0.001
(14.207) 0:012
(5.743) 0.001
0:129
0:117
0:04
0.208
Quantile regressions
This table reports the results from pooled panel regressions of quarterly net mutual fund flows on market returns and fund characteristics. The data spans 136 quarters from 1970 to 2003. The table reports the coefficient estimates and the White heteroscedasticity consistent t-stats in the parentheses Robust t statistics in parentheses Significant at 10% Significant at 5% Significant at 1%
Observations Adjusted R-squared
Top performance tercile
Middle performance tercile
Bottom performance tercile
Top performance tercile (factor model)
Middle performance tercile (factor model)
Bottom performance tercile (factor model)
Change in real value of the dollar
World market adjusted fund return (MARW)2
World market adjusted fund return (MARW)
Table 81.3 Diversification and international mutual fund flows: alternate performance measures OLS regressions
1246 D.K. Patro
81 Determinants of Flows into U.S.-Based International Mutual Funds
flows. The significantly positive sign of the fund family size could be due to the search costs and familiarity with fund brand names as suggested by Sirri and Tufano (1998) and Khorana and Servaes (1999). Similarly, the age of the fund is significantly negative as in Chevalier and Ellison (1997). In all specifications we also observe a significant relationship between fund flows and flows to all international funds.
81.4.2 Information Asymmetry, Return Chasing, and International Mutual Fund Flows In Table 81.4 we report the results from our panel regressions where we test the predictions of Bohn and Tesar (1996) and Brennan and Cao (1997). Results from three alternate specifications are reported. In these regressions we do not include the fund performance variables as they are highly correlated to the fund returns that are used to test our hypothesis. The flows are significantly (t-stat of 12.957 in model 1) and positively related to current quarter fund returns as predicted by the model of Brennan and Cao (1997). This is consistent with the model and supports the idea that there may be information asymmetry between local and foreign investors in international equity markets. The flows are also significantly (tstat of 7.042 in model 3) related to the previous quarter’s returns. The results clearly indicate that the net quarterly flows into or out of U.S.-based international mutual funds exhibit “return chasing” or “trend following.” These later results are also consistent with the empirical evidence for domestic mutual funds. The change in the real value of the dollar for the current quarter is also significantly positive, although it changes sign in model 3 when lagged fund returns are included.
1247
This suggests that, after controlling for the fund size, fees, complex size and so forth – everything else being equal – there is a strong preference for domestic funds. When we use a dummy variable for emerging market funds, the coefficient for the emerging market funds is significantly negative (t-stat of 4:334). This suggests that ceteris paribus there is an aversion for foreign funds and even more aversion for emerging market funds. When the various fund characteristics are interacted with the dummy variable for international funds, only the age variable is negative and significant. Therefore, when U.S. investors invest globally, there is an investor preference for funds that help them diversify; however, there is a strong preference for domestic funds.
81.4.4 Monthly International Mutual Fund Flows The results in the previous sections were based on quarterly flows as that gives the largest number of funds for the longest time period. The CRSP database of mutual funds also has the monthly TNAs of funds since 1991. Therefore, as a robustness check we re-estimate our main regressions using this dataset. Although this spans a shorter time-period, we have more frequent observations for this sample. The results in general are very similar to what we have found using quarterly flows and therefore not reported. The sign and the significance of the other control variables are also similar to what was obtained using quarterly flows. Furthermore, since using quantile regressions results in similar results to those obtained using pooled analysis, the results are very robust.
81.5 Conclusion 81.4.3 Home Bias and International Mutual Fund Flows Although not the main focus of our analysis, we also that examine if controlling for fund characteristics, there is a preference for domestic funds by U.S. households. To examine if U.S. investors prefer domestic funds – because of the well-documented home bias (see, for example, French and Poterba 1991) – we report results (see Table 81.5) from panel regressions where we pool data for diversified domestic equity funds and international funds. A dummy variable is used to indicate the international funds. The coefficient on that variable is significantly negative (t-stat of 6:988).
We provide a comprehensive analysis of the growth of U.S.based international mutual funds to understand what drives investors to buy these funds. Our analysis uncovers several new and unique features of flows into international mutual funds. The empirical findings show a strong relationship between flows into U.S.-based international mutual funds and the correlation of the fund’s assets and the U.S. market and the standard deviation of the fund’s returns, consistent with the international diversification argument. Second, the flows are related to contemporaneous and past fund returns supporting an “information asymmetry” as well as “return chasing” or “trend following” hypothesis for international capital flows.
Logarithm of age (in years)
Std. dev. of last 12 months returns
Flows to all international funds
U.S. market adjusted fund return (MAR)2
U.S. market adjusted fund return (MAR)
Correlation of fund returns with U.S. market index
Fund return (current quarter)
Intercept
0.164 (15.192) 0.185
(13.603) 0:055
.6:069/ 0.112
(17.495) 0.018
(1.827) 0.889
(16.085) 0:223
.2:947/ 0:058 .25:928/
0.167 (15.522) 0.170
(12.957) 0:059
.6:495/ 0.110
(17.410) 0.020
(2.016) 0.878
(15.882) 0:228
.3:019/ 0:059
.26:189/
Table 81.4 Information asymmetry, trend following and international mutual fund flows OLS regressions
.25:891/
.1:809/ 0:058
(16.667) 0:142
(0.897) 0.942
(17.961) 0.009
.6:846/ 0.114
0:063
0.170 (15.599)
(43.866)
.12:498/ 0:035
(36.244) 0:302
(16.874) 0.51
(31.356) 0.027
.5:794/ 0.061
(12.375) 0:016
0.082 (23.042) 0.052
Quantile regressions
(46.647)
.13:633/ 0:034
(40.380) 0:301
(16.691) 0.52
(34.971) 0.025
.4:902/ 0.063
(14.820) 0:012
0.08 (24.338) 0.059
(continued)
(44.018)
.12:113/ 0:035
(40.511) 0:285
(14.459) 0.555
(32.713) 0.022
.5:055/ 0.066
0:013
0.081 (23.261)
1248 D.K. Patro
(3.983)
34,810 0.100
.1:998/ 0.094
(7.960) 0.272
(8.080)
34,889 0.099
(7.955) 0:135
.8:945/ 0.006
.8:749/ 0.006
(7.042) 34,810 0.095
34,889 0.062
(2.579)
(7.278) 0.001
0:01 .14:252/ 0.002
.9:318/ 0.006
Quantile regressions 0:013 .6:185/ 0:009
0:013 .5:928/ 0:009
OLS regressions
0:013 .5:863/ 0:009
34,810 0.062
(7.645)
(2.342) 0.164
(7.389) 0.001
0:01 .15:630/ 0.002
(6.895) 34,810 0.061
(2.019) 0.031
(2.146) 0.046
(6.977) 0.001
0:011 .15:613/ 0.002
Robust t statistics in parentheses. This table reports the results from pooled panel regressions of quarterly net mutual fund flows on market returns and fund characteristics. The data spans 136 quarters from 1970 to 2003. The table reports the coefficient estimates and the White heteroscedasticity consistent t-stats in the parentheses Significant at 10% Significant at 5% Significant at 1%
Observations Adjusted R-squared
Fund return (previous quarter)
Changes in real value of the dollar
Logarithm of lag fund complex TNA
Logarithm of lag TNA
Total fees
Table 81.4 (continued)
81 Determinants of Flows into U.S.-Based International Mutual Funds 1249
Logarithm of age (in years)
Std. dev. of last 12 months returns
Changes in real value of the dollar
Flows to all international funds
U.S. market adjusted fund return (MAR)2
U.S. market adjusted fund return (MAR)
Flows to all domestic funds
Dummy (foreign funds)
Intercept
1.256
(32.832) 0.195
(49.940) 0:008
.1:308/ 0.054
(3.525) 0.072
(3.053) 0:053
.1:815/ 0:047
(32.639) 0.194
(49.216) 0:008
.1:238/ 0.053
(3.444) 0.073
(3.059) 0:071
.2:448/ 0:047
0.118 (38.297)
.6:988/ 1.253
0.119 (39.002) 0:011
Table 81.5 Home bias and international mutual fund flows OLS regressions
.0:120/ 0:045
(2.663) 0:003
.16:689/ 0:025
(10.116) 0:139
(4.388) 0.083
(24.209) 0.022
.1:958/ 0.059
(3.832) 0.063
(101.671) 0.014
(49.233) 0.087
(16.969) 0.666
0.051 (48.345) 0:009
Quantile regressions
(46.115) 0:012
(32.266) 0.204
1.235
0.115 (37.999)
.15:494/ 0:025
(10.929) 0:119
(5.164) 0.081
(27.218) 0.024
(112.678) 0.014
(54.134) 0.088
0.667
0.05 (51.140)
(continued)
.11:981/ 0:024
(11.115) 0:1
(5.438) 0.084
(18.162) 0.026
(93.800) 0.011
(52.496) 0.088
0.662
0.049 (48.884)
1250 D.K. Patro
Logarithm of age (in years)
Std. dev. of last 12 months returns
Interaction of foreign dummy with U.S. market adjusted fund return (MAR) U.S. market adjusted fund return (MAR)2
Dummy (emerging market funds)
Logarithm of lag fund complex TNA
Logarithm of lag TNA
Total fees
Table 81.5 (continued)
(14.368) 0:015
(14.949)
.4:334/
.19:611/ 0.004
.64:486/ 0:006 .7:175/ 0:008
.19:805/ 0.004
.64:584/ 0:005 .6:687/ 0:008
OLS regressions
0.014
(9.527) 0:26
(13.319) 0:008
0.021
(1.781) 0:424
.5:557/ 0:014
(continued)
.0:953/
(2.941)
(4.936) 0
.93:259/ 0:004 .14:036/ 0.001
.4:805/
.18:479/
0:832 0:017
(11.633) 0
.106:497/ 0:004 .17:738/ 0.001
0.002
(1.986)
(10.109) 0
.97:811/ 0:004 .15:647/ 0.001
Quantile regressions
0:036
(13.633)
.19:409/ 0.004
.59:175/ 0:005 .6:388/ 0:009
81 Determinants of Flows into U.S.-Based International Mutual Funds 1251
195,586 0.089
OLS regressions
195,586 0.089
(6.079) 195,586 0.054
(3.394) 195,586 0.090
195,586 0.054
(9.325) 0.001
(2.154) 0.003
195,586 0.061
.11:841/ 0:001 .2:543/ 0.003
Quantile regressions .6:329/ 0:000 .0:054/ 0.002
This table reports the results from pooled panel regressions of quarterly net mutual fund flows on market returns and fund characteristics. The data spans 136 quarters from 1970 to 2003. The table reports the coefficient estimates and the White heteroscedasticity consistent t-stats in the parentheses
Observations Adjusted R-squared
Logarithm of lag fund complex TNA
Logarithm of lag TNA
Total fees
Table 81.5 (continued)
1252 D.K. Patro
81 Determinants of Flows into U.S.-Based International Mutual Funds
References Adler, M. and B. Dumas. 1983. “International portfolio choices and corporation finance: A synthesis.” Journal of Finance 38, 925–984. Bailey, W., A. Kumar, and D. Ng. 2008. “Foreign investments of U.S. individual investors: causes and consequences.” Management Science 54, 443–459. Barber, B. M. and T. Odean. 2000. “Trading is hazardous to your wealth: the common stock investment performance of individual investors.” Journal of Finance 55, 773–806. Barber, B. M., T. Odean, and L. Zheng. 2005. “Out of sight, out of mind: the effects of expenses on mutual fund flows.” Journal of Business 78, 2095–2119. Bekaert, G. and C. Harvey. 2000. “Capital flows and the behavior of emerging market equity returns,” with Geert Bekaert,’ in Capital inflows to emerging markets, S. Edwards (Ed.). NBER and University of Chicago Press, Chicago, pp. 159–194. Berk, J. and R. C. Green. 2004. “Mutual fund flows and performance in rational markets.” Journal of Political Economy 112, 1269–1295. Bhattacharya, U. and P. Groznik. 2008. “Melting pot or salad bowl: some evidence from U.S. investments abroad.” Journal of Financial Markets 11(3), 228–258. Bohn, H. and L. Tesar. 1996. “U.S. equity investment in foreign markets: portfolio rebalancing or return chasing?” American Economic Review 86, 77–81. Brennan, M. and H. Cao. 1997. “International portfolio investment flows.” Journal of Finance 52, 1851–1880 Carhart, M. M. 1997. “On persistence in mutual fund performance.” Journal of Finance 52, 57–82. Chevalier, J. and G. Ellison. 1997. “Risk taking by mutual funds as a response to incentives.” Journal of Political Economy 105, 1167–1200. De Santis G. and B. Gerard. 1997. “International asset pricing and portfolio diversification with time-varying risk.” Journal of Finance 52, 1881–1912. Errunza, V. R. and E. Losq. 1989. “Capital flow controls, international asset pricing, and investors’ welfare: a multi-country framework.” Journal of Finance 44, 1025–1037. Errunza, V., K. Hogan, and M.-W. Hung. 1999. “Can the gains from international diversification be achieved without trading abroad.” Journal of Finance 54, 2075–2107. French, K. and J. Poterba. 1991. “Investor diversification and international equity markets.” American Economic Review 81, 222–226. Froot, K. A., P. G. J. O’Connell, and M. S. Seasholes. 2001. “The portfolio flows of international investors.” Journal of Financial Economics 59, 151–194. Grinblatt, M. and M. Keloharju. 2000. “The investment behavior and performance of various investor types: a study of Finland’s unique data set.” Journal of Financial Economics 55, 43–67. Gruber, M. J. 1996. “Another puzzle: the growth in actively managed mutual funds.” Journal of Finance 51, 783–810. Khorana, A. and H. Servaes. 1999. “The determinants of mutual fund starts.” Review of Financial Studies 12(5), 1043–1074. Levy, H. and M. Sarnat. 1970. “International diversification of investment portfolios.” American Economic Review 60, 668–675. Nanda, V., Z. Jay Wang, and L. Zheng. 2004. “Family values and the star phenomenon.” Review of Financial Studies 17, 667–698. Patel, J., R. J. Zeckhauser, and D. Hendricks. 1994. “Investment flows and performance: evidence from mutual funds, cross-border investments, and new issues,” in Japan, Europe and the international
1253 financial markets: analytical and empirical perspectives, R. Satl, R. Levitch and R. Ramachandran (Eds.). Cambridge University Press, New York, pp. 51–72. Sirri, E. R. and P. Tufano. 1998. “Costly search and mutual fund flows.” Journal of Finance 53, 1589–1622 Solnik, B. 2004. International investments, Pearson-Addison Wesley, Salt Lake City. Stulz, R. M. 1999. “International portfolio flows and security markets,” in International capital flows, M. Feldstein (Ed.). University of Chicago Press, Chicago. Tesar, L. and Werner, I. 1994. “International equity transactions and U.S. portfolio choice.” in The internationalization of equity markets, J. Frankel (Ed.). University of Chicago Press, Chicago, pp. 185–220. Tesar, L. and I. Werner. 1995. “U.S. equity investment in emerging stock markets.” World Bank Economic Review 9, 109–130. Wermers, R. 2000. “Mutual fund performance: an empirical decomposition into stock-picking talent, style, transaction costs, and expenses.” Journal of Finance 55, 1655–1695. Wooldridge, J. M. 2002. Econometric analysis of cross section and panel data, MIT Press, Cambridge, MA.
Appendix 81A Econometric Analysis of Panel Data Yit D ˛i C x0it ˇ C eit
(81.5)
where i D 1 : : : N and t D 1 : : : T The standard assumption is that the intercepts ’i and ei are independent, and that eit is independently and identically distributed across firms and time and is not correlated with the independent variables xit. In that case, OLS estimation results in consistent estimates. The common assumption is that “ is constant across firms. When ’i D ’ for all N, the estimation is called a pooled estimation and usual techniques for least square analysis applied. If however we make the assumption that the intercepts vary across firms we can still estimate (1) using what is called “fixed-effects.” If standard assumptions hold, we obtain unbiased and consistent estimates. Sometimes this is also referred to as a LSDV or Least Squares Dummy Variable model. We can allow for different intercepts for different firms, and also allow different intercepts for time too using time dummies. However, if we treat ’ as a random draw from a distribution model it is termed a random effects model. The pooled estimator of “ is b D .X0 X/ 1.X0 Y/ where X is the NT K matrix of observations on the independent variables across the N firms and Y is the nT 1 vector of dependent variables. There are K coefficients in the “ vector.
1254
D.K. Patro
Data Appendix To identify and categorize all international mutual funds, we used the following four classifications and the sub-categories available in the CRSP mutual fund database: 1. 2. 3. 4.
The Weisenberger fund types: INT-international equity Policy: C&I – Canadian and international ICDI’s fund objective codes: IE-international equity and Strategic Insight’s Fund objectives codes: ECH-Chinese Equity Funds invest primarily in equity
securities of companies in China. ECN-Canada Equity Funds invest primarily in equity
securities of companies in Canada.
ESC-Single Country Equity Funds invest primarily in
equity securities of companies whose main trading market is in a single country outside the United States, Canada, China or Japan. However, the funds such as the Fidelity France Fund, New England Growth Fund of Israel, and Pioneer India Fund were re-classified as funds from the respective countries. From 1962 to 1992 the classification is mainly based on the categories of Weisneberger and Policy. From 1992 to 2003 the classifications are mainly based on ICDI’s fund objective codes and Strategic Insight’s Fund objectives codes. Finally, we rechecked each and every fund’s name and reclassified it into one of the following categories. An asterisk . / denotes emerging market funds.
EID-International Developing Markets Equity Funds
invest primarily in equity securities whose main trading markets are non-industrialized or developing market countries. EIG-International Growth Funds invest primarily in equity securities whose main trading markets are outside the United States for capital appreciation. EIS-International Small Company Funds invest primarily in equity securities of small capitalization companies whose main trading areas are outside the United States. EIT-International Total Return Funds invest primarily in equity securities whose main trading markets are outside the United States for capital appreciation and current or future income. EJP-Japanese Equity Funds invest primarily in equity securities of companies in Japan. ELT-Latin America Equity Funds invest primarily in equity securities of companies in Latin America. EPC-Pacific Equity Funds including Japan Funds invest primarily in equity securities of companies in the Pacific Region including Japan. EPX-Pacific Equity excluding Japan Funds invest primarily in equity securities of companies in the Pacific Region excluding Japan. ERP-European Equity Funds invest in equity securities whose primary trading markets are confined to Europe or specific European countries. FLG-Flexible Global Funds are generally free to assign up to 100% of their assets across various asset classes, including foreign and domestic equities, fixedincome securities and money market instruments. This is used if other classifications indicate that it is an international fund. JPN-Japanese equity. PAC-Pacific equity.
Region/country Australia Austria Belgium Brazil Canada China Developing European France Germany Holland Hong Kong India International International small cap International total returns Israel Italy Japan Korea Latin American Malaysia Mexico New Zealand Nordic Pacific Pacific (excluding Japan) Poland Russia South Africa Singapore Spain Sweden Switzerland Taiwan UK
Number of funds 4 1 2 1 8 23 279 165 3 7 2 2 5 1; 338 87 177 3 4 56 6 51 1 3 1 2 89 73 2 2 2 1 3 1 2 1 4
81 Determinants of Flows into U.S.-Based International Mutual Funds
The following are lists of the largest funds in three categories: domestic funds, international funds, and emerging market funds. Largest Domestic Funds at the end of 2003. Name Vanguard 500 Index/Inv Fidelity Magellan Investment Company of America Fund/A Washington Mutual Investors Fund/A Growth Fund of America/A Standard & Poors Depository Rcpt Fidelity Contra Fund Income Fund of America/A Fidelity Growth & Income Vanguard Institutional Index/Instl Fidelity Low Priced Stock
Total assets ($ millions) 75342:5 67995:1 58353:4 55; 575 48073:8 43815:4 36051:4 31955:0 30571:7 29457:5 26725:2
1255
Largest International Funds at the end of 2003 Name EuroPacific Growth Fund/A Capital Income Builder Fund/A Fidelity Diversified International Templeton Foreign Fund/A Artisan International Fund Vanguard International Growth Vanguard European Stock Index Morgan Stanley Instl:Intl Equity/A iShares MSCI EAFE Index Fd Vanguard Total Internatl Stock Index T Rowe Price Internatl Stock Fund
Total assets ($ millions) 29907:6 20605:0 13559:1 12039:9 9591:1 6424:1 6251:5 5639:0 5350:0 5279:0 5197:4
Chapter 82
Predicting Bond Yields Using Defensive Forecasting Glenn Shafer and Samuel Ring
Abstract This chapter introduces a new method of forecasting, defensive forecasting, and illustrates its potential by showing how it can be used to predict the yields of corporate bonds in real time. Defensive forecasting identifies a betting strategy that succeeds if forecasts are inaccurate and then makes forecasts that will defeat this strategy. The theory of defensive forecasting is based on the game-theoretic framework for probability introduced by Shafer and Vovk in 2001. In this framework, we frame statistical tests as betting strategies. We reject a hypothesis when a strategy for betting against it multiplies the capital it risks by a large factor. We prove each classical theorem, such as the law of large numbers, by constructing a betting strategy that multiplies the capital it risks by a large factor if the theorem’s prediction fails. Defensive forecasting plays against strategies of this type. Keywords Defensive forecasting r Game-theoretic probability r On-line prediction r Predicting bond yields r Support vector machines
82.1 Introduction Defensive forecasting identifies a betting strategy that succeeds if forecasts are inaccurate and then makes forecasts that will defeat this strategy. Defensive forecasting identifies a betting strategy that succeeds if forecasts are inaccurate and then makes forecasts that will defeat this strategy. The theory of defensive forecasting is based on the game-theoretic framework for probability introduced by Shafer and Vovk in 2001 (see Shafer and Vovk 2001 and www.probabilityandfinance.com). In this framework, we frame statistical tests as betting strategies. We reject a hypothesis when a strategy for betting against it
G. Shafer () and S. Ring Rutgers University, Newark, NJ, USA e-mail: [email protected]
multiplies the capital it risks by a large factor. We prove each classical theorem, such as the law of large numbers, by constructing a betting strategy that multiplies the capital it risks by a large factor if the theorem’s prediction fails. Defensive forecasting plays against strategies of this type. In Sect. 82.2, we introduce the game-theoretic framework. In Sect. 82.3, we explain the intuition and the theory of defensive forecasting. In Sect. 82.4, we use data on corporate bond prices to illustrate the method and compare it to two other methods, a support vector machine and an implementation of Merton’s model. We conclude, in Sect. 82.5, with a brief summary. In the remainder of this introductory section, we explain how game-theoretic probability relates to the general problem of on-line prediction, why game-theoretic probability is relevant to finance, and why making predictions or forecasts that pass statistical tests has been considered so difficult.
82.1.1 On-Line Prediction The problem of predicting prices in financial markets in real time is an example of on-line prediction, which has attracted attention in machine learning in recent years (see Cesa-Bianchi and Lugosi 2006; Vovk et al. 2005a and the wiki at http://www.onlineprediction.net). In general, on-line prediction can be thought of as an interaction between two players, say Forecaster and Reality, who move in sequence as follows: P ROTOCOL 1. O N - LINE PREDICTION FOR n D 1; 2; : : : ; N : Reality announces xn . Forecaster announces fn . Reality announces yn . Here fn is Forecaster’s prediction of yn , and xn is auxiliary information Forecaster can use. The two players move in sequence and see each other’s moves as they are made. So Forecaster can use .x1 ; y1 /; : : : ; .xn1 ; yn1 / and xn to predict yn . There are many variations on this basic protocol. Sometimes we allow play to continue indefinitely instead of
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_82,
1257
1258
stopping it at a fixed horizon N ; sometimes there is a stopping rule that depends on the moves by Reality and Forecaster. We can elaborate the protocol with additional players and rules restricting the moves, and we can make it into a game by adding rules for winning. One popular elaboration, called prediction with expert advice, adds other forecasters (experts) and a loss function that penalizes errors in the forecasts; the problem for Forecaster is then to choose his fn so that he does nearly as well as the best of the experts. Often this can be achieved by taking fn to be some sort of average of the experts’ forecasts. Another popular elaboration, often called Bayesian, introduces a probability distribution for the sequence .x1 ; y1 /; : : : ; .xN ; yN /, so that Forecaster can set fn equal to the conditional expected value of yn given .x1 ; y1 /; : : : ; .xn1 ; yn1 /; xn . The game-theoretic framework for probability treats yn as a payoff and treats fn as a price for it, and it adds a player, whom we call Skeptic, who gambles at these prices. This produces the following protocol, where Kn designates Skeptic’s capital after the nth round: P ROTOCOL 2. O N - LINE PREDICTION WITH S KEPTIC K0 WD 1. FOR n D 1; 2; : : : ; N : Reality announces xn . Forecaster announces fn . Skeptic announces sn . Reality announces yn . Kn WD Kn1 C sn .yn fn /. Here we have set Skeptic’s initial capital, K0 , equal to one monetary unit, but other choices for the initial capital are sometimes convenient. We can make Protocol 2 into a game by specifying goals for the players. Different goals or rules for winning produce games appropriate for different chapters of probability theory and finance theory. Shafer and Vovk’s 2001 book (Shafer and Vovk 2001) emphasizes games in which Skeptic tries to multiply his initial capital by a large factor without risking additional capital (without moving so that Reality can make Kn negative for some n). A typical theorem of classical probability theory (the law of large numbers or the law of the iterated logarithm, for example) says an event A whose happening and failing is determined by Reality’s moves has probability close to 1. The corresponding game-theoretic theorem says that Skeptic has a strategy that will multiply the capital he risks by a large factor unless the event happens. Theorems of this type constitute a flexible and generalized version of probability theory, in which only forecasts f1 ; : : : ; fN are required; a complete probability distribution for Reality’s moves need not be given.
G. Shafer and S. Ring
As we have already noted, a probability distribution P for Reality’s moves .x1 ; y1 /; : : : ; .xN ; yN / can be thought of as a complete strategy for Forecaster: it tells Forecaster to set fn equal to EP .yn j.x1 ; y1 /; : : : ; .xn1 ; xn1 /; xn /, where EP designates the expected value with respect to P. But gametheoretic probability does not emphasize strategies for Forecaster; instead it emphasizes strategies for Skeptic, which are tests of Forecaster. Both prediction for expert advice and Bayesian statistics can be thought of as methods that average strategies for Forecaster to obtain a strategy that performs well under diverse circumstances. Defensive forecasting is analogous but different. It averages strategies for Skeptic (tests of Forecaster) to obtain a portmanteau strategy for Skeptic (test of Forecaster). Then it has Forecaster play against the portmanteau strategy for Skeptic.
82.1.2 Applying the Framework to Finance The game-theoretic framework can be used to derive various stylized facts in finance from an efficient market hypothesis alone, without making probabilistic assumptions. Examples include CAPM (Vovk and Shafer 2008) and the absence of lagged correlations in equity returns (Wu and Shafer 2007). Here we present a simpler example: the stylized fact that changes in p market prices over an interval of time of length dt scale as dt. In a securities market where shares are traded 252 days a year, for example, the typical change in price of a share from p one year to the next is 252, or about 16, times as large as the typical change from one day to the next. There is a standard way of explaining this. We begin by assuming that price changes are governed by some objective but unknown probability distribution. Then we argue that successive changes must be uncorrelated; otherwise someone who knew the correlation (or learned it by observation) could devise a trading strategy with positive expected value. Uncorrelatedness of 252 successive daily price changes implies that their sum, the annual price change, has variance 252 times p as large and hence standard deviation, or typical value, 252 times as large. This is a simple argument, but it uses the mysterious chance mechanism twice, first when we use the probabilistic concept of correlation, and then when we interpret market efficiency as the absence of a trading strategy with positive expected value. As we now explain, we can replace the appeal to a chance mechanism with a purely game-theoretic argument, in which the assumption of market efficiency is expressed by saying that a speculator will not multiply the capital he risks by a large factor.
82 Predicting Bond Yields Using Defensive Forecasting
1259
P ROTOCOL 3. DAILY TRADING IN THE MARKET FOR A SECURITY
K0 WD 1. Market announces y0 2 R. FOR n D 1; 2; : : : ; N : Investor announces sn 2 R. Market announces yn 2 R. Kn WD Kn1 C sn .yn yn1 /. For simplicity, consider the above protocol, which describes a market in shares of a corporation. Investor plays the role of Skeptic; he tries to make money, and we assume he will not do so without risking more than his initial stake, which we take to be $1. Market plays the roles of Forecaster (by giving opening prices) and Reality (by giving closing prices). We suppose that today’s opening price is yesterday’s closing price, so that Market gives only one price each day, at the end of the day. When Investor holds sn shares during day n, he makes sn .yn yn1 /, where yn is the price at the end of day n. For simplicity, we ignore the fact that the price yn of a share cannot be negative. This does not invalidate our results; they are worst case results, telling us what Investor can accomplish regardless of Market’s behavior, and so they are not invalidated by a further restriction on Market’s moves. Stronger results may be obtained, however, when the restriction yn 0 is imposed, see Vovk and Shafer (2003). Since no chance mechanism is assumed to be operating, we cannot appeal to the idea of the variance of approbability distribution for price changes to explain what dt scaling means. But we can use v u N u1 X t .yn yn1 /2 (82.1) N nD1 as the typical daily change, and we can compare it to the magnitude of the change we see over the whole game, say max jyn y0 j:
0
(82.2)
The quantity (Equation (82.2)) should have the same order p of magnitude as N times the quantity (Equation (82.1)). Equivalently, we should have N X .yn yn1 /2 max .yn y0 /2 ; nD1
0
(82.3)
where is understood to mean that the two quantities have the same order of magnitude. Do we have any reason to think that (Equation (82.3)) should hold? Indeed we do. As it turns out, there is a scoring martingale that becomes large (makes a lot of money
for Investor) if (Equation (82.3)) is violated. Market (whose moves are affected by all the participants in the market, but only slightly by Investor) tends to set prices so that Investor will not make a lot of money, and as we will see in Sect. 82.3, he can more or less do so. So we may expect (Equation (82.3)) to hold. The strategy that makes for Investor money if (Equation (82.3)) is violated is an average of two scoring strategies, a momentum strategy (hold more shares after the price goes up), and a contrarian strategy (hold more shares after the price goes down). 1. The momentum strategy turns $1 into $D=E or more if P .yn yn1 /2 E and max.yn y0 /2 D. 2. The or more if P contrarian2 strategy turns $1 into $E=D .yn yn1 / E and max.yn y0 /2 D. For details, see Vovk and Shafer (2003).
82.1.3 Probability Forecasting Although our goal in this paper is to explain the use of defensive forecasting to predict bond yields, we will introduce the game-theoretic framework using the simpler example where one tries to predict events by giving probabilities for them. We fit this into the framework of Protocols 1 and 2 by supposing that ( yn WD
1
if the event happens;
0
if the event does not happen:
In this case, the price fn for yn is the probability of the event. Writing pn instead of fn and omitting for simplicity the auxiliary information x1 ; : : : ; xN , we obtain this protocol: P ROTOCOL 4. P ROBABILITY FORECASTING K0 WD 1. FOR n D 1; 2; : : : ; N : Forecaster announces pn 2 Œ0; 1 . Skeptic announces sn 2 R. Reality announces yn 2 f0; 1g. Kn WD Kn1 C sn .yn pn /. If sn > 0, Skeptic is betting on the event; he gets sn .1 pn / if it happens and loses sn pn if it fails. If sn < 0, he is betting against the event; he gets sn pn if it fails and loses sn .1 pn / if it happens. At first glance, it appears that Skeptic and Reality can easily beat Forecaster in Protocol 4. When Forecaster gives a low probability, Skeptic bets on the event and Reality makes it happen; when Forecaster gives a high probability, Skeptic
1260
G. Shafer and S. Ring
bets against the event and Reality makes it fail. This idea was formalized Dawid (1986). As Dawid explained, if Reality follows the strategy ( yn WD
1 if pn < 0:5 0 if pn 0:5;
and Skeptic follows the strategy ( sn WD
1
if pn < 0:5
1
if pn 0:5;
then Skeptic makes a profit of at least 0:5 on every round: 1 pn when pn < 0:5 and pn when pn 0:5. But this example is not as convincing as it first appears. Its weak point is that it requires Reality and Skeptic to be able to tell whether pn is exactly equal to 0:5 and to switch their moves with infinite abruptness (discontinuously) as pn shifts from 0:5 to something ever so slightly less. Because the idea of an infinitely abrupt shift lives in an idealized mathematical world, we should be wary about drawing practical conclusions from it. To show that Dawid’s counterexample depends on the artificiality of the mathematical idealization, it suffices to show that it disappears when we hobble Skeptic and Reality in a way that is reasonable or inconsequential in practice. In this chapter we do this by requiring that Skeptic follow a strategy that makes his bet a continuous function of the forecast pn . This is not a practical impediment, because a continuous function can change with any degree of abruptness short of infinite. It can easily be shown – we reproduce in Sect. 82.3.1 below the proof given in Vovk et al. (2004) – that Forecaster can beat any continuous strategy. And as we explain in Sect. 82.3.2, there are continuous strategies for Skeptic that make a lot money unless Forecaster’s probabilities perform well.
82.2 Game-Theoretic Probability The game-theoretic framework for probability has its roots in work by Jean Ville in the 1930s (Ville 1939). Ville worked under the assumption that complete probability distributions are given, and for the most part he considered only the case where we seek betting strategies that multiply the capital risked by an infinite factor (achieving this corresponds to the happening of an event of zero probability, rather than merely the happening of an event of small probability). Moreover, he spelled out his reasoning fully only for the case where outcomes are binary. But he explained fully the idea of proving theorems by constructing betting strategies.
In Sect. 82.2.1, we review Ville’s theorem, which establishes the equivalence, within classical probability, between the happening of an event of small probability and the success of a betting strategy. In Sect. 82.2.2, we review how Ville thought about martingales. In Sect. 82.2.3, we discuss a martingale that Ville used to prove the strong law of large numbers in the case of successive independent events that all have the same probability p. We then turn to the generalization to the case where no joint probability distribution for what we may observe is given. In Sect. 82.2.4, we return to Protocol 4. We discuss the weak and strong law of large numbers for this protocol, and as an illustration, We sketch a very simple proof of the weak law. In Sect. 82.2.5, we discuss a protocol in which predictions mn of quantities yn are interpreted as prices at which Skeptic can buy or sell the yn , and we explain how we can obtain laws of large numbers even in this case.
82.2.1 Ville’s Theorem In order to state Ville’s theorem, consider a sequence Y1 ; Y2 ; : : : of binary random variables with a joint probability distribution P. Suppose, for simplicity, that P assigns every finite sequence y1 ; : : : ; yn of 0s and 1s positive probability, so that its conditional probabilities for Yn given values of the preceding variables are always unambiguously defined. Since the forecasts will be the conditional probabilities, $P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 / we do not need Forecaster as a player. The protocol becomes simply the following: P ROTOCOL 5. V ILLE ’ S PROBABILITY PROTOCOL K0 WD 1. FOR n D 1; 2; : : : : Skeptic announces sn 2 R. Reality announces yn 2 f0; 1g. Kn WD Kn1 C sn .yn P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 //. Restriction on skeptic. Skeptic must choose the sn so that his capital is always nonnegative (Kn 0 for all n) no matter how Reality moves. Notice that Skeptic and Reality now continue for an infinite number of rounds instead of stopping after round N . Ville’s theorem for Protocol 5 says that Skeptic’s getting infinitely rich is equivalent to an event of zero probability happening, in the following sense: 1. When Skeptic follows a strategy that gives sn as a function of y1 ; : : : ; yn1 , P .K0 ; K1 ; : : : is unbounded/ D 0:
(82.4)
82 Predicting Bond Yields Using Defensive Forecasting
1261
2. If A is a measurable subset of f0; 1g1 with P.A/ D 0, then Skeptic has a strategy that guarantees
Now suppose Skeptic begins with capital ˛, not necessarily equal to 1, and drop the requirement that he keep his capital nonnegative.
lim Kn D 1
n!1
whenever .y1 ; y2 ; : : : / 2 A. We can summarize these two statements by saying that Skeptic’s being able to multiply his capital by an infinite factor is equivalent to the happening of an event with probability zero. Practical applications require that we consider small probabilities rather than zero probabilities. In this case, we can use the following finitary version of Ville’s theorem, which is proven in Chap. 8 of Shafer and Vovk (2001). 1. When Skeptic follows a strategy that gives sn as a function of y1 ; : : : ; yn1 , sup Kn
P
nD0;1;:::
1 ı
! ı
(82.5)
for every ı > 0. 2. If A is a measurable subset of f0; 1g1 with P.A/ ı, then Skeptic has a strategy that guarantees lim inf Kn n!1
1 ı
whenever .y1 ; y2 ; : : : / 2 A. We can summarize these results by saying that Skeptic’s being able to multiply his capital by a factor of 1=ı or more is equivalent to the happening of an event with probability ı or less. If we assume that the game is played only for a fixed number of rounds N instead of being continued indefinitely, then the preceding statements simplify to the following: 1. When Skeptic follows a strategy that gives sn as a function of y1 ; : : : ; yn1 ,
1 P max Kn 0nN ı
ı
(82.6)
for every ı > 0. 2. If A is a subset of f0; 1gN with P.A/ < ı, then Skeptic has a strategy that guarantees KN > whenever .y1 ; : : : ; yN / 2 A.
1 ı
82.2.2 Martingales in Ville’s Picture
P ROTOCOL 6. V ILLE ’ S PROBABILITY PROTOCOL AGAIN Parameters. real number ˛, positive probability distribution P on f0; 1g1 K0 WD ˛. FOR n D 1; 2; : : : : Skeptic announces sn 2 R. Reality announces yn 2 f0; 1g. Kn WD Kn1 C sn .yn P.Yn D 1jY1 D y1 ; : : : ; (82.7) Yn1 D yn1 //. Recall the meaning of the assumption that the probability distribution P is positive: for every sequence y1 ; : : : ; yn of 0s and 1, P.Y1 D y1 ; : : : ; Yn D yn / > 0. A strategy for Skeptic in Protocol 6 is a rule that specifies, for each n and each possible sequence of prior moves y1 ; : : : ; yn1 by Reality, a move sn for Skeptic. When such a strategy is fixed and ˛ is given, Skeptic’s capital Kn becomes a function of Reality’s moves y1 ; : : : ; yn . Ville was the first to call the sequence of functions K0 ; K1 ; : : : a martingale. In Protocol 6, knowing the capital process K0 ; K1 ; : : : produced by a strategy for Skeptic is equivalent to knowing the strategy and the initial capital ˛. You can find K0 ; K1 ; : : : from ˛ and the strategy, and you can find ˛ and the strategy from K0 ; K1 ; : : : . This equivalence was Ville’s justification for calling a capital process a martingale; before his work, “martingale” was a name for a gambler’s strategy, not a name for his capital process. The name stuck; now we call capital processes martingales even in protocols where the strategy cannot be recovered from them. Let us call a martingale K0 ; K1 ; : : : a scoring martingale if K0 D 1 and Kn is always nonnegative – i.e., Kn .y1 ; : : : ; yn / 0 for all n and all y1 ; : : : ; yn . Let us call a strategy for Skeptic that produces a scoring martingale when it starts with unit capital a scoring strategy. When Skeptic plays a scoring strategy starting with K0 D 1, he can be sure that Kn 0 for all n no matter how Reality plays. He may be risking the entire initial unit of capital, but he is not putting any other capital – his own or anyone else’s – at risk. We relate Protocol 6 and its probability distribution P to the empirical world by predicting that a scoring martingale will be bounded. This prediction can be used in two ways: Prediction. The prediction that a particular scoring martingale Kn is bounded can imply other predictions that are
1262
G. Shafer and S. Ring
interesting in their own right. It is easy to give examples. In §82.2.3, we will look at a scoring martingale whose being bounded, as predicted with probability one by Equation (82.4), implies the strong law of large numbers. In §82.2.4, we will look at a scoring martingale whose being bounded by 1=ı, as predicted with probability ı by Equation (82.6), implies the weak law of large numbers. Testing. The actual values of a scoring martingale test the validity of the probability distribution P. The larger these “scores” are, the more strongly P is refuted. Ville used scoring martingales in both ways. Equation (82.7), the rule for updating the capital in Protocol 6, implies that Kn1 is the expected value at time n 1 of the future value of Kn : E.Kn jy1 ; : : : ; yn1 / D Kn1 .y1 ; : : : ; yn1 /:
(82.8)
Because its left-hand side is equal to Kn .y1 ; : : : ; yn1 ; 1/
scoring strategies produces the same average of the scoring martingales. In order to avoid any confusion, let me spell out what this means in the case where we are taking a simple average of two strategies and also in the general case where we are averaging a class of strategies. Simplest case. Suppose the scoring strategies S 1 and S 2 recommend moves sn1 .y1 ; : : : ; yn1 / and sn2 .y1 ; : : : ; yn1 /, respectively, on the nth round of the game. The average of S 1 and S 2 – call it S – recommends the move sn .y1 ; : : : ; yn1 / D
on the nth round of the game. If we write Kn1 , Kn2 , and Kn for the corresponding scoring martingales, then Kn .y1 ; : : : ; yn / D
P.y1 ; : : : ; yn1 ; 1/ P.y1 ; : : : ; yn1 /
1 1 s .y1 ; : : : ; yn1 / 2 n Csn2 .y1 ; : : : ; yn1 /
1 1 Kn .y1 ; : : : ; yn / C Kn2 .y1 ; : : : ; yn / 2
for all n and all y1 ; : : : ; yn . General case. In general, consider strategies S , where the index ranges over a set „. We may average these strategies with respect to a probability distribution on „, obtaining a strategy S with moves
P.y1 ; : : : ; yn1 ; 0/ ; C Kn .y1 ; : : : ; yn1 ; 0/ P.y1 ; : : : ; yn1 / Equation (82.8) is equivalent to Q.y1 ; : : :; yn1 ; 1/CQ.y1 ; : : :; yn1 ; 0/DQ.y1 ; : : :; yn1 /;
Z sn .y1 ; : : : ; yn1 / D „
(82.9) where Q is the function on finite strings of 0s and 1s defined by Q.y1 ; : : : ; yn / D Kn .y1 ; : : : ; yn /P.y1 ; : : : ; yn /:
provided only that this integral always converges. The corresponding scoring martingales then average in the same way: Z Kn .y1 ; : : : ; yn / D „
Equation (82.9) is necessary and sufficient for a nonnegative function Q on finite strings of 0s and 1s to define a probability measure on f0; 1g1 . So a nonnegative martingale with respect to P is the same thing as the ratio of a probability measure Q to P – i.e., a likelihood ratio with P as denominator. Joe Doob imported Ville’s notion of a martingale into measure-theoretic probability, where the gambling picture is not made explicit, by taking Equation (82.8) as the definition of a martingale (Doob 1953). Game-theoretic probability goes in a different direction. It uses Ville’s definition of a martingale in more general forecasting games, where the forecasts are given by a player in the game, not by a probability distribution P for Reality’s moves. In these games, martingales are not likelihood ratios. They are simply capital processes for Skeptic. One very important property of martingales that Ville used, which does carry over to the more general games studied in game-theoretic probability, is this: an average of
sn .y1 ; : : : ; yn1 /.d /;
Kn .y1 ; : : : ; yn /.d /:
These assertions are true because the increment Kn Kn1 is always linear in Skeptic’s move sn . This linearity is a feature of all the protocols considered in Shafer and Vovk (2001).
82.2.3 Borel’s Strong Law in Game-Theoretic Form Ville studied some strategies for Skeptic in the special case of Protocol 6 in which P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 / is always equal to p. Let us review the scoring martingale he used to prove the strong law of large numbers for this special case: Skeptic can become infinitely rich unless Reality makes the relative frequency of 1s converge to p. Let us assume that P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 / is always equal to p:
82 Predicting Bond Yields Using Defensive Forecasting
1263
P ROTOCOL 7. F ORECASTING WITH A CONSTANT PROB ABILITY p Parameter. Real number p satisfying 0 < p < 1 K0 WD 1. FOR n D 1; 2; : : : : Skeptic announces sn 2 R. Reality announces yn 2 f0; 1g. Kn WD Kn1 C sn .yn p/. This protocol represents what classical probability calls independent Bernoulli trials: successive events that are independent and all have the same probability. The scoring martingale Ville used to prove the strong law of large numbers for this protocol is Kn .y1 ; : : : ; yn / WD
rn Š.n rn /Š rn .nrn / p q ; .n C 1/Š
(82.10)
where q is equal to 1 p and rn is the number of 1s among y1 ; : : : ; yn : n X yi rn WD i D1
(Ville 1939, p. 52; Ville 1955; Robbins 1970). To confirm that Equation (82.10) is a scoring martingale, it suffices to notice that it is a likelihood ratio with P as the denominator. In fact, p rn q nrn D P.y1 ; : : : ; yn / and rn Š.n rn /Š D .n C 1/Š
Z
1
rn .1 /nrn d D Q.y1 ; : : : ; yn /; 0
(82.11) where Q is the probability distribution obtained by averaging probability distributions under which the events yi D 1 are independent and all have a probability possibly different from p.
Proposition 1. The strong law of large numbers (Emile Borel) With probability one, rn D p: n!1 n lim
(82.12)
Sketch of Proof Because Equation (82.10) is a scoring martingale, Ville’s theorem (Equation (82.4)) says that, with probability one, there is a constant C such that rn Š.n rn /Š rn .nrn / p q
(82.13)
for n D 0; 1; : : : . One can deduce Equation (82.12) from Equation (82.13) using Stirling’s formula.
The reader may verify that Equation (82.10) is Skeptic’s capital process when he follows the strategy that prescribes the moves
1 rn1 C 1 p sn .y1 ; : : : ; yn1 / WD pq nC1 Kn1 .y1 ; : : : ; yn1 /:
(82.14)
The ratio .rn1 C 1/=.n C 1/ is close to the relative frequency of 1s so far, rn1 =.n 1/. Roughly speaking, the strategy (Equation 82.14) bets that yn will be 1 when the relative frequency is more than p, 0 when the relative frequency is less than p. So whenever the relative frequency diverges from p, Reality must move it back towards p to keep Skeptic from increasing his capital. The rate of convergence of rn =n p to zero implied by p Equation (82.13) is .log n/=n (Ville 1955). Using other scoring martingales (obtained by averaging rn .1 /nrn as in Equation (82.11), but with respect to distributions concentrated around p rather than the uniform distribution), Ville also derived the faster convergence asserted by the law of the iterated logarithm (Shafer and Vovk 2001, chapter 5). We can also use martingale methods to prove other classical theorems, including the weak law of large numbers (Shafer and Vovk, 2001, chapter 6) and the central limit theorem (Shafer and Vovk, 2001, chapter 7). These outside take us outside classical probabilities, to situations like finance (Protocol 3) and forecasting (Protocol 4). In these situations, where we do not have a probability distribution, we cannot prove that something happens with probability one. But we can prove that if it does not happen, Skeptic can multiply the capital he risks by an infinite or large factor.
82.2.4 Bernoulli’s Theorem in Game-Theoretic Form As a first step outside classical probability, let us again consider Protocol 4, the protocol for probability forecasting with a finite horizon N . Because this protocol has a finite horizon, we prove the weak law for it rather than the strong law. Consider the sum Sn WD
n X .yi pi /: i D1
The weak law of large numbers says that if N is sufficiently P large, the relative frequency of 1s, .1=N / N n , should nD1 yP be close to the average probability forecast, .1=N / N nD1 pn . This means SN =N should be close to zero. We state this precisely as follow:
1264
G. Shafer and S. Ring
Proposition 2. The weak law of large numbers (Jacob Bernoulli) There exists a scoring martingale that will exceed 1=ı unless ˇ ˇ ˇ SN ˇ 1 ˇ ˇ (82.15) ˇN ˇ p : Nı Proof. Consider the strategy for Skeptic that prescribes the move 2Sn1 : sn D N This produces the capital process 1 Kn D 1 C N
Sn2
! n X 2 .yi pi / ; i D1
which is a scoring martingale: Kn 0 for n D 1; : : : ; N because .yi pi /2 1 for each i . For the same reason, the hypothesis KN 1=ı implies Equation (82.15). When and ı are small positive numbers, and N is extremely large, so that 1 N 2 ; ı Proposition 2 implies that we can be as confident that jSN =N j as we are that Skeptic will not multiply his capital by the large factor 1=ı. The martingale in the proof appears in measure-theoretic form in Kolmogorov’s 1929 proof of the weak law Kolmogorov (1929). A positive probability distribution P on f0; 1gN can serve as a strategy for Forecaster, prescribing pn WD P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 /: When such a strategy is imposed, Protocol 4 reduces to a finite-horizon version of Ville’s protocol (Protocol 5), and Proposition 2 says that there is a scoring martingale in this finite-horizon protocol that will exceed 1=ı unless ˇ ˇ N ˇ1 X ˇ 1 ˇ ˇ .yn P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 //ˇ p ; ˇ ˇN ˇ Nı nD1 (82.16)
and by Ville’s theorem, this implies that ˇ N ˇ1 X ˇ P ˇ .yn P.Yn D 1jY1 D y1 ; : : : ; ˇN nD1 ˇ ! ˇ 1 1 ˇ 1 : Yn1 D yn1 //ˇ p ˇ ı Nı When P.Yn D 1jY1 D y1 ; : : : ; Yn1 D yn1 / is always equal to p, we are in a finite-horizon version of Protocol 7, and we get
ˇ ˇ ! N ˇ ˇ1 X 1 1 ˇ ˇ 1 ; yn p ˇ p P ˇ ˇ ˇN ı Nı nD1 the elementary version of the classical weak law usually derived using Chebyshev’s inequality. Forecaster need not follow a strategy defined by a probability distribution for y1 ; : : : ; yN . He can use information from outside the game to decide how to move, or he can simply follow his whims. So instead of having a classical probability distribution, we are in what Dawid called the prequential framework Dawid (1984). In this framework, we cannot say that Equation (82.16) has probability at least 1 1=ı, because we have no probability distribution, but Proposition 2 has the same intuitive meaning as such a probability statement. We consider Skeptic’s multiplying his capital by 1=ı or more against a good Forecaster unlikely in the same way as we consider the happening of an event unlikely when it is given probability ı or less by a valid probability distribution. Although we have focused on the weak law here, it is also straightforward to prove the strong law for probability forecasting with an infinite horizon. Kumon and Takemura have given a proof using a modified form of Ville’s martingale (Equation 82.10) Kumon and Takemura (2008). In (Shafer and Vovk, 2001, chapter 3), Shafer and Vovk gave an alternative proof using a martingale extracted from Kolmogorov’s 1930 proof of the strong law Kolmogorov (1930).
82.2.5 Bounded Forecasting To drive home the point that game-theoretic probability generalizes classical probability, suppose Forecaster’s task on each round is to give not a probability for an event but a prediction of a bounded quantity. To fix ideas, we assume that the quantity and the prediction are always in the interval from 0 to 1. P ROTOCOL 8. F INITE - HORIZON BOUNDED FORECASTING
Parameter: natural number N K0 WD 1. FOR n D 1; : : : ; N : Forecaster announces mn 2 Œ0; 1 . Skeptic announces sn 2 R. Reality announces yn 2 Œ0; 1 . Kn WD Kn1 C sn .yn mn /. As it turns out, the game-theoretic treatment of Protocol 9 is scarcely different from that of Protocol 4. When we set Sn WD
n X i D1
.yi mi /;
82 Predicting Bond Yields Using Defensive Forecasting
we find that Proposition 2 still holds, with exactly the same proof. Only slight changes are needed in the proposition and proof if some other finite interval is substituted for Œ0; 1 .
82.3 Defensive Forecasting We now turn from strategies for Skeptic, which test Forecaster, to strategies for Forecaster, which seek to withstand such tests. Forecaster engages in defensive forecasting when he plays in order to defeat a particular strategy for Skeptic. As it turns out, Forecaster can defeat any particular strategy for Skeptic, provided only that each move prescribed by the strategy varies continuously with respect to Forecaster’s previous move. Forecaster wants to defeat more than a single strategy for Skeptic. He wants to defeat simultaneously all the scoring strategies Skeptic might use. But as we will see, Forecaster can often amalgamate the scoring strategies he needs to defeat by averaging them, and then he can play against the average. Defeating the average may be good enough, because when any one of the scoring strategies rejects Forecaster’s validity, the average will generally reject as well, albeit less strongly. We begin by explaining defensive forecasting in the case of probability forecasting, leaving the reader to consult other expositions for other protocols. In Sect. 82.3.1, we review the simple proof that Forecaster can defeat any continuous strategy for Skeptic in the binary case. In Sect. 82.3.2, we explain how to average strategies for Skeptic to test for the calibration of binary probability forecasts, and in Sect. 82.3.3, we explain how resolution can also be achieved. In Sect. 82.3.4, we discuss briefly how defensive forecasting extends from to more general prediction problems such as we encounter in finance, where are predicting a non-binary quantity rather than giving a probability for an event. There is much more to say about defensive forecasting, much of it in recent papers by Vovk (2005c, 2004, 2005a,b, 2006, 2007a,b,c). We should also mention related work by Levin (1976, 1984), Gács (2005), Foster and Vohra (1998), Sandroni et al. (2003), and Kakade and Foster (2004).
82.3.1 Defeating a Continuous Strategy for Skeptic In this section, we repeat a simple proof, first given in Vovk et al. (2004), showing that Forecaster can defeat any
1265
particular fixed strategy for Skeptic, provided that each move prescribed by the strategy varies continuously with respect to Forecaster’s previous move. Consider a strategy S for Skeptic in Protocol 4, the protocol for probability forecasting. Write S.p1 ; y1 ; : : : ; pn1 ; yn1 ; pn /
(82.17)
for the move that S prescribes on the nth round of the game. At the beginning of the nth round, just before Forecaster makes his move pn , the earlier moves p1 ; y1 ; p1 ; : : : ; pn1 ; yn1 are known and therefore fixed, and so Equation (82.17) becomes a function of pn only, say Sn .pn /: This defines a function Sn on the interval Œ0; 1 . If Sn is continuous for all n and all p1 ; y1 ; : : : ; pn1 ; yn1 , let us say that S is forecast-continuous. When we fix the strategy S, Skeptic no longer has a role to play in the game, and we can omit him from the protocol. P ROTOCOL 9. B INARY PROBABILITY FORECASTING AGAINST A FIXED TEST
Parameter: Strategy S for Skeptic K0 WD ˛: FOR n D 1; 2; : : : ; N : Forecaster announces pn 2 Œ0; 1 . Reality announces yn 2 f0; 1g. Kn WD Kn1 C Sn .pn /.yn pn /. Proposition 3. If the strategy S is forecast-continuous, Forecaster has a strategy that ensures K0 K1 K2 . Proof. By the intermediate-value theorem, the continuous function Sn is always positive, always negative, or else satisfies Sn .p/ D 0 for some p 2 Œ0; 1 . So Forecaster can use the following strategy: If Sn is always positive, take pn WD 1 If Sn is always negative, take pn WD 0 Otherwise, choose pn so that Sn .pn / D 0
This guarantees that Sn .pn /.yn pn / 0, so that Kn Kn1 . If the reader finds it confusing that the notation Sn .pn / leaves the dependence on the earlier moves p1 ; y1 ; p1 ; : : : ; pn1 ; yn1 implicit, he or she may wish to think about the following alternative protocol, which leaves Skeptic in the game and has him announce the function Sn just before Forecaster makes his move pn :
1266
G. Shafer and S. Ring
P ROTOCOL 10. S KEPTIC CHOOSES A STRATEGY ON EACH ROUND
K0 WD ˛. FOR n D 1; 2; : : : ; N : Skeptic announces continuous Sn W Œ0; 1 ! R. Forecaster announces pn 2 Œ0; 1 . Reality announces yn 2 f0; 1g. Kn WD Kn1 C Sn .pn /.yn pn /. Protocol 10 gives Skeptic a little more flexibility than Protocol 9 does; it allows Skeptic to take into account information coming from outside the game as well as the previous moves p1 ; y1 ; : : : ; pn1 ; yn1 when he decides on Sn . But it still requires that Sn .pn / depend on pn continuously, and it still makes sure that Forecaster knows Sn before he makes his move pn . (Like all protocols in this chapter, Protocol 10 is a perfect-information protocol; the players move in sequence and see each other’s moves as they are made.) So the proof and conclusion of Proposition 3 still hold: Forecaster has a strategy that ensures K0 K1 K2 .
P
nWpn p yn for the number in the latter group. Thus the fraction of 1s on those rounds where pn is near p is
P nWpn p
P
nWpn p
If a forecaster is doing a good job, we expect about 30% of the events to which he gives probabilities near 30% to happen. If this is true not only for 30% but also for all other probability values, then we say the forecaster is well calibrated. Here are two ways to apply this somewhat fuzzy concept to the probabilities p1 ; : : : ; pN and outcomes y1 ; : : : ; yN . 1. For a given p 2 Œ0; 1 , divide the pn into those we consider near p and those we consider not near. Write P 1 nWpn p for the number in the group we consider near p. Divide this group further into those for which yn is equal to 0 and those for which yn is equal to 1. Write
:
K.pn ; p/ WD e .pn p/
(82.18)
2
(82.19)
(with > 0), which takes the value 1 when pn is exactly equal to p, a value close to 1 when pn is near p, and a value close to 0 when pn is far from p. Then we say the forecasts are well calibrated if PN
nD1 K.pn ; p/yn
PN
nD1
P ROTOCOL 11. F ORECASTER & R EALITY FOR n D 1; : : : ; N : Forecaster announces pn 2 Œ0; 1 . Reality announces yn 2 f0; 1g.
1
We say the forecasts are well calibrated at p if Equation (82.18) is approximately equal to p. We say the forecasts are well calibrated overall if they are well calibrated at p for every p for which the denominator of Equation (82.18), the number of pn near p, is large. 2. Alternatively, instead of dividing the pn sharply into those we consider near p and those we consider not near, we can introduce a continuous measure K.pn ; p/ of similarity, a function such as the Gaussian kernel
82.3.2 Calibration We now exhibit a forecast-continuous strategy S for Skeptic in Protocol 9 that multiplies Skeptic’s capital by a large factor whenever Forecaster’s probabilities fail to be well calibrated. According to Proposition 3, Forecaster can give probabilities that are well calibrated by playing against S. We follow closely the more general exposition in Vladimir Vovk’s “Non-asymptotic calibration and resolution” Vovk (2005c). Leaving aside Skeptic for a moment, consider binary forecasting with Forecaster and Reality alone:
yn
K.pn ; p/
p
(82.20)
for every p for which the denominator is large. The second approach, making the closeness of pn and p continuous, fits here, because Skeptic can enforce Equation (82.20) using a forecast-continuous strategy. Recall that a kernel on a set on Z is a function K W Z 2 ! R that satisfies two conditions: It is symmetric: K.p; p 0 / D K.p 0 ; p/ for all p; p 0 2 Z. Pm Pm It is positive definite: i D1 j D1 i j K.zi ; zj / 0 for
all real numbers 1 ; : : : ; m and all elements z1 ; : : : ; zm of Z. The Gaussian kernel (Equation 82.19) is a kernel on Œ0; 1 in this sense. It also satisfies 0 K.p; p 0 / 1 for all p; p 0 2 Z. And as we have already noted, K.p; p 0 / is 1 when p D p 0 , close to 1 when p 0 is near p, and close to 0 when p 0 is far from p. Our reasoning in this section works when K is any kernel on Œ0; 1 satisfying these conditions. The goal that Equation (82.20) hold for all p for which PN nD1 K.pn ; p/ is large is fuzzy on two counts: “ p” is fuzzy, and “large” is fuzzy. But reasonable interpretations of these two fuzzy predicates lead to the following precise goal. G OAL ˇN ˇ ˇX ˇ p ˇ ˇ K.pn ; p/.yn pn /ˇ N ˇ ˇ ˇ nD1
(82.21)
82 Predicting Bond Yields Using Defensive Forecasting
1267
Heuristic Proposition 1 Under reasonable interpretations of “ p” and “large,” Equation (82.21) implies that P Equation (82.20) holds for all p for which N nD1 K.pn ; p/ is large.
According to the proof of Proposition 3, playing against the K29 betting strategy means using the following algorithm to choose pn for n D 1; : : : ; N . K29 F ORECASTING A LGORITHM
Explanation 1 Condition (Equation 82.20) is equivalent to PN
nD1 K.pn ; p/.yn PN nD1 K.pn ; p/
p/
0:
(82.22)
PN
nD1 K.pn ; p/.pn PN nD1 K.pn ; p/
p/
0
P K.pn ; p/ is large. So Equation (82.22) holding when N PNnD1 when nD1 K.pn ; p/ is large is equivalent to nD1 K.pn ; p/.yn PN nD1 K.pn ; p/
pn /
0
(82.23)
PN holding when nD1 K.pn ; p/ is large. If we take PN K.p ; p/ being large to mean that n nD1 N X
n1 X
K.p; pi /.yi pi / D 0
K.pn ; p/
p N;
nD1
then this condition together with Equation (82.21) implies ˇ ˇP ˇ ˇ N ˇ nD1 K.pn ; p/.yn pn /ˇ 1; PN nD1 K.pn ; p/ which is a reasonable interpretation of Equation (82.23). In order to derive a strategy for Forecaster that guarantees the goal (Equation 82.21) in Protocol 11, let us imagine that Skeptic is allowed to enter the game and bet, as in Protocol 10. Our strategy for Forecaster will be to play against the following forecast-continuous strategy for Skeptic.
has at least one solution p in the interval Œ0; 1 , set pn equal to such a solution. Pn1 If i D1 K.p; pi /.yi pi / > 0 for all p 2 Œ0; 1 , set pn equal to 1. Pn1 If i D1 K.p; pi /.yi pi / < 0 for all p 2 Œ0; 1 , set pn equal to 0. Why does the K29 forecasting algorithm work? We will give a formal proof that it works and then discuss its success briefly from an intuitive viewpoint. Proposition 4. Suppose Forecaster plays in Protocol 11 using the K29 algorithm with a kernel K satisfying K.p; p/ 1 for all p 2 Œ0; 1 . Then Equation (82.21) will hold no matter how Reality chooses y1 ; : : : ; yN . Proof. Mercer’s theorem says that for any kernel K on Œ0; 1 , there is a mapping ˆ (called a feature mapping) of Œ0; 1 into a Hilbert space H such that K.p; p 0 / D ˆ.p/ ˆ.p 0 /
Sn .p/ D
sn D
K.pn ; pi /.yi pi /:
Proposition 3 says that when Forecaster plays the K29 forecasting algorithm against this strategy, Skeptic’s capital does not increase. So N X
sn .yn pn /
nD1
K.p; pi /.yi pi /
We call this strategy for Skeptic the K29 betting strategy because it is modeled on the martingale that Kolmogorov used in Kolmogorov (1929) to prove the weak law of large numbers (see Sect. 82.2.4).
n1 X i D1
0 KN K0 D
i D1
(82.25)
for all p; p 0 in Œ0; 1 , where is the dot product in H Mercer (1909); Schölkopf and Smola (2002). Under the K29 betting strategy, Skeptic’s moves are
K29 B ETTING S TRATEGY n1 X
(82.24)
i D1
We are assuming that K.pn ; p/ is small when p is far from pn . This implies that
PN
If the equation
D
n1 N X X
K.pn ; pi /.yn pn /.yi pi /
nD1 i D1
1 XX K.pn ; pi /.yn pn /.yi pi / 2 nD1 i D1 N
D
N
1268
G. Shafer and S. Ring
1X K.pn ; pn /.yn pn /2 2 nD1 N
2 N 1 X D .yn pn /ˆ.pn / 2 nD1 1X k.yn pn /ˆ.pn /k2 : 2 nD1 N
Hence
2 N N X X k.yn pn /ˆ.pn /k2 : .yn pn /ˆ.pn / nD1 nD1 (82.26) From Equation (82.26) and the fact that kˆ.p/k D
p K.p; p/ 1;
The pertinence of this formulation becomes clear when we recognize that Forecaster’s calibration cannot be rejected because of what Reality does on a single round of the forecasting game. A statistical test of calibration (such as the K29 betting strategy) can reject calibration (multiply its initial capital by a large factor) only as a result of a trend involving many trials. To reject calibration at p, we must see many rounds with pi near p, and the relative frequency of 1s for these pi must diverge substantially from p. The K29 formulation avoids any such divergence by always putting the next pn at values of p where no trend is emerging – where, on the contrary, calibration is excellent so far. This is only natural. In general, we avoid choices that have worked out poorly in the past. The interesting point here is that this is sufficient to avoid rejection by a statistical test. Forecaster can make sure he is well calibrated by acting as if the future will be like the past, regardless of what Reality does on each round.
we obtain N X p .yn pn /ˆ.pn / N :
82.3.3 Resolution (82.27)
nD1
Using Equation (82.25), the Cauchy-Schwartz inequality, and then Equation (82.27), we obtain ˇ ˇ N ˇ ˇN ! ˇ ˇ X ˇ ˇX ˇ ˇ ˇ ˇ K.pn ; p/.yn pn /ˇ D ˇ ˆ.pn /.yn pn / ˆ.p/ˇ ˇ ˇ ˇ ˇ ˇ nD1
nD1
N X p ˆ.pn /.yn pn / kˆ.p/k N : nD1
Additional insight about the success of the K29 forecasting algorithm can be gleaned from Equation (82.24): n1 X
K.p; pi /.yi pi / D 0:
i D1
In practice, this equation usually has a unique solution in Œ0; 1 , and so this solution is Forecaster’s choice for pn . Because K.p; pi / is small when p and pi are far apart, the main contribution to the sum comes from terms for which pi is close to p. So the equation is telling us to look for a value of p such that yi pi average to zero for i such that pi is close to p. This is precisely what calibration requires. So we can say that on each round, the algorithm chooses as pn the probability value where calibration is the best so far.
Probability forecasting is usually based on more information than the success of previous forecasts for the various probability values. If rainfall is more common in April than May, for example, then a weather forecaster should take this into account. It should rain on 30% of the April days for which he gives rain a probability of 30% and also on 30% of the May days for which he gives rain a probability 30%. This property is stronger than mere calibration, which requires only that it rain on 30% of all the days for which the forecaster says 30%. It is called resolution. To see that defensive forecasting can achieve resolution as well as mere calibration, we consider a protocol in which Reality provides auxiliary information xn that Skeptic and Forecaster can use in making their moves: P ROTOCOL 12. F ORECASTING WITH AUXILIARY I N FORMATION xn Parameter: natural number N , set X K0 WD ˛. FOR n D 1; : : : ; N : Reality announces xn 2 X. Skeptic announces continuous Sn W Œ0; 1 ! R. Forecaster announces pn 2 Œ0; 1 . Reality announces yn 2 f0; 1g. Kn WD Kn1 C Sn .pn /.yn pn /. In this context, we need a kernel K W .Œ0; 1 X/2 ! Œ0; 1 to measure the similarity of .p; x/ to .p 0 ; x 0 /. We may choose it so that K..p; x/.p 0 ; x 0 // is 1 when .p; x/ D .p 0 ; x 0 /, close to 1 when .p; x/ is near .p 0 ; x 0 /, and close to 0 when .p; x/ is far from .p 0 ; x 0 /. Once we have chosen such a kernel, we may say that the forecasts have good resolution if
82 Predicting Bond Yields Using Defensive Forecasting
1269
PN
nD1 K..pn ; xn /.p; x//yn
PN
nD1 K..pn ; xn /.p; x//
p
for every pair .p; x/ for which the denominator is large. This is a straightforward generalization of calibration, and the entire theory that we have reviewed for calibration generalizes directly. In the generalization, the K29 betting strategy is Sn .p/ D
n1 X
K..p; xn /; .pi ; xi //.yi pi /;
(82.28)
i D1
and K29 forecasting algorithm achieves good resolution by playing against it. The K29 forecasting strategy is again very natural. It chooses pn to satisfy n1 X
K..pn ; xn /; .pi ; xi //.yi pi / D 0
(82.29)
No assumptions are needed concerning the space X, because the information xn that Skeptic uses can take any form. But the spaces F, S, and Y must satisfy certain regularity conditions. Roughly speaking, Forecaster’s move space F must be the convex closure of Reality’s moves space Y, and Skeptic’s move space S must be a linear subspace of Rm . Under these conditions, the arguments in Propositions 3 and 4 generalize (we appeal to a fixed-point theorem rather than to the intermediate-value theorem). Implementation generalizes in a relatively straightforward way. The crucial step is again the choice of a kernel K to measure the similarity of .f; xn /, where f is a candidate for the prediction fn , with an earlier pair .fi ; xi /. Once K is selected, we proceed as in Equations (82.28) and (82.29); the testing strategy for Skeptic is Sn .f / D
82.3.4 Defensive Forecasting of Prices In Vovk et al. (2005b), the theory of defensive forecasting is generalized from the binary case, which we have been reviewing so far, to the case where the outcomes and predictions can be real numbers or even vectors. The starting point is the general protocol for on-line prediction we laid out in Protocol 2, except that move spaces satisfying certain conditions are specified. We may write the protocol in this way, where the move spaces F, S, and Y are subsets of a linear space, say the EuclideanPspace Rm , and hu; vi is the inner product in Rm : hu; vi WD m i D1 ui vi . P ROTOCOL 13. L INEAR P ROTOCOL K0 WD 1. FOR n D 1; 2; : : : ; N : Reality announces xn 2 X. Forecaster announces fn 2 F. Skeptic announces sn 2 S. Reality announces yn 2 Y. Kn WD Kn1 C hsn ; .yn fn /i.
K..f; xn /; .fi ; xi //.yi fi /;
i D1
i D1
In other words, it chooses pn so that we already have good resolution for .pi ; xi / near .pn ; xn /. This is analogous to the practice of varying one’s actions with the situation in accordance with past experience in different situations. If x is the friend you are spending time with and p is your activity, and experience tells you that bowling has been the most enjoyable activity with Tom, bridge with Dick, and tennis with Harry, then this is how you will choose in the future. See Vovk (2005c) for details.
n1 X
and the strategy for Forecaster chooses fn to satisfy n1 X
K..fn ; xn /; .fi ; xi //.yi fi / D 0:
(82.30)
i D1
82.4 Predicting Bond Yields We now illustrate defensive forecasting by using it to predict prices for corporate bonds. In order to provide some context, we also forecast the same prices using a support vector machine (Vapnik 1995; Boser et al 1992; Joachims 1999) and a method based on the Merton model (Merton 1974; Hull 2002). Our results are intended only as an illustration; we have not looked at all the ways defensive forecasting, support vector machines, and Merton’s model can be implemented for this problem. Corporate bonds are traded much less frequently than the shares of the same firms. The TRACE data base, which we use, often records only a few trades of bonds issued by a given firm on a given day. But the data is available to traders in real time. We consider the position of the trader towards the end of the day, and we imagine him or her using information about earlier trades to predict the price of the last bond from that firm to trade that day. His data includes not only the prices at which the firm’s bonds have traded earlier and details on these bonds but also the firm’s accounting data and current data on the firm’s stock price. Following standard practice, we work not with the price of the bond, but with the yield, which carries the same information. Altogether, we predict 1050 closing yields, for bonds from 41 different firms.
1270
G. Shafer and S. Ring
82.4.1 Data We use data from the following databases and time periods (Table 82.1): We also use treasury rates for all maturities from 3 months to 10 years, 2000–2006, and recovery rates provided by Altman and Kishore (1996). The TRACE database covers bond trading for 2,995 firms over 1,136 days, for a total of over 700,000 potential firmdate pairs. But not every bond traded every day, and the data for many trades are incomplete. In order to give all three of our prediction methods a chance to perform well, we consider only firm-date pairs that meet the following conditions: 1. The firm has only one class of debt outstanding. (This assumption is satisfied by 150 of the firms in the TRACE database. It is required by the Merton model.) 2. The firm is a US firm and is not a financial firm or utility. (Financial firms and utilities are considered difficult to model because of their unusual leverage structure.) 3. There is at least 24 months of data for the firm in all four databases. 4. The firm has at least three bonds outstanding. 5. All these outstanding bonds traded at least twice that day. 6. Other needed data (treasury yields, for example) are available for the day. Only 1,050 firm-date pairs satisfy all these conditions. They involve 41 different firms. For each of the 1,050 firm-date pairs, we predict the yield for the last trade of the day for a bond issued by the firm. For the most part, the different bonds of the 41 firms are about equally represented; there are few cases where we predict the price for one issue of a firm many more times than another. We predict the yield at which a bond trades using ten variables: 1. SD(E) – The 24-month historical equity return volatility 2. Equity – The market value of equity on the trade date 3. Amount Outstanding – The principal amount of the debt issue 4. T - The time to maturity of the debt issue 5. Daily Ratio – The percentage of the overall firm debt that the debt issue represents
6. Dividend – The dollar value of the annual dividend payout 7. Coupon – The dollar value of the coupon payment of the debt security 8. Assets – The book value of the assets of the firm 9. Recovery – The dollar value of the recovery of the debt issue calculated using the rate obtained by industry and seniority from Altman and Kishore (1996). 10. r – The risk free rate for time T computed using the Vasicek model. The historical data on which we base our predictions consists of all the trades that TRACE recorded from 2002 to 2006 for the 41 firms. Altogether there are 64,582 of these trades; this includes the 1,050 trades for which we are predicting the yield. Following machine-learning terminology, we call the vector consisting of the values of the 11 variables (the ten predictors and the yield) for a trade an example. In the case of the machine learning methods (the support vector machine and defensive forecasting), each prediction uses the values of the ten predictor variables for the example being predicted and the values of all 11 variables for all previous examples of trades for bonds on that firm. In general, this will include many of the 64,582 trades that are not included among the 1,050 trades for which we are making predictions. In fact, the first of the 1,050 firm-date pairs for which we make a prediction occurs in 2003, after a year of previous data. So all our predictions use a fairly large learning sample – i.e., many previous examples. For the Merton model, we only use the data associated with the 1,050 examples for which we make predictions. The Merton model is a parametric model, and the data serves to help us estimate its five parameters. First, we use the book value of assets as the value of the firm. Second, we use the modified historical equity return standard deviation to estimate asset return volatility. Finally, the maturity, risk free rate, and face value of debt are used directly as parameters. The model produces a zero coupon bond price equivalent prediction, which we convert into a yield prediction. The Merton model is the simplest of a large class of structural models for pricing bonds Duffie and Singleton (2003). We use it here only as a benchmark; the other structural models generally make better predictions.
82.4.2 Implementation
Table 82.1 Databases used Name Data
Period
TRACE
Bond prices
2002–2006
Fixed income securities database
Bond issue information
1976–2003
CompuStat
Accounting information
2000–2006
CRSP
Equity prices
2000–2006
An example is of the form .f; x/, where f is the yield of the bond, and x is vector consisting of values of the ten predictor variables. The ten variables are all numerical; we standardize them by subtracting their mean and dividing by their standard deviation. Because we are using subscripts to distinguish the examples, we use superscripts to distinguish the ten values in the vector x; we write x D .x 1 ; : : : ; x 10 /.
82 Predicting Bond Yields Using Defensive Forecasting
1271
Table 82.2 Results
jEj Merton model Support vector machine Defensive forecasting
Like defensive forecasting, a support vector machine requires a kernel. In both cases, we use kernels built from Gaussians (also called radial basis functions), but we found that the methods did best with different kernels. For the support vector machine, we settled on the kernel K..f; x/; .f 0 ; x 0 // WD e .f
f 0 /
C
10 X
e
.x j x 0 j /
j D1
For defensive forecasting, we settled on K..f; x/; .f 0 ; x 0 // WD e .f f
0/
10 Y
e .x
j x 0 j /
j D1
D e .f f
0C
P10
j D1 .x
j x 0 j /
/:
Defensive forecasting requires us to solve the Equation (82.30). In order to make this solution more feasible numerically, we use a variation on K29, called K29* (see page 17 of Vovk (2005c)). The K29* algorithm replaces the expression for Skeptic with the following equation: n1 X
K..fi ; xi /; .fn ; xn //.yi fi / K..fn ; xn /; .fn ; xn //fn D 0
iD1
We solve this equation using bisection, with bounds of 100 and 100; for details see Press et al. (2007). Since the equation uses past predictions fi to make the prediction fn , a prediction must be made for each of the preceding examples in the learning sample. In order to get the prediction process started at the beginning of the learning sample, we predict the yields of the first ten examples by the mean yield of the ten examples. From that point onward, we use defensive forecasting to make predictions fi sequentially for i D 11; :::; n. THE SVM algorithm implementation that we use is a basic implementation provided by RapidMiner’s Yale package. This algorithm is a direct implementation of the theory in Vapnik’s 1995 book (Vapnik 1995).
82.4.3 Empirical Results In order to compare the predictions made by the three methods, we compute four measures:
Average (%) 1.78 0.89 0.28
jPEj Stand. dev. (%) 2.03 5.17 0.74
Average (%) 24.78 19.30 5.37
Stand. dev. (%) 19.63 122.47 23.31
1. E D y f . This is the error, the difference between the predicted yield f and the actual yield y. 2. jEj. 3. PE D Ey . This is the percentage error. 4. jPEj. As is typical with on-line machine learning, we observed that these measures improved significantly as more examples were included in the learning sample. Table 82.2 shows the average of these four measures of error over the 1,050 predictions for the three different methods. Defensive forecasting did a better job of making predictions than the other two methods in this particular illustration. But we need to explore alternative kernels for the two machine learning methods (support vector machines and defensive forecasting), and we also need to consider structural models that are more powerful than Merton’s model.
82.5 Conclusion The game-theoretic framework is an alternative to the measure-theoretic framework for probability theory. In this new framework, statistical tests are betting strategies. This idea leads to a new method of on-line prediction, defensive forecasting, in which the forecaster plays against a betting strategy that will make money if the forecasts do not have good statistical properties. In problems where there is a lot of information to use in making predictions, defensive forecasting depends strongly on the choice of a kernel to measure the similarity of examples. The method is analogous, in this respect, to other machine learning methods, such as support vector machines. For the prediction of corporate bond yields in real time, using both accounting information and prior prices for the corporation’s shares and bonds, both defensive forecasting and support vector machines are competitive with credit risk models. For the TRACE data considered in this chapter, defensive forecasting performed quite well.
References Altman, E. and V. Kishore. 1996. “Almost everything you wanted to know about recoveries on defaulted bonds.” Financial Analysts Journal 52, 57–64.
1272 Boser, B., I. Guyon, and V. VN. 1992. “A training algorithm for optimal margin classifiers,” in 5th Annual ACM workshop on COLT, ACM Press, Pittsburgh, PA, pp. 144–152. Cesa-Bianchi, N. and G. Lugosi. 2006. Prediction, learning, and games, Cambridge University Press, Cambridge. Dawid, A. P. 1984. “Statistical theory: The prequential approach (with discussion).” Journal of the Royal Statistical Society Series A 147:278–292. Dawid, A. P. 1986. “Probability forecasting,” in Encyclopedia of Statistical Sciences, S. Kotz, N. L. Johnson, and C. B. Read (Eds.), Vol. 7, Wiley, New York, pp. 210–218. Doob, J. L. 1953. Stochastic processes. Wiley, New York. Duffie, D. and K. J. Singleton (2003) Credit risk: pricing, measurement, and management, Princeton University Press, Princeton, NJ. Foster, D. P. and R. V. Vohra (1998) “Asymptotic calibration.” Biometrika 85, 379–390. Gács P. 2005. “Uniform test of algorithmic randomness over a general space.” Theoretical Computer Science 341, 91–137. Hull, J. C. 2002. Options, futures, and other derivatives, 5th Edition, Prentice Hall, Hoboken, NJ. Joachims, T. 1999. Making large-scale SVM learning practical. MIT Press, Cambridge, MA. Kakade, S. and D. Foster. 2004. “Deterministic calibration and Nash equilibrium,” in Proceedings of the seventeenth annual conference on learning theory, Lecture Notes in Computer Science Vol. 3120, J. Shawe-Taylor and Y. Singer (Eds.). Springer, Heidelberg, pp. 33–48. Kolmogorov, A. N. 1929. “Sur la loi des grands nombres. Atti della Reale Accademia Nazionale dei Lincei, Serie VI.” Rendiconti 9, 470–474. Kolmogorov, A. N. 1930. “Sur la loi forte des grands nombres.” Comptes rendus hebdomadaires des séances de l’Académie des Sciences 191, 910–912. Kumon, M. and A. Takemura. 2008. “On a simple strategy weakly forcing the strong law of large numbers in the bounded forecasting game.” Annals of the Institute of Statistical Mathematics 60, 801–812. Levin, L. A. 1976. “Uniform tests of randomness. Soviet Mathematics Doklady 17, 337–340, the Russian original in: Doklady AN SSSR 227(1), 1976. Levin, L. A. 1984. “Randomness conservation inequalities; information and independence in mathematical theories.” Information and Control 61, 15–37. Mercer, J. 1909. “Functions of positive and negative type, and their connection with the theory of integral equations.” Philosophical Transactions of the Royal Society of London Series A, Containing Papers of a Mathematical or Physical Character 209, 415–446. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 28, 449–470. Press, W., S. Teukolsky, W. Vetterling, and B. Flannery. 2007. Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge. Robbins, H. 1970. “Statistical methods related to the law of the iterated logarithm.” Annals of Mathematical Statistics 41, 1397–1409. Sandroni, A., R. Smorodinsky, and R. Vohra. 2003. “Calibration with many checking rules.” Mathematics of Operations Research 28, 141–153. Schölkopf, B. and A. J. Smola. 2002 Learning with kernels. MIT Press, Cambridge, MA. Shafer, G. and V. Vovk (2001) Probability and finance: it’s only a game. Wiley, New York.
G. Shafer and S. Ring Vapnik, V. 1995. The nature of statistical learning theory. Springer, New York. Ville, J. A. 1939. Étude critique de la notion de collectif. GauthierVillars, Paris, this differs from Ville’s dissertation, which was defended in March 1939, only in that a 17-page introductory chapter replaces the dissertation’s one-page introduction. For a translation into English of the passages where Ville spells out an example of a sequence that satisfies von Mises’s conditions but can still be beat by a gambler who varies the amount he bets, see http://www.probabilityandfinance.com/misc/ville1939.pdf. Ville, J. A. 1955. Notice sur les travaux scientifiques de M. Jean Ville. Prepared by Ville when he was a candidate for a position at the University of Paris and archived by the Institut Henri Poincaré in Paris. Vovk, V. 2005a. Competing with wild prediction rules, Working Paper 16, December 2005, http://www.probabilityandfinance.com, revised January 2006. Vovk, V. 2005b. Competitive on-line learning with a convex loss function, Working Paper 14, May 2005, http://www.probabilityandfinance.com, revised September 2005. Vovk, V. 2005c. On-line regression competitive with reproducing kernel Hilbert spaces, Working Paper 11, November 2005, http://www.probabilityandfinance.com, revised January 2006. Vovk, V. 2006. Predictions as statements and decisions, Working Paper 17, http://www.probabilityandfinance.com. Vovk, V. 2007a. Continuous and randomized defensive forecasting: unified view, Working Paper 21, http://www.probabilityandfinance.com. Vovk, V. 2007b. Defensive forecasting for optimal prediction with expert advice, Working Paper 20, http://www.probabilityandfinance.com. Vovk, V. 2007c. Leading strategies in competitive on-line prediction, Working Paper 18, http://www.probabilityandfinance.com. Vovk, V. 2004. Non-asymptotic calibration and resolution, Working Paper 13, November 2004, http://www.probabilityandfinance.com, revised July 2006. Vovk, V. and G. Shafer. 2008 “The game-theoretic capital asset pricing model.” International Journal of Approximate Reasoning 49, 175–197, see also Working Paper 1 at http://www.probabilityandfinance.com. Vovk, p V. and G. Shafer. 2003. A game-theoretic explanation of the dt effect, Working Paper 5, January, http://www.probabilityandfinance.com. Vovk, V., A. Gammerman, and G. Shafer. 2005a. Algorithmic Learning in a Random World. Springer, New York. Vovk, V., I. Nouretdinov, A. Takemura, and G. Shafer 2005b. Defensive forecasting for linear protocols, Working Paper 10, http://www.probabilityandfinance.com. An abridged version appeared in Algorithmic Learning Theory: Proceedings of the 16th International Conference, ALT 2005, Singapore, October 8–11, 2005, edited by Sanjay Jain, Simon Ulrich Hans, and Etsuji Tomita, http://www-alg.ist.hokudai.ac.jp/thomas/ALT05/alt05.jhtml on pp. 459–473 of Lecture Notes in Computer Science, Vol. 3734, Springer, New York, 2005. Vovk, V., A. Takemura, and G. Shafer. 2004. Defensive forecasting, Working Paper 8, September 2004, http://www.probabilityandfinance.com. An abridged version appeared in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, www.gatsby.ucl.ac.uk/aistats/, revised January 2005. Wu, W. and G. Shafer. 2007. Testing lead-lag effects under game-theoretic efficient market hypotheses, Working Paper 23, http://www.probabilityandfinance.com.
Chapter 83
Range Volatility Models and Their Applications in Finance Ray Yeutien Chou, Hengchih Chou, and Nathan Liu
Abstract There has been a rapid growth of range volatility due to the demand of empirical finance. This paper contains a review of the important development of range volatility, including various range estimators and range-based volatility models. In addition, other alternative models developed recently, such as range-based multivariate volatility models and realized ranges, are also considered here. Finally, this paper provides some relevant financial applications for range volatility. Keywords CARR r DCC r GARCH r High/low range r Realized volatility r Multivariate volatility r Volatility
83.1 Introduction With the continual development of new financial instruments, there is a growing demand for theoretical and empirical knowledge of financial volatility. It is well known that financial volatility has played such a central role in derivative pricing, asset allocation, and risk management. Following Barndorff-Nielsen and Shephard (2003) or Andersene et al. (2003), financial volatility is a latent factor and hence is not directly observable, however. Financial volatility can only be estimated using its signature on certain known market price process; when the underlying process is more sophisticated
R.Y. Chou () Institute of Economics, Academia Sinica, #128, Yen-Jio-Yuan Road, Sec 2, Nankang, Taipei, Taiwan, ROC e-mail: [email protected] and Institute of Business Management, National Chiao Tung University, Taipei, Taiwan, ROC H. Chou Department of Shipping & Transportation Management, National Taiwan Ocean University, Keelung, Taiwan, ROC e-mail: [email protected] N. Liu Department of Finance, Feng Chia University, Taichung, Taiwan, ROC e-mail: [email protected]
or when observed market prices suffer from market microstructure noise effects, the results are less clear. It is well known that many financial time series exhibit volatility clustering or autocorrelation. In incorporating the characteristics into the dynamic process, the generalized autoregressive conditional heteroskedasticity (GARCH) family of models proposed by Engle (1982) and Bollerslev (1986), and the stochastic volatility (SV) models advocated by Taylor (1986) are two popular and useful alternatives for estimating and modeling time-varying conditional financial volatility. However, as pointed by Alizadeh et al. (2002), Brandt and Diebold (2006), Chou (2005), and other authors, both GARCH and SV models are inaccurate and inefficient, because they are based on the closing prices of the reference period, and thus failing to use the information contents inside the reference. In other words, the path of the price inside the reference period is totally ignored when volatility is estimated by these models. Especially in turbulent days with the declines and recoveries of the markets, the traditional closeto-close volatility indicates a low level while the daily price range shows correctly that the volatility is high. The price range, defined as the difference between the highest and lowest market prices over a fixed sampling interval, has been known for a long time and recently experienced renewed interest as an estimator of the latent volatility. The information contained in the opening, highest, lowest, and closing prices of an asset is widely used in Japanese candlestick charting techniques and other technical indicators (Nison (1991). Early application of range in the field of finance can be traced to Mandelbrot (1971), and the academic work on the range-based volatility estimator began in the early 1980s. Several authors, dating back to Parkinson (1980), developed from it several volatility measures far more efficient than the classical return-based volatility estimators. Building on the earlier results of Parkinson (1980), many studies1 show that one can use the price range information 1 See Garman and Klass (1980), Beckers (1983), Ball and Torous (1984), Wiggins (1991), Rogers and Satchell (1991), Kunitomo (1992), Yang and Zhang (2000), Alizadeh et al. (2002), Brandt and Diebold (2006), Brandt and Jones (2006), Chou (2005, 2006), Cheung (2007),
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_83,
1273
1274
to improve volatility estimation. In addition to being significantly more efficient than the squared daily return, Alizadeh et al. (2002) also demonstrate that the conditional distribution of the log range is approximately Gaussian, thus greatly facilitating maximum likelihood estimation of stochastic volatility models. Moreover, as pointed by Alizadeh et al. (2002), and Brandt and Diebold (2006), the range-based volatility estimator appears robust to microstructure noise such as bid-ask bounce. By adding microstructure noise to the Monte Carlo simulation, Shu and Zhang (2006) also support that the findings of Alizadeh et al. (2002), that range estimators are fairly robust toward microstructure effects. Cox and Rubinstein (1985) stated the puzzle that despite the elegant theory and the support of simulation results, the range-based volatility estimator has performed poorly in empirical studies. Chou (2005) argued that the failure of all the range-based models in the literature is caused by their ignorance of the temporal movements of price range. Using a proper dynamic structure for the conditional expectation of range, the conditional autoregressive range (CARR) model, proposed by Chou (2005), successfully resolves this puzzle and retains its superiority in empirical forecasting abilities. The in-sample and out-of-sample volatility forecasting using S&P 500 Index data shows that the CARR model does provide more accurate volatility estimator compared with the GARCH model. Similarly, Brandt and Jones (2006) formulate a model that is analogous to Nelson’s (1991) EGARCH model, but uses the square root of the intra-day price range in place of the absolute return. Both studies find that the range-based volatility estimators offer a significant improvement over their return-based counterparts. Moreover, Chou et al. (2007) extend CARR to a multivariate context using the dynamic conditional correlation (DCC) model proposed by Engle (2002a). They find that this range-based DCC model performs better than other return-based volatility models in forecasting covariances. This paper will also review alternative range-based multivariate volatility models. Recently, many studies use high frequency data to get an unbiased and highly efficient estimator for measuring volatility, see Andersene et al. (2003) and McAleer and Medeiros (2008) for a review. The volatility built by nonparametric methods is called realized volatility, which is calculated by the sum of nonoverlapping squared returns within a fixed time interval. Martens and van Dijk (2007) replace the squared return by the price range to get a more efficient estimator; namely, the realized range. In their empirical study, the realized range significantly improves over realized return
Martens and van Dijk (2007), Chou and Wang (2007), Chou et al. (2007), and Chou and Liu (2008a,b).
R.Y. Chou et al.
volatility. In addition, Christensen and Podolskij (2007) independently develop the realized range and show that this estimator is consistent and relatively efficient under some specific assumptions. The remainder of the chapter is laid out as follows. Section 83.2 introduces the price range estimators. Section 83.3 describes the range-based volatility models, including univariate and multivariate ones. Section 83.4 presents the realized range. The financial applications of range volatility are provided in Sect. 83.5. Finally, the conclusion is showed in Sect. 83.6.
83.2 The Price Range Estimators A few price range estimators and their estimation efficiency are briefly introduced and discussed in this section. A significant practical advantage of the price range is that for many assets, daily opening, highest, lowest, and closing prices are readily available. Most data suppliers provide daily highest/lowest as summaries of intra-day activity. For example, Datastream records the intraday price range for most securities, including equities, currencies, and commodities, going back to 1955. Thus, range-based volatility proxies are therefore easily calculated. When using this record, the additional information yields a great improvement when used in financial applications. Roughly speaking, knowing these records allows us to get closer to the real underlying process, even if we do not know the whole path of asset prices. For an asset, let’s define the following variables: Ot D the opening price of the tth trading day, Ct D the closing price of the tth trading day, Ht D the highest price of the tth trading day, Lt D the lowest price of the tth trading day. The Parkinson 1980) estimator efficiency intuitively comes from the fact that the price range of intraday gives more information regarding the future volatility than two arbitrary points in this series (the closing prices). Assuming that the asset price follows a simple diffusion model without a drift term, his estimator O P2 can be written: O P2 D
1 .ln Ht ln Lt /2 : 4 ln 2
(83.1)
But instead of using two data points, the highest and lowest prices, four data points, the opening, closing, highest, and lowest prices, might also give extra information. Garman and Klass (1980) propose several volatility estimators based on the knowledge of the opening, closing, highest, and
83 Range Volatility Models and Their Applications in Finance
lowest prices. Like Parkinson (1980), they assume the same 2 as: diffusion process and propose their estimator O GS n 2 D 0:511Œln.Ht =Lt / 2 0:19 ln.Ct =Ot / O GK
(83.2) As mentioned in Garman and Klass (1980), their estimator 2 2 can be presented practically as O GK 0 D 0:5Œln.Ht =Lt / 2 Œ2 ln 2 1 Œln.Ct =Ot / . Since the price path cannot be monitored when markets are closed, however, Wiggins (1991) finds that the both Parkinson estimator and Garman–Klass estimator are still biased downward compared to the traditional estimator, because the observed highs and lows are smaller than the actual highs and lows. Garman and Klass (1980) and Grammatikos and Saunders (1986), nevertheless, estimate the potential bias using simulation analysis and show that the bias decreases with an increasing number of transaction. Therefore, it is relatively easy to adjust the estimates of daily variances to eliminate the source of bias. Because Parkinson (1980) and Garman and Klass (1980) estimators implicitly assume that log-price follows a geometric Brownian motion with no drift term, further refinements are given by Rogers and Satchell (1991) and Kunitomo (1992). Rogers and Satchell (1991) add a drift term in the stochastic process that can be incorporated into a volatility estimator using only daily opening, highest, lowest, and clos2 can be written: ing prices. Their estimator O RS t 1 X D ln .Hn =On / Œln .Hn =On / ln .Cn =On / : N nDt N
C ln .Ln =On / Œln .Ln =On / ln .Cn =On /
(83.3)
Rogers et al. (1994) report that the Rogers–Satchell estimator yields theoretical efficiency gains compared to the Garman– Klass estimator. They also report that the Rogers–Satchell estimator appears to perform well with changing drift and as few as 30 daily observations. Kunitomo (1992) uses the opening and closing prices to estimate a modified range corresponding to a hypothesis of a Brownian bridge of the transformed log-price. Basically this also tries to correct the highest and lowest prices for the drift term: O K2
1 D ˇN
t h
i X On ; ln HO n = L pDt N
where two estimators HO n D ArgŒM ax fPti ŒOn C.Cn On /= ti
Pti
ti C .Cn On / jti 2 Œn 1; n g and LO n D ArgŒMinfPti ti
Œln.Ht / C ln.Lt / 2 ln.Ot / o 2Œln.Ht =Ot / ln.Lt =Ot / 0:383Œln.Ct =Ot / 2
2 O RS
1275
(83.4)
Pti
ŒOn C .Cn On /=ti C .Cn On / jti 2 Œn 1; n g are denoted as the end-of-the-day drift correction highest and lowest prices. ˇN D 6=.N 2 / is a correction parameter. Finally, Yang and Zhang (2000) make further refinements by deriving a price range estimator that is unbiased, independent of any drift, and consistent in the presence of opening 2 thus can be written price jumps. Their estimator O YZ 2 O YZ D
t h i X 1 ln.On =Cn1 /ln.On =Cn1 / .N 1/ nDtN
C
t h i X k ln.On =Cn1 / ln.On =Cn1 / .N 1/ nDtN
2 ; C .1 k/O RS
(83.5)
where k D 1:34C.N0:34 . The symbol XN is the uncondiC1/= .N 1/ 2 tional mean of X , and RS is the Rogers–Satchell estimator. The Yang–Zhang estimator is simply the sum of the estimated overnight variance, the estimated opening market variance, and the Rogers and Satchell (1991) drift independent estimator. The resulting estimator therefore explicitly incorporates a term for the closed market variance. Shu and Zhang (2006) investigate the relative performance of the four range-based volatility estimators, including Parkinson, Garman–Klass, Rogers–Satchell, and Yang– Zhang estimators for S&P 500 Index data, and find that the price range estimators all perform very well when an asset price follows a continuous geometric Brownian motion. However, significant differences among various range estimators are detected if the asset return distribution involves an opening jump or a large drift. In term of efficiency, all previous estimators exhibit very substantial improvements. Defining the efficiency measure of a volatility estimator O i2 as the estimation variance compared with the close-close estimator O 2 ; that is: Eff .O i2 / D
Var.O 2 / : Var.O i2 /
(83.6)
Parkinson (1980) reports a theoretical relative efficiency gain ranging from 2.5 to 5, which means that the estimation variance is 2.5 to 5 times lower. Garman and Klass (1980) report that their estimator has an efficiency of 7.4; while the Yang and Zhang (2000) and Kunitomo (1992) variance estimators result in a theoretical efficiency gain of, 7.3 and 10, respectively.
1276
R.Y. Chou et al.
83.3 The Range-Based Volatility Models
83.3.2 The MA Model
This section provides a brief overview of the models used to forecast range-based volatility. In what follows, the models are presented in increasing order of complexity. For an asset, the range of the log-prices is defined as the difference between the daily highest and lowest prices in a logarithm type. It can be denoted by:
MA methods are widely used in time series forecasting. In most cases, a moving average of length N where N D 20, 60, 120 days is used to generate log range forecasts. Choosing these lengths is fairly standard because these values of N correspond to 1 month, 3 months, and 6 months of trading days, respectively. The expression for the N day moving average is shown below:
Rt D ln.Ht / ln.Lt /:
(83.7)
According to Christoffersen (2002), for the S&P 500 data the autocorrelations of the range-based volatility, Rt , show more persistence than the squared-return autocorrelations. Thus, range-based volatility estimator of course could be used instead of the squared return for evaluating the forecasts from volatility models, and with the time series of Rt , one can easily constructs a volatility model under the traditional autoregressive framework. Instead of using the data of range, nevertheless, Alizadeh et al. (2002) focus on the variable of the log range, ln.Rt /, since they find that in many applied situations, the log range approximately follows a normal distribution. Therefore, all the models introduced in the section except for Chou’s CARR model are estimated and forecasted using the log range. The following range-based volatility models are first introduced in some simple specifications, including random walk, moving average (MA), exponentially weighting moving average (EWMA), and autoregressive (AR) models. Hanke and Wichern (2005) think these models are fairly basic techniques in the applied forecasting literature. Additionally, we also provide some models at a much higher degree of complexity, such as the stochastic volatility (SV), CARR, and range-based multivariate volatility models.
N 1 1 X ln.Rt j /: N j D0
(83.9)
83.3.3 The EWMA Model EWMA models are also very widely used in applied forecasting. In EWMA models, the current forecast of log range is calculated as the weighted average of the one period past value of log range and the one period past forecast of log range. This specification is appropriate provided the underlying log range series has no trend. EŒln.Rt C1 /jIt D EŒln.Rt /jIt 1 C .1 / ln.Rt /: (83.10) The smoothing parameter, , lies between zero and unity. If is zero then the EWMA model is the same as a random walk. If is one then the EWMA model places all of the weight on the past forecast. In the estimation process the optimal value of was chosen based on the root mean squared error criteria. The optimal is the one that records the lowest MSE.
83.3.4 The AR Model This model uses an autoregressive process to model log range. There are n lagged values of past log range to be used as drivers to make a one period ahead forecast.
83.3.1 The Random Walk Model The log range ln.Rt / can be viewed as a random walk. It means that the best forecast of the next period’s log range is this period’s estimate of log range. As in most papers, the random walk model is used as the benchmark for the purpose of comparison. EŒln.Rt C1 /jIt D ln.Rt /;
EŒln.Rt C1 /jIt D
EŒln.Rt C1 / D ˇ0 C ˇi
n X
ln.Rt C1i /:
(83.11)
i D1
83.3.5 The Discrete-Time Range-Based SV Model (83.8)
where It is the information set at time t. The estimator EŒln.Rt C1 /jIt is obtained conditional on It .
Alizadeh et al. (2002) present a formal derivation of the discrete time SV model from the continuous time SV model.
83 Range Volatility Models and Their Applications in Finance
1277
The conditional distribution of log range is approximately Gaussian:
ln ht ln ht 1 D . ln ht 1 / C XtR1 C rt 1 = ht 1 ; (83.17)
N ˇ 2 t ; ln Rt C1 j ln Rt N Œln RN C .ln Rt 1 ln R/; (83.12)
where is denoted as the long-run mean of the volatility process, and is denoted as the speed of mean reverting. The coefficient decides the asymmetric effect of lagged returns. The innovation,
where t D T = N , T is the sample period, and N is the number of intervals. The parameter ˇ models the volatility of the latent volatility. Following Harvey et al. (1994), a linear state space system including the state equation and the signal equation can be written: ln R.i C1/ t D ln RN C t ln Ri t ln R p Cˇ t.i C1/ t : (83.13) ˇi h ˇ ˇ ˇ ˇ ˇ ln ˇf .si t;.i C1/ t /ˇ D ln Ri t C E ln ˇf .si t;.i C1/ t /ˇ C".i C1/ t :
XtR1 D
ln.Rt 1 / 0:43 ln ht 1 ; 0:29
(83.18)
is defined as the standardized deviation of the log range from its expected value. It means is used to measure the sensitivity to the lagged log ranges. In short, the range-based EGARCH model replaces the innovation term of the modified EGARCH by the standardized log range.
(83.14)
Equation (83.13) is the state equation and Equation (83.14) is the signal equation. In Equation (83.14), E is the mathematical expectation operator. The state equation errors are i:i:d: N.0; 1/ and the signal equation errors have zero mean. A two-factor model can be represented by the following state equation. ln R.i C1/ t D ln RN C ln RN 1;.i C1/ t C ln RN 2;.i C1/ t : p ln R1;.i C1/ t D 1; t ln R1;i t C ˇ1 t1;.i C1/ t : p ln R2;.i C1/ t D 2; t ln R2;i t C ˇ2 t2;.i C1/ t :
83.3.7 The CARR Model This section provides a brief overview of the CARR model used to forecast range-based volatility. The CARR model is also a special case of the multiplicative error model (MEM) of Engle (2002b). Instead of modeling the log range, Chou (2005) focuses the process of the price range directly. With the time series data of price range Rt , Chou (2005) presents the CARR model of order (p; q), or CARR (p; q) is shown as Rt D t "t ; "t f .:/;
(83.15) The error terms 1 and 2 are contemporaneously and serially independent N.0; 1/ random variables. They estimate and compare both one-factor and two-factor latent volatility models for currency futures prices and find that the twofactor model shows more desirable regression diagnostics.
83.3.6 The Range-Based EGARCH Model Brandt and Jones (2006) incorporate the range information into the EGARCH model, named by the range-based EGARCH model. The model significantly improves both insample and out-of-sample volatility forecasts. The daily log range and log returns are defined as the followings: ln.Rt /jIt 1 N.0:43 C ln ht ; 0:292 /; rt jIt 1 N.0; h2t /; (83.16) where ht is the conditional volatility of the daily log return rt . Then, the range-based EGARCH for the daily volatility can be expressed by:
t D ! C
p X
˛i Rt i C
i D1
q X
ˇj t j ;
(83.19)
j D1
where t is the conditional mean of the range based on all information up to time t, and the distribution of the disturbance term "t , or the normalized range, is assumed to have a density function f .:/ with a unit mean. Since "t is positively valued, given that both the price range Rt and its expected value t are positively valued, a natural choice for the distribution is the exponential distribution. The equation of the conditional expectation of range can be easily extended to incorporate other explanatory variables, such as trading volume, time-to-maturity, lagged return. t D ! C
p X i D1
˛i Rt i C
q X j D1
ˇj t j C
L X
lk Xk :
(83.20)
kD1
This model is called the CARR model with exogenous variables, or CARRX model. The CARR mode essentially belongs to a symmetric model. In order to describe the leverage effect of financial time series, Chou (2006) divides the whole price range into two single-side price ranges, upward range
1278
R.Y. Chou et al.
and downward range. Further, he defines UPRt , the upward range, and DNRt , the downward range, as the differences between the daily highs, daily lows, and the opening price, respectively, at time t. In other words,
k k diagonal matrix with time-varying standard where Dt ap deviations hi;t of the ith return series from GARCH on the ith diagonal. R is a sample correlation matrix of rt . The DCC is formulated as the following specification:
UPRt D ln.Ht / ln.Ot /;
(83.21)
H t D Dt Rt Dt ;
DNRt D ln.Ot / ln.Lt /:
(83.22)
Rt D diagfQt g1=2 Qt diagfQt g1=2 ;
Similarity, with the time series of single-side price range, UPRt or DNRt , Chou (2006) extends the CARR model to Asymmetric CARR (ACARR) model. In volatility forecasting, the asymmetric model also performs better than the symmetric model.
83.3.8 The Range-Based DCC Model The multivariate volatility models have been extensively researched in recent studies. They provide relevant financial applications in various areas, such as asset allocation, hedging, and risk management. Bauwens et al. (2006) offer a review of the multivariate volatility models. As to the extension of the univariate range models, Fernandes et al. (2005) propose one kind of multivariate CARR model using the formula Cov.X; Y / D ŒV .X C Y / V .X / V .Y / =2. Analogous to the Fernandes et al.’s (2005) work, Brandt and Diebold (2006) use no-arbitrage conditions to build the covariances in terms of variances. However, this kind of method can substantially apply to a bivariate case. Chou et al. (2007) combine the CARR model with the DCC model of Engle (2002a) to propose a range-based volatility model, which uses the ranges to replace the GARCH volatilities in the first step of DCC. They conclude that the range-based DCC model performs better than other return-based models (MA100, EWMA, CCC, return-based DCC, and diagonal BEKK) through the statistical measures, RMSE and MAE, based on four benchmarks of implied and realized covariance. The DCC model is a two-step forecasting model that estimates univariate GARCH models for each asset and then calculates its time-varying correlation by using the transformed standardized residuals from the first step. The related discussions about the DCC model can be found in Engle and Sheppard (2001), Engle (2002a), and Cappiello et al. (2006). It can be viewed as a generalization of the constant conditional correlation (CCC) model proposed by Bollerslev (1990). The conditional covariance matrix Ht of a k 1 return vector rt in CCC (rt jt 1 N.0; Ht / / can be expressed as Ht D Dt RDt
(83.23)
Qt D S ı . 0 A B/ C A ı Zt 1 Zt 1 CB ı Qt 1 ; Zt D D1 t rt ;
(83.24)
where is a vector of ones and ı is the Hadamard product of two identically sized matrices, which is computed simply by element by element multiplication. Qt and S are the conditional and unconditional covariance matrices, respectively, of the standardized residual vector Zt from GARCH. A and B are estimated parameter matrices. Most cases, however, set them as scalars. In a word, DCC differs from CCC by allowing only R to be time varying.
83.4 The Realized Range Volatility There has been much widely investigated research for measuring volatility due to the use of high frequency data. In particular, the realized volatility, calculated by the sum of squared intra-day returns, provides a more efficient estimate for volatility. The review of realized volatility is discussed in Andersen et al. (2001, 2003), Barndorff-Nielsen and Shephard (2003), Andersen et al. (2006a,b), and McAleer and Medeiros (2008). Martens and van Dijk (2007) and Christensen and Podolskij (2007) replace the squared intra-day return with the high-low range to get a new estimator called realized range. Initially, we assume that the asset price Pt follows the geometric Brownian motion: dPt D Pt dt C Pt dzt ;
(83.25)
where is the drift term, is the constant volatility, and zt is a Brownian motion. There are equal-length intervals divided in a trading day. The daily realized volatility RV t at time t can be expressed by: RV t D
X
.ln Pt;i ln Pt;i 1 /2 ;
(83.26)
i D1
where Pt;i is the price for the time i on the trading day t, and is the time interval. Then, is the trading time length in a trading day. Moreover, the realized range RRt is:
83 Range Volatility Models and Their Applications in Finance
1 X .ln Ht;i ln Lt;i 1 /2 ; 4 ln 2 i D1
RRt D
(83.27)
where Ht;i and Lt;i are the highest price and the lowest price of the ith interval on the tth trading day, respectively. As mentioned before, several studies such as Garman and Klass (1980) suggest improving efficiency by using the open and close prices. Furthermore, assuming that Pt follows a continuous sample path martingale, Christensen and Podolskij (2007) propose integrated volatility and show this range estimator remains consistent in the presence of stochastic volatility. Z t Z t s ds C s dzt ; for 0 t < 1: ln Pt D ln P0 C 0
0
(83.28)
The obvious and important question is that the realized range should be seriously affected by microstructure noise. Martens and van Dijk (2007) consider a bias-adjustment procedure, which scales the realized range by using the ratio of the average level of the daily range and the average level of the realized range. They find that the scaled realized range is more efficient than the (scaled) realized volatility.
83.5 The Financial Applications and Limitations of the Range Volatility The range mentioned in this paper is a measure of volatility. From the theoretical points of view, it indeed provides a more efficient estimator of volatility than the return. It is intuitively reasonable due to more information provided by the range data. In addition, the return volatility neglects the price fluctuation, especially not taking into account the near distance between the closing prices of two trading days. We can therefore conclude that the high-low volatility should contain some additional information compared with the close-toclose volatility. Moreover, the data on the range is readily available, which has low cost. Hence, most researches related to volatility may be applied on the range. Poon and Granger (2003) provide extensive discussions of the applications of volatilities in the financial markets. The range estimator undoubtedly has some inherited shortcomings. It is well known that the financial asset price is very volatile and is easy to be influenced by instantaneous information. In statistics, the range is very sensitive to the outliers. Chou (2005) provides an answer by using the quantile range. For example, the new range estimator can be calculated by the difference between the top and the bottom 5% observations on average.
1279
In theory, many range estimators in previous sections depend on the assumption of continuous-time geometric Brownian motion. The range estimators derived from Parkinson (1980) and Garman and Klass (1980) require a geometric Brownian motion with zero drift. Rogers and Satchell (1991) allow a nonzero drift, and Yang and Zhang (2000) further allow overnight price jumps. Moreover, only finite observations can be used to build the range. It means the range will appear with some unexpected bias, especially for the assets with lower liquidity and finite transaction volume. Garman and Klass (1980) pointed out that this will produce the later opening and early closing. They also said the difference between the observed highs and lows will be less than that between the actual highs and lows. It means that the calculated high-low estimator should be biased downward. In addition, Beckers (1983) noted that the highest and lowest prices may be traded by disadvantaged buyers and sellers. The range values might therefore be less representative for measuring volatility. Before the range was adapted by the dynamic structures, however, its application was very limited. Based on the SV framework, Gallant et al. (1999) and Alizadeh, Brandt, and Diebold incorporate the range into the equilibrium asset pricing models. Chou (2005) and Brandt and Jones (2006), on the other hand, fill the gap between a discrete-time dynamic model and range. Their works give a large extension for the applications of range volatility. In the early studies, Bollerslev et al. (1992) give good illustrations of the conditional volatility applications. Based on the conditional meanvariance framework, Chou and Liu (2008a) show that the economic value of volatility timing for the range is significant in comparison to the return. It means that we can apply the range volatility to some practical cases. In addition, Corrado and Truong (2007) report that the range estimator has similar forecasting ability of volatility compared with the implied volatility. However, the implied volatilities are not available for many assets and the option markets are not sufficient in many developed countries. In such cases, the range is more practical. More recently, Kalev and Duong (2008) utilize Martens and van Dijk’s (2007) realized range to test the Samuelson Hypothesis for the futures contract.
83.6 Conclusion Volatility plays a central role in many areas of finance. In view of the theoretical and practical studies, the price range provides an intuitive and efficient estimator of volatility. In this paper, we begin our discussion by reviewing the range estimators. There has been a dramatic increase in the number
1280
of publications on this work since Parkinson (1980) introduced the high/low range. From then on, some new ranges are considered with opening and closing price. The new range estimators distribute feasible weights to the differences among the highest, lowest, opening, and closing. Through the analysis, we can gain a better understanding of the nature of range. Some dynamic volatility models combined with range are also introduced in this study. They led to broad applications in finance. In particular, the CARR model incorporates both the superiority of range in forecasting volatility and the elasticity of the GARCH model. Moreover, the range-based DCC model, which combines CARR with DCC, contributes to the multivariate applications. This research may provide an alternative to risk management and asset allocation. And finally, the squared intra-day return of realized volatility is replaced by the high-low range to get a more efficient estimator called realized change. Undoubtedly, the range is sensitive to outliers in statistics, but only few researchers mention this problem. It’s useful and meaningful to utilize the quantile range to replace the standard range to get a robust measure of range. Moreover, the multivariate works for range is still in its infancy. Future research is obviously required for this topic.
References Alizadeh, S., M. Brandt, and F. Diebold. 2002. “Range-based estimation of Stochastic Volatility models.” Journal of Finance 57, 1047–1091. Andersen, T., T. Bollerslev, F. Diebold, and H. Ebens. 2001. “The distribution of realized stock return volatility.” Journal of Financial Economics 61, 43–76. Andersen, T., T. Bollerslev, F. Diebold and P. Labys. 2003. “Modeling and forecasting realized volatility.” Econometrica 71, 579–625. Andersen, T., T. Bollerslev, P. Christoffersen, and F. Diebold. 2006a. “Volatility and correlation forecasting,” in Handbook of economic forecasting, G. Elliott, C. W. J. Granger, and A. Timmermann (Eds.). North-Holland, Amsterdam, pp. 778–878. Andersen, T., T. Bollerslev, P. Christoffersen, and F. Diebold. 2006b. “Practical volatility and correlation modeling for financial market risk management,” in Risks of financial institutions, M. Carey, and R. Stulz (Eds.). University of Chicago Press for NBER, Chicago. Ball, C. and W. Torous. 1984. “The maximum likelihood estimation of security price volatility: theory, evidence, and application to option pricing.” Journal of Business 57, 97–112. Barndorff-Nielsen, O. and N. Shephard. 2003. “How accurate is the asymptotic approximation to the distribution of realised volatility?,” in Identification and inference for econometric models, A Festschrift for Tom Rothenberg, D. Andrews, J. Powell, P. Ruud, and J. Stock (Eds.). Econometric Society Monograph Series, Cambridge University Press, Cambridge. Bauwens, L., S. Laurent, and J. V. K. Rombouts. 2006. “Multivariate GARCH models: a survey.” Journal of Applied Econometrics 21, 79–109. Beckers, S. 1983. “Variances of security price returns based on highest, lowest, and closing prices.” Journal of Business 56, 97–112.
R.Y. Chou et al. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroscedasticity.” Journal of Econometrics 31, 307–327. Bollerslev, T. 1990. “Modeling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH model.” Review of Economics and Statistics 72, 498–505. Bollerslev, T., R. Y. Chou, and K. Kroner. 1992. “ARCH modeling in finance: a review of the theory and empirical evidence.” Journal of Econometrics 52, 5–59. Brandt, M. and F. Diebold. 2006. “A no-arbitrage approach to rangebased estimation of return covariances and correlations.” Journal of Business 79, 61–74. Brandt, M and C. Jones. 2006. “Volatility forecasting with range-based EGARCH models.” Journal of Business and Economic Statistics 24, 470–486. Cappiello L., R. Engle, and K. Sheppard. 2006. “Asymmetric dynamics in the correlations of global equity and bond returns.” Journal of Financial Econometrics 4, 537–572. Cheung, Y. 2007. “An empirical model of daily highs and lows.” International Journal of Finance and Economics 12, 1–20. Chou, R. 2005. “Forecasting financial volatilities with extreme values: the conditional autoregressive range (CARR) model.” Journal of Money Credit and Banking 37, 561–82. Chou, R. 2006. “Modeling the asymmetry of stock movements using price ranges.” Advances in Econometrics 20A, 231–257. Chou, R. Y. and N. Liu. 2008a. The economic value of volatility timing using a range-based volatility model, Working paper, Academia Sinica. Chou, R. Y. and N. Liu. 2008b. Estimating time-varying hedge ratios with a range-based multivariate volatility model, Working paper, Academia Sinica. Chou, H. and D. Wang. 2007. “Forecasting volatility of UK stock market: a test of conditional autoregressive range (CARR) model.” International Research Journal of Finance and Economics 10, 7–13. Chou, R. Y., N. Liu, and C. Wu. 2007. Forecasting time-varying covariance with a range-based dynamic conditional correlation model, Working paper, Academia Sinica. Christensen, K. and M. Podolskij. 2007. “Realized range-based estimation of integrated variance.” Journal of Econometrics 141, 323–349. Christoffersen, P. F. 2002. Elements of financial risk management, Academic Press, San Diego. Corrado, C. and C. Truong. 2007. “Forecasting stock index volatility: comparing implied volatility and the intraday high-low price range.” Journal of Financial Research 30, 201–215. Cox, J. and M. Rubinstein. 1985. Options markets, Prentice-Hall, NJ. Engle, R. F. 1982. “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation.” Econometrica 50, 987–1007. Engle, R. 2002a. “Dynamic conditional correlation: a simple class of multivariate GARCH models.” Journal of Business and Economic Statistics 20, 339–350. Engle, R. 2002b. “New frontiers for ARCH models.” Journal of Applied Econometrics 17, 425–446. Engle, R., and K. Sheppard. 2001. “Theoretical and empirical properties of dynamic conditional correlation multivariate GARCH.” NBER Working Paper 8554. Fernandes, M., B. Mota, and G. Rocha. 2005. “A multivariate conditional autoregressive range model.” Economics Letters 86, 435–440. Gallant, R., C. Hsu, and G. Tauchen. 1999. “Using daily range data to calibrate volatility diffusions and extracting integrated volatility.” Review of Economics and Statistics 81, 617–31. Garman, M. and M. Klass. 1980. “On the estimation of security price volatilities from historical data.” Journal of Business 53, 67–78. Grammatikos, T. and A. Saunders. 1986. “Futures price variability: a test of maturity and volume effects.” Journal of Business 59, 319–330.
83 Range Volatility Models and Their Applications in Finance Hanke, J. and D. Wichern. 2005. Business forecasting, 8th ed., Prentice Hall, New York. Harvey, A., E. Ruiz, and N. Shephard. 1994. “Multivariate stochastic variance models.” Review of Economic Studies 61, 247–264. Kalev, P. S. and H. N. Duong. 2008. “A test of the Samuelson Hypothesis using realized range.” Journal of Futures Markets 28, 680–696. Kunitomo, N. 1992. “Improving the Parkinson method of estimating security price volatilities.” Journal of Business 65, 295–302. Mandelbrot, B. 1971. “When can price be arbitraged efficiently? A limit to validity of the random walk and martingale models.” Review of Economics and Statistics 53, 225–236. Martens, M. and D. van Dijk. 2007. “Measuring volatility with the realized range.” Journal of Econometrics 138, 181–207. McAleer, M. and M. Medeiros. 2008. “Realized volatility: a review.” Econometric Reviews 27, 10–45. Nelson, D. 1991. “Conditional heteroskedasticity in asset returns: a new approach.” Econometrica 59, 347–370. Nison, S. 1991. Japanese candlestick charting techniques, New York Institute of Finance, New York. Parkinson, M. 1980. “The extreme value method for estimating the variance of the rate of return.” Journal of Business 53, 61–65.
1281 Poon, S. H. and C. W. J. Granger. 2003. “Forecasting volatility in financial markets: a review.” Journal of Economic Literature 41, 478–639. Rogers, L. and S. Satchell. 1991. “Estimating variance from high, low and closing prices.” Annals of Applied Probability 1, 504–512. Rogers, L., S. Satchell, and Y. Yoon. 1994. “Estimating the volatility of stock prices: a comparison of methods that use high and low prices.” Applied Financial Economics 4, 241–247. Shu, J. H. and J. E. Zhang. 2006. “Testing range estimators of historical volatility.” Journal of Futures Markets 26, 297–313. Taylor, S. J. 1986. Modelling financial time series, Wiley, Chichester. West, K. and D. Cho. 1995. “The predictive ability of several models of exchange rate volatility.” Journal of Econometrics 69, 367–391. Wiggins J. 1991. “Empirical tests of the bias and efficiency of the extreme-value variance estimator for common stocks.” Journal of Business 64, 417–432. Yang, D. and Q. Zhang. 2000. “Drift-independent volatility estimationbased on high, low, open, and closing prices.” Journal of Business 73, 477–491.
Chapter 84
Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets Zhuo Qiao, Venus Khim-Sen Liew, and Wing-Keung Wong
Abstract Because of its very important role in modern production and management, information technology (IT) has become a major driver of economic growth and has speeded up the integration of the global economy since the 1990s. Due to the prominent position of the IT industry in the U.S., the U.S. IT stock market is believed to have driven up IT stock markets in other countries. In this paper, we adopt a multivariate GARCH model of Baba et al. (Unpublished manuscript, Department of Economics, University of California, San Diego, 1990) to investigate the linkages between the IT stock and several non-U.S. IT markets; namely, Japan, France, Canada, Finland, Sweden, and Hong Kong. Our findings reveal that, generally, the U.S. IT market contributes strong volatility to non-U.S. IT markets rather than having a mean spillover effect, implying that the U.S. IT market plays a dominant role in the volatility of world IT markets. In addition, our analysis of the dynamic path of correlation coefficients implies that during the formation, spread, and collapse of the IT bubble, the relationships between the U.S. and non-U.S. IT markets are strong but the relationships weaken after the IT bubble bursts. Keywords Information technology r IT bubble r Stock market r Integration r Volatility r Spillover effect r Multivariate GARCH (MGARCH) r Conditional correlation
Z. Qiao Faculty of Business Administration, University of Macau, Macau, China e-mail: [email protected] V.K.-S. Liew Faculty of Economics and Business, Universiti Malaysia Sarawak, Malaysia e-mail: [email protected] W.-K. Wong () Department of Economics, Hong Kong Baptist University, Hong Kong, Kowloon Tong, Hong Kong e-mail: [email protected]
84.1 Introduction The rapid growth and diffusion of information technology (IT) have had a great impact on modern production and management. IT became a major driver of economic growth throughout the 1990s, and, thus, IT markets have attracted great attention from investors and policy makers. For example, Oliner and Sichel (2000) found that developments in computer hardware, software, and network infrastructure accounted for about two-thirds of the acceleration in labor productivity for the non-farm business sector in the U.S. between the first and second halves of the 1990s. Gains in the new technology fueled the fastest-growing companies in history through the second half of the 1990s. Writing at the peak of the IT boom, Gordon (2000) stated that the true enthusiasts treat the New Economy as a fundamental industrial revolution as great or greater in importance than the concurrence of inventions, particularly in electricity and the internal combustion engine, which transformed the world at the turn of the last century. However, Gordon (2000) pointed out that the miracle of U.S. economic performance that occurred on the back of the growth in the IT sector in the second half of the 1990s began to unravel when the NASDAQ fell by half between March and December 2000. In 2000 and 2001, it was reported that 784 IT companies went out of business, and in those 2 years, 143,440 workers in the IT industry in the U.S. lost their jobs (Maich 2003). The profits of Yahoo, a company whose primary source of revenues is online advertising, collapsed from earnings of nearly U.S. $300 million in 2000 to almost nothing in 2001. Yahoo’s stock price slumped from a high of about U.S. $240 in early 2000 to U.S. $17 on March 9, 2001, the first anniversary of the 5,000-level peak of the NASDAQ. Over the same period, IT stocks in countries other than the U.S. also collapsed, focusing attention on the fact that the collapse in IT stocks was a global phenomenon. With the progress of global economic integration, researchers have examined the linkages among major national stock markets. Important studies include Jeon and von Furstenberg (1990), Hamao et al. (1990), Campbell and Hamao (1992), Longin and Solnik (1995), Hamori and Imamura (2000), Masih and Masih (2001), and Edwards
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_84,
1283
1284
and Susmel (2001). A common finding in these studies is that co-movements across national stock markets have increased since the 1990s. This finding has resulted in growing interest in the importance of industry factors in explaining international return variation as investors consider alternative diversification strategies to reduce risk. However, in the literature on the co-movements between national stock markets, only a few studies have examined the interdependence among industry-based stock markets. Among them, Jorge and Iryna (2002) applied univariate T-GARCH models to examine whether price changes and volatility spillovers were generated from the U.S. or from the Asia-Pacific region, using stock data for the telecommunications, media, and technology (TMT) and non-TMT sectors from January 1990 to May 2001. Their findings suggested that the U.S. market plays an important role in determining price dynamics in Asia-Pacific stock markets for both the TMT and non-TMT sectors. They also found that Asia-Pacific stock markets have little or no influence on U.S. stock markets, especially for TMT stocks. In a separate endeavor, Jeon and Jang (2004) discovered only unidirectional causation from the U.S. high-tech to the South Korean hightech index, using a vector autoregression (VAR) model to examine the interrelationship between the NASDAQ and the KOSDAQ stock market indices for high-tech industries in South Korea as well as the relationship between the stock prices of South Korean and U.S. semiconductor firms for the period July 1996 to February 2001. In this chapter, we contribute to the literature by applying a multivariate GARCH (MGARCH) model formulated by Baba et al. (BEKK) (1990), or the so-called BEKK GARCH model, to analyze linkages between the information technology stock market in the U.S. and those in Japan, France, Canada, Finland, Sweden, and Hong Kong. Briefly, the BEKK MGARCH model allows one to investigate the lead-lag relationships or informational spillover effects of two or more variables, using both the first (causality in mean) and second order (causality in variance) specifications. After the seminal work of Engle and Kroner (1995), this model has gained popularity in financial market studies to test for volatility transmission or spillover effects. For instance, Karolyi (1995) used this framework to model international transmissions of stock returns and volatility in the context of U.S. and Canada, whereas Caporale et al. (2002) applied it to East Asian markets to examine the causality relationship between stock prices and the volatility of exchange rates. More recently, Liao and Williams (2004) employed this framework to model stock market interdependence in groups in the European Community. Considering the usefulness of this model in the identification of informational spillover effects in financial markets, this study adopts it to analyze international linkages between information technology stocks.
Z. Qiao et al.
To preview our findings, our estimation of the BEKK MGARCH model suggests that the U.S. IT market contributes a mean spillover effect only to a few (Japan and Hong Kong), but not all, IT markets, and the IT markets of some other countries (France, Canada, and Sweden) do generate significant mean spillover effects to the U.S. IT market. This finding is in sharp contrast to the common belief that the U.S. IT market plays a dominant role in the world. Nevertheless, our findings reveal that in general the U.S. IT market still plays a dominant role in affecting the volatility of world IT markets in the sense that it contributes strong volatility spillover effects to other IT stock markets except Canada. In addition, our analysis of the dynamic path of correlation coefficients supports the idea that, during the formation, spread, and collapse of the IT bubble, the relationship between the U.S. and non-U.S. IT markets is strong. In contrast, after the burst of the IT bubble, the relationship has been weakened ever since. Overall, our research results provide additional information on the relationship between the U.S. and non-U.S. IT markets, which, in turn, should be useful for investors when making investment decisions and for policy makers when setting policy for the IT industry. The remainder of the chapter is organized as follows: Sect. 84.2 discusses the data and methodology. The empirical results are presented and analyzed in Sect. 84.3. We make some concluding remarks in the final section.
84.2 Data and Methodology We use the weekly IT indices of the U.S. (US), Japan (JP), France (FR), Canada (CA), Finland (FIN), Sweden (SWE), and Hong Kong (HK) taken from DataStream International in our study. Our sample covers the period from January 1995 to December 2005, with 574 observations in total for each index. The weekly Wednesday indices are used to alleviate the effects of noise characterizing daily data and to avoid the day-of-the-week effect (Lo and MacKinlay 1988). In addition, to avoid exchange rate bias, all indices are expressed in U.S. dollars. The weekly continuously compounded rate of return, rt , on date t is defined as rt D 100.ln p t ln pt 1 /
(84.1)
where pt is the corresponding price index on date t for each of the IT stock price indices. Since the U.S. IT industry is believed to hold a prominent position in the world, the U.S IT market is expected to lead the IT markets in other countries. To study this possibility, we examine the linkages between the U.S. IT market and the IT markets in non-U.S. countries, in terms of both mean
84 Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets
and volatility spillover effects. In particular, we adopt the following bivariate VAR .m/-BEKK (1,1) MGARCH framework to model the dynamic linkages between the two IT markets such that: r1t D c1 C
m X
i '11 r1t i C
i D1
r2t D c2 C
m X
m X
i '12 r2t i C "1t
i D1 i '21 r1t i C
i D1
m X
i '22 r2t i C "2t
(84.2)
i D1
where r1t and r2t denote the stock index returns of the U.S. IT market and one of the non-U.S IT markets, respectively, and "t D ."1t ; "2t /0 is the vector of error terms. In addition, in order to capture the heteroskedasticity in the return series and to ensure that the variance matrix of the error terms is positive definite, we further incorporate the model in Equation (84.2) with the following bivariate BEKK (1,1) model (Baba et al. 1990; Engle and Kroner 1995)1: X "t jt 1 N 0; (84.3) t
!
t11 t12 P D A0 A00 C A1 "t 1 "0t 1 A01 C where t D 21 22 t t P B1 t 1 B10 . Here, we assume that "t follows a bivariate normalPdistribution conditional on the past information set t 1 ; t is the symmetric and positive semi-definite variance-covariance matrix of "t ; A0 is a lower triangular matrix, and A1 and B1 are unrestricted square matrices. This framework enables us P to fully examine the dynamics of t . Usefully, it allows for the interaction between the conditional variances and covariances, thereby allowing us to observe the contemporaneous information flows across markets. The expansion of BEKK (1,1) into individual dynamic equations generates the following variance and covariance equations: 2 2 11 C A1 "1t 1 C A12 t11 D A11 0 1 "2t 1 h 2 12 2 22 i 12 11 12 C B111 t11 C 2B B C B1 t 1 ; 1 1 1 t 1 2 2 22 2 21 t22 D A21 C A0 C A1 "1t 1 C A22 0 1 "2t 1 h 2 22 2 22 i 21 22 21 C B121 t11 t 1 ; 1 C 2B1 B1 t 1 C B1 21 11 21 2 t12 D A11 0 A0 C A1 A1 " 1t 1 22 12 21 12 22 2 C.A11 1 A1 C A1 A1 /"1t 1 "2t 1 C A1 A1 " 2t 1 12 21 11 22 t12 CB111 B121 t11 1 C B1 B1 C B1 B1 1
CB112 B122 t22 1 : 1
(84.4)
BEKK (1,1) is usually sufficient to model volatility in financial time series. See, for example, Baba et al. (1990).
1285
In this framework, the dynamics of conditional variance and the conditional covariance are modeled directly and the volatility spillover effects across return series indicated by the off-diagonal entries of coefficient matrices A1 and B1 can also be estimated. This BEKK specification is a more general and flexible MGARCH model as there are no restrictions imposed on the coefficients, but they contain all the ARCH and GARCH items in the equations such that the firstorder ARCH(i,j) and GARCH(i,j) terms are the elements of the ARCH and GARCH coefficient matrices A1 and B1 in Equations (84.3) and (84.4).
84.3 Empirical Results To analyze the stock indices of the IT sectors for all countries being studied in this chapter, we first plot their indices in Fig. 84.1 and display the descriptive statistics of their weekly returns in Table 84.1. The figure shows clearly that the stock indices of the IT sectors increased rapidly prior to 1999 and slumped sharply after 2001. The table reveals that, with the exception of the Hong Kong market, the returns are skewed to the left. In addition, consistent with the literature, the kurtosis coefficients indicate that the distributions of the series have fat tails, with the values of their corresponding Jarque-Bera statistic indicating a non-normality property for each of the stock returns. To examine the serial correlations in both levels and squared levels of the stock returns, we applied the LjungBox (LB) test on both the return series .LB.q// and on the squares of the return series .LB2 .q//, respectively, where q represents the number of lags included in the computation of the LB statistics. As shown in Table 84.1, the significance of the LB2 statistics at lag 6 implies the existence of strong serial correlation in the squared levels, revealing the presence of time-varying volatility such as ARCH or GARCH effects in the return series. To test the stationarity property for each return series, we conduct both the Augmented Dickey-Fuller (ADF) and the Phillips-Perron (PP) unit root tests to the return series and present the results in Table 84.2. Consistent with the results in the literature, the results of both ADF and PP tests indicate that all of the return series are stationary. We then proceed to use the bivariate VAR(m)-BEKK MGARCH model as stated in Equations (84.2)–(84.4), with m D 1, in particular, to analyze the data and present the results in Tables 84.3a and 84.3b. The statistical results shown in the tables enable us to analyze the dynamic linkage between the U.S. IT stock market and other non-U.S. IT mar1 1 '21 for kets. For instance, the estimated coefficient '12 the US-JP pair measures the mean spillover effect from the JP(U.S.) to the U.S.(JP) IT stock market, and ARCH(1,2) and GARCH(1,2) (ARCH (2,1) and GARCH(2,1)) measure the volatility spillover effect from the JP(U.S.) IT stock market to the U.S. (JP) IT stock market.
1286
Z. Qiao et al. FIN
CA
US 3500
4000
16000
3000
12000
2000
8000
1000
4000
3000 2500 2000 1500 1000 500 0
0
0 95 96 97 98 99 00 01 02 03 04 05
FR
95 96 97 98 99 00 01 02 03 04 05
95 96 97 98 99 00 01 02 03 04 05
HK
JP
12000
600
24
10000
500
20
8000
400
16
6000
300
12
4000
200
8
2000
100
4
SWE 2800 2400 2000 1600
0
1200
400 0
0
0 95 96 97 98 99 00 01 02 03 04 05
800
95 96 97 98 99 00 01 02 03 04 05
95 96 97 98 99 00 01 02 03 04 05
95 96 97 98 99 00 01 02 03 04 05
Fig. 84.1 Plot of IT stock price indices for different countries. Note: This figure exhibits the stock price indices of the US and non-US IT stock markets: US, JP, FR, CA, FIN, SWE and HK, for the period January 1995 through December 2005
Table 84.1 Descriptive statistics for stock returns of the IT sectors US JP FR CA
FIN
SWED
HK
Mean Median Maximum Minimum Std. dev. Skewness Jarque-Bera LB(6) LB2 .6/
0.364 0.670 25.320 35.325 6.680 0.731 296.066 12.122 20.501
0.185 0.644 28.023 52.469 7.705 1.007 1253.066 5.617 20.258
0.195 0.084 45.572 34.958 7.176 0.330 923.246 19.694 206.40
0.214 0.345 16.622 21.399 4.416 0.356 64.188 6.371 87.635
0.064 0.342 14.432 18.467 4.677 0.060 37.487 6.680 91.765
0.125 0.177 22.875 29.531 5.334 0.304 258.572 8.007 97.915
0.028 0.532 22.542 38.371 6.317 0.703 307.928 23.537 75.721
Notes: , and indicate statistical significance at the 1, 5 and 10% level, respectively. The JarqueBera statistic has a 2 distribution with two degrees of freedom under the null hypothesis of normally distributed errors. LB (6) is the Ljung-Box statistic based on the levels of the time series up to the sixth order. LB2 .6/ is the Ljung-Box statistic based on the squared levels. Both statistics on the levels and squared levels are asymptotically distributed as 2 .6/
Table 84.2 Unit root tests for index returns
ADF TestnMarket US JP FR CA FIN SWE HK
PP
t -Statistic
p-Value
t -Statistic
p-Value
25.258 23.669 23.370 21.607 25.854 24.025 23.423
0.000 0.000 0.000 0.000 0.000 0.000 0.000
25.226 23.706 23.400 21.998 25.830 24.047 23.706
0.000 0.000 0.000 0.000 0.000 0.000 0.000
Notes: The null hypothesis is that the series has a unit root. ADF test refers to Augmented Dickey-Fuller unit root test. The PP test refers to the Phillips-Perron unit root test
84 Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets
Table 84.3a Estimates for VAR (1)-BEKK (1,1) fitted on US-JP, US-FR and US-CA
MarketsnEstimates c1 c2 1 '11 1 '12 1 '21 1 '22 A(1,1) A(2,1) A(2,2) ARCH(1,1) ARCH(1,2) ARCH(2,1) ARCH(2,2) GARCH(1,1) GARCH(1,2) GARCH(2,1) GARCH(2,2)
1287
US-JP
US-FR
0.315 (0.152) 0.084(0.166) 0:056.0:050/ 0:009.0:036/ 0.171(0.042) 0:056.0:042/ 0.238(0.286) 0:430.0:509/ 0.233(0.866) 0.287(0.045) 0.030(0.031) 0:023.0:045/ 0.196(0.032) 0.961(0.017) 0:016.0:012/ 0.033(0.015) 0.962(0.011)
US-CA
0.305(0.163) 0.311(0.201) 0:108.0:057/ 0.095(0.042) 0.034(0.062) 0.011(0.058) 0.038(1.289) 0:6932.6:233/ 0.6053(0.122) 0.207(0.055) 0.044(0.041) 0:118.0:063/ 0.376(0.044) 0.989(0.016) 0:023.0:016/ 0.065(0.022) 0.900(0.022)
0.229(0.160) 0.215(0.237) 0:145.0:055/ 0.071(0.032) 0:087.0:081/ 0.116(0.063) 0.199(0.275) 0:807.1:687/ 0.009(154.815) 0.249(0.046) 0.006(0.037) 0:026.0:070/ 0.311(0.057) 0.971(0.016) 0:003.0:013/ 0.040(0.027) 0.933(0.022)
Notes: The estimates are based on Equations (84.2)–(84.4) in the text. The first-order ARCH(i,j) and GARCH(i,j) terms are the elements of the ARCH and GARCH coefficient matrices A1 and B1 in Equations (84.3) and (84.4). Numbers in parentheses are standard errors. , , and indicate significance at the 1, 5, and 10% level, respectively Table 84.3b Estimates for VAR (1)-BEKK (1,1) fitted on US-FIN, US-SWE and US-HK
MarketsnEstimates
US-FIN
US-SWE
US-HK
c1 c2 1 '11 1 '12 1 '21 1 '22 A(1,1) A(2,1) A(2,2) ARCH(1,1) ARCH(1,2) ARCH(2,1) ARCH(2,2) GARCH(1,1) GARCH(1,2) GARCH(2,1) GARCH(2,2)
0.322(0.191) 0.594(0.286) 0:081.0:060/ 0.053(0.037) 0.003(0.076) 0:069.0:057/ 0.126(1.058) 1.447(12.402) 0.631(28.120) 0.268(0.049) 0:018.0:035/ 0.072(0.078) 0.230(0.043) 0.993(0.026) 0:027.0:022/ 0.070(0.037) 0.908(0.026)
0.297(0.167) 0.410(0.318) 0:123(0.054) 0.062(0.028) 0:119.0:095/ 0:021.0:068/ 0.385(0.269) 0:758.0:668/ 0.009(52.541) 0.237(0.047) 0.028(0.027) 0:043.0:078/ 0.216(0.031) 0.969(0.018) 0:008.0:008/ 0.056(0.025) 0.960(0.008)
0.379(0.180) 0.269(0.258) 0:055.0:053/ 0.023(0.032) 0.158(0.063) 0.018(0.052) 0.380(0.226) 1.435(0.448) 0.000(2.142) 0.117(0.049) 0.073(0.023) 0:104.0:073/ 0.402(0.038) 0.995(0.009) 0:037.0:008/ 0.028(0.016) 0.895(0.017)
Notes: The estimates are based on Equations (84.2)–(84.4) in the text. The first-order ARCH(i,j) and GARCH(i,j) terms are the elements of the ARCH and GARCH coefficient matrices A1 and B1 in Equations (84.3) and (84.4). Numbers in parentheses are standard errors. , , and indicate significance at the 1, 5, and 10% level, respectively
Among the estimates of mean equations, from Tables 1 indicates a highly signif84.3a and 84.3b, the value of '21 icant positive effect on U.S.-JP and U.S.-HK, implying that there are mean spillover effects from the U.S. IT stock market to both the Japan and Hong Kong IT stock markets. On the 1 other hand, we also find there are significantly positive '12 for U.S.-FR, U.S.-CA, and U.S.-SWE, revealing that there
also exist positive mean spillover effects from the France, Canada, and Sweden IT stock markets to the U.S. IT stock market. The only pair that does not contain any spillover effect in the mean equation is the U.S.-FIN, implying that there is no spillover effect in the mean equation between the U.S. and Finland. Our results reveal that the U.S. IT market does not play a dominant role in mean spillover effect, since it
1288 Fig. 84.2 Contemporaneous correlations among IT stock markets. Note: This figure displays the time-varying correlation coefficients of the US and non-US IT stock markets: US-JP, US-FR, US-CA, US-FIN, US-SWE and US-HK for the period January 1995 through December 2005
Z. Qiao et al. .8
1.0
.6
0.8
.4
0.6
.2
0.4
.0
0.2
−.2
0.0 −0.2
−.4 95 96 97 98 99 00 01 02 03 04 05
95 96 97 98 99 00 01 02 03 04 05
US-JP
US-FR
.9
.9
.8
.8
.7
.7
.6
.6
.5
.5
.4
.4
.3
.3
.2
.2 .1
.1 95 96 97 98 99 00 01 02 03 04 05
95 96 97 98 99 00 01 02 03 04 05
US-FIN
US-CA 1.0
.7 .6
0.8
.5 0.6
.4
0.4
.3 .2
0.2
.1 0.0
.0
−0.2
−.1 95 96 97 98 99 00 01 02 03 04 05 US-SWE
contributes a mean spillover effect to only a few other IT markets, but some other IT markets also generate significant mean spillover effects to the U.S. IT market. It is also important to analyze the spillover effects in volatility between the IT markets by examining the offdiagonal items in the variance and covariance equations. From the tables, we find that the GARCH(2,1) estimates are significantly positive for all the pairs in our study except the U.S.-CA. However, all the ARCH(2,1) estimates are not significantly different from zero except the case of the U.S.-FR, in which the ARCH(2,1) estimates have a marginally significant negative value. These imply that the U.S. IT market does contribute some volatility spillover effects to other IT markets. On the other hand, for the ARCH(1,2) and GARCH(1,2) parameters, they are significant only for the U.S.-HK pair. This implies that the U.S. IT market does play a leading role in influencing the volatility in other IT markets, whereas other IT markets (except HK) do not contribute any volatility spillover effect to the U.S. IT market. In addition, it is interesting to examine the dynamic evolution path of the correlation between the IT markets.
95 96 97 98 99 00 01 02 03 04 05 US-HK
To achieve this, we plot in Fig 84.2 the time-varying conditional correlation coefficients estimated from the VAR (1)-BEKK (1,1) model for each pair of markets.2 By eyeballing these figures, we discover an interesting finding: for the period 1999–2002; that is, the period of the formation, spread, and collapse of the IT bubble, the correlation between the U.S. and non-U.S. IT markets is high and exhibits an upward trend. This indicates that the market relationships are very close during the IT bubble, and this becomes a global event that affects each stock market significantly. However, after the burst of the IT bubble, the correlation path shows a sharp downward-sloping trend, suggesting that the relationship between the U.S. and non-U.S. IT markets has become weaker ever since. Finally, we conduct a diagnostic check of the models and report the test results in Table 84.4. The Ljung-Box test of white noise is applied to both the standardized residuals and
2
The conditional correlation .p between markets 1 and 2 at any time t is t11 t22 .
defined as: 1;2;t D t12
84 Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets
Table 84.4 Diagnostic Check of the VAR (1)-BEKK (1,1) model fitting
1289
MarketsnEstimates
US-JP
US-FR
US-CA
LB (6)-US LBS (6)-US LB (6)-JP LBS (6)-JP LB (6)-FR LBS (6)-FR LB (6)-CA LBS (6)-CA MarketsnEstimates LB(6)-US LB2 (6)-US LB(6)-FIN LB2 (6)-FIN LB(6)-SWE LB2 (6)-SWE LB(6)-HK LB2 (6)-HK
1.186 11.055 6.903 9.553 N/A N/A N/A N/A US-FIN 0.835 11.874 6.197 2.593 N/A N/A N/A N/A
0.546 8.750 N/A N/A 3.571 3.141 N/A N/A US-SWE 0.381 9.295 N/A N/A 3.359 2.148 N/A N/A
1.327 8.747 N/A N/A N/A N/A 5.485 7.897 US-HK 1.823 8.011 N/A N/A N/A N/A 8.386 8.858
Notes: Numbers in parentheses are standard errors. , , and indicate significance at the 1, 5, and 10% level, respectively. LB(6) and LB2 .6/ are the Ljung-Box statistics based on the level and the squared level of the time series up to the sixth order
the squared standardized residual series to test joint significance for serial correlation in the first and second moments of residuals. Since all of the p-values of the Ljung-Box test statistic are not significant at the 5% level, we conclude that the fitted models are adequate and successful in capturing the dynamics in the first two moments of the index return series.
84.4 Conclusions Information technology plays a very important role in modern production and management. It has become a major driver of economic growth and has greatly speeded up the integration of the global economy since the 1990s. In this chapter, we adopt the bivariate VAR(1)-BEKK(1,1) MGARCH model to examine the linkages of the IT stock markets between the U.S. and several non-U.S. IT countries. Our results show that, contrary to common belief, the U.S. IT market does not play a leading role in terms of the mean spillover effect, since the U.S. IT market contributes mean spillover effects to the IT markets of only some countries, but the IT markets of some other countries also contribute some mean spillover effects to the U.S. IT market. Nevertheless, in general, the U.S. IT market still plays a leading role in terms of volatility in the sense that it has strong volatility spillover effects to other IT stock markets except Canada. The evolution path of the correlation coefficients between the U.S. and nonUS. IT markets suggests that during the formation, spread, and collapse of the IT bubble, the relationship between the U.S. and non-U.S. IT markets was strong. In contrast, after
the burst of the IT bubble, the relationship has been weaker. Overall, our research results provide additional information on the relationship between the U.S. and non-U.S. IT markets, which, in turn, should be useful for investors when making investment decisions and for policy makers when setting policy for the IT industry. Acknowledgments The authors are most grateful to Professor C.F. Lee for his substantive comments and suggestions that have significantly improved this manuscript. The third author would like to thank Professors Robert B. Miller and Howard E. Thompson for their continuous guidance and encouragement. The research is partially supported by University of Macau, Universiti Malaysia Sabah, and Hong Kong Baptist University.
References Baba, Y., R. F. Engle, D. Krafet, and K. F. Kroner. 1990. “Multivariate simultaneous generalized ARCH.” Unpublished Manuscript, Department of Economics, University of California, San Diego. Bollerslev, T., R. F. Engle, and J. M. Wooldridge. 1988. “A capital asset pricing model with time varying covariances.” Journal of Political Economy 96(1), 116–131. Campbell, J. Y. and Y. Hamao. 1992. “Predictable stock returns in the United States and Japan: a study of long-term capital market integration.” Journal of Finance 47, 43–69. Caporale, M. G., N. Ptittas, and N. Spagnolo. 2002. “Testing for causality in variance: an application to the East Asian markets.” International Journal of Finance and Economics 7, 235–245. Edwards, S. and R. Susmel. 2001. “Volatility dependence and contagion in emerging equity markets.” Journal of Development Economics 66, 505–532. Engle, R. F. and K. F. Kroner. 1995. “Multivariate simultaneous generalized ARCH.” Econometric Theory 11, 122–150.
1290 Gordon, R. 2000. “Does the ‘new economy’ measure up to the great inventions of the past?” Journal of Economic Perspectives 14, 49–74. Hamao, Y. R., W. Masulis, and V. K. Ng. 1990. “Correlations in price changes and volatility across international stock markets.” Review of Financial Studies 3, 281–307. Hamori, S. and Y. Imamura. 2000. “International transmission of stock prices among G7 countries: LA-VAR approach.” Applied Economics Letters 7, 613–618. Jeon, B. N. and B. S. Jang. 2004. “The linkage between the US and Korean stock markets: the case of NASDAQ, KOSDAQ, and the semiconductor stocks.” Research in International Business and Finance 18, 319–340. Jeon, B. N. and G. M. von Furstenberg. 1990. “Growing international co-movement in stock price indexes.” Quarterly Review of Economics Business 30, 15–30. Jorge, C. H. and I. Iryna. 2002. Asian flu or Wall Street virus? Price and volatility spillovers of the tech and non-tech sectors in the United States and Asia, IMF working paper wp/02/154. Karolyi, A. G. 1995. “Multivariate GARCH model of international transmissions of stock returns and volatility: the case of the Unites States and Canada.” Journal of Business and Economics Statistics 13, 11–25. Liao, A. and J. Williams. 2004. “Volatility transmission and changes in stock market interdependence in the European Community.” European Review of Economic and Finance 3, 203–231. Lo, A. W. and A. C. MacKinlay. 1988. “Stock market prices do not follow random walks: evidence from a simple specification test.” Review of Financial Studies 1, 41–66. Longin, F. and B. Solnik. 1995. “Is the correlation in international equity returns constant: 1960–1990?” Journal of International Money and Finance 14, 3–23. Maich, S. 2003. “Analysts try to justify latest internet rise: a boom or bubble?” National Post, 9 September, 1. Masih, R. and A. M. Masih. 2001. “Long and short term dynamic causal transmission amongst international stock markets.” Journal of International Money and Finance 20, 563–587. Oliner, S. and D. Sichel. 2000. “The resurgence of growth in the late 1990s: is information technology the story?” Journal of Economic Perspectives 14, 3–22.
Appendix 84A When modeling multivariate economic and financial time series using vector autoregressive (VAR) models, squared residuals often exhibit significant correlations. For univariate time series, various univariate GARCH models have been developed to model the serial correlation in the second moment of the underlying univariate time series. For multivariate time series, econometricians extend the univariate GARCH models to the multivariate GARCH (MGARCH) models to model and predict the time varying volatility and volatility co-movement of multivariate time series. It is widely observed that financial volatilities move together over time across different assets and different markets; therefore, it is more important to analyze this feature through a MGARCH framework than through separate univariate GARCH models. MGARCH models have numerous applications to financial data. They provide a powerful tool
Z. Qiao et al.
to validate many theories and provide analysis for many empirical studies in many areas, such as asset pricing, portfolio selection, option pricing, hedging and risk management, and so forth. One important application of this type of models is the study of the relations between the volatilities and covolatilities of two financial markets as shown in this paper. Let frt g be a k 1 multivariate return series; each series rt can be presented as: rt D ct C "t ; for t D 1; 2; : : : ; T
(84A.1)
where ct is the conditional expectation of rt given the past information set t 1 and "t is white noise with zero mean. For most return series, it is sufficient to adopt a simple k 1 vector ARMA for fct g. The conditional covariance matrix of P f"t g given t 1 is a k k positive-definite matrix t defined P by t D Cov."t jt 1 / . A multivariate volatility modeling P technique is used to capture the time evolution of t . There are many ways to generalize univariate volatility models to the multivariate case. Bollerslev et al. (1988) first provided the basic framework for an MGARCH model. They extended the univariate GARCH representation to the vectorized conditional-variance (VEC) matrix representation, which follows the traditional autoregressive moving average time series analogue. A useful member of the VEC representation family is the diagonal form (DVEC). A DVEC .p; q/ P model specifies t as follows: X t
D A0 C
p X i D1
Ai ˝ ."t i "0t i / C
q X
Bj ˝
j D1
X t j
(84A.2) where ˝ stands for the Hadamard product. For example, a bivariate DVEC(1,1) model can be expressed as: 11 2 11 11 t11 D A11 0 C A1 "1t 1 C B1 t 1 ; 22 2 22 22 t22 D A22 0 C A1 "2t 1 C B1 t 1 ;
(84A.3)
21 21 21 t21 D A21 0 C A1 "1t 1 "2t 1 C B1 t 1 :
Clearly, the volatility of each series follows a GARCH process, while the covariance process can also be treated as a GARCH model in terms of the cross-moment of the errors. Since the covariance matrix must be symmetric, in practice P it suffices to treat t as symmetric and consider only the lower triangular part of the system. A covariance matrix must P also be positive semi-definite (PSD). However, the t in the DVEC model cannot be guaranteed to be PSD. This is a weakness of the DVEC. Furthermore, the DVEC cannot be used to measure the dynamic dependence between volatility series. In particular, the conditional variance and covariance depend only on their own lagged elements and the corresponding cross-product of shocks or error terms.
84 Examining the Impact of the U.S. IT Stock Market on Other IT Stock Markets
To guarantee the positive-definite constraint, Engle and Kroner (1995) proposed the BEKK model. A BEKK .p; q/ can be expressed as: X t
D
A0 A00
C
p X i D1
Ai ."t i "0t i /A0i
C
q X j D1
Bj
X t 1
Bj0
(84A.4) where A0 is a lower triangular matrix, but Ai and Bj are P unrestricted square matrices. It is easy to show that t is guaranteed to be symmetric and PSD in the above formula. In addition, the dynamics allowed in the BEKK are richer than those in the DVEC model, which can be illustrated P by expressing the elements of t in the following BEKK (1,1) model: 2 2 11 t11 D A11 C A1 "1t 1 C A12 0 1 "2t 1 h 2 12 2 22 i 12 11 12 C B111 t11 t 1 ; 1 C 2B1 B1 t 1 C B1
1291
2 2 22 2 21 C A0 C A1 "1t 1 C A22 t22 D A21 0 1 "2t 1 h 2 22 2 22 i 21 22 21 C B121 t11 t 1 ; 1 C 2B1 B1 t 1 C B1 21 11 21 2 t12 D A11 0 A0 C A1 A1 " 1t 1 22 12 21 12 22 2 C.A11 1 A1 C A1 A1 /"1t 1 "2t 1 C A1 A1 " 2t 1 12 21 11 22 12 CB111 B121 t11 1 C B1 B1 C B1 B1 t 1
CB112 B122 t22 1 :
(84A.5)
Obviously, BEKK is more flexible as it allows for the interaction among conditional variances and covariances. A limitation of the BEKK model is that it requires the estimation of a large number of parameters, especially when the dimensions of the variance-covariance matrix are large. However, this weakness is not important in any bivariate case, such as the model used in this chapter.
Chapter 85
Application of Alternative ODE in Finance and Economics Research Cheng-Few Lee and Junmin Shi
Abstract This chapter introduces ordinary differential equation (ODE) and its applications in finance and economics research. There are various approaches to solve an ordinary differential equation. Among them, the most commonly used approaches are the classical approach for a linear ODE and the Laplace transform approach. We demonstrate the applications of ODE in both deterministic and stochastic systems. As an application in deterministic systems, we apply ODE to solve a simple gross domestic product (GDP) model and an investment model of a firm, which is well studied in Gould (The Review of Economic Studies 1, 47–55, 1968). As an application in stochastic systems, we employ ODE in a jump-diffusion model and derive the expected discounted penalty given the asset value follows a phase– type jump diffusion process. In particular, we study capital structure management model and derive the optimal default-triggering level. For related results, please see Chen et al. (Finance and Stochastics 11, 323–355, 2007). Keywords Default-triggering level r Expected discounted penalty r Investment model r Laplace transform r Ordinary differential equation
85.1 Introduction The theory of differential equations has become an essential tool in finance and economics studies, particularly since computers have become commonly available. An ordinary differential equation (ODE) only involves a single unknown function with one variable and its derivative, whereas nonordinary differential equations, such as partial differential equations, have an unknown function dependent on more than one variable. In the chapter, we shall introduce the ordinary differential equation and its application in finance and economics C.-F. Lee and J. Shi () Business School, Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]; [email protected]
research. There are various approaches to solve an ordinary differential equation. Among them, the most commonly used approaches are the classical solution technique for a linear ODE and the Laplace transform approach. Laplace transform approach is powerful in solving integro-differential equation. It is originally applied in engineering. As a powerful tool, it is also extensively employed in finance and economics [see Churchill (1972) and Schiff (1999) for detail and references therein]. We demonstrate the applications of ODE in both deterministic and stochastic systems. As an application in deterministic systems, we apply ODE to solve a typical gross domestic product (GDP) model and an investment model of a firm. Gould (1968) studies the adjustment costs in the theory of investment of a firm by considering a rationally managed firm in a competitive industry, acting to maximize the present value of all future net cash flow. To this end, Gould (1968) derives an Euler–Lagrange equation, which leads to a second order differential equation. By the classical solution technique and the Laplace transform, we obtain the optimal capital input function. As an application in stochastic systems, we employ ODE in a jump-diffusion model and derive the expected discounted penalty given the asset value follows a phase-type jump diffusion process. In particular, we study capital structure management model and derive the optimal default-triggering level. Chen et al. (2007) derive an integrodifferential equation for the expected present value of a firm, and apply ODE approach to solve the equation. The remainder of this chapter is organized as follows. In Sect. 85.2, we introduce some fundamental concepts and two solution techniques. In particular, the classical solution technique is introduced in Sect. 85.2.1, the Laplace transform is introduced in Sect. 85.2.2 and the Euler–Lagrange equation is introduced in Sect. 85.2.3. The applications of ordinary differential equation are illustrated in Sects. 85.3 and 85.4. In particular, it application in a deterministic system is in Sect. 85.3 by considering a simple GDP model and an investment model for a firm. Its application in a stochastic system is demonstrated in Sect. 85.4 by considering a capital structure management model. Finally, this chapter is concluded in Sect. 85.5.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_85,
1293
1294
C.-F. Lee and J. Shi
85.2 Ordinary Differential Equation
The above equation can be symbolically written as
In this section, we shall introduce some fundamental concepts of ODE and two basic solution techniques. Let y be an unknown function y W R 7! R of variable x, and y.i / the i -th order derivative of function y, then a function F.x; y; y0 ; y.n1// D y.n/
(85.1)
is called an ordinary differential equation (ODE) of order n. When a differential equation has the form
D n y C kn D n1 y C : : : : : : C k2 Dy C k1 y D 0;
(85.7)
or equivalently, .D n C kn D n1 C : : : :: C k2 D C k1 /y D 0; where differentiating operator D j D For some fri I i D 1; 2; : : : ng, if
dj d xj
(85.8)
; j D 1; 2; : : : ; n.
D n Ckn D n1 C: : :Ck2 D Ck1 D .D r1 /.D r2 / .D rn / D 0;
F.x; y; y ; y 0
.n1/
; y / D 0; .n/
(85.2)
it is called an implicit ODE whereas the form given in Equation (85.1) is explicit ODE. An ODE is said to be linear if function F can be written as a linear combination of the derivatives of y; that is, n X
ai .x/y.i / D r.x/
(85.3)
(85.9)
then the ODE given by Equation (85.7) can be rewritten as .D rn /.D rn1 / .D r1 /y D 0:
(85.10)
Given the roots fri I i D 1; 2; : : : ng of Equation (85.10), consider the following two cases.
i D0
In Equation (85.3), if r.x/ D 0, the linear ODE is homogeneous; otherwise it is inhomogeneous. There are various established methods to solve different types of differential equations. The basic and widely used ones are the classical techniques and Laplace transforms approaches.
Since each of the n factors can be placed before y, there are n different solutions (real or complex) corresponding to n different factors given by Cn e rn x ; Cn1 e rn1 x ; : : : ; C2 e r2 x ; C1 e r1 x where Cn ; Cn1 ; : : : ; C2 ; C1 are constants. Then the homogeneous part of the solution is
85.2.1 Classical Techniques The general form of linear ordinary differential equation with constant coefficients is given by d n1 y d 2y dy d ny Ck1 y D r.x/ Ck C: : : : : : : : :Ck Ck2 n 3 n n1 2 dx dx dx dx (85.4) The general solution contains two parts y D yH C yP
Case 1: Roots are distinct
(85.5)
where yH is the homogeneous part of the solution and yP is the particular part of the solution, which satisfies Equation (85.4). The homogeneous part of the solution yH is the part of the solution that gives zero when substituted in the left hand side of the equation. Whence, yH is solution to Equation d n1 y d 2y dy d ny C k1 y D 0: C k C : : : :: C k C k2 n 3 d xn d x n1 d x2 dx (85.6)
yH D C1 e r1 x C C2 e r2 x C : : : C Cn1 e rn1 x C Cn e rn x : (85.11) Case 2: Some roots are identical If there are m identical roots, denoted by rm , to the homogeneous equation, then the solution has the following form yH D C1 C C2 x C C3 x 2 C C Cm x m1 e rm x CCmC1 e rmC1 x C Cn e rn x :
(85.12)
85.2.2 Laplace Transformation If f.x/ is a function for non-negative real number x, the Laplace transform denoted by Qf.s/ is given by Qf.s/ D Lff.x/g D
Z
1
e sx f.x/dx; 0
(85.13)
85 Application of Alternative ODE in Finance and Economics Research
where s is a parameter, which can be a real or complex number. We can get back f.x/ by taking inverse Laplace transform of Qf.s/ f.x/ D L1 fQf.s/g: Laplace transform is very powerful in solving integrodifferential equations. They give solution directly and avoid evaluating arbitrary constants separately. As typical examples, the following are Laplace transforms of some elementary functions [see detail in Churchill (1972) and Schiff (1999)]. 1. L.1/ D 1s , nŠ , where n D 0; 1; 2; 3 : : : 2. L.x n / D s nC1 1 ax 3. L.e / D sa . The following are the inverse Laplace transforms of some elementary functions 1. L1 1s D 1, ax 1 De , 2. L1 sa x n1 3. L1 s1n D .n1/Š , where n D 1; 2; 3 : : :.,
ax 1 x n1 D e.n1/Š . 4. L1 .sa/ n Some properties of Laplace transform are listed as follows. (a) Linear property: If a; b are constants and f.x/; g.x/ are functions of x then L faf.x/Cb g.x/g D a Lff.x/g Cb Lfg.x/g:
(85.14)
(b) Shifting property: If Lff.x/g D Qf.s/, then Lfe a t f.x/g D Qf.s a/:
1 Q s f : a a
Z
1
85.2.3 Euler–Lagrange Equation One of the powerful tools applied in optimization problems in finance and economic research is the Euler– Lagrange equation (Lagrange’s equation). The Euler– Lagrange equation was developed by Leonhard Euler and Joseph-Louis Lagrange in the 1750s. It is used to solve for functions that optimize a given cost or profit function. It is analogous to the result from calculus that when a smooth function attains its extreme values its derivative goes to zero. The Euler–Lagrange Equation is an equation satisfied by a function f.t/, which maximizes (minimizes) the function Z
b
J D
F.t; f.t/; f0 .t// dt
(85.18)
a
where function F.t; x; y/ 2 C 1 .R X Y/. The Euler–Lagrange equation is the differential equation FX .t; f.t/; f0 .t//
d Fy .t; f.t/; f0 .t// D 0 dt
(85.19)
where Fx and Fy denote the partial derivatives of F(t,x,y) with respect to x,y.
(85.15) In this section, we demonstrate the application of ODE in deterministic systems. In this effort, we first study a simple gross domestic product model, and then an investment model for a firm. (85.16)
(d) Laplace transforms of derivatives If the first n derivatives of f.x/ are continuous, then L ff.n/.x/g D
1. Take Laplace transform of both sides of ODE; 2. Express Qf.s/ as a function of s; 3. Take inverse Laplace transform on both sides to get solution.
85.3 Applications of ODE in Deterministic System
(c) Scaling property: If Lff.x/g D Qf.s/ then Lff.a x/g D
1295
e s x f.n/ .x/ dx D s n Qf.s/ f.n1/ .0/
0
sf.n2/ .0/ s 2 f.n3/ .0/ s n1 f.0/ (85.17) The steps to solve an ordinary differential equation via Laplace transform technique are presented as follows:
85.3.1 Gross Domestic Product A differential equation expresses the rate of change of the current state as a function of the current state. A simple illustration of this type of dependence is the changes of the Gross Domestic Product (GDP) over time. Consider state x.t/ of the GDP of the economy at time t. The rate of change of the GDP is proportional to the current GDP; that is, xP .t/ D g x.t/;
(85.20)
1296
C.-F. Lee and J. Shi
where xP .t/ D @t@ x.t/ and the growth rate g is a given constant at any time t. Then the GDP at time t can be obtained via solving the above differential equation. Equation (85.20) can be written as @ xP .t/ D ln x.t/ D g: x.t/ @t Therefore, ln x.t/ D g x C c, where c is a constant. Equivalently, x.t/ D e c e g t . Furthermore, if we set t D 0, then we have x.0/ D e c . Finally, we have the following solution x.t/ D x.0/ e g t : The solution tells us that the GDP decreases (increases) exponentially over time when g is negative (positive).
P From Equation (85.22) where KP D dtd K.t/ and KR D dtd K.t/. it follows that the value of the marginal product of labor should be equal to the wage rate everywhere on the optimum path. Since F .K; L/ is homogeneous of degree one, its first partial derivative are functions which are homogeneous . ThereD FL .K; L/ D FL K of degree zero so that @F @L L fore, Equation (85.22) provides the capital-labor ratio at any time point w K D FL1 : (85.24) L P The partial derivative FK is also homogeneous of degree zero and hence may similarly be treated as function of the capitallabor ratio. Thus, by Equation (85.24), we have P FK .K; L/ D P FK
85.3.2 Investment of the Firm
Z
1
e R.t / ŒP .t/ F .K; L/ w.t/ L.t/ C.t/ dt 0
(85.21) Rt where e R.t / D 0 r./d is the discount factor at time zero for cash flows to be received at t, r./ is the instantaneous interest rate prevailing at time . In a certainty world, the firm chooses the function K.t/ and L.t/ to maximize V . To this end, we calculate the Euler–Lagrange equation. The Euler– Lagrange equations satisfied by the functions to maximize V are [see Equations 85.4 and 85.5 in Gould (1968)] e R.t / ŒP
@F w D 0 @L
(85.22)
and e R.t / ŒP
@F P 00 D 0; .r C ı/C 0 C .KR C ı K/C @K
h w i D P FK FL1 : P (85.25)
Substituting Equation (85.25) into Equation (85.23) and simplifying, we have
Consider a rationally managed firm in a competitive industry, acting to maximize the present value of all future net cash flows. Let P .t/ denote the price of output as function of time, w.t/ denote the wage rate, L.t/ denote labor input, C.I / represent the costs associated with investing in capital stock at the gross investment rate I , in particular, C.I / D q0 I C q1 I 2 . Let further F .K; L/ denote the output (production) quantity as a function of the capital and labour inputs. The gross investment rate I has the form, I D KP C D.K/, where D.K/ is the level of replacement investment needed to maintain a capital stock K. In particular, the depreciation is a constant percentage of capital stock; namely, D.K/ D ıK. The present value of the firm is given by [see Equation 85.1 in Gould (1968)] V D
K L
(85.23)
KR r KP .r C ı/ ı K C
PG
w P
q0 .r C ı/ D0 2 q1 (85.26)
where G Pw D FK FL1 Pw . Since I.t/ D KP C ıK, Equation (85.26) can be written as an ordinary differential equation as follows, IP Œr.t/ C ı I D f .t/;
(85.27)
0 Œr.t /Cı . where f.t/ D P G.t /q 2 q1 To solve the ordinary differential Equation (85.27), we provide the following lemma.
Lemma 1. The ordinary differential equation PP .t/ C g.t/ P .t/ D q.t/
(85.28)
has the general solution of the form Z t
Z t P .t/ D C exp g.s/ ds C q.x/ t0
Z
x
g.s/ ds
exp
t1
Z t
dx/ exp g.s/ ds ;
0
0
(85.29) where C; t0 and t1 are constants. Proof. The homogeneous differential equation PP .t/ C P g.t/P .t/ D 0 or PP .t.t // D g.t/ has solution of the form
R t P .t/ D c exp t0 g.s/ds .
85 Application of Alternative ODE in Finance and Economics Research
Recall that I.t/ D KP C ıK. Taking Laplace transformation on both sides, we have
Now substituting constant c by function c.t/, we have Z t
P .t/ D c.t/ exp g.s/ ds :
1297
Q Q IQ.s/ D s K.s/ K.0/ C ı K.s/:
(85.30)
t0
(85.35)
Rearranging Equation (85.35 ), we further have
Then taking derivative with respect to t, we obtain
IQ.s/ C K.0/ Q : K.s/ D sCı
Z t
PP .t/ D c.t/ P exp g.s/ ds t0
Z t
c.t/exp g.s/ ds g.t/
(85.36)
By taking inverse Laplace transformation, we finally have the capital input function
t0
Z t
D c.t/ P exp g.s/ ds P .t/ g.t/
K.t/ D I.t/ e ı t C K.0/ e ı t ;
(85.37)
where I.t/ e ıt is the convolution product of I.t/ and e ıt .
t0
Therefore,
Z t P g.s/ ds : P .t/ C P .t/ g.t/ D c.t/ P exp
85.4 Applications of ODE in Stochastic (85.31) System
t0
By Equations R (85.28)
and (85.31), we have q.t/ t c.t/ P exp t0 g.s/ds , or equivalently Z
t
c.t/ P D q.t/ exp
D
g.s/ ds :
In this section, we shall investigate an application of ODE in stochastic system. To begin with, we will present some preliminary results obtained in Chen et al. (2007). Then we shall apply ODE approach in a capital structure management model.
t0
Thus, Z
Z
t
c.t/ D C C
q.x/ exp
g.s/ ds
t1
85.4.1 Preliminary
x
dx:
(85.32)
0
The proof is completed by substituting Equation (85.32) into Equation (85.30). u t
As an application of Lemma 1, Gupta et al. (2007) apply the result to solve the optimal dividend policy. By Lemma 1, the general solution to Equation (85.27) is given as
Z
t0
I.t/ D I0 exp t
Z
t0
C
where 0 p 1. We consider the process X D fXt I t 0g defined by
Œr./ C ı d
Z
Xt D x C c t C Wt Zt :
s
f.s/ exp
Œr./ C ı d
t
ds
(85.33)
t
where the solution passes through .t0 ; I0 /. Since we assume an infinite time horizon, the specific solution which maximizes V is Z s Z 1 f.s/ exp Œr./ C ı d ds: (85.34) I.t/ D t
In a given probability space .; F ; P/, there are a standard Brownian motion W D fW t I t 0g and a compound PoisPN.t / son process Z D fZt D nD1 Yn I t 0g. Here the Poisson process N D fNt I t 0g has parameter > 0 and the random variables fYn I n 2 Ng are independent and identically distributed with bounded phase-type density function 8 x>0 < p f.C/ .x/ f .x/ D ; (85.38) : .1 p/f./ .x/ x < 0
t
(85.39)
where x; c 2 R; > 0. Define ./ as the characteristic exponent of fXt I t 0g, such that Ex Œe X1 D EŒe X1 jX0 D x D e where ./ D D 2 C c C
./
ex ;
(85.40)
Z e y f .y/ dy
(85.41)
1298
C.-F. Lee and J. Shi
and
2
: (85.42) 2 Accordingly, the infinitesimal generator L of X has a domain containing C02 .R/, and for any h 2 C02 .R/, we have DD
Lh.x/ D D h".x/ C c h0 .x/ Z C h.x y/ f .y/ dy h.x/:
(85.43)
ˆ.x/ D L 1
D s ˆ.0/ C D ˆ0 .0/ C c ˆ.0/ : Ds 2 C c s r C fQ.s/ (85.48)
Example 1. (Exponentially distributed fYn g) If fYn g are iid and exponential distributed, that is f .x/ D e x for x 0, where > 0, then we have fQ.s/ D Lff .x/g D
Z
1
e s x e x dx D 0
The first zero-hitting time is a stopping time and defined as Hence,
D infft > 0I Xt 0g: Let the penalty function for hitting zero be denoted by g.X / as a function of X . Then the expected discounted penalty for the penalty is ˆ.x/ D Ex Œe
r
g.X /
(85.44)
For ˆ.x/, we have the following theorem [see Theorem 2.4 in Chen et al. (2007)]. Theorem 1. The expected discounted penalty ˆ.x/ satisfies the following integro-differential equation .L r/ ˆ.x/ D 0
(85.45)
If r 0 and Ex ŒX1 > 0, we have ˆ.x/ 2 C02 .R/. To obtain the expected discounted penalty ˆ.x/, we employ both Laplace transform approach and an ODE approach.
85.4.1.1 Laplace Transform Approach By taking Laplace transform on both sides of Equation (85.45), we have Q s ˆ.0/ ˆ0 .0/ D Œs 2 ˆ.s/ Q Q Q Cc Œs ˆ.s/ ˆ.0/ C ˆ.s/ fQ.s/ . C r/ ˆ.s/ D 0: (85.46)
ˆ.x/ D L1
8 9 ˆ < D s ˆ.0/ C D ˆ0 .0/ C c ˆ.0/ > = > ˆ : Ds 2 C cs r C ; sC
D L1
;
.s C / ŒD s ˆ.0/ C D ˆ0 .0/ C c ˆ.0/ D s 3 C .c C D /s 2 C . c r/ s
where ˆ.n/ .0/ have been obtained recursively as ˆ.nC2/ .0/ D
1 Œcˆ.nC1/ .0/ C . C r/ˆ.nC1/.0/ En .0/ D (85.50)
with ˆ.0/ D g.0/, where En .0/ is given by Theorem 3.7 in Chen et al. (2007).
85.4.1.2 Alternative ODE Approach Instead of Laplace transform approach, in the following we apply an ODE approach to derive the expected discounted penalty. Following Theorem 1, we have that the expected discounted penalty ˆ.x/ satisfies an ODE, which is given by the following Theorem [see Chen et al. (2007) for detail]. Theorem 3. The expected discounted penalty ˆ.x/ satisfies the following ordinary differential equation D 1 ˆ D 0;
(85.51)
where D1 is the differential operator with the characteristic polynomial P1 given by
Therefore, D s ˆ.0/ C D ˆ0 .0/ C c ˆ.0/ Q ˆ.s/ D : Ds 2 C c s r C fQ.s/
: Cs (85.49)
(85.47)
Then we have the following Theorem. Theorem 2. The expected discounted penalty function given by Equation (85.44) has the following form
P 1 ./ D P 0 . ./ r/;
(85.52)
and P0 ./ is the “minimal” polynomial with leading coefficient 1 such that the zeros of P0 ./ coincide with the poles of fQ./, and ./ D D 2 C c C fQ./ .
85 Application of Alternative ODE in Finance and Economics Research
Example 2. (Exponential distribution, continued) If fYn g are iid and exponential distributed with parameter , then by Equation (85.49), we have ./ D D 2 C c C
; C
(85.53)
P 0 ./ D C ;
1299
mACbA EŒ1 e .mCr/ C ˇ EŒV e .mCr/ ; mCr (85.56) and the present value of the firm is
D.V; L/ D
bAı EŒ1 e r C .1 ˇ/ EŒV e r : r (85.57) Thereby, the value of the equity of the firm is given by v.V; L/ D V C
and S.V; L/ D v.V; L/ D.V; L/:
P 1 ./ D P 0 . ./ r/
(85.58)
r C
By imposing the smooth-pasting condition, we calculate the optimal default-triggering level L such that the value of equity of the firm is maximized. D D 2 . C / C c . C / C . C r/. C / For ease of exposition, let Xt D ln VLt and x D ln VL . Then 3 2 D D C .D C c/ C .c r/ r the asset value given by Equation (85.55) can be rewritten as Equation (85.39). Note that D infft > 0I Xt 0g. Under risk-neutral measure Q, we have Therefore, we have the ordinary differential equation D . C /ŒD 2 C c C
D 1 ˆ D ˆ000 C . C
c c r 0 / ˆ00 C ˆ r ˆ D 0 D D (85.54)
D.V; L/ D
AŒm C b Œ1GmCr .x/ CˇLHr .x/ mCr
and v.V; L/ D V C
85.4.2 Application in Capital Structure Management
bAı Œ1 Gr .x/ C .1 ˇ/ L Hr .x/ r (85.60)
where Gs .x/ D Ex Œe s ;
Consider that under a risk-neutral measure Q, the value of the firm’s assets is given by [see example 3.9 in Chen et al. (2007)] Vt D V e c t CWt Zt :
(85.59)
(85.61)
and Hs .x/ D Ex Œe X e s :
(85.62)
Thereby we have
(85.55)
The setting of debt issuance and coupon payment follows the convention in Leland and Toft (1996). In the time interval Œt; t C dt , the firm issues new debt with face value a dt and maturity profile k.t/ D memt . With the exponential maturity profile, the face value of the debt maturing in Œt; t C dt is the same as the face value for the newly issued debt. Thus the faceR value of all pending debt is equal to a constant 1 A D a 0 e mt dt D ma . We further assume that the default occurs at time D infft > 0I Vt Lg where the defaulttriggering level L will be determined optimally shortly. All bondholders will receive coupons at rate b until default occurrence. All debt is of equal seniority, and once default occurs, the bondholders get the rest of the value, ˇ V , after the bankruptcy cost .1 ˇ/V . We assume that there is a corporate tax rate ı, and the coupons paid can be offset against tax. We assume that the market has a risk-free rate r > 0, and > 0 is the proportional rate at which profit is disbursed to investors. Then the present value of all debt is
@ @ S.V; L/ D .v.V; L/ D.V; L// @V @V D1 C
ıAb 0 1 1 Gr .x/ .1 ˇ/ L Hr0 .x/ r V V
1 1 A.m C b/ 0 0 GmCr .x/ ˇLHmCr .x/ mCr V V
By imposing the smooth pasting condition @ S.V; L/j D 0, we obtain the optimal defaultV DL @V triggering level [see Equation (6.23) in Chen et al. (2007)] L D
A.mCb/ 0 GmCr .0/ mCr 0 0 ˇ/Hr .0/ ˇHmCr .0/
ıAb 0 Gr .0/ r
1 .1
(85.63)
If there are only downward jumps, we get Hs0 .0/ D
d s .1/ Hs .x/jxD0 D 1 C ; dx DŒ1 s
(85.64)
1300
C.-F. Lee and J. Shi
s D s.
Where s is a positive real number such that Similarly, we have Gs0 .0/ D
d s Gs .x/jxD0 D dx D s
(85.65)
Substituting Equations (85.64) and (85.65) into Equation (85.63), we have the optimal default-triggering level [see Equation (3.26) in Chen et al. (2007)] AŒm C b ıAb mCr r L D : mC C .1 ˇ/ ˇ mCr 1 r 1
(85.66)
Example 3. (Exponential distribution, continued) By Equation (85.53) and the fact that s D s, we have c 2 C C C D 3
Cs c D D
s D 0: D (85.67)
By Lemma 2, we have the positive real r and mCr to Equation (85.67). Then, plugging them into Equation (85.66), we finally obtain the optimal default-triggering level L .
85.5 Conclusion Ordinary differential equation is a powerful tool and has been widely used in finance and economics. We first introduce the basic conceptions of ODE and related approaches (classical approach and Laplace transform) that are commonly used to solve ordinary differential equations. The applications of ordinary differential equations are demonstrated in both deterministic and stochastic systems. As an application in a deterministic system, we consider a simple GDP model and an investment model for a firm. As an application in a stochastic system, we consider a capital structure management model to calculate the optimal default-triggering level such that the value of equity of the firm is maximized.
To solve the above cubic equation, the well known formula is provided in the following lemma [see Korn (2000) for detail].
References
Lemma 2. The cubic equation z3 C a2 z2 C a1 z C a0 D 0
(85.68)
has three roots given as follows 8 z1 D 13 a2 C .S C T / ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ < z2 D 13 a2 12 .S C T / C 12 i.S T / ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ : z3 D 13 a2 12 .S C T / 12 i.S T / where S D R2 ; Q D
p p p p 3 3 R C U ; T D R U ; U D Q3 C
3a1 a22 , 9
and R D
9a1 a2 27a0 2 a23 . 54
Chen, Y. T., C. F. Lee, and Y. C. Sheu. 2007. “An ODE approach for the expected discounted penalty at ruin in a jump-diffusion model.” Finance and Stochastics 11, 323–355. Churchill, R. V. 1972. Operational mathematics, McGraw-Hill, New York. Gould, J. P. 1968. “Adjustment costs in the theory of investment of the firm.” The Review of Economic Studies 1, 47–55. Gupta, M. C., A. C. Lee, and C. F. Lee. 2007. Effectiveness of dividend policy under the capital asset pricing model: a dynamic analysis. Working paper, Fox School of Business. Korn, G. A. and T. M. Korn. 2000. Mathematical Handbook for scientists and engineers: definitions, theorems, and formulas for reference and review, Courier Dover, New York. Leland, H.E. and K. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” The Journal of Finance 51, 987–1019. Schiff, J.L. 1999. The Laplace transform: theory and applications. Springer, New York.
Chapter 86
Application of Simultaneous Equation in Finance Research Carl R. Chen and Cheng Few Lee
Abstract This chapter introduces the concept of simultaneous equation system and the application of such a system to finance research. We first discuss the order and rank condition of identification problem. We then introduce the two-stage least squares (2SLS) and three-stage least squares (3SLS) estimation method. The pro and cons of these estimations methods are summarized. Finally, the results of a study in executive compensation structure and risk-taking are used to illustrate the difference between single equation and simultaneous equation method. Keywords Simultaneous equation r Two-stage least squares r Three-stage least squares r Model identification
employ a two-equation model to examine the relationship between executives’ incentive compensation and firm risktaking. Because both executive compensation and firm risk are endogenous, 2SLS is more appropriate than OLS method. Empirical finance research often employs a single equation for estimation and testing. However, single equation rarely happens in economic or financial theory. Using the OLS method to estimate equation(s), which should otherwise be treated as a simultaneous equation system, is likely to produce biased and inconsistent parameter estimators. To illustrate, let us start with a simple Keynesian consumption function specified as the follows:
86.1 Introduction Rarely a single equation arises in economic theory. In a multi-equation system, OLS fails to yield unbiased and consistent estimators for the structural equations. Therefore, appropriate estimation methods must be applied to the estimation of structural equation parameters. This paper first discusses situations where a simultaneous equation system may arise. We then explain why OLS estimation is not appropriate. Section 86.2 introduces two most frequently used methods to estimate structural parameters in a system of equations. Before 2SLS and 3SLS methods are synthesized, we explain the order condition and the rank condition of model identification. 2SLS and 3SLS are then introduced, and the differences between these two methods are discussed. Section 86.3 gives examples from the literature where applications of the simultaneous equation models in finance are shown. In the finance application, Chen et al. (2006) C.R. Chen () University of Dayton, Dayton, OH, USA e-mail: [email protected]
Ct D ˛ C ˇYt C t
(86.1)
Y t D C t C It
(86.2)
It D I0
(86.3)
Where Ct is the consumption expenditure at time t, Yt is the national income at time t, It is the investment expenditure at time t, which is assumed fixed at I0 , and t is the stochastic disturbance term at time t. Equation (86.1) is the consumption function; Equation (86.2) is the equilibrium condition (national income accounting identity); and Equation (86.3) is the investment function. Some of the variables in the model are endogenous, others are exogenous. For example, Ct and Yt are endogenous, meaning they are determined within the model. On the other hand, It is the exogenous variable, which is not determined in the model, hence not correlated with t . A simultaneous equation bias arises when OLS is applied to estimate the consumption function because Yt is correlated with the disturbance term t , which violates the OLS assumption that independent variables are orthogonal to the disturbance term. To see this, through a series of substitutions, we can obtain the reduced form equations from the structural Equations (86.1)–(86.3) as:
C.-F. Lee Business School, Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_86,
Ct D
ˇ ˛ t C I0 C 1ˇ 1ˇ 1ˇ
(86.4)
Yt D
1 ˛ t C I0 C 1ˇ 1ˇ 1ˇ
(86.5)
1301
1302
C.R. Chen and C.F. Lee
Based on Equation (86.5) clearly Yt is correlated with the disturbance term, t , which violates the OLS assumption that independent variables and the disturbance term are uncorrelated. This bias is commonly referred to in the literature as the simultaneous equation bias. Furthermore, the OLS estimate of ˇ in Equation (86.1) is not only bias, but also inconsistent, meaning ˇ estimate does not converge to the true ˇ when the sample size increases to very large.
86.2 Two-Stage and Three-Stage Least Squares Method To resolve the simultaneous equation bias problem as illustrated in Sect. 86.1, in this section we discuss two popular simultaneous equation estimation methods. Different methods are available to handle the estimation of a simultaneous equation model: Indirect least squares, instrumental variable procedure, two-stage least squares, three-stage least squares, limited information likelihood method, and full information maximum likelihood method, just to name a few. In this section, we will focus on two methods that popular statistical and/or econometric software are readily available.
86.2.1 Identification Problem Before getting into the estimation methods, it is necessary to discuss the “identification problem.” Identification problem arises when we cannot identify the difference between, say, two functions. Consider the demand and supply model of gold. The structure equations can be written as: Qd D ˛0 C ˛1 P C "
(86.6)
Qs D ˇ0 C ˇ1 P C e
(86.7)
Qd D Qs D Q
(86.8)
Equation (86.6) is the demand for gold function, where the demand Qd is determined by the price of gold, P ; Equation (86.7) is the supply of gold, and it is a function of gold price; Equation (86.8) is an identity stating the market equilibrium. Can we apply the OLS method to Equations (86.6) and (86.7) to obtain parameter estimates? To answer this question, we first obtain the “reduced form” equations for P and Q through substitutions. "e ˇ0 ˛0 C P D ˛1 ˇ1 ˛1 ˇ1 QD
˛1 " ˇ1 e ˛1 ˇ0 ˛0 ˇ1 C ˛1 ˇ1 ˛1 ˇ1
(86.9) (86.10)
Obviously it is impossible to estimate Equations (86.9) and (86.10) using OLS method because there are four parameters (˛0 ; ˛1 ; ˇ0 , and ˇ1 ) to be estimated, but there are only two equations. Therefore, we cannot estimate the parameters in the structure equations. This is the situation called “underidentification.” To differentiate demand equation from supply equation, we now assume that demand curve for gold may shift due to the changes in economic uncertainty, which can be proxied by, say, stock market volatility, V , which is assumed to be exogenous. Hence Equations (86.6) and (86.7) can be modified as: Qd D ˛0 C ˛1 P C ˛2 V C " (86.11) Qs D ˇ0 C ˇ1 P C e
(86.12)
The reduced form becomes ˛2 "e ˇ0 ˛0 V C (86.13) ˛1 ˇ1 ˛1 ˇ1 ˛1 ˇ1 D 0 C 1 V C 1 ˛1 " ˇ1 e Q D .ˇ0 C ˇ1 0 / C ˇ1 1 V C ˛1 ˇ1 (86.14) D 0 C 1 V C 2
P D
Because V is assumed exogenous and uncorrelated with residuals 1 and 2 , OLS can be applied to the reduced form Equations (86.13) and (86.14), and obtain estimators of 0 ; 1 ; 0 , and 1 . In examining Equation (86.14), we find that 0 D ˇ0 C ˇ1 0 , and 1 D ˇ1 1 . Since 0 ; 1 ; 0 , and 1 are all obtained from the OLS estimates, ˇ0 and ˇ1 can be solved. Therefore, the supply function [Equation (86.14)] is said to be identified. However, from the demand function, we find that 0 D
ˇ0 ˛0 ˛2 ; and 1 D : ˛1 ˇ1 ˛1 ˇ1
Since there are only two equations, we cannot possibly estimate three unknowns; ˛0 ; ˛1 , and ˛2 , hence, the demand function is not identified. Based on the discussions of Equations (86.6) and (86.7), we thus know that in a twoequation model, if one variable is omitted from one equation, then this equation is identified. On the other hand, there is no omitted variable in Equation (86.6); hence the demand function is not identified. Now let us further modify Equations (86.6) and (86.7) as follows: Qd D ˛0 C ˛1 P C ˛2 V C "
(86.15)
Qs D ˇ0 C ˇ1 P C ˇ2 D C e
(86.16)
All variables are defined as before except now we have added a new variable D in the supply equation. Let D be the
86 Application of Simultaneous Equation in Finance Research
government deficit of Russia, which is assumed exogenous to the system. When Russia’s budget deficit deteriorates, the government increases the gold production for cash. The reduced form of P and Q based on Equations (86.11) and (86.12) is P D
˛2 ˇ2 "e ˇ0 ˛0 V C DC ˛1 ˇ1 ˛1 ˇ1 ˛1 ˇ1 ˛1 ˇ1
D 0 C 1 V C 2 D C 1
(86.17)
Q D .˛0 C ˛1 0 / C .˛1 1 C 2 /V C˛1 2 D C
˛1 " ˇ1 e ˛1 ˇ1
D 0 C 1 V C 2 D C 2
(86.18)
Based on the OLS estimates of Equations (86.17) and (86.18), we can obtain unique estimates for the structure parameters ˛0 ; ˛1 ; ˛2 ; ˇ0 ; ˇ1 , and ˇ2 ; hence both the demand and supply functions are identified. In this case, we call the situation as “exactly identified.” In a scenario when there are multiple solutions to the structure parameters, the equation is said to be “overidentified.” For example, in Equations (86.19) and (86.20), we modify the supply equation by adding another exogenous variable, q, representing lagged quantity of gold produced (i.e., supply of gold in the last period), which is predetermined. Qd D ˛0 C ˛1 P C ˛2 V C "
(86.19)
Qs D ˇ0 C ˇ1 P C ˇ2 D C ˇ3 q C e
(86.20)
1303
identification.” To summarize the order condition of model identification, a general rule is that the number of variables excluded from an equation must be the number of structural equations.1 Although “order condition” is a popular way of model identification, it provides only a necessary condition for model identification, not a sufficient condition. Alternatively, “rank condition” provides both necessary and sufficient conditions for model identification. An equation satisfies the rank condition if and only if at least one determinant of rank .M 1/ can be constructed from the column coefficients corresponding to the variables that have been excluded from the equation, where M is the number of equations in the system. However, “rank condition” is more complicated than “order condition,” and it is difficult to determine in a large simultaneous equation model. The following example based on Equations (86.23)–(86.25) provides some basic ideas about “rank condition.” Note Equations (86.23)–(86.25) are similar to Equations (86.11), (86.12), and (86.8) with terms rearranged.
˛2 ˇ2 ˇ0 ˛0 V C D ˛1 ˇ1 ˛1 ˇ1 ˛1 ˇ1
(86.21)
˛1 " ˇ1 e ˛1 ˇ1
D 0 C 1 V C 2 D C 3 q C 2
(86.24)
Qd Qs D 0
(86.25)
Equations
Variables Intercept
Qd
Qs
P
V
Equ. (86.23) Equ. (86.24) Equ. (86.25)
˛0 ˇ0 0
1 0 1
0 1 1
˛1 ˇ1 0
˛2 0 0
ˇ ˇ ˇ 1 ˛2 ˇ ˇ ˇ ˇ 1 0 ˇ D ˛2
Q D .˛0 C ˛1 0 / C .˛1 1 C 2 /V C ˛1 2 D C˛1 3 C
Qs ˇ0 ˇ1 P D e
Since variables Qd and V are excluded from Equation (86.24), the determinant of remaining parameters in columns Qd and V is
"e ˇ3 q C C ˛1 ˇ1 ˛1 ˇ1 D 0 C 1 V C 2 D C 3 q C 1
(86.23)
In the following table, all structural parameters are stripped from the equations and placed in a matrix.
The reduced form becomes P D
Qd ˛0 ˛1 P ˛2 V D "
(86.22)
Based on Equations (86.21) and (86.22), we find ˛1 2 D 2 ; hence structure equation parameter ˛1 can be estimated as 2 =2 . However, we also find ˛1 3 D 3 , hence ˛1 can also take another value, 3 =3 . Therefore, ˛1 does not have a unique solution, and we say the model is “over-identified.” The condition we employ in the above discussions for model identification is the so-called “order condition of
Because this has a rank of 2, which is equal to the number of equations minus 1, Equation (86.2) is identified. On the other hand, a determinant of rank 2 cannot be constructed for Equation (86.23) because it has only one zero coefficient, hence Equation (14.1) is “under-identified.”2 1
The discussions of the order condition draw heavily from Ramanathan (1995). 2 For more detailed discussions of the rank condition, see econometric books such as Greene (2003), Judge et al. (1985), Fisher (1966), Blalock (1969), and Fogler and Ganapathy (1982).
1304
C.R. Chen and C.F. Lee
86.2.2 Two-Stage Least Squares Two-stage least squares (2SLS) method is easy to apply and can be applied to a model that is exactly or over-identified. To illustrate, let us use Equations (86.15) and (86.16) for demonstration, and rewrite them as the follows: Qd D ˛0 C ˛1 P C ˛2 V C "
(86.26)
Qs D ˇ0 C ˇ1 P C ˇ2 D C e
(86.27)
Based on the “order condition,” Equations (86.26) and (86.27) each has one variable excluded from the other equation, which is equal to the number of equations minus one. Hence the model is identified. Since endogenous variable P is correlated with the disturbance term, the first stage for the 2SLS calls for the estimation of “predicted P ” .PO / using a reduced form containing all exogenous variables. To do this, we can apply the OLS to the following equation: P D 0 C 1 V C 2 D C
(86.28)
OLS will yield unbiased and consistent estimation because both V and D are exogenous, hence not correlated with the disturbance term . With the parameters in Equation (86.28) estimated, the “predicted P ” can be calculated as: PO D O 0 C O 1 V C O 2 D: This PO is the instrumental variable to be used in the second stage estimation, and is not corrected with the structure equation disturbance term. Substituting PO into Equations (86.26) and (86.27), we have Qd D ˛0 C ˛1 PO C ˛2 V C "
(86.29)
Qs D ˇ0 C ˇ1 PO C ˇ2 D C e
(86.30)
Since PO is not correlated with the disturbance terms, OLS method can be applied to Equations (86.29) and (86.30).
86.2.3 Three-Stage Least Squares The 2SLS method is a limited information method. On the other hand, the three-stage least squares (3SLS) method is a full information method. A full information method takes into account the information from the complete system; hence it is more efficient than the limited information method. Simply put, 3SLS method incorporates information obtained from the variance–covariance matrix of the system
disturbance terms to estimate structural equation parameters. On the other hand, 2SLS method assumes that " and e in Equations (86.29) and (86.30) are independent and estimates structural equation parameters separately, thus it might lose some information when in fact the disturbance terms are not independent. This section briefly explains a 3SLS estimation method. Let the structural equation, in matrix, be: Yi D Zi
i
C "i ; where i D 1; 2; : : : m
(86.31)
In Equation (86.31), Yi is a vector of n observations on the left-hand side endogenous variables; Zi is a matrix consisting of the right-hand side endogenous and exogenous vari: is a vector of structural ables, i.e., Z D y ::x ; and i
i
i
i
equation parameters such that 0 1 Z1 Y1 B :: B :: C Y D@: A Z D @ : 0 Ym 0
... :: :
i
: 0 D ˛i ::ˇi . Let
1 0 :: C : A Zm
0
1
0
1 "1 B B C C D @ ::: A "D @ ::: A "m m 1
Then, Y DZ
C"
(86.32)
If we multiply both sides of Equation (86.32) by a matrix X 0 , where 1 0 0 x ... 0 C B X 0 D @ ::: : : : ::: A ; 0 x0 that is, X 0Y D X 0Z C X 0"
(86.33)
Then the variance–covariance matrix of the disturbance term in Equation (86.33), X 0 " will be 0
1 11 x 0 x . . . 1m x 0 x B C :: :: E.X 0 ""0 X / D ˝ x 0 x D @ ::: A : : 0 0 m1 x x mm x x (86.34) The 3SLS structural equation parameters can thus be estimated as O 1 ˝.x 0 x/1 X 0 Zg1 Z 0 xŒ O 1 ˝.x 0 x/1 X 0 Y O D fZ 0 X Œ (86.35) A question arises in the estimation process because the ’s are unknown, hence the matrix 1 is also unknown. This problem can be resolved by using the residuals from the structural equations estimated by the 2SLS to form the mean
86 Application of Simultaneous Equation in Finance Research
sum of residual squares and use them to estimate 1 . Standard econometric software such as SAS can be easily used to estimate Equation (86.35). In sum, 3SLS takes three stages to estimate the structural parameters. The first stage is to estimate the reduced form system; the second stage uses 2SLS to O matrix; and the third stage completes the esestimate the timation using Equation (86.35). Since contains information pertinent to the correlations between disturbance terms in the structural equations, 3SLS is called a full information method.3 Since 3SLS is a full information estimation method, the parameters estimated are asymptotically more efficient than the 2SLS estimates. However, this statement is correct only if the model is correctly specified. In effect, 3SLS is quite vulnerable to model misspecifications. This is because model misspecification in a single equation could easily propagate itself into the entire system.
86.3 Application of Simultaneous Equation in Finance Research In this section, we use an example employing the simultaneous equation model to illustrate how the system can be applied to finance research. Corporate governance literature has long debated whether corporate executives’ interest should be aligned with that of the shareholders. Agency theory argues that unless there is an incentive to align the managers’ and shareholder’s interests, facing the agency problem, managers are likely to exploit for personal interest at the expense of shareholders’. One way to align the interest is to make the executive compensation incentive-based. Chen et al. (2006) study the effect of bank executive incentive compensation on the firm risktaking. A single equation model of the effect of executive compensation on firm risk-taking would look like: Risk D ˛0 C ˛1 .Comp/ C ˛2 .LTA/ C ˛3 .Capital/ C˛4 .NI / C
m X
˛i .Dgeo/
i D5
C
n X
˛i .Dyear/ C
(86.36)
i DmC1
Where Risk is measurements of firm risk; Comp is the executive incentive compensation (i.e., option-based compensation); LTA is the total assets in log form; 3 For more detailed discussions, see Ghosh (1991), Judge et al. (1985), and Greene (2003).
1305
Capital is the bank’s capital ratio; NI is the non-interest income, in percentage; Dgeo is a binary variable measuring bank’s geographic diversification; and Dyear is a yearly dummy variable. Chen et al. (2006) argue that OLS estimates of Equation (86.36) will produce simultaneity bias because executive compensation is endogenous to the model, and is likely to be correlated with the disturbance term . Therefore, Chen et al. (2006) introduce another equation to measure executive compensation. Comp D ˇ0 C ˇ1 .Risk/ C ˇ2 .LTA/ Cˇ3 .SP / C
m X
ˇi .Drate/ C : (86.37)
i D4
Where SP is the underlying stock price and Drate is a series of dummy variables measuring annual interest rates. For example, Rate92 is defined as the T-bill rate of 1992 if the data is from year 1992; otherwise, a value of 0 is assigned to Rate92. Interest rate dummies control for the impact of interest rates on option value. Taking Equations (86.36) and (86.37) together, we find that applying OLS to these two structural equations will not yield unbiased estimates because the right-hand-side variables include endogenous variables Comp and Risk. Therefore, Chen et al. (2006) apply 2SLS to these two equations, and Table 86.1 reports some of their results. Chen et al. first report OLS estimates of the risk equation and find that executive’s compensation structure does not impact firm risktaking. 2SLS results reported in Table 86.1, however, reveal that once the simultaneity of firm risk decision and executive compensation are taken into account, executive compensation does affect firm risk-taking. The incorrect inference derived from the OLS estimates is thus due to simultaneity bias.
86.4 Conclusion When deciding on the appropriate method to estimate structural equations, one must be cautious that in the real world the distinction between endogenous and exogenous variables are often not as clear-cut as one would like to have. Economic theory, therefore, must play an important role in the model construction. Furthermore, as explained in Sect. 86.2, although full information methods produce more efficient estimation, they are not always better than the limited information method. This is because, for example, 3SLS is vulnerable to model specification errors. If an equation is misspecified, the error will propagate into the entire system of equations.
1306
C.R. Chen and C.F. Lee
Table 86.1 Simultaneous equation model showing the relation between total risk and option-based compensation estimated using two-stage least squares (2SLS).s Models j & OPTION/TOTAL_COMP j & ACCUMULATED_OPTION OPTION/TOTAL_COMP j ACCUMULATED_OPTION Equations j Variable Equation 1 Equation 2 Equation 1 Equation 2 OPTION/TOTAL_COMP ACCUMULATED_OPTION j LN(TA) CAPITAL_RATIO NON_INT_INCOME% GEO_DUMMY STOCK_PRICE D92/DRate92 D93/DRate93 D94/DRate94 D95/DRate95 D97/D Rate97 D98/D Rate98 D99/DRate99 D00/DRate00 R2
0.00028 (2.17) – – 0:00123 .3:73/ 0:27 .1:83/ 0.0012 (0.27) 0:0009 .0:99/ – 0.0059 (3.8) 0.0056 (3.15) 0.0022 (1.42) 0.00006 (0.05) 0.0033 (2.66) 0.0095 (7.68) 0.0071 (5.47) 0.0136 (10.53) 30%
– – 1065.1 (3.13) 2.191 (3.13) – – – 0.0758 (2.69) 1:842 .1:88/ 1:054 .0:89/ 0.7567 (0.88) 1:1032 .1:77/ 0.404 (0.57) 1.592 (2.3) 3.591 (4.7) 1.259 (2.0) 17%
– 0.00021 (4.86) – 0:0015 .6:12/ 0:08 .6:21/ 0:0025 .0:83/ 0.0006 (0.85) – 0.0055 (5.54) 0.0052 (4.98) 0.0024 (2.37) 0.0003 (0.36) 0.0029 (3.11) 0.0093 (10.15) 0.0071 (7.55) 0.0139 (14.65) 45%
– – 1877.65 (5.93) 2.5616 (3.93) – – – 0.157 (5.99) 0:3406 .0:37/ 0.6085 (0.55) 0.4149 (0.52) 0:3342 .0:57/ 1.0367 (1.58) 1.0771 (1.67) 0.981 (1.38) 1.4515 (2.47) 19%
j is a measure of total risk, OPTION/TOTAL_COMP is the percentage of total compensation in the form of stock options, ACCUMULATED_OPTION is the accumulated option value measuring the executive’s wealth, LN(TA) is the natural log of total assets, CAPITAL_RATIO is the capital-to-assets ratio, NON_INT_INCOME% is the percentage of income that is from non-interest sources, GEO_DUMMY is a binary variable measuring geographic diversification, Dum92 – Dum00 are dummy variables coded as 1 or 0 for each year from 1992–2000. 1995 is the excluded year , , indicates significance at the 1, 5 and 10% levels respectively
References Blalock, H. M. 1969. Theory construction: from verbal to mathematical formulations, Prentice-Hall, Englewood Cliffs, NJ. Chen, C. R., T. Steiner, and A. Whyte. 2006. “Does stock optionbased executive compensation induce risk-taking? An analysis of the banking industry.” Journal of Banking and Finance 30, 915–946. Fisher, F. M. 1966. The identification problem in econometrics, McGraw-Hill, New York.
Fogler, H. R. and S. Ganapathy. 1982. Financial econometrics, Prentice-Hall, Englewood Cliffs, NJ. Ghosh, S. K. 1991. Econometrics: theory and applications, Prentice Hall, Englewood Cliffs, NJ. Greene, W. H. 2003. Econometric analysis, Prentice Hall, Upper Saddle River, NJ. Judge, G. G., W. E. Griffiths, R. C. Hill, H. Lütkepohl, and T. Lee. 1985. The theory and practice of econometrics, Wiley, New York. Ramanathan, R. 1995. Introductory econometrics with application, Dryden, Fort Worth, TX.
Chapter 87
The Fuzzy Set and Data Mining Applications in Accounting and Finance Wikil Kwak, Yong Shi, and Cheng-Few Lee
Abstract To reduce the complexity in computing trade-offs among multiple objectives, our series of papers adopts a fuzzy set approach to solve various accounting or finance problems such as international transfer pricing, human resource allocation, accounting information system selection, and capital budgeting problems. A solution procedure is proposed to systematically identify a satisfying selection of possible solutions that can reach the best compromise value for the multiple objectives and multiple constraint levels. The fuzzy solution can help top management make a realistic decision regarding its various resource allocation problems as well as the firm’s overall strategic management when environmental factors are uncertain. In addition, we provide asn update on the effectiveness of a multiple criteria linear programming (MCLP) approach to data mining for bankruptcy prediction using financial data. Data mining applications have been receiving more attention in general business areas, but there is a need to use more of these applications in accounting and finance areas when dealing with large amounts of financial and non-financial data. The results of the MCLP data mining approach in our bankruptcy prediction studies are promising and may extend to other countries. Keywords Fuzzy set r Data mining r Transfer pricing r Human resource allocation r Accounting information system selection r Capital budgeting r Firm bankruptcy predictions
W. Kwak () University of Nebraska at Omaha, Omaha, NE, USA e-mail: [email protected] Y. Shi University of Nebraska at Omaha, Omaha, NE, USA and Chinese Academy of Sciences, Beijing, China C.-F. Lee Rutgers University, Newark, NJ, USA
87.1 Introduction In this paper, we provide a state-of-the-art review on our past works in fuzzy set and data mining approaches to some critical problems in accounting and finance. These applications include international transfer pricing, human resource allocation in a CPA firm, accounting information system selection, capital budgeting, and firm bankruptcy predictions. The purpose of this paper is to show that in addition to traditional methods, such as statistics and logit or probit models, there are many quantitative methods, exemplified by fuzzy sets and data mining that alternatively can be used to analyze and solve these challenging accounting and finance problems.
87.2 A Fuzzy Approach to International Transfer Pricing As the global economy has merged, international transfer pricing issues have become more important because most corporations have increased the transfer of goods or services dramatically among their divisions as a result of increased outsourcing of products or services when restructuring or downsizing of their organizations. In addition, transfer pricing guidelines are an important element of the international tax system because 60% of world trade is performed by multinational corporations. Transfer pricing problems have been extensively studied by a number of scholars. Many of them have recognized that a successful transfer pricing strategy should consider multiple criteria or objectives, such as overall profit after tax, total market share, divisional autonomy, performance evaluation, and utilized production capacity (Abdel-khalik and Lusk 1974; Bailey and Boe 1976; Merville and Petty 1978; Yunker 1983; Lecraw 1985; Lin et al. 1993). As an alternative methodology, Kwak et al. (2002) proposed a fuzzy set approach to solve transfer pricing problems with multiple criteria and multiple decision makers (MCMDM). This approach is based on two fundamental
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_87,
1307
1308
human cognitive processes: (1) all decision makers (DMs) involved in the transfer pricing decision-making have goal setting and compromising behavior for seeking multiple criteria, and (2) each DM has a preference for the transfer price level. The solution procedure systematically identifies a fuzzy optimal selection of possible transfer prices that cannot only reach the best compromised value for the multiple criteria, but also use the best price range according to the multiple decision makers’ preferences. In comparison with the analytic hierarchy process (AHP) approach of Kwak et al. (1996), the fuzzy set approach does not have to elicit weights of importance for both multiple criteria and multiple DMs’ preferences. Furthermore, the fuzzy set approach is easier for a firm to apply, because it can be implemented by using any available commercial computer software that solves linear programming problems. Practically speaking, since linear programming has only a single criterion (objective) and a single resource availability level (right-hand side), it has limitations in handling real-world transfer pricing problems. For instance, linear programming cannot be used to solve the problem in which a corporation tries to maximize overall profit after tax and total market share simultaneously. This dilemma is overcome by a technique called multiple criteria (MC) linear programming (Zeleny 1974; Goicoechea et al. 1982; Steuer 1986). To extend the framework of MC linear programming, Seiford and Yu (1979) and Yu (1985) formulated a model of multiple criteria and multiple constraint levels .MC2 / linear programming. This model is rooted by two facts. First, from the linear system structure point of view, the criteria and constraints may be “interchangeable.” Thus, like multiple criteria, multiple constraint (resource availability) levels can be considered. Second, from the application point of view, it is more realistic to consider multiple resource availability levels (discrete right-hand sides) than a single resource availability level in isolation. The philosophy behind this perspective is that the availability of resources can fluctuate depending on the decision situation changes, such as the desirability levels believed by the different managers. For example, if the differentiation of budget among managers in transfer pricing problems (Watson and Baumler 1975) is represented by different levels of budget, then this differentiation can be resolved by identifying some best compromises of budget levels as the consensus budget. The theoretical connections between MC linear programming and MC2 linear programming can be found in Gyetvan and Shi (1992). Decision problems related to MC2 linear programming has been extensively studied (Lee et al. 1990; Shi and Yu 1992). Key ideas of MC2 linear programming, a primary theoretical foundation of this paper, are outlined as follows.
W. Kwak et al.
An MC2 linear programming problem can be formulated as Maxœt Cx s. t. Ax D” x 0, where C 2 Rqxn ; A 2 Rmxn , and D 2 Rmxp are matrices of qxn, mxn, and mxp dimensions, respectively; x 2 Rn are decision variables; œ 2 Rq is called the criteria parameter and ” 2 Rp is called the constraint level parameter. Both .œ; ”/ are assumed unknown. Denote the index set of the basic variables fxj1 ; : : : :; xjm g for the MC2 problem by J D fj1 ; : : : ; jm g. Note that the basic variables may contain some slack variables. Without confusion, J is also called a basis for the MC2 problem. Since a basic solution J depends on parameters .œ; ”/, define that (1) a basic solution J is feasible for the MC2 problem if and only if there exists a ” 0 > 0 such that J is a feasible solution for the MC2 problem with respect to ” 0 ; and (2) J is potentially optimal for the MC2 problem if and only if a œ0 > 0 and a ” 0 > 0 exist such that J is an optimal solution for the MC2 problem with respect to .œ0 ; ” 0 /. For an MC2 problem, a number of potentially optimal solutions {J} may exist as parameters .œ; ”/ vary, depending on decision situations. Seiford and Yu (1979) derived a simplex method to systematically locate the set of all potentially optimal solutions {J}. The computer software of the simplex method (called MC2 software) was developed by Hao and Shi (1996). This software consists of five subroutines in each iteration: (1) pivoting, (2) determining primal potential bases, (3) determining dual potential bases, (4) determining the effective constraints for the primal weight set, and (5) determining the effective constraints for the dual weight set (Chap. 8 of Yu 1985). It is written in PASCAL and operates in a manmachine interactive fashion. The user can not view only the tableau of each iteration but also trace the past iterations. In the next section, the framework of the MC2 problem, as well as its software, will be used to formulate and solve using the fuzzy set approach for international transfer pricing problems. The MC2 linear programming framework can resolve the above shortcomings inherent in the previous transfer pricing models. Based on Al-Eryani et al. (1990), Yunker (1983), and Tang (1992), the four important objectives of transfer pricing problems in most corporations are considered: (1) maximizing the overall profit after tax; (2) maximizing the total market share; (3) maximizing the subsidiary profit (note that the subsidiary profit maximization is used to reflect the degree of the subsidiary autonomy in decision making. It may differ from the overall profit); and (4) maximizing the utilized production capacity. Even though the following model
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
contains only these four specific objectives, the generality of the modeling process fits in all transfer pricing problems with multiple criteria and multiple constraint levels. To mathematically formulate the fuzzy MCMDM transfer pricing model, let k be the number of divisions in a corporation under consideration, and t be the index number of the products that each division of the corporation produces. Define xij as the units of the jth product made by the ith division, i D 1; : : : ; kI j D 1; : : : ; t. For the coefficients of the objectives, let pij be the unit overall profit generated from the jth product made by the ith division; mij be the market share value for the jth product made by the ith division in the market; sij be the unit subsidiary profit generated from the jth product made by the ith division; and cij be the unit utilized production capacity of the ith division to produce the jth product. For the coefficients of the constraints, let bij be the budget allocation rate for producing the jth product by the ith division. For the coefficients of the constraint levels, let bsij be the budget availability level believed by the sth manager (or executive) for producing the jth product by the ith division, s D 1; : : : ; h; let dsi be the production capacity level believed by the sth manager for the ith division; dsij be the production capacity level believed by the sth manager for the ith division to produce the jth product; esij be the initial inventory level believed by the sth manager for the ith division to hold the jth product. Then, the multiple factor transfer pricing model is:
Max
t k X X
t k X X
pij xij
t k X X
5 X
mij xij
Max
5 X
Pj Xj
(87.3)
bij xj .c1i ; : : : ; c5i / .”1 ; : : : ; ” 5 / t
(87.4)
jD1
X sij xij
iD1 jD1 t k X X
.i/ Max.and Min/
subject to:
iD1 jD1
Max
Note that even though this MCMDM transfer pricing model includes only four specific groups of objectives for presentation purposes, the modeling process easily adapts to all international transfer pricing problems with multiple criteria and multiple constraint levels such as maximization of after-tax profit for the parent company as another criteria or minimization of foreign taxes and tariffs as additional constraints. The above model (Equations (87.1) and (87.2)) is an MC2 problem with four objectives and four constraint levels. Solving this type of real-world problem is quite complicated. Therefore, a fuzzy set approach is needed to make the computational task as practical as possible. Based on a compromised solution approach for multiple objective decision making problems, Shi and Liu (1993) propose fuzzy MC linear programming. Their fuzzy approach adopts the compromised solution or the “satisficing solution” between upper and lower bounds of acceptability for objective payoffs. The following section presents the fuzzy set approach for an international transfer pricing problem. Step 1: Given the model (Equations (87.1) and (87.2)), we may use any available computer software to solve the following series of LP programs. For illustration purposes, assume t D 5; i D 5, and u D 5:
jD1
iD1 jD1
Max
1309
”p D 1; ” o > 0; p D 1; : : : ; 5;
and xj 0; for j D 1; : : : ; 5: cij xij
(87.1) .ii/ Max.and Min/
iD1 jD1
5 X
mij xj ; i D 1; : : : ; 5 (87.5)
jD1
subject to: t k X X
bij xij b1ij ; : : : ; bhij
subject to: Equation (87.4)
.iii/ Max.and Min/
iD1 jD1 t X
sij xj ; i D 1; : : : ; 5
(87.6)
cij xj ; i D 1; : : : ; 5
(87.7)
jD1
xij d1i ; : : : ; dhi
subject to: Equation (87.4)
jD1
xij d1ij ; : : : ; dhij
xij C xiC1;j e1ij ; : : : ; ehij xij 0; i D 1; : : : k; j D 1; : : : ; t:
5 X
.iv/ Max.and Min/
5 X jD1
(87.2)
subject to: Equation (87.4)
1310
W. Kwak et al.
Group (i) of the above LP problems has two types of opti- (iii) Maximize the utilized production capacity of the commization: one is maximization and the other is minimization. pany so that each division manager can avoid any underGroups (ii), (iii) and (iv) have six problems, respectively. All utilization of normal production capacity. of them use the same constraints (Equation (87.4)). The total The data related to these objectives is given in Table 87.1. number of problems is eight. We define the objective value In Table 87.1, Product 1 of Division 1 is a byproduct that of the maximization problems by Ui as the upper bound of has no sale value in Division 1 at all. The unit profit $4 DMs’ acceptability for the objective payoff; and the objective of this product means its production cost. All other products value of the minimization problems by Li as the lower bound generate profits. We use a market price of each product to of DMs’ acceptability for the objective payoff, i D 1; : : : ; 12. maximize the market share goal. Note that the market prices Step 2: of three products in Table 87.1 are different. Let f.x/i ; i D 1; : : : ; 12, be the objective functions in In this example, besides central management, the selling Step 1. Then, we solve the following linear program: divisional manager and the purchasing divisional manager may have different interpretations of the same resource availMax “ (87.8) ability across the divisions. The selling divisional manager views the constraint level based on material, manpower, and subject to: equipment under control, while the purchasing divisional manager views the constraint level based on available cash (87.9) “ .f.x/i Li /=.Ui Li /; i D 1; : : : ; 12; flow. Central management will make the final decision for the company’s transfer price setting based on the compromises “ 0, of both managers. All divisional managers will carry out central management’s decision, although they have their auton5 X omy to provide the information about the divisional profits bij xj .c1i ; : : : ; c5i / .”1 ; : : : ; ” 5 / and costs for central management. The interpretations of both jD1 managers for the resource constraint levels are summarized X ”p D 1; ” p > 0; p D 1; : : : ; 5; in Table 87.2. and xj 0; for j D 1; : : : ; 5: All executives and divisional managers agree on the units of resources consumed to produce the products. The data is The resulting values of xj are the fuzzy optimal solution given in Table 87.3. and the value of “ is the satisficing level of the solution for Let xij be the units of the jth product produced by the ith the DMs. division, i D 1; 2I j D 1; 2. The above fuzzy solution method has systematically reThen: duced the complexity of the MCMDM transfer pricing model Step 1. We use the LINDO (Schrage 1991) computer soft(Equations (87.1) and (87.2)) because it transforms the MC2 ware to solve three maximization problems and three minLP problem into a LP problem. The above two solution steps imization problems for three objectives, respectively. The can be easily implemented by employing available commer- results are: cial computer software. In the next section, a prototype of U1 D 856; 900; U2 D 4; 720; 000; U3 D 546; 000; the transfer pricing model in a multinational corporation will and L1 D 20; 000; L2 D 0; L3 D 0. To see a more be illustrated to demonstrate the implications for decision concentrated value on ”1 and ”2 , we add ”1 0:3 and makers. ”2 0:3 on each problem. Then, the results are: U1 D As an illustration of the multiple factor transfer pricing 801; 380; U2 D 4; 427; 900; U3 D 516; 700I L1 D model, the Union Plastic Corporation has two divisions that 20; 000; L2 D 0; L3 D 0. The LINDO software yields process raw materials into intermediate or final products. Di- slack and dual prices, which is an opportunity cost or marginal vision 1, which is located in Taipei, Taiwan, manufactures two kinds of plastics, called Products 1 and 2. Product 1 in Division 1 is an intermediate product and cannot be sold ex- Table 87.1 Union plastic corporation Division 1 Division 2 ternally. It can, however, be processed further by Division 2 Product 1 Product 2 Product 1 Product 2 into a final product. Division 2, which is located in Omaha, Nebraska, U.S., processes Product 2 and finalizes Product 1 Unit profit ($) 4 8 13 5 in Division 1. Central management and two divisional manMarket price 0 40 46.2 38 agers agree on the following multiple objectives: 4 4 3 2 Unit utilizing (i) Maximize the overall company’s after-tax profit. (ii) Maximize the market share goal of Product 2 in Division 1, and Products 1 and 2 in Division 2.
production capacity (hours)
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
1311
Table 87.2 The constraint levels of the vice presidents
The selling divisional manager
The purchasing divisional manager
0
100
Budget constraint ($)
45; 000
40; 000
Production capacity of product 2 in division 1
38; 000
12; 000
Production capacity of product 1 in division 2
45; 000
50; 000
Production capacity of product 2 in division 2
36; 000
10; 000
Transfer product constraint
Table 87.3 The unit consumptions of resources
Division 1 Product 1 1.0
Transfer product constraint Budget constraint ($)
0
Production capacity of product 2 in division 1 Production capacity of product 1 in division 2 Production capacity of product 2 in division 2
cost of transfer prices. When ”1 C ”2 D 1, objective function value of this example is $856,900 for the first objective maximization. Slacks for ”1 D 6; 500 and transfer prices of fx11 ; x12 ; x21 ; x22 g are $4.00, $12.50, $3.00, and $4.00, respectively. When ”1 > 0:3 and ”2 > 0:3, the objective function value is $801,380. Slacks for x12 D 1; 540 and transfer prices of x11 D $4:00; x21 D $8:00 and x22 D $9:00 will be produced to achieve the objective payoff V.J1 / and the resource of x22 has the amount of s5 unused. According to the marginal cost pricing approach, the optimal transfer prices of fJ1 ; J2 g are designated as the shadow prices of fJ1 ; J2 g. Step 2. Let “ be the satisfying level of the fuzzy optimal solution for the DMs. Then, we have the fuzzy solution problem as shown in the appendix. Solving this problem, we obtain: “ D 0:987; x11 D 48; 262:7; x12 D 38; 000; x21 D 45; 000, and x22 D 29; 500 units and ”1 D 1 and ”2 D 0. If we impose the preference parameter of the decision makers ”i 0:3, then the solution will be “ D 0:991; x11 D 48; 784; x12 D 30; 200; x21 D 46; 500, and x22 D 28; 200 units and ”1 D 0:7 and ”2 D 0:3. Here ”1 D the preference of the selling divisional manager and ”2 D the preference of the purchasing divisional manager. However, the current fuzzy solution does not provide shadow prices. For the current solution with both managers’ preferences, we only find fuzzy optimum units for each product. For example, if we transfer x11 at $4.00, x12 at $12.50, x21 at $3.00 and x22 at
Division 2 Product 2
Product 1
Product 2
0
1.0
0
0.4
0.4
0.4
1.0 1.0 1.0
$4.00, the objective functions will be $791,964, which is less than the optimum value for the achievement of the goal of maximization of profits, but we can satisfy both managers according to their other preferences. With the above shadow prices (or transfer prices), the constraint in Step 2 is satisfied. However, other shadow prices in Step 1 do not satisfy the constraints in Step 2. The fuzzy set approach for the transfer pricing problems as we discussed in this paper has some managerial implications. First, the model integrates the multiple objective transfer pricing problem with multiple DMs. In our model, multiple constraint levels are allowed to incorporate each DM’s preference of budget availability. This improvement in modeling can minimize the uncertainty of determining the level of budget availability. Second, the fuzzy set approach facilitates DMs’ participation to avoid suboptimization of overall company goals. This model supports divisional autonomy more than any other mathematical programming model by allowing differences in each decision maker’s preference. Third, solving MC2 problems in a real-world setting is not practical because of computational complexity. The fuzzy set approach is more straightforward and easy to solve using currently available computer software. Finally, the structure of the MCMDM international transfer pricing is flexible. We can easily add or delete more
1312
objectives or constraints depending on the economic or political situation, changes which is common in international business environments.
87.3 A Fuzzy Set Approach to Human Resource Allocation of a CPA Firm Formal planning for staff personnel in a CPA firm is very important because it enables a firm to function more efficiently and effectively, given the current dynamic market situation and difficulty in retaining highly qualified accountants. The current situation of declining numbers of accounting graduates and new CPA candidates requires accounting firms to rethink their human resource allocation problems. Information released in the American Institute of CPAs’ based on the 1998–1999 academic year, shows a 22% drop in the number of accounting graduates compared to the 1995–1996 academic year (Accounting Today 2001). If this trend continues, accounting firms will be forced to do formal planning to address their human resource allocation problems. Firms will be required to perform a variety of services such as management advisory services, auditing, and tax planning for their clients with less manpower. If a firm decides to have a formal planning model, the model should incorporate economic and professional objectives of their personnel. As one of the professional objectives, 120 h of continuing professional education for all members in public practice every 3 years was proposed and accepted several years ago by the Council of American Institute of Certified Public Accountants (AICPA). In addition, rotation of audit partners and staff was proposed to increase “auditor independence” by the Charted Accountants’ Joint Ethics Committee by the calendar year 1995 (Business Review Weekly 1997). Generally, planning an audit staff in a large accounting firm is a complex task and multiple goals are necessary in today’s highly competitive environment. Linear programming focuses on a single goal – usually profit maximization or cost minimization – but this is not the situation for an accounting firm, which is a service organization. Since the fuzzy set approach provides a simultaneous solution to a complex system of competing objectives, it seems to be a proper tool for an accounting firm’s human resource allocation problem. Kwak et al. (2003) explored a fuzzy set approach to the human resource allocation problem of functional departments of a CPA firm. To mathematically formulate the multiple objective auditstaffing models, let i be the position of audit staffs in a CPA firm under consideration. Define xi as the positions of the audit staff by the office and i D 1; : : : ; t. For the coefficients of the objectives, let pi be the gross audit fees generated by the ith staff; let qi be the income made by the ith staff; let ri be the management personnel to staff men ratio to maintain
W. Kwak et al.
audit quality. For the coefficients of the constraints, let bi be the hourly billing rates and let ci be projected profit believed by the kth manager for the firm. Here, we allow multiple constraint levels for the billing rate, which is believed by different managers. Generally, we assume at least two different types of personnel managers get involved in the human resource allocation problems. Then, the multiple objective audit staff-planning model is: Max
t X
pi xi
iD1
Max
t X
qi xi
iD1
Max
t X
ri xi
iD1
subject to: t X
bi xi ci ”j C : : : C ct ”k
i D1 k X
j D 1; j 0;
j D1
xi 0; i D 1; : : : ; t; j D 1; : : : k. Note that even though this model includes only three specific groups of objectives for presentation purposes, the modeling process easily adapts to all audit-staffing problems with multiple objectives and multiple constraint levels. We could use an integer LP program for this type of problem to enforce integer solutions, which may be easier to interpret and implement for the decision maker. The proposed model is an LP problem with three objectives and four constraint levels. In the case of billing rates, a multiple constraint level is used. In other words, each manager’s projection for the increased billing rate for the upcoming year can be different. The differences in the righthand side values are accepted in this model, which allows a group decision-making process. Solving this type of realworld problem is quite a task as the market conditions or the firm environment changes. The proposed fuzzy set approach for an audit-staffing problem can be: Step 1: Given the proposed model in this paper, we may use any available computer software to solve the following series of LP problems. For illustration purposes, assume t D 5 and k D 2: 5 X pi xi .i/ Max .and Min/ i D1
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
subject to: 5 X
bi xi D .c1 ”1 C : : : C c5 ”2 /
i D1 2 X
j D 1; j 0;
j D1
and xi 0; for i D 1; : : : ; 5: .ii/ Max .and Min/
(87.10)
5 X
qi xi ; i D 1; : : : ; 5
i D1
subject to: Equation (87.10) .iii/ Max .and Min/
5 X
ri xi
i D1
subject to: Equation (87.10) Group (i) of the above LP problems has two types of optimization: one is maximization and the other is minimization. Groups (ii) and (iii) have four problems, respectively. All of them use the same constraints (Equation (87.10)). The total number of problems is six. We define the objective value of the maximization problems by Ui as the upper bound of decision makers’ (DM) acceptability for the objective payoff; and the objective value of the minimization problems by Li as the lower bound of DMs’ acceptability for the objective payoff, i D 1; 2; 3. Step 2: Let f.x/i ; i D 1; 2; 3, be the objective functions in Step 1. Then, we solve the following linear problem: Max “
(87.11)
subject to: “ .f.x/i Li /=.Ui Li /; i D 1; 2; 3;
(87.12)
“ 0, 5 P
bi xi D .c1 ”1 C : : : C c5 ”2 /
i D1 2 P
j D 1; j 0;
j D1
and xi 0, for i D 1; : : : ; 5. The resulting values of xj are the fuzzy optimal solution and the value of “ is the satisfying level of the solution for the DMs.
1313
The above fuzzy solution method has systematically reduced the complexity of the multiple objective audit staffplanning model because it transforms the multiple objective and multiple constraint LP problem into an LP problem. The above two solution steps can be easily implemented by employing available commercial computer software. In the next section, a numerical example of the audit staff-planning model in a CPA firm will be presented to demonstrate the implications for decision makers. A numerical example, which is similar to Killough and Souders (1973), is developed to present the fuzzy set approach and its solution method. The proposed model in this paper is concerned only with a firm’s audit function, although a tax and management advisory service can be added if desired. In addition, the model to be designed is limited to the planning horizon of 1 year; however, the timing horizon can be easily extended if needed. The CPA firm audit staff is assumed to consist of five levels. At the highest level is Partner A and the second level is Partner B; in descending hierarchical order, manager, senior staff auditor, and junior staff auditor. Besides these five levels, the personnel partner, who is responsible for manpower planning will be added. In this paper, let us assume that the personnel partner is trying to develop the most effective audit staff workload allocation plan for the next year. Given projected billings and the present number of staff members at each level, he needs to know whether the firm should increase or decrease the number of its auditor employees at each level. Based on past experience, the personnel partner decides the objectives to be achieved. The first objective is to increase gross audit fees by 5% over the past year. To achieve this objective, the personnel partner has to have subgoals of increasing hourly billing rate per classification by 2% and increasing chargeable hours by 4% whichever is possible. The second objective is to maintain a ratio of at least one management personnel (partners and managers) to every five staff (senior and junior staff auditors). This is a required condition to maintain the quality of the audit work and to retain the company’s good reputation. The third objective is to maintain a net income of $5,500,000. Tables 87.4–87.6 represent the relevant information. Table 87.4 contains the information concerning audit personnel, their working hours, the billing rate per hour, and the gross audit fees earned for the past year. Table 87.5 contains projected information on chargeable hours, nonchargeable hours, and billing rates for the upcoming year. Table 87.6 presents projected revenues and expenses predicted on these same goals. Formulation of the model variables is as follows: X1 D Number of audit partners A required. X2 D Number of audit partners B required.
1314
W. Kwak et al.
Table 87.4 Audit personnel, working hours, billing rates, and fees Position
# Employ
Working hours/ person/ year
Partner A Partner B Manager Senior Junior
2 5 12 40 52
2,500 2,250 2,300 2,250 2,000
Total hours/ position
Charge hours/ person/ year
Noncharge/ year
Billing rate/hour
5,000 11,250 27,600 90,000 104,000
2,000 2,000 2,100 2,000 1,700
500 250 200 250 300
$80 $70 $60 $50 $40
Gross audit fees earned for the past year: chargeable hours/position/year billing rate/hour D $10,068,000 Table 87.5 Projected information for the next year based on goals set by the firm 1. Chargeable hours (an increase of 4%) Partner A: 4;000 104% D 4;160 Partner B: 10;000 104% D 10;400 Manager: 25;200 104% D 26;208 Senior: 80;000 104% D 83;200 Junior: 88;400 104% D 91;936 2. Billing rates/hour (an increase of 2%) Partner A: $80 102% D $81:60 Partner B: $70 102% D $71:40 Manager: $60 102% D $61:20 Senior: $50 102% D $51:00 Junior: $40 102% D $40:80
Table 87.6 Projected revenues and expenses 1. Gross audit fees: Partner A: 4;160 h $81:60=h D $339;456 Partner B: 10;400 h $71:40=h D $742;560 Manager: 6;208 h $61:20=h D $1;603;929 Senior: 83;200 h $51:00=h D $4;243;200 Junior: 91;936 h $40:80=h D $3;750;989
X6 D New hourly billing rate for partners A. X7 D New hourly billing rate for partners B. X8 D New hourly billing rate for managers. X9 D New hourly billing rate for senior staffs. X10 D New hourly billing rate for junior staffs. Objectives and constraints for the numerical example are as follows: First, the objective of a 5% increase in gross audit fees may be expressed as: 4; 160X6 C 10; 400X7 C 26; 208X8 C 83; 200X9 C91; 936X10 $10; 680; 134 It is desirable to provide net income of $5,500,000 in the upcoming year for the growth and enhancement of the firm’s partners. Therefore, the second objective can be expressed as: 4; 160 X6 C 10; 400 X7 C 26; 208 X8 C 83; 200 X9 C91; 936 X10 50; 000 X1 45; 000 X2 30; 000 X3 25; 000 X4 20; 000 X5 $8; 250; 000
Total $10,680,134 2. Expenses: Salaries: Partner A .$50; 000 X1 / Partner B .$45;000 X2 / Manager .$30;000 X3 / Senior .$25;000 X4 / Junior .$20;000 X5 / Where Xi represents number of auditors employed. 3. Other Expenses: Estimated Fixed Costs: (e.g., rents, insurance, depreciation expenses for office equipment, : : : etc.) Total $2,750,000
The personnel partner believes that it is desirable to maintain a ratio of at least one management personnel (partners and managers) to every five staff persons (senior and junior audit staff). This is a desirable condition to maintain the quality of audit work. After all the field work is done at least one of the partners should review that the job is done properly. All the audit work from the planning to issuing the audit report should be supervised properly. This third objective can be expressed as follows: 5.X1 C X2 C X3 / C .X4 C X5 / 0: The followings are constraints:
X3 D Number of audit managers required. X4 D Number of audit senior staffs required. X5 D Number of audit junior staffs required.
(i) Personnel Requirement The constraints for the number of audit personnel required (from Tables 87.4 and 87.5) can be expressed as:
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
2; 500 X1 2; 250 X2 2; 300 X3 2; 250 X4 2; 000 X5
5; 160 11; 650 28; 608 93; 200 107; 536
Here the nonchargeable hours are maintained at present levels to provide necessary time for formal training, the upgrading of services, and obtaining new clients. (ii) Billing Rates The multiple constraints for the new hourly billing rates can be expressed as: X6 .81:6 ”1 C 85:0 ”2 / X7 .71:4 ”1 C 75:0 ”2 / X8 .61:2 ”1 C 65:0 ”2 / X9 .51:0 ”1 C 55:0 ”2 / X10 .40:8 ”1 C 45:0 ”2 / Here the new hourly billing rates have two different levels. The first right-hand side value is the rate estimated by the personnel partner and the second right-hand side value is the rate estimated by the managing partner. The fuzzy set approach we used in this paper allows multiple constraint levels as well as multiple objectives. However, for illustration purposes, we only allow multiple billing rates. (iii) Minimization of Overtime Job pressure, frequent traveling, and overtime can explain high turnover of top quality certified public accountants. The personnel partner decides to minimize overtime to retain experienced auditors. This constraint can be expressed as: 500 X1 250 X2 200 X3 250 X4
1; 000 1; 250 2; 400 10; 000
(iv) Constraints on Senior and Junior Audit Staffs It is desired to keep at least 90 senior and junior staff auditors and one managing partner to increase sales and maintain the quality of audit work. The personnel partner wants to maintain at the least current partner B, manager, and senior staff levels. This constraint can be expressed as: X4 C X5 D 90; X1 1; X2 5; X3 10; and X4 40 A Fuzzy Solution: Step 1. The LINDO (Schrage 1991) computer software was used to solve three maximization problems and three minimization problems for three objectives, respectively. The results are: U1 D 0:000000001109; U2 D 8; 902; 440; U3 D 5:000; and L1 D 0; L2 D 2; 760; 360; L3 D 13:768.
1315
Step 2. Let “ be the satisfying level of the fuzzy optimal solution. Then, we have the fuzzy solution problem as below: Maximize “ Subject to: 4;160 X6 C 10;400 X7 C 26;208 X8 C83;200 X9 C 91;936 X10 1;109;340“ 0 4;160 X6 C 10;400 X7 C 26;208 X8 C 83;200 X9 C91;936 X10 50;000X1 45;000 X2 30;000 X3 25;000 X4 20;000 X5 11;662;800 “ 2;760;360 5.X1 C X2 C X3 / C .X4 C X5 / 18:768 “ 13:7680 2;500 X1 5;160 2;250 X2 11;650 2;300 X3 28;608 2;250 X4 93;200 2;000 X5 107;536 X6 81:6 ”1 85 ”2 0 X7 71:4 ”1 68 ”2 0 X8 61:2 ”1 65 ”2 0 X9 51:0 ”1 55 ”2 0 X10 40:8 ”1 45 ”2 0 500 X1 1;000 250 X2 1;250 200 X3 2;400 250 X4 10;000 X4 C X5 90 X1 1 X2 5 X3 10 X4 40 ”1 C ”2 D 1 and “ 1. Solving this problem, we obtain: “ D 0:99, which implies 99% of objectives are satisfied with current constraints. The satisfaction rate is high for this numerical example because the problem is simple and well-defined. The current solution shows that 1.965 Partner A, 5 Partner B, 12 Managers, 40 Senior Staff Auditors, and 50 Junior Staff Auditors are required for the next year. Here, 1.965 Partner A means less than 2 Partner A. If the firm has partially retiring partners, they can hire part-time partners. However, if this is not possible, they have to use Integer LP to set inter-solutions. This solution indicates that the firm needs to hire less management personnel and less junior staff auditors for the upcoming year. Finally, ”1 D 0 and ”2 D 1. This result implies that the managing partner’s opinion dominates the personnel partner’s opinion. If we force ”1 0:3, the satisfaction rate .“/ is 97% and
1316
Partner A is 1.889. Other values are the same. Here ”1 0:3 means that the managing partner’s opinion still dominates the personnel partner’s opinion. If ”1 0:5, the satisfaction rate is decreasing to 95.7% and Partner A is 1.839. This case implies that the personnel partner’s opinion is stronger than the managing partner’s opinion. However, the solutions are pretty similar in this numerical example because the constraint level is narrowly defined. The fuzzy set approach for the audit staff planning problems as we discussed in this paper has some managerial implications. First, the model integrates multiple objectives. In this model, multiple constraint levels are also allowed to incorporate each DM’s preference of new hourly billing rates. Second, the fuzzy set approach facilitates each decision maker’s participation to avoid suboptimization of the overall firm’s goals. Third, the fuzzy set approach is more straightforward and easy to solve using currently available computer software. Fourth, the fuzzy set approach is flexible enough to easily add more objectives or constraints according to the changing environment. Fifth, the fuzzy set approach allows a group decision-making process, which is pretty common in today’s business environment, by allowing multiple righthand side values. Finally, the fuzzy set approach allows the sensitivity analysis of the opinions of decision makers as we have illustrated in the solutions of the numerical example.
W. Kwak et al.
under consideration. Define xi as the optimum decision variables for an accounting information system and i D 1; : : : ; t. For the coefficients of the objectives, let pi be the cost savings of information processing, let qi be the score of decisionmaking flexibility, and let ri be the ranking of improved planning process. For the coefficients of the constraints, let ai be cash flow availability for investment, let bi be system analysts availability, let ci be programmer availability, let di be CPU time availability, and let ei be resource availability believed by the kth manager for the firm. Here, we allow multiple constraint levels for the cash flow availability for the investment of a new accounting information system, which is believed by different managers. Generally, we assume at least two different types of information system managers get involved in the accounting information system selection problems. Then, the multiple objective accounting information system selection model is: Max
t X
pi xi
i 1
Max
t X
qi xi
i 1
Max
t X
ri xi
i 1
subject to:
87.4 A Fuzzy Set Approach to Accounting Information System Selection
t X
ai xi ei ”j C : : : C et ”k
iD1
Formal planning for accounting information system selection in a firm is very important because it enables the firm to function more efficiently and effectively, given the current dynamic information technological environment. If a firm decides to have a formal planning model, the model should first incorporate economic and professional objectives of its users. In addition, strategic planning is another important factor in information system management (Brancheau and Wetherbe 1987). However, information system managers, corporate officers, and users may have different perspectives in their accounting information system (Buss 1983). Therefore, an accounting information system selection for a firm is a complex task and multiple goals are necessary in today’s highly competitive business environment. Kwak (2006) proposed a fuzzy set approach to this accounting information system selection problem with multiple objectives and multiple decision makers or users of the firm to incorporate multiple constraints. To mathematically formulate the multiple objective accounting information system selection model, let i be the information system in a firm
t X
bi xi ei ”j C : : : C et ”k
iD1 t X
ci xi ei ”j C : : : C et ”k
iD1 t X
di xi ei ”j C : : : C et ”k
iD1 k X
j D 1; j 0;
jD1
xi 0; i D 1; : : : ; t; j D 1; : : : k, and t and k are the numbers of available decision variables. Then, the fuzzy set approach for an accounting information system selection problem is: Step 1: Given the proposed model in this paper, we may use any available computer software to solve the following series of
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
LP problems. For illustration purposes, assume t D 6 and k D 2: 6 X .i/ Max .and Min/ pi xi iD1
1317
“ 0, 6 X
subject to:
6 X 6 X
ai xi .e1 ”1 C : : : C e6 ”2 /
6 X
6 X
iD1
bi xi .e1 ”1 C : : : C e6 ”2 /
iD1
6 X
6 X
iD1
ci xi .e1 ”1 C : : : C e6 ”2 /
iD1
2 X
6 X
j D1
di xi .e1 ”1 C : : : C e6 ”2 /
iD1
(87.13) and xi 0, and I D 1; : : : 6. 6 X
qi xi
iD1
subject to: Equation (87.13) .iii/ Max.and Min/
6 X
ri xi
iD1
subject to: Equation (87.13) Group (i) of the above LP problems has two types of optimization: one is maximization and the other is minimization. Groups (ii) and (iii) have four problems, respectively. All of them use the same constraints (Equation (87.13)). The total number of problems is six. We define the objective value of the maximization problems by Ui as the upper bound of decision makers’ (DM) acceptability for the objective payoff; and the objective value of the minimization problems by Li as the lower bound of DMs’ acceptability for the objective payoff, i D 1; 2; 3. Step 2: Let f.x/i ; i D 1; 2; 3, be the objective functions in Step 1. Then, we solve the following linear problem: Max “
(87.14)
subject to: “ .f.x/i Li /=.Ui Li /; i D 1; 2; 3;
bi xi .e1 ”1 C : : : C e6 ”2 /
iD1
iD1
.ii/ Max .and Min/
ai xi .e1 ”1 C : : : C e6 ”2 /
iD1
ci xi .e1 ”1 C : : : C e6 ”2 /
di xi .e1 ”1 C : : : C e6 ”2 /
j D 1; j 0;
and xi 0, for i D 1; : : : ; 6. The resulting values of xj are the fuzzy optimal solution and the value of “ is the satisficing level of the solution for the DMs. A numerical example, which is similar to Santhanam et al. (1989), is developed to present the fuzzy set approach and its solution method. The proposed model in this paper is adopted to compare the fuzzy set approach with the goal programming approach. In the case of our fuzzy set approach, we do not have to specify the priority or preference of goals in advance. In addition, we also allow more than one decision maker’s estimated value, such as the amount of investment available for the new accounting system, as the value in the right-hand side constraint value. The firm wishes to achieve the following goals. They want to reduce information processing costs, to increase flexibility in decision making, and to improve the planning process. The total allocated budget for the information system selection is $100,000. Table 87.7 contains the information concerning an accounting system selection contribution towards goals. Table 87.8 contains projected information on resource utilization. Analysis Formulation of the fuzzy set model variables is as follows: Xi D project number. Objectives and constraints for the numerical example are as follows: First, the objective of reducing information processing costs can be expressed as: Max .Min/ Z D 20; 000X1 C 50; 000X2 C 10; 000X3
(87.15)
C30; 000X4 C 25; 000X5 C 15; 000X6:
1318
W. Kwak et al.
Table 87.7 Accounting information system contribution towards goals
Table 87.8 Accounting information system resource utilization
System #
Cost savings in $
Decision-making flexibility (score)
Improving planning process (rank)
1 2 3 4 5 6
20,000 50,000 10,000 30,000 25,000 15,000
5 5 4 6 2 5
3 1 2 5 4 6
System #.
Investment in Number dollars of analysts
Number of programmers
CPU time required
1
15,000
2
4
10,000
2
80,000
5
6
40,000
3
40,000
1
3
30,000
4
30,000
4
1
10,000
5
25,000
3
3
30,000
6
30,000
2
4
20,000
100,000
6
10
100,000
Resource availability
It is desirable to increase flexibility in decision-making and the second objective can be expressed as: Max .Min/ Z D 5X1 C 5X2 C 4X3 C 6X4 C 2X5 C 5X6 : It is also desirable to improve the planning process and this third objective can be expressed as follows: Max .Min/ Z D 3X1 C 1X2 C 2X3 C 5X4 C 4X5 C 6X6 : The following are constraints: (i) Investment cash flow availability The constraints for the investment cash flow availability for the accounting information system selection (from Table 87.8) can be expressed as: 15; 000X1 C 80; 000X2 C 40; 000X3 C 30; 000X4 C25; 000X5 C 30; 000X6 100; 000: Here the maximum available cash flow for the accounting information system project is $100,000. One important characteristic of the fuzzy set approach is that it allows more than one right-hand side value. For example, the president thinks that the cash flow available for Project 6 is $40,000. However, the controller thinks that the cash flow available for Project 6 is $30,000. For illustration purpose, only multiple cash flow availability is allowed. To incorporate the multiple criteria
(multiple right-hand side values), the constraint for the two different amounts of cash flow availability can be expressed as: X6 30; 000 ”1 40; 000 ”2 0: (ii) Number of system analysts available The constraints for the number of system analysts available can be expressed as: 2X1 C 5X2 C 1X3 C 4X4 C 3X5 C 2X6 6: Here the maximum availability of analysts is six. (iii) Number of programmers available The maximum number of programmers available for this year is ten. This constraint can be expressed as: 4X1 C 6X2 C 3X3 C 1X4 C 3X5 C 4X6 10: (iv) CPU time required The CPU time required for each accounting information system project development is given in Table 87.8. The maximum available CPU time for this year is 100,000 s. This constraint can be expressed as: 10; 000X1 C 40; 000X2 C 30; 000X3 C 10; 000X4 C30; 000X5 C 20; 000X6 100; 000:
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
Finally, Project 6 is the top management’s priority to establish a new branch and this constraint can be expressed as: X6 D 1. A Fuzzy Solution Step 1. The LINDO (Schrage 1991) computer software was used to solve three maximization problems and three minimization problems for three objectives, respectively. The results are: U1 D 55; 000; U2 D 15; U3 D 12:154; and L1 D 15; 000; L2 D 5; L3 D 6. Step 2. Let “ be the satisfying level of the fuzzy optimal solution. Solving this problem, we obtain: “ D 0:92, which implies 92% of objectives are satisfied with current constraints. The satisfaction rate is high for this numerical example because the problem is simple and well defined. The current solution shows that Projects 1 and 6 should be selected. The original GP solution selected Projects 3, 5, and 6. The difference in the GP solution is that GP sets priorities in their goals. For general GP solutions, they try to satisfy the first goal as much as possible and then satisfy the second goal and so on. In addition, the first goal of the original GP example tries to reduce information processing costs by $50,000. That is the reason why we reached a different solution here. Finally, ”1 D 0 and ”2 D 1. This result implies that the controller’s opinion dominates the president’s opinion. If we force ”1 0:3, the satisfaction rate .“/ is 92% and other values are the same. Here ”1 0:3 means that the controller’s opinion still dominates the president’s opinion. If ”1 0:5, the satisfaction rate is still the same. This case implies that the controller’s opinion is equally important as the president’s opinion. However, the solutions are the same in this numerical example because the constraint level is narrowly defined. The fuzzy set approach for the accounting information system project selection problem as we discussed in this paper has some managerial implications. First, the model integrates multiple objectives. In this model, multiple constraint levels are also allowed to incorporate each decision maker’s preference of new cash flow availability for investments. Second, the fuzzy set approach facilitates each decision maker’s participation to avoid suboptimization of a firm’s overall goals, which means each decision maker can express their opinions. Third, the fuzzy set approach is flexible enough to easily add more objectives or constraints according to the changing environment. Finally, the fuzzy set approach allows a group decision-making process, which is quite common in today’s business environment, by allowing multiple right-hand side values.
1319
87.5 Fuzzy Set Formulation to Capital Budgeting Capital budgeting problems have drawn a great deal of attention from both researchers and practitioners (e.g., see Liberatore et al. 1992). While practical techniques such as payback or accounting rate of return have been widely used, discounted cash flow (DCF) methods, including net present value (NPV) and internal rate of return (IRR), are the primary quantitative methods in capital budgeting (Kim and Farragher 1981). The payback method estimates how long it will take to recover the original investment. However, this method incorporates neither the cash flows after the payback period or the variability of those cash flows (Boardman et al. 1982). The accounting rate of return method measures a return on the original cost of the investment. Both of the above methods ignore the time value of money. In addition, DCF methods may not be adequate to evaluate long-term investments because of a bias in favor of short-term investments with quantifiable benefits (Mensah and Miranti 1989). Researchers have suggested a number of capital budgeting methods utilizing mathematical programming techniques to improve effectiveness in the evaluation and control of large capital projects over several periods (e.g., Baumol and Quandt 1965; Bernard 1969; Hillier 1969; Howe and Patterson 1985; Pike 1983, 1988; Turney 1990; and Weingartner 1963). The primary goal of a typical mathematical capital budgeting approach is to maximize DCFs that measure a project’s desirability on the basis of its expected net present value. This kind of single criterion analysis, however, ignores strategic criteria such as future growth opportunities (Cheng 1993). Furthermore, if operating conditions change, management may change their plans. For example, they may change input and output mixes or abandon the project in a multiperiod situation. The ever-increasing involvement of stakeholders, other than shareholders, in a business organization supports a multiple objective approach (Bhaskar 1979). Other empirical studies also have found that firms use multiple criteria in their capital budgeting decisions (e.g., Bhaskar and McNamee 1983 and Thanassoulis 1985). The goal programming approach has been widely used to tackle multiple objective problems (e.g., Bhaskar 1979; Choi and Levary 1989; and Santhanam et al. 1989). It emphasizes weights of importance of the multiple objectives with respect to the decision maker’s (DM) preference. As an alternative methodology, presented a fuzzy approach to solving MCMDM capital budgeting problems.
1320
W. Kwak et al.
This approach is based on two fundamental human cognitive processes: (1) all DMs involved in the capital budgeting decision-making have goal setting and compromise behavior for seeking multiple criteria, and (2) each DM has a preference for the budget availability level. The solution procedure systematically identifies a fuzzy optimal selection of possible projects that not only reach the best compromise value for the multiple criteria, but also use the best budget availability level according to the multiple decision makers’ preferences. In comparison with the AHP approach of Kwak et al. 1996, the fuzzy approach does not have to elicit weights of importance for both multiple criteria and multiple DMs’ preferences. Furthermore, the fuzzy approach is easier for a firm to apply, because it can be implemented by using any available commercial computer software that solves mixed integer programming problems. Note that even though this MCMDM capital budgeting model includes only four specific groups of objectives for the sake of clear presentation, the modeling process adapts to all capital budgeting problems with multiple criteria and multiple constraint levels. For instance, we can consider an uncertainty (risk) factor of an investment. If the objective functions are expressed in cash flows, then the risk factor involved can be estimated by the standard deviation of net cash flows as Hillier (1963) suggested. This risk factor can be one of the objective functions in the modeling process. Let i be the time period for a firm to consider, and j be the index number of the projects that the firm can select. We define xj as the jth project that can be selected by the firm, j D 1; : : : ; t. For the coefficients of the objectives, let vj be the net present value generated by the jth project; gij be the net sales increase generated by the jth project in the ith time period to measure sales growth, i D 1; : : : s; let rij be the ROI generated from the jth project in the ith time period; and let lij be the firm’s equity to debt ratio in the ith time period after jth project is selected to measure leverage. Here, the optimum debt to equity ratio for the firm is assumed to be 1 and the inverse of debt to equity ratio is used to measure leverage. For the coefficients of the constraints, let bij be the cash outlay required by the jth project in the ith time period. For the coefficients of the constraint levels, let cki be the budget availability level believed by the kth manager (or executive) for the firm in the ith time period, k D 1; : : : ; u. In this paper, for illustration, if we suppose t D 10; i D 5 and u D 5, then the model is: Max
10 X
Max
10 X jD1
rij xj
jD1 10 X
Max
1ij xj
(87.16)
jD1
subject to: 10 X
bij xj .c1i C : : : C c5i / ; i D 1; : : : ; 5
(87.17)
jD1
and xj D f0; 1g, where k D 1; : : : ; 5, for cki , represents the five possible DMs (i.e., president, controller, production manager, marketing manager, and engineering manager). The weights of 16 total objectives can be expressed by .œ1 ; : : : ; œ16 / with P œq D 1; q D 1; : : : 16; 0 < œq < 1, and the weights P of five DMs can be expressed by .œ1 ; : : : ; œ5 / with œp D 1; p D 1; : : : ; 5; 0 < œp < 1. Based on Equation (87.16), we formulate a MCMDM capital budgeting model with optimal fuzzy (satisficing) selection of investment projects as: Step 1: Given the model (Equations (87.16) and (87.17)), we use any available computer software, such as LINDO, to solve the following IP programs: .i/ Max .and Min/
10 X
vj xj
(87.18)
jD1
subject to:
10 P
bij xj .c1i C : : : C c5i / .1 C : : : C 5 /
jD1 k X
p D 1;p 0; p D 1; : : : ; 5
jD1
and xj D f0; 1g; for j D 1; : : : ; 10: .ii/ Max .and Min/
10 X
gij xj ; i D 1; : : : ; 5
(87.19) (87.20)
jD1
subject to: Equation (87.19) .iii/ Max .and Min/
10 X
rij xj ; i D 1; : : : ; 5
(87.21)
1ij xj ; i D 1; : : : ; 5
(87.22)
jD1
vj xj
jD1
10 X
Max
subject to: Equation (87.19) gij xj
.iv/ Max .and Min/
10 X jD1
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
Subject to: Equation (87.19) Group (i) of the above IP problems has two types of optimization: one is maximization and the other is minimization. Groups (ii), (iii) and (iv) have ten problems, respectively. All of them use the same constraints (Equation (87.18)). The total number of problems is 32. We define the objective value of the maximization problems by Ui as the upper bound of DMs’ acceptability for the objective payoff; and the objective value of the minimization problems by Li as the lower bound of DMs’ acceptability for the objective payoff, i D 1; : : : ; 16. Step 2: Let f(x)i, i D 1; : : : ; 16, be the objective functions in Step 1. Then, we solve the following mixed-integer program:
1321
capital budgeting model to demonstrate the usefulness of the fuzzy optimal solution and its managerial implications for multiple criteria and multiple decision makers. We assume that the Lorie-Savage Corporation has ten projects under consideration, and all projects have 5-year time periods. The executives (president, controller, production manager, marketing manager, and engineering manager) attempt to optimize the following conflicting multiple objectives:
(i) Maximize the net present value of each project. Net present value is computed as the sum of all the discounted, estimated future cash flows, using the required rate of return, minus the initial investment. (ii) Maximize the growth of the firm due to each project. Growth can be measured by net sales increase of each project. This measurement implies a strategic success Max “ (87.23) factor of a firm. (iii) Maximize the profitability of the company. Profitability subject to: can be measured by ROI for each time period generated by each project. (87.24) “ .f.x/i Li /=.Ui Li /; i D 1; : : : ; 16; (iv) Maximize the flexibility of financing. The flexibility of financing (leverage) can be measured by the debt to eq“ 0, uity ratio. If a firm’s debt to equity ratio is higher than the industry average or the optimum level, the firm’s cost of 10 X debt financing becomes more expensive. In this example, bij xj .c1i C : : : C c5i / .1 C : : : C 5 / we use the inverse of the debt to equity ratio and assume jD1 1 as the optimum debt to equity ratio. k X The data related to these objectives are given in Table 87.9. p D 1;p 0; p D 1; : : : 5 In this example, all projects are assumed to be independent. jD1 In Table 87.9, the firm’s cost of capital is assumed to be known a priori and to be independent of the investment deand xj D f0; 1g, for j D 1; : : : ; 10. The resulting values of xj are the fuzzy optimal solution cisions. Based on these assumptions, the net present value and the value of “ is the satisficing level of the solution for of each project can be defined as the sum of the cash flows discounted by the cost of capital. Cash outlay is the amount the DMs. Using the well-known Lorie-Savage problem (Lorie and of expenditure required for project j; j D 1; 2; 3; : : : ; 10, in Savage 1955), we construct a prototype of the MCMDM each time period.
Table 87.9 Net present value for the Lorie-Savage Corporation (in millions)
Net cash outlays for each period Present Project
value
Period 1
Period 2
Period 3
Period 4
Period 5
1 2 3 4 5 6 7 8 9 10
$20 16 11 4 4 18 7 19 24 4
$24 12 8 6 1 18 13 14 16 4
$16 10 6 4 6 18 8 8 20 6
$6 2 6 7 9 20 10 12 24 8
$6 5 6 5 2 15 8 10 16 6
$8 4 4 3 3 15 8 8 16 4
1322
Table 87.10 Net sales increase for the Lorie-Savage Corporation (in millions)
W. Kwak et al.
Net sales increase for each project Project 1 2 3 4 5 6 7 8 9 10
Table 87.11 Return on investment for the Lorie-Savage Corporation (in percentage)
Period 1
Period 2
Period 3
Period 4
$120 100 80 40 40 110 60 110 150 35
$130 120 90 50 45 120 70 120 170 40
$145 140 95 50 50 140 65 100 180 40
$150 150 95 55 55 150 70 110 190 50
Period 5 $170 160 100 60 60 165 80 120 200 50
Return on investment for each project Project
Period 1
Period 2
Period 3
Period 4
Period 5
1 2 3 4 5 6 7 8 9 10
10 10 12 8 15 12 8 14 12 10
12 12 15 15 10 12 12 16 10 8
14 18 15 10 8 10 10 13 9 9
15 16 15 8 20 15 12 15 12 8
17 17 18 12 18 15 12 16 12 12
To measure growth, the net sales increase during each time period for each project is estimated. These data are given in Table 87.10. To measure the profitability of each project, ROI is estimated after reflecting additional income and capital expenditures from each investment during each time period. These data are presented in Table 87.11. To measure leverage, the inverse of debt to equity ratio is used. Here, the optimum debt to equity ratio is assumed to be 1 for the Lorie-Savage Corporation. These data are displayed in Table 87.12. The five key executives, i.e., the DMs, in this company (president, controller, production manager, marketing manager, and engineering manager) have different beliefs regarding resource availability. In addition, each DM has a different opinion on the budget availability level. Of course, the president may make the final decision based upon the compromise of the executives’ opinions. However, a process of collecting DMs’ preferences should improve the quality of the final decision. Budget availability level data are given in Table 87.13. Employing the fuzzy solution method, we solve this problem as follows: Let xj be the index of the jth project. If the project is selected, xj D 1; otherwise xj D 0, for j D 1; : : : ; 10. Then:
Step 1. We use the LINDO (Schrage 1991) computer software to solve 16 maximization problems and 16 minimization problems for 16 objectives, respectively. The results are: U1 D 55; U2 D 350; U3 D 390; U4 D 430; U5 D 450; U6 D 490, U7 D 61; U8 D 68; U9 D 58; U10 D 74; U11 D 75; U12 D 4:78, U13 D 4:83; U14 D 4:82; U15 D 4:81, and U16 D 4:79; and Li D 0, for i D 1; : : : ; 16. Step 2. Let “ be the satisficing level of the fuzzy optimal solution for the DMs. Then, we have a mixed IP problem (see Kwak et al. 2002 for the details). Here, to guarantee that the best compromise of budget availability levels reflects each participating executive’s (DM’s) preference, we impose the constraint parameter of budget availability ”p 0:1, for p D 1; : : : ; 5. Then the fuzzy optimal solution for this numerical example is to select projects 2, 3, 4, 5 and 8. The DMs’ satisficing level “ for all 16 objectives over five periods is 94.67%. The best weights for the budget availability are: ”1 of the President is 0.1; ”2 of the Controller is 0.13; ”3 of the Production Manager is 0.1; ”4 of the Marketing Manager is 0.57; and ”5 of the Engineering Manager is 0.1. This result implies
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
Table 87.12 Debt to equity for the Lorie-Savage Corporation
Table 87.13 Budget availability level for the Lorie-Savage Corporation (in millions)
1323
Inverse of debt to equity ratio Project
Period 1
Period 2
Period 3
Period 4
1 2 3 4 5 6 7 8 9 10
0:85 0:90 0:96 0:98 0:90 0:90 0:96 0:96 0:90 0:98
0:90 0:98 0:97 0:95 0:95 0:95 0:98 0:95 0:88 0:95
0:98 0:95 0:98 0:96 0:95 0:94 0:98 0:90 0:85 0:95
0:95 0:95 0:92 0:92 0:98 0:95 0:98 0:92 0:95 0:98
Period 5 0:95 0:96 0:92 0:95 0:95 0:95 0:98 0:95 0:95 0:95
Estimated budget availability Period 1 Decision maker President Controller Production manager Marketing manager Engineering manager
that the Marketing Manager’s opinion is more important than others. According to this information and Table 87.13, we see that the best compromise of these DMs’ budget availability levels is: $62.25 million .D 50.0:1/ C 40.0:13/ C 55.0:1/ C 45.0:57/ C 50.0:1// for Period 1; $33.95 million for Period 2; $36.2 million for Period 3; $39.05 million for Period 4; and $29.7 million for Period 5. Thus, this solution reflects the most satisfying compromise for the multiple objectives and satisfies the best compromise of budget availability levels based on the multiple DMs’ preferences. Now we use the data of this example to conduct a comparison study for four different approaches to capital budgeting problems: (1) linear programming, (2) goal programming, (3) MCMDM with analytic hierarchy process (AHP), and (4) MCMDM with fuzzy optimal selection. The computational results are summarized in Table 87.14. In a linear programming approach, we separately maximize four objectives: (1) the net present value (NPV), (2) the growth of the firm (Growth), (3) the profitability of the firm (Profit), and (4) the flexibility of financing (Finance). The president’s budget availability of Table 87.13 is used for all four problems. The weights of 5, 4, 3, 2, and 1 have been applied to Periods 1–5, respectively, in aggregating the objective for problems (ii), (iii), and (iv). In a goal programming approach, in addition to the use of president’s budget availability, 16 of the goal values have been set up to convert the objectives into the goal deviation constraints. The weights of 4, 3, 2, and 1 are used in the minimization of the
$50 40 55 45 50
Period 2 $30 45 40 30 40
Period 3 $30 30 20 40 45
Period 4 $35 30 30 45 30
Period 5 $30 20 35 30 35
goal deviations for the objectives NPV, Growth, Profit, and Finance, respectively. In the MCMDM with AHP of Kwak et al. (1996), AHP has been used to identify first the weights of importance for 16 objectives, and then for five decision makers. Since this approach has the MC2 framework, the multiple criteria approach is its special case. In the fuzzy MCMDM approach, without assigning or identifying any weights, the solution procedure systematically produces a fuzzy optimal selection of projects with the satisficing level. Comparing the fuzzy model suggested in this paper with other existing methods, the fuzzy method has some important managerial implications. First, like the MCMDM with AHP of Kwak et al. (1996), the model integrates the multiple objective capital budgeting problem with multiple DMs. The most common problems in a capital investment decisionmaking situation are (1) how to estimate cash flows and (2) how to incorporate uncertainty factors of decision makers (Hillier 1963; Lee 1993; Ouederni and Sullivan 1991). In our model, we allow multiple constraint levels to incorporate each DM’s preference of budget availability. This improvement in modeling can minimize the uncertainty of determining the level of budget availability. Neither linear programming nor goal programming models can overcome these modeling difficulties. Second, our model can be applied to solve capital budgeting problems not only with the representation of a complex structure, but also with motivational implications. Our method facilitates DMs’ participation to avoid
1324
Table 87.14 Comparison of optimal solutions for four approaches
W. Kwak et al.
Index of the jth project Approaches
x1
Linear prog. Max NPV Max growth Max profit Max finance Goal prog. MCMDM-AHP MCMDM-Fuzzy
x2
x3
x4
C C C C C C C
C C C C C C C
C C
X5
x6
x7
x8
C
C
x10
C C C
C C C C
x9
C C C C
“C” indicates the corresponding project is chosen
suboptimization of overall company goals (Locke et al. 1981). By allowing DMs to express their preference structures in a systematic fashion, this model fosters more autonomous control than any other mathematical programming models. The model also aids coordination and synthesis of conflicting multiple objectives that are usually measured differently by a number of decision participants. Third, employing the fuzzy approach of Liu and Shi (1994) dramatically reduces the solution complexity of the resulting MC2 IP program. Solving MC2 IP problems is very complex. Real-life-size MC2 IP problems are intractable by using current computational technology. Computer software for solving MC2 IP problems is still under development (Shi and Lee 1997). Furthermore, this approach to capital budgeting is more likely to outperform the AHP approach of Kwak et al. (1996) in the sense that the former systematically finds a satisfying (fuzzy) solution without eliciting trade-offs among objectives and/or DMs’ preferences. Practically speaking, the fuzzy approach is more straightforward and thus consumes less time for trade-offs than the MCMDM with AHP of Kwak et al. (1996). However, both approaches share the advantage of using currently available computer software. Finally, the framework of the MCMDM capital budgeting problem is highly flexible. The DMs can reach an agreement interactively. For example, alternative budgeting policy can be obtained by providing a different preference matrix of objectives and/or resource availability levels.
87.6 A Data Mining Approach to Firm Bankruptcy Predictions Data mining is a process of discovering interesting patterns in databases that are useful in business decision-making processes (see Bose and Mahapatra 2001). With today’s powerful computing capabilities, we expect to see more applications of data mining in the business world. This new tool is widely used in finance and marketing applications such as forecasting stock prices, loan default predictions, a cus-
tomers’ buying behavior analysis, product promotion analysis, and online sales support. In addition, recently many applications have been used in accounting areas where accountants deal with large amounts of computerized operational and financial databases. There are higher potential payoffs for data mining applications in these areas. We can try to apply similar approaches that are used in finance or marketing studies to accounting areas such as fraud detection, detection of tax evasion, and an audit-planning tool for financially distressed firms. The purpose of this section is to review a multiple criteria linear programming (MCLP) approach to data mining for bankruptcy prediction for different countries (Kwak et al. 2006a; Kwak et al. 2006b). Data mining is a tool that can be used to analyze data to generate descriptive or predictive models to solve business problems, see Chan and Lewis (2002); Witten and Frank (2000); Berry and Linoff (1997). According to Shi et al. (2002), data mining can also be classified based on its process, methodology, and mathematical tools. From the process view, data mining consists of the following stages: (1) cleaning, (2) integration, (3) selection, (4) transformation, (5) data mining processes, and (6) pattern evaluation or interpreting (see Han and Kamber 2001). Data cleaning is required to remove noise and inconsistent data. Data integration may be needed where multiple data sources are used. Selecting and transforming of data for achieving business goals is known as data warehousing. Pattern evaluation or interpretation provides the analysis of mined data for use in a business decision making by managers. From the methodological view, data mining can be achieved by association, classification, clustering, and predictions. From the mathematical point of view, the algorithms of data mining can be implemented by binary, or induction decision tree, statistics, and neural networks, and so forth. (see Shi et al. 2002). The goal of a predictive model like our bankruptcy prediction study is to predict future outcomes based on past data. Bankruptcy prediction is an interesting topic because investors, lenders, auditors, management, unions, employees, and the general public are interested in how bankrupt firms
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
proceed, especially since hearing news about the Enron, WorldCom, and United Airline bankruptcy filings. Wilson and Sharda (1994) used the neural networks approach and Sung et al. (1999) used the decision tree approach for bankruptcy prediction, but our study is the first one to use the MCLP approach in data mining for bankruptcy prediction. In recent years, some attempts were made to predict bankruptcy using computer immunology see Cheh (2002). Although it is quite promising, the science of computer immunology is still in its infancy. Furthermore, the application of computer immunology to bankruptcy prediction has been far more challenging than its applications in computer security (see D’haeseleer et al. 1996; Hofmeyr and Forrest 2000). The challenges may come from creating optimal costbenefit trade-off mapping between the benefits of quantifying or discretizing financial data and the cost of information loss in the process of granulating financial data for detecting bankruptcy. In computer security detection, the discretization problem may be less severe because of the binary nature of computer virus or security breach with which the detectors deal, whereas financial data are decimal. On the other hand, as of now, a multiple criteria linear programming data mining approach has recently been applied to credit card portfolio management. This approach has proven to be robust and powerful for even a large sample size using a huge financial database. The results of the MCLP approach in our bankruptcy prediction study are promising as this approach performs better than traditional multiple discriminant analysis or logistic regression analysis using financial data. From the aspect of mathematical tools of data mining, the algorithms of data mining can be implemented by many different types of mathematical techniques. For example, Shi et al. (2002) states classification or prediction methods can be constructed by decision trees, statistics, and neural networks as previous bankruptcy studies have used. Based on Freed and Glover’s (1986) linear programming (LP) model, Kou et al. (2003) developed a general problem of data classification by using multiple criteria linear programming (MCLP). The basic concept of the formulation can be shown as follows: Given a set of r variables about the bankrupt or nonbankrupt sample firms in the database a D .a1 ; : : : ; ar /, let Ai D .Ai1 ; : : : ; Air / 2 Rr be the sample observations of data for the variables, where i D 1; : : : :; n and n is the sample size. We want to determine the coefficients of the variables, denoted by X D .x1 ; : : : ; xr /T , and a boundary value of b to separate two classes: B(Bankrupt) and N (Non-bankrupt); that is, Ai X b; Ai 2 B.Bankrupt/ and Ai X > b; Ai 2 N.Non bankrupt/
(87.25)
1325
α
Ai X = b α
bi
bi ai
ai
Non-bankrupt
Bankrupt Ai X = b - α
Ai X = b + α
Fig. 87.1 A two-class MCLP model
Consider now two kinds of measurements for better separation of Bankrupt and Non-bankrupt firms. Let ˛i be the overlapping degree with respect to Ai , and ˇi be the distance from Ai to its adjusted boundary. In addition, we define ˛ to be the maximum overlapping of two-class boundary for all cases Ai .˛i < ˛/ and ˇ to be the minimum distance for all cases Ai from its adjusted boundary .ˇi > ˇ/. We use (?) to represent Ai of Non-bankrupt and () to represent Ai of Bankrupt, then Fig. 87.1 shows that our goal is to minimize the sum of ˛i and maximize the sum of ˇi simultaneously. By adding ˛i into (a), we have: Ai X b C˛i ; Ai 2 Band Ai X > b ˛i ; Ai 2 N
(87.26)
However, by considering ˇi , we can rewrite (b) as Ai X D b C ˛i ˇi ; Ai 2 B and Ai X D b ˛i C ˇi ; Ai 2 N (87.27) Our two-criterion linear programming model is stated as Minimize
X i
˛i and Maximize
X i
ˇi
(87.28)
subject to Ai X D b C ˛i ˇi ; Ai 2 B; Ai X D b ˛i C ˇi ; Ai 2 N; Where Ai are given, X and b are unrestricted, ˛i and ˇI 0, see Shi et al. (2001), p. 429. The above MCLP model can be solved in many different ways. One method is to use the compromise solution approach as Shi and Yu (1989) and Shi (2001) did, to reform the model (Equation (87.28)) by systematically identifying the best trade-offs between ˙i ˛i and ˙i ˇi . To show this graphically, we assume the ideal value of ˙i ˛i is ˛ > 0 and the ideal value of ˙i ˇi is ˇ > 0. Then, if ˙i ˛i > ˛ , we define the regret measure as d˛ C D ˙i ˛i C ˛ ; otherwise, it is 0. If ˙i ˛i < ˛ , the regret measure is
1326
W. Kwak et al.
Min da- + da+ + db- + db+ S bi (a*, b*)
-Sai Fig. 87.2 Compromise formulation
defined as d˛ D ˛ C ˙i ˛i ; otherwise, it is 0. Thus, we have (i) ˛ C ˙i ˛i D d˛ d˛ C , (ii) j˛ C ˙i ˛i j D d˛ d˛ C , and (iii) d˛ ; d˛ C 0. Similarly, we derive P ˇ ˙i ˇi D dˇ dˇ C ; jˇ i ˇi j D dˇ C dˇ C , and C dˇ ; dˇ 0 (See Fig. 87.2). An MCLP model has been gradually evolved as Minimize d˛ C d˛C C dˇ C dˇC
(87.29)
subject to ˛ C ˇ
X X
i
˛i D d˛ d˛C ;
i
ˇi D dˇ dˇC ;
Ai X D b C ˛i ˇi ; Ai 2 B; Ai X D b ˛i C ˇi ; Ai 2 N; Where Ai ; ˛ , and ˇ are given, X and b are unrestricted, ˛i ; ˇi ; d˛ ; d˛ C ; dˇ ; dˇ C 0. The data separation of the MCLP classification method is determined by solving the above problem. Two versions of actual software have been developed to implement the MCLP method on large-scale databases. The first version is based on the well-known commercial SAS platform (SAS/OR 1990). In this software, the SAS codes, including SAS linear programming procedure, are used to solve (e) and produce the data separation. The second version of the software is written by CCC language running on a Linux platform (see Kou and Shi 2002). The reason for developing a Linux version of the MCLP classification software is that the majority of database vendors, such as IBM, are aggressively moving to Linux-based system development. This Linux version goes along with
the trend of information technology. Because many large companies currently use a SAS system for data analysis, the SAS version is also useful to conduct data mining analysis under the SAS environment. We did empirical tests for the three purposes. First, we test the effectiveness of our MCLP data mining model for Japanese bankruptcy prediction using Altman’s (1968) and Ohlson’s (1980) variables separately and compare the effectiveness of the two sets of predictors. We are not testing Altman’s (1968) and Ohlson’s (1980) models, but we are testing the effectiveness of their predictor variables in the MCLP data mining model. Second, we test the effectiveness of both sets of variables together using our MCLP data mining model to determine whether additional variables increase the model’s predictive accuracy. Third, we compare our prediction rates with those from U.S. studies. Our primary comparison is with Kwak et al. (2006b) because they use the same MCLP data mining model that we do, and their sample period of 1992–1998 closely corresponds to our sample period of 1989–1999. Table 87.15 shows the descriptive statistics of predictor variables for Japanese bankrupt and non-bankrupt sample firms between 1989 and 1999. The t-values presented in Panel B are test statistics for comparing the mean values for non-bankrupt and bankrupt firms. Eight of the thirteen variables have statistically different means. Bankrupt firms are more highly levered (based on TL/TA and RE/TA) and less liquid (based on WCA/TA) than are non-bankrupt firms. Non-bankrupt firms have better operating performance than bankrupt firms based on EBIT/TA and SALE/TA, but there is no statistical difference in the means for NI/TA. This probably reflects the higher interest expense included in net income for the more highly leveraged bankrupt firms. The fact that the funds from operations-to-total liabilities (FU/TL) for bankrupt firms is negative and significantly lower than that for non-bankrupt firms also reflects the impact of higher interest costs experienced by these bankrupt firms. For the bankrupt firms, average EBIT and average NI are both negative (losses), and over 75% of these firms reported net losses during the current and preceding years (INTWO). Means for size, total current liabilities-to-total current assets (CL/CA), changes in net income (CHIN), and market value of equityto-book value of total debt (MKV/TD) are not significantly different. The average size of the firms should be similar because we matched the control firms based on the size and industry. Results for the OENEG variable are interesting. If a firm’s total liabilities exceed its total assets, then OENEG equals 1; otherwise, OENEG equals 0. Panel A of Table 87.15 reports that the mean, minimum, and maximum values of OENEG for sample bankrupt firms are all 0. This means that all sample Japanese bankrupt firms’ total assets exceeded total liabilities. However, the mean value of OENEG for non-bankrupt firms is 0.009, which is very small
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance
Table 87.15 Descriptive statistics of predictor variables for Japanese bankrupt and Non-bankrupt sample firms between 1989 and 1999
1327
Panel A: Bankrupt firms Variables Size TL/TA WCA/TA CL/CA NI/TA FU/TL INTWO OENEG CHIN RE/TA EBIT/TA MKV/TD SALE/TA
N 37 37 37 37 37 37 37 37 36 37 37 37 37
Mean 0:0003 0:7261 0:0096 1:1906 0:0537 0:0061 0:7568 0:0000 0:0442 0:0977 0:0029 1:27369 0:8080
Std Dev 0:0004 0:2655 0:2994 0:8712 0:2230 0:1376 0:4350 0:0000 0:8471 0:2537 0:0496 2:9669 0:3784
Min 0:00 0:14 0:70 0:22 1:33 0:70 0:00 0:00 1:00 0:75 0:18 4:06 0:19
Max 0:00 0:99 0:61 4:17 0:07 0:21 1:00 0:00 1:00 0:73 0:15 10:86 1:56
Panel B: Non-bankrupt firms Variables
N
Mean
Std Dev
Min
Max
t-valuea
Size TL/TA WCA/TA CL/CA NI/TA FU/TL INTWO OENEG CHIN RE/TA EBIT/TA MKV/TD SALE/TA
111 111 111 111 111 111 111 111 110 111 111 111 111
0:0003 0:6012 0:1019 0:9332 0:0001 0:0821 0:3514 0:0090 0:0098 0:2809 0:0142 27:7273 1:0891
0:0004 0:2145 0:1953 0:6476 0:0543 0:2138 0:4796 0:0949 0:7523 0:1958 0:0312 96:1406 0:7386
0:00 0:06 0:40 0:12 0:29 0:21 0:00 0:00 1:00 0:20 0:13 5:07 0:12
0:00 1:01 0:50 5:40 0:14 1:76 1:00 1:00 1:00 0:83 0:12 844:50 6:42
0:48 2:88 1:76 1:65 1.45 2:90 4:55 1 0.23 4:01 1:97 1.51 3:00
Variable descriptions: Size D total assets/gross domestic products. We used GDP instead of Ohlson’s GNP price-level index because of data availability in the PACAP database TL/TA D total liabilities/total assets (Ohlson (1980) ratio) WCA/TA D working capital/total assets (Altman (1968) ratio and (Ohlson (1980) ratio)) CL/CA D total current liabilities/total current assets (Ohlson (1980) ratio) NI/TA D net income/total assets (Ohlson (1980) ratio) FUTL D funds from operations/total liabilities (Ohlson (1980) ratio) INTWO D if net income < 0 or lag(net income) < 0 then INTWO D 1; else INTWO D 0 (Ohlson (1980) dummy variable) OENEG D if TL/TA > 1 then OENEG D 1; else OENEG D 0 (Ohlson (1980) dummy variable) CHIN D (net income lag(net income))/[absolute (net income) C absolute (lag net income)] (Ohlson (1980) ratio) RE/TA D retained earnings/total assets (Altman (1968) ratio) EBIT/TA D earnings before interest and taxes/total assets (Altman (1968) ratio) MKV/TD D market value of equity/book value of total debt (Altman (1968) ratio) SALE/TA D sales/total assets (Altman (1968) ratio) p < 0:10; p < 0:05; p < 0:001 a t-value for testing mean differences between bankrupt and non-bankrupt firms
but reflects the fact that one of the non-bankrupt firms in the sample had total liabilities in excess of total assets. Overall, Japanese firms maintain higher assets than liabilities even with bankrupt firms, but their assets may be inflated based on the bubble economy of the 1990s and inflated land prices.
Table 87.16 reports our MCLP model results using Altman’s (1968) five variables (Panel A), Ohlson’s (1980) nine variables (Panel B), and the combination of both sets of variables (Panel C). These results are based on running the MCLP model for the entire sample period. We ran our MCLP
1328
Table 87.16 Predictability of the MCLP model using various financial predictor variables measured 1 year prior to bankruptcy for Japanese firms (37 bankrupt and 111 non-bankrupt) from 1989 to 1999
W. Kwak et al.
Panel A: Altman’s (1968) five predictor variables All years
N
Type-I 37 Type-II 111 Overall prediction rate
Number correct
Percent correct
Percent error
6 102
16.22 91.89 72.97%
83.78 8.11 27.03%
Panel B: Ohlson’s (1980) nine predictor variables All years
N
Type-I 37 Type-II 111 Overall prediction rate
Number correct
Percent correct
Percent error
26 89
70.27 80.18 77.70%
29.73 19.82 22.30%
Panel C: Combination of Altman’s (1968) and Ohlson’s (1980) predictor variables All years
N
Type-I 37 Type-II 111 Overall prediction rate
model by each year, but the results by year are not reported here because of small sample size problems. Panel A shows an overall prediction (accuracy) rate of 72.97% using Altman’s (1968) predictor variables. This reflects a Type I prediction rate of only 16.22%, where Type I errors are a misclassification of bankrupt firms as non-bankrupt, and Type II errors are a misclassification of non-bankrupt firms as bankrupt. In the case of bankruptcy prediction, the costs of Type I errors are more significant than the costs of Type II errors. For example, investors and creditors suffer actual economic damage from the bankrupt firms because of a Type I error, but there will be no actual damage from a Type II error. Studies using Altman’s (1968) variables with U.S. data and multiple discriminant analysis reflect non-stationarity of prediction rates in different sample periods. Altman (1968) reported accuracy rates of 83.5% overall and 96% Type I, but this overall rate dropped to 78.4% with Begley et al. (1996) 1980s holdout sample. Grice and Ingram (2001) used a sample from 1988 to 1991 and reported overall prediction (Type I) accuracy of 56% (68%) using Altman’s (1968) coefficients and 87.6% (48.6%) using newly estimated coefficients from a 1985–1987 sample. Kwak et al. (2006b) use the same MCLP approach used in this study for a 1990s sample of U.S. firms, and they document overall and Type I prediction rates of 88.56% and 46.15%, respectively, using Altman’s (1968) variables. Although our overall prediction rate of approximately 73% is within the range of rates in these prior studies, our Type I prediction rate is significantly lower than that reported in any of these prior U.S. studies. Table 87.16, Panel B reports prediction rates from our MCLP model using Ohlson’s (1980) nine variables. The overall prediction rate is 77.70%, and the Type I prediction
Number correct
Percent correct
Percent error
19 91
51.35 81.98 74.32%
48.65 18.02 25.68%
rate is 70.27%. Both rates are higher than those rates using Altman’s (1968) variables. Although the Type II rate is lower (80.18% in Panel B versus 91.89% in Panel A), the Type I rate increased dramatically (from 16.22% in Panel A to 70.27% in Panel B). These results indicate that Ohlson’s (1980) variables have more predictive power than Altman’s (1968) variables for bankruptcy prediction in Japan during the 1990s using our MCLP model. Ohlson (1980) reported accuracy rates of 96.4% overall and 32.4% for Type I with his conditional logit model for U.S. data from 1970 to 1976. When Grice and Dugan (2001) tested Ohlson’s (1980) model (variables and coefficients) for U.S. firms in two subsequent sample periods (1988–1991 and 1992–1999), they found Type I accuracy of approximately 95% but overall accuracy of below 40%. Kwak et al. (2006b) report overall and Type I prediction rates of 86.76% and 43%, respectively, using Ohlson’s (1980) variables in their MCLP model. Our overall prediction rate of 77.70% is much higher than that documented by Grice and Dugan (2001) but somewhat lower than that reported in Kwak et al. (2006b). Our Type I rate of 70.27% is somewhat lower than Grice and Dugan’s (2001) but much higher than Kwak et al.’s (2006b). To determine whether a combined set of variables provides better bankruptcy prediction results, we applied the MCLP data mining approach using Ohlson’s (1980) and Altman’s (1968) variables together. The results, presented in Table 87.16, Panel C, are inferior to those using just Ohlson’s (1980) variables (overall (Type I) prediction rate of 74.32% (51.35%) in Panel C versus 77.70% (70.27%) in Panel B). These results support the superiority of using Ohlson’s (1980) nine variables to predict bankruptcy using Japanese financial data.
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance Fig. 87.3 Cumulative distribution of bankrupt and non-bankrupt firms
1329
1 0.9 0.8 0.7 0.6
cum_Bad
0.5
cum_Good
0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Figure 87.3 presents the cumulative distribution of bankrupt and non-bankrupt firms for all years using Ohlson’s model. The picture is presented to visualize the results of bankruptcy prediction. As you can see from Fig. 87.3, our MCLP data mining approach is stable about 70% for the Type I prediction rate, but the Type II prediction rate is more than 80% for the overall years. Our study makes two important contributions. It provides evidence of the effectiveness of the MCLP data mining approach to bankruptcy prediction, and it offers a useful and practical tool for those who want to analyze bankruptcy risk of Japanese firms. Even though Japan’s economy and accounting system were undergoing changes in the early 2000s, these results from the 1990s are relevant because they examine a period after significant deregulation and diminishing main-bank controls.
87.7 Conclusion The fuzzy set approach for the international transfer pricing, human resource allocation, accounting information system selection, and capital budgeting problems as we discussed in this paper has some managerial implications. First, the model integrates multiple objectives. In this model, multiple constraint levels are also allowed to incorporate each DM’s preference of new hourly billing rates. Second, the fuzzy set approach facilitates each decision maker’s participation to avoid suboptimization of the overall firm’s goals. Third, the fuzzy set approach is more straightforward and easy to solve using currently available computer software. Fourth, the fuzzy set approach is flexible enough to easily add more objectives or constraints according to the changing environment. Fifth, the fuzzy set approach allows a group decision-making process, which is common in today’s business environment, by allowing multiple right-hand side values. Finally, the fuzzy set approach allows the sensitivity analysis of the opinions of decision makers as we have
illustrated in the solutions of the numerical examples. We can extend the fuzzy set approach to other real-world problems in business. The framework of this model can be applied to other areas, such as financial planning, portfolio determination, inventory management, resource allocation, and audit sampling objectives for a CPA firm if the decision variables and formulation are expressed appropriately. The last part of this paper used the MCLP data mining approach to assess the predictability of bankruptcy. The MCLP data mining approach has recently been applied to Japanese bankruptcy prediction (Kwak et al. 2006b) and to credit card portfolio management (Shi et al. 2002). This approach has proved to be robust and powerful even for a large sample size using a huge financial database, but it is also effective with smaller sample sizes. The results of the MCLP data mining approach in our bankruptcy prediction studies are promising and may extend to other countries. Our MCLP data mining studies makes two important contributions. They provide evidence of the effectiveness of the MCLP data mining approach to bankruptcy prediction, and they offer a useful and practical tool for those who want to analyze bankruptcy risk of firms. Acknowledgements This work was partially supported by National Natural Science Foundation of China (Grant No. 70621001,70531040, 70501030, 10601064, 70472074), National Natural Science Foundation of Beijing (Grant No. 9073020), 973 Project of the Chinese Ministry of Science and Technology (Grant No. 2004CB720103), and BHP Billiton Corporation of Australia.
References Abdel-khalik, A. R. and E. J. Lusk. 1974. “Transfer pricing – a synthesis.” The Accounting Review 49, 8–24. Accounting Today. 2001. February 12–25, 2001. 15(3), 8. Al-Eryani, M. F., P. Alam, and S. H. Akhter. 1990. “Transfer pricing determinants of U.S. multinationals.” Journal of International Business Studies 21, 409–425. Altman, E. 1968. “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy.” The Journal of Finance 23(3), 589–609.
1330 Bailey, A. D. and W. J. Boe. 1976. “Goal and resource transfers in the multigoal organization.” The Accounting Review 559–573. Baumol, W. J. and R. E. Quandt. 1965. “Investment and discount rate under capital rationing – a programming approach.” Economic Journal 75, 317–329. Begley, J., J. Ming, and S. Watts. 1996. “Bankruptcy classification errors in the 1980s: an empirical analysis of Altman’s and Ohlson’s models.” Review of Accounting Studies 1, 267–284. Bernard, R. H. 1969. “Mathematical programming models for capital budgeting – a survey generalization and critique.” Journal of Financial and Quantitative Analysis 4, 111–158. Berry, M. J. A., and G. Linoff. 1997. Data mining techniques for marketing, sales, and customer support, Wiley, New York. Bhaskar, K. A. 1979. “Multiple objective approach to capital budgeting.” Accounting and Business Research 10, 25–46. Bhaskar, K. A. and P. McNamee. 1983. “Multiple objectives in accounting and finance.” Journal of Business Finance & Accounting 10, 595–621. Boardman, C. M., W. J. Reinhart, and S. E. Celec. 1982. “The role of payback period in the theory and application of duration to capital budgeting.” Journal of Business Finance & Accounting 9, 511–522. Bose, I. and R. K. Mahapatra. 2001. “Business data mining – a machine learning perspective.” Information and Management 39, 211–225. Brancheau, J. C. and J. C. Wetherbe. 1987. “Key issues in information systems management.” MIS Quarterly 11, 23–45. Business Review Weekly. 1997. “Model of a competitive audit bureau.” July 21, 1997. 19(27), 88. Buss, M. D. J. 1983. “How to rank computer projects.” Harvard Business Review 62, 118–125. Chan, C. and B. Lewis. 2002. “A basic primer on data mining.” Information Systems Management 19, 56–60. Cheh, J. J. 2002. “A heuristic approach to efficient production of detector sets for an artificial immune algorithm-based bankruptcy prediction system.” Proceedings of 2002 IEEE World Congress on Computational Intelligence, pp. 932–937. Cheng, J. K. 1993. “Managerial flexibility in capital investment decision: insights from the real-options literature.” Journal of Accounting Literature 12, 29–66. Choi, T. S. and R. R. Levary. 1989. “Multi-national capital budgeting using chance-constrained goal programming.” International Journal of Systems Science 20, 395–414. D’haeseleer, P., S. Forrest, and P. Helman. 1996. “An immunological approach to change detection: algorithms, analysis and implications.” Proceedings of the 1996 IEEE Symposium on Security and Privacy, pp. 100–119. Freed, N., and F. Glover 1986. “Evaluating alternative linear programming models to solve the two-group discriminant problem.” Decision Sciences 17, 151–162. Goicoechea, A., D. R. Hansen, and L. Duckstein. 1982. Multiobjective decision analysis with engineering and business applications, Wiley, New York. Grice, J. S. and M. T. Dugan. 2001. “The limitations of bankruptcy prediction models: some cautions for the researcher.” Review of Quantitative Finance and Accounting 17, 151–166. Grice and R. W. Ingram. 2001. “Tests of generalizability of Altman’s bankruptcy prediction model.” Journal of Business Research 54, 53–61. Gyetvan, F. and Y. Shi. 1992. “Weak duality theorem and complementary slackness theorem for linear matrix programming problems.” Operations Research Letters 11, 249–252. Han, J. and M. Kamber. 2001. Data mining: concepts and techniques, Morgan Kaufmann Publishers, San Francisco, CA. Hao, X. R. and Y. Shi. 1996. MC2 Program: version 1.0: a C C C program run on PC or Unix, College of Information Science and Technology, University of Nebraska at Omaha.
W. Kwak et al. Hillier, F. S. 1963. “The derivation of probabilities information for the evaluation of risky investments.” Management Science 10, 443–457. Hillier, F. S. 1969. The evaluation of risky interrelated investments, North-Holland Publishing Co., Amsterdam. Hofmeyr, S. A. and S. Forrest. 2000. “Architecture for an artificial immune system.” Evolutionary Computation 8(4), 443–473. Howe, K. M. and J. H. Patterson. 1985. “Capital investment decisions under economies of scale in flotation costs.” Financial Management 14, 61–69. Killough, L. N. and T. L. Souders. 1973. “A goal programming model for public accounting firms.” The Accounting Review XLVIII(2), 268–279. Kim, S. H. and E. J. Farragher. 1981. “Current capital budgeting practices.” Management Accounting 6, 26–30. Kou, G. and Y. Shi. 2002. Linux based multiple linear programming classification program: version 1.0., College of Information Science and Technology, University of Nebraska-Omaha, Omaha, NE. Kou, G., X. Liu, Y. Peng, Y. Shi, M. Wise, and W. Xu. 2003. “Multiple criteria linear programming approach to data mining: models, algorithm designs and software development.” Optimization Methods and Software 18, 453–473. Kwak, W. 2006. “An accounting information system selection using a fuzzy set approach.” The Academy of Information and Management Sciences Journal 9(2), 79–91. Kwak, W., Y. Shi, H. Lee, and C. F. Lee. 1996. “Capital budgeting with multiple criteria and multiple decision makers.” Review of Quantitative Finance and Accounting 7, 97–112. Kwak, W., Y. Shi, H. Lee, and C. F. Lee. 2002. “A fuzzy set approach in international transfer pricing problems.” Advances in Quantitative Analysis of Finance and Accounting Journal 10, 57–74. Kwak, W., Y. Shi, and K. Jung. 2003. “Human resource allocation for a CPA firm: a fuzzy set approach.” Review of Quantitative Finance and Accounting 20(3), 277–290. Kwak, W., Y. Shi, and J. Cheh. 2006a. “Firm bankruptcy prediction using multiple criteria linear programming data mining approach.” Advances in Investment Analysis and Portfolio Management 2, 27–49. Kwak, W., Y. Shi, S. W. Eldridge, and G. Kou. 2006b. “Bankruptcy prediction for Japanese firms: using multiple criteria linear programming data mining approach.” The International Journal of Business Intelligence and Data Mining 1(4), 401–416. Lecraw, D. J. 1985. “Some evidence on transfer pricing by multinational corporations,” in Multinationals and transfer pricing, A. M. Rugman and L. Eden (Eds.). St. Martin’s Press, New York. Lee, C. F. 1993. Statistics for business and financial economics, Lexington, Heath, MA. Lee, Y. R., Y. Shi, and P. L. Yu. 1990. “Linear optimal designs and optimal contingency plans.” Management Science 36, 1106–1119. Liberatore, M. J., T. F. Mohnhan, and D. E. Stout. 1992. “A framework for integrating capital budgeting analysis with strategy.” The Engineering Economist 38, 31–43. Lin, L., C. Lefebvre, and J. Kantor. 1993. “Economic determinants of international transfer pricing and the related accounting issues, with particular reference to Asian pacific countries.” The International Journal of Accounting 28, 49–70. Liu, Y. H. and Y. Shi 1994. “A fuzzy programming approach for solving a multiple criteria and multiple constraint level linear programming problem.” Fuzzy Sets and Systems 65, 117–124. Locke, E. A., K. N. Shaw, and G. P. Lathom. 1981. “Goal setting and task performance: 1969–1980.” Psychological Bulletin 89, 125–152. Lorie, J. H. and L. J. Savage. 1955. “Three problems in rationing capital.” Journal of Business 28, 229–239. Mensah, Y. M. and P. J. Miranti, Jr. 1989. “Capital expenditure analysis and automated manufacturing systems: a review and synthesis.” Journal of Accounting Literature 8, 181–207. Merville, L. J. and W. J. Petty. 1978. “Transfer pricing for the multinational corporation.” The Accounting Review 53, 935–951.
87 The Fuzzy Set and Data Mining Applications in Accounting and Finance Ohlson, J. 1980. “Financial ratios and the probabilistic prediction of bankruptcy.” Journal of Accounting Research 18(1), 109–131. Ouederni, B. N. and W. G. Sullivan. 1991. “Semi-variance model for incorporating risk into capital investment analysis.” The Engineering Economist 36, 83–106. Pike, R. 1983. “The capital budgeting behavior and corporate characteristics of capital-constrained firms.” Journal of Business Finance & Accounting 10, 663–671. Pike, R. 1988. “An empirical study of the adoption of sophisticated capital budgeting practices and decision-making effectiveness.” Accounting and Business Research 18, 341–351. Santhanam, R., K. Muralidhar, and M. Schniederjans. 1989. “A zeroone goal programming approach for information system project selection.” OMEGA: International Journal of Management Science 17(6), 583–593. SAS/OR. 1990. User’s guide, SAS Institute Inc., Cary, NC. Schrage L. 1991. User’s manual: linear, integer, and quadratic programming with Lindo, The Scientific Press, San Francisco, CA. Seiford, L. and P. L. Yu. 1979. “Potential solutions of linear systems: the multi-criteria multiple constraint level program.” Journal of Mathematical Analysis and Applications 69, 283–303. Shi, Y. 2001. Multiple criteria multiple constraint-levels linear programming: concepts, techniques and applications, World Scientific Publishing, River Edge, NJ. Shi, Y. and H. Lee. 1997. “A binary integer linear programming with multicriteria and multiconstraint levels.” Computers and Operations Research 24, 259–273. Shi, Y. and Y. H. Liu. 1993. “Fuzzy potential solutions in multicriteria and multiconstraints levels linear programming problems.” Fuzzy Sets and Systems 60, 163–179. Shi, Y., and P. L. Yu. 1989. “Goal setting and compromise solutions,” in Multiple criteria decision making and risk analysis using microcomputers, B. Karpak and S. Zionts (Eds.). Springer, Berlin, pp. 165–204. Shi, Y. and P. L. Yu. 1992. “Selecting optimal linear production systems in multiple criteria environments.” Computer and Operations Research 19, 585–608.
1331
Shi, Y., M. Wise, M. Luo, and Y. Lin. 2001. “Data mining in credit card portfolio management: a multiple criteria decision making approach,” in Multiple criteria decision making in the new millennium, M. Koksakan and S. Zionts (Eds.). Springer, Berlin, pp. 427–436. Shi, Y., Y. Peng, W. Xu, and X. Tang. 2002. “Data mining via multiple criteria linear programming: applications in credit card portfolio management.” International Journal of Information Technology & Decision Making 1(1), 131–151. Sung, T. K., N. Chang, and G. Lee. 1999. “Dynamics of modeling in data mining: interpretive approach to bankruptcy prediction.” Journal of Management Information Systems 16(1), 63–85. Steuer, R. 1986. Multiple criteria optimization: theory. computation and application, Wiley, New York. Tang, R. Y. W. 1992. “Transfer pricing in the 1990s.” Management Accounting 73, 22–26. Thanassoulis, E. 1985. “Selecting a suitable solution method for a multi objective programming capital budgeting problem.” Journal of Business Finance & Accounting 12, 453–471. Turney, S. T. 1990. “Deterministic and stochastic dynamic adjustment of capital investment budgets.” Mathematical Computation & Modeling 13, 1–9. Watson, D. J. H. and J. V. Baumler. 1975. “Transfer pricing: a behavioral context.” The Accounting Review 50, 466–474. Weingartner, H. M. 1963. Mathematical programming and the analysis of capital budgeting problems, Prentice-Hall, Englewood Cliffs, NJ. Wilson, R. L. and R. Sharda. 1994. “Bankruptcy prediction using neural networks.” Decision Support Systems 11, 545–557. Witten, I. H., and E. Frank. 2000. Data mining: practical machine learning tools and techniques with java implementations, Morgan Kaufmann Publishers, San Francisco, CA. Yu, P. L. 1985. Multiple criteria decision making: concepts, techniques and extensions, Plenum, New York. Yunker, P. J. 1983. “Survey study of subsidiary, autonomy, performance evaluation and transfer pricing in multinational corporations.” Columbia Journal of World Business 19, 51–63. Zeleny, M. 1974. Linear multiobjective programming, Springer, Berlin.
Chapter 88
Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High-Frequency Index Returns Bevan J. Blair, Ser-Huang Poon, and Stephen J. Taylor
Abstract The information content of implied volatilities and intraday returns is compared in the context of forecasting index volatility over horizons from 1 to 20 days. Forecasts of two measures of realized volatility are obtained after estimating ARCH models using daily index returns, daily observations of the VIX index of implied volatility, and sums of squares of 5-min index returns. The in-sample estimates show that nearly all relevant information is provided by the VIX index, and hence there is not much incremental information in high-frequency index returns. For out-of-sample forecasting, the VIX index provides the most accurate forecasts for all forecast horizons and performance measures considered. The evidence for incremental forecasting information in intraday returns is insignificant. Keywords ARCH models r Forecasting r High-frequency returns r Implied volatility r Stock index volatility
88.1 Introduction The ability of ARCH models to provide good estimates of equity return volatility is well documented. Many studies show that the parameters of a variety of different ARCH models are highly significant in-sample (see Bollerslev 1987; Nelson 1991; Glosten et al. 1993; and surveys by Bollerslev et al. 1992, 1994; Bera and Higgins 1993).
This paper is reprinted from Journal of Econometrics, 105(1) (2001) 5–26. B.J. Blair Ingenious Asset Management, London, UK S.-H. Poon Manchester Business School, University of Manchester, Manchester, UK S.J. Taylor () Department of Accounting and Finance, Lancaster University, Lancaster, UK e-mail: [email protected]
There is less evidence that ARCH models provide good forecasts of equity return volatility. Some studies (Akgiray 1989; Heynen and Kat 1994; Franses and Van Dijk 1995; Brailsford and Faff 1996; Figlewski 1997) examine the outof-sample predictive ability of ARCH models, with mixed results. All find that a regression of realized volatility on forecast volatility produces a low R2 statistic (often less than 10%) and hence the predictive power of the forecasts may be questionable. Dimson and Marsh (1990) show that data snooping can produce enhanced in-sample predictive ability that is not transferable to out-of-sample forecasting. Nelson (1992) uses theoretical methods to show that the predictive ability of ARCH models is very good at high frequencies, even when the model is misspecified, but that out-of-sample forecasting of medium- to long-term volatility can be poor. Nelson and Foster (1995) develop conditions for ARCH models to perform favorably at medium- to long-term forecast horizons. An alternative to ARCH volatility forecasts is to use implied volatilities from options. These are known to covary with realized volatility (Latane and Rendleman 1976; Chiras and Manaster 1978), and hence it is of interest to compare the forecasting accuracy of ARCH with implied volatility. To do this, many studies use S&P 100 Index options that are American. Day and Lewis (1992) find that implied volatilities perform as well but no better than forecasts from ARCH models. Mixtures of the two forecasts outperform both univariate forecasts. Canina and Figlewski (1993) provide contrary evidence to Day and Lewis. They find that implied volatilities are poor forecasts of volatility and that simple historical volatilities outperform implied volatilities. Christensen and Prabhala (1998) show for a much longer period that while implied volatilities are biased forecasts of volatility they outperform historical information models when forecasting volatility. These papers use implied volatilities that contain measurement errors, because early exercise is ignored and/or dividends are ignored and/or the spot index contains stale prices. The implied volatility series are potentially flawed and this may account for some of the conflicting results. Fleming et al. (1995) describe an implied volatility index (VIX) that
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_88,
1333
1334
eliminates misspecification problems. We use the same index to obtain new results about the information content of option prices. Fleming (1998) uses a volatility measure similar to VIX to show that implieds outperform historical information. From the recent studies of Christensen and Prabhala (1998) and Fleming (1998), it can be concluded that the evidence now favors the conclusion that implied volatilities are more informative than daily returns when forecasting equity volatility. The same conclusion has been obtained for foreign exchange volatility from daily data, but with more certainty (Jorion 1995; Xu and Taylor 1995). High-frequency returns, however, have the potential to change these conclusions. Taylor and Xu (1997) show that 5-min FX returns contain volatility information incremental to that provided by options. In this paper we explore the incremental volatility information of high-frequency stock index returns for the first time. The importance of intraday returns for measuring realized volatility is demonstrated for FX data by Taylor and Xu (1997), Andersen and Bollerslev (1998), and Andersen et al. (2001b), and for equities by Ebens (1999) and Andersen et al. (2001a). Andersen and Bollerslev 1998) prove that regression methods will give low R2 values when daily squared returns measure realized volatility, even for optimal GARCH forecasts, because squared returns are noisy estimates of volatility. They show that intraday returns can be used to construct a realized volatility series that essentially eliminates the noise in measurements of daily volatility. They find remarkable improvements in the forecasting performance of ARCH models for FX data when they are used to forecast the new realized series, compatible with theoretical analysis. They do not, however, compare ARCH forecasts with implied forecasts. This paper answers some important empirical questions for the S&P 100 Index. First, how does the predictive quality of volatility forecasts from ARCH models, which use daily index returns and/or intraday returns, compare with forecasts from models that use information contained in implied volatilities? Second, how important is the selection of the measure of realized volatility in assessing the predictive accuracy of volatility forecasts? The paper is arranged as follows. Section 88.2 discusses the various datasets we use and issues surrounding their construction. Methods for estimating and forecasting volatility that use daily index returns, 5-min returns, and implied volatilities are presented in Sect. 88.3. Section 88.4 provides results from both in-sample estimation and out-of-sample forecasting of S&P 100 volatility. Section 88.5 sets out our conclusions.
B.J. Blair et al.
88.2 Data There are three main types of data: daily index returns, daily implied volatilities, and 5-min index returns. These are available to us for the 13 years from 1987 to 1999 inclusive. The in-sample period is from January 2, 1987 to December 31, 1992 providing 1,519 daily observations, followed by the out-of-sample period from January 4, 1993 to December 31, 1999 providing 1,768 daily observations.
88.2.1 Daily Index Returns Daily returns from the S&P 100 Index are defined in the standard way by the natural logarithm of the ratio of consecutive daily closing levels. Index returns are not adjusted for dividends. Some models in this paper have been estimated using dividend-adjusted index returns and the results from volatility estimation and forecasting are not significantly different.
88.2.2 Implied Volatilities Implied volatilities are considered to be the market’s forecast of the volatility of the underlying asset of an option. To calculate an implied volatility an option valuation model is needed as well as inputs for that model (index level, risk free rate, dividends, contractual provisions of the option) and an observed option price. Many of these variables are subject to measurement errors that may induce biases in a series of implied volatilities. The use of an inappropriate option valuation model will also cause mismeasurements in the implied volatilities. For instance, S&P 100 options are American but if an European model is used to calculate implied volatilities then error will be induced by the omission of the early exercise option. Moreover, when a valuation model is used that includes the early exercise option, if the timing and level of dividends are assumed constant rather than using actual timings and dividends, then the early exercise option is not taken up as frequently, pricing errors will occur, and implied volatilities will be mismeasured (see Harvey and Whaley 1992). Important biases may also be induced by relatively infrequent trading of the stocks in the index (see Jorion 1995). Also, if closing prices are used, there may be biases due to the different closing times of stock and options markets. Finally,
88 Forecasting S&P 100 Volatility
if bid or ask option prices are used, the first order autocorrelation of the implied volatility series may be negative and so biases will result from bid/ask effects. Implied volatilities used in many previous studies (including Day and Lewis 1992; Canina and Figlewski 1993; Christensen and Prabhala 1998) contain relevant measurement errors whose magnitudes are unknown. We follow Fleming et al. (1995) and use an implied volatility index (VIX) that mitigates the above problems. VIX is a weighted index of American implied volatilities calculated from eight near-the-money, near-to-expiry, S&P 100 call and put options, and it is constructed in such a way as to eliminate mismeasurement and “smile” effects. This makes it a more accurate measurement of implied market volatility. VIX uses the binomial valuation method with trees that are adjusted to reflect the actual amount and timing of anticipated cash dividends. Each option price is calculated using the midpoint of the most recent bid/ask quote to avoid problems due to bid/ask bounce. VIX uses both call and put options to increase the amount of information and, most important, to eliminate problems due to mismeasurement of the underlying index and put/call option clientele effects. VIX is constructed so that it represents a hypothetical option that is at-the-money and has a constant 22 trading days (30 calendar days) to expiry. VIX uses pairs of near-themoney exercise prices, which are just above the current index price and just below it. VIX also uses pairs of times to expiry, one that is nearby (at least 8 calendar days to expiration) and one that is second nearby (the following contract month). For a detailed explanation of the construction of the VIX index see Fleming et al. (1995). Because VIX eliminates most of the problems of mismeasurement we use it as our measure for S&P 100 Index implied volatility. Daily values of VIX at the close of option trading are used.1 Even though VIX is robust to mismeasurement it is still a biased predictor of subsequent volatility. It is our understanding that bias occurs because of a trading time adjustment that typically multiplies2 conventional implied volatilities by approximately 1.2. The variance of daily index returns from 1987 to 1999 is equivalent to an annualized volatility or standard deviation of 17.6%. The average value of VIX squared is equivalent to an annualized volatility of 21.7% during the same period, which is consistent with an average scaling factor of 1.23. With this in mind, our empirical methods are designed to be robust against this bias.
1335
88.2.3 High-Frequency Stock Returns
ARCH and implied volatility models provide forecasts of volatility. These forecasts require some measure of the latent volatility process so that their performance can be evaluated. Because the volatility process is not observed, researchers have used a variety of methods to compute ex post estimates of volatility, often called realized volatility. The most common method for computing a realized volatility is to square the interperiod returns. For example, if we are forecasting daily volatility the realized measure is the squared daily return. Andersen and Bollerslev (1998) show that this method produces inaccurate forecasts for correctly specified volatility models. Consider a representation of excess returns rt such that rt D t zt , where t is the time-varying volatility and zt i:i:d:.0; 1/. The realized volatility measure using squared returns is rt2 D t2 z2t , and if t is independent of zt then E.rt2 jt2 / D t2 . However the realized measure, rt2 , will be a very noisy estimate of t2 because not only are ARCH models generally misspecified but poor predictions of rt2 will be due to the noisy component z2t . By sampling more frequently and producing a measure based on intraday data, Andersen and Bollerslev (1998) show that the noisy component is diminished and that in theory the realized volatility is then much closer to the volatility during a day. They find, for foreign exchange data, that the performance of ARCH models improves as the amount of intraperiod data used to measure realized volatility increases. Related results for FX and equity data are presented, respectively, in Andersen et al. (2001b) and Andersen et al. (2001a). In this paper we use 5-min returns from the S&P 100 Index to calculate a measure of realized volatility, here called INTRA. These 5-min returns are constructed from the contemporaneous index levels recorded when S&P 100 options are traded. The original dataset contains an index level whenever an option is traded, but we have chosen a 5-min frequency as this is the highest and best frequency that Andersen and Bollerslev (1998) use. The latest index level available before a 5-min mark is used in the calculation of 5-min returns. To construct the measure INTRA of daily realized volatility we square and then sum 5-min returns for the period from 08:30 CST until 15:00 CST and then add the square of the previous “overnight” return. Thus INTRA on a Tuesday, for example, is the squared index return from Monday 15:00 to Tuesday 08:30 plus 78 squared 5-min index returns on Tuesday, commencing with the return from 08:30 to 08:35 and concluding with the return from 14:55 to 15:00. This volatility measure is also used in ARCH models to deduce the information content of intraday returns. The square root of the average value of INTRA from 1987 1 VIX data are available at http://www.cboe.com/tools/historical/vix.htm. to 1999 is equivalent to an annualized volatility of 13.4%. p 2 VIX multiplies the conventional implied by Nc =Nt , with Nc and Nt This is much less than the equivalent volatility calculated calendar days and trading days, respectively, until expiry. See Fleming from the variance of daily returns for the same period, et al. (1995), Equation (88.2) and Fig. 88.3.
1336
B.J. Blair et al.
namely 17.6%. This is a consequence of substantial positive correlation between consecutive intraday returns, which could be explained by stale prices in the S&P 100 Index.
88.3 Methodology for Forecasting Volatility 88.3.1 In-Sample Models To compare the in-sample performance of several models that use information from stock index returns and implied volatilities, ARCH models are estimated for daily index returns rt from January 2, 1987 to December 31, 1992, with a dummy term dt for the Black Monday crash in 1987. The most general specification is as follows: rt D C 1=2
1 dt
"t D ht zt ; ht D
˛0 C
C
C "t ;
(88.1)
zt i:i:d:.0; 1/;
˛1 "2t1
ıVIX 2t1 : 1 ˇV L
C 1 ˇL
˛2 st 1 "2t1
(88.2) C
2 dt 1
C
6. A volatility model that uses information in both INTRA and VIX without explicit use of daily returns: ˛1 D ˛2 D 0. These are all special cases of: 7. The unrestricted model using all the available information in daily returns, INTRA and VIX. The parameters are estimated by the usual quasi-likelihood methodology,3 so that the likelihood function is defined by assuming standardized returns, zt , have Normal distributions even though this assumption is known to be false. Inferences are made using robust p-values, based on Bollerslev and Wooldridge (1992) standard errors, that are robust against misspecification of the shape of the conditional distributions. It is then possible to attempt to decide which types of information are required to obtain a satisfactory description of the conditional distributions of daily returns. To assess predictive power, "2t is regressed on the conditional variance ht . Higher values of the correlation R indicate more accurate in-sample forecasts of "2t .
INTRAt 1 1 ˇI L
88.3.2 Forecasting Methods (88.3)
Here L is the lag operator, ht is the conditional variance of the return in period t, st 1 is 1 when "t 1 < 0 and otherwise it is zero, dt is 1 when t refers to 19 October 1987 and is 0 otherwise, VIXt 1 is the daily implied indexpvolatility computed from an annualized volatility as VIX= 252 and INTRAt 1 is the measure of volatility based on 5-min returns from the S&P 100 Index. The parameters 1 and 2 , respectively, represent the excess return on the crash day (24%) and an exceptional, transient effect in variance immediately following the crash. By placing restrictions on certain parameters six different volatility models are obtained using different daily information sets: 1. The GJR(1,1) model of Glosten et al. (1993) with a crash dummy in the variance equation, also estimated by Blair et al. (2001,2002): D ı D ˇI D ˇV D 0. 2. A volatility model that uses intraday returns information alone: ˛1 D ˛2 D ı D ˇV D 0. 3. A specification that uses the historic information in both daily returns and INTRA but ignores option information: ı D ˇV D 0. 4. A volatility model that uses the latest information in the VIX series alone: ˛1 D ˛2 D D ˇI D 0. 5. A low-frequency specification that uses only the information provided by daily measurements of VIX and index returns: D ˇI D 0.
Time series of forecasts are obtained by estimating rolling ARCH models. Each model is estimated initially over the final 1,000 trading days of the in-sample period, from January 20, 1989 to December 31, 1992, and forecasts of realized volatility are made for the next day, say day T C 1, using the in-sample parameter estimates. The model and data are then rolled forward 1 day, deleting the observation(s) at time T 999 and adding on the observation(s) at time T C 1, reestimated and a forecast is made for time T C 2. This rolling method is repeated until the end of the out-of-sample forecast period. The one-step-ahead forecasts provide predictions for January 4, 1993 to December 31, 1999 inclusive and define time series of length 1,768. On each day, forecasts are also made for 5-, 10-, and 20-day volatility. Two measures of realized volatility are predicted. The first measure is squared excess returns, so that forecasts are made at time T of .rT C1 /2
and
N X
.rT Cj /2 ;
N D 5; 10; 20:
j D1
These quantities are predicted because a correctly specified ARCH model provides optimal forecasts of .rT C1 /2 . 3
The general parameter vector is .; ˛0 ; ˛1 ; ˛2 ; ˇ; 1 ; 2 ; ; ˇI , ı; ˇV /. The log-likelihood function was maximized using GAUSS with the constraints ˛1 ; ˛1 C ˛2 0.
88 Forecasting S&P 100 Volatility
1337
We assume the conditional expected return is constant and our results are not sensitive to the choice of ; results are reported for an assumed annual expected return of 10%. The second measure of realized volatility is INTRA that uses 5-min and “overnight” returns. Forecasts are produced at time T for INTRAT C1
and
N X
INTRAT Cj ;
N D 5; 10; 20:
j D1
These quantities can be expected to provide a more predictable measure of realized volatility than squared excess daily returns, although on average INTRAT C1 is less than .rT C1 /2 as already remarked in Sect. 88.2.3. Four different sets of forecasts are estimated for each measure of realized volatility to compare predictive power. These forecasts are first presented for the realized measure defined by squared excess returns. 88.3.2.1 Historic Volatility A simple estimate of volatility is evaluated to see what simple methods can achieve in comparison with more sophisticated methodologies. The simple one-step-ahead forecast is here called historic volatility (HV) and it equals the sample variance of daily returns over the period from time T 99 to time T inclusive, given by 1 X .r T j rN T /2 ; 100 j D0
1 X r T j : 100 j D0
99
hT C1 D
99
rN T D
(88.4) To produce 5-, 10-, and 20-day volatility forecasts the onestep-ahead forecast is multiplied by 5, 10, and 20, respectively.
PN 2 distributions. Then j D1 .r T Cj / is forecast to be PN .1/ j D1 E.hT Cj jIT /.
88.3.2.3 Forecasts of Volatility Using VIX These forecasts use information contained in VIX alone to forecast volatility. The one-step-ahead forecast, hT C1 , is obtained by estimating an ARCH model with conditional variances defined by ht D ˛0 C
(88.7) .2/
Parameters ˛0 ; ı; ˇV are estimated from IT D frT i , VIX T i ; 0 i 999g. To produce 5-, 10-, and 20.2/ day volatility forecasts it is assumed that E.hT Cj jIT / D .2/ E.hT Cj 1 jIT / for j > 1 so that the one-step-ahead forecast is multiplied by 5, 10, and 20, respectively. This simple multiplicative method may handicap VIX forecasts; however information about the term structure of implied volatility is not available from VIX.
88.3.2.4 Forecasts of Volatility Using Intraday Returns These forecasts use the information in intraday returns alone to forecast volatility. Now hT C1 is obtained from conditional variances defined by ht D ˛0 C
INTRAt 1 1 ˇI L
(88.8) .3/
and the parameters ˛0 ; ; ˇI are estimated4 from IT D frT i ; INTRAT i ; 0 i 999g. To produce 5-, 10-, and 20day volatility forecasts, we aggregate expectations defined by
88.3.2.2 GJR(1,1) Forecasts
.3/
.3/
E.hT Cj jIT / D .1 ˇI /˛0 C ˇI E.hT Cj 1 jIT /
The one-step-ahead forecast, hT C1 , of .rT C1 /2 is defined by the recursive formula: ht C1 D ˛0 C ˛1 rt2 C ˛2 st rt2 C ˇht ;
(88.5)
.3/
CE.INTRAT Cj 1 jIT /;
j > 1:
To make progress, it is necessary to assume that there is a proportional relationship between expectations of squared excess returns and INTRA, so that
where st equals 1 when rt < 0 and otherwise equals 0. The parameters ˛0 ; ˛1 ; ˛2 ; ˇ are estimated from the information .1/ IT D frT i ; 0 i 999g. Forecasts for 5-, 10-, and 20-day volatility are produced by aggregating expectations: .1/
ıVIX 2t1 : 1 ˇV L
.k/
.k/
E.INTRAT Cj jIT / D cT E..rT Cj /2 jIT / .k/
D cT E.hT Cj jIT /
(88.9)
.1/
E.hT Cj jIT / D ˛0 C pgjr E.hT Cj 1 jIT /;
j > 1; (88.6) with pgjr the persistence equal to ˛1 C 12 ˛2 C ˇ for the GJR(1,1) model, here assuming that returns have symmetric
4
Specifications (88.5), (88.7), and (88.8) are estimated with the constraints ˛1 ; ˛1 C ˛2 , ˇ; ˇI ; ˇV 0.
1338
B.J. Blair et al.
for some constant cT , that is, the same for all positive j and information sets indexed by k D 1; 2; 3. Positive dependence among intraday returns causes cT to be less than one as noted in Sect. 88.2.3. Making the assumption stated by Equation (88.9), expectations are calculated from .3/
E.hT Cj jIT / D .1 ˇI /˛0 .3/
C.ˇI C cT /ˇI E.hT Cj 1 jIT /;
j > 1;
from implied volatilities produced higher adjusted R2 statistics than univariate forecasts. Consequently, multiple regressions are also estimated. Multiple R2 statistics are used to assess the information content of the mixtures. No model that uses a mixture of the VIX, GJR, and intraday information sets is used to produce forecasts directly. Rather bivariate and trivariate mixtures of forecasts are evaluated to see if the implied volatility forecasts subsume the information in GJR and intraday return forecasts.
(88.10) P PN .3/ 2 and N j D1 .rT Cj / is forecast to be j D1 E.hT Cj jIT /. To calculate the forecasts at time T , the constant cT is estimated by the ratio
88.4.1 In-Sample ARCH Results
PT
t DT 999 INTRAt
cOT D PT
t DT 999
88.4 Results
.rt /2
:
(88.11)
88.3.2.5 Forecasts of INTRA To predict future values of INTRA we again make the plausible assumption stated in Equation (88.9). Then if some method (HV, GJR, VIX, or INTRA) produces a forecast P 2 fT;N of N j D1 .rT Cj / then that method’s forecast of PN j D1 INTRAT Cj equals cOT fT;N .
88.3.3 Forecast Evaluation Mean square forecast errors are compared to assess the relative predictive accuracy of the four forecasting methods. Given forecasts xT;N made at times T D s; : : : ; n N of quantities yT;N known at time T C N , we report the proportion of variance explained by the forecasts: PnN 2 T Ds .yT;N xT;N / P D1 P : nN N 2 T Ds .yT;N y/
(88.12)
We also report the squared correlation, R2 , from the regression yT;N D ˛ C ˇxT;N C uT;N ; which provides the proportion of variance explained by the ex post best linear combination ˛ C ˇxT;N . Hence P , which measures the accuracy of forecasts, is at most equal to the popular R2 statistic that is often interpreted as a measure of information content. The above regression only measures the predictive power of a single method in isolation. Day and Lewis (1992) found that mixtures of forecasts from ARCH models and
Table 88.1 presents parameter estimates, robust t-ratios, loglikelihoods, and squared correlations R2 for the seven ARCH models defined in Sect.88.3.1. The results are obtained from 6 years of data, from January 2, 1987 to December 31, 1992. The log-likelihoods increase monotonically across the columns of Table 88.1. The excess log-likelihood for a model is defined as its maximum log-likelihood minus the corresponding figure for the first model. The first model is the standard GJR model that uses previous index returns to describe the conditional variance, augmented by a transient volatility increase following the crash on October 19, 1987. The multiplier for squared negative returns .˛1 C˛2 / is three times that for squared positive returns .˛1 /, indicating a substantial asymmetric effect, although the estimates of ˛1 and ˛2 are not significantly different from zero at the 10% level, reflecting the relatively short sample period. The persistence estimate is ˛1 C 12 ˛2 C ˇ D 0:9693. The second model uses only the measure INTRA based on 5-min returns to describe conditional variances. It has a substantially higher log-likelihood than the GJR model. The INTRA series is filtered by the lag function =.1 ˇI L/ with ˇI approximately 0.25. This suggests that most of the volatility information in 5-min returns can be obtained from the most recent day, although long memory effects in INTRA, beyond the scope of this paper, might be anticipated from the results of Ebens (1999) and Andersen et al. (2001a). The low value of ˇI contrasts with the high value of ˇ for daily returns (0.94 for Model 1). When all the information in the history of daily and intraday index returns is used in the third model, the log-likelihood is not significantly higher than for Model 2 using the non-robust likelihood-ratio test and a 5% significance level. Furthermore, the parameters for the variable INTRA are similar for Models 2 and 3 while the parameters for squared daily returns are much smaller for Model 3 than for Model 1.
88 Forecasting S&P 100 Volatility
1339
Table 88.1 Models for the S&P 100 Index, daily returns from January 1987 to December 1992.Daily index returns rt are modeled by the ARCH specification: rt D C 1 dt C "t ; "t D ˛ C˛1 "2t1 C˛2 st1 "2t1 C 1ˇL
1=2
ht zt ; zt i:i:d:.0; 1/; ht D 0 Model Parameter ˛0 106 ˛1 ˛2 ˇ 2
103
(1)
2 dt1
(2)
2:6891 .1:53/ 0:0136 .1:53/ 0:0280 .1:19/ 0:9417 .33:67/ 1:563 .2:13/
ˇI
C
INTRAt1 1ˇI L
(3) 44:797 .3:85/
0:9793 .118:7/ 0:590 .1:67/ 0:6396 .3:64/ 0:2523 .1:51/
0:7298 .1:48/ 0:0085 .1:56/ 0:0078 .0:79/ 0:9773 .130:8/ 0:432 .1:26/ 0:5661 .3:32/ 0:2350 .1:53/
ı ˇV log L Excess log L R2
4; 833:66 0:197
4; 845:80 12:14 0:318
4; 848:35 14:69 0:307
ıVIX 2
C 1ˇt1 ; dt is 1 for 19 October 1987 and otherwise is 0, st is 1 if "t VL is negative, otherwise st is zero, INTRA is a sum of squared intraday returns, and VIX is a measure of implied volatility.
(4)
(5) 9:6881 .0:74/
0:5209 .1:53/ 2:104 .1:40/
12:5372 .0:72/ 0:0741 .1:34/ 0:0485 .1:20/ 0:3039 .1:93/ 0:800 .0:81/
0:4313 .6:31/ 0:1509 .2:73/ 4; 851:97 18:31 0:377
0:4101 .6:06/ 0:1553 .2:59/ 4; 858:48 24:82 0:367
(6)
(7) 6:3594 .0:55/
0:5954 .3:58/ 1:618 .0:68/ 0:3718 .1:86/ 0:0360 .0:23/ 0:3283 .4:47/ 0:1943 .3:40/ 4; 859:70 26:04 0:376
0:3110 .0:70/ 0:0029 .0:47/ 0:0029 .0:47/ 0:9695 .95:89/ 0:259 .0:67/ 0:3742 .1:86/ 0:0539 .0:39/ 0:2816 .3:27/ 0:1778 .3:49/ 4; 860:79 27:13 0:358
Notes. Parameters are estimated by maximizing the quasi-log-likelihood function, defined by assuming conditional normal densities. Robust t -ratios, calculated using GAUSS, are shown in brackets. Excess log-likelihood is relative to Model (1). Estimates of range from 0.00039 to 0.00047. 1 is -0.237. The constraints ˛1 ; ˛1 C ˛2 0 are applied.
The only information used in the fourth model to describe conditional variances is contained in the VIX series. The significance of this information is clear from the high robust t-ratio for the parameter ı, equal to 6.31. The fourth model has a higher likelihood than the second model that uses daily and 5-min returns. Model 4 has the same number of parameters as Model 2 and the difference in their log-likelihoods is 6.17. Thus implied volatilities are more informative than 5-min returns, at least in-sample. The VIX series is filtered by the function ı=.1 ˇV L/ with ˇV approximately 0.15. Although ˇV is near zero, it is significantly different at the 5% level with a robust t-ratio of 2.73. This result suggests that the options market is not informationally efficient, since the latest option prices do not contain all relevant information about the next day’s volatility. Although VIX quadrupled on the crash day, from 36% on Friday 16th to 150% on Monday 19th, the estimate of the crash parameter 2 is positive. Models 5, 6, and 7 are compared with Model 4 to try to decide if VIX contains all the relevant information about the next day’s volatility. Any incremental information in the history of 5-min and daily returns is probably to be found in
the 5-min returns, first because Model 6 has a higher loglikelihood than Model 5 and second because Model 7 that uses all the information improves only slightly on Model 6 that does not use daily returns. Consequently, we focus on comparing Model 4 that uses VIX information with Model 6 that uses both VIX and INTRA information. The log-likelihood for Model 6 is 7.73 more than for Model 4. The former model has an additional two parameters and hence a non-robust likelihood-ratio test would conclusively prefer the former model. However, the robust t-ratio for the INTRA parameter equals 1.86 so that has a robust one-tailed p-value that is low (3.1%) but not very close to zero. It may be concluded that while VIX is more informative than INTRA there is probably incremental information in 5-min returns. As VIX squared and INTRA do not have comparable magnitudes, for reasons given in Sects. 88.2.2 and 88.2.3, the fact that the VIX parameter ı is less than the INTRA parameter is not particularly relevant when comparing information content. The average level of VIX squared is approximately 2.6 times the average level of INTRA. In the equation for the
1340
B.J. Blair et al.
Fig. 88.1 In-sample comparison of conditional variance forecasts of squared returns (logarithmic scale) 2 January 1987–31 December 1992.
conditional variance ht the ratio of the impact of VIX information to the impact of INTRA information is therefore approximately 2:6
ı=.1 ˇV / D 2:8 =.1 ˇI /
using the parameter estimates for Model 6. This ratio emphasizes that VIX is much more informative than INTRA when applying an in-sample methodology. Figure 88.1 provides a comparison of the in-sample conditional variances for three ARCH specifications and the squared returns that they forecast. The specifications are Models 1, 2, and 4 that, respectively, use the information in daily returns (labeled GJR on Fig. 88.1), intraday returns (labeled INTRA), and implied volatilities (labeled VIX). The values of the four-time series are plotted on a logarithmic scale truncated at 105 . It can be seen that the squared returns are a very noisy series and that the three conditional variances move closely together.
88.4.2 Out-of-Sample Forecasting The out-of-sample accuracy of volatility forecasts is compared from January 4, 1993 to December 31, 1999. Figure 88.2a provides similar information to Fig. 88.1 and shows the one-step-ahead GJR, INTRA, and VIX forecasts compared with the realized volatility defined by squared returns. It is now possible to see periods when the three forecasts are some way apart and other periods when they are very similar. Figure 88.2b provides comparisons when there is much less noise in the series of realized volatility. The quantity predicted is now the sum of INTRA values over 20 days. It is seen that volatility was low for the first 4 years
on the plot (1993–1996), higher in 1997, and then much higher in 1998 and 1999. The GJR forecasts are the most sensitive to large movements in the index and occasional spikes in the forecasts can be seen. The VIX forecasts can be seen to track the quantity forecasted more closely than the other forecasts during some periods, for example, throughout 1993 at the left side of the figure. Figure 88.3 plots the sums of INTRA values over 20 days against the VIX forecasts, for which R2 equals 0.57.
88.4.2.1 Relative Accuracy of the Forecasts Tables 88.2 and 88.3 summarize the out-of-sample accuracy of the volatility forecasts. Each table provides information for forecasts 1, 5, 10, and 20 days ahead of the two measures of realized volatility. Both measures are sums of squares, summing squared daily excess returns for one measure and squared 5-min and overnight returns for the other measure. Table 88.2 reports the accuracy measure P defined by Equation (88.12). This proportion of explained variability is a linear function of the mean square error of forecasts, with higher values corresponding to more accurate forecasts. Table 88.3 reports the (multiple) squared correlation R2 from regressions of realized volatility on one or more forecasts. These R2 statistics provide useful information about the information content of the various forecasts. Tables 88.2 and 88.3 unequivocally support the conclusion that the implied volatility index VIX provides more accurate forecasts of realized volatility than the historic volatility HV, the low-frequency ARCH forecast GJR given by the model of Glosten et al. (1993) and the intraday volatility INTRA calculated from 5-min and overnight returns. The VIX forecasts are more accurate than the other univariate forecasts for all 16 combinations of realized volatility
88 Forecasting S&P 100 Volatility
1341
Fig. 88.2 (a) Out-of-sample comparison of 1-day-ahead forecasts of squared returns (logarithmic scale) 4 January 1993–31 December 1999. (b) Actual values and forecasts of realized volatility over 20 days, measured using intraday returns (logarithmic scale) 4 January 1993–31 December 1999.
measure (sums of squared excess returns or INTRA), forecast horizon (1, 5, 10, or 20) and summary statistic .P or R2 /. The proportions of explained variability P given in Table 88.2 can be summarized by five remarks. First, VIX always has the highest value.5 Second, GJR and INTRA have similar values when forecasting for the next day. Third, INTRA ranks second to VIX when forecasting five or more days into the future. Fourth, the values of P generally increase as the horizon N increases, with the exception of the GJR forecasts. Fifth, the values of P are much higher for predictions of INTRA than for predictions of squared daily excess returns, as anticipated from the work of Andersen and Bollerslev (1998). Changing the target to be forecast from squared daily excess returns to sums of squared intraday returns approximately quadruples the proportions P for 1-dayahead forecasts and more than doubles them for 5-day-ahead forecasts. The rows of Table 88.3 are arranged in the order that produces columns of ranked values of R2 for N D 5; 10; 20 and 5
However, there are few differences in the values of P that are statistically significant at conventional levels.
almost a ranked column when N is 1. It is seen first that the variance of the 100 previous daily returns is less informative than all other forecasts. Second, INTRA has a higher statistic than GJR, except when N is 1, as also occurs in Table 88.2. Third, linear combinations of GJR and INTRA are less informative than VIX. Fourth, combinations of VIX and other forecasts are apparently only more informative than VIX alone when forecasting 1 or 5 days ahead using VIX and GJR. The lack of incremental information is striking for forecasts 10 and 20 days ahead; the value of R2 increases by less than 0.001 when the first realized volatility measure is regressed on VIX, GJR, and INTRA instead of VIX alone, while the increase is from 0.559 to 0.565 for the second measure when N equals 10 and from 0.569 to 0.577 when N equals 20. The superior performance of VIX as the forecast horizon increases is consistent with its design as a measure of market expectations over 22 trading days. Fifth, the values of R2 generally increase as N increases and the values are higher, often much higher, when realized volatility is measured using intraday returns. The differences between the two performance measures, R2 P , are generally small. In particular, they are always
1342
B.J. Blair et al.
Fig. 88.3 Scatter plot of actual values and VIX forecasts of realized volatility over 20 days, measured using intraday returns 4 January 1993–31 December 1999.
Table 88.2 The relative accuracy of volatility forecasts from January 1993 to December 1999.Realized volatility yT;N is defined by eiPN PN ther j D1 .rT Cj /2 or j D1 INTRAT Cj . Forecasts xT;N made at time T are obtained from historic volatility estimates HV, the GJR model, the INTRA series, or the VIX series. These forecasts are defined in Sect.88.3.2 and make use of Equations (88.4) (HV), (88.5) (GJR), (88.7) (VIX), and (88.8) (INTRA). The accuracy of forecasts is measured by the proportion of explained variability: P D 1 PnN 2 T Ds .yT;N xT;N / ; which is a linear function of the mean square PnN 2 N T Ds .yT;N y/ error of the forecasts. Values of P are calculated from 1,769 N forecasts. Forecast N D1 N D5 N D 10 N D 20 (A) Values of P for forecasts of sums of squared excess returns HV GJR INTRA VIX
0.037 0.106 0.099 0.115
0.089 0.085 0.204 0.239
0.112 0.016 0.214 0.297
0.128 0.013 0.250 0.348
(B) Values of P for forecasts of sums of the intraday volatility measure INTRA HV GJR INTRA VIX
0.167 0.375 0.383 0.401
0.243 0.289 0.494 0.533
0.255 0.169 0.455 0.534
0.255 0.181 0.465 0.545
less than 0.05 for the VIX forecasts, thus indicating that a linear function of these forecasts, optimized ex post, is not much more accurate than the ex ante forecasts. However, the differences R2 P are substantial for the GJR forecasts and
this is a consequence of the low values of P when forecasting 10 or 20 days ahead.
88.4.2.2 Properties of Univariate Forecast Errors Tests for bias in the forecasts of squared excess returns find no evidence of bias for all horizons N that have been considered. There is, however, statistically significant evidence that the forecasts of future INTRA values are downward biased for the out-of-sample period. For example, the average value of the INTRA forecast obtained from VIX information is 86% of the average value of INTRA, for each of the four forecasting horizons; t-tests for a zero mean applied to nonoverlapping forecast errors range from 3.02 to 4.09. The observed level of bias is consistent with an overall increase in estimates cOT of the terms cT defined by Equation (88.9), from 0.6 in 1993 to 0.8 in 1999; that is, consistent with a decrease in the dependence between consecutive intraday returns. As the estimates cOT use information from the previous 4 years they are, with hindsight, less than the parameters cT that they estimate, and hence the forecasts are on average less than the intraday volatility that they predict. The values of R2 in Table 88.3 imply that there is not much incremental predictive information in the GJR and INTRA forecasts, and that any such information is to be found at the shortest forecasting horizons. The increases in R2 , when GJR and INTRA are both added to the regressions that only employ VIX, are 5.0, 1.9, 0.6, and 0.8% when predicting
88 Forecasting S&P 100 Volatility
Table 88.3 Correlations and multiple correlations for regressions of realized volatility on volatility forecasts from January 1993 to December 1999.Realized volatility is defined by either PN .rT Cj /2 or PjND1 j D1 INTRAT Cj . Forecasts made at time T are obtained from historic volatility estimates HV, the GJR model, the INTRA series, or the VIX series. These forecasts are defined in Sect.88.3.2 and make use of Equations (88.4) (HV), (88.5) (GJR), (88.7) (VIX), and (88.8) (INTRA). The table documents the squared correlation R2 from regressions of realized volatility on one or more forecasts. Values of R2 are calculated from 1; 769 N forecasts.
1343
Explanatory variables
N D1
N D5
N D 10
N D 20
2
(A) Values of R for forecasts of sums of squared excess returns HV GJR INTRA GJR and INTRA VIX VIX and INTRA VIX and GJR VIX, GJR, and INTRA
0.043 0.118 0.099 0.119 0.129 0.129 0.139 0.144
0.111 0.181 0.212 0.217 0.249 0.250 0.253 0.253
0.151 0.189 0.238 0.240 0.304 0.304 0.304 0.304
0.197 0.223 0.285 0.287 0.356 0.356 0.356 0.356
(B) Values of R2 for forecasts of sums of the intraday volatility measure INTRA HV GJR INTRA GJR and INTRA VIX VIX and INTRA VIX and GJR VIX, GJR, and INTRA
realized volatility measured by INTRA at horizons 1, 5, 10, and 20, respectively; the increases are much less when predicting squared excess returns and their sums. Consequently, we only discuss the incremental predictive information beyond VIX for one-step-ahead forecasts. Let GJRf, VIXf, and INTRAf refer to the forecasts of the next squared excess return defined by Equations (88.5), (88.7), and (88.8), respectively. A heteroscedasticity-adjusted regression of the forecast errors from VIX on the three sets of forecasts then gives .rT C1 /2 VIXf T D a C 0:16VIXf T C 0:27GJRf T 0:23INTRAf T C residual with standard errors 0.16, 0.13, and 0.15 for the three forecasts and a negative intercept a. The R2 of this regression is only 0.4% with an F -statistic equal to 2.37 and a p-value of 7%. The standard errors ignore the sampling error in the parameters that are used to calculate the forecasts and hence they underestimate the correct standard errors (see West 2001). Nevertheless, it can be concluded that the evidence for incremental forecasting information in daily and intraday returns is insignificant when predicting squared daily returns. The forecasts of INTRAT C1 are cOT multiplied by the forecasts of the squared excess return. They provide the estimated regression INTRAT C1 cOT VIXf T D a 0:02cOT VIXf T C 0:42cOT GJRf T 0:12cOT INTRAf T C residual:
0.185 0.423 0.385 0.443 0.445 0.448 0.491 0.495
0.282 0.449 0.506 0.525 0.567 0.575 0.586 0.586
0.309 0.395 0.482 0.490 0.559 0.563 0.564 0.565
0.335 0.403 0.499 0.504 0.569 0.576 0.576 0.577
The forecast errors now vary significantly with the forecasts. The R2 of the regression is 3.1%, the F -statistic equals 18.85, and the constant a has a t-ratio of 5.45 reflecting the bias in the forecasts discussed above. The standard errors of the three forecasts are 0.08, 0.06, and 0.07 with t-ratios of 0.27 for VIXf, 6.54 for GJRf, and 1.67 for INTRAf. These t-ratios must be interpreted cautiously because the standard errors may be misleading. The high t-ratio for GJRf compared with that for INTRAf is counterintuitive. The evidence for some incremental information is more intuitive and can be attributed to the mismatch between the 1-day-ahead forecast horizon and the 1-month horizon that defines VIX.
88.5 Conclusion Previous studies of low-frequency (daily or weekly) index returns and implied volatilities have produced conflicting conclusions about the informational efficiency of the S&P 100 options market. Our in-sample analysis of low-frequency data using ARCH models finds no evidence for incremental information in daily index returns beyond that provided by the VIX index of implied volatilities. This conclusion is in agreement with the recent evidence of Christensen and Prabhala (1998) and Fleming (1998). The extension here of the historic information set to include high-frequency (5-min) returns shows, furthermore, that there appears to be only minor incremental information in high-frequency returns; this information is almost subsumed by implied volatilities.
1344
Out-of-sample comparisons of volatility forecasts show that VIX provides more accurate forecasts than either lowfrequency or high-frequency index returns, regardless of the definition of realized volatility and the horizon of the forecasts. A mixture of the VIX forecasts and forecasts from index returns shows that there is probably some incremental forecasting information in daily returns when forecasting 1 day ahead. For the longer forecast horizon of 20 trading days, which closely matches the life of the hypothetical option that defines VIX, daily observations of VIX contain all the relevant forecasting information. Our results for equity volatility confirm the conclusion of Andersen and Bollerslev (1998), obtained for foreign exchange volatility, that intraday returns provide much more accurate measures of realized volatility than daily returns as well as providing more accurate forecasts. Although high-frequency returns are highly informative about future volatility, we have found that implied volatilities were more informative throughout our sample period from 1987 to 1999. Acknowledgments We are particularly grateful to two anonymous referees, as well as Francis Diebold, Kenneth West, Dan Galai, Nagpurnanand Prabhala, and Jan Jakobsen for their constructive comments on earlier versions of this paper. We have also benefited from the comments of participants at meetings held by the Bolsa de Derivados do Porto (Oporto, 1999), the London School of Economics (London, 1999), the American Finance Association (Boston, 2000), the Financial Management Association (Edinburgh, 2000), and the Bachelier Finance Society (Paris, 2000). We thank Robert Whaley for VIX data.
References Akgiray, V. 1989. “Conditional heteroskedasticity in time series of stock returns: evidence and forecasts.” Journal of Business 62, 55–80. Andersen, T. G. and T. Bollerslev. 1998. “Answering the skeptics: yes standard volatility models do provide accurate forecasts.” International Economic Review 39, 885–905. Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens. 2001a. “The distribution of stock return volatility.” Journal of Financial Economics 61, 43–76. Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys. 2001b. “The distribution of exchange rate volatility.” Journal of the American Statistical Association 96, 42–55. Bera, A. K. and M. L. Higgins. 1993. “ARCH models: properties, estimation and testing.” Journal of Economic Surveys 7, 305–362. Blair, B. J., S. H. Poon, and S. J. Taylor. 2001. “Modelling S&P 100 volatility: the information content of stock returns.” Journal of Banking & Finance 25, 1665–1679. Blair, B. J., S. H. Poon, and S. J. Taylor. 2002. “Asymmetric and crash effects in stock volatility.” Applied Financial Economics 12, 319–329. Bollerslev, T. 1987. “A conditional heteroskedastic time series model for speculative prices and rates of returns.” Review of Economics and Statistics 69, 542–547. Bollerslev, T., R. Y. Chou, and K. P. Kroner. 1992. “ARCH modelling in finance: a review of theory and empirical evidence.” Journal of Econometrics 52, 5–59.
B.J. Blair et al. Bollerslev, T., R. F. Engle, and D. B. Nelson. 1994. “ARCH models,” in Handbook of econometrics, Vol. IV. North-Holland, Amsterdam, pp. 2959–3037. Bollerslev, T. and J. M. Wooldridge. 1992. “Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances.” Econometric Reviews 11, 143–179. Brailsford, T. J. and R. W. Faff. 1996. “An evaluation of volatility forecasting techniques.” Journal of Banking & Finance 20, 419–438. Canina, L. and S. Figlewski. 1993. “The informational content of implied volatility.” Review of Financial Studies 6, 659–681. Chiras, D. P. and S. Manaster. 1978. “The information content of option prices and a test for market efficiency.” Journal of Financial Economics 6, 213–234. Christensen, B. J. and N. R. Prabhala. 1998. “The relation between implied and realized volatility.” Journal of Financial Economics 50, 125–150. Day, T. E. and C. M. Lewis. 1992. “Stock market volatility and the informational content of stock index options.” Journal of Econometrics 52, 267–287. Dimson, E. and P. Marsh. 1990. “Volatility forecasting without datasnooping.” Journal of Banking & Finance 14, 399–421. Ebens, H. 1999. Realized stock volatility, Working paper, Department of Economics, John Hopkins University. Figlewski, S. 1997. “Forecasting volatility.” Financial Markets, Institutions and Instruments 6, 1–88. Fleming, J. 1998. “The quality of market volatility forecasts implied by S&P 100 Index option prices.” Journal of Empirical Finance 5, 317–345. Fleming, J., B. Ostdiek, and R. E. Whaley. 1995. “Predicting stock market volatility: a new measure.” Journal of Futures Markets 15, 265–302. Franses, P. H. and D. Van Dijk. 1995. “Forecasting stock market volatility using (non-linear) GARCH models.” Journal of Forecasting 15, 229–235. Glosten, L. R., R. Jagannathan, and D. E. Runkle. 1993. “On the relation between the expected value and the volatility of the nominal excess return on stocks.” Journal of Finance 48, 1779–1801. Harvey, C. R. and R. E. Whaley. 1992. “Dividends and S&P 100 Index options.” Journal of Futures Markets 12, 123–137. Heynen, R. C. and H. M. Kat. 1994. “Volatility prediction: a comparison of stochastic volatility, GARCH(1,1) and EGARCH(1,1) models.” Journal of Derivatives 2(2), 50–65. Jorion, P. 1995. “Predicting volatility in the foreign exchange market.” Journal of Finance 50, 507–528. Latane, H. A. and R. J. Rendleman. 1976. “Standard deviations of stock price ratios implied in option prices.” Journal of Finance 31, 369–381. Nelson, D. B. 1991. “Conditional heteroskedasticity in asset returns: a new approach.” Econometrica 59, 347–370. Nelson, D. B. 1992. “Filtering and forecasting with misspecified ARCH models I: getting the right variance with the wrong model.” Journal of Econometrics 52, 61–90. Nelson, D. B. and D. P. Foster. 1995. “Filtering and forecasting with misspecified ARCH models II: making the right forecast with the wrong model.” Journal of Econometrics 67, 303–335. Taylor, S. J. and X. Xu. 1997. “The incremental volatility information in one million foreign exchange quotations.” Journal of Empirical Finance 4, 317–340. West, K. D. 2001. “Encompassing tests when no model is encompassing.” Journal of Econometrics 105, 287–308. Xu, X. and S. J. Taylor. 1995. “Conditional volatility and the informational efficiency of the PHLX currency options markets.” Journal of Banking & Finance 19, 803–821.
Chapter 89
Detecting Structural Instability in Financial Time Series Derann Hsu
Abstract This paper reviews the literature that deals with structural shifts in statistical models for financial time series. The issue of structural instability is at the heart of capital asset pricing theories on which many modern investment portfolio concepts and strategies rely. The need to better understand the nature of structural evolution and to devise effective control of the risk in real world investments has become ever more urgent in recent years. Since the early 2000s, concepts of financial engineering and the resulting product innovations have advanced in an astonishing pace, but unfortunately often based on a faulty premise, and without proper regulatory keep-ups. The developments have threatened to destabilize worldwide financial markets. We explain the principle and concepts behind these phenomena. Along the way we highlight developments related to these issues over the past few decades in the statistical, econometric, and financial literature. We also discuss recent events that illuminate the dynamics of structural evolution in the mortgage lending and credit industry. Keywords Structural instability r Sudden shifts of variance r Volatility clustering r GARCH r Change-point problem r Bayesian analysis r Evolution of risk r Instability principle r Law of long time r Iterated cumulative sums of squares algorithm
89.1 Introduction This article is to organize and explain to the reader in nontechnical language the literature that provides the concepts, methods, and applications in examining the stability of statistical structure of financial time series. The author tells the stories about how the subject came to light about 35 years ago when he and his colleagues first started the work in this area. He then elaborates why the subject will be, and should D. Hsu () University of Wisconsin–Milwaukee, Milwaukee, WI, USA e-mail: [email protected]
be, one core part of the knowledge for every investment professional, and amateur as well. It is especially important to those who are responsible for risk management in financial institutions. It is also very useful to those who merely manage their own retirement plan, such as 401(k)s, or savings accounts. Financial theories and practices over the past four decades have been built predominantly on a foundation of probability theories, and the models and methods derived there-from. One aspect that has often been overlooked is the degree of stability of the statistical models that many of us rely on to formulate theories, develop strategies, and conduct risk management. It has been a general tendency among researchers as well as practitioners to take comforts in the models that they have diligently constructed. The presumption is that the models will remain stable for the foreseeable future. But in reality, no one can assure that the empirically based models are footed on the solid ground. They may well sit on soft mud and can tumble down during heavy rains. Even the bed rock a model appears to stand on can turn over when an earthquake hits, and sooner or later it sure will.
89.2 Genesis of the Literature The seven papers by Hsu et al. (Hsu et al. 1974; Wichern et al. 1976; Hsu 1977, 1979, 1982a, b, 1984) listed in the references at the end of this article appear to have initiated the research work in this area. The first paper in JASA (Hsu et al. 1974) was motivated by a debate about whether stable Paretian distributions can properly characterize stock market returns. The author was mystified by the suggestion by Fama (1965) and Mandelbrot (1963, 1967) that stock returns carry infinite variance. He thought it was a pretty bold claim. That made him wonder what was really going on behind stock prices. Variability of stock returns was indeed at times erratic. But the stable Paretian distributions that include in them a class of distributions that have infinite variance were just too bizarre for stock returns. He soon realized that the statistical structure of stock returns is anything but stable,
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_89,
1345
1346
and it did not matter whether he assumed it to be a member of Paretian distributions or not. After having researched on considerable amount of stock price data, he was convinced that the inherent instability of the probabilistic structure of stock returns is a challenging issue, in both intellectual and practical sense. This motivated him to take up the work in designing schemes to detect variance changes in stock return data. It appeared that the variance of stock returns was occasionally subject to a “step” shift. It may be driven by a sudden increase of uncertainty in the political or economic environment. In many, though not all, situations events can be identified that are likely linked to a sharp rise of the stock price variability. His next paper in JRSS (Hsu 1977) developed two testing methods and applied them to the analysis of the weekly return series compiled based on the Dow Jones Industrial Average (DJIA) for the period from January 1, 1971 through August 2, 1974 (162 weeks). The test methods are formulated to test the null hypothesis
D. Hsu
a screening fashion. The location of the maximum value of the test statistic in the sequence of the ratios is useful for identifying the change point. Analysis of the DJIA weekly return series indicates that in the second week of March 1973 the variance of index returns rose to 3.2 times of the level observed in the preceding period. It was conceivably triggered by the political tumults that began to engulf the country as news of the Watergate wiretapping scandal erupted. Added on to it, just a few months later, was the Arabic oil embargo that sent the U.S. economy into to a tailspin. He then continued to work on improving the power and efficiency of the statistical procedure in detecting variance change. He also designed a way to estimate the change point with the added strength of robustness, meaning to desensitize the performance of the inference procedure against the departure from the normal distribution assumption that is conventionally assumed in the main testing scheme; see Hsu (1982). It also moved beyond the detection of variance change and toward the detection of the instability of the beta coefficient, which represents a stock’s systematic risk and is the core el2 2 2 2 ement in the modern investment portfolio theory. He showed H0 W 1 D : : : D M D 0 .0 is unknown/ that in reality some of the beta coefficients were not steady. It 2 2 D : : : D M against H1 W 12 D : : : D k2 D 02 I kC1 could be a fundamental problem for the whole battery of fiD 02 C ı; j ıj > 0: nancial portfolio theories, models and their applications. His interest at that time was to try to link the change in the beta 2 are variances of the sequence of M con- coefficient to the forces that might have caused or induced where 12 , . . . , M secutively observed independent normal random variables, it. In his paper in JFQA (Hsu 1984), he further postulated an denoted by Z1 , . . . , ZM , all having a known and constant “evolution” concept for the risk structure of stock market remean ; k is unknown (k D 1, 2,. . . , M 1); and ı is un- turns. The idea was borrowed from evolutionary biology and the thought was that the risk structure of stock market returns known. It is then defined that will eventually shift as the market forces evolve and drive it through a stage of “sudden,” “irregular,” though not always Xi D .Zi /2 ; i D 1; 2; : : : ; M: dramatic, “genetic” changes. Behind the idea of evolutionary risk structure is a princiThe test statistic derived from a concept of “locally most ple that he called the “instability principle.” It comes from powerful test” is specified as the necessity for survival of species, or it is simply a princi) ,( ) ( M ple the universe has been created, or has evolved, with. RanM X X .M 1/ .i 1/Xi Xi ; 0 T 1: dom, previously nonexistent, factors will and must come to TD life to maintain ecological developments on earth. The stimi D1 i D1 ulation of “changes” will keep living beings vivid and vigThe statistic under H0 has a mean equal to 0.5 and variance orous over time. Trembling “earthquakes” are signs that the equal to planet is alive and well. The hot molten rock at the core of the planet will erupt occasionally to show that it is functioning. Var.T/ D .M C 1/= f6.M 1/.M C 2/g : A main theorem of the instability principle is a natural law that Hsu calls the “law of long time.” It says that any probThe test statistic after standardization approaches a normal abilistic phenomenon describable by a statistical or econodistribution with mean 0 and variance 1, when the sample metric model will have a shift in its structure (referring to its size exceeds 30. formulation, specification, or parameter values) with probaThe second test statistic is a ratio of the cumulative sum of bility one, as time goes on. Though he is not perfectly cersquares (after removing the mean) of the second part of the tain that the so-called “probability” as it is conventionally series to that of the first part, with the partition point shifting described (using the example of flipping a coin) is well defrom the second time point to the last second time point in fined, his idea behind the “law of long time” is to say that no
89 Detecting Structural Instability in Financial Time Series
probabilistic phenomenon is perfectly steady. It is a matter of relative degree. Phenomena in physics may have high degree of stability and can be observed repeatedly and confirmed readily. On the other hand, economic, financial, and social behaviors may have a much lower degree of stability. And he has come to believe that phenomena and behaviors in financial markets probably have one of the lowest degrees of stability among any activities observable or comprehensible by mankind. By that logic, any financial analysis without proper attention to structural instability is flawed at best and foolhardy at worst. It has been often observed that when a price trend is deemed to be certain and to persist forever, it turns. In that sense, conventional capital asset pricing models that are presumably built on a steady variance-covariance matrix are destined to fail in reality. Furthermore, an escape route often taken by economists and financial researchers is to model risk (volatility) or betas as a homogeneous or stationary stochastic process. But this kind of approach cannot evade the supreme force of the “law of long time.” Whatever the stochastic process used for that purpose does not have the assurance of prefect structural stability. That is the risk at a higher dimension and is by nature inherent. This concept applies to the GARCH modeling techniques widely used by economists to handle time varying error variance in regressions, and financial researchers for risk analysis; see Engle (1982), Bollerslev (1986), and Nelson (1991). We will come back to the subject in a later section.
89.3 Problems of Multiple Change Points The statistical techniques reported in Hsu’s papers (1974– 1984) were formulated for detecting a single change point in a sequence of random variables. There were two methodological tracks that he utilized. One is a traditional approach using a so-called “locally most powerful test” initiated by Kander and Zacks (1966) and explained in an earlier section, while the other is a Bayesian Robust technique suggested by Box and Tiao (1973). While a single change point can be isolated within a segment of a time series by visual inspection of the relevant time series chart, statistical testing and estimation of the change point (the time point at which a shift of the parameter of interest takes place) can be carried out using the schemes proposed in Hsu’s papers or some improved versions developed by others. When there are multiple change points in a time series the procedure will need further extensions. Wichern et al. (1976) use a sequential moving window for preliminary testing and screening and then estimate the potential change points when any are identified. Since then,
1347
several improved techniques have been proposed and applied to analysis of stock prices. Baufays and Rasson (1985) used a likelihood function-based scheme, Booth and Smith (1982) employed a conventional Bayesian method, Inclan (1993) used a Bayesian odds approach, Barry and Hartigan (1993) used a product partition model, Inclan and Tiao (1994) developed an algorithm of iterated cumulative sums of squares (ICSS), and Chen and Gupta (1997) used a binary procedure combined with Schwarz information criterion (SIC). The varied techniques mentioned above in this paragraph have their relative advantages either in identifying change points more sensitively, estimating the change points more precisely, or executing the task more efficiently computationalwise. Because of the importance of the ICSS algorithm developed by Inclan and Tiao (1994), which will be discussed in detail in a later section, some description of the method is as follows. The statistic used in the testing scheme is Dk D
Ck k ; CT T
where Ck D
k P t D1
k D 1; 2; : : : ; T; with D0 D DT D 0;
at2 which is a sum of squares of the random
variable fat g with mean 0 and variance t2 , t D 1; 2; : : : , T, and T is the number of observations in the sample series. In the series of computed Dk , the location that provides maxk jDk j is used as theqestimate of the change point, if the
corresponding value of T2 jDk j exceeds 1.358. The number 1.358 is the large sample critical point at the 5% significance level of the test statistic. When it is suspected that more than one change point may be located in the series, the testing and estimation routine is applied to the first part and the second part of the time series separately. The routine can be iterated till no new change point is found. This detection procedure is easy to use and is highly efficient in terms of the CPU time on a computer. In another set of literature, several econometricians seemingly independently developed some generally applicable techniques for detection of structural change points in regression analysis. Andrews (1993) used Wald, Lagrange multiplier, and likelihood ratio tests coupled with generalized method of moment estimation procedure for change point detection in nonlinear parametric models. Bai (1996) used weighted empirical distribution functions of estimated residuals, which are asymptotically distribution free, for testing parameter constancy in linear regression. Bai (1994) also used a least squares estimation procedure for the estimation of a shift in linear process. Chu et al. (1996) developed two methods for monitoring structural change in a real time setting, one based on parameter fluctuations, and the other based on CUSUM monitoring procedures. Finally, Bai and Perron (1998) derived properties of the least squares estimators for
1348
multiple change points in linear regression. The research papers summarized above in this paragraph represent a broad scope of econometric developments aimed at the issues of structural instability in regression models. They are clearly applicable to financial modeling and risk management.1 Extensions of some of the methods used above to the analysis of multivariate time series were developed by Bai et al. (1998).
89.4 Here Came the GARCH and Its Brethrens In light of the discussions included in the previous sections, there appears to be a clear need for financial and economic analysts to relax to some extent the constancy assumption on the regression coefficients and the error variance specified in standard econometric models. Engle (1982) was among the earlier authors to propose a way to relax the constancy assumption on the variance of the error term. Instead of inventing an economic theory to predict or explain why and how the error variance shifts over time, he took an easy way out by asking the data to speak for themselves. It naturally went the way of statistical time series modeling for the error variance. He picked a class of time series models that he called autoregressive conditional heteroscedasticity (ARCH) models. The idea is to let the squared errors follow an autoregressive model of order q. That is, to let the current squared error linearly regress on the squared errors of the immediately preceding q time points, assuming that the data points are collected periodically in calendar or other natural sequence. The approach has the advantage of letting the error variance float the way the data say it would hoping that the autoregressive model will do its job properly. In a traditional econometric model, this is just a side show. Analysts would be glad to rid themselves of the agony of sticking to the constant variance assumption of which they cannot be sure. Adding to it is the benefit of gaining better assured reliability on the estimated regression coefficients. It is a seemingly easy decision. The idea quickly gained traction among economists. The ARCH model was then extended to the generalized autoregressive conditional heteroscedasticity (GARCH) model by Bollerslev (1986). The model is to have the current variance value regress on p immediately preceding estimated variance values and q immediately preceding squared errors. It is a 1 A collection of papers in a special issue of the Journal of Econometrics 70(1) (1996), pp. 1–316, titled: “Recent Developments in the Econometrics of Structural Change,” presented some refinements, as of its publication date, of the econometric techniques used in the testing of structural changes together with applications of the methodology of change point analysis to various economic issues.
D. Hsu
mixed autoregressive and moving average model of order p and q, respectively. In statistical jargons, the error variance is to follow an ARMA (p, q) model. Empirical applications of the GARCH model soon pointed to the needs for modifications. Nelson (1991) suggested a logarithmic transformation of the variance value with a skewness modification, and is often called EGARCH model because it implies an exponential function in its original scale. This modification is to prevent the occurrence of negative values for the variance as they sometimes appear in the empirical modeling process of the original GARCH model, which does not impose value restrictions. Another significant modification was made by Glosten et al. (1993). It used an ARMA (1, 1) model where the current variance value depends more on the immediately preceding squared error when that error carries a negative sign rather than a positive sign. This was motivated by the observations that stock market return variance tends to be larger during a market downturn; that is, when market returns are negative. However, one will have to come to the basic question: “Why does the error term have to follow a particular ARMA model with a set of specific parameters?” There are no ready answers. When one is dealing with the error term in the original econometric model, it is not a question to be taken seriously – after all, why would someone worry about the variance pattern in the “random errors,” or “residuals,” which the model does not care to account for? Nevertheless, the GARCH models and its brethrens can be a good tool for statistical descriptive or characterization purposes. But the model itself does not carry theoretical insights, or simple interpretations. In the spirit of the “instability principle” explained in a previous section, we would be more concerned about the stability of the GARCH model, say, in the mean level of the variance, especially when it is applied to the pricing of financial options contracts, in which the GARCH model plays the crucial role of projecting future risk level. As a ready illustration, take the real world example Engle (2001) used to explain how the GARCH model can be used to forecast the quantity, “value at risk” (VAR) for a composite portfolio based on some relevant U.S. asset returns data observed up to the end of first quarter of year 2000. Using a GARCH model, Engle derived forecasted values of VAR for the next 12 months of out-of-sample period. The forecasts turned out to be visibly too large relative to the subsequently observed values. This was due to an unexpected stabilization effect from the Federal Reserve’s aggressive cuts in interest rates in the wake of the burst of high tech stock market bubble. In this particular case, the bias happened to be on the conservative side and was likely inconsequential. But what if the forecasts and the risk assessment were made shortly prior to the stock market crash in October 1987, or sometime
89 Detecting Structural Instability in Financial Time Series
in August 1998 right before the worldwide financial crisis? The practical consequences could be very serious. Further, there are reported research findings that demonstrate that stock return variance is often subject to step shifts, instead of gradual shifts as described by a typical stationary ARMA model. Examples as previously discussed have been provided in Hsu (1977, 1979, 1982a, b, 1984), Hsu et al. (1974), Wichern et al. (1976). Substantial evidence has been reported in Lamoureux and Lastrapes (1990) that shows that step shifts in return variance have led the authors to the selection of the integrated GARCH (I-GARCH) model for some major stock returns series in which the underlying ARMA model has a unit root, or is non-stationary. It basically defies the original purpose of using the GARCH model to maintain a steady unconditional variance, or for the time series of variance to remain stationary. In other words, when the ARMA model that describes the conditional variance process reaches the non-stationarity region, the forecasts of variance derived there-from serve little practical functions. Further examples of step-shifts of variance that have confounded the GARCH modeling were provided by Chu (1995), Wilson et al. (1996), and Aggarwal et al. (1999). In the three papers just cited, the authors took up the task of detecting changes in the mean level of variance in the GARCH model. After properly isolating the various unconditional variance regimes in a time series and estimating a different mean level of variance for each segment of time series governed by a regime, the authors found that the values of the ARMA parameters in the segmented GARCH model, compared to those fitted to the entire time series, decrease dramatically in almost all cases and vanish in most. A study applying the findings to the monitoring of constancy of variance in conditionally heteroscedastic time series was conducted, and the results reported, by Horvath et al. (2006). In a broad perspective, GARCH modeling is useful in sweeping under the rug unwanted random errors from an econometric model and packing them in an empirically based ARMA models to characterize the time pattern of the error variance. It however does not necessarily provide significant assistance in obtaining insights in financial studies. It could instead, in some cases, misguide practitioners to think that the GARCH model can properly forecast future risk. In reality, there may be an important process that is driving the risk level but cannot be captured or characterized by the parameter values in the GARCH model. To understand the risk hidden behind a company’s stock price, one has to rely on a broad understanding of the soundness of the company’s business model, strategies, the competitions it faces, and the environment of the industry the company is in. An ex-post measure of the stock’s historical volatility and the volatility’s historical time pattern may not reveal relevant information about the company’s true level of future risk. The crucial part of the financial risk often comes to light only after a sudden breakdown of the GARCH model. It could be a disruptive
1349
rise of the volatility to a level unprecedented in the history of the returns series. The event could take place subsequent to, say, a company’s announcement of exceedingly large unexpected earnings shortfalls, news of governmental probes of corporate accounting scandals, or the like. In short, one cannot rely on a weather forecaster who herself may be caught in the eye of a developing hurricane.2
89.5 Examples of Structural Shift Analysis in Financial Time Series Among the empirical work on “sudden” shifts of variance in financial time series, the papers by Aggarwal et al. (1999) and Wilson et al. (1996) deserve special attention. The first paper presents an analysis of stock market index returns of ten largest emerging markets by country, six major developed markets by country, three composites of multiple-country regional markets plus the world market, all for the period from May 1985 through April 1995. The second paper is to study the returns from 1-month oil future contracts, S&P 500 Index, and a composite portfolio of the common stocks of oil and natural gas producing companies in the U.S., all for the period from January 1984 through December 1992. Both papers shed important light on the nature and causes of sudden (step-type) variance changes in a broad political, economic, and social setting, as well as their implications on financial asset pricing and hedging. In the next part of this section, we will discuss the two papers in some detail and outline the main findings reported therein.
89.5.1 Analysis of Volatility in Emerging and Developed Stock Markets In the paper by Aggarwal, Inclan, and Leal (heretofore AIL) (1999), a collection of weekly market index returns for emerging markets, developed markets (both by countries), and several multiple-country regional markets and the world 2
To put things in a proper perspective, especially in the spirit of scholarship, the many researchers who have contributed to the GARCH literature or employed the techniques in various financial and economic applications are extraordinarily intelligent and serious scholars. The author admires their devotion to the goal to improve the methodological applicability and theoretical understanding of various complex phenomena involved in financial markets. The survey paper by Bollerslev et al. (1992), in the special issue titled “ARCH Models in Finance” of the Journal of Econometrics 52(1) (1992), outlines the spirit and accomplishments in this field in the first decade of the GARCH era. Several papers in the same issue indeed were devoted to the revisions of the standard CAPM and options pricing schemes in light of the time varying, yet stationary, returns variance. They are commendable contributions to the advancement of financial modeling and theoretical research.
1350
market are analyzed to examine the occurrences of large sudden shifts of returns volatility in each of the index series. The method used to detect such shifts is the algorithm developed by Inclan and Tiao (1994), called Iterated Cumulative Sums of Squares (ICSS) procedure, as we have described in an earlier section. The technique has the features such as ease of use and ranking high in computational efficacy. It has been a most-used algorithm in the multiple change-points varianceshift detection literature. The authors first conducted a series-by-series varianceshift scanning test and applied the estimation routine on each of the stock market index return series. The series involved are country-by-country stock market indices of (A) emerging Latin America markets: Argentina, Brazil, Chile, and Mexico; (B) emerging Asian markets: India, Malaysia, Philippines, South Korea, Taiwan, and Thailand; (C) developed major markets: Hong Kong, Singapore, Germany, Japan, U.K. and U.S.; and (D) Morgan Stanley Capital International’s (MSCI) indices for emerging markets, Far East region, Latin America region, and the world as the whole. There are 20 index series in all, covering the period from May 1985 through April 1995. From the ICSS analysis on each of the 20 series, the authors uncovered numerous sudden shifts. The findings are reported in Table 2 of their paper. Some important aspects of their results are highlighted in the paragraphs that follow. (We note that the term “volatility” in the following context is in terms of “standard deviation” of the return series.) Several of the dramatic rises in volatility in the market index of Argentina, Brazil, and Mexico in the late 1980s were linked to hyper-inflation in these countries. In the case of Mexico, the stock market volatility doubled, relative to the period prior to it, during the Peso crisis that began in March 1994 and lasted through the end of the sample period of April 1995. In the India market, a doubling of market volatility during July 1990 – March 1991, relative to the prior 2 years, were thought to be related to the government’s balance of payment crisis and its instability due to election. Subsequently, during February 1992 – May 1992 the Indian market volatility quadrupled, relative to the year before, due to a stock market scandal. In the Malaysia market, the October 1987 worldwide market crash and Chinese–Malay riots appeared to have caused a tripling of its market volatility during October 1987 – January 1988. An implementation of increased reserve requirement and capital control measure again appeared to have caused a tripling of the market volatility during the December 1993 – March 1994 period. In Philippines market, the Marcos-Aquino conflict and a coup attempt was thought to have linked to a doubling of the market volatility during the period February 1986 – September 1987. Later in the period April 1989 – March
D. Hsu
1991 the issues of debt overload, exchanges closes, and coup attempt were associated with a tripling of the market volatility. In Taiwan market, the October 1987 worldwide market crash and the central bank’s withdrawal of support of the local currency were thought to have caused the quadrupling of the volatility over the period July 1987 – February 1991. In the Thailand market, the October 1987 global market crash coincident with the death of the country’s president and the ensuing political unrest, as well as high inflation, took a big toll on its stock market, which was reflected in the doubling of the market volatility. Later during the period August 1990 – March 1991, a military coup by a Junta and the revelation of high ranking government officers’ corruption scandals were thought to have caused a doubling of the market volatility. In the developed markets, the October 1987 crash was associated with quadrupling of market volatility in the U.S., U.K., Hong Kong, and Singapore. But it lasted for only three weeks in the U.S. and Hong Kong, and for about three months for the U.K. and Singapore. Two special local events made big marks on their respective countries’ stock markets. The Tiananmen Square turmoil in China caused a quadrupling of the market volatility in the Hong Kong market during May 1989 – July 1989. The fears of possible collapse of the banking system in Japan during April 1992 – September 1992 caused plummets of the value of the Japanese Nikkei 225 stock market index and a doubling of the market volatility. The important lesson that we learn from these observations is that the volatility of stock return is not a pure economic or statistical phenomenon. It is strongly influenced by political, social, and economic activities in the country in which the market is directly impacted. After understanding how the level of volatility is influenced by the various factors, and the predictability of these influencing factors, one would then hopefully be able to find a better way to gauge the risk level of investing in a particular market. In other words, the results from variance shift detection only provide the first step in gathering historical information that may help one build a conceptual framework for risk measurement and projection. The perception that a GARCH model or its brethrens can help one gaze into the crystal ball viewing the future volatility is naïve at best. It is more a fantasy than a reality. To resolve the issue as to whether the GARCH model is appropriate or useful in characterizing time varying variance in financial time series, and to see how the sudden shifts in return variance impact on the performance of the GARCH model, the aforementioned authors, AIL, sought to filter out the net effects of the sudden shifts from the GARCH modeling on the series in question. A hybrid model based on GARCH (1, 1) is formulated as follows: Yt D C et ;
et jIt 1 N.0; ht /;
89 Detecting Structural Instability in Financial Time Series
ht D ! C d1 D1 C : : : C dn Dn C ˛et21 C ˇht 1 ; where Y stands for the observed returns, N represents the conditional normal probability density function with mean zero and variance ht , It 1 is the information available up to time t 1, and D1 ,. . . , Dn are the dummy variables taking a value of one from each point of sudden change of variance onwards, zero elsewhere. The parameters ˛ and ˇ are the first order moving average and the first order autoregressive parameter, respectively, in the GARCH model. A pure GARCH (1, 1) model in the conventional setting does not accommodate the impacts of step jumps in the average level of the variance. When it is forced fitted to the data that exhibit such jumps, the estimated values of the parameters ˛ and ˇ will be distorted and mostly pushed higher to reflect the superimposed non-stationarity. The increments in the parameter values due to the dubious or inflated factors depend on the timing and the sizes of the jumps of the level of the unconditional variance in the various segments of the time series. In most cases, the distortions can be very substantial. As such, an effective way to see whether this is the case is to simply isolate the variance regimes using the dummy variables as indicated in the model above and reestimate the GARCH model within each segment of the time series. That is what the authors in question did. In their Table 3, the refitted hybrid model comes up with truly significant values on either ’ or “, or both, in only the Taiwan and Japan series, compared within 20 out of 20 in the pure GARCH (1, 1) model. It is a clear indication that the experience using the pure GARCH models in empirical modeling, which had been previously reported by various researchers, may in large part have been driven by the effects from sudden variance jumps. The main issue here is not which statistical model formulation, the pure GARCH or sudden jumps, works better. The real issue is what is the nature of the variance shift taking place. If it is a step-type shift, to model the variance process using a conventional pure ARMA model, which is designed to fit relatively smooth and gradual movements in the value of variance, is a misguided approach for modeling financialtime series. The most worrisome consequence of misapplications of the pure GARCH models in financial-time series is that it gives the users the false impressions that the future movement of the level of variance is predictable. One would be able to sense the upcoming crash of a stock market only when she is able to comprehend and identify the causes of the disequilibrium and the forces that may have moved the price level to an unsustainable level. This understanding is unlikely to show up in any crystal ball, “GARCH” included.
1351
89.5.2 Detecting Volatility Changes Across the Oil Sector The paper by Wilson, Aggarwal, and Inclan (heretofore WAI) (1996) has a different research motivation. It investigates the issue whether the oil futures market in the U.S. is linked to the equity market of oil and gas producing companies in the U.S., proxied by an equally weighted portfolio of the common stocks of 160 such firms, closely enough so that the latter can be an effective hedging instrument for the former. Along the way the S&P 500 market index was also investigated to see whether the oil company stock portfolio is linked more closely to the equity market index or to the prices of oil futures contracts. The daily data involved in the analysis covers the period January 1, 1984 through December 31, 1992, with the number of observations ranging from 2,192 to 2,276 among the three series. The analysis was carried out again using the ICSS algorithm to detect the time points of sudden shift in variance of the daily rates of return on each of the three time series. The idea is to see whether the timings of sudden shifts in the volatility in the three series coincide with one another. A confirmation or rejection in this regard is important because many oil hedging strategies hinge importantly on the timing of such variance shifts in the counterpart instruments’ returns time series. The results of analysis on the 1-month maturity oil futures contracts reveal 15 sudden variance shifts over the sample period. The authors linked some of the change points to important developments in the oil industry during the period. From November 1985 through December 1986, OPEC members were in bitter disputes over oil production quotas. In late 1985, Saudi Arabian oil minister delivered on his earlier threats to launch an all-out oil production surge attempting to flood the world market with oil and bring down oil prices. This was to punish fellow oil producing countries, which allegedly had cheated on their respective oil production quotas. Prices of oil futures contracts as a result dropped from $35 per barrel at the beginning of the period to as low as $10 per barrel, while volatility rose by more than four fold, sometime during the period, relative to the immediately preceding period. During the first Gulf War in which Iraqi invasion forces were driven out from Kuwait by U.S. armed forces, volatility of oil futures prices from January 14, 1991 through January 23, 1991 rose to a level seven times of the normal level. In summary, the findings outlined above show that volatility of the prices of oil futures contracts can be strongly impacted by events around the globe such as a dispute over production quotas or a war that may interrupt production or shipping of oil.
1352
WAI’s analysis of the returns time series of the S&P 500 index came up with results that are more directly related to the general financial market. The U.S. stock market crash in October 1987 was the dominant feature of the volatility shifts during the sample period. On the other hand, the portfolio of common stocks of oil and gas producing companies in the U.S. was found to have change points largely different from those of the S&P 500 Index. But a major spike of the volatility was found to have occurred during the month of October 1987. It was expected because stocks of U.S. oil companies are one part of the U.S. equity market. A financial market turmoil can naturally impact the returns of the stocks of oil and gas companies. The lack of close association in the sudden shifts of the price volatility between the oil futures market and the portfolio of stocks of oil and gas producing companies may render it less than adequate the strategy to hedge the holding of oil futures contracts with oil producing companies’ common stocks. An analysis using a hybrid model that combines the GARCH model and the sudden shifts in volatility as in the AIL paper was carried out. The results are similar to the cases discussed in the AIL paper, in which the efficacy of the GARCH model diminishes markedly. However, in the returns series of oil futures contracts, the GARCH effects linger on at a modest level rather than simply vanish.
89.5.3 Other Recent Applications of Variance Shift Detection The analytical approach outlined earlier has been utilized to study the financial time series arising in many other contexts or countries. Malik (2003) studied the behavior of foreign exchange rates time series and found the dominant effect of sudden variance shifts in the GARCH framework. The research concludes that the pure GARCH model overestimates the persistence of the conditional volatility, or the parameters ’ and “ in the model. Malik et al. (2005) reported similar findings for Canadian stock returns. Moore and Wang (2007) studied the sudden variance switching in the stock markets of new European Union (EU) states and found that the entrance into the EU helped stabilize the said country’s stock market volatility. Diamandis (2008) used the hybrid model described earlier in this section to study the effects of financial liberalization on four Latin American equity markets. Wang and Theobald (2008) explored the regime switching on stock return volatility of six East Asian emerging markets for the impacts of financial liberalization. Marcello et al. (2008) investigated the asymmetric variance and spillover effects as revealed by regime shifts in the Spanish stock market. Lastly, Wang and Moore (2009) conducted a similar study as in the AIL paper outlined earlier on five central European stock markets and drew similar conclusions.
D. Hsu
In summary, the analysis using the AIL approach discussed earlier in this section has attracted increasingly more attention among financial researchers and analysts. The analysis provides important inputs into a new paradigm of financial asset pricing theories. Academic researchers as well as financial practitioners will need to recognize the degree of instability of the system they are dealing with and look at ongoing activities in financial markets from a dynamic point of view.
89.6 Implications of Structural Instability to Financial Theories and Practice In contrast to straightforward statistical modeling, in-depth empirical descriptions and explanations of stock market volatility changes over time have been provided by a handful of financial researchers. Schwert (1989) studied the phenomenon of volatility changes by analyzing U.S. stock price data, along with related economic data, tracing back to the year 1857. Along side the time series of major U.S. stock market index returns, he went over various U.S. macroeconomic time series ranging from production, inflation, monetary growth, bond yields, market trading days, to financial leverage over the entire sample period. Despite his extensive investigative efforts, he concluded that he could not find definite determinants for the volatility of U.S. stock market returns. The question he asked after the seemingly exhaustive examination is “Why was the stock market volatility so high in the U.S. during the period of 1929–1939?” – because it is the puzzle that has so far defied rational explanations. He noted however that he was hesitant to cede the unexplained phenomena to social psychologists as proof of fads or bubbles. In an in-depth theoretical paper, McQueen and Vorkink (2004) formulated a preference-based equilibrium asset pricing model that links wealth changes to loss aversion and to a mental scoreboard. They reasoned that, after a significant experience of losses, investors’ mental scorecards make them more sensitive to news and become more risk averse till the experience fades away. When the fear of losses diminishes, investors return to the old mental state and begin to take risk again. The authors theorized that this psychological cycle explains the volatility clustering, or persistence of conditional volatility. In other words, the world is evolving around us. It drives our emotions, and our emotions enhance or mute the changes. Market volatility may reflect such interactions. The empirical study that we have discussed in a previous section regarding the analysis of 20 index return series of stock markets around the world reminds us that there are events that we do not fully comprehend, and much less
89 Detecting Structural Instability in Financial Time Series
are able to predict, that will influence investors’ decisions. Perception, intuition, and perhaps animal spirits may affect the market more than cold, rational reasoning that many economic and financial theories postulate. Understanding of human psychology, cultural propensity, social and political trends, legal and regulatory structure and its weakness, and flaws or loopholes may better help investors visualize the future of the market and its hidden risk. As an example, let us look at recent developments in financial markets, as of summer 2008 when this article is written, that may help the reader do more of his or her own thinking about the formation and evolution of an episode of structural shift in a statistical risk assessment model that is fundamental to mortgage-backed securities. By the end of year 2005, there were clear signs that housing markets around the world had gone through an unprecedented boom. A dominant factor behind this boom was the surging popularity of mortgage-backed securities. At the start, securitization of mortgage loans was indeed a clever scheme that created liquidity, investment opportunities, and rich returns to investors. It reduces risk and simultaneously increases returns of the packaged loans. Securitization begins with pooling a large amount of loans from diverse geological areas and from different origination institutions. The pool of loans is used as a base to issue the pass-through (to pass through interest payments and principal paybacks from borrowers to investors) securities commonly called “mortgage-backed securities” and sold to investors. To provide further risk reduction, various types or issues of mortgage-backed securities are shuffled and reshuffled to form new securities called collateralized loan obligation (CLO) or collateralized debt obligation (CDO). If the default rates and their variance-covariance matrix among the original loans are steady, the default risks associated with the upper (AAA) and middle (AA to BB) tranches (layers classified by default risks) of CLOs and CDOs are very small and can be calculated with reasonable accuracy. The perception among investors prior to the summer of 2007 was that CLOs and CDOs, especially the upper tranch, which accounted for the vast majority of the mortgagebacked securities, were almost risk free, because of the extra diversification and safeguards built into the securities. They largely and justifiably, it seemed, carried the AAA rating. Mortgage brokers, real estate brokers, home appraisers, loan officers, mortgage lenders, investment bankers, and ultimately the investors and home buyers were all told that there was little or near zero chance that anything would go wrong in a major way. And indeed, there were troops of highpowered mathematicians, statisticians, actuarial experts, reputable bankers, and top-notch analysts who were ever ready to testify in support of the soundness and mathematical accuracy of the default probability calculation backed by voluminous historical databases.
1353
But when everyone believed in the miracle financial engineers had so cleverly created, the chain of participants in the system started to let down their guards. The procedure for screening borrowers became lax and in many cases nonexistent. Prudence in mortgage lending decisions deteriorated precipitously. And quickly the common sense in banking practice was thrown into the wind. This happened in an epidemic fashion, and progressed at a massive scale. The correlation structure of default rates of individual loans underwent a drastic shift. The values of the cross correlation among individual loans converged closer to one, while the default rates rose across the board. This pierced through the heart of the diversification arguments that gave birth to CDOs and CLOs. Surely, reported default (delinquency and foreclosure) rates of mortgage loans in the U.S. soared rapidly by late 2006. By mid-2007, the worldwide mortgage lending system was jolted by the realization that quality of mortgagebacked securities in the bottom (called “equity,” with high concentration of sub-prime loans) tranches had declined to a shockingly lower level and the liquidity of that class of securities simply evaporated. Trading of many CDOs and CLOs was frozen as the result. As of the writing of this article in the summer of 2008, multiple hundreds of billons of dollars of the values of mortgage loans had been written down by the holding financial institutions around the world. The turmoil in U.S. housing markets and the global banking and credit system was deepening by days and governmental interventions had been called upon and taking shape around the world. These observations illustrate that a statistically based securities design, while being sound and clever at the onset, when perceived to be so, can promptly invite abuses or reckless applications. That may lead to its eventual demise. Furthermore, the lesson we have learned from the reasoning described above is that one should pay close attention to potential structural shifts in the statistical models on which one relies. No statistical models, however carefully and ingeniously constructed, will last very long, especially in financial applications. So be ever vigilant to guard against a potential episode of “abrupt evolution.” Detection of structural instability in financial time series is an ever-evolving art.
89.7 Direction of Future Research and Developments It would be appropriate to outline our thoughts about what to expect of future research and developments in the area of structural shift detection. 1. There is a need to allow both sudden and graduate structural shifts in a single modeling process. The role of GARCH models needs a reevaluation or reposition.
1354
2. The concept of the “degree of statistical instability” of a financial system needs a formal and thorough classification system. 3. Financial asset pricing theories need a rebuild or thorough overhaul in light of the prevalence of structural instability in real world phenomena. 4. The soundness or the degree of stability of a financial system has much to do with the timeliness and soundness of the federal government’s monetary and fiscal policy and, very importantly, of the regulatory legislation, policy, and enforcement. How to get the government to move in the same pace and be on the same page as financial markets is an urgent issue. The recent mortgage lending crisis is an example showing how things can go wrong and the costs of not stopping the corrupt behaviors before it is too late. 5. The stability, soundness, and efficiency of a financial system do not come naturally. They need extensive and well-thought out education on the part of participants. Rationality does not come with birth. It comes from learning, self evaluation, habit forming, long-term self disciplines, and social interactions. We cannot expect every investor to act in an optimal fashion unless he or she is provided with proper information and guidance necessary to avoid foolish or self-destructive acts. The stock market crash in 1987 (linked to reckless computer trading), the worldwide financial crisis in 1997–1998 (linked to lax standards in making loans to developing countries), the high tech stock bubble in late 1990s (linked to wild and uneducated expectations), and the housing bubble in 2002–2007 (linked to misconception of risk diversification) have all warned us that the financial system is vulnerable to participants’ self-destructive behaviors. These behaviors arose in part due to lack of proper financial education on the part of individual investors and low costs incurred against moral hazards on the part of institutional investors. Remedies to these problems deserve future attention from researchers, governments, and practitioners.
89.8 Epilogue All of the previous sections of this article were written in mid-August, 2008. The final stage of the article was completed at the end of October 2008. In the ensuing eight weeks, the U.S. financial system and its counterparts in almost all foreign countries around the world have just gone through a “perfect storm.” A series of financial crises occurred in a rapid sequence starting with the U.S. government’s takeover of Fannie Mae and Freddie Mac on September 7, followed by Lehman Brothers’ bankruptcy on September 15, and then a tidal wave of near collapses of AIG, Merrill Lynch, Washington Mutual, Wachovia, National City Corp., and a score
D. Hsu
of other European banks and financial institutions. A federal bailout plan of over $700 billion was approved by the U.S. Congress on October 3, 2008 hoping to unfreeze the worldwide credit system. The roots of the problems have been explained in previous sections. However, we would like to report in this extra writing that analyses using the detection methods described in the previous sections of this article have all suggested unanimously that a variance shift in U.S. stock market daily index returns took place on the day Lehman Brothers declared bankruptcy; that is, September 15, 2008. The standard deviation of the S&P 500 Index daily returns tripled for the period from September 15, 2008 through October 28, 2008 (the last available data point at the time these analyses were carried out), relative to the value of the standard deviation of returns over the 3 months prior to the change point. The size of the standard deviation shift is extraordinary in light of the empirical findings that we have accumulated across various countries over the past four decades. This points to an urgent need of drastic changes in the ways (i) financial institutions are structured, (ii) derivative products are designed and regulated, and most important of all, (iii) market participants are educated.
References Aggarwal, R., C. Inclan, and R. Leal. 1999. “Volatility in emerging stock markets.” Journal of Financial and Quantitative Analysis 34, 33–55. Andrew, D. W. K. 1993. “Tests for parameter instability and structural change with unknown change point.” Econometrica 61, 821–856. Bai, J. 1994. “Least squares estimation of a shift in linear processes.” Journal of Time Series Analysis 15, 453–472. Bai, J. 1996. “Testing for parameter constancy in linear regression: an empirical distribution function approach.” Econometrica 64, 597–622. Bai, J., R. L. Lumsdaine, and J. H. Stock. 1998. “Testing for and dating common breaks in multivariate time series.” Review of Economic Studies 65, 395–432. Bai, J. and P. Perron. 1998. “Estimating and testing linear models with multiple structural changes.” Econometrica 66, 47–78. Barry, D. and J. A. Hartigan. 1993. “A Bayesian analysis for change point problems.” Journal of the American Statistical Association 88(421), 309–319. Baufays, P. and J. P. Rasson. 1985. “Variance changes in autoregressive models,” in Time series analysis: theory and practice 7, O. D. Anderson (Ed.). North Holland, New York, pp. 119–127. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroscedasticity.” Journal of Econometrics 31, 307–327. Bollerslev, T., R. Y. Chou, and K. F. Kroner. 1992. “ARCH modeling in finance,” Journal of Econometrics 52, 5–59. Booth, N. B. and A. F. M. Smith. 1982. “A Bayesian approach to retrospective identification of change-points.” Journal of Econometrics 19, 7–22. Box, G. E. P. and G. C. Tiao. 1973. Bayesian inference in statistical analysis, Addison-Wesley, Reading, MA.
89 Detecting Structural Instability in Financial Time Series Chen, J. and A. K. Gupta. 1997. “Testing and locating variance changepoints with application to stock prices.” Journal of the American Statistical Association 92, 739–747. Chu, C.-S. J. 1995. “Detecting parameter shift in GARCH models.” Econometric Review 14, 241–266. Chu, C.-S. J., M. Stinchcombe, and H. White. 1996. “Monitoring structural change.” Econometrica 64, 1045–1065. Diamandis, P. 2008. “Financial liberalization and changes in the dynamic behaviour of emerging market volatility: evidence from four Latin American equity markets.” Research in International Business and Finance 22(3), 362–377. Engle, R. F. 1982. “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation.” Econometrica 50(4), 987–1007. Engle, R. F. 2001. “GARCH 101: the use of ARCH/GARCH models in applied econometrics.” Journal of Economic Perspectives 15(4), 157–168. Fama, E. F. 1965. “The behavior of stock-market prices.” Journal of Business 38, 34–105. Glosten, L. R., R. Jagannathan, and D. E. Runkle. 1993. “On the relation between the expected value and the volatility of the nominal excess return on stocks.” Journal of Finance 48(5), 1779–1801. Horvath, L., P. Kokoszka, and A. Zhang. 2006. “Monitoring constancy of variance in conditionally heteroskedastic time series.” Econometric Theory 22, 373–402. Hsu, D. A., R. B. Miller, and D. W. Wichern. 1974. “On the stable Paretian behavior of stock-market prices.” Journal of the American Statistical Association 69(345), 108–113. Hsu, D. A. 1977. “Tests for variance shift at an unknown time point.” Journal of the Royal Statistical Society, Series C, 26(3), 279–284. Hsu, D. A. 1979. “Detecting shifts of parameter in gamma sequences with applications to stock price and air traffic flow analysis.” Journal of the American Statistical Association 74(365), 31–40. Hsu, D. A. 1982a. “A Bayesian robust detection of shift in the risk structure of stock market returns.” Journal of the American Statistical Association 77(377), 29–39. Hsu, D. A. 1982b. “Robust inferences for structural shift in regression models.” Journal of Econometrics 19, 89–107. Hsu, D. A. 1984. “The behavior of stock returns: Is it stationary or evolutionary?” Journal of Financial and Quantitative Analysis 19(1), 11–28. Inclan, C. 1993. “Detection of multiple changes of variance using posterior odds.” Journal of Business and Economic Statistics 11, 189– 200. Inclan, C. and G. C. Tiao. 1994. “Use of cumulative sums of squares for retrospective detection of changes of variance.” Journal of the American Statistical Association 89(427), 913–923.
1355 Kander, Z. and S. Zacks. 1966. “Test procedures for possible changes in parameters of statistical distributions occurring at unknown time points.” Annals of Mathematical Statistics 37, 1196–1210. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Persistence in variance, structural change, and GARCH model.” Journal of Business and Economic Statistics 8(2), 225–234. Malik, F. 2003. “Sudden changes in variance and volatility persistence in foreign exchange markets.” Journal of Multinational Financial Management 13(3), 217–230. Malik, F., B. T. Ewing, and J. E. Payne. 2005. “Measuring volatility persistence in the presence of sudden changes in the variance of Canadian stock returns.” Canadian Journal of Economics 38(3), 1037–1056. Mandelbrot, B. 1963. “The variation of certain speculative prices.” Journal of Business 36, 394–419. Mandelbrot, B. 1967. “The variation of some other speculative prices.” Journal of Business 40, 393–413. Marcello, J. L. M., J. L. M. Quiros, and M. M. M. Quiros. 2008. “Asymmetric variance and spillover effects: regime shifts in the Spanish stock market.” Journal of International Financial Markets, Institutions and Money 18(1), 1–15. McQueen, G. and K. Vorkink. 2004. “Whence GARCH? a preferencebased explanation for conditional volatility.” Review of Financial Studies 17(4), 915–949. Moore, T. and P. Wang. 2007. “Volatility in stock returns for new EU member states: Markov regime switching model.” International Review of Financial Analysis 16(3), 282–292. Nelson, D. B. 1991. “Conditional heteroscedasticity in asset returns: a new approach.” Econometrica 59(2), 347–370. Schwert, G. W. 1989. “Why does stock market volatility change over time?” Journal of Finance 44(5), 1115–1153. Wang, P. and M. Theobald. 2008. “Regime-switching volatility of six east Asian emerging markets.” Research in International Business and Finance 22(3), 267–283. Wang, P. and T. Moore. 2009. “Sudden changes in volatility: the case of five central European stock markets.” Journal of International Financial Markets, Institutions and Money 19(1), 33–46. Wichern, D. W., R. B. Miller, and D. A. Hsu. 1976. “Changes of variance in first-order autoregressive time series models – with an application.” Journal of the Royal Statistical Society, Series C, 25(3), 248–256. Wilson, B., R. Aggarwal, and C. Inclan. 1996. “Detecting volatility changes across the oil sector.” Journal of Futures Markets 16(3), 313–330.
Chapter 90
The Instrument Variable Approach to Correct for Endogeneity in Finance Chia-Jane Wang
Abstract The endogeneity problem has received a mixed treatment in corporate finance research. Although many studies implicitly acknowledge its existence, the literature does not consistently account for endogeneity using formal econometric methods. This chapter reviews the instrumental variables approach to endogeneity from the point of view of a finance researcher who is implementing instrumental variable methods in empirical studies. This review is organized into two parts. Part I discusses the general procedure of the instrumental variables approach, the related diagnostic statistics for assessing the validity of instruments, which are important but not frequently used in finance applications, and some recent advances in econometrics research on weak instruments. Part II surveys corporate finance applications of instrumental variables. We found that the instrumental variables used in finance studies are often chosen arbitrarily and very few diagnostic statistics are performed to assess the adequacy of IV estimation. The resulting IV estimates are thus questionable. Keywords Endogeneity r Instrumental variable estimation 2SLS r Weak instruments r Corporate finance
r
90.1 Introduction Corporate finance often involves a number of decisions that are intertwined and endogenously chosen by managers and/or debtholders and/or shareholders. For example, in order to maximize value, firms must form an effective board of directors and grant their managers an optimal payperformance compensation contract. Debtholders have to decide how much debt and with what features ( junior or senior, convertible, callable, maturity length, etc.) should the debt be structured. Given that these endogenously chosen
C.-J. Wang () Manhattan College, Riverdale, NY, USA e-mail: [email protected]
variables are interrelated and are often partially driven by unobservable firm characteristics, the endogeneity problem can make the standard OLS results hard to interpret (Hermalin and Weisbach 2003). The usual approach to the endogeneity problem is to implement the instrumental variable estimation method. In particular, one chooses instrumental variables that are correlated to the endogenous regressors, but uncorrelated to the structural equation errors and then employs a two-stage least square (2SLS) procedure. The endogeneity problem has received a mixed treatment in corporate finance research. Although many studies implicitly acknowledge its existence, the literature does not consistently account for endogeneity using formal econometric methods. There is an increasing emphasis on addressing the endogeneity problem in recent work and the simultaneous equations model is now being used more commonly. However, the instrumental variables are often chosen arbitrarily and few diagnostic statistics are performed to assess the adequacy of IV estimation. This chapter reviews the instrumental variables approach to endogeneity from the point of view of a finance researcher who is implementing instrumental variable methods in empirical studies. We do not say much on the distribution theory and the mathematical proofs of estimation methods and test statistics as they are well covered in econometrics textbooks and articles. The chapter proceeds as follows: Sect. 90.2 describes the statistical issue raised by endogeneity. Section 90.3 discusses the commonly used instrumental variables approaches to endogeneity: the two-stage least squares estimation (2SLS) and the generalized method of moments estimation (GMM). Section 90.4 discusses the conditions of a valid instrument and the related diagnostic statistics, which are critical but not frequently used in finance applications. Section 90.5 considers the weak instrument problem; that is, the instruments are exogenous but have weak explanatory power for explaining the endogenous regressor. Some recent advances in econometrics research on statistical inference with weak instruments are briefly discussed. Section 90.6 surveys corporate finance applications of instrumental variables. Section 90.7 concludes.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_90,
1357
1358
C.-J. Wang
90.2 Endogeneity: The Statistical Issue To illustrate the endogeneity issue, assume that we wish to estimate parameter “ of a linear regression model for a population of firms Y D X“ C u (90.1) where Y is the dependent variable, which is typically an outcome such as return or profitability, X is the regressor that explains the outcome, and u is the unobservable random disturbance or error. If u satisfies the classical regression conditions, the parameters “ can be consistently estimated by the standard OLS procedures. This can be shown from the probability limit of OLS estimator Plim“OLS D “ C Cov.X; u/=Var.X/
(90.2)
When the disturbance and the regressor are not correlated and hence the second term is zero, the OLS estimator will be consistent. But if the disturbance is correlated with the regressor, that is, the explanatory variable in Equation (90.1) is potentially endogenous,1 the usual OLS estimation generally results in biased estimator. In applied finance work, endogeniety is often caused by omitted variables and/or simultaneity. Omitted variables problem arises when the explanatory variable X is hard to measure or depends on some unobservable factors, which are part of the error term u, thus X and u are correlated. The correlation of explanatory variables with unobservable may be due to self-selection: the firm makes the choice of X for the reason that is unobservable. An example is the private information held by a firm in making a debt issue, in which the terms and structure of the debt offering is likely to be correlated with unobserved private information held by the firm. The Heckman two-step procedure (Heckman 1979) is extensively used in corporate finance for modeling this omitted variable.2 Endogeneity may also be caused by simultaneity, in which one or more of the explanatory variables are determined simultaneously along with the dependent variable. For example, if Y is firm value and X is
management compensation, the management compensation contract is partly determined by the anticipated firm value Y from choosing X, and the general OLS estimator is biased in this situation.
90.3 Instrumental Variables Approach to Endogeneity 90.3.1 Instrumental Variables and Two-Stage Least Square (2SLS) The method of instrumental variables (IV) approach provides a general solution to the problem of endogenous explanatory variables. To see how, consider the following setup Y D X“ C u
(90.3)
where Y is N 1 vector of observations on the dependent variable, X is N K matrix of explanatory variables, and u is unobservable mean-zero N 1 vector of disturbance correlated with some elements of X. To use the IV approach, we need instruments at least as many as explanatory variables in the model. The instruments should be sufficiently correlated with the endogenous explanatory variables but asymptotically uncorrelated with u, and the explanatory variables that are exogenous can serve as their own instruments as they are uncorrelated with u. Specifically, the valid instruments must satisfy the orthogonality condition and there must be no linear dependencies among the exogenous variables. In a just identified setup, using an N K matrix Z to instrument for X, the IV estimator can be solved as “IV D .Z 0 X/1 .Z0 Y/ C .Z0 X/1 Z0 u
(90.4)
“IV D “ C .Z0 X/1 Z0 u
(90.5)
If Z and u are not correlated and Z’X has full rank
1
Traditionally a variable is defined as endogenous if it is determined within the context of a model. However, in applied econometrics the “endogenous” variable is used more broadly to describe the situation where an explanatory variable is correlated with the disturbance and the resulting estimator is biased (Wooldridge 2002). 2 The finance literature using self-selection models has little interest in estimating the endogenous decision itself (the parameter “ in Equation (1.1)), but is more interested in using self-selection models to reveal and test for the private information that influences the decision. In contrast, this chapter focuses on how to implement IV approach to estimate the parameter consistently. The readers interested in finance application of self-selection models are referred to Li and Prabhala (2005).
E.Z0 u/ D 0
(90.6)
E.Z0 X / D K
(90.7)
The second term of the right hand side of Equation (90.5) becomes 0, “IV will be consistent and the unique solution for the true parameters “ when the conditions (Equations (90.6) and (90.7)) hold. When we have more exogenous variables than needed to identify the parameters, for example, if the instrumental variables Z is an N L matrix, where L > K, the model is said to be overidentified and there are (L-K) overidentifying
90 The Instrument Variable Approach to Correct for Endogeneity in Finance
restrictions because (L-K) instruments could be discarded and the parameters can still be identified. If L < K, the parameters cannot be identified. Therefore, the order condition L K is necessary for the rank condition, which requires that Z is sufficiently related to X so Z0 K has full column rank K. In an overidentifying model, any linear combinations of the instruments can also be used as instruments. Under homoskedasticity, the two-stage least square (2SLS) is the most efficient IV estimator out of all possible linear combinations of the valid instruments since the method of 2SLS chooses the most highly correlated with the endogenous explanatory variable. The name “two-stage least squares” comes from its twostep procedure: Step 1: Obtain the fitted value of each endogenous explanatory variable from regressing each endogenous explanatory variable on all instrumental variables. This is called the first-stage regression. Using the matrix notation, the matrix of the fitted values can be expressed as XO D Z.Z 0 Z/1 Z 0 X (if any xi is exogenous, its fitted value is itself). Step 2: Run the OLS regression of the dependent variable on the exogenous explanatory variables in the structural Equation (90.1) and the fitted values obtained from the first stage regression in place of the observations on the endogenous variables. This is the second stage regression. The IV/2SLS estimator that uses the instruments XO , can be written as ˇO D .XO 0 X /1 XO 0 Y . Substitute Z.Z 0 Z/1 Z 0 X for XO , this 2SLS estimator can be written as ˇO D ŒX 0 Z.Z 0 Z/1 Z 0 X 1 X 0 Z.Z 0 Z/1 Z 0 Y
where i;j is the correlation coefficient between variable i and variable j . Thus, the IV estimator has smaller bias than OLS 2 2 2 3 is larger than z;u =x;u . Although this estimator only if x;z comparison involves the population correlations with unobservable variable u and cannot be estimated directly, it indicates that when the instrument is only weakly correlated to the endogenous regressor, the IV estimator is not likely to improve upon on the OLS (see Bartels 1991 for the discussions of quasi-instrumental variables). Even if the instruments are perfectly exogenous, the low relevance of the instruments can increase asymptotic standard errors and reduce the power of the tests. We can see from Equation (90.10) that the inconsistency in the IV estimator can get extremely large if the correlation between X and Z gets close to zero and makes the IV estimator undesirable. The orthogonality of an instrument is difficult to ascertain because we cannot test the correlation between one observable instrument and the unobservable disturbance. But in the case of overidentifying model (i.e., the number of instruments exceeding the number of regressors), the overidentifying restrictions test can be used to evaluate the validity of the additional instruments under the assumption that at least one instrument is valid. We will discuss the tests for validity of the instrument in Sect. 90.4. For checking the rank condition, often we estimate the reduced form for each endogenous explanatory to make sure that at least one of the instruments not in X is significant. If the reduced form regression fits poorly, the model is said to suffer from weak instrument problem and the standard asymptotic theory cannot be employed to make inference. We will discuss the weak instruments problem in Sect. 90.5.
(90.8)
In practice, it is best to use a software package with 2SLS command rather than carry out the two-step procedure, because the reported OLS standard errors of the second stage regression are not the 2SLS standard errors. The 2SLS residuals and convariance matrix are calculated by the original observations on the explanatory variables instead of the fitted values of the explanatory variables. In searching valid instrument, both the orthogonality condition and the rank condition are equally important. When an instrument is not fully exogenous and the correlation between the instrument and the explanatory variable is too small, the bias in IV estimator may not be smaller than the bias in OLS estimator. To see this, we compare the bias in the OLS estimator with the bias in the IV estimator for model (90.3): “OLS “ D Cov.X; u/=Var.X/ D u x;u =x
1359
90.3.2 Hypothesis Testing with 2SLS Testing hypotheses about a single parameter estimate in model Equation (90.3) is straightforward using an asymptotic t statistic. For testing restrictions on multiple parameters, Wooldridge (2002) provides a method to compute a residual-based F statistic. Rewrite the Equation (90.3) into a partitioned model Y D X1 “1 C X2 “2 C u
where X1 is N K1 , X2 is N K2 , and K1 C K2 D K, let Z denote an N L matrix of instruments and assume the rank and the orthogonality conditions hold. Our interest is to test the K2 restrictions: H0 W “2 D 0 against H1 W “2 ¤ 0
(90.9)
ˇIV “ D Cov.Z; u/=Cov.X; Z/ D u z;u /x x;z (90.10)
(90.11)
3
(90.12)
We follow Larcker and Rusticus (2007) to compare the squared terms to avoid the problem of sign flips.
1360
C.-J. Wang
In order to calculate the F-statistics for 2SLS, we need to calculate the sum of squared residuals from the restricted O r , the sum of squared second-stage regression, denoted as SSR residuals from the unrestricted second-stage regression, deO ur , and the sum of squared residuals from unrenoted as SSR stricted 2SLS, denoted as SSRur . The F statistics is calculated as F
O r SSR O ur / .N K/ .SSR FK2 ;NK SSRur K2
(90.13)
When the homoskedasticity assumptions cannot be made, we need to calculate the heteroskedasticity-robust standard errors for 2SLS. Some statistical packages compute these standard errors using a simple command. Wooldridge (2002)
seheter .ˇOj / D
N .N K/
1=2 "
se.ˇOj / O
90.3.3 Instrumental Variables and Generalized Method of Moments When we have a system of equations, with sufficient instruments, we can still apply 2SLS procedure to each single equation in the system to obtain acceptable results. However, in many cases we can obtain more efficient estimators by estimating parameters in the system equations jointly. The approach to system instrumental variable estimation is based on the principal of the generalized method of moments (GMM). As discussed in the previous section, the orthogonality conditions require that the valid instruments are uncorrelated to the disturbance; that is, the population moment condition E.Z0 / is equal to zero. Thus, by this principle, the optimal parameter estimate is chosen so that the corresponding sample moment is also equal to zero. To show this, reconsider the linear model (Equation (90.3)) for a random sample from the population, with Z as an N L matrix of instruments orthogonal to u for the set of L linear equations in K unknowns and assume the rank condition holds, the sample moment condition must satisfy N 1
N X
O D0 Zi0 .Yi Xi ˇ/
(90.15)
i D1
where the i subscript is included for notational clarity. When we have an exactly identified model where L = K and Z’X is invertible, we can solve for the generalized method of moments (GMM) estimator ˇO in full matrix notation as ˇO D .Z 0 X /1 Z 0 Y , which is the same IV estimator obtained in Equation (90.4). When L > K, the model is overidentified, we cannot choose K unknowns to satisfy the L equations and
shows the robust standard errors can be computed in the following steps: Step 1: Apply 2SLS procedures and obtain the 2SLS residual, denoted uO i for each observation i , i D 1; ::N, then , use uO i to calculate the SSR and O 2 , where O 2 NSSR K O 1 , the standard and Var.ˇOj /, where Var.ˇOj / D ¢O 2 .XO 0 X/ error is denoted as se .“O j /, j D 1: : :K. Step 2: Obtain the fitted value of each explanatory variables, denoted as xO ij , j D 1: : :K. Step 3: Regress each element of xO ij on all other xO ik where k¤j, and obtain the residuals from the regressions, denoted as rOij for each j. Step 4: Compute the heteroskedasticity-robust standard errors of ˇOj , denoted as seheter .“O j /: #2
1 m Oj
1=2 ;
where m Oj D
XN i D1
rOij uO i
(90.14)
generally the Equation (90.15) will not have a solution. If we cannot set the sample moment condition exactly equal to zero, we can at least choose the parameter estimate ˇO so that the vector in Equation (90.15) is as close to zero as possible. One idea is to minimize the squared Euclidean length of the L 1 vector in Equation (90.15) and the optimal GMM estimator ˇO is the minimizer. The starting point for GMM estimation is to specify the O Y/, which is the sample moGMM criterion function Q.ˇ; ment in Equation (90.15) in a quadratic form with a L L symmetric and positive definite weighting matrix WO : O Y/ Q.ˇ;
"N X i D1
"
#0 Zi0 .Yi
O Xi ˇ/
WO
N X
# Zi0 .Yi
O Xi ˇ/
i D1
(90.16) The GMM estimator ˇO can be obtained by minimizing the O Y/ over ˇO and the unique solution in criterion function Q.ˇ; full matrix notation is ˇO D .X 0 Z WO Z 0 X /1 .X 0 Z WO Z 0 Y /
(90.17)
It can be shown that if the rank condition holds and the chosen weighting matrix WO is positive definite, the resulting GMM estimator is asymptotically consistent. There is no shortage of such matrix. However, it is important that we choose the weighting matrix that produces the GMM estimator with the smallest asymptotic variance. To dopthis, first we need to find the asymptotic variance matrix of N .ˇO ˇ/. Plug (90.17) into (90.3) to get ˇO D “ C .X 0 Z WO Z 0 X /1 .X 0 Z WO Z 0 u/
(90.18)
90 The Instrument Variable Approach to Correct for Endogeneity in Finance
p Thus, the asymptotic variance matrix Avar N .“O “/ can be shown as Avar
p
N .ˇO ˇ/ D VarŒ.C WC/ C W Z u 0
1
0
0
D .C 0 WC/1 C 0 WSWC.C 0 WC/1 (90.19) where Z0 X C, S Var.Z 0 u/. It can be shown that if the weighting matrix W is chosen such that W D S1 the GMM estimator has the least asymptotic variance. The details can be found in Hayashi (2000, p. 212). By setting W S1 , the Equation (90.19) can be simplified as Avar
p N .ˇO ˇ/ D .C 0 S 1 C /1 D .X 0 ZS1 ZX/1 (90.20)
Therefore, the efficiency of the GMM estimator depends on if we can consistently estimate S (the variance of the asymptotic distribution of Z0 u). Since the 2SLS estimator is consistent though not necessarily efficient, it can be used as an initial estimator to obtain the residuals, which in turn can be used as the estimator for S. The two-step efficient GMM procedure is given in Wooldridge (2002) as follows: Step 1: Use the 2SLS estimator as an initial estimator since Q to obtain the 2SLS residuit is consistent, denoted as ˇ, Q als by Q i D Yi Xi ˇ, i D 1; 2; : : :N. For a system of equations, apply 2SLS equation by equation. This allows the possibility that different instruments are used for different system equations. Step 2: Estimate S, the variance of the asymptotic distribution of Z0 u by SO D N 1
N X
Zi0 uQ i uQ 0i Zi i
(90.21)
i D1
Then use W D SO 1 as the weighting matrix to obtain the efficient GMM estimator. We can plug SO 1 into Equation (90.17) and obtain the efficient GMM estimator ˇO D .X 0 Z SO 1 Z 0 X /1 .X 0 Z SO 1 Z 0 Y /
(90.22)
Follow Equation (90.19), the asymptotic variance matrix VO of the optimal GMM estimator can be estimated as VO D .X 0 Z SO 1 ZX/1
1361
90.3.4 Hypothesis Testing Using GMM The t-statistics for the hypothesis testing after GMM estimation can be directly computed by using the asymptotic standard errors obtained from the variance matrix VO . For testing multiple restrictions, the GMM criterion function can be used to calculate the statistic. For example, suppose our interest is to test Q restrictions for the K unknowns in the system, thus the Wald statistics is a limiting null 2 Q . To apply this statistics, we need to assume the optimal weighting matrix W D S1 is chosen to obtain the GMM estimator with and without imposing the Q restrictions. Define the residuals evaluated at the unrestricted GMM estimator ˇOu as uO u Yi Xi ˇOu and the residuals evaluated at the restricted GMM estimator ˇOr as uO r Yi Xi ˇOr . By plugging the calculated residuals uO u and uO r into the criterion function (Equation (90.16)) for the unrestricted model and the restricted model, respectively, the criterion function statistic is computed as the difference between the two criterion function values, divided by the sample size N. The criterion function statistic has chi-square distribution with Q degrees of freedom.
90.4 Validity of Instrumental Variables 90.4.1 Test for Exogeneity of Instruments When researchers explore various alternative ways to solve the endogeneity problem and consider the instrumental variables estimation to be the most promising econometric approach, the next step is to find and justify the instruments used. The orthogonality condition (Equation (90.6)) is the sufficient condition in order for a set of instruments to be valid. When the condition is not satisfied, the IV estimator is inconsistent. In practice, we cannot test whether an observable instrument is uncorrelated with the unobservable disturbance. But when we have more instruments than needed to identify an equation, we can use any subset of the instruments for estimation and the estimates should not be significantly different if the instruments are valid. Following the principle of Hausman (1978) we can test whether the additional instruments are valid by comparing the estimates of an over-identified model with those of a just identified model. The model we wish to test is
(90.23)
The square roots of diagonal elements of this matrix VO are the asymptotic standard errors of the optimal GMM estimator.
Y D X“ C u
(90.24)
where X is a vector of 1 K explanatory variables with some elements correlated with u, Z is a vector of 1 L instruments,
1362
C.-J. Wang
and L > K so the model has L-K overidentifying restrictions and we can use any 1 K subset of Z as instruments for X. Let Z1 be a vector of 1 (L-K) extra instruments. The overidentifying restrictions require that E.Z01 u/ D 0. The LM statistic can be computed by regressing the 2SLS residuals from the original model (Equation (90.24)) on the full set of instruments Z, and use the obtained uncentered R2 times N. The asymptotic distribution of the statistic is 2 L-K . Another way to test the overidentifying restrictions is to use a test statistic based on the difference between the minimized values of the IV criterion function for the overidentified model and the just-identified model. By the moment condition, the minimized value of the criterion function is equal to zero for the just-identified model. Thus the test statistic is the minimized value of the criterion function for the overidentified model (Equation (90.24)), divided by the estimate of the error variance from the same model. This test is called Sargan’s test, after Sargan (1958), and numerically the Sargan’s test is identical to the LM statistic discussed above. The usefulness of the over-identifying restrictions test is that if we cannot reject the null, we can have some confidence in the overall set of instruments used. This suggests that it is preferable to have an over-identified model for empirical " N
1=2
N X
"
#0 Zi0 .Yi
O Xi ˇ/
application, because the overidentifying restrictions can and should always be tested. However, we also need to note that the overidentifying restrictions test is performed under the assumption that at least one instrument is valid. If this assumption does not hold, we will have a situation that all instruments have similar bias and the test will not reject the null. It could also be that the test has low power4 for detecting the endogeneity of some of the instruments. The heteroskedasticity-robust test can be computed as follows. Let H be any 1 (L-K) subset of Z. Regress each element of H onto the fitted value of X and collect the residuals, denoted as . O The asymptotic 2 L-K test statistic is obtained as N minus the sum of squared residuals from regressing 1 O on uO 0 . For the system equations using GMM estimation, the test of overidentifying restrictions can be computed by using a similar criterion function based procedure. The overidentification test statistic is the minimized criterion function (Equation (90.16)) evaluated at the optimal (efficient) GMM estimator and there is no need to be divided by an estimator of error variance because the GMM criterion function takes account of the covariance matrix of the error terms:
WO
N
1=2
i D1
The GMM estimator using the optimal weighting matrix is called the minimum chi-square estimator because the GMM estimator ˇO is chosen to make the criterion function minimum. If the chosen weighting matrix is not optimal, the expression (Equation (90.25)) fails to hold. This test statistic is often called Hansen’s J statistic after Hansen (1982) or Hansen-Sargan statistic for its close relationship with the Sargan’s test in the IV/2SLS estimation. The Hansen’s J statistic is distributed as chi-square in the number of overidentifying restrictions, (L-K), since K degrees of freedom are lost for having estimated K parameters,5 and it is consistent in the presence of heteroskedasticity. It is strongly suggested that an overidentifying restrictions test on instruments should always be performed before formally using the IV estimation, but we also need to be cautious in interpreting the test results. There are several situations in which the null will be rejected. One possibility is that the model is correctly specified and some of the instruments are indeed correlated with the disturbance and thus are invalid. The other possibility is that the model is not correctly
5
A potential problem is that the test is not consistent against some failures of the orthogonality condition due to the loss of degrees of freedom from K to K–L.
N X
# Zi0 .Yi
O 2 Xi ˇ/ LK
(90.25)
i D1
specified, for instance, some variables are omitted from the regression function. In either case, the overidentifying test statistic leads us to reject the null hypothesis but we cannot be certain about which is the case. Another problem is that in small samples the actual size of the Hansen’s J-test far exceeds the nominal size and the test usually rejects too often.
90.4.2 Whether IV Estimator Is Really Needed In many cases, we may suspect that certain explanatory variables are endogenous but do not know whether the IV estimation is reasonably preferred to the OLS estimation. However, it is important to know that using instrumental variable to solve for the endogeneity problem in empirical research is like a tradeoff between efficiency and consistency. As discussed in Bartels (1991), the asymptotic mean square error of the IV estimator can be partitioned into three components: 2 2 2 2 /=XZ C NZu =XZ AMSE .ˇOIV / D Œu2 =NX2 Œ1 C .1 XZ (90.26) 4
If the instruments are only weakly related to the endogenous explanatory variables, the power of the test can be low.
90 The Instrument Variable Approach to Correct for Endogeneity in Finance
where u is the structural error, X is the explanatory variable, Z is the instrumental variable, and N is the sample size. The first term in the second bracket of Equation (90.26) corresponds to the asymptotic variance of the OLS estimator, the second term corresponds to the additional asymptotic variance produced by using IV estimator rather than the OLS estimator, and the third term captures the inconsistency if the instruments are not truly exogenous. Thus, even when we 2 D 0, the find a perfectly exogenous instruments so that Zu standard error for the IV estimator will exceed the standard error for the OLS estimator by 1= 2 XZ . As the correlation between X and Z gets close to zero, the standard error for the IV estimator can get extremely large. Because the IV estimator is less efficient than the OLS estimator, unless we have strong evidence that the explanatory variable is endogenous, the IV estimation is not preferred to the OLS estimation. Therefore, it is always useful to run a test for endogeneity before using the IV approach. This endogeneity test dates back to Durbin (1954), subsequently extended by Wu (1973) and Hausman (1978). Thus, we refer to the test of this type as Durbin-Wu-Hausman tests. To illustrate the procedure, write a population model as Y1 D “Y2 C Z1 C u
(90.27)
where Y1 is the dependent variable, Y2 is 1 K2 vector of the possible endogenous explanatory variables, Z1 is 1 K1 vector of exogenous explanatory variables, and K1 C K2 D K; also Z1 is a subset of the 1 L exogenous variables Z, assuming Z satisfies the orthogonality condition and E.Z0 Y2 / has full rank. Our interest is to test the null hypothesis that Y2 is actually exogenous: H0 W Y1 D “Y2 C Z1 C u; E Y02 u D 0 against H1 W Y1 D “Y2 C Z1 C u; E Y02 u ¤ 0 Under H0 both IV and the OLS estimators are consistent and they should not differ significantly. Under H1 only the IV estimator is consistent. Thus, the original idea of the endogeneity test is to check whether the IV estimator is significantly different from the OLS estimator. The test can also be made by just comparing the estimated coefficients of the parameters of interest, which is ˇO in our case. The suitable Hausman statistic thus can be calculated as .“O IV h i1 .“O I V “O OLS /.6 “O OLS /0 Var.“O IV / Var.“O OLS /
6
If the assumption of homoskedasticity cannot be made, this standard error is invalid because the asymptotic variance of the difference is no longer the difference in asymptotic variances.
1363
We often use Hausman (1978) regression-based form of the test, which is easier to compute and asymptotically equivalent to the original endogeneity test. Following the procedure described in Wooldridge (2002), the first step is to regress each element of the K2 possibly endogenous variables against all exogenous variables Z. The 1 K2 vector of the obtained reduced form error is denoted as . Since Z is uncorrelated to the structural error u and the reduced form error , Y2 is endogenous if only if u and are correlated. To test this, we can project u onto as u D C e
(90.28)
where is uncorrelated to e, and e is uncorrelated to Z, thus the test of whether Y2 is exogenous is equivalent to test whether the joint test of œ (the K2 restrictions) is significantly different from 0. By plugging equation (Equation (90.9)) into (Equation (90.8)), we have the equation Y1 D “Y2 C Z1 C C e
(90.29)
Since e is uncorrelated to , Z1 and Y2 by construction, the test of the null hypothesis can be done by using a standard joint F test on œ in an OLS regression (for the single endogenous variable, a t-statistic can be used in the same procedure), and the F-statistic has the F.K2 , N-K-K2 ) distribution. If we reject the null H0 W œ D 0, there is evidence that at least some elements of Y2 are indeed endogenous. So the use of IV approach is justified assuming the instruments are valid. If the heteroskedasticity is suspected under H0 , the test can be made robust to heteroskedasticity in (since D e under H0 ) by computing the heteroskedasticity-robust standard errors.7 Another use of this regression-based form of the test is that the OLS estimates of “ and œ for Equation (90.29) should be identical to their 2SLS estimates for the Equation (90.27) using Z as the instruments (see Davidson and MacKinnon 1993 for the details), and that allows us to examine whether the differences in the OLS and 2SLS point estimates are practically significant.
7 Since the robust (Hubert-White) standard errors are asymptotically valid to the presence of heteroskedasticity of unknown form including homoskedasticity, these standard errors are often reported in empirical research especially when the sample size is large. Several statistical packages such as Stata now report these standard errors with a simple command, so it is easy to obtain the heteroskedasticity-robust standard errors.
1364
C.-J. Wang
90.5 Identification and Inferences with Weak Instruments 90.5.1 Problems with Weak Instruments and Diagnosis As stated in the very beginning of this chapter, a valid instrument should be asymptotically uncorrelated with the structural error (orthogonality) but sufficiently correlated with the endogenous explanatory variable for which it is supposed to serve as instrument (relevance). The relevance condition is critical for the structural model to be identified. If the instruments have little relevance, the instruments will not enter the first-stage regression, then the sampling distribution of IV statistics are generally not normal and the IV estimates and standard tests are unreliable. A simple way to detect the presence of weak instruments is to look at the first stage F-statistic for the null hypothesis that the instruments are jointly equal to zero. Staiger and Stock (1997) suggests that the instruments are considered to be weak if the first-stage F-statistic is less than 10. Some empirical studies use R2 in the first stage regression as a measure of instrument relevance. However, Shea (1997) argues that for the model with multiple endogenous regressors, when instruments are highly collinear IV may work poorly even if R2 is high for each first stage regression. For instance, suppose both the vector of endogenous regressors X and the vector of instruments Z have rank of 2, while only one element of Z is highly correlated to X. In this situation the regression of each element of X onto Z will have high R2 even though “ in the structural model may not be identified. Instead, Shea (1997) proposes a partial R2 , which measures the instrument relevance by taking intercorrelations among the instruments into account. The idea is that the instruments should work best when the part of instruments important to one endogenous regressor is linearly independent of the part important to the other endogenous regressor. Taking the example above, Shea’s partial R2 is the squared correlation between the component of one endogenous regressor X1 orthogonal to the other endogenous regressor X2 (i.e., the residuals from regressing X1 onto X2 / and the component of the fitted values of X1 orthogonal to the fitted values of X2 (i.e., the residuals from regressing the fitted values of X1 onto the fitted values of X2 /. Shea’s partial R2 can be corrected for the degrees of freedom by
2 Rp D 1 Œ.N 1/=.N K/ 1 Rp2 2
(90.30)
where Rp is the corrected partial R2 , Rp2 is uncorrected partial R2 , N is the sample size, and K is the number of exogenous variables in the system.
Another type of the tests for instrument relevance is testing whether the equation is identified. Rothenberg et al. (1984) shows that at a formal level the strength of instruments can be characterized in terms of the so-called concentration parameter associated with the first stage regression. He considers a simple linear model with a single endogenous regressor X and without included exogenous regressors: Y D X“ C u
(90.31)
X D Z… C
(90.32)
Where Y and X are N 1 vectors of observations on endogenous variables, Z is an N K matrix of instruments, and u and v are N 1 error vectors. The errors are assumed to be i.i.d as N.0; †/ where ¢ 2 u , ¢ 2 uv and ¢ 2 v are elements of †, and the correlation between the error terms ¡ D ¢ 2 uv =.¢u ; ¢v /. In matrix notation the 2SLS estimator can be written as “O 2SLS D .X0 PZ X/1 .X0 PZ Y/, where the idempotent matrix PZ D Z0 .Z0 Z/1 Z0 . The concentration parameter 2 associated with the first stage reduced form is defined as 2 D …0 Z0 Z …= 2 v (90.33) The concentration parameter 2 can be interpreted in terms of F, the first stage F statistic for testing the hypothesis … D 0 (i.e., the instruments do not enter the first stage regression). Let FQ be the infeasible counterpart of F using the true value of ¢ 2 v . Then K FQ has a chi-squared distribution of K degrees of freedom and noncentrality parameter 2 . When the sample size is large, the E.K FQ / Š 2 =K C 1, thus for large values of 2 =K, (F-1) can be used as an estimator for 2 /K. Rothenberg et al. (1984) shows that the concentration parameter 2 plays an important role in the approximation to the distributions of 2SLS estimators and test statistics. He emphasizes that for the normal approximation to the distribution of the 2SLS estimator to be precise the concentration parameter 2 must be large. Thus, a small F statistic (i.e., a smaller value of 2 =K) can indicate the presence of weak instruments. There are several tests developed using the concentration matrix to test for weak identification. Cragg and Donald (1993) proposed a statistic using the minimum eigenvalue of the concentration parameter to test the null hypothesis of underidentification, which occurs when the concentration matrix is singular. Stock and Yogo (2002) argue that when the concentration matrix is nonsingular but its eigenvalues are sufficiently small, the inferences based on conventional normal approximation distributions are misleading even though the parameters might be identified. Thus, the minimum eigenvalues of the concentration matrix 2 =K can be used to detect the presence of weak instruments. By this principle Stock and
90 The Instrument Variable Approach to Correct for Endogeneity in Finance
Yogo (2002) develop two alternative quantitative definitions of the weak instrument. A set of instruments is considered to be weak if 2 =K is small enough so the bias in IV estimator to the bias in OLS estimator exceeds a certain threshold, depending on the researcher’s tolerance, for example 10%. Alternatively, a set of instruments is considered to be weak if 2 =K is small enough so the conventional ’-level Wald test based IV statistic has an actual size exceeding a certain threshold, again depending on the researcher’s tolerance. Stock and Yogo (2002) propose using the first-stage F-statistic for making inferences about 2 =K and develop the critical values of F-statistic corresponding to the weak instrument threshold 2 =K. For example, if a researcher requires the 2SLS relative bias no more than 10%, for a model with three instruments, the computed first-stage F statistic has to be larger than 9.08 for the threshold value 2 =K larger than 3.71, so the null hypothesis that the 2SLS relative bias is less than or equal to 10% (i.e., instruments are weak) will be rejected. The readers are referred to Stock and Yogo (2002) for a tabulation of the critical values for weak instrument tests with multiple endogenous regressors.
90.5.2 Possible Cures and Inferences with Weak Instruments Many of the key issues of weak instruments have been studied for decades, however, most of the research on the estimation and inferences robust to weak instruments is quite recent and their applications in finance still remain to be seen. Therefore, this section just simply touches on this topic and refers the reader to the original articles for details. In their survey of weak instrument and identification, Stock et al. (2002) considered the Anderson-Rubin statistic as “fully robust” to weak instruments, in the sense that this procedure has the correct size regardless of the value of concentration parameter. For testing the null hypothesis for “ D “0 , the Anderson-Rubin statistic (Anderson and Rubin 1949) is computed as AR.“0 / D
.Y Xˇ0 /0 PZ .Y Xˇ0 /=K FK; NK .Y Xˇ0 /0 MZ .Y Xˇ0 /=.N K/ (90.34)
where PZ D Z.Z 0 Z/1 Z 0 and MZ D I PZ . Testing the null hypothesis that the coefficients of the endogenous regressors in the structural equation are jointly equal to zero is numerically equivalent to estimating the reduced form of the equation (with the full set of instruments as regressors) and testing that the coefficients of the excluded instruments are jointly equal to zero. Therefore, the AR statistic is often used for testing overidentifying restrictions. The Anderson-Rubin
1365
procedure provides valid tests and confidence set under weak-instrument asymptotics,8 but it has low power when too many instruments are added (see Dufour 2003). Other considered fully robust tests by Stock et al. (2002) include Moreira’s conditional test by Moreira (2002), which fixes the size distortion of the test in the presence of weak instruments and can be used to make reliable inference about the coefficients of endogenous variables in the structural equation, and Kleibergen’s statistic by Kleibergen (2002), which is robust under both conventional and weak-instrument asymptotics. Stock et al. (2002) considered several k-class estimators that are “partially robust” to weak instruments, in the sense that these k-class estimators are more reliable than 2SLS. The K-class is a set of estimators defined by the following estimating equation with an arbitrary scalar k: O ˇ.k/ D ŒX 0 .I kM Z /X 1 ŒX 0 .I kM Z /Y
(90.35)
This class includes 2SLS, LIML, and Fuller-k. 2SLS is a k-class estimator with k equal to 1; LIML is a k-class estimator with k equal to the LIML eigenvalue; Fuller-k or called Fuller’s modified LIML, proposed by Fuller (1977), sets k D kLIML ’=.N-K/, where K is the total number of instruments and ’ is the Fuller constant, and the Fuller estimator with ’ D 1 yields unbiased results to second order with fixed instruments and normal errors; Jackknife 2SLS estimator, proposed by Angrist et al. (1999), is asymptotically equivalent to k-class estimator with k D 1 C K=.N-K/ (Chao and Swanson 2005). For a discussion of LIML and k-class estimators, see Davidson and MacKinnon (1993). Stock et al. (2002) found that LIML, Fuller-k, and Jacknife estimators have lower critical value for weak-instrument test than 2SLS (so the null will not be rejected too often), and thus are more reliable when the instruments are weak. However, Hahn et al. (2004) argued that due to the lack of finite moment LIML sometimes performs well but sometimes poorly in the weak instrument situation. They found that the interquartile range and the root MSE of the LIML often far exceed those of the 2SLS and hence suggest extreme caution in using the LIML in the presence of weak instrument. Instead, Hahn et al. (2004) recommend the Jackknife 2SLS estimator and Fuller-k estimator because the two estimators do not have the “no moment” problem that LIML has. Theoretical calculations and simulations show that Jackknife 2SLS estimator improves on 2SLS when many instruments are used and thus the weak instrument problem usually
8 Stock, Wright, and Yogo (2002) defines weak-instrument asymptotics as the alternative asymptotics methods that can be used to analyze IV statistics in the presence of weak instruments. Weak-instrument asymptotics involves a sequence of models chosen to keep concentration parameters constant as sample size N!1 and the number of instruments held fixed.
1366
occurs (see Chao and Swanson 2005; Angrist et al. 1999). Hahn et al. (2004) also find that the bias and mean square error using Fuller-k estimator are smaller than those using 2SLS and LIML.
90.6 Empirical Applications in Corporate Finance In order to provide some insight into the use of IV estimation by finance researchers, we follow the methodology used by Larcker and Rusticus (2007) to conduct a search using the key words “2SLS,” “simultaneous equations,” “instrumental variables,” and “endogeneity” for papers published in Journal of Finance, Journal of Financial Economics, Review of Financial Studies, and Journal of Financial and Quantitative Analysis during the period 1997–2008. As shown in Table 90.1, our search produced 28 published articles that use instrumental variable approach to solve the endogeneity bias in the study of capital structure, agency/ownership structure, pricing of public offering, debt covenants, financial institutions, microstructure, diversification/acquisition, venture capital/private equity, product market, and macroeconomics. Compared with the survey of Larcker and Rusticus (2007) for the IV applications in accounting research (which found 42 such articles in the recent decade), our list shows that the IV estimation is less commonly used in finance research9 and often employed in corporate finance related studies. Similar to the finding of Larcker and Rusticus (2007) on IV applications in accounting research, there is little attempt in empirical finance research to develop a formal structural equation to identify endogenous and exogenous variables in the first place. Most take the endogeneity in the variables of interest as granted and only 14% conducted the endogeneity test (Hausman test). Almost all decided to use IV approach to solve the endogeneity problem directly without considering the alternatives. As we discussed in Sects. 90.3 and 90.30 of this chapter, without valid instruments, the IV estimator can produce even more biased results than the OLS estimator and finding good instruments can be a very challenging. Therefore, the researcher should investigate the nature of the endogeneity and explores alternative methods before selecting the IV approach. We found that only Sorensen (2007) and Mitton (2006) evaluated the situation and chose alternative methods and fixed effects model instead to solve the endogeneity problem due to the lack of good instruments.
9
Another reason we found a smaller number of finance papers using IV may be that we limit our key words to those appearing in the abstract. Thus, our data may be more representative of the general situation of the finance research using IV as their main tests.
C.-J. Wang
Table 90.1 Finance research that use instrumental variables methods Capital structure/leverage ratio Faulkender and Petersen (RFS 2006) Yan (JFQA 2006) Molina (JF 2005) Johnson (RFS 2003) Desai, Foley, and Hines (JF 2004) Harvey, Lins and Roper (JFE 2004) Agency/ownership structure Bitler, Moskowitz and Vissing-Jorgensen (JF 2005) Ortiz-Molina (JFQA 2006) Palia (RFS 2001) Cho (JFE 1998) Wei, Xie and Zhang (JFQA 2005) Daines (JFE 2001) Pricing of public offering Cliff and Denis (JF 2004) Lee and Wahal (JFE 2004) Lowry and Shu (JFE 2002) Debt covenants Dennis, Nandy and Sharpe (JFQA 2000) Chen, Lesmond and Wei (JF 2007) Macro/product market Beck, Levine and Loayza (JFE 2000) Campello (JFE 2006) Garmaise (RFS 2008) Financial institutions Ljungqvist, Marston, Wilhelm (JF 2006) Berger, Miller, Petersen, Rajan and Stein (JFE 2005) Microstructure Brown, Ivkovi´c, Smith and Weisbenner, (JF 2008) Conrad, Johnson and Wahal (JFE 2003) Kavajecz and Odders-White (RFS 2001) Diversification/acquisition Campa and Kedia (JF 2002) Hsieh and Walkling (JFE 2005) Venture capital/private equity Gompers and Lerner (JFE 2000) The sample is based on an electronic search for the term “2SLS,” “Instrumental Variables,” “Simultaneous Equations,” and “Endogeneity” for papers published in Journal of Finance, Journal of Financial Economics, Review of Financial Studies, and Journal of Financial and Quantitative Analysis during the period 1997–2008
Table 90.2 shows that the standard two-stage least square regression method is the most commonly used procedure (17 out of 28), and the GMM/3SLS procedure is more concentrated on the study of financial economics. Half of the research did not discuss formally why a specific variable is selected as an instrumental variable and even fewer explicitly justify theoretically or statistically the validity of selected instruments by examining the orthogonality and relevance of
90 The Instrument Variable Approach to Correct for Endogeneity in Finance
Table 90.2 Descriptive statistics for finance research that use instrumental variable methods
1367
(A) Types of IV applications Standard two-stage least squares Capital structure/leverage ratio (4) Agency/ownership structure (6) Pricing of public offering (1) Debt covenants (2) Financial institutions (1) Microstructure (2) Diversification/acquisition (1) Total: 17 Two-stage Heckman Pricing of public offering/debt covenants (2) Financial institutions (1) Microstructure (1) Diversification/acquisition (1) Venture Capital/private equity (1) Total: 6 GMM/3SLS Capital structure/leverage ratio (2) Macro/product market (3) Total: 5 (B) Features of IV application Specification/justification
Two-stage
Heckman
GMM/3SLS
Total
Discussion of model/instruments Endogeneity test
8 (47%) 3 (18%)
3 (50%) 1 (17%)
3 (60%) 0
14 (50%) 4 (14%)
7 (41%) 3 (18%) 0
2 (33%) 2 (33%) 1 (17%)
2 (40%) 2 (40%) 0
11 (39%) 7 (25%)
2 (12%)
0 1 (17%)
3 (60%) 1 (20%)
5 (18%) 2 (7%)
Reported statistics IV Relevance First-stage regression F statistic for strength of IV Partial R2 for IV IV Orthogonality Overidentifying restrictions test Concerns on WI
the instruments. All studies have overidentified models but only 18% performed the overidentifying restrictions test to examine the orthogonality of the instruments. One study even cited the low R2 in the first stage as evidence that the instruments do not appear to be related to the error terms. It raises a great concern about the validity of the instruments used and the possible bias in the resulted IV estimates. From the study, 39% reported the first stage results along with R2 ; however, it should be noted that the first-stage R2 does not represent the relevance of the excluded instruments but the overall explanatory power of all exogenous variables. Thus, the strength of instruments can be overstated if judged by the first-stage R2 . Only 25% formally performed the first-stage F statistic for the instruments relevance, one study also used Shea’s partial R2 (Shea 1997), and none used Cragg and Donald’s underidentification test
(Cragg and Donald 1993) or Stock and Yogo’s weak instrument test (Stock and Yogo 2002). Given the absence of the weak instrument test in most finance studies, many of the estimation results using IV approach are questionable. It is also not surprising that only two studies among all addressed the concerns with the weak instrument problem. Ljungqvist et al. (2006) used five bank pressure proxies as instruments for analyst behavior in their two-step Heckman MLE model for competing underwriting mandates but detected the weak instrument problem by the low F statistic. They interpreted the insignificant estimates in the second step as possibly a result of weak instrument. Beck et al. (2000) discussed the possible weak instrument problem (though not judged by any test) associated with a difference estimator using the lagged value of the dependent variable, and then decided to use the alternative method by estimating the regression in
1368
difference jointly with the regression in level to reduce the potential finite sample bias. It appears that the recently developed weak-instrument-robust estimators and inferences have not yet applied in finance research.
90.7 Conclusion The purpose of this chapter is to present a practical procedure for using the instrumental variable approach to solve the endogeneity problem in empirical finance studies. The endogeneity problem has received a mixed treatment in finance research. The literature does not consistently account for endogeneity using formal econometric methods. When the IV approach is used, the instrumental variables are often chosen arbitrarily and few diagnostic statistics are performed to assess the adequacy of IV estimation. Given the challenge of finding good instruments, it is important that the researcher analyzes the nature of the endogeneity and the direction of the bias if possible, and then explores alternative empirical approaches so the problem can be solved more appropriately. For example, in the presence of omitted variables, if the unobservable effects, which are part of the error term, can be treated as random variables rather than the parameters to be estimated, panel data models can be used to obtain consistent estimates. When the IV approach is considered to be the most appropriate estimation method, the researcher needs to find and justify the instruments theoretically and statistically. One way to describe an instrumental variable is that a valid instrument Z for the potential endogenous variable X should be redundant in the structural equation when X is already included, which means that Z will not affect the dependent variable Y in any way other than through X. To examine the orthogonality statistically, an overidentified model is preferred and hence the overidentifying restrictions can be used. If the OID test cannot be rejected, we then can have some confidence in the orthogonality of the overall instruments. To examine the instrument relevance, the partial R2 and the first-stage F-statistic on the joint significance of the instruments should be performed at the minimum. The newly developed weak instrument tests (Cragg and Donald 1993; Stock and Yogo 2002) can also be used as robust check, especially for the finite samples. The researcher should keep in mind that the IV estimation method provides a general solution to the endogeneity problem; however, without strong and exogenous instruments, the IV estimator is more biased and inconsistent than the simple OLS estimator.
C.-J. Wang
References Anderson, T. W. and H. Rubin. 1949. “Estimation of the parameters of a single equation in a complete system of stochastic equations.” Annuals of Mathematical Statistics 20, 46–63. Angrist, J. D., G. W. Imbens, and A. B. Krueger. 1999. “Jackknife instrumental variables estimation.” Journal of Applied Econometrics 14, 57. Bartels, L. M. 1991. “Instrumental and ‘quasi-instrumental’ variables.” American Journal of Political Science 35, 777–800. Beck, T., R. Levine, and N. Loayza. 2000. “Finance and the sources of growth.” Journal of Financial Economics 58, 261–300. Berger, A. N., N. H. Miller, M. A. Petersen, R. G. Rajan, and J. C. Stein. 2005. “Does function follow organizational form? Evidence from the lending practices of large and small banks.” Journal of Financial Economics 76, 237. Bitler, M. P., T. J. Moskowitz, and A. Vissing-Jorgensen. 2005. “Testing agency theory with entrepreneur effort and wealth.” Journal of Finance 60, 539. Brown, J. R., Z. Ivkovic, P. A. Smith, and S. Weisbenner. 2008. “Neighbors matter: causal community effects and stock market participation.” Journal of Finance 63, 1509. Campa, J. M. and S. Kedia. 2002. “Explaining the diversification discount.” Journal of Finance 57, 1731. Campello, M. 2006. “Debt financing: Does it boost or hurt firm performance in product markets?” Journal of Financial Economics 82, 135. Chao, J. C. and N. R. Swanson. 2005. “Consistent estimation with a large number of weak instruments.” Econometrica 73, 1673. Chen, L., D. A. Lesmond, and J. Wei. 2007. “Corporate yield spreads and bond liquidity.” Journal of Finance 62, 119. Cho, M.-H. 1998. “Ownership structure, investment, and the corporate value: an empirical analysis.” Journal of Financial Economics 47, 103. Cliff, M. T. and D. J. Denis. 2004. “Do initial public offering firms purchase analyst coverage with underpricing?” Journal of Finance 59, 2871. Conrad, J., K. M. Johnson, and S. Wahal. 2003. “Institutional trading and alternative trading systems.” Journal of Financial Economics 70, 99. Cragg, J. G. and S. G. Donald. 1993. “Testing identifiability and specification in instrumental variable models.” Econometric Theory 9, 222. Daines, R. 2001. “Does delaware law improve firm value?” Journal of Financial Economics 62, 525. Davidson, R. and J. G. MacKinnon, 1993. Estimation and inference in econometrics. Oxford University Press, New York. Dennis, S., D. Nandy, and I. G. Sharpe. 2000. “The determinants of contract terms in bank revolving credit agreements.” Journal of Financial and Quantitative Analysis 35, 87. Desai, M. A., C. F. Foley, and J. R. Hines Jr. 2004. “A multinational perspective on capital structure choice and internal capital markets.” The Journal of Finance 59, 2451–2487. Dreher et al. Dufour, J.-M. 2003. “Identification, weak instruments, and statistical inference in econometrics.” Canadian Journal of Economics 36, 767– 808. Durbin, J. 1954. “Errors in variables.” Review of the International Statistical Institute 22, 23–32. Faulkender, M. and M. A. Petersen. 2006. “Does the source of capital affect capital structure?” Review of Financial Studies 19, 45.
90 The Instrument Variable Approach to Correct for Endogeneity in Finance Fuller, W. A. 1977. Some properties of a modification of the limited information estimator. Econometrica 45, 939. Garmaise, M. J. 2008. “Production in entrepreneurial firms: the effects of financial constraints on labor and capital.” Review of Financial Studies 21, 544. Gompers, P. and J. Lerner. 2000. “Money chasing deals? The impact of fund inflows on private equity valuations.” Journal of Financial Economics 55, 281. Hahn, J., J. Hausman, and G. Kuersteiner. 2004. “Estimation with weak instruments: accuracy of higher-order bias and MSE approximations.” Econometrics Journal 7, 272. Hansen, L. P. 1982. “Large sample properties of generalized method of moments estimators.” Econometrica 50, 1029. Harvey, C. R., K. V. Lins, and A. H. Roper. 2004. “The effect of capital structure when expected agency costs are extreme.” Journal of Financial Economics 74, 3. Hausman, J. A. 1978. “Specification tests in econometrics.” Econometrica 46, 1251. Hayashi, F. 2000. Econometrics, Princeton University Press, Princeton and Oxford. Heckman, J. 1979. “Sample selection as a specification error.” Econometrica 47, 1531–1161. Hermalin, B. E. and M. S. Weisbach. 2003. “Boards of directors as an endogenously determined institution: a survey of the economic literature.” Federal Reserve Bank of New York Economic Policy Review 9, 7–26. Hsieh, J. and R. A. Walkling. 2005. “Determinants and implications of arbitrage holdings in acquisitions.” Journal of Financial Economics 77, 605. Johnson, S. A. 2003. “Debt maturity and the effects of growth opportunities and liquidity risk on leverage.” Review of Financial Studies 16, 209. Kavajecz, K. A. and E. R. Odders-White. 2001. “An examination of changes in specialists’ posted price schedules.” Review of Financial Studies 14, 681. Kleibergen, F. 2002. “Pivotal statistics for testing structural parameters in instrumental variables regression.” Econometrica 70, 1781. Larcker, D. F. and T. O. Rusticus. 2007. On the use of instrumental variables in accounting research, Working Paper. Lee, P. M. and S. Wahal. 2004. “Grandstanding, certification and the underpricing of venture capital backed IPOs.” Journal of Financial Economics 73, 375. Li, K. and Prabhala, N. R. 2005. Self-selection models in corporate finance, Working Paper. Ljungqvist, A., F. Marston, and W. J. Wilhelm Jr. 2006. “Competing for securities underwriting mandates: banking relationships and analyst recommendations.” Journal of Finance 61, 301.
1369
Lowry, M. and S. Shu. 2002. “Litigation risk and IPO underpricing.” Journal of Financial Economics 65, 309. Mitton, T. 2006. “Stock market liberalization and operating performance at the firm level.” Journal of Financial Economics 81, 625. Molina, C. A. 2005. “Are firms underleveraged? An examination of the effect of leverage on default probabilities.” Journal of Finance 60, 1427. Moreira, M. J. 2002. Tests with correct size in the simultaneous equations model, University of California, Berkeley. Ortiz-Molina, H. 2006. “Top management incentives and the pricing of corporate public debt.” Journal of Financial and Quantitative Analysis 41, 317. Palia, D. 2001. “The endogeneity of managerial compensation in firm valuation: a solution.” Review of Financial Studies 14, 735. Rothenberg, T. J., Z. Griliches, and M. D. Intriligator. 1984. “Approximating the distributions of econometric estimators and test statistics,” in Handbook of econometrics. Volume II (Handbooks in Economics series, book 2. Amsterdam; New York and Oxford: NorthHolland; distributed in the U.S. and Canada by Elsevier Science, New York). Sargan. J. D. 1958. “The estimation of economic relationships using instrumental variables.” Econometrica 26, 393–415. Shea, J. 1997. “Instrument relevance in multivariate linear models: a simple measure.” Review of Economics and Statistics 79, 348. Sorensen, M. 2007. “How smart is smart money? A two-sided matching model of venture capital.” The Journal of Finance 62(38), 2725–2762. Staiger, D. and J. H. Stock. 1997. “Instrumental variables regression with weak instruments.” Econometrica 65, 557. Stock, J. H., J. H. Wright, and M. Yogo. 2002. “A survey of weak instruments and weak identification in generalized method of moments.” Journal of Business and Economic Statistics 20, 518. Stock, J. H., and M. Yogo. 2002. Testing for weak instruments in linear IV regression, (National Bureau of Economic Research, Inc, NBER Technical Working Paper: 284). Wei, Z., F. Xie, and S. Zhang. 2005. “Ownership structure and firm value in China’s privatized firms: 1991–2001.” Journal of Financial and Quantitative Analysis 40, 87. Wooldridge, J. M. 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, Massachusetts. Wu, D.-M. 1973. “Alternative tests of independence between stochastic regressors and disturbances.” Econometrica 41, 733. Yan, A. 2006. “Leasing and debt financing: substitutes or complements?” Journal of Financial and Quantitative Analysis 41, 709.
Chapter 91
Bayesian Inference of Financial Models Using MCMC Algorithms Xianghua Liu, Liuling Li, and Hiroki Tsurumi
Abstract After using the Monty Hall problem to explain Bayes’ rule, we introduce Gibbs sampler and Markov Chain Monte Carlo (MCMC) algorithms. The marginal posterior probability densities obtained by the MCMC algorithms are compared to the exact marginal posterior densities. We present two financial applications of MCMC algorithms: the CKLS model of the Japanese uncollaterized call rate and the Gaussian copula model of S&P500 and FTSE100.
91.2 Bayesian Inference and MCMC Algorithms 91.2.1 Bayes’ Theorem Bayesian statistical inference is based on the conditional probability of an event B given an event A:
P .B j A/: (91.1) Keywords Monty Hall problem r Gibbs sampler r MCMC algorithms r Uncollaterized call rate r Gaussian copulas r SP500 r In the expression above, A has already happened, and it is FTSE100 treated as given. The use of Equation (91.1) may be brought home if we apply it to the Monty Hall problem: Monty Hall Problem: Suppose that you are on a game 91.1 Introduction show. You are given the choice of three doors: Behind Door Bayesian statistical inference has made a substantial inroad 2 is a car; behind Door 1 and Door 3 are goats. You pick not only in finance and economics but also in many other Door 1. Monty Hall the host of the show knows what is befields. It has been spreading widely at an increasing rate. Its hind the doors. He opens another door, say Door 3,which has success has been helped by the advent of computer technol- a goat. Monty then says to you, “Do you want to 2pick Door 2?” Is it to your advantage to switch your choice? ogy that lifts heavy burden on computation. The answer is that by switching your choice you can imBayesian inference relies on numerical integration, and since the early 1980s the Gibbs sampler,1 a stochastic-based prove the probability of winning a car to two-thirds from onenumerical integration procedure, has made numerical inte- third. There are many ways to solve the Monty Hall problem. gration of high dimensions easier. Gibbs sampling, or more The easiest answer is obtained by using Equation (91.1) exgenerally Markov Chain Monte Carlo (MCMC) algorithms, pressed as has provided solutions to estimation problems that are difficult to be solved by algorithms based on maximization such as the maximum likelihood or generalized methods of moments. In this chapter we first explain the essence of Bayesian inference and the Gibbs sampler. Then we present two applications: one is the spot asset pricing model and the other is a copula model.
which is known as Bayes’ rule. If we put A D O3 and B D C2 Equation (91.2) becomes
X. Liu (), L. Li, and H. Tsurumi Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
2
1
The Gibbs sampler is named in honor of J.W. Gibbs, and the algorithm was devised by Geman and Geman (1984) about 80 years after the death of J.W. Gibbs. The interested reader should google “Gibbs sampling.”
P .B j A/ D
P .C2 j O3 / D
P .A j B/P .B/ P .A/
P .O3 j C2 /P .C2 / P .O3 /
(91.2)
(91.3)
The celebrated Monty Hall problem has been discussed over and over again. Most recently, the New York Times had an article on it entitled “Monty Hall Meets Cognitive Dissonance” April 7, 2008. If you Google “Monty Hall problem New York Times,” one of the first items that comes out in the search is to play the game. The second item is” Monty Hall Meets Cognitive Dissonance” April 7, 2008.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_91,
1371
1372
X. Liu et al.
where
In the Monty Hall problem we see that there are two events at work: guessing which door a car is behind and guessing which door Monty Hall opens.
C2 D event the car is behind Door 2 O3 D event Door 3 is opened by Monty Hall Similarly we can define C1 , and O2 and so forth. As we show below the answer to the Monty Hall problem is
Since you, the player, have no idea where the car is, the prior probability (i.e,. prior to Monty Hall opening Door 3) is 1 : 3
You know that Monty Hall opens either Door 2 or Door 3. Hence P.O2 / D P.O3 / D 1=2. Monty Hall knows that the car is behind Door 2. Hence P.O3 jC2 / D 1. Putting all together we have P .C2 j O3 / D
1
1 2
1 3
are the probabilities Monty Hall faces. Equation (91.2) is the relationship between the joint and conditional distributions. The reason why it is called Bayes’ rule lies in the use of Equation (91.2) in statistics. In statistics we have data y and parameters ™. If we put B D ™ and A D y then Equation (91.2) becomes . j y/ D
f .y j / ./ f .y/
(91.5)
where 2 1 D ¤ : 3 3
Suppose that the car is behind Door 1. Then P .C1 j O3 / D
i D 1; 2; 3
are the (prior) probabilities the player of the game faces, whereas P .O3 j C1 / and P .O3 j C2 /
2 1 P .C2 j O3 / D ¤ P .C2 / D : 3 3
P .C1 / D P .C2 / D P .C3 / D
P .Ci / and P .Oi /
P .O3 j C1 /P .C1 / D P .O3 /
1 2
1 3
1 2
D
1 D P .C1 / 3 (91.4)
since the car is behind Door 1 Monty opens either Door 2 or Door 3 with probability 1/2. Equation (91.4) shows that the event of Door 3 being opened does not change the probability of C1 . Naturally, we have P .C1 j O3 / C P .C2 j O3 / D 1:
f.yj™/ D the joint probability density function (pdf) of y given ™, hence the likelihood function; .™/ D prior pdf of ™; f.y/ D marginal pdf of y; .™jy/ D posterior pdf of ™. Since data y is realized random variables, f(y) is a constant. Hence, Equation (91.5) becomes . j y/ / ./ l. j y/
(91.6)
where l.™jy/ D f.yj™/ is the likelihood function. The Bayes’ rule (or theorem) reads as posterior is proportional to prior likelihood In Bayesian inference we need to derive the marginal pdf of a parameter. Suppose that ™ consist of ten elements: ™ D .™1 ; ™2 ; : : : ; ™10 /. The marginal posterior pdf of ™1 is
Z .1 j data/ /
Z
.1 ; 2 ; ; 10 j data/d.2 /d.3 / d.10 /:
If a closed form solution to this multiple integration is not available we need to resort to numerical integration.
91.2.2 Markov Chain Monte Carlo Algorithms There are various ways to carry out numerical integration: using, for example, a quadrature formula, importance
(91.7)
sampling, and Markov Chain Monte Carlo (MCMC) algorithms. MCMC algorithms are stochastic numerical integration methods based on the property of Markov chains that converge under certain conditions to steady state probabilities. The steady state probabilities are the marginal distributions. Casella and George (1992) and Robert and Gasella (2004) are handy reference sources on MCMC algorithms.
91 Bayesian Inference of Financial Models Using MCMC Algorithms
MCMC algorithms generally consist of the target density and proposal density. The target density is the posterior pdf of, say, ™1 given ™2 ; : : : ; ™10 : .1 j 2 ; ; 10 /:
Convergence tests on draws of “1 KST
0.676 (0.751) 0.757 (0.616) 1.181 (0.240)
FT Zg
The proposal density q.1 j 2 ; ; 10 / is the density from which we draw random numbers. If q.™1 j/ D .™1 j/ then we can draw random numbers from .™1 j/. In this case we say .™1 j/ is fully conditional, and we may use the Gibbs sampler. If not, we have to resort to other methods such as the accept–reject method or the MetropolisHastings method. Given the joint posterior pdf .™1 ; ™2 ; : : : ; ™10 jdata/, the Gibbs sampler is carried out in the following steps: Step 1: Choose the initial values, ™2 .0/ ; ™3 .0/ ; : : : ; ™10 .0/ . Step 2: Draw ™1 from .™1 j™2 .0/ ; : : : ; ™10 .0/ / and set ™1 as ™1 .1/ . Draw ™2 from .™2 j™1 .1/ ; ™3 .0/ ; : : : ; ™10 .0/ / and set ™2 as ™2 .1/ . :: : Draw ™j from .™j j™1 .1/; ™2 .1/; : : :; ™j .1/1 ; ™j .0/C1; : : : ; ™10 .0/ / :: : Draw ™10 from p.™10 j™1 .1/ ; ™2 .1/ ; : : : ; ™9 .1/ / and set ™10 as ™10 . Repeat Step 2 m times and collect .1/
.i / .i / .i / 1 ; 2 ; ; 10 ; i D 1; ; m Step 3: Discard (i.e., burn) the first mb draws and keep the rest. The theoretical basis of MCMC algorithms is the ergodic Markov chains converging to the steady state distributions and the steady state distributions are invariant to the initial starting states. Based on this theory, we burn initial draws of ™k .i/ ’s. We need to judge when the draws ™k .i/ ’s are converged. Plots of MCMC draws as well as some convergence tests are used to judge whether the draws have converged or not.3 We demonstrate that the Gibbs sampler yields the marginal posterior pdfs. We generate data using the simple regression model 3
1373
Goldman et al. (2008) compare the Kolmogorov-Smirnov, fluctuation, and Zg tests and recommend to use the filtered KolmogorovSmirnov test.
“2
¢
0.840 (0.481) 0.859 (0.451) 1:564 (0.080)
0.475 (0.978) 1.024 (0.245) 0:055 (0.841)
KST D Kolmogorov-Smirnov test FT D Fluctuation test Figures in parentheses are the prob-values. See Goldman et al. (2008)
yi D ˇ1 C ˇ2 xi C "i ; i D 1; ; n where "i N.0; 2 /: We set ˇ1 D ˇ2 D 1:0; 2 D 3:0; xi UniformŒ0; 5 and n D 100. The number of draws is 3,100 and we burn the first 100 draws. The exact posterior pdfs of “1 and “2 are univariate t distributions, while the posterior pdf of ¢ 2 is the inverted-gamma distribution. We compare the MCMC draws with the exact posterior pdfs. Since the posterior pdf of “1 ; “2 , and ¢ 2 is fully conditional we can use the Gibbs sampler as follows: Step 1: Choose an initial value for ¢ .0/ . Step 2: Draw “1 and “2 together from the bivariate normal distribution: # " # ! " .i / ˇO1 ˇ1 2.i 1/ 0 1 N ; .X X / .i / ˇO2 ˇ2 where ˇO1 and ˇO2 are the posterior means; ˇ1 and ˇ2 are the i -th draw of “1 and “2 ; 2.i 1/ is the i –1th draw of ¢ 2 . X is a n 2 matrix of observations on the regressors. Step 3: Draw ¢ 2.i/ from the inverted gamma distribution IG.˛; / where .i /
˛D
.i /
.y Xˇ .i / /0 .y Xˇ .i / / n2 ;D 2 2
and y is a n 1 matrix of observations on the regressand. The plots of the draws of the Gibbs sampler are given in Fig. 91.1. The draws exhibit the random patterns typical of the steady state distributions. Table below gives the convergence test statistics. Although the Kolmogorov -Smirnov test is preferable to the fluctuation and Zg tests, we present all three test statistics. The test statistics support convergence of the MCMC draws.
1374
X. Liu et al.
Gibbs sampler draws
Fig. 91.1 First 2,000 draws after the burns
2.50 2.00 1.50 1.00 0.50 0.00 -0.50
0
500
1000
1500
2000
1500
2000
1500
2000
Gibbs sampler draws
Draws of β1 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0
500
1000
Gibbs sampler draws
Draws of β2 2.3 2.2 2.1 2.0 1.9 1.8 1.7 1.6 1.5 1.4
0
500
1000
Draws of σ Figure 91.2 presents the exact posterior pdfs and the posterior pdfs from the Gibbs sampler. We see that the posterior pdfs by the Gibbs sampler are very close to the exact posterior pdfs.
91.3 CKLS Model with ARMA-GARCH Errors Beginning in the mid-1970s, the spot asset prices have been modeled mathematically by diffusion processes. In econometrics, the parameters of these processes have been estimated in discrete time approximation. A model used by Chan (1992) has been applied to various spot asset prices since the model is flexible enough to contain some popular models. The discrete time specification of the CKLS model is
rt D ˛ C ˇrt C rt ut
(91.8)
where rt is the spot rate. If œ D 0, Equation 91.8 becomes Vasicek’s model. If œ D :5 it becomes the CIR (Cox, Ingersoll and Ross) model. If œ D 1 it becomes the Brennan-Schwartz model. When “ D 0 and œ D 0 Equation (91.8) becomes Merton’s model. Equation (91.8) has been estimated by generalized methods of moments (GMM), maximum likelihood methods (MLE), and by Bayesian procedures. Qian et al. (2005) compared the performances of the GMM, MLE, and Bayesian methods and found that the Bayesian and MLE methods tend to dominate the GMM method especially when œ and/or ¢ get large. In their paper, the error term ut was assumed to be the Gaussian white noise. Here we will specify ut follows an ARMA-GARCH error process:
rt D ˛ C ˇrt 1 C rt 1 ut p q X X ut D
j ut j C"t C j "t j j D1
j D1
91 Bayesian Inference of Financial Models Using MCMC Algorithms
posterior densities
Fig. 91.2 Posterior pdfs: Exact vs. Gibbs sampler
1375
1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00 0.00
0.50
1.00
1.50
2.00
β1
posterior densities
5
Legend
4
exact Gibbs sampler
3 2 1 0 0.60
0.80
1.00
1.20
1.40
2.0
2.2
β2 posterior densities
4.00 3.00 2.00 1.00 0.00 1.4
1.6
1.8
σ "t N 0; t2 r s X X t2 D a0 C aj "2tj C bj t2j j D1
(91.9)
j D1
There are 3 C p C .q C 1/ C .r C 1/ C s parameters to be estimated. We modify the MCMC algorithms of Nakatsuma (2000) as follows: 1. To draw œ we use the modified efficient jump method proposed in Tsurumi and Shimizu (2008). 2. Instead of using nonlinear maximization procedures for the MA and GARCH parameters we use random walk draws.
procedures. In our experience a modified efficient jump or a random walk draw works better than a nonlinear maximization procedure. We apply the CKLS model with ARMA-GARCH errors to the Japanese short-term interest rate, uncollateralized overnight call rate, which is equivalent to the U.S. overnight federal funds rate. Since the uncollateralized overnight call rate began on July 31 1985 and the Bank of Japan adopted the so called zero-interest rate policy in 1999, the sample period of the data is from July 31 1985 to October 31 1998. We have 3,388 observations. The mean, standard deviation (std) and coefficients of skewness and kurtosis are as follows: Mean
r
In many MCMC draws of non-linear parameters such as ™j ’s, aj ’s, and bj ’s, nonlinear maximization estimation is often imbedded in the MCMC algorithms. However, it is more natural to design Markov chains without using maximization
0:002
Std 0:114
Skewness 1:232
Kurtosis 37:185
We observe that the change in the call rate is highly leptokurtic and slightly skewed, reflecting the stylized fact about many financial returns.
1376
X. Liu et al.
Table 91.1 MCMC draws of the CKLS model with ARMA-GARCH errors Posterior means (Standard deviations) ˛ ˇ
1
2 1 a0 a1 b1 b2
6,000/3 0:00063 (0.00090) 0:00015 (0.00031) 0.5289 (0.0892) 0.5310 (0.0952) 0:0794 (0.0268) 0:6806 (0.0880) 0.00124 (0.0079) 0.3130 (0.0700) 0.3847 (0.0944) 0.2692 (0.0704)
100,000/50 0:00063 (0.00064) 0:00016 (0.00026) 0.5463 (0.0420) 0.5390 (0.0738) 0:0814 (0.0739) 0:6879 (0.0589) 0.0005 (0.0035) 0.3045 (0.0388) 0.3934 (0.0675) 0.2867 (0.0562)
Notes: (1) 6,000/3 denotes MCMC draws of 6,000 after the burn of 1,000 and every 3rd draw are kept. (2) 100,000/50 denotes MCMC draws of 100,000 after the burn of 1,000 and every 50th draw are kept
Table 91.1 presents the summary statistics of the MCMC draws for the ARMA(2,1)-GARCH(1,2) errors for the draws of 6,000 and 100,000. We note that the posterior means are more or less the same for the MCMC draws of 6,000 and 100,000. Does this mean that we attain convergence with 6,000 draws? Table 91.2 presents the estimates of the AR(1) coefficient of the MCMC draws, and unfiltered and filtered Kolmogorov-Smirnov tests (KST) for 6,000 and 100,000 draws. As shown in Table 91.2, the filtered Kolmogorov-Smirnov test shows that except for four parameters the MCMC draws exhibit convergence. When the number of draws is increased to 100,000, the MCMC draws of all 10 parameters show convergence judged by the filtered Kolmogorov-Smirnov test. With 6,000 draws we keep every 3rd draw, while with 100,000 draws we keep every 50th draw. When the interval of the draws to be kept is increased from 50 to 3, the autocorrelation of the draws reduces substantially, and only two unfiltered KSTs reject convergence. All the filtered KSTs show convergence. Table 91.2 illustrates that: 1. Convergence of MCMC draws needs to be judged parameter by parameter.
2. When the autocorrelation of the draws is higher than 0.95, the unfiltered KST reject the null hypothesis of convergence, indicating that when the autocorrelation of the draws is high the KST tends to reject convergence even if convergence is attained. 3. When the autocorrelation of the draws is high, we should use the filtered KST. The two parameters of interest in the CKLS model is “ and œ. Figure 91.3 presents the marginal posterior pdfs for “ and œ. We see that the probability of “ being negative is 0.74. While the posterior mean of œ is 0.546, the 95% highest posterior density interval includes œ D 0:5, indicating that the CIR model may hold.
91.4 Copula Model for FTSE100 and S&P500 Applications of copulas have been fast growing. Using copulas we can build joint distributions from independent marginal distributions, thus incorporating dependence among variables.4 One of the primary uses of copulas is to measure nonlinear dependence through Kendall’s tau, Spearman’s rho, and Gini indices. In statistical applications, a Gaussian copula, Clayton copula, and Frank copula have been used. While the maximum likelihood estimation has been popular in estimating copulas, there are some Bayesian papers: Hoff (2006) and Ausin and Lopes (2006) to name a few. In this section using a Gaussian copula, we build a bivariate joint distribution of S&P500 and FTSE100. S&P500 and FTSE100 are two leading stock indices in the U.S. and in Europe. We will model the returns on these two financial indices using a copula. We use monthly data from January 1981 to October 2006 with a total of 319 observations. The data are downloaded from Global Financial Data. The descriptive statistics are given in Table 91.3. We specify regression models: yt1 D xt1 ˇ1 C 1 et1 yt 2 D xt 2 ˇ2 C 2 et 2
(91.10–91.11)
where yt1 D returns on FTSE100 at time t; yt2 D returns on S&P500 at time t, and xt1 D xt2 D 1. Judged by the correlation between yt1 and yt2 in Table 91.3, we see error terms et1 and et2 may not be treated independent. Also, we see from Table 91.3 that the yields of both FTSE100 and S&P500 are leptokurtic and slightly negatively skewed. 4 According to Wikipedia, Li (2000) used a Gaussian copula for the pricing of collateralized debt obligations, one of the new financial instruments much discussed in the current turmoil in the financial market.
91 Bayesian Inference of Financial Models Using MCMC Algorithms
1377
Table 91.2 Autocorrelation of MCMC draws and unfiltered and filtered Kolmogorov-Smirnov test 6,000/3 100,000/50 AR(1) KST Filtered KST AR(1) KST 0:984 (0.288) 0:827 (0.500) 0:126 0:750 0:275 (0.078) 0:783 (0.573) 0:019 0:825 (0.008) 1:297 (0.069) 0:589 0:550 1:66 3:18 (0) 1:45 (0.029) 0:579 1:58 1:29 (0.067) 0:827 (0.500) 0:149 1:025 2:75 (0) 1:36 (0.048) 0:610 1:50 1:14 (0.038) 0:894 (0.400) 0:619 0:575 2:59 (0) 1:66 (0.01) 0:592 0:950 3:31 (0) 1:65 (0.01) 0:640 0:875 2:82 (0) 1:275 (0.078) 0:553 0:875
P .i1/ .i/ Nj j Nj j .i/ Note: (1) AR.1/ D , where j is the i -th draw of parameter j ;
2 P .i1/ Nj j ˛ ˇ
1
2 1 a0 a1 b1 b2
0:551 0:555 0:977 0:976 0:656 0:973 0:966 0:976 0:986 0:980
(0.627) (0.504) (0.923) (0.014) (0.244) (0.022) (0.896) (0.327) (0.428) (0.428)
Filtered
KST
0:775 0:900 0:550 0:959 1:000 1:319 0:625 1:075 0:475 1:100
(0.585) (0.393) (0.923) (0.341) (0.280) (0.062) (0.830) (0.198) (0.977) (0.178)
(2) See Notes in Table 91.1 for
6,000/3 and 100,000/50; (3) KST D unfiltered Kolmogorov-Smirnov test; (4) Filtered KST D filtered Kolmogorov-Smirnov test (See Goldman et al. 2008); (5) denotes failure of convergence at the 5% significance level; (6) The figures in parentheses are prob-value Fig. 91.3 Posterior pdfs of “ and œ
1600
density
1200
800 400
0 -0.0015
-0.0010
-0.0005
0.0000
0.0005
0.0010
β 16.0
density
12.0
8.0 4.0
0.0 0.450
0.500
0.550
0.600
0.650
λ While skewness may be negligible, leptokurtosis is significantly larger than 3. Hence we use exponential power distributions (EPDs). There are multivariate EPDs (G’omez Sanchez-Mariano et al. (1998, 2002), but the shape parameters need to be identical for et1 and et2 . To allow different shape parameters and at the same time introduce dependence between S&P500 and FTSE100, we use a copular. Specifically, we assume that et1 and et2 have independent EPDs:
1 eti ˛i ; i D 1; 2 exp j j fi .eti / D ˛i C1 2 i i 2 ˛i .1=˛i / (91.12) ˛i
where i D
22=˛i .1=˛i / .3=˛i /
1=2 :
1378
X. Liu et al.
Table 91.3 Summary statistics for FTSE100 and S&P500 FTSE100 S&P500 Mean Min Max Std dev Skewness Kurtosis Correlation
0.0085 0.0121 0.1443 0.0462 0.9050 6.673 0.706
0.0086 0.0102 0.1318 0.0431 0.5859 5.456
The EPD above has zero mean and unit variance. Given a bivariate Gaussian copula, the copula density is given by 2 2st rt st2 rt2 Y st2 C rt2 C h.et1 ; et 2 / D p exp fi .eti / 2 2.1 2 / 1 2 i D1 1
where st D ˆ1 .Ft1 .x// and rt D ˆ1 .Ft 2 .x//; Z
x
Fti D
fi .x/dx; i D 1; 2: 1
ˆ.:/ is the cumulative density of the standardized normal variate, for t D 1; : : : ; n. Let p.™/ be the prior, where ™ D .ˇ1 ; ˇ2 ; 1 ; 2 ; ˛1 ; ˛2 ; ¡/. We use an informative prior as follows p./ D p./
" 2 Y
# p.ˇi /p.˛i /p.i / #
i D1 "
D IW.s; 1/
2 Y
N.bi ; i 0 /IG.c˛i ; d˛i /IG.ci ; di / (91.14)
where s; bi ; i 0 2 ; c˛i ; d˛i ; ci and di are the parameters of the prior; IW and IG denote the inverted Wishart and inverted gamma distributions, respectively. The posterior pdf is p.jdata/ / p./
There are two ways of estimating parameter ™: one-step and two-step methods. Two-step methods are frequently used in the maximum likelihood estimation (MLE). Applied to our copula model, the two-step MLE procedures are as follows: Step 1: From the likelihood function l.ˇ1 ; ˇ2 ; 1 ; 2 ; ˛1 ; ˛2 jdata/ D
n Y
h.et1 ; et 2 /:
(91.15)
t D1
p.ˇ1 ; ˇ2 ; 1 ; 2 ; ˛1 ; ˛2 jdata/ /
2 Y
n Y
f1 .et1 /f2 .et 2 /
t D1
obtain the maximum likelihood estimates: ˇO1 ; ˇO2 ; O 1 ; O 2 ; ˛O 1 and ˛O 2 . Step 2: Obtain the MLE of ¡ from the Gaussian copula
2
i D1
(91.13)
n Y
2st rt st2 rt2 s 2 C rt2 C p exp t 2 2.1 2 / 1 2 t D1 1
where st and rt are evaluated by the MLE’s: ˇO1 ; ˇO2 ; O 1 ; O 2 ; ˛O 1 , and ˛O 2 . The two-step MCMC algorithms are Step 1: Obtain MCMC draws of ˇ1 ; ˇ2 ; 1 ; 2 ; ˛1 , and ˛2 from the posterior pdf
p.ˇi /p.i /p.˛i /l.ˇ1 ; ˇ2 ; 1 ; 2 ; ˛1; ˛2 jdata/
i D1
Step 2: Obtain MCMC draws of ¡ given the MCMC draws of ˇi ; i , and ˛i ; i D 1; 2. Compared to the two-step MCMC algorithm, the one-step MCMC algorithm is to draw all the parameters including ¡ from Equation (91.15) by MCMC algorithms. As shown in Ausin and Lopes (2006) and in Bhasker (2008), a single
step estimation tends to yield better results than a two step estimation. In our MCMC draws we go further. We derive one-step ahead predictive values of SP500 and FTSE100, yQnC1 D .yQnC1;1 ; yQnC1;2 / where yQnC1;1 D one-period ahead forecast of S&P500 yQnC1;2 D one-period ahead forecast of FTSE100
91 Bayesian Inference of Financial Models Using MCMC Algorithms
Table 91.4 Posterior means and standard deviations of the Gaussian copula model for FTSE100 and S&P500 Mean Std dev ˇ1 ˇ2 1 2 ˛1 ˛2
0.01 0.01 0.05 0.04 0.68 2.17 1.77
1379
Table 91.5 Mean squared forecast errors (MSFE) of one-period ahead forecast Mean Standard error Model 1 SP500 FTSE100 Model 2 SP500 FTSE100
0.01 0.01 0.02 0.02 0.07 0.90 0.74
0.0019 0.0029
0.0026 0.0053
0.0258 0.0040
0.0393 0.0472
Note: Model 1 is Gaussian copula model. Model 2 is univariate model
91.5 Conclusion We obtain MCMC draws of .yQnC1 ; ™/ using the following joint predictive posterior distribution as the target density: fpred .yQnC1 ; / D p./p.yQnC1 /
nC1 Y
h.et1 ; et 2 /
(91.16)
t D1
and we use a flat prior for yQnC1 W p.yQnC1 / / constant. We have designed MCMC with Metropolis-Hastings algorithms for the draw of .yQnC1 ; ™/. Table 91.4 presents the posterior means and standard deviations. We see that the posterior means of the shape parameters, ˛1 and ˛1 are 2.17 and 1.77, respectively. The posterior mean of the correlation coefficient ¡ is 0.68. Will the copula model that incorporates the correlation between S&P500 and FTSE100 improve the prediction from the univariate model? In the univariate model the error terms of the returns of S&P500 and FTSE100 are assumed to follow independent EPD. As the measure of prediction performance we use the mean squared forecast errors (MSFEs) of one-period ahead prediction. By the MCMC algorithms we can generate the .j / .j / draws of one-period ahead forecast, yQnC1 ’s. From yQnC1 ’s we can obtain the distribution of MSFE:
2 .j / MSFE.j / D yQnC1 ynC1 ; j D 1; : : : ; m
(91.17)
where ynC1 is the realized S&P500 and FTSE100 returns at t D n C 1I MSFE.j/ is the j-th MCMC draw of MSFE and m is the number of MCMC draws. For each model we may draw the distribution of MSFE or present the mean and standard deviation of MSFE. The model that yields the distribution of MSFE that is closer to zero will be the better model. Table 91.5 presents the means and standard deviations of one-period ahead MSFEs. Model 1 is the Gaussian copula model and Model 2 is the univariate model. We see that for both S&P500 and FTSE100 the MSFE of the copula model are better than the MSFE of the univariate model since the means and standard errors of the copula model are smaller than those of the univariate model.
We used the Monty Hall problem to explain the Bayes theorem. The Bayes theorem is a probability statement. In statistics and econometrics we use the Bayes theorem to make inference on the parameters of the distribution from which data are drawn. Once data become available they are treated as realized random variables and thus they are fixed. The parameters, on the other hand, are treated as random. Given the realized data, the joint posterior distribution of the parameters is obtained by the Bayes theorem. The derivation of the joint posterior distribution of the parameters is straightforward. The difficult part is to obtain the marginal posterior distribution of a parameter since we need to integrate out the remaining parameters from the joint posterior distribution. Most of the models in finance are nonlinear in parameters, and we are unable to obtain the marginal posterior distribution of a parameter in a closed form. We need to resort to numerical integration. Since early 1990s stochastic integration procedures known as Markov Chain Monte Carlo (MCMC) algorithms have rapidly replaced other numerical integration procedures such as quadrature formulas, importance sampling, and Laplace approximation. We used the linear regression model to explain the Gibbs sampler that is used when fully conditional posterior distributions are available. In using MCMC algorithms we need to check whether MCMC draws of the parameters have converged or not. The check on convergence needs to be made by plotting the draws and by using convergence tests such as the Kolmogorov-Smirnov test, the fluctuation test, and the Zg test. As empirical illustrations we estimated the Chen, Karolyi, Longstaff, and Sanders (CKLS) model for the daily Japanese short-term interest rate between July 31 1985 and October 31 1998. Then we applied a bivariate copula model to incorporate the correlation between daily returns on FTSE100 and S&P500 indices. We showed that the predicted values of the returns on FTSE100 and S&P500 indices are closer to the realized returns when we incorporate the correlation. While we recognize that MCMC algorithms are powerful tools to obtain marginal distributions, we should point out
1380
that the design of the MCMC algorithms needs to be made with utmost care. In particular, we need to select an appropriate proposal density.
References Ausin, M. C. and H. F. Lopes. 2006. “Time-varying joint distribution through copulas.” mimeograph Bhasker, S. 2008. “Financial market interdependence: the copulaGARCH model.” mimeograph Brennan, M. and E. Schwartz. 1980. “Analyzing convertible bonds.” Journal of Financial and Quantitative Analysis 15, 907–929. Casella, G. and E. I. George. 1992. “Explaining the Gibbs sampler.” The American Statistician 46(3), 167–174. Chan, K., A. Karolyi, F. Longstaff, and A. Sanders. 1992. “An empirical comparison of alternative models of the short-term interest rate.” Journal of Finance 47, 1209–1227. Cox, J., J. Ingersoll, and S. Ross. 1980. “An analysis of variable rate loan contracts.” Journal of Finance 35, 389–403. Geman, S. and D. Geman. 1984. “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721–741. Goldman, E., E. Valieva, and H. Tsurumi. 2008. “Kolmogorov-Smirnov, fluctuation and Zg tests for convergence of Markov Chain Monte Carlo draws.” Communication in Statistics, Simulation and Computations 37(2), 368–379.
X. Liu et al. G’omez Sanchez-Mariano, E. G., E. G. G’omez-Villegas, and J. M. Marin. 1998. “A multivariate generalization of the power exponential family of distributions.” Communication in Statistics 27, 589–600. G’omez Sanchez-Mariano, E. G., E. G. G’omez-Villegas, and J. M. Marin. 2002. “A matrix variate generalization of the power exponential family of distributions.” Communication in Statistics 31, 2167– 2182. Hoff, P. D. 2006. “Marginal set likelihood for semiparametric copula estimation.” mimeograph. Li, D. X. 2000. “On default correlation: a copula function approach.” Journal of Fixed Income 9(4), 43–54. Merton, R. 1973. “An intertemporal capital asset pricing model.” Econometrica 41, 867–887. Nakatsuma, T. 2000. “Bayesian analysis of ARMA-GARCH models: a Markov chain sampling approach.” Journal of Econometrics 95, 57–69. Qian, X., R. Ashizawa, and H. Tsurumi. 2005. “Bayesian, MLE, and GMM estimation of spot rate model.” Communication in Statistics 34(12), 2221–2233. Robert, C. P. and G. Gasella. 2004. Monte Carlo statistical methods, 2nd ed. Springer, New York. Tsurumi, H. and H. Shimizu. 2008. “Bayesian inference of the regression model with error terms following the exponential power distribution and unbiasedness of the LAD estimator.” Communication in Statistics 37(1) 221–246. Vasicek, O. A. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188.
Chapter 92
On Capital Structure and Entry Deterrence Fathali Firoozi and Donald Lien
Abstract The theoretical literature on the link between an incumbent firm’s capital structure (financial leverage, debt/equity ratio) and entry into its product market is based on two classes of arguments, the “limited liability” arguments and the “deep pocket” arguments. However, these two classes of arguments provide contradictory predictions. This study provides a distinct strategic model of the link between capital structure and entry that is capable of rationally producing both of the existing contradictory predictions. The focus is on the role of beliefs and the cost of adjusting capital structure, two of the factors that are absent in the existing models. The beliefs may be exogenously given or endogenized by a rational expectations criterion. Keywords Market entry r Leverage r Game theory
92.1 Introduction There is a substantial literature on the strategic role that a firm’s capital structure (financial leverage, debt-equity ratio) could play in its product markets. A small but significant subset of this literature has focused on the link between a firm’s capital structure and entry into its product market. The seminal studies in this subset include Brander and Lewis (1986) and Poitevin (1989). The arguments can be classified into two classes with contradictory predictions. One class, referred to as the “limited liability” arguments initially proposed by Brander and Lewis (1986), goes along the following line. Since share-holders do not have priority to debt-holders in case of bankruptcy, a financial restructuring that amounts to a reduction in the firm’s debt/equity ratio changes the debt-holder and share-holder structure in a way that forces the firm to be less aggressive in its product market and thus encourages entry. The second class, F. Firoozi and D. Lien () Department of Economics, University of Texas at San Antonio, San Antonio, TX, USA e-mail: [email protected]; [email protected]
referred to as the “deep pocket” arguments initially proposed by Poitevin (1989), suggests the following. A reduction in incumbent’s debt/equity ratio effectively increases the debtfinancing capability of the firm. Since capacity expansions are often debt-financed, an entrant views the incumbent with low debt/equity ratio as having financial strength to engage in predatory expansion through debt financing if entry occurs. Therefore, the stated leverage reduction has the effect of discouraging entry, thus contradicting the prediction given by the limited liability arguments. Clearly, the connection between capital structure and market entry as well as the predictions summarized above are not limited to domestic competition. Similar issues arise in international entry contexts. In fact, the impact of capital structure on international market entry by multinationals has recently received attention in international trade circles. Related studies include Chao et al. (2002), Chao and Yu (1996, 2000), Marjit and Mukherjee (1998, 2001), Mukherjee (2002), Patron (2007), and Vishwasrao et al. (2007). In this study we propose a distinct strategic link between an incumbent firm’s capital structure and entry into its product market. We purpose a strategic model that brings into a two-period stochastic setting two distinct links. The first link is between capital structure and capacity expansion based on a number of studies in the financial literature, including Stulz (1990), Harris and Raviv (1991), and Li and Li (1996). The second link is the connection between capacity expansion and entry deterrence based on a number of classic studies, including Spence (1977) and Dixit (1980). The proposed unified model in this study has the novel feature of reconciling the existing contradictory predictions advanced by the limited liability arguments and the deep pocket arguments as summarized above regarding the link between capital structure and entry. The results in the present study demonstrate that the crucial factors that could generate the existing contradictory results consist of the probabilistic beliefs held by potential entrants regarding the impact of capital structure on operational expansion as well as the cost of adjusting capital structure to the incumbent. Further, we show that the stated belief may be exogenous or completely endogenized by a rational expectations criterion. Needless to say the results
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_92,
1381
1382
in the present study are directly applicable in international entry contexts as well, given that capital structure plays a role in determining the ability of an incumbent domestic firm to expand. The basic components of the model are constructed in Sect. 92.2. Its equilibrium is studied in Sect. 92.3. The link between capital structure and entry is then studied in Sect. 92.4 with a special focus on the role of beliefs. Some concluding remarks appear in the last section.
92.2 The Setting Consider an incumbent monopolist in the first period of a two-period time span and a potential entrant in period 1 contemplating entry in period 2. The potential entrant in period 1 is concerned about the incumbent’s cost structure in period 2. This cost structure depends on whether the incumbent maintains its current productive capacity or will expand in period 2. If the entrant is certain that the incumbent will expand, then it will not enter. Although the entrant is uncertain about such expansion by the incumbent, it can develop a rational probabilistic assessment regarding such an expansion. Hence, the entrant has a probabilistic assessment in period 1, referred to as the entrant’s belief and denoted by ˇ, regarding whether the incumbent will engage in capacity expansion in period 2 if entry occurs. It is well known that a low leverage (low debt/equity ratio) is often an essential prerequisite for subsequent expansion, consistent with the observation that capacity expansions are often debt-financed. For instance, the studies of Stulz (1990), Harris and Raviv (1991), and Li and Li (1996) have suggested that financial leverage is negatively correlated with subsequent undertaking of growth and investment opportunities. This literature shows that low leverage is often an essential component for subsequent expansion. As such, the stated probabilistic belief depends on the incumbent’s capital structure (financial leverage or debt/equity ratio) in period 1, information that is assumed to be publicly available, thus observable by the potential entrant. Further, as will be shown in Sect. 92.3, the functional form of the connection between the entrant’s belief and the incumbent’s capital structure may be rather general and exogenous or endogenized by a usual expectations criterion. A class of games often referred to as dynamic games of incomplete information has a distinctive feature of incorporating beliefs (Aumann and Hart 1994). We now characterize a context within this class. Typical models in the stated general class are by nature interactive, dynamic, and stochastic, thus require certain stylized set-ups for tractability. Therefore, a number of additional assumptions must be made. We construct a setting that reflects the essential features for this study and avoids unnecessary complications.
F. Firoozi and D. Lien
At the start of period 1 and in the absence of any concern about entry or expansion, the incumbent has implemented an optimal capital structure and carries the physical production scale or capacity K. The incumbent’s production is such that the marginal cost is partially and inversely related to its production scale or capacity. In the sense that an innovation induced increase in scale or capacity will reduce unit costs, investment in capacity can be equivalently viewed as investment in innovation in our setting. Therefore, given that the incumbent’s current capacity is represented by K, a rise in K reflects investment in innovation that yields lower per unit costs for the incumbent. In period 1, the potential entrant envisions two possibilities for the capacity status of the incumbent in period 2: the present capacity K or the expanded capacity K, so K < K. The potential entrant is uncertain about whether the incumbent will continue with the present capacity K in period 2 or will have the expanded capacity K. Further, the potential entrant must finalize its entry decision in period 1 before the incumbent has manifested its capacity for period 2. As such, the entrant will develop a likelihood of innovation induced capacity expansion by the incumbent that depends on the incumbent’s financial positioning in period 1. This likelihood, which is the potential entrant’s belief, will be characterized shortly. As will be shown, the incumbent has rational incentive to discourage entry. Thus, when faced with a possibility of entry, the incumbent maintains a capital structure that could accommodate expansion. A crucial observation that will be incorporated shortly is that financial positioning for innovation induced expansion is costly to the incumbent. No presumption is made about the length of each period except that the end of period 1 is marked by a common observation that either the potential entrant has finalized the decision to enter in period 2 or the threat of entry is terminated. We next specify the market demand, costs, and the nature of interaction if entry occurs in period 2. Throughout, we assume that the output is the numeraire. As for the market demand for the output in period 2, the following structure is assumed: X D a bP
(92.1)
where a > 0 represents the market size, b > 0 is the price sensitivity of demand, X is the total demand for the output, and P is the market price. In further characterization of the incumbent, we bypass the usual control and agency issues by assuming owner-manager unity. This unity also allows us not to distinguish between financial and operational profits and reflect the two in an overall profit function. Any reduction in financial leverage (reduction in debt/equity ratio) is treated as a costly investment in the present study for the following reason. In the case of leverage reduction by issuing new equity, the stated adjustment cost is in accord with the literature that has shown equity issuance
92 On Capital Structure and Entry Deterrence
1383
will generally carry an adverse effect on the firm’s valuation by financial markets. References include Myers and Majluf (1984) and Asquith and Mullins (1986) who showed that announcement of equity issue and the firm’s share prices are negatively correlated. Further, we assume that the cost of leverage reduction in period 1 is distributed as cost in periods 1 and 2 in accordance with a common practice that distributes an investment cost over current and future periods when the benefits are expected to be realized. This practice is often adopted to avoid misleading profit report for any period. To reflect the above assumptions and specifications, we incorporate the incumbent’s financial leverage (L) into its overall profit function via costs. Furthermore, we adopt the usual assumption that the incumbent’s production capacity (K) is inversely related to its marginal cost so that an innovation induced increase in capacity as reflected by a rise in K will reduce its marginal cost. All of the stated cost characteristics are incorporated by adopting the following marginal cost function for period 2, which reflects both financial and operational costs to the incumbent in period 2: A.L; K/ D L K
(92.2)
where > 0 is a constant, > 0 reflects the cost effect of a leverage reduction, > 0 represents the reduction in production cost associated with an innovation induced capacity expansion, L is the incumbent’s financial leverage (capital structure, debt/equity ratio) in period 1, and K is the incumbent’s capacity in period 2. As elaborated earlier, leverage reduction in period 1 has cost effects that distribute over periods 1 and 2, which is reflected by the second term in Equation (92.2). Since our focus is on the impact of L and K, it suffices to assume a constant marginal cost in relation to output. These aspects also allow a further simplification by assuming that the conventional fixed costs are zero so that the marginal cost reflects the average cost as well. To illustrate in a conventional cost-output diagram, each period’s average cost (AC) and marginal cost (MC) schedules may be viewed as coinciding horizontal lines that shift with changes in L or K. This line is initially shifted up when the leverage L is reduced as determined by the leverage reduction cost coefficient (), and shifted down when the capacity K is increased as determined by the capacity coefficient ( ). A smaller value for implies a smaller financial cost associated with any leverage reduction. A larger value for reflects a larger cost reduction emerging from any innovation induced expansion of capacity. As for the potential entrant, in the case of entry in period 2, the entrant will have a given marginal cost parameter c > 0. As in the case of the incumbent, we assume zero conventional fixed costs so that the cost parameter (c) represents both the marginal cost and average cost to the potential entrant in the case of entry.
Beliefs. We now specify the potential entrant’s belief as defined earlier in general form. From the potential entrant’s perspective in period 1, the likelihood that the incumbent will adopt the expanded innovation induced capacity K in period 2 is a function of the incumbent’s financial leverage in period 1. This likelihood reflects the potential entrant’s belief and is defined by: ˇ.L/ D ProbŒK D KjL
(92.3)
so ˇ.L/ is the potential entrant’s probability assessment that the incumbent will adopt the expanded capacity K in period 2, given the incumbent’s financial leverage L in period 1. Thus: 1 ˇ.L/ D Prob ŒK D K jL
(92.4)
is the likelihood that the expansion will not occur. We assume that a smaller L(smaller debt/equity ratio) corresponds to a larger likelihood of expansion, so dˇ d D Prob ŒK D K jL 0 dL dL
(92.5)
As will be shown, the assumption dˇ=dL 0 alone is not sufficient to guarantee that a reduction in debt/equity ratio (dL < 0) will discourage entry. In fact, the extent of the belief response (dˇ > 0) by the potential entrant as well as the cost of the leverage reduction for the incumbent will be the crucial determining factors in the interactive equilibrium. It is clear that different potential entrants may assign different exogenous functional forms and different slopes to ˇ.L/, depending on factors such as their individual prior knowledge, perceptions, and characters. Even if beliefs within an industry are uniformly determined by the industry characteristics, cross-industry beliefs may well be different. As will be shown in Sect. 92.3, the functional form of ˇ.L/ may be endogenized by a rational expectations criterion. As stated, the incumbent will have two possible cost schemes in period 2, one with the existing capacity K and one with the innovation induced expanded capacity K. These two cost schemes are specified via (92.2) as: A.L; K/ D L K
(92.6)
A.L; K/ D L K
(92.7)
If follows from K < K that A < A. From the potential entrant’s perspective in period 1, the incumbent in period 2 will have the low cost A (high capacity K ) with the probability ˇ and the high cost A (low capacity K) with the probability 1 ˇ. It is clear that: d AN dA D D < 0 dL dL
(92.8)
reflecting the assertion justified prior to Equation (92.2) that a leverage reduction (dL < 0) has adverse cost effect.
1384
F. Firoozi and D. Lien
92.3 Equilibrium Utilizing the setting in the last section and a usual interaction scheme, we first develop the reaction functions for the subgame that involves entry and then study the equilibrium. Further, we show that the belief function ˇ.L/ may be endogenized by a rational expectations mechanism. A determinant of entry decision is the nature of post-entry interaction between the entrant and the incumbent. We assume a Cournot interaction. Let Xe and Xm be the Cournot equilibrium output levels for the entrant and the incumbent in period 2 in the case of entry. Utilizing Equation (92.1), the market price is given by: P D .1=b/.a X / D .1=b/Œa .Xm C Xe /
(92.9)
The incumbent’s profit function …m is then defined, leading to the problem:
where the second equality follows from Equation (92.9). But the potential entrant is uncertain about the incumbent’s reaction function Xm . This uncertainty emerges from the potential entrant’s incomplete information about the incumbent’s capacity in period 2. After observing the incumbent’s leverage .L/ in period 1, the potential entrant utilizes own belief ˇ.L/ and the incumbent’s possible reactions in Equations (92.12) and (92.13) to maximize the expectation of own profit in Equation (92.14), leading to the problem profit: Max E…e Xe
E…e D ˇ…e .K/ C .1 ˇ/…e .K/ ˚ D ˇ.L/ .1=b/Œa bc X m Xe Xe CŒ1 ˇ.L/ f.1=b/Œa bc X m Xe Xe g ˚ D .1=b/ .a bc Xe / ˇ.L/X m Œ1 ˇ.L/ X m Xe (92.15)
Max …m Xm
…m D PXm AXm D .1=b/Œa .Xm C Xe / Xm A.L; K/Xm (92.10) where A.L; K/ is defined in Equation (92.2). The first order condition for the maximization of …m over Xm generates the solution: Xm D .1=2/Œa bA.L; K/ Xe
(92.11)
The second derivative of …m is 2=b < 0, which shows its concavity in Xm , thus satisfying the second order condition for maximum. Equations (92.9)–(92.11) show that the incumbent will enjoy higher values for its market price, profits, and output in the absence of the entrant (Xe D 0). This demonstrates the existence of a rational motivation for the incumbent to engage in entry deterrence activities. For the two possible capacities and cost schemes AN and A as specified in Equations (92.6) and (92.7), the incumbent’s two possible reaction functions are specified via Equation (92.11) as: X m .Xe / D .1=2/Œa bA Xe
(92.12)
X m .Xe / D .1=2/Œa bA Xe
(92.13)
where X m is the low capacity (high cost) output and X m is the high capacity (low cost) output of the incumbent. It follows from A < A that X m < X m . The incumbent will choose X m over X m if the profit associated with K is larger than that associated with K, and vice versa. As for the potential entrant, the post-entry profit function under complete information is: …e D PXe cXe D .1=b/Œa .Xm C Xe / Xe cXe D .1=b/Œa bc Xm Xe Xe
(92.14)
Here we assume risk neutrality so that maximization of expected profit and maximization of expected utility of profit generate the same results. The maximization of E…e over Xe leads to the first order condition: .a bc 2Xe / ˇX m .1 ˇ/X m D 0
(92.16)
The second derivative of E…e is 2=b < 0, which shows its concavity in Xe . The potential entrant’s reaction function is then the solution to Equation (92.16): Xe .X m ; X m / D .1=2/Œ.a bc/ ˇX m .1 ˇ/X m (92.17) The equilibrium output values for the two players are the simultaneous solutions to the three reaction functions in Equations (92.12), (92.13), and (92.17) that can be written in the matrix form: 2
32 3 2 3 Xe 12 0 a bA 41 0 2 5 4 X m 5 D 4 a bA 5 2 ˇ .1 ˇ/ Xm a bc
(92.18)
The solution to Equation (92.18) identifies the post-entry interactive optimal output values, which are given by: Xe .L/ D .1=3/Œ.a 2bc/ C ˇbA C .1 ˇ/bA
(92.19)
X m .L/ D .1=6/Œ2.a C bc/ .3 C ˇ/bA .1 ˇ/bA (92.20) X m .L/ D .1=6/Œ2.a C bc/ ˇbA .4 ˇ/bA
(92.21)
Here ˇ , A , and A are functions of L as specified in the last section.
92 On Capital Structure and Entry Deterrence
Before proceeding to a characterization of the link between the incumbent’s leverage and entry, some intuitive observations are noted in the results Equations (92.19)–(92.21). It is clear that a participant’s optimal output is inversely related to its own marginal cost and directly related to the other participant’s marginal cost; a rise in the incumbent’s costs (A or A) decreases the incumbent’s optimal output and increases the potential entrant’s post-entry optimal output. Similarly, a rise in the potential entrant’s costs (c) decreases its post-entry optimal output and increases the incumbent’s output. All optimal output values are directly related to the size of the market demand (a). The potential entrant’s belief (ˇ) about the incumbent’s expansion has a crucial impact not only on its post-entry optimal output but on the incumbent’s optimal output as well. Since the potential entrant’s post-entry optimal output as specified in Equation (92.19) is a function of the incumbent’s financial leverage in pre-entry period, the context may seem to be one of signaling with the leverage serving as the signal for the incumbent. However, as stated in the last section, the context as it stands is not one of signaling but one in the more general class of dynamic games of incomplete information. In a typical game from the subclass of signaling, there exists private information or information asymmetry, which are clearly absent in the present context. But there are restrictions that could reduce the present context to one of signaling. For instance, if we assume that the incumbent has private information that it will not engage in innovation induced capacity expansion in a post-entry game but intends to signal otherwise via its leverage in period 1 to deter entry, the context then would be one of signaling with the leverage serving as the signal. In that case, the optimal post-entry output values would still be those specified by Equations (92.19) and (92.21) above and one may proceed in the direction of designing optimal signal. However, for the present purpose of highlighting the role of beliefs in linking capital structure to entry, a restriction of the context to one of signaling is not needed. Rationalization of beliefs. We now show that the entrant’s belief ˇ.L/ as defined earlier can in fact be endogenized through a rationalization process. Suppose the expansion by the incumbent in period 2 requires a cost of , which is known in period 2 but is a random variable in period 1. Let F ./ denote the probability distribution of : F .x/ D ProbŒ x
1385
Utilizing the equilibrium solutions (Equations (92.19)– (92.21)), we can evaluate the incumbent’s capacity and production choices. Specifically, X m (the incumbent’s optimal output under the expanded capacity) is chosen over X m if and only if the profit under expansion exceeds the profit without expansion; that is,
1 a X m Xe X m AX m b
1 Œa X m Xe X m AX m (92.22) b Utilizing Equations (92.19)–(92.21) and some algebraic manipulation, the above inequality reduces to: " b a C bc .K K/ 8 8L 5 K 3 K 3 12 # ˇ.L/b (92.23) .K K/ C 6 Thus, under the usual rational expectations criterion, the probability of capacity expansion is the likelihood of the above inequality; that is, ˇ.L/ D F .K K/
b a C bc 3 12 8 8L5 K 3 K ˇ.L/b .K K/ (92.24) C 6
Therefore, a rational specification for ˇ.L/ in the present context is the solution that emerges from Equation (92.24). Note that the differentiation of Equation (92.24) by the chain rule generates:
ˇ 0 .L/b 8b C .K K/ ˇ .L/ D F fg .K K/ 12 6 0
0
leading to the solution: F 0 fg .K K/ 8b 12
ˇ .L/ D b 0 2 .K K/2 1 F fg 6 0
(92.25)
We have shown in Equation (92.5) that ˇ.L/ must satisfy ˇ 0 .L/ 0. Within the support of the density function F
1386
F. Firoozi and D. Lien
we have F 0 ¤ 0, thus ˇ 0 .L/ ¤ 0. Since the numerator in Equation (92.25) above is nonnegative, the strict inequality ˇ 0 .L/ < 0 holds if and only if the following condition holds: 1 F 0 fg
b 2 .K K/2 < 0 6
(92.26)
For example, if the expansion cost is uniformly distributed over Œ0 ; 1 , then F 0 fg D 1=.1 0 / and the parametric condition in Equation (92.26) for ˇ 0 .L/ < 0 becomes: 6 .1 0 / < b 2 .K K/2
(92.27)
Thus, in the present context, the statement in Equation (92.27) establishes a sufficient parametric condition for the rationality of the belief that emerges from Equation (92.24).
92.4 Capital Structure and Entry Deterrence The central issue of capital structure and entry deterrence can now be closely studied. Recall the potential entrant’s postentry interactive optimal profit function stated in Equation (92.15): E…e D .1=b/Œ.a bc Xe / ˇX m .1 ˇ/X m Xe (92.28) Substitution of the output equilibria Equations (92.19)– (92.21) into Equation (92.28) generates: E…e .L/
h D Œ1=.9b/ .a 2bc/2 C 2.1 ˇ/.a 2bc/bA C ˇ.2 ˇ/.a 2bc/bA C ˇ.1 ˇ/.2 ˇ/b 2 AA i 2 (92.29) C ˇ 2 .1 ˇ/b 2 A C .1 ˇ/2 b 2 A2
where ˇ, A, and A are functions of L with dˇ=dL < 0, d A=dL D d A=dL < 0 as specified in the last section. Since the potential entrant’s decision on entry is determined by its post-entry interactive optimal profits, a link between the incumbent’s capital structure and entry is identified by the slope dE…e =dL. However, the entrant’s post-entry profit function as specified in Equation (92.29) is highly nonlinear in L and even with adoption of a simple functional form for ˇ.L/ a characterization of the slope dE…e =dL does not seem to be a tractable task. One significant revelation at this point is that for values for L where dE…e =dL > 0Œ< 0 , a decrease in the incumbent’s leverage discourages [encourages] entry through a decrease (an increase) in the potential
entrant’s post-entry profits. Some intuitive explanations for this result will be given shortly. The determinants of entry and their roles are summarized in Equation (92.29). A monotonicity condition. Our primary purpose is to highlight the role of beliefs in characterizing the link between capital structure and entry deterrence. Therefore, for the purpose of conducting further comparative static analysis without getting nailed down right away by the stated complexities in Equation (92.29), it suffices that we focus on the case where the initial relationship between the potential entrant’s post-entry optimal output and its post-entry optimal profit is monotonically increasing. Thus, changes that bring a rise in optimal output will also bring a rise in optimal profit. We now show that for this monotonicity condition to hold, it suffices that initially the market size (a) and/or the incumbent’s cost (A) be sufficiently large. Since the monotonicity of a random variable with respect to a parameter extends to its expectation, the monotonicity condition regarding the potential entrant’s profit function in Equation (92.14) extends to the potential entrant’s expected profit function in Equation (92.15). Thus, it suffices to work with the potential entrant’s profit function in Equation (92.14). Substituting the incumbent’s reaction function Equation (92.11) for Xm in Equation (92.14) generates: 1 1 1 …e D .1=b/ a bc C bA Xe Xe 2 2 2 Therefore, d…e 1 D .1=b/ .a C bA/ .bc C Xe / dXe 2 It follows that d…e =dXe > 0 holds for sufficiently large market size (a) and/or sufficiently large incumbent’s cost (A). Note that dXe and d…e here are changes in optimal values initiated by a parametric change. At an optimum for a given set of parametric values, the stated derivative is zero by the first order condition. Under the stated monotonicity condition, comparative static results regarding the potential entrant’s optimal output (Xe ) extend directly to those regarding the entrant’s optimal profit (…e ). The potential entrant’s post-entry optimal output as specified in Equation (92.19) can be written as: Xe .L/ D .1=3/Œ.a 2bc/ C bA ˇ b.A A/
(92.30)
By the statements in Equations (92.6) and (92.7), b.AA/ D b.K K/ > 0, and with a use of Equation (92.6) for A, Equation (92.30) reduces to: h Xe .L/ D .1=3/ .a 2bc C b b K/ i b.K K/ˇ.L/ bL (92.31)
92 On Capital Structure and Entry Deterrence
The strategic and cost effects. It follows that a reduction in leverage (L) by the incumbent in period 1 has two distinct and opposing effects on the potential entrant’s post-entry optimal output in period 2 as reflected by the second term containing ˇ.L/ and the third term containing L in Equation (92.31). The second term, which will be referred to as the strategic effect, works through the potential entrant’s belief. Consider the case dˇ=dL < 0. Since b.K K/ > 0, the strategic effect of a reduction in the incumbent’s leverage results in a decrease in the potential entrant’s post-entry optimal output. This is due to the fact that an increase in the potential entrant’s belief that the incumbent will adopt the innovation induced expansion implies a higher likelihood of a lower marginal cost for the incumbent and a lower postentry optimal output for the potential entrant. The extent of the strategic effect is determined mainly by the potential entrant’s belief (ˇ), the incumbent’s capacity savings coefficient ( ) as specified in Equation (92.2), and the expected expansion .K K/. The third term in Equation (92.31), which will be referred to as the cost effect, works mainly through the incumbent’s financial cost parameter () associated with a leverage reduction as specified in Equation (92.2). The cost effect of a reduction in the incumbent’s leverage will increase the potential entrant’s post-entry optimal output. As discussed prior to and after Equation (92.2), a reduction in the incumbent’s leverage involves a financial value cost for the incumbent, which increases the incumbent’s marginal cost and thus raises the potential entrant’s post-entry optimal output. It is clear that the incumbent will benefit from the strategic effect of reducing its leverage but it suffers from the cost effect of doing so. The fact that a leverage reduction entails a financial cost that is incorporated into the interactive process raises the following case. If the potential entrant’s belief is not sufficiently sensitive to the incumbent’s leverage, then a leverage reduction by the incumbent could in fact encourage entry via a cost effect that overrides the strategic effect. The notion of potential entrant’s belief sensitivity to leverage as defined by jdˇ=dLj will play a crucial role in the following analysis of this issue. We consider the two cases associated with dˇ=dL 0. Case (i): dˇ=dL D 0. Consider the case where a reduction in financial leverage by the incumbent has no impact on the potential entrant’s likelihood assessment that the incumbent will subsequently engage in innovation induced expansion. Note that this extreme belief is rather irrational according to the rational expectations criterion studied at the end of the last section where the slope dˇ=dL is strictly negative.
1387
However, it serves as a benchmark for our analysis. If follows from Equation (92.31) that: dXe =dL D b < 0 Hence, by the monotonicity condition, d…e =dL < 0. It is clear that in this case a leverage reduction by the incumbent increases the entrant’s post-entry optimal output and profit, thus encourages entry. A leverage reduction in this case has only cost effect and its strategic effect is zero. A summary is given below. Proposition 1. Under the stated monotonicity condition and in the case where initially there is no belief sensitivity to leverage (d“=dL D 0), a reduction in financial leverage (reduction in debt/equity ratio) by the incumbent has no strategic effect and its cost effect will encourage entry. On a purely intuitive ground, the above result may be explained as follows. When the potential entrant pays close attention to an incumbent’s cost and profitability reports but pays no attention to the incumbent’s capital structure, then a costly reduction in financial leverage by the incumbent could in fact encourage entry, although the leverage reduction puts the incumbent in a better financial position to engage in innovation induced expansion and predation if entry occurs. The case where a leverage reduction has nonzero strategic effect beside the cost effect and may discourage entry is discussed next. Note that the rationalization of beliefs discussed earlier involved the random cost parameter with the probability distribution F ./. In the special case where there is no uncertainty so assumes a certain value, F ./ is a constant with a jump at the value of . It follows then from Equation (92.24) that dˇ=dL D 0, hence the analysis and results summarized in Proposition 1 above apply in the case where the parameter is deterministic. Case (ii): dˇ=dL < 0. We now assume the potential entrant has some sensitivity to financial leverage as a tool of expansion so that the belief sensitivity to leverage (jdˇ=dLj) is nonzero. We analyze the strategic role of leverage in entry deterrence in this case as follows. The adopted negative sign here is consistent with the rational beliefs studied at the end of the last section. We continue under the stated monotonicity condition. Define G > 0 and H > 0 as follows: G D .1=3/b.K K/
(92.32)
H D .1=3/b
(92.33)
1388
F. Firoozi and D. Lien
Then the potential entrant’s post-entry optimal output as specified in Equation (92.31) shows that: dXe D H G dL
dˇ dL
(92.34)
It follows from Equation (92.34) that the condition dXe =dL > 0 is equivalent to: H dˇ < D <0 dL G .K K/
(92.35)
The preceding condition is equivalent to: ˇ ˇ ˇ dˇ ˇ ˇ ˇ ˇ dL ˇ > .K K/
(92.36)
Thus Equation (92.36) is a necessary and sufficient condition for dXe =dL > 0 and, by the monotonicity condition, for d…e =dL > 0. If the entrant’s belief sensitivity to leverage (jdˇ=dLj) exceeds the lower bound specified on the right side of Equation (92.36), then the strategic effect of a leverage reduction by the incumbent exceeds its cost effect so that the potential entrant’s post-entry optimal output and profit fall and entry is discouraged. The results are summarized in the following proposition. Proposition 2. Under the stated monotonicity condition, when the potential entrant’s belief sensitivity to leverage (jd“=dLj) exceeds the lower bound specified in Equation (92.36), the strategic effect of a reduction in financial leverage (reduction in debt/equity ratio) by the incumbent will exceed its cost effect so that the leverage reduction will discourage entry. The stated bound is determined by the incumbent’s cost characteristics (; ) and its expected expansion .K K/. The smaller is =Œ.K K/ the larger is the likelihood that a lowering of leverage discourages entry. The above result further highlights the significance of prior beliefs held by potential entrants in linking an incumbent’s capital structure to entry deterrence. It also gives a strategic justification for certain behavior often manifested by incumbents, including attempts to heighten the public expectation of their future capacity expansion (K), increase their capacity savings coefficient (”), and decrease their financial cost coefficient (). As shown above, these changes decrease the bound =Œ.K K/ and thus increase the likelihood that a lowering of financial leverage will discourage entry. An entry deterrence criterion. Suppose the entrant requires that its post-entry production level exceeds the quantity Xe0 in order to remain viable in the case of entry. Assuming that the incumbent is aware of this minimal output requirement by the entrant, entry-deterrence by the incumbent amounts to
setting its control variable; namely, its capital structure in the present setting, so that the entrant’s expected post-entry output level Xe is reduced the level Xe0 . Thus, the entry-deterring O for the incumbent is the value of L that financial leverage L generates Xe .L/ D Xe0 where Xe .L/ is given by Equation (92.31) above. Define: I D .1=3/Œa 2bc C b b K Xe0
(92.37)
Under the case (ii) elaborated above, applying the definitions and results in Equations (92.32), (92.33), (92.37), and (92.31) reduces the equality Xe .L/ D Xe0 to the following equation in L: Gˇ.L/ C HL I D 0
(92.38)
Thus the incumbent’s entry-deterring financial leverage is the solution LO to Equation (92.38). If the function on the left side of Equation (92.38) is strictly monotonic in L, then Equation (92.38) has a unique solution. But this function has the slope ŒG.dˇ=dL/ C H , which is negative by Equation (92.35), thus Equation (92.38) in fact has a unique entryO Note that the incumbent’s fideterring leverage solution L. nancial leverage is not the only factor that can deter entry. Other factors that can bring the output criterion Xe D Xe0 to hold in the present setting are those specified in the solution for Xe in Equation (92.31). For instance, a sufficiently high marginal cost for the entrant as reflected by a sufficiently high value for “c” in Equation (92.31) could in fact reduce the entrant’s expected output Xe to the entry-deterring level Xe0 .
92.5 Conclusion The theoretical and empirical literature has not settled the issue of how an incumbent’s capital structure (financial leverage, debt/equity ratio) may affect entry into its product market. There are two classes of existing theories that generate opposing predictions, one based on the “deep pocket” arguments, and the other based on the “limited liability” arguments. The focus of the present model has been on two of the factors that are absent in the existing models of the stated link; namely, the belief held by the potential entrant regarding the incumbent’s expansion, and the cost of adjusting capital structure to the incumbent. By incorporating these two factors into a distinct strategic link between an incumbent’s capital structure and entry into its product market, the study has presented a model that is capable of rationally producing both of the opposing predictions in the existing literature. The incorporated probabilistic belief, which is formed by the potential entrant and defined as a function of the in-
92 On Capital Structure and Entry Deterrence
cumbent’s capital structure (L), may be exogenous or completely endogenized by a rational criterion. Beside the belief, another crucial factor in linking financial leverage to entry appears to be the cost of leverage reduction incurred by the incumbent. When the potential entrants’ belief about the incumbent’s innovation induced expansion has sufficiently high sensitivity to the incumbent’s capital structure, the prediction of the model is that a reduction in the incumbent’s financial leverage (reduction in debt/equity ratio) will discourage entry. The explanation here is that the positive strategic impact of reducing leverage overrides its cost effect for the incumbent so that the net effect in equilibrium is to discourage entry through a reduction in post-entry optimal output and profit for the potential entrant. This justification is fundamentally different from the one given by the existing “deep pocket” arguments that suggest a reduction in incumbent’s debt/equity ratio effectively increases both the available cash and debt-financing capability for the incumbent, thus discourages entry. A distinctive aspect of the present setup is that it could rationally explain both positive and negative correlations between capital structure and entry deterrence. When the potential entrants’ belief has sufficiently low sensitivity to the incumbent’s capital structure, then the strategic effect of a reduction in leverage (reduction in debt/equity ratio) by the incumbent is overridden by the cost of reducing the leverage, thus the leverage reduction could in fact encourage entry. The case may be elaborated further on intuitive grounds as follows. When potential entrants pay close attention to an incumbent’s current cost and profitability reports but not sufficient attention to the incumbent’s capital structure, a costly reduction in financial leverage by the incumbent could in fact encourage entry, even under the condition that the lower leverage financially puts the incumbent in a better strategic position to engage in a subsequent competitive expansions if entry occurs. This justification is very different from the one given by the existing “limited liability” arguments that are typically presented as follows. Since share-holders do not have priority to debt-holders in case of bankruptcy, a reduction in incumbent’s debt/equity ratio changes the debt-holder and share-holder structure in a way that forces the incumbent to be less aggressive in its product market and thus encourages entry. The small body of empirical literature that exists on this issue is inconclusive and has suggested both positive and negative correlations between capital structure and entry with theoretical justifications based on the deep pocket and limited liability arguments as elaborated above. Our study has suggested an alternative rational model that closes the existing gap but introduces a new explanatory variable into
1389
the empirical specifications; namely, the belief held by the entrants. Although this belief and its crucial role are now theoretically defined and rationalized, the empirical challenge that remains is to find out how to measure or identify a proxy for the belief variable. The study can also be extended in a number of directions. For instance, a subsequent undertaking could analyze the issue of entrant’s capital structure and entry deterrence by incumbent. Another extension may study the case where the incumbent and the entrant compete in the formation of capital structure, each with the intention of influencing the other player’s response.
References Asquith, P. and D. Mullins. 1986. “Equity issues and offering dilution.” Journal of Financial Economics 15, 61–89. Aumann, R. and S. Hart (Eds.). 1994. Handbook of games theory, Vol.2, North-Holland, Amsterdam. Brander, J. and T. Lewis. 1986. “Oligopoly and financial structure: the limited liability effect.” American Economic Review 76, 956–970. Chao, C., W. Chou, and E. Yu. 2002. “Domestic equity control, capital accumulation and welfare: theory and China’s evidence.” World Economy 25(8), 1115–1127. Chao, C. and E. Yu. 1996. “Are wholly foreign-owned enterprises better than joint ventures?” Journal of International Economics 40(1–2), 225–237. Chao, C. and E. Yu. 2000. “Domestic equity controls of multinational enterprises.” Manchester School 68(3), 321–330. Dixit, A. 1980. “The role of investment in entry deterrence.” Economic Journal 90(357), 95–106. Harris, C. and J. Raviv. 1991. “The theory of capital structure.” Journal of Finance 46, 297–355. Li, D. and S. Li. 1996. “A theory of corporate scope and financial structure.” Journal of Finance 51, 691–709. Marjit, S. and A. Mukherjee. 1998. “Technology collaboration and foreign equity participation: a theoretical analysis.” Review of International Economics 6(1), 142–150. Marjit, S. and A. Mukherjee. 2001. “Technology transfer under asymmetric information: the role of equity participation.” Journal of Institutional and Theoretical Economics 157(2), 282–300. Mukherjee, A. 2002. “Subsidy and entry: the role of licensing.” Oxford Economic Papers 54(1), 160–171. Myers, S. and N. Majluf. 1984. “Corporate financing and investment decisions when firms have information that investors do not have.” Journal of Financial Economics 13(2), 187–221. Patron, H. 2007. “The value of information about central bankers’ preferences.” International Review of Economics and Finance 16, 139–148. Poitevin, M. 1989. “Financial signaling and the deep-pocket argument.” Rand Journal of Economics 20, 26–40. Spence, M. 1977. “Entry, capacity, investment and oligopolistic pricing.” Bell Journal of Economics 10, 1–19. Stulz, R. 1990. “Managerial discretion and optimal financing policies.” Journal of Financial Economics 26, 3–27. Vishwasrao, S., S. Gupta, and H. Benchekroun. 2007. “Optimum tariffs and patent length in a model of north-south technology transfer.” International Review of Economics and Finance 16, 1–14.
Chapter 93
VAR Models: Estimation, Inferences, and Applications Yangru Wu and Xing Zhou
Abstract Vector auto-regression (VAR) models have been used extensively in finance and economic analysis. This paper provides a brief overview of the basic VAR approach by focusing on model estimation and statistical inferences. Applications of VAR models in some finance areas are discussed, including asset pricing, international finance, and market micro-structure. It is shown that such approach provides a powerful tool to study financial market efficiency, stock return predictability, exchange rate dynamics, and information content of stock trades and market quality. Keywords VAR r Granger-causality test r Impulse response r Variance decomposition r Co-integration r Asset return predictability r Market quality r Information content of trades r Informational efficiency
93.2 A Brief Discussion of VAR Models A basic p-lag VAR model has the following form: Xt D A C ˆ1 Xt 1 C ˆ2 Xt 2 C : : : C ˆp Xt p C "t ; t D 1; 2; : : : T:
where Xt D .x1;t ; x2;t ; : : : ; xn;t /0 ; and it is an .n 1/ vector of economic time series. In addition, ˆ are coefficient matrices and "t is an .n 1/ vector of residuals. The residual vector is assumed to have zero mean, zero auto-correlation, and time invariant covariance matrix . For example, a p-lag bivariate VAR model can be expressed as:
x1;t x2;t
Y. Wu () and X. Zhou Rutgers Business School, Newark and New Brunswick, NJ, USA e-mail: [email protected]
D
a1 a2
C
93.1 Introduction Following seminal work by Sims (1980), the vector autoregression (VAR) approach has been developed as a powerful modeling tool for studying the interactions among economic and financial variables and for forecasting. As basic VAR models focus on the statistical representation of the dynamic behavior of time series data but without much restriction on the underlying economic structure and can be easily estimated, they have gained increasing popularity in both economics and finance. In this chapter, we provide an overview of basic VAR models and some of their most popular applications in financial economics.
(93.1)
C
1 1
1;1
1;2 1 1
2;1
2;2 p
p
p
p
1;1 1;2
2;1 2;2
! !
x1;t 1 x2;t 1 x1;t p
C !
x2;t p
C
"1;t
!
"2;t (93.2)
where cov."1;t ; "2;s / D 12 if t D s, and cov."1;t ; "2;s / D 0 otherwise.
93.2.1 Estimation In order to estimate a VAR model, the number of lags of endogenous variables has to be determined first, as the results and hence inferences from estimation can be very sensitive to the lag choice. The general rule is to choose the lag p to minimize some model selection criteria. Two most common information criteria include the Akaike (AIC) and the Schwarz–Bayesian (SBC): ˇ ˇ ˇX ˇ 2 ˇ ˇ AIC.p/ D ln ˇ .p/ˇ C pn2 ; ˇ ˇ T
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_93,
(93.3)
1391
1392
Y. Wu and X. Zhou
and then conduct an F -test on the null hypothesis:
and ˇ ˇ ˇX ˇ ln T ˇ ˇ pn2 : SBC.p/ D ln ˇ .p/ˇ C ˇ ˇ T P
p
1 D : : : D '2;1 D 0: H0 W '2;1
(93.4)
PT 1
.p/ D T Ot "O0t , and p, T and n represent where t D1 " the lag, sample size, and number of variables, respectively. Once the lag is determined, the VAR model can be estimated using either the ordinary least squares (OLS) method or the maximum likelihood method.1
If the null hypothesis is rejected, we can conclude that x1 Granger-causes x2 . While Granger-causality tests provide useful information on the forecasting ability of the model variables, the predicative power of one variable is not equivalent to its true causality ability. Therefore, results from Granger-causality tests need to be interpreted with great caution. 93.2.2.2 Impulse Response Analysis
93.2.2 Inferences Since a p-lag VAR model contains many parameters, interpreting the estimation results can be difficult especially when p is large. Instead of focusing on interpreting each individual parameter estimated, some summary measures are usually employed to provide useful information on the dynamics among variables included in the model. In this section, we focus on three major kinds of summary analysis.
Another approach to examine the interactions among variables in the VAR model is to use impulse response functions, which represent the reactions of model variables to shocks hitting the system. In order to estimate the impulse response function, we first transform the VAR model into vector moving average form: Xt D C "t C ‰1 "t 1 C ‰2 "t 2 C ‰3 "t 3 C : : : : where
93.2.2.1 Granger Causality Tests
i;j s
One important application of the VAR approach is forecasting. Specifically, this approach can be used to address such questions as whether some variables contain valuable information about the future dynamics of other variables in the model. Granger (1969) proposes a test on the forecasting relationship between two variables, which was further developed in Sims (1972). According to Granger (1969), in a bivariate VAR model as discussed above, if x1 helps predict x2 , then x1 Grangercauses x2 . Otherwise, x1 fails to Granger-cause x2 . More formally, x1 fails to Granger-cause x2 if for all s > 0, the mean squared error (MSE) of a forecast of x2;t Cs based on (x2;t , x2;t 1 ; : : :) is the same as the MSE of a forecast of x2;t Cs based on (x1;t , x1;t 1 ; : : :) and (x2;t , x2;t 1 ; : : :). To test whether x1 Granger-causes x2 , we can estimate the following model: p
1 1 x1;t 1 C : : : C '2;1 x1;t p C '2;2 x2;t 1 C : : : x2;t D a2 C '2;1 p
C'2;2 x2;t p C "2;t ;
(93.5)
@Xt Cs @"0t
(93.6)
D ‰s : Therefore, the (i; j / component of ‰s ,
captures the effects of a one unit shock to variable j at j time t ("t ) on variable i at time t C s (xi;t Cs ) and hence is interpreted as the impulse response function. However, such interpretation is only possible if the elements of "t are not correlated; that is, the variance– covariance matrix of "t is a diagonal matrix. Sims (1980) proposes a recursive causal ordering to solve this problem. For example, if Xt D .x1;t ; x2;t ; x3;t /, then the variables in the Xt can be ordered in a way so that x1;t affects x2;t and x3;t but not vice versa, and x2;t affects x3;t but not vice versa. Which particular order to use should be determined by the specific context and economics models being examined. With such recursive causal reordering, the VAR model can be rewritten as: x1;t D a1 C…1;1 Xt 1 C…1;2 Xt 2 C : : : C…1;p Xt p C1;t x2;t D a2 C ˇ2;1 x1;t C …2;1 Xt 1 C …2;2 Xt 2 C : : : C…2;p Xt p C 2;t :: : xn;t D an C ˇn;1 x1;t C : : : C ˇn;n1 xn1;t C …n;1 Xt 1 C…n;2 Xt 2 C : : : C …n;p Xt p C n;t :
(93.7)
Transform this VAR model into the Wold representation: 1 For further discussions on VAR techniques, see Watson (1994), Lutkepohl (1999), Stock and Watson (2001), among others.
Xt D C ‚0 t C ‚1 t 1 C ‚2 t 2 C ‚3 t 3 C : : : ; (93.8)
93 VAR Models: Estimation, Inferences, and Applications
1393
where ƒ0 is a lower triangular matrix. The impulse response function can be estimated as: @xi;t Cs s D i;j ; @j;t
(93.9)
where i , j D 1; 2; : : :, n and s > 0.
A related issue on the influence of a one unit shock to the system is its contribution to the variance of the forecast error in predicting the future value of each variable in the system. Specifically, if we write the forecast error of Xt Cs at time t as: Xt Cs XOt Csjt D "t Cƒ1 "t Cs1 Cƒ2 "t Cs2 C Cƒs1 "t C1 ; (93.10) the MSE of this forecast is:
0 MSE.XO t Csjt / D E Xt Cs XO t Csjt Xt Cs XO t Csjt D C ƒ1 ƒ01 C C ƒs1 ƒ0s1 : (93.11) For one variable in the model, xi , the forecast error has the form: s1 X
h i;1 1;T Csh C C
hD0
The VAR has been widely regarded as a useful model to study stock return predictability. Campbell and Shiller (1987) propose the following VAR model for asset returns
93.2.2.3 Variance Decomposition
xi;t Cs xO i;t Csjt D
93.3.1 Stock Return Predictability and Optimal Asset Allocation
s1 X
h i;n n;T Csh :
hD0
(93.12) With the same recursive causal ordering as discussed in the previous section, the variance and covariance matrix is orthogonal. Therefore, the variance of the forecast error for xi;t Cs is: s1 s1 X X h 2 h 2 i;1 C C2n i;n : var xi;t Cs xO i;t Csjt D 21 hD0
hD0
(93.13)
93.3 Applications of VARs in Finance The VAR models have been used very broadly in the finance literature. In this chapter, we focus on a few important applications to illustrate the advantages and some potential concerns when applying this methodology in empirical studies.
yt St
D
a.L/ b.L/ c.L/ d.L/
yt 1 u C 1t St 1 u2t
(93.14)
where in the case of the present value model of stock prices, yt is the dividend and St is the difference between stock price and a multiple of dividends; and in the case of the term structure of interest rates, yt is one-period interest rate, and St is the spread between long- and short-term interest rates. For stock market data, they use real annual prices and real dividends on a broad stock index from 1981 to 1986. For term structure data, they use monthly U.S. Treasury 20-year yield series and 1-month Treasury bill rate. They find that yt is non-stationary and its first difference yt is stationary, while St is in general stationary in both the stock market and term structure cases. There is weak evidence of cointegration between stock prices and dividends, although the cointegration relationship in the term structure is more significant. They find that the yield spread Granger-causes short-rate changes and that excess returns on long-term bonds are significantly predictable, although the expectation hypothesis of the term structure is formally rejected. As for the stock market, dividend changes are found to be highly predictable and the price-dividend spreads Granger-cause dividend changes. The present value model of stock price is however rejected. Using a log-linear approximation for stock prices and dividends, Campbell and Shiller (1989) estimate the following bi-variate VAR a.L/ b.L/ ıt 1 u ıt D C 1t rt 1 dt 1 u2t c.L/ d.L/ rt 2 dt 2 (93.15) where dt is log dividends, ıt is log dividend price ratio, rt is ex post discount rate, and rt dt is interpreted as the growth-adjusted discount rate. Campbell and Shiller (1989) find that stock returns are somewhat predictable. The lagged log dividend-price ratio has a positive impact on stock returns, and lagged real dividend growth rate has a negative impact. The findings that dividend-price ratio has predicting power for future stock returns are consistent with earlier studies (e.g., Shiller 1984; and Flood et al. 1986). Log dividend-price ratio Granger-causes real dividend growth. Short-term discount rates are not helpful in explaining stockprice movements.
1394
Hodrick (1992) studies the statistical properties of three models for predicting stock returns on long horizons. The three model specifications are as follows: (1) OLS regression of stock return on past dividend yield originally proposed by Fama and French (1988); (2) a reorganization of the first specification where log stock return is regressed by cumulative past dividend yields; and (3) a first-order VAR in three variables: log stock returns, dividend yields, and 1-month Treasury-bill return relative to its previous 12-month moving average. Using simulations, Hodrick (1992) finds that the VAR alternative has the correct size, and provides unbiased long-horizon statistics, and therefore is the preferred technique to the other two model specifications in the prediction of stock return over long horizons. Empirically, the VAR tests provide strong evidence of predictive power of the 1-month ahead return. The VAR approach is a useful alternative way to properly calculate various statistics in long horizon predictions, including the implied regression slope coefficients, implied R2 , and variance ratios. Several authors report that the log of earnings, dividends, and stock prices are integrated of order one processes and both dividend payout and dividend yields are stationary (e.g., Cochrane 1992; Mankiw et al. 1991; Lee 1996). That is, log earnings and dividends series are cointegrated of order one. Log stock prices and log dividends are also cointegrated of order one. Lee (1998) employs a tri-variate structural VAR model to identify the various components of stock price and examine the response of stock prices to different types of shocks: permanent and temporary changes in earnings and dividends, and changes in discount factors and nonfundamental factors. The variables in the structural VAR are the change in log earnings, the spread between log dividends and earnings, and the spread between log stock prices and dividends. Lee (1998) finds that about half of the stock price variation is unrelated to earnings or dividend changes. Timevarying excess stock returns account for much of the remaining deviation of stock prices. Deviation of stock prices from fundamentals does not persistent for a long time, suggesting a fad interpretation of stock market movements. The VAR has been employed to study optimal consumption and portfolio decisions over long horizons when asset returns are not iid. For example, Campbell and Viceira (1999) use a restricted VAR consisting of the excess stock market returns over the risk free rate and the log dividend price ratio. The VAR parsimoniously describes the changing investment opportunity over time for the long-horizon investor. The predictability of asset returns over time significantly increases the complexity of the optimal consumption and portfolio solution. Using a log linear approximation, Campbell and Viceira (1999) nicely obtain an analytic solution. An optimal portfolio consists of two components. The first component is the myopic demand, which is what one would obtain in the absence of asset return predictability. This is the classical
Y. Wu and X. Zhou
result on optimal asset allocation. The second component is the hedging demand, which arises due to the serial correlation of asset returns. Campbell et al. (2003) extend the model to allow multiple assets in the VAR; Viceira (2001) examines long-horizon consumption and portfolio decisions with nontradable labor income. These studies demonstrate how the VAR can be conveniently employed to yield interesting and important insight in the consumption and portfolio optimization.
93.3.2 Exchange Rate Prediction The VAR is widely used to study exchange rate dynamics. We summarize several papers on exchange rate predictability. There is a large literature on exchange rates forecasting, beginning with the seminal work of Meese and Rogoff (1983), who report that most structural and time-series exchange rate models, including an unconstrained VAR consisting of exchange rate and six key macroeconomic variables, cannot produce a better forecast of future spot exchange rates than the naïve random walk model. Subsequent authors have developed various models and techniques to better understand exchange rate dynamics, hoping to produce more accurate forecasts of exchange rates than the naïve random walk. Bekaert and Hodrick (1992) consider the following VAR model for two countries (e.g., the U.S. which is called country 1, and Japan, which is called country 2): Yt D A0 C A1 Yt 1 C ut C1
(93.16)
where Yt D Œr1t ; r2t ; rs2t ; dy1t ; dy2t ; fp2t 0 I rjt is the excess equity market return in country j; rsjt is the excess dollar rate of return on a currency j money market investment; dyjt is the dividend yield in country j; and fpjt is the forward premium on currency j in terms of the U.S. dollars. They find that dividend yields that are known to have predictive power for equity returns can predict excess returns in foreign exchange markets. Similarly, forward premiums that are known to predict excess returns in the foreign exchange market have predictive power for equity excess returns. They also find that excess returns in the foreign exchange market have strong positive persistence. Baillie and Bollwerslev (1989) employ a VAR to study the long- and short-run dynamics of spot and forward exchange rates for seven major currencies again in the U.S. dollar. They
93 VAR Models: Estimation, Inferences, and Applications
report that spot and forward exchange rates can be characterized as unit-root processes and spot and forward exchange rates are cointegrated for each currency pair. Furthermore, one common unit root, or stochastic trend, is detected in the multivariate time series models for the seven spot and forward exchange rates. These findings suggest that the seven exchange rates possess one long-run relationship and that the disequilibrium error around the long-run relationship can partly explain the subsequent short-run movements in the exchange rates. Diebold et al. (1994) examine one immediate implication of the Baillie and Bollwerslev (1989) finding that spot and forward exchange rates are cointegrated. That is, cointegration implies an error-correction representation of spot and forward exchange rates that should better forecast the change in future spot exchange rate than the naïve random walk model. Empirical findings of Diebold et al. (1994) are however quite negative. The VAR model with the cointegration restriction does not produce more accurate out-ofsample forecast of exchanges rates than the simple martingale model in short horizons. Mark (1995) and Mark and Sul (2001) however find substantial predictability of exchange rates over longer horizons (3–4 years) when monetary fundamental variables are employed. Numerous papers study the information contents of the forward foreign exchange rate premium in predicting future spot exchange rate changes. According to the uncovered interest rate parity, when the forward exchange rate is traded at a 1% premium relative to the current spot exchange rate, the future spot exchange rate should be expected to depreciate by 1%. However, empirically, the uncovered interest rate parity is strongly rejected. Furthermore, the forward premium often forecasts the change in future spot exchange rate in the opposite direction. Hai et al. (1997) propose a permanent-transitory components model for the spot and forward exchange rates. The permanent component, which is shared by the spot and forward exchange rates, is a random walk without drift, while the transitory components of spot and forward exchange rates are assumed to follow a vector autoregressive moving average process. The model is estimated using the Kalman filter. They report that this simple parametric model is useful in understanding why the forward rate may be an unbiased predictor of future spot rates although an increase in forward premium predicts a dollar appreciation. The estimates of the expected excess return on short-term dollar-denominated assets are persistent and reasonable in magnitude. Mark and Wu (1998) find that a vector error correction model for the spot and forward exchange rates can account for many salient features of the data. In particular, the estimated risk premium series is highly persistent, and the risk premium and the expected future depreciation of the spot exchange rate alternate between positive and negative values
1395
and change sign infrequently. They show that standard intertemporal asset pricing model is not capable of generating reasonable foreign exchange risk premiums and therefore does not help to explain why the forward exchange premium forecasts future spot exchange rate changes in the wrong direction. On the other hand, a noise-trader model along the line of De Long et al. (1990) is potentially promising in explaining the anomaly.
93.3.3 Measuring Market Quality and Informational Content of Stock Trades The VAR approach has also been applied to high-frequency data to measure informational content of stock trades and the quality of security markets. For example, Hasbrouck (1991) uses the following vector autoregressive system to model the interaction of trades and quote revisions: rt D
I X
ai rt i C
I X
i 1
i D0
I X
I X
bi xt0i C v1;t ;
(93.17)
di xt0i C v2;t ;
(93.18)
and xt0
D
ci rt i C
i 1
i D1
where t indexes transactions, and rt and xt0 represent price changes and trade direction, respectively. Within this framework, the information content of a trade is measured as the ultimate impact of the trade innovation on the stock price and can be inferred from the impulse respond function of price changes to a shock to xt0 . Specifically, if an order arrives at time t, its cumulative impact on quote revisions through step m is represented by: ˛m .v2;t /
m X
E Œrt Ci jv2;t :
(93.19)
i D0
As m increases, ˛m .v2;t / can be shown to converge to the revisions in the efficient price resulting from the shock v2;t . Notably, this model differs from usual VAR specification since the contemporaneous net trading volume appears as one of the independent variables in explaining quotes returns. This is largely due to the trades/quote timing convention assumed in the study. According to Hasbrouck (1991), trades (xt ) take place after market makers posts bid and ask quotes (rt 1 ). Based on the trades occurred, market makers revise their quotes (rt ) and more trades follow. Therefore, even though xt and rt carry the same subscript t, they are
1396
Y. Wu and X. Zhou
not determined simultaneously. In fact, rt takes place after xt and hence cannot influencext . Such recursive causal ordering is also necessary in order to estimate the impulse response function. In another influential paper, Hasbrouck (1993) uses the VAR specification to discuss a new measure of market quality. In this model, security transaction prices are decomposed into a random walk component, identified as the efficient price, and a residual stationary component, termed the pricing error. Specifically, the (logarithm of) the actual transaction price at t, pt , is modeled as: pt D mt C st
(93.20)
where mt is the efficient price that follows a random walk, and st is the pricing error that is a zero-mean covariancestationary stochastic process. As discussed in Hasbrouck (1993), the pricing error consists of two components, an information uncorrelated component (e.g., arising from inventory effects) and an information correlated component (which arises from adverse selection effects). This dispersion of the pricing error, denoted s , measures how closely the real transaction price tracks the efficient price, and hence measures the market quality: the smaller the s , the higher the market quality. To empirically determine s , Hasbrouck (1993) estimates a vector-auto regressive n (VAR) model o over a set of four price 1=2 0 and trade variables rt ; xt ; xt ; xt , where rt D pt pt 1 , xt0 is the trade direction (which is equal to 1 for buys and 1 1=2 for sells), and xt and xt represent signed size and signed square root of size respectively. The s is then calculated for each firm as a function of the estimated variance–covariance matrix and the coefficient matrix of the VAR model under the Beveridge–Nelson identification restriction (1981).2 Applying this measure to a sample of NYSE firms, Hasbrouck (1993) finds that the average s is about 0.33% of the stock price, and it exhibits elevation at both the beginning and end of the trading session.
93.3.4 Relative Informational Efficiency The relative informational efficiency of related asset markets has been of great interest to finance scholars for decades. Numerous studies have employed the VAR approach to address substantial questions such as whether prices/trading activities in one market reflects new information faster than that in the other markets. Following seminal work by Black (1975), there has been much literature studying inter-market relationships between the stock and option markets. The underlying
rationale is that any information related to the value of a firm’s equity security should also be reflected in its derivative contracts, and which market moves first in response to the arrival of new information is determined to a large degree by informed traders’ preferences for trading. According to Black (1975), the option market might be more attractive to informed traders than the market for the underlying stock because options offer higher financial leverage, and the option market is characterized by less stringent margin requirements and no uptick rule for short selling. Whether the option market is indeed leading the stock market in reflecting new information has been directly examined in numerous empirical studies. One approach is to analyze the dynamic relationship between the stock and option markets using a VAR model. For example, Chan et al. (2002) propose the following multivariate VAR model on the trades and quotes revisions in options and their underlying stocks: rt D a1 rt 1 CCap rt p Cb0 xt Cb1 xt 1 Cbp xt p C"1;t ; (93.21) and xt D c1 rt 1 C C cp rt p C d1 xt 1 C dp xt p C "2;t : (93.22) s c p 0 s c p 0 In this model, rt D rt ; rt ; rt and xt D xt ; xt ; xt , p where rts , rtc and rt denote returns calculated using quote p s midpoints, and xt , xtc , and xt represent net trading volume (buyer-initiated volume minus seller-initiated volume) in the stock, call, and put option markets during time interval t respectively. Further, the error terms "1;t and "2;t are assumed to have zero means and are jointly and serially uncorrelated. This specification is in the same spirit as in Hasbrouck (1991) since the contemporaneous net trading volume appears as one of the explanatory variables for quote returns but not vice versa. The quote returns and net trading volumes are calculated on a 5-min. intervals for a sample of 14 most actively traded NYSE stocks with options traded on CBOE for a total of 231 trading days. The authors find that stock net trading volume, but not option trading volume, carries explanatory power for both contemporaneous and subsequent stock and option quote revisions. On the other hand, both stock and option quote revisions predict subsequent quote changes in both markets. These findings suggest that informed traders only initiate trades in the stock market. However, they also trade in the option market by submitting limit orders. This is consistent with the notion that the less liquidity in the option market might discourage informed traders to trade using market orders, despite the high financial leverage it offers.3 3
2
For more estimation details, refer to Hasbrouck (1993).
Whether options lead stocks in the price discovery process is still a question open to debate. Early studies in this literature present strong
93 VAR Models: Estimation, Inferences, and Applications
1397
The VAR approach has also been used in a recent literature on the relative informational efficiency of stocks and corporate bonds. As both stocks and corporate bonds are claims on the issuing firms’ assets, any information on the value of these assets will affect their prices. The analysis on the leadlag relationships between the stock and bond price movements can then be used to address the question as to whether stocks lead bonds in reflecting firm-specific information: Rs;t D ˛s C
I X
i ˇs;s Rs;t i C
i D1
L X
i s;b Rb;t i C "s;t ;
i D1
(93.23) and Rb;t D ˛b C
I X i D1
i ˇb;s Rs;t i C
L X
i b;b Rb;t i C "b;t ;
i D1
(93.24) where Rs;t (Rb;t ) represents the return on the stock (bond) at time t. Within this framework, Granger-causality tests are conducted to identify the lead–lag relationships. Hotchkiss and Ronen (2002) examine pre-TRACE (FIPS) transaction price summaries for 55 high-yield bonds and find that stocks do not lead bonds, consistent with Zhou (2006) who finds that lagged TRACE 50 high-yield bond prices contain valuable information about current stock returns, and that they serve an important role in disseminating firm specific information.4 In contrast, Downing et al. (2007) evidence in favor of the option lead in prices. For example, Latane and Rendleman (1976) and Beckers (1981) show that the volatility implied in option prices predicts future stock price volatility. Consistently, Manaster and Rendelman (1982) find that the option implied stock prices contain valuable information about the equilibrium prices of the underlying stocks that has not been revealed in the stock market. However, Vijh (1988) questions the results in Manaster and Rendelman (1982), since using daily closing prices introduces a bias associated with the bid-ask spared and nonsynchronous trading. After purging the effects of bid-ask spreads, Stephan and Whaley (1990) find that the stock market leads the option market. Nevertheless, Chan et al. (1993) argue that the stock lead is due to the relative smaller stock tick. If the average of the bid and ask is used instead of transaction prices, neither market leads the other. Later studies on the stock option lead-lag analysis have focused more on the trading volume. Easley et al. (1998) show that “positive news option volumes” and “negative news option volumes” have predictive power for future stock price changes. See also Pan and Poteshman (2003) and Cao et al. (2003). By measuring the relative share of price discovery occurring in the stock and option markets, Chakravarty et al. (2004) conclude that informed trading takes place in both stock and option markets, suggesting an important role for option volume. 4 Zhou (2006)s analysis incorporates the option market trades in addition to the stock and bond market transactions. Chava and Tookes (2005) also incorporate the option markets, and examine the volatility reaction of stock, bond and options to macroeconomic and firm specific information, finding significant effects near announcements. Overall, they find that corporate bond and option trades have information content for future stock price movements.
conclude that stock returns do lead non-convertible bond returns at the hourly level in times of financial distress, and, therefore, the corporate bond market is less informationally efficient.5 Gurun et al. (2008) also find significant stock leads for daily bond indices, which diminish with certain information releases. The findings of less informational efficiency of corporate bonds seem to be consistent with the notion that the liquidity (transaction costs) for corporate bonds is much lower (higher) than that for stocks. However, in a recent study by Ronen and Zhou (2008), the authors question the ability of VAR in capturing the information flow between stocks and bonds. Since for any given firm there is typically a multiplicity of bond issues (in contrast to a single or very few stock issues), VAR analysis on pair-wise comparisons of each bond with the issuer’s stock can be misleading. In fact, it cannot not reveal the information most desired by a single trader: whether there exists at least one bond constituting an information-based trading venue. For more discussions, see Ronen and Zhou (2008).
93.4 Conclusion This chapter endeavors to summarize the basic structure of a VAR model and some related estimation methods and structural analysis. It also offers a quick overview of some important applications of the VAR approach both in economics and finance. Despite all the advantages for the VAR models, it is important to realize that VAR models can only be applied to stationary time series data. Therefore, in order to properly implement this approach in empirical studies, certain tests on the stationarity of the time series data have to be conducted first. If null hypothesis of stationarity is rejected, a vector error correction model (VECM) should be used instead. For more details on VAR and VECM models, please refer to Hamilton (1994).
References Baillie, R. and T. Bollwerslev. 1989. “Common stochastic trends in a system of exchange rates.” Journal of Finance 44, 167–181. Beckers, S. 1981. “Standard deviations implied in option prices as predictors of future stock price variability.” Journal of Banking and Finance 5, 363–381. Bekaert, G. and R. Hodrick. 1992. “Characterizing predictable components in excess returns on equity and foreign exchange markets.” Journal of Finance 47, 467–509.
5 For convertible bonds, Downing et al. (2007) find that these results also hold, but only for those with conversion options deep in the money. Kwan (1996) first examines the relation between individual stocks and bonds using weekly quote data and also finds evidence in support of stock leads.
1398 Black, F. 1975. “Fact and fantasy in the use of options.” Financial Analysts Journal 31, 36–41, 61–72. Campbell, J. and R. Shiller. 1987. “Cointegration and tests of present value models.” Journal of Political Economy 95, 1062–1088. Campbell, J. and R. Shiller. 1989. “The dividend-price ratio and expectations of future dividends and discount factors.” Review of Financial Studies 1, 195–228. Campbell, J. and L. Viceira. 1999. “Consumption and portfolio decisions when expected returns are time varying.” Quarterly Journal of Economics 114, 433–495. Campbell, J., Y. L. Chan, and L. Viceira. 2003. “A multivariate model of strategic asset allocation.” Journal of Financial Economics 67, 41–80. Cao, C., Z. Chen, and J. M. Griffin. 2003. “The informational content of option volume prior to takeovers.” Journal of Business 78, 1071–1109. Chan, K., P. Chung, and H. Johnson. 1993. “Why option prices lag stock prices: a trading-based explanation”. Journal of Finance, 48, 1957– 1967. Chan, K., Y. P. Chung, and W. M. Fong. 2002. “The informational role of stock and option volume.” Review of Financial Studies 15, 1049– 1075. Chakravarty, S., H. Gulen, and S. Mayhew. 2004. Informed trading in stock and option markets, Working Paper, Purdue University, Virginia Tech, University of Georgia and SEC. Chava, S. and H. Tookes. 2005. Where (and how) does news impact trading? Unpublished working paper, Texas A&M University and Yale School of Management. Cochrane, J. 1992. “Explaining the variance of price-dividend ratios.” Review of Financial Studies 5, 243–280. De Long, B., A. Shleifer, L. Summers, and R. Waldman. 1990. “Noise trader risk in financial markets.” Journal of Political Economy 98, 703–738. Diebold, F., J. Gardeazabai, and K. Yilmaz. 1994. “On cointegration and exchange rate dynamics.” Journal of Finance 49, 727–735. Downing, C., S. Underwood. and Xing, Y. 2007. “The relative informational efficiency of stocks and bonds: an intraday analysis.” Journal of Financial and Quantitative Analysis, forthcoming. Easley, D., M. O’Hara, and P. S. Srinivas. 1998. “Option volume and stock prices: evidence on where informed traders trade.” Journal of Finance 53, 431–466. Fama, E. and K. French. 1988. “Dividend yields and expected stock returns.” Journal of Financial Economics 22, 3–26. Flood, R., R. Hodrick, and P. Kaplan. 1986. An evaluation of recent evidence on stock market bubbles, NBER working paper. Granger, C. W. J. 1969. “Investigating causal relations by econometric models and cross spectral methods.” Econometrica 37, 424–438. Gurun, U. G., R. Johnston, and S. Markov. 2008. “Sell-side debt analysts and market efficiency”, working paper. Hai, W., N. Mark, and Y. Wu. 1997. “Understanding spot and forward exchange rate regressions.” Journal of Applied Econometrics 12, 715–734. Hamilton, J. D. 1994. Time series analysis, Princeton University Press, Princeton. Hasbrouck, J. 1991. “Measuring the information content of stock trades.” Journal of Finance 46, 179–207. Hasbrouck, J. 1993. “Assessing the quality of a security market: a new approach to transaction-cost measurement.” Review of Financial Studies 6, 191–212.
Y. Wu and X. Zhou Hodrick, R. 1992. “Dividend yields and expected stock returns: alternative procedures for inference and measurement.” Review of Financial Studies 5, 357–386. Hotchkiss, E. S. and Ronen, T. 2002. “The informational efficiency of the corporate bond market: an intraday analysis.” Review of Financial Studies 15, 1325–1354. Kwan, S. H. 1996. “Firm-specific information and the correlation between individual stocks and bonds.” Journal of Financial Economics 40, 63–80. Latane, H. and R. J. Rendleman. 1976. “Standard deviation of stock price ratios implied in option prices.” Journal of Finance 31, 369–381. Lee, B.-S. 1996. “Time-series implications of aggregate dividend behavior.” Review of Financial Studies 9, 589–618. Lee, B.-S. 1998. “Permanent, temporary and nonfundamental components of stock prices.” Journal of Financial and Quantitative Analysis 33, 1–32. Lutkepohl, H. 1991. Introduction to multiple time series analysis, Springer, Berlin. Manaster, S. and R. J. Rendelman. 1982. “Option prices as predictors of equilibrium stock prices.” Journal of Finance 37, 1043–1057. Mankiw, N. G., D. Romer, and M. D. Shapiro. 1991. “Stock market forecastability and volatility: a statistical appraisal.” Review of Economic Studies 58, 455–477. Mark, N. 1995. “Exchange rates and fundamentals: evidence on longhorizon predictability.” American Economic Review 85, 201–218. Mark, N. and D. Sul. 2001. “Nominal exchange rates and monetary fundamentals: evidence from a small post-Bretton Woods panel.” Journal of International Economics 53, 29–52. Mark, N. and Y. Wu. 1998. “Rethinking deviations from uncovered interest parity: the role of covariance risk and noise.” Economic Journal 108, 1686–1706. Meese, R. A. and K. Rogoff. 1983. “Empirical exchange rate models of the seventies: do they fit out of sample?” Journal of International Economics 14, 3–24. Pan, J. and A. Poteshman. 2003. The information in option volume for stock prices, Working Paper, MIT and University of Illinois. Ronen, T. and X. Zhou. 2008. “Where did all the information go? Trade in the corporate bond market.” Working paper, Rutgers University. Shiller, R. 1984. “Stock prices and social dynamics.” Brookings Papers on Economic Activity 2, 457–498. Sims, C. A. 1972. “Money, income, and causality.” American Economic Review Vol. 62, pp. 540–552. Sims, C. A. 1980. “Macroeconomics and reality.” Econometrica 48, 1–48. Stephan, J. and R. Whaley. 1990. “Intraday price changes and trading volume relations in the stock and stock option markets.” Journal of Finance 45, 191–220. Stock, J. and M. Watson. 2001. “Vector autoregressions.” Journal of Economic Perspectives 15, 101–115. Viceira, L. 2001. “Optimal portfolio choice for long-horizon investors with nontradable labor income.” Journal of Finance 56, 433–470. Vijh, A. M. 1988. “Potential biases from using only trade prices of related securities on different exchanges.” Journal of Finance 43, 1049–1055. Watson, M. 1994. “Vector autoregressions and cointegration,” in Handbook of econometrics, Vol. IV, R. F. Engle and D. McFadden (Eds.). Elsevier, Amsterdam. Zhou, X. 2006. Information-based trading in the junk bond market, Working paper, Cornell University.
Chapter 94
Signaling Models and Product Market Games in Finance: Do We Know What We Know?* Kose John and Anant K. Sundaram
Abstract Important results in a large class of financial models of signaling and product market games hinge on assumptions about the second order complementarities or substitutabilities between arguments in the maximand. Such second order relationships are determined by the technology of the firm in signaling models, and market structure in product market games. To the extent that the underlying economics (in theoretical specifications) or the data (in empirical tests) cannot distinguish between such complementarities and substitutabilities, the theoretical robustness and the empirical tests of many models are rendered questionable. Based on three well-known models from finance literature, we discuss the role that these assumptions play in theory development and provide empirical evidence that is consistent with the arguments advanced here. Keywords Financial signaling r Second order complementarities or substitutabilities r Product market games r Strategic substitutes and complements r Empirical evidence
94.1 Introduction Central results in many models in financial economics – particularly models of interaction between product market structure and financial market behavior, as well as models of signaling – hinge on assumptions concerning second-order relationships between arguments in a function to be optimized. These second order relationships can be defined as those resulting in super- or submodularity of the maximand (Topkis 1978, 1979; Milgrom and Roberts 1990a, b), and reflect the underlying economics of the firm or the industry. We argue below that changing the assumption from one to the other – a change that is tantamount to implicitly changing K. John () New York University, NY, USA e-mail: [email protected] A.K. Sundaram Dartmouth College, Hanover, New Hampshire, USA
the assumptions on technology or market structure – can produce result reversals or indeterminacy, raising troublesome questions about both the theoretical robustness and the empirical testability of many well-accepted models. Data that cannot discriminate the presence of these differing technologies and market structures may produce conflicting results, or no results at all, and it possible that studying the market value announcement effects of various real and financial decisions by looking at average effects can be misleading. Following Topkis (1978, 1979) and Milgrom and Roberts (1990a, b), we first define super- and submodularity. We then examine its implications for signaling models and product market games. We show how the assumption of one or the other leads to result indeterminacy, or even reversals, using three examples: Miller and Rock (1985) on dividend signaling, Brander and Lewis (1986) on the effect of leverage increases, and Maksimovic (1990) on the effect of loan commitments. Based on recent empirical evidence, we argue that empirical tests that only examine average announcement effects may need to be reexamined by parsing data sets to reflect the presence of super- and submodularity, since the opposite effects of each may cancel the other out, or the unduly large presence of one or the other in a data set may be producing results that are not generalizable to all firms. Many such models are making implicit assumptions about the nature of competition, or the nature of technologies in the firms and industries under study. Unless these assumptions can be validated by the data under scrutiny, we cannot generalize the results even if they appear to be consistent with the chosen model. An illustration of our point is provided by the Miller and Rock (1985) signaling model * The authors are respectively Charles William Gerstenberg Professor of Banking and Finance at the Stern School of Business at New York University (http://[email protected]), and a professor of finance and Faculty Director of Executive Education at the Tuck School of Business at Dartmouth College (http://[email protected]). Earlier versions were presented at the Western Finance Association meetings, the European Finance Association meetings, the Third Annual Conference on Finance and Accounting in New York, and at the Stockholm School of Economics. We are grateful to Clas Bergstrom, Paolo Fulghieri, “Max” Maksimovic, James Seward, and Lakshmi ShyamSunder for comments.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_94,
1399
1400
of dividends (explored in greater detail in Sect. 94.3). We show that the choice of technology in Miller and Rock is essentially arbitrary. An equally, if not more, plausible choice of technology will give rise to a signaling equilibrium in which the strategies of firms with respect to investment policies and the announcement effects of dividend changes are the exact opposite of that it obtained in the Miller–Rock equilibrium. The only rationale for the Miller–Rock choice of technology is that yields an announcement effect of dividend changes on stock prices that is consistent with the average (across firms) effect documented in some empirical studies. Recent studies taking a closer look at the evidence (e.g., John and Lang 1991; Lang and Litzenberger 1989; Shih and Suk 1992) have found both stock price increases and decreases in response to announcements of dividend increases. Moreover, such dichotomous announcement effects can be related to super- or submodularity of the firm’s technology.1 The point of this paper is simply this: instead of making a model choice arbitrarily to yield the average effect observed empirically, it may be important to examine firm and industry characteristics closely and relate them to explanations for positive and negative effects observed in the data.
94.2 Supermodularity: Definitions The concept of supermodularity revolves around the idea that there are second order complementarities between arguments in a function to be optimized, and therefore, there may be “highest” and “lowest” optimal solutions to the function. That is, the optimal solutions are such that all the arguments are simultaneously at their highest values or simultaneously at their lowest values at optimum. In economic modeling, it can imply that there are Pareto-best and Pareto-worst equilibria (Milgrom and Roberts 1990b). Consider a function f .x1 ; x2 / where is f .:/ is twice continuously differentiable in x1 ; x2 . Then supermodularity is defined as follows: 2f >0 (SM) f is supermodular if and only if @x@1 @x 2 Topkis (1978); a weakened version of this definition can also be found in Milgrom and Roberts (1990a, b). If f .:/ is supermodular, then f .:/ is submodular; if f .:/ is both supermodular and submodular, then it is a valuation (Topkis 1978). 1 Such dichotomous effects have been observed – but largely left unexplained – in other related areas such as announcement effects of leverage changes (e.g., Masulis 1983; Eckbo 1986; Mikkelson and Partch 1986), announcement effects of investments (e.g., John and Mishra 1990) and dividend initiations (e.g., John and Lang 1991) or announcement effects of changes in Research and Development expenditures (e.g., Sundaram et al. 1996) and antitakeover measures (e.g., John et al. 1992). For more details, see Sect. 94.5 of the paper.
K. John and A.K. Sundaram
Although we assume twice-continuously differentiable functions in the definition above (and in the rest of the paper), in general, the concept of supermodularity does not require assumptions of convexity, concavity or differentiability.2 Further, there is no necessary connection between supermodularity and increasing or decreasing returns to scale (see Milgrom and Roberts 1990a, b).3 The assumption of super- or submodularity appears in a wide range of models: oligopoly games, auction games, technology adoption games, games of macroeconomic coordination, and so forth (for a review of the various applications, see Milgrom and Roberts (1990b). We argue that they also apply to a wide class of product market and signaling games in finance. Using three well-known examples, we analyze how the phenomena of super- and submodularity affect theoretical and empirical results.
94.3 Supermodularity in Signaling Models In the early signaling models, such as Leland and Pyle (1977) or Ross (1977), larger managerial holdings or larger debt levels were unambiguous signals of higher quality (i.e., higher levels of the attribute of private information). Similarly, in the dividend signaling models, which followed (e.g., Bhattacharya 1979; Miller and Rock 1985; John and Williams 1985), higher dividend payouts were assumed to connote better firm prospects. From recent work in financial signaling, it is clear that the interpretation of increases in the signal has to be made conditional on the super- or submodularity of firm technologies. In Ambarish et al. (1987) and John and Mishra (1990), whether or not better firms invest larger or smaller amounts than the full information-optimal level depends on the characteristics of the firm’s technology; consequently, whether announcements of investment increases will lead to increases or decreases in the firm’s stock price also depends on the firm’s technology. 2
The generalization of similar conditions for the non-differentiable case motivates the concepts of super modularity and sub modularity. The details can be found in Milgrom and Roberts (1990b). 3 As an example, suppose the function is of the kind: f .x1 ; x2 / D x1 C x2 C ˇx1 x2 Using (SM), we see that, if x1 0; x2 0 then f .:/ is supermodular if ˇ > 0, submodular if ˇ < 0, and a valuation if ˇ D 0. Consider another function, the familiar Cobb-Douglas function: ˇ
f .x/ D x1 x2
Here, f .:/ is supermodular if, for all x1 0; x2 0; sgn./ D sgn.ˇ/. Thus, even the Cobb-Douglas function can exhibit super- or submodularity, and be decreasing or increasing returns to scale under either.
94 Signaling Models and Product Market Games in Finance: Do We Know What We Know?
Similarly, in John and Lang (1991), dividend increases can signal better or worse firm prospects depending on the firm’s characteristics.4 The idea can be generally understood simply as resulting from implicit assumptions concerning super- or submodularity of firm technologies. We illustrate this argument using the well-known Miller and Rock (1985) model.
94.3.1 Dividend Signaling We consider an extended version of the model in Miller and Rock (1985). The investment opportunity set is generalized in a straightforward way to be a function of not only investment, but also the firm quality attribute, which is private information to corporate insiders. Let t D 0 indicate the initial date and t D 1, the final date. At t D 0, the firm has current cash C , which includes cash from operations as well as any external financing done at the firm level selling riskless claims.5 The firm has access to a technology (or an investment opportunity set) f .I; /. The technology represents a set of assets such that an investment of amount I made at t D 0 results in a cash flow of net present value of f .I; / at t D 1 for a firm of quality (specified in detail below). Corporate insiders are assumed to act on behalf of current shareholders. It is common knowledge that an exogenous fraction k .0 < k < 1) of current shareholders will sell their shares at t D 0 in the market. Just prior to that, the insiders announce a dividend D, and invest I , where I is the residual amount C D, in the technology of the firm. Information about future cash flow is asymmetric. To specify this asymmetry as simply as possible, represent by , with 2 ‚, the value of the single state variable affecting the firm’s future profitability (for example, may measure its insiders’ forecasted return on either its assets in place or alternatively, its opportunities to invest).6 The firm’s identity, , is private information; it is known to the insiders, but not observed directly by the outsiders. In particular, insiders 4 Empirical evidence of such dichotomous announcement effects of dividend initiations on stock prices is also presented in John and Lang (1991). 5 The outside claims that the firm can sell are restricted to be riskless debt and equity. This assumption is made for model tractability. For convenience, we include riskless claims in C and treat separately the equity sold by current shareholders. Such an assumption is common to papers in this area, see Miller and Rock (1985), John and Williams (1985), Ambarish et al. (1987) and John and Mishra (1990). 6 Although we analyze the case of one-dimensional signaling for expositional convenience, similar results obtain in more complex models of “efficient signaling” where a least-cost mix of multiple signals are used to convey one-dimensional private information. See, for example, Ambarish et al. (1987), and John and Mishra (1990).
1401
observe the private attribute before picking D, whereas the outsiders must infer from the insiders’ actions. By contrast, prior probability ./ of possessing the private attribute is common knowledge.7 At t D 1, the cash flows are realized and distributed prorata to shareholders. By convention, we take higher to represent more favorable private information. In other words, for all feasible I , we have f .I; 1 / > f .I; 2 / when 2 > 1 . Also, the function f .:; / is positive, twice-continuously differentiable, strictly increasing and strictly concave in investment, with suitable corner conditions. As argued in Miller and Rock (1985), corporate insiders will solve kP C .1 k/ŒD C f .C D; / ;
(94.1)
which is the weighted average of the wealth of the shareholders who are selling and those who remain. The market price, P , can however only depend on common knowledge and the observable actions of insiders. In other words, the pricing function used by the market will be a function of D (and parameterized by all common knowledge). Let P .D/ be the cum-dividend price assigned by the market for the entire firm value. We use the signaling framework developed in Riley (1979) to analyze the model. Following Riley (1979), let the insiders’ objective be U.; D; P /, as a function of the private information, the signal, and the market pricing function. The corporate objective is now to choose the dividend payout D to maximize their objective in (94.1), conditional on P . This optimization is equivalent to: (94.2) Max U.; D; P / D0
Let the true value of the firm; that is, the value which the market would have assigned to the firm (cum-dividend) had it been informed of the value of be V .; D/. Then, V .; D/ D D C f .C D; /
(94.3)
Riley (1979) defines a signaling equilibrium fD./, P ŒD./ g as dividend schedules D./ and a pricing functional P .:/ such that for D./ which solves the incentive compatibility condition Equation (94.2), the following competitive rationality condition is also satisfied: P ŒD./ D V Œ; D./
(94.4)
The existence of the signaling equilibrium and its properties are shown in detail in Riley (1979), and we only sketch the 7
For example, the current cash C is fully revealed to outsiders through costless corporate audits; similarly, corporate audits will verify that I D C D will be the level of investment actually undertaken.
1402
K. John and A.K. Sundaram
arguments below. Let us first examine the full-information optimum investment level I ı ./ for firm type . It is given by the familiar condition: @f .I ı ; / D1 @I
(94.5)
Also, define D ı ./ D C I ı ./. Now we show that the properties of the signaling equilibrium are crucially dependent on whether the firm’s technology, f .I; /, is super- or submodular as a function of I ı and ; i.e., thatı is, whether @2 f @I @ > 0 for all and all I 0, or @2 f @I @ < 0 for all and all I 0.8 Proposition 1. If the firm technologies are supermodular, all but the poorest firms overinvest in the signaling equilibrium and pay lower dividends with increasing ; if firm technologies are submodular, all but the poorest firms underinvest in the signaling equilibrium and pay larger dividends with larger . Proof. Using Riley’s assumptions (i) to (vi) in Riley (1979) we can show that when the technology is submodular, fD./; P ŒD./ g is an equilibrium. It is sufficient to verify Riley’s crucial assumption (v). It can be verified as follows for the objective function in Equation (94.1): @fŒ.1 k/=k Œ.@f =@I / 1 g @.UD =UP / D <0 @ @ (94.6) ı if and only if @2 f @I @ < 0, i.e., if f is submodular in I and . Since I ı ./ is decreasing in for submodular technologies, it follows that I./ < I ı ./ for all types except the P -type: I. P / D I ı . P /, where P represents the of the poorest firms. The corresponding dividend D./ D C I./ is increasing in . Thus all but the poorest firms underinvest. Conversely, for supermodular technologies, Riley’s conditions can only be satisfied for the signal I./ (and corresponding to a decreasing schedule of dividends D./ D C I.//. The crucial Riley condition (v) can be verified only for I./ C D./. Now, all the but the poorest firms overinvest in the signaling equilibrium (I./ > I ı ./), and higher firms pay smaller dividends. QED. We now examine the announcement effect of dividends on stock prices. Differentiating Equation (94.1), kŒ@P =@D .1 k/Œ.@f =@I / 1/ D 0
(94.7)
which, depending on sgnŒ.@f =@I / 1 , is positive for submodular firms and negative for supermodular firms. Underinvestment for submodular technologies implies @f =@I > 1, and overinvestment for supermodular technologies implies @f =@I < 1. Thus, the characteristics of the firm’s technology will determine whether stock prices increase or decrease in response to announcements of dividend increases.
94.3.2 An Example Let us consider a particular example of a submodular technology, which is isomorphic to the one considered in the Miller–Rock paper: f .I; / D ˇ ln.I C /
(94.9)
where ˇ is a positive constant (Miller–Rock provide an explicit solution to this technology; that this technology is submodular is easy to see: @2 f .:/=@I @ D ˇ=.1 C /2 < 0/. This technology has the “usual” properties that one would look for in a cash flow function: it is increasing in investment level I and the quality attribute ; it is strictly concave, twice-continuously differentiable, and so forth. However, note that the full-information investment level I ı ./ defined by @f .I ı ; /=@I is decreasing in .9 As we have shown in Proposition 1 (see, in particular, (Equation (94.6)), the firm underinvests in the signaling equilibrium and the announcement effect of dividend increases is positive. Now consider a simple example of a supermodular technology: p (94.10) f .I; / D I (That this technology is supermodular is easy to check: @2 f .:/=@I @ D .0:5/I .1=2/ > 0/. This technology also has the “usual” properties one would look for a cash flow function: it is increasing in the investment level I , and the quality attribute ; it is strictly concave, twice-continuously differentiable, and so forth. Indeed, we might note that (94.10) has arguably the more attractive property that, in the full information equilibrium, the optimal investment level is increasing in firm quality, a feature that might actually make it a more plausible candidate for a cash flow technology than the one in Equation (94.9). Note now that with Equation (94.10) as the firm technology (and following Equation (94.6)), firms will overinvest in the signaling equilibrium, and therefore, the announcement effects of dividend increases will be negative.
which implies @P =@D D Œ.1 k/=k Œ.@f =@I / 1 ;
(94.8)
It should be noted that I ı . / is increasing in for supermodular technologies and decreasing in for submodular technologies.
8
9 Indeed, we might argue that the optimal level of investment being lower for a higher quality firm is not an attractive characteristic.
94 Signaling Models and Product Market Games in Finance: Do We Know What We Know?
In other words, for an equally, if not more, plausible technology to that in Miller–Rock, we get a result that is the exact opposite of theirs. The choice of a supermodular technology such as the one in Equation (94.10) is just as appropriate as the submodular one in Equation (94.9). The main argument in favor of submodular technology choice is that it yields results that are consistent with the positive average announcement effects that have been observed in some of the data (in particular, the earlier studies). Instead of making arbitrary a priori choices of technology based on average evidence, we suggest that detailed studies of technology characteristics and related announcement effects will be necessary to understand the true underlying phenomena.
94.4 Supermodularity in Product Market Games In games involving product market competition, supermodular games are those in which each player’s strategy set is partially ordered, the marginal returns to increasing one’s strategy increase in the competitor’s strategy. In other words, the game is said to exhibit “strategic complementarity (SC)” (Bulow et al. (1985); see also Fudenberg and Tirole (1984) and Tirole (1988).10 If marginal returns are decreasing in the competitor’s strategy, then there is submodularity and the game is said to exhibit “strategic substitutability (SS).” Consider, for example, a duopoly. In (SM), if f D and if the variable of strategic choice11 for players 1 and 2 are ql and q2; then profits are (strictly) supermodular with respect to strategy qi , i D 1; 2, if and only if @2 =@q1 @q2 > 0; if the reverse inequality holds, then profits are submodular and they are strategic substitutes.12 In terms of reaction functions in models of oligopoly, while submodularity implies downward sloping reaction functions, supermodularity implies that reaction functions are upward sloping. There are three variables that determine supermodularity in oligopoly games: the nature of the demand function, the
10 It can also occur in situations in which players’ strategies are multidimensional: if the marginal returns to anyone component of the player’s strategy increase with increases in the other components, then the component is supermodular with respect to the other components. 11 If Cournot or market share competition is assumed, then the strategic variable is quantity; if Bertrand, then the strategic variable is price; under perfect competition and monopoly, the variable can be modeled either as quantity or price. 12 The concepts of SS and SC have wide general applicability in oligopoly theory. Tirole (1988) refers to them as a “taxonomy” for business strategies, whereby strategic complements competition can be interpreted as aggressive competition, and strategic substitutes competition as passive competition.
1403
nature of the cost function, and the nature of competition (for example, whether it is Cournot, Bertrand, market share, or competitive). Many of the linkages between SS/SC and these three sets of functions have been explored in some detail in Bulow et al. (1985), and we will not go into them here. In general, Cournot and market share competition with linear demand is implicitly an assumption of SS, regardless of the shape of the cost function; on the other hand, if constant elasticity demand is assumed, then we can have SS or SC depending on the size of demand elasticity.13 With homogeneous goods and Cournot/market share competition, demand functions that are linear or strictly concave to the origin imply SS, while demand functions that are sufficiently convex to the origin imply SC. In the case of Bertrand competition, increasing or constant marginal costs will imply SC, while decreasing marginal costs are likely to imply SS. Perfect competition (and Bertrand competition in homogeneous goods, which also produces the result that prices equal marginal costs) will imply SS. Thus, models cannot simply assume, say, “Cournot competition with linear demand” and attempt to draw conclusions of any reasonable empirical generality from it, since the assumption that is really being made is that the game exhibits SS. Whether or not this is a reasonable assumption would require prior empirical analysis of industries or firms under scrutiny. We show how this matters in two recent models that deal with interactions between product markets and financial markets: Brander and Lewis (1986), and Maksimovic (1990).14 In each case, we restrict ourselves to one or two key conclusions of the models, since our intention is to illustrate the broader point concerning the empirical robustness of such work.
94.4.1 Effect of Leverage Decisions [Brander and Lewis (1986)] Brander and Lewis (1986) (hereafter, BL) developed a model in which financial and output decisions follow in sequence, and argued that the limited liability effect of debt may
13
The lower the demand elasticity in a Cournot model, ceteris paribus, the greater the possibility that SC obtains in the product market. To the extent that lower demand elasticities characterize differentiated products, we might argue that many products in the real world are as likely to be SC as they are SS. 14 Our choice of these two models is for reasons of parsimony. The general issues raised below appear in many other models that consider the implications of interactions between product markets and financial markets (e.g., Rotemberg and Scharfstein 1990; Brander and Spencer 1989).
1404
K. John and A.K. Sundaram
commit a leveraged firm to a more aggressive output strategy, and consequently its competitor to a more passive output strategy. BL’s setup is as follows. Firms 1 and 2 are Cournot competitors in output markets, producing competing products ql and q2 , respectively. The profit of firm i is Ri .q i ; q j ; zi /, where zi (independent of and identical to zj / is a random variable that affects output, and distributed in the interval [z’, z”] according to the density function f .zi /. BL consider two situations with respect to the realization of zi : Rizi > 0 where the marginal profits are higher in the better states of the world, and Rizi < 0 where the reverse obtains. For our purposes, it is sufficient to consider the former case. They also assume that Ri < 0 satisfies that “usual (p. 958)” properties: Rji < 0, Riii < 0, and Riji < 0. From our point of view, vis-a-vis the main result of BL, the most important assumption is the third one: the assumption that the game exhibits strategic substitutability; that is., that reaction functions are downward sloping.15 (After stating their main result and sketching the proof, we shall show why this is a critical assumption.)16 Firms can take on debt D i , and the objective of firm i is to choose output levels that maximize the expected value of the firm, V i , to its stockholders: Zz00 V .q ; q / D ŒRi .q i ; q j ; zi / D i f .zi /dzi i
i
j
(94.11)
zL
where zL D Ri .:/ D i captures the limited liability effect. Note that the assumption that Riji < 0 will imply that Viji < 0 or that firm value is submodular in outputs. Similarly, the assumption that Riii < 0 will imply that Viii < 0; this would, for example, be the case where the demand function of the firm is concave to the origin if marginal costs are constant. The important result of the paper is Proposition 2, which, for the case Rizi > 0, is as follows (BL suggest that it represents the “. . . key insight in the analysis. (p. 963)”): Given Rizi > 0, a unilateral increase infirm i’s debt, D i , causes an increase in q i and a decrease in q j . The proof is fairly straightforward, simply requiring that the first order conditions to Equation (94.11) are totally differentiated, then using Cramer’s rule and assumptions concerning the signs of the variables to determine comparative
15
This is equivalent to the assumption that the profit function is submodular in quantities. Note, however, that, with respect to the uncertainty term zi , BL allow for the possibility of both super- and submodularity. 16 BL do appear to recognize this possibility, but in passing. They note (p. 961) that the condition can be violated by feasible demand and cost structures. But they do not explore it any further.
statics effects. To sketch the proof, consider the case of Rizi > 0. If we totally differentiate the first order conditions for Equation (94.11), we have i dDi D 0 Viii dqi C Viji dqj C ViDi j
j
j
Vji dqi C Vjj dqj C VjDi dDi D 0
(94.12) (94.13)
Using Cramer’s rule to solve for dqi =dDi and dqj =dDi gives j
i Vjj =B dqi =dDi D ViDi
(94.14)
j i ViDi Vji =B
(94.15)
dq =dD D j
i
where B > 0 (under the assumption of equilibrium stability) j and further, Vjj is assumed to be less than zero. However, j Vji can be greater than zero or less than zero, depending on whether V is super- or submodular in quantities. Now, BL’s assertion that D i causes an increase in q i is dependent on the j j assumption that the sign of Vjj is negative. (If Vjj is positive instead, that is, the demand function is concave to the origin and costs constant, then an increase in leverage has an outputreducing effect for firm i.) The assertion that D i causes a decrease in the competitor’s output is, however, dependent j on the assumption that Vji is submodular in quantities. Now, instead, let us assume that V is supermodular in j quantities, Le., that Vji > 0. Then, by examining Equation (94.15), we have the following proposition: Given Rizi > 0, a unilateral increase infirm i’s debt, D i , causes an increase in q i and an increase in q j . The original result was that an aggressive capital structure choice by a firm will lead to a passive strategic response by its competitors. This intuition depended on the assumption of submodularity of firm value with respect to competitors’ quantities. Instead, if firm value is supermodular in competitors’ quantities, then aggressive capital structure choices by firms lead to an aggressive strategic response from their competitors. Consequently, we would observe the following: (1) an increase in leverage by one firm will be followed by an increase in leverage by other firms; (2) there will be a general increase in industry output levels following leverage increases. If such industry wide increase in output has adverse profitability consequences for all firms in the industry and if there is a one-to-one correspondence between firm profits and stock prices, then we may well observe that leverage increases are accompanied by stock price decreases. In order to see this, let us examine the profits of firm i around equilibrium, when perturbed by a leverage increase (i.e., an increase in D i ): . D i /.d i =dDi / D . D i /Œ.@ i =@ q i /.dqi =dDi / C.@ i =@ q j /.dqj =dDi /C.@ i =@D i /
94 Signaling Models and Product Market Games in Finance: Do We Know What We Know?
Note that the first term on the right hand side is zero in equilibrium, and the third term is positive (from the analysis above). Now consider the second term: @ i =@q j is less than zero by assumption of conventional substitutes; however, we note from Equation (94.15) that dqj =dDi can be positive or negative depending on whether .:/ is supermodular or submodular. Thus, the second term (and hence the impact of leverage increases on firm profit) can be negative or ambiguous depending on whether n.:/ is submodular (the BL case) or supermodular. Indeed, if n.:/ is sufficiently supermodular, dqj =dDi can be sufficiently large and positive so as to overwhelm the effect of @ i =@D i ; that is, leverage increases can have negative stock price consequences (assuming, as is normal, that stock prices covary positively with firm profits), a result that implies the reverse of the BL insight. The crucial issue then is this: if we empirically implement the BL idea using a data set that contained products or markets where both SS and SC forms of competition are present, then we are likely to either find conflicting results in the data or unlikely to find any results at all. The only way to address this would be to empirically analyze the market prior to the analysis of valuation implications, to explore whether manifestations of SS or SC are present in the data (for example, in the choice of industries).
94.4.2 Loan Commitments in a Duopoly [Maksimovic (1990)] Maksimovic (1990) (hereafter, M) developed a model of loan commitments in a Cournot duopoly. The argument is that, when product markets are imperfect, the provision of loan commitments leads to a subsidy effect from reduced capital costs, giving the firm the incentive to increase output. Under the assumptions of the model, the increased output will lead to a reduction in the rival’s output and hence, an increase in the firm’s profits; consequently, the firm has the incentive to take on loan commitments. Likewise, the rival has the same set of options and will do the same. The net result is that both firms receive loan commitments and overproduce relative to a no-commitments equilibrium, leading to a decline in industry profit. Thus, an important implication of M’s model is that while “. . . it is rational for a firm to obtain a loan commitment, all the firms in that industry taken together are made worse off by the existence of loan commitments (p. 1641).” We show below that this result hinges on the implicit assumption of submodularity (i.e., strategic substitutes). As before, we present a simplified version of the model and its key result. Next, we show how results can get reversed under supermodularity.
1405
There are two firms 1 and 2, homogeneous products Cournot competitors, facing a linear demand schedule.17 The inverse demand function is p D a b.q 1 C q 2 /, where q i , i D 1, 2 is the output of firm i . Marginal costs are assumed to be constant (and normalized to 1) for simplicity, and are subsumed in the intercept of the demand function, a. Firm i can obtain a loan commitment at constant marginal cost r i . The profit function for firm i is i .q 1 ; q 2 / D Œa b.q 1 C q 2 / r i q i
(94.16)
The first order condition for Equation (94.16) gives us @ i =@q i D a 2bq1 q 2 r D 0
(94.17)
which, in turn, implies a downward sloping reaction function.18 Clearly, a decrease in r (i.e., a subsidy effect resulting from the provision of a loan commitment) will lead to an increase in the firm’s output and profit, and consequently, a decline in the rival’s output and profit. Hence, the firm will have the incentive to obtain a commitment. The rival will have the same incentive, and will also obtain a commitment. A prisoner’s dilemma situation results whereby both firms produce more than the precommitment equilibrium, increasing total industry output and reducing total industry profit. This is M’s main result. Let us now generalize M’s model somewhat, by specifying a general demand function (which also subsumes the normalized constant marginal cost) by writing P D P .q 1 ; q 2 /, with @P =@q i < 0, and retaining the assumption that the product is homogeneous (and hence, implicitly, conventional substitutes).19 Now, the profit function can be written as i D P .q 1 ; q 2 /q i r i q i
i D 1; 2
(94.18)
For variable X , let Xi denote the first (partial) derivative with respect to q i , and Xij denote the second (partial) derivative of X with respect to q i and q j , i D 1, 2 and j D 2, 1. Let Xr and Xir similarly denote the first and second partial 17
Assuming homogeneous product Cournot competition with linear demand is implicitly an assumption of strategic substitutes; see below. 18 Downward sloping reaction functions result from the assumption of strategic substitutes. Strategic substitutes should imply that @2 i =@q 1 @q 2 < 0 in Equation (94.17); it is easy to verify that it is equal to -1 in M’s setup. This is a consequence of assuming homogeneous products, linear demand, and Cournot competition. 19 Conventional substitutability or complementarity (that is, the first derivative of firm i ’s profit with respect to firm j ’s output) is independent of the notion of SS and SC. None of the general arguments of this paper depend on the assumption of conventional substitutes or complements. It is normal to assume conventional substitutes in most economic models.
1406
K. John and A.K. Sundaram
derivatives, respectively, with respect to r i . The first order condition is ii D P C q i Pi r i D 0
(94.19)
and second order conditions are assumed to be met.20 Strategic substitutes or complements are determined by the sign of iij . If iij D Pj C q i Pij < 0 then profit is submodular (and strategic substitutes obtain), but if iij > 0, then profit is supermodular (and strategic complements obtain). Note that iji can be of either sign since no prior restrictions are imposed on the sign of Pij , the shape of the demand function.21 Now we are ready to explore the main result in M in some detail, by examining the movement of the key variables around equilibrium. Taking the total derivative of Equation (94.19) for the two firms, we have 111 dq1 C 112 dq2 C 11r dr D 0
(94.20)
221 dq1 C 222 dq2 D 0
(94.21)
(note that the partial derivative of firm 2’s profit function with respect to firm 1’s cost of financing, 22r is zero, since r i does not enter firm 2’s profit function). Using Cramer’s rule and simplifying, dq1 =dr D 222 11r =. 112 221 111 222 /
(94.22)
dq2 =dr D 112 11r =. 112 221 111 222 /
(94.23)
We see that Equation (94.23) – that is, the rival firm’s reaction to an increase in the subsidy for firm 1 – depends on 12 and hence, on super- or submodularity of the profit function. To examine the output effects of a change in r, we need to examine the signs of Equations (94.22) and (94.23). dq1 =dr will always be less than zero, since 222 < 0 by the second order condition, 21r < 0 (output is concave in r) and equilibrium stability requires that the denominator be less than zero. However, dq2 =dr can be greater than or less than zero, depending on whether SS or SC obtains; if SS, then dq2 =dr will be greater than zero, and if SC then 2 dq =dr will be less than zero. The interpretation is as follows: an increase in the subsidy to firm 1 (i:e:; a decrease in r) will always increase firm 1’s output, but it could decrease or increase firm 2’s output depending on whether profits are super- or submodular in quantities. By implicitly assuming This requires that 111 < 0 and 111 222 112 221 > 0. Linear demand would mean that Pij D 0. Since Pj < 0 by assumption of conventional substitutes, SS always results. Pij > 0 will imply demand functions that are strictly convex to the origin while Pij < 0 will imply demand functions that are strictly concave to the origin. Strictly concave demand functions with homogeneous products Cournot competition will also always imply SS, while sufficiently convex demand functions could imply SC depending on the degree of convexity.
20
SS, M only gets the result that firm 2 will decrease output when r decreases for firm 1. Again, as in the case of BL, we can examine how SS and SC affect industry profits. By perturbing profits around the equilibrium resulting from a small change in ri , it can be shown that, under strategic complements, an increase in loan commitment to firm 1 could decrease the profits of firm 1. This, in turn, could not only imply that firm 1 (and similarly, firm 2, using similar arguments) will have no incentive to take on a loan commitment, but it could actually imply that the firms have the incentive to make (rather than receive) loan commitments.22 Our arguments in relation to both BL and M are summarized in Proposition 2: Proposition 2. Under submodularity, an increase in leverage (BL) and a decrease in loan commitments (M) will result in positive firm profits; however, under supermodularity, the profit effects of leverage increases and loan commitment decreases will be ambiguous (and perhaps negative).
94.5 Empirical Evidence In signaling models, the usual practice in the literature has been to examine the average announcement effect, and to pick a model structure that produces results consistent with this effect. However, it has been noted in a variety of contexts that even though the average effect is positive (negative), a sizable fraction of firms in the sample have an announcement effect that is negative (positive). For example, the average positive announcement effect in the case of dividend signaling has been documented in Aharony and Swary (1980), Asquith and Mullins (1983), and Healy and Palepu (1988): however, it has not been unusual for 30– 40% of the firms in their samples to obtain an announcement effect that is negative. Similarly, in the case of leverage increases, Eckbo (1986) finds that leverage increases are accompanied by stock price increases for 40–48% of the firms in the sample, and decreases in the remaining cases; in an analysis of 171 straight debt public offerings, Mikkelson and Partch (1986) document 56% negative effects and the rest positive. Results that dichotomize firms along criteria examined here produce results consistent with our arguments. For example, Lang and Litzenberger (1989) use Tobin’s Q to dichotomize firms, and obtain evidence consistent with our perspective; John and Lang (1991) obtain results
21
22
The intuition here is similar to that in the industrial organization literature on how there may be an incentive for a firm to raise its rivals’ costs, since the benefits resulting from costs imposed on rivals would more than compensate for any additional costs one brings upon oneself. See, for example, Salop and Scheffman (1983).
94 Signaling Models and Product Market Games in Finance: Do We Know What We Know?
on announcement effects of dividend initiations using a technology-based dichotomy and find similar results; using a similar classification, John et al. (1992) find dichotomous announcement effects from adoption of antitakeover amendments. In the area of product market games, Sundaram et al. (1996) examine the impact of R&D spending announcements on the market value of announcing firms and their competitors, using both average effects and effects when the data are parsed by SS and SC firms. They operationalize a measure of SS and SC, called “competitive strategy measure (CSM),” as follows: CSM is obtained by correlating the ratio of change in the firm’s own profits to the change in the firm’s own revenue with the change in rest-of-industry revenues (using data for 40 quarters prior to the announcement quarter), and thus they attempt to directly measure the underlying construct of the second derivative of the profit function with respect to own firm and competitor firm quantities. For the sample taken as a whole, they find that the average announcement effect of R&D expenditure changes is not significantly different from zero. However, the announcement effect for the firm announcing the R&D expenditure change is negatively and significantly related to CSM, the measure of strategic interaction: firms with passive (SS) competitors have larger positive announcement effects than firms with aggressive (SC) competitors. When the firms in their sample are split by SS and SC types, they find that SS type firms have a positive and significant announcement effect, while SC type firms have a negative (although not significant) announcement effect. These results are consistent with the arguments we advance here, which suggest that non-significant averages may actually hide significant effects of opposite signs if the data are parsed using underlying market structure criteria such as SS and SC. Building on the work of Sundaram et al. (1996) there has been a number of recent papers that explore the effects of this dichotomy on a number of important firm decisions such as new product introductions (e.g., Chen et al. 2002), capital structure choices (Lyanders 2006; De Jong et al. 2007), and strategic IPOs (initial public offerings) (e.g., Chod and Lyandres 2008). Kedia (2006) examines the effect of the nature of strategic competition on the incentive features included in an optimally designed topmanagement compensation structure and finds empirical results in support of the dichotomy studied in this paper.
94.6 Conclusion Our analysis demonstrated that the phenomena of supermodularity and submodularity have crucial implications for the generalizability of many theoretical results. Notably, we examined three cases: depending on whether product markets
1407
are super- or submodular, the stock price effects of leverage increases can be positive or negative; similarly, firms may have the incentive to make loan commitments rather than receive them; depending on whether firm technologies are super- or submodular, stock price effects of dividend increases can be positive for a broad class of firms. The implications from all these observations are twofold. First, it is important to take a closer look at choices of technology in signaling models, and choices of market structure, demand functions, and cost functions in product market games in finance. Arbitrary modeling choices that rationalize average effects may miss crucial driving factors. Second, average effects are just that: going a step further to sort out technology types and market structure types, and systematically relating to them to the direction (positive, ambiguous, or negative) of announcement effects is likely to yield more insightful results for both researchers and practitioners. In other words, unless our data sets are pre-analyzed and firms (or industries) classified as super- and submodular, we will obtain average results that may or may not apply to a particular firm or industry. There is also the likelihood that we obtain weak results since the effects of the two phenomena may be canceling each other out, or even the reverse of expected results if one or the other phenomenon overwhelms the data or choice of industries. Accordingly, much research remains to be done.
References Aharony, J. and I. Swary. 1980. “Quarterly dividends and earnings announcements and stockholder returns: evidence from capital markets.” Journal of Finance 35, 1–12. Ambarish, R., K. John, and J. Williams. 1987. “Efficient signaling with dividend and investments.” Journal of Finance 42, 321–344. Asquith, P. and D. Mullins. 1983. “The impact of initiating dividend payments on shareholders’ wealth.” Journal of Business 56, 77–96. Bhattacharya, S. 1979. “Imperfect information, dividend policy, and the ‘bird in the hand’fallacy.” Bell Journal of Economics 10, 259–270. Brander, J. and T. Lewis. 1986. “Oligopoly and financial structure: the limited liability effect.” American Economic Review 76, 956–970. Brander, J. and B. Spencer, 1989. “Moral hazard and limited liability: implications for the theory of the firm.” International Economic Review 30, 833–850. Bulow, J., J. Geanakoplos, and P. Klemperer. 1985. “Multi market oligopoly: strategic substitutes and complements.” Journal of Political Economy 93, 488–511. Chen, S., K. W. Ho, K. H. Ik, and C. F. Lee. 2002. “How does strategic competition affect firm values? A study of new product announcements,” Financial Management 31, 67–84 (Financial Management Association), Summer. Chod, J. and E. Lyandres. 2008. Strategic IPOs and product market competition, Working Paper, University of Rochester. De J., Abe, T. T. Nguyen, M. A. Van Dijk. 2007. “Strategic debt: evidence from bertrand and cournot competition.” ERIM Report Series Reference No. ERS-2007-057-F&A. Eckbo, B. E. 1986. “Valuation effects of corporate debt offerings.” Journal of Financial Economics 15, 119–152.
1408 Fudenberg, D. and J. Tirole. 1984. “The fat-cat effect, the puppy dog ploy, and the lean and hungry look.” American Economic Review Papers and Proceedings 74, 361–366. Healy, P. M. and K. G. Palepu. 1988. “Earnings information conveyed by dividend initiations and omissions.” Journal of Financial Economics 21, 149–176. John, K. and L. Lang. 1991. “Insider trading around dividend announcements: theory and evidence.” Journal of Finance 46, 1361–1390. John, K., L. Lang, and F. Shih. 1992. Anti takeover measures and insider trading: theory and evidence, Working Paper, NYU Stern Graduate School of Business. John, K. and B. Mishra. 1990. “Information content of insider trading around corporate announcements: the case of capital expenditures.” Journal of Finance 45, 835–855. John, K. and J. Williams. 1985. “Dividends, dilution and taxes: a signaling equilibrium.” Journal of Finance 40, 1053–1070. Kedia, S. 2006. “Estimating product market competition: methodology and applications.” Journal of Banking and Finance 30(3), 875–894. Lang, L. and R. Litzenberger. 1989. “Dividend announcements: cash flow signaling versus free cash flow hypothesis?” Journal of Financial Economics 24, 181–192. Leland, R. and D. Pyle. 1977. “Informational asymmetries, financial structure and financial intermediation.” Journal of Finance 32, 371–381. Lyanders, E. 2006. “Capital structure and interaction among firms in output markets: theory and evidence.” The Journal of Business 79(5), 2381–2422. Maksimovic, V. 1990. “Product market imperfections and loan commitments.” Journal of Finance 45, 1641–1653. Masulis, R. 1983. “The impact of capital structure change on firm value: some estimates.” Journal of Finance 38, 107–126.
K. John and A.K. Sundaram Mikkelson, W. R. and M. M. Partch. 1986. “Valuation effects of security offerings and the issuance process.” Journal of Financial Economics 15, 31–60. Milgrom, P. and J. Roberts. 1990a. “The economics of modern manufacturing.” American Economic Review 80, 511–528. Milgrom, P. and J. Roberts. 1990b. “Rationalizability, learning and equilibrium and in games with strategic complementarities.” Economertica 58, 1255–1278. Miller, M. and K. Rock. 1985. “Dividend policy under asymmetric information.” Journal of Finance 40, 1031–1051. Riley, J. 1979. “Informational equilibrium.” Econometrica 47, 331–359. Ross, S. 1977. “The determination of financial structure: the incentive signaling approach.” Bell Journal of Economics 8, 23–40. Rotemberg, J. and D. Scharfstein. 1990. “Shareholder value maximization and product market competition.” Review of Financial Studies 3, 367–393. Salop, S. and D. Scheffman. 1983. “Raising rivals’ costs.” American Economic Review 73, 267–271. Shih, F. and D. Suk. 1992. Information content of insider trading and insider ownership around special dividend announcements, Working Paper, Rider College. Sundaram, A., T. John, and K. John. 1996. “An empirical analysis of strategic competition and firm values: the case of R&D competition.” Journal of Financial Economics 40, 459–486. Tirole, J. 1988. Theory of industrial organization, MIT Press, Cambridge, MA. Topkis, D. 1978. “Minimizing a submodular function on a lattice.” Operations Research 26, 305–321. Topkis, D. 1979. “Equilibrium points in non-zero sum n-person submodular games.” SIAM Journal of Control and Optimization 17, 773–787.
Chapter 95
Estimation of Short- and Long-Term VaR for Long-Memory Stochastic Volatility Models Hwai-Chung Ho and Fang-I Liu
Abstract The phenomenon of long-memory stochastic volatility (LMSV) has been extensively documented for speculative returns. This research investigates the effect of LMSV for estimating the value at risk (VaR) or the quantile of returns. The proposed model allows the return’s volatility component to be short- or long-memory. We derive various types of limit theorems that can be used to construct confidence intervals of VaR for both short-term and long-term returns. For the latter case, the results are in particular of interest to financial institutions with exposure of long-term liabilities, such as pension funds and life insurance companies, which need a quantitative methodology to control market risk over longer horizons. Keywords Long-memory stochastic volatility model Stochastic volatility model r Value-at-risk r Quantile
r
95.1 Introduction The quantitative analysis of financial risks has always been an issue of central importance to portfolio management. There are several types of risks in financial markets, such as credit, liquidity, foreign exchange, and market risks. Of these, market risk is probably the most difficult to hedge and has therefore received a great deal of attention from both researchers and practitioners. The Value-at-risk (VaR) quantity has been used widely to measure market risk; it is the maximum potential change in value of a portfolio of financial instruments with a given probability over a certain horizon. From a statistical perspective, VaR is equivalent to a quantile of the distribution of returns. To estimate the VaR of a H.-C. Ho () Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan, ROC e-mail: [email protected] H.-C. Ho and F.-I. Liu Department of Finance, National Taiwan University, Taipei, Taiwan, ROC
portfolio or an market index, it is necessary to find a proper model for the returns. To model the equity return, one of the major issues pertains to the property of volatility clustering, as observed especially in long series of returns and frequently traded data. Stationary models proposed to describe volatility clustering include the ARCH (or GARCH) family (Engle 1982; Bollershlev 1986), exponential GARCH (EGARCH; Nelson 1991), and the stochastic volatility (SV) model (e.g., Taylor 1986; Harvey et al. 1994). As their main feature, these models include a nonlinear, positive volatility process in the model specification to govern the volatility dynamics of the underlying returns. Using the square or absolute value of returns as a proxy of the volatility process, many empirical studies (e.g., Ding et al. 1993) find that the volatility process is itself not only correlated but exhibits highly persistent autocorrelations, which ARCH processes cannot produce. The decay rate of the autocorrelations is so slow that it can be better extrapolated by a positive sequence, which is not summable. A time series with autocorrelations of such a decay rate is usually called long memory, and its autocovariance function is represented as .j / D j 2d 1 L .j / cj 2d 1 ; j ! 1;
(95.1)
for some 0 < d < 1=2, with a positive function L that is Q =L .x/ D 1 slowly varying at infinity (i.e., limx!1 L .ax/ for all aQ > 0). Lobato and Savin (1998) examine the daily returns of the S&P 500 index for the period July 1962 to December 1994 and find strong evidence of persistent correlation in the square or absolute value of returns. In addition to index returns, the long-memory phenomenon of volatility has been documented in individual stocks (Ray and Tsay 2000), as well as in some high-frequency data, such as minute-by-minute stock returns (Ding and Granger 1996) and foreign exchange rate returns (Bollerslev and Wright 2000). To provide a better fit for data with strong volatility dependence, Breidt et al. (1998) suggest the long-memory stochastic volatility (LMSV) model, which is a natural extension of the traditional SV model. The LMSV process is a stationary time series formed by martingale differences that
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_95,
1409
1410
H.-C. Ho and F.-I. Liu
contain a latent volatility process exhibiting long memory. The main message delivered by the LMSV model is that the magnitude of the shocks to the returns dissipates at a much slower rate than the short-memory conditional variance models like ARCH or GARCH can depict. We introduce the specific form of the LMSV model in Sect. 95.2. The presence of long-memory dependence affects the variance of the VaR estimates and must be taken into consideration in order to obtain adequate assessments of their variation. In the remainder of this chapter, we focus on risk evaluation under long-memory dependence in the volatility process. To access the uncertain risk of VaR estimates, we construct a confidence interval of daily VaR when the return follows the LMSV process. We also derive the explicit formula for long-horizon VaR and its confidence interval. The issue of long-run risk is important but scarcely mentioned in existing literature. Yet financial institutions with long-term liabilities such as pension funds and life insurance companies certainly need a proper risk management methodology that controls for risk over longer horizons.
95.2 Long Memory in Stochastic Volatility The ARCH-type (including GARCH family) process and the stochastic volatility (SV) process are two major classes of models, proposed for asset returns that incorporate the stylized fact of volatility clustering. Both have been studied widely in the literature and enjoy equal success. In this article, we concentrate only on the SV model, because it provides the modeling flexibility to allow the volatility process to be either short-memory or long-memory. Let fXt g be the market value process of an asset and rt D log .Xt =Xt 1 / be the log returns, with mean and distribution function Fr ./. The SV model then is defined by rt D C et ; et vt ut ; vt D N exp.Zt =2/;
1 X
ai t i ;
.1 B/d .B/ Zt D .B/ t ; 0 < d < 1=2;
(95.3)
i D0
where ft g indicates i.i.d. random variables with mean 0, finite variance 2 , and distribution function FZ .:/. The traditional SV model (Taylor 1986; Harvey et al. 1994), which is the short-memory version of Equation (95.2), requires that the coefficient sequence fai g be summable,
(95.4)
where B is the shift operator, and ./ and ./ are the characteristic polynomials of orders p and q, respectively. When d D 0, Equation (95.4) produces nothing but the standard ARMA.p; q/ process. We do not consider the case in which 1=2 < d < 0; because it represents the class of antipersistent time series that usually resulted from oversmoothing and are of no interest for the issues we are addressing. Let .:/ and .:/ denote the autocovariance and autocorrelation function of a FARIMA .0; d; 0/ process, respectively. Then, it is known that .0/ D 2
.1 2d / Z2 ; and 2 .1 d /
.j / D 2
.j C d / .1 2d / .j d C 1/ .d / .1 d /
(95.2)
where N > 0; fet g are zero mean errors with a distribution function Fe ./ and fut g is an i.i.d. sequence with Eut D 0 and Eu2t D 1 that is independent of the latent volatility component fZt g. Furthermore, fZt g is a linear process defined as Zt D
which implies that the autocovariance sequence is also summable. The LMSV model introduced by Breidt et al. (1998) also follows the structure specified in Equation (95.2) but uses the innovation coefficients fai g of fZt g in the form of ai C i ˇ for some ˇ 2 .1=2; 1/ ; as well as a positive constant C ( gn hn signifies limn!1 gn = hn D 1). We set d D 1 ˇ . Because ai C i ˇ , which is clearly not summable, the j th autocovariance Z .j / of fZt g can P be expressed as Z .j / D 2 1 i D0 ai ai Cj ; which has the same asymptotic form C j 2d 1 as stated in Equation (95.1). Two important models of the long-memory linear process in Equation (95.3) are the fractional autoregressive integrated moving average (FARIMA) process of Adenstedt (1974), Granger and Joyeux (1980), and Hosking (1981) and the fractional Gaussian processes of Mandelbrot and van Ness (1968). FARIMA .p; d; q/ process is formulated as the unique stationary solution of the difference equation
c1 jj j2d 1 as j ! 1;
(95.5)
(95.6)
where c1 D 2
2 .1 d / .1 2d / Z ; .d / .1 d / .d /
and .j / D
.j C d / .1 d / k1Cd D … 0
.1 d / 2d 1 as j ! 1: jj j .d /
(e.g., Brockwell and Davis 1991).
(95.7)
95 Estimation of Short- and Long-Term VaR for Long-Memory Stochastic Volatility Models
Fractional Gaussian noise (FGN) also is a stationary process with mean 0 and variance F2 GN . The autocovariance function of FGN is given by
F2 GN jj C 1j2H 2 jj j2H C jj 1j2H : 2 (95.8) If H ¤ 1=2; then .j / D
.j / F2 GN H .2H 1/ jj j2H 2
(95.9)
values, conditional on current information, the sample quantile of the log return is a natural way to deal with the problem (see Dowd 2001). We choose the quantile estimator based on the absolute deviation loss, defined as the solution of the following minimization problem (e.g., Koenker and Bassett 1978; Koenker 2005): 8 < X O0 ./ D inf 2 Rj jrt j : rt 0
as j ! 1: When 1=2 < H < 1, Equation (95.9) P tends to 0 so slowly that 1 j D1 .j / diverges. Comparing Equations (95.6) and (95.9), we note that the relationship between the two exponents is 2d 1 D 2H 2; that is, H D d C 1=2:
2
X
C .1 / ( D inf
(95.10)
The LMSV model described in Equations (95.2) and (95.3) thus exhibits the desirable property that the sequence itself is white noise and the squared series displays long memory, which coincides with the stylized fact observed in many speculative returns (Ding et al. 1993). Another useful property of the SV model is that if the mean is known, taking the logarithm of the square of the series, centered by its mean, yields the following simple additive model:
1411
jrt j
rt <0 n X
9 = ;
)
.rt / ,
(95.12)
t D1
where O0 ./ is the th sample quantile, .x/ (usually called the check function) is given by x . I .x < 0// ; and I .x < 0/ is the indicator function of fx < 0g. Define 0 ./ the th quantile of Fr ./ : When the returns are i.i.d. random variables, we know that
d h 0 i2
p 2 O n 0 ./ 0 ./ ! N 0; ./ Fe ..// (95.13)
95.3 VaR Calculation
with 2 ./ D .1 / (Mosteller 1946). It is a known reality in empirical finance that financial series are subject to data dependent which calls for a more general approach to the VaR estimation that caters to the dependence. For data with weak dependence, the asymptotic normality in Equation (95.13) also holds, with 2 ./ D .1 / C P 2 1 kD1 cov.I.Yt < 0 .//; I.Yt Ck < 0 .// (e.g., Yoshihara 1995). The dependence structures applicable by the techniques are very wide, including ARMA, ARCH, GARCH, and SV, as long as they satisfy the ˛-mixing. Despite extensive research into sample quantiles for stationary sequences, corresponding work regarding the important class of nonlinear time series remains largely unaccomplished. In this section, we survey some recent research by Ho and Liu (2008) that attempts to fill this gap by deriving the asymptotic distributions of the sample quantile for the LMSV model. In detail we will first modify the objective function in Equation (95.12) as follows:
95.3.1 Daily VaR
Zn .ı/ D
yt D log.rt /2 D log N 2 C E log ut C Zt C log u2t E log u2t C Zt C "t ;
(95.11)
2 where f"t g is i.i.d. with mean zero and variance 2" : If u2t is 2 standard normal, then E log ut D 1:27; and " D =2 (Wishart 1947). From Equation (95.11), it is evident that the autocovariances of fyt g are identical to those of fZt g ; except at lag zero, because f"t g is an i.i.d. sequence independent of fZt g. Therefore, though fZt g is not observable, finding empirical evidence of long memory in fZt g can be achieved by performing statistical tests on a volatility proxy, namely, the logarithm of the squared series.
We can define VaR as the maximal loss of an asset or a portfolio during a given period for a given probability. Because VaR is simply a particular quantile of future portfolio
n n2d C1 n X et ./
t D1
ı 1
n 2 d
.et .// ; (95.14)
1412
H.-C. Ho and F.-I. Liu
where ./ D 0 ./
: The minimum of Zn .ı/ is 1 n 2 d O0 ./ 0 ./ . Relying on the basic corollary provided by Hjort and Pollard (1993), which offers a general method for proving the asymptotic normality of estimators defined by the minimization of convex functions under reasonable regularity conditions, if Zn .ı/ converges in distribution to some function Z .ı/ that has a unique minimum, then the minimum of Zn .ı/ also converges to that of Z .ı/ in distribution. Therefore, we begin by deriving the limiting distribution of Zn .ı/ ; from which it follows that
d 1 n 2 d O0 ./ ./ ! arg min .Z .ı// :
(95.15)
Assumption 1. The distribution functions Fe are continu0 00 ous and have continuous densities Fe and Fe at ./ :
Theorem 1. Let frt g be a LMSV process with mean ; and define Zn .ı/ as in Equation (95.14). If Assumption 1 holds, then for any ı; d
Zn .ı/ ! Z .ı/ ; where
1 0 F . .// ı 2 ıW; 2 e with W as a normal random variable with mean 0 and variance Z .ı/ D
io2 n h .1/ c 2 E FZ .ln ./ ln u I .u > 0//
1 1 : 2d C 1 d
The following theorem establishes the asymptotic normality of sample quantile O0 ./ ; which can be used to construct asymptotic confidence intervals for 0 ./ when asset volatilities are present in the long-memory dependent phenomenon.
Theorem 2. According to the results of Theorem 1, if Z .ı/ has a unique minimum, then
1 n 2 d O0 ./ 0 ./ d
!
1 W N Fe0 . .//
2
0;
!
./ 1 1 c : 4 d 2d C 1
./2 F2 GN 0; 4
! : (95.18)
Even if the LMSV sequence itself is white noise, the sample quantile converges at the same rate as the sample mean of a long-range dependence process. It also stands to reason that the memory will be induced once it is transformed by the check function. Therefore, if we were blind to the longmemory persistence of asset volatilities but treated returns as uncorrelated or weakly dependent series, the confidence intervals consequently constructed for 0 ./ would be inaccurate and misleading. For the proofs of Theorems 1 and 2, see Ho and Liu (2008).
95.3.2 Long-Term VaR Although VaR is an easy and intuitive concept, its measurement represents a very challenging statistical problem, especially for long time horizons. Of the different methodologies for calculating VaR, few are concerned with long-term issues. We project the asset return rt using the SV model and attempt to derive a general formula for the long-term VaR. We define the VaR of a long position over the time horizon T with probability ˛ as ˛DP
T X
! rt < V˛ .T / :
(95.19)
t D1
Let 2 denote the variance of the returns rt . Then, Equation (95.19) implies PT
t D1 rt
˛DP
T
p T
! V˛ .T / T p < : T
T d p ! N .0; 1/ ; T
t D1 rt
!
0;
d O0 ./ 0 ./ ! N
PT
1 n 2 d O0 ./ 0 ./ d
n
1 2 d
According to the central limit theorem, as T ! 1; the sum of the sample return under the SV model approaches as stan(95.16) dard normal distribution, which is
That is, when Zt FARIMA.0; d; 0/ ;
!N
and when Zt FGN,
1 ./2 Z2 .1 d / 1 ; (95.17) 4 .d / d 2d C 1
(95.20)
where 2 D E .rt /2 . The central limit theorem of the SV model has been proven by Ho et al. (2008). We extend the idea to the
95 Estimation of Short- and Long-Term VaR for Long-Memory Stochastic Volatility Models
derivation of the T -horizon VaR. That is, the central limit theorem implies that V˛ .T / T P 1 p ! ˆ .˛/ ; as T ! 1:: T
(95.21)
Therefore, a natural estimate of V˛ .T /, denoted VO˛ .T /, would be p VO˛ .T / D T C O T ˆ1 .˛/;
(95.22)
obtained by solving V˛ .T / T p D ˆ1 .˛/ ; T where gets replaced by the sample standard deviation ; O T
1=2 P and O D .rt /2 =T . To obtain the true value t D1
of V˛ .T / and its confidence interval, we must address the asymptotic behavior of O . If the volatility process fZt g (cf. Equations (95.2) and (95.3)) is short-memory, such that the coefficients fai g in Equation (95.3) are summable, the estimator O 2 obeys the central limit theorem (Ho 2006), which is p 2 d T O 2 ! N 0; g 2 ;
(95.23)
T 2 P 2 12 2 where g D lim Var T vt u t 1 C T !1 t D1
T 1 P lim Var T 2 v2t 2 : Furthermore, for the long-
T !1
t D1
memory SV process, this derivation of the root T convergence of O 2 does not apply. Ho (2006) also shows that O 2 is still asymptotically normal but converges to the true variance p at a rate slower than T . Specifically, as T ! 1, T
1 2 d
2 O 2 D T
1 2 d
T X
v2t
2
! =T
C op .1/
t D1
! N 0; h2 ; d
(95.24)
where h2 depends on the distribution of fZt g ; d; and C . We assume fut g in Equation (95.2) is a sequence of i.i.d. standard normal random variables. Using the normality property of rt after conditioning on FT D fZ1 ; Z2 ; :::; ZT g and T P 1 v2t =T , we T 2 d T2 2 D OP .1/ , where T2 D t D1
1413
Equation (95.25) therefore is a general formula for the true long-horizon VaR, applicable for both short-memory and long-memory cases in the SV model. When d D 0; it represents the true long-horizon VaR for the short-memory SV process. The 100.1 ˇ/ % confidence interval for V˛ .T / for both short-memory and long-memory is as described in Theorem 3. For the proofs of Equation (95.25) and Theorem 3, please refer to Ho et al. (2008). Theorem 3. Assume that the returns frt g follow the SV model specified in Equation (95.2), with fut g as a sequence of i.i.d. standard normal random variables, and E81 C Ev41 < 1. Suppose the mean of the return process rt is known, and the variance 2 of rt is bounded below by a known positive constant ı , such that 0 < ı < 2 . (i) If the volatility process fZt g is long-memory, with a memory parameter 0 < d < 1=2, then
V˛ .T / VO˛ .T / d h2 : ! N 0; ˆ1 .˛/T d 4 2
(95.26)
In turn, the 100.1 ˇ/ % confidence interval for the T horizon VaR, V˛ .T / ; is VO˛ .T / T d ˆ1 .˛/ L V˛ .T / VO˛ .T / T d ˆ1 .˛/ U; (95.27) where L and U are the 100 .ˇ=2/ % and the 100 .1 ˇ=2/ of N 0; h2 =4 2 ; respectively. % 2quantiles 2 In turn, N 0; h =4 is the asymptotic distribution of 1 T 2 d .O / for the long-memory case. (ii) If the volatility process fZt g is short-memory, such that the coefficients ai in Equation (95.3) are summable, then
g2 V˛ .T / VO˛ .T / d ; ! N 0; ˆ1 .˛/ 4 2
(95.28)
and g 2 depends on the second moment and autocovariance of v2 . The confidence interval in Equation (95.27) for part (i) holds for the short-memory case, modified by setting d equal to 0. In addition, L and U can be revised to represent % and the 100 .1 ˇ=2/ of % quantiles the 100 .ˇ=2/ N 0; g 2 =4 2 , respectively. Finally, N 0; g 2 =4 2 is the 1 asymptotic distribution of T 2 .O / for the short-memory case.
95.3.3 Implementation
obtain
p 1 V˛ .T / D T C T ˆ1 ˛ C o T . 2 d / : (95.25)
To use Equation (95.16) to evaluate the uncertain risk of daily VaR estimates, we must estimate the usually unknown normalization rate d . This estimation poses a challenging
1414
problem, because d is determined by the decay rate ˇ of the innovation coefficients of the unobservable series Zt . Because we do not know the mean of rt in Equation (95.2) when investigating the daily VaR, the results that Deo and Hurvich (2001) offer are not applicable. To obtain the normalization rate in Equation (95.16) without estimating d directly, we employ the sampling window method introduced by Hall et al. (1998) and Nordman and Lahiri (2005) to esti 1 1 mate the normalizing constants n 2 d c 2d1C1 d1 2 . Whereas Nordman and Lahiri (2005) only deal with linear processes, we can show that the sampling window method applied to frt2 g produces a consistent estimate of the normalizing con 1 1 stant n 2 d c 2d1C1 d1 2 for Equation (95.16) to hold (see Theorem 7.1; Ho and Liu 2008). To evaluate the confidence interval of the T -horizon VaR, the sampling window method of Hall et al. (1998) is still applicable; it captures the underlying dependence structure and the limiting variance h2 =4 2 :
95.4 Conclusions LMSV models hold the promise of improved long-run volatility forecasts and more accurate pricing of long-term contracts. The implications of this study can therefore extend to the context of risk management. In light of the evidence of long-memory volatility in the S&P 500 index, the risk of VaR estimates will be underestimated if the persistent autocorrelations of the index’s volatility are treated as of short memory. For practitioners, the estimation of long-term VaR is more difficult than that of short-term VaR, because forecasting the volatility process for longer horizons becomes distorted. We derive explicit equations for the long-horizon VaR and confidence intervals and thus provide general forms that can address both short-memory and long-memory stochastic volatility effects. Using an analytic equation to calculate the VaR and its confidence intervals can provide insurers with a better assessment of the risk to which their long-duration insurance products are subject. Acknowledgment The research is supported by NSC grant: NSC 972118-M-001-002-MY3.
References Adenstedt, R. K. 1974. “On large-sample estimation for the mean of a stationary sequence.” Annals of Statistics 2, 1095–1107. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroscedasticity.” Journal of Econometrics 31, 307–327.
H.-C. Ho and F.-I. Liu Bollerslev, T. and J. H. Wright. 2000. “Semiparametric estimation of long-memory volatility dependencies: the role of high-frequency data.” Journal of Econometrics 98, 81–106. Breidt, F. J., N. Crato, and P. De Lima. 1998. “The detection and estimation of long memory in stochastic volatility.” Journal of Econometrics 83, 325–348. Brockwell, P. J. and R. A. Davis. 1991. Time series: theory and methods, Springer, New York. Deo, R. S. and C. M. Hurvich. 2001. “On the log periodogram regression estimator of the memory parameter in long memory stochastic volatility models.” Econometric Theory 17, 686–710. Ding, Z. and C. Granger. 1996. “Modeling volatility persistence of speculative returns: a new approach.” Journal of Econometrics 73, 185–215. Ding, Z., C. W. J. Granger, and R. F. Engle. 1993. “A long memory property of stock market returns and a new model.” Journal of Empirical Finance 1, 83–106. Dowd, K. 2001. “Estimating VaR with order statistics.” Journal of Derivatives 8, 23–70. Engle, R. F. 1982. “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflations.” Econometrica 50, 987–1007. Granger, C. W. J. and R. Joyeux. 1980. “An introduction to longmemory time series models and fractional differencing.” Journal of Time Series Analysis 1, 15–29. Hall, P., B.-Y. Jing, and S. N. Lahiri. 1998. “On the sampling window method for long-range dependent data.” Statistica Sinica 8, 1189– 1204. Harvey, A. C., E. Ruiz, and N. Shephard. 1994. “Multivariate stochastic variance models.” Review of Economic Studies 61, 247–264. Hjort, N. L. and D. Pollard. 1993. Asymptotics for minimisers of convex processes, Statistical Research Report, University of Oslo. Ho, H. C. 2006. “Estimation errors of the Sharpe ratio for long-memory stochastic volatility models,” in Time series and related topics, IMS Lecture Notes-Monograph Series 52, pp. 165–172. Ho, H. C. and T. Hsing. 1996. “On the asymptotic expansion of the empirical process of long-memory moving averages.” The Annals of Statistics 24, 922–1024. Ho, H. C. and T. Hsing. 1997. “Limit theorems for functional of moving averages.” The Annals of Probability 25, 1636–1669. Ho, H. C. and F. I. Liu. 2008. Sample quantile analysis on long-memory stochastic volatility models and its applications in estimating VaR, Working paper. Ho, H. C., S. S. Yang, and F. I. Liu. 2008. A stochastic volatility model for reserving long-duration equity-linked insurance: long-memory, Working paper. Hosking, J. R. M. 1981. “Fractional differencing.” Biometrika 68, 165– 176. Koenker, R. 2005. Quantile regression, Cambridge University Press, New York. Koenker, R. and G. Bassett. 1978. “Regression quantiles.” Econometrica 46, 33–50. Lobato, I. N. and N. E. Savin. 1998. “Real and spurious long-memory properties of stock market data.” Journal of Business & Economic Statistics 16, 261–268. Mandelbrot, B. and J. W. van Ness. 1968. “Fractional Brownian motions, fractional noises and applications.” SIAM Review 10, 422–437. Mosteller, F. 1946. “On some useful “inefficient” statistics.” Annals of Mathematical Statistics 17, 377–408. Nelson, D. B. 1991. “Conditional heteroscedasticity in asset returns: a new approach.” Econometrica 59, 347–370.
95 Estimation of Short- and Long-Term VaR for Long-Memory Stochastic Volatility Models Nordman, D. J. and S. N. Lahiri. 2005. “Validity of the sampling window method for long-range dependent linear processes.” Econometric Theory 21, 1087–1111. Ray, B. K. and R. S. Tsay. 2000. “Long range dependence in daily stock volatilities.” Journal of Business & Economic Statistics 18(2), 254–262.
1415
Taylor, S. 1986. Modelling financial time series, Wiley, New York. Wishart, J. 1947. “The cumulants of the Z and of the logarithmic 2 and t distributions.” Biometrika 34, 170–178. Yoshihara, K. 1995. “The Bahadur representation of sample quantiles for sequences of strongly mixing random variables.” Statistics and Probability Letters 24, 299–304.
Chapter 96
Time Series Modeling and Forecasting of the Volatilities of Asset Returns Tze Leung Lai and Haipeng Xing
Abstract Dynamic modeling of asset returns and their volatilities is a central topic in quantitative finance. Herein we review basic statistical models and methods for the analysis and forecasting of volatilities. We also survey several recent developments in regime-switching, change-point and multivariate volatility models. Keywords Conditional heteroskedasticity volatility r Regime-switching r Change-point
r
Stochastic
96.1 Introduction Volatility of asset returns or interest rates plays a central role in quantitative finance and appears in basic formulas and theories in derivative pricing, risk management and asset allocation. In derivative pricing, volatility of the underlying asset return plays a key role, and the market convention is to list option prices in terms of volatility units. Now a days, derivative contracts on volatilities are traded in financial markets, so dynamic models of volatilities are indispensable in pricing such derivatives. In financial risk management, the Basel Accord requires banks and other financial institutions to calculate their minimum capital requirement for covering the risks of their trading positions that are aggregated into a risk measure involving volatility forecasts. In Markowitz’s portfolio theory, the optimal weights allocated to the assets in the portfolio involve the volatilities of the asset returns and their correlations. In view of the fundamental importance of
T.L. Lai Department of Statistics, Stanford University, Stanford, CA 94305-4065, USA H. Xing () Department of Applied Mathematics and Statistics, SUNY at Stony Brook, Stony Brook, NY 11794-3600, USA e-mail: [email protected]
volatility modeling and forecasting, there is a large literature on the subject, and surveys of various directions in the literature have been provided by Knight and Satchell (1998), Engle and Patton (2001), and Poon and Granger (2003). In this chapter, we review statistical methods and models for evaluating and forecasting the volatilities of asset returns. Section 2 describes some stylized facts of asset returns and their volatilities, and gives a brief review of classical volatility models, including the GARCH-type and stochastic volatility models. Section 3 surveys several recent developments, primarily in the area of regime-switching and changepoint volatility models and in relating volatilities to exogenous variables. Section 4 summarizes multivariate regression models and describes in this connection some recent work on mean–variance portfolio optimization when the means and covariances of the underlying assets are unknown and have to be estimated from time series data.
96.2 Conditional Heteroskedasticity Models 96.2.1 Stylized Facts on Time Series of Asset Returns Lai and Xing (2008a, pp. 140–144) summarize various stylized facts from the literature on empirical analysis of asset returns and their volatilities. In particular, the empirical analysis has revealed clustering of large changes and returns. Such volatility clustering is common in the intraday, daily and weekly returns in equity, commodity and foreign exchange markets, and results in much stronger autocorrelations of the squared returns than the original returns that are usually weakly autocorrelated. Moreover, there is asymmetry of magnitudes in upward and downward movements of asset returns, and the volatility response to a large positive return is considerably smaller than that to a negative return of the same magnitude. This asymmetry is sometimes referred to as a leverage effect. A possible cause of the asymmetry is that a drop of a stock price increases the debt-to-asset ratio of the
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_96,
1417
Chapter 96
Time Series Modeling and Forecasting of the Volatilities of Asset Returns Tze Leung Lai and Haipeng Xing
Abstract Dynamic modeling of asset returns and their volatilities is a central topic in quantitative finance. Herein we review basic statistical models and methods for the analysis and forecasting of volatilities. We also survey several recent developments in regime-switching, change-point and multivariate volatility models. Keywords Conditional heteroskedasticity volatility r Regime-switching r Change-point
r
Stochastic
96.1 Introduction Volatility of asset returns or interest rates plays a central role in quantitative finance and appears in basic formulas and theories in derivative pricing, risk management and asset allocation. In derivative pricing, volatility of the underlying asset return plays a key role, and the market convention is to list option prices in terms of volatility units. Now a days, derivative contracts on volatilities are traded in financial markets, so dynamic models of volatilities are indispensable in pricing such derivatives. In financial risk management, the Basel Accord requires banks and other financial institutions to calculate their minimum capital requirement for covering the risks of their trading positions that are aggregated into a risk measure involving volatility forecasts. In Markowitz’s portfolio theory, the optimal weights allocated to the assets in the portfolio involve the volatilities of the asset returns and their correlations. In view of the fundamental importance of
T.L. Lai Department of Statistics, Stanford University, Stanford, CA 94305-4065, USA H. Xing () Department of Applied Mathematics and Statistics, SUNY at Stony Brook, Stony Brook, NY 11794-3600, USA e-mail: [email protected]
volatility modeling and forecasting, there is a large literature on the subject, and surveys of various directions in the literature have been provided by Knight and Satchell (1998), Engle and Patton (2001), and Poon and Granger (2003). In this chapter, we review statistical methods and models for evaluating and forecasting the volatilities of asset returns. Section 2 describes some stylized facts of asset returns and their volatilities, and gives a brief review of classical volatility models, including the GARCH-type and stochastic volatility models. Section 3 surveys several recent developments, primarily in the area of regime-switching and changepoint volatility models and in relating volatilities to exogenous variables. Section 4 summarizes multivariate regression models and describes in this connection some recent work on mean–variance portfolio optimization when the means and covariances of the underlying assets are unknown and have to be estimated from time series data.
96.2 Conditional Heteroskedasticity Models 96.2.1 Stylized Facts on Time Series of Asset Returns Lai and Xing (2008a, pp. 140–144) summarize various stylized facts from the literature on empirical analysis of asset returns and their volatilities. In particular, the empirical analysis has revealed clustering of large changes and returns. Such volatility clustering is common in the intraday, daily and weekly returns in equity, commodity and foreign exchange markets, and results in much stronger autocorrelations of the squared returns than the original returns that are usually weakly autocorrelated. Moreover, there is asymmetry of magnitudes in upward and downward movements of asset returns, and the volatility response to a large positive return is considerably smaller than that to a negative return of the same magnitude. This asymmetry is sometimes referred to as a leverage effect. A possible cause of the asymmetry is that a drop of a stock price increases the debt-to-asset ratio of the
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_96,
1417
1418
T.L. Lai and H. Xing
stock, which in turn increases the volatility of returns to the equity holders. In addition, the news of increasing volatility makes the future of the stock more uncertain, and the asset price and its return therefore become more volatile.
96.2.2 Historic Volatility and Exponentially Weighted Moving Averages Let t be the volatility of the asset return rt at time t. A commonly used estimate of t2 at time t 1 is the sample variance based on the most recent k observations: 1 X .rt i rN /2 ; k 1 i D1 k
b 2t D
(96.1)
ı P where rN D kiD1 rt i k. The volatility given by Equation (96.1) is called the historic volatility. If k is in days, we can convert Equation (96.1) in the daily basis to the annual p volatility by A, where the annualizing factor A is the number of trading days, usually taken as around 252; see Lai and Xing (2008a, p. 66). Historic volatility uses equal weights for the observations in the moving window of returns. Since the interest is to estimate the current level of volatility, one may want to put more weight to the most recent observations, yielding an estimate of the form k X ˛i u2ti (96.2) b 2t D Pk
96.2.3 The GARCH Model How should the weights ˛i and in the moving average scheme (Equation (96.2)) be chosen and how should V be estimated? Regarding them as unknown parameters in a model of the time-varying volatilities t2 , we can estimate them by fitting a model to the observed data. In particular, a model that matches well with Equation (96.2) is Engle’s (1982) autoregressive conditional heteroskedastic (ARCH) model, denoted by ARCH.k/ and defined by ut D t t ;
2t1 C .1 /u2t1 : b 2t D b
(96.3)
RiskMetrics, originally developed by J. P. Morgan and made publicly available in 1994, uses EWMA with D 0:94 for updating daily volatility estimates in its database. Noting that 2t1 Equation (96.3) says that b 2i is convex combination of b 2 and ut 1 , we can modify Equation (96.3) by including also a long-run variance rate V : 2t1 : b 2t D .1 ˛ ˇ/V C ˛u2t1 C ˇb
(96.4)
k X
˛i u2ti ;
(96.5)
i D1
in which t are i.i.d. random variables with mean 0 and variance 1, having either the standard normal or the following standardized Student-t distribution. Let x be a Student-t distribution with > 2 degrees of freedom. Then t D ıp =. 2/ has a standardized Student-t distribution x with variance 1 and probability density function f ./ D
Œ. C 1/=2 .C1/=2 p 1C : 2 Œ=2 . 2/
(96.6)
Since the ARCH model (Equation (96.5)) is similar to an autoregressive model, weak (or covariance) stationarity requires that the zeros of the characteristic polynomial 1˛1 z ˛k zk lie outside the unit circle. As the ˛i are usually assumed to be nonnegative, this requirement is equivalent to
i D1
where i D1 ˛i D 1, and we use um to denote the centered log return rm rN , or simply the log return rm (since rN 2 is typP 2 ). The parically small in comparison with k 1 kiD1 rmi ticular case of Equation (96.2) with ˛i D .1 /i 1 for 0 < < 1, k D 1 and uj D 0 for j 0, is called an exponentially weighted moving average (EWMA) and has the recursive representation
t2 D ! C
˛1 C C ˛k < 1:
(96.7)
Under Equation (96.7), the long-run volatility of ut is given by 2 D E.u2t / D
! : 1 ˛1 ˛k
(96.8)
Other properties of ARCH and estimation of the model will be discussed in Sect. 96.3.3. Bollerslev (1986) subsequently generalized equation (96.5) to a model of the form ut D t t ;
t2 D ! C
h X i D1
ˇi t2i C
k X
˛j u2tj ;
(96.9)
j D1
where the random variables t are i.i.d. with mean 0 and variance 1 and have either the standard normal or a standardized Student-t distribution. This model is called the generalized
96 Time Series Modeling and Forecasting of the Volatilities of Asset Returns
ARCH (GARCH) model and is denoted by GARCH.h; k/. It can be considered as an ARMA model of volatility with martingale difference innovations: Specifically, letting ˇi D 0 if i > h, ˛j D 0 if j > k, and t D u2t t2 , we can rewrite Equation (96.9) as u2t D ! C
max .h;k/ h X X .˛i C ˇi /u2ti C t ˇj t j : i D1
j D1
1419
96.2.5 ARMA-GARCH and ARMA-EGARCH Models We can combine an ARMA.p; q/ model for rt with the GARCH.h; k/ model (Equation (96.9)) for the innovations ut , leading to the ARMA.p; q/-GARCH.h; k/ model r t D 0 C
(96.10) Moreover, E.t jus ; s t 1/ D 0 because E.u2t jus ; s t 1/ D t2 . Hence the same invertibility and stationarity assumptions for ARMA models apply to GARCH models; see Lai and Xing (2008a, Section 9.3.2) where maximum likelihood estimation of GARCH parameters, likelihood inference and model-based forecasts of future volatilities are also described.
t2 D ! C
p X
i rt i C
i D1
j D1
h X
k X
ˇi t2i C
i D1
b r t C1jt D 0 C
ut D t t ;
log.t2 / D ! C
h X i D1
ˇi log.t2i /C
k X j D1
j ut j ;
ut D t t ;
˛j u2tj ;
(96.12)
j D1
in which the t are i.i.d. standard normal or standardized Student-t random variables. The parameters of the combined ARMA.p; q/-GARCH.h; k/ can be estimated by maximum likelihood; see Lai and Xing (2008a, pp. 156–157). The onestep ahead forecasts of the asset return and its volatility at time t are given by
96.2.4 The Exponential GARCH Model As pointed out in Sect. 96.2.1, a stylized fact of the volatility of asset returns is the leverage effect that the volatility response to large positive returns is considerably smaller than that to negative returns of the same magnitude. Because the GARCH model is defined by t2 and u2tj , such leverage effect cannot be incorporated. To address the asymmetry, several modifications of the GARCH model have been proposed in the literature, the most popular being the exponential GARCH (EGARCH) model proposed by Nelson (1990). Letting fj ./ D ˛j C j .jj Ejj/, we can write the EGARCH.h; k/ model in the form
q X
p X i D1
b 2tC1jt D ! C
k X j D1
i rt C1i C
q X
j ut C1j ;
j D1
˛j u2tC1j C
h X
ˇi t2C1i : (96.13)
j D1
The k-step ahead forecasts can be obtained by making use of Equation (96.13) and the law of iterated conditional expectations. Replacing the second equation in Equation (96.12) by Equation (96.11) yields the ARMA.p; q/-EGARCH.h; k/ model.
fj .t j /:
96.2.6 Volatility Persistence and Integrated GARCH Models (96.11)
The random variable fj .t / is a sum of two zero-mean random variables ˛j t and j .jt j Ejt j/, and can be written in the form .˛j C j /t j Ejt j; if t 0; fj .t / D .˛j j /t j Ejt j; if t < 0; which shows the asymmetry of the volatility response to positive and to negative returns. Since Equation (96.11) represents log.t2 / as a linear process, t2 and therefore ut also are strictly stationary if the zeros of 1 ˇ1 z ˇh zh lie outside the unit circle; see Lai and Xing (2008a, Section 6.3.4) where EGARCH forecasts of future volatilities and maximum likelihood estimation of EGARCH parameters are also described.
Fitting GARCH(1,1) models to financial time series often displays high volatility in the sense that the estimated ˛ C ˇ values are close to 1. More generally, the integrated GARCH (IGARCH) model, introduced by Engle and Pk Bollerslev Ph(1986), has the form (Equation (96.9)) with j D1 ˛j C i D1 ˇi D 1. Although IGARCH models have infinite (unconditional) variance, they are in fact strictly stationary. In view of the ARMA representation (10) with martingale difference innovations, the shocks t to the variance in an IGARCH model do not decay over time, suggesting integration in variance that is analogous to ARIMA models. Note that an integrated ARMA model, ARIMA.p; d; q/, for a unit-root nonstationary time series xt is a stationary ARMA.p; q/ for .1 B/d xt , where B is the backshift
1420
T.L. Lai and H. Xing
operator Bxt D xt 1 and therefore .1 B/xt D xt xt 1 . Extending d from positive integers to 12 < d < 12 yields the fractionally integrated ARMA model, ARFIMA.p; d; q/, for which .1 B/d xt is stationary ARMA; see Lai and Xing (2008a, p. 219). Noting that Equation (96.10) represents a GARCH model for ut as an ARMA model for u2t , Baillie et al. (1996) have used fractional integration to extend IGARCH models to FIGARCH (fractionally integrated GARCH) models, with a slow hyperbolic rate of decay for the influence of the past innovations t j .
The QML treats t as if it were normal so that Equation (96.15) is a linear Gaussian state-space model to which the Kalman filter can be applied to give not only the minimumvariance linear predictor ht jt 1 of ht based on observations up to time t but also its variance Var.ht jt 1 /. Let et D yt E.log 21 / ht jt 1 and vt D Vt jt 1 C Var.log 21 /. If t were normal (with mean 0 and variance Var.log 21 /), then et would be N.0; vt / and therefore the log-likelihood function would be given by
96.2.7 Stochastic Volatility Models Although the GARCH(1, 1) model t2 D ! C ˇt21 C ˛u2t1 resembles an AR(1) model for t2 , it involves the observed u2t1 instead of an unobservable innovation in the usual AR(1) model. Replacing ˛u2t1 by a zero-mean random disturbance has the disadvantage that it no longer ensures t2 to be nonnegative. A simple way to get around this difficulty is to consider log t2 instead of t2 , leading to the stochastic volatility (SV) model introduced by Taylor (1986): ut D t t ;
t2 D e ht ; ht D 0 C 1 ht 1 C C p ht p Ct ; (96.14)
which has AR.p/ dynamics for log t2 . The t and t in Equation (96.14) are assumed to be independent normal random variables, with t N.0; 1/ and t N.0; 2 /. There is an extensive literature on the properties of SV models; see Taylor (1994), Capobianco (1996), Ghysels et al. (1996), Shephard (1996), and Barndorff-Nielsen and Shephard (2001). A complication of the SV model is that unlike usual AR.p/ models, the ht in Equation (96.14) is an unobserved state undergoing AR.p/ dynamics, while the observations are ut such that ut jht N.0; e ht /. The likelihood function of
D .; 0 ; : : : ; p /T , based on a sample of n observations u1 ; : : : ; un , involves n-fold integrals, making it prohibitively difficult to compute the MLE by numerical integration for usual sample sizes. A tractable alternative to the MLE is the following quasimaximum likelihood (QML) estimator. To fix the ideas, consider the case p D 1 and let 0 D !, 1 D . Letting yt D log u2t , note that yt D ht C log t2 , where log t2 is distributed like log 21 , which has mean 1:27. Let t D log t2 E.log 21 /. Note that this is a linear statespace model with unobserved states ht and observations yt satisfying ht D !C ht 1 Ct ;
yt D ht CE.log 21 /Ct :
(96.15)
1X 1X 2 log vt e =vt 2 t D1 2 t D1 t n
l.!; ; 2 / D
n
(96.16)
up to additive constants that do not depend on .!; ; 2 /. The QML estimator that maximizes Equation (96.16) is consistent and asymptotically normal but not efficient because it uses an incorrect density function for the nonnormal t . Another alternative to the MLE is the Bayes estimator that assumes the following combined normal and inverted chisquare prior distribution of . 2 ; !; /: ˇ .!; /j 2 N..!0 ; 0 /; 2 V0 /ˇj j<1 ; (96.17) ˇ where j 0 j < 1 and N.; /ˇj j<1 denotes the bivariate normal distribution restricted to the region f.!; / W j j < 1g so that the corresponding AR(1) model for ht is stationary. Therefore the conditional distribution of WD .!; ; 2 / given h WD .h1 ; : : : ; hn / again has the same form as Equation (96.17) but with .m; ; !0 ; 0 ; V0 / replaced by their posterior counterparts. The difficulty with posterior estimation in SV models is that h is actually unobservable and the observations are u1 ; : : : ; un . Gibbs sampling can be used to obtain a Monte Carlo approximation to the posterior distribution of . ; h/. Let ht D .h1 ; : : : ; ht 1 ; ht C1 ; : : : ; hn /, which removes ht from h. Using f .j/ to denote conditional densities, we can use the AR(1) dynamics for ht to obtain m 2m ; 2
! ! 1 u2t .ht ut /2 1 f .ht ju; ; ht / / exp 2 exp ; t 2 2 2t t2 (96.18) whereı 2 D 2 =.1 C 2 / and t D !.1 / C .ht 1 C ht C1 / .1 C 2 /. Moreover, the conditional distribution of given u and h is a combined normal and inverted 2 : m C
Pn
2 t D2 vt
2mCn1 ; 2 ˇ ˇ .!; /ˇ 2 N..! ; /; 2 V /ˇj j<1 : (96.19)
96 Time Series Modeling and Forecasting of the Volatilities of Asset Returns
Therefore, given h and .!; /, let vt D ht ! log ht 1 for 2 t n and generate 1= 2 from the 2 -distribution in Equation (96.19). With 2 thus generated, let zt D .1; log ht 1 /T , !, n X V1 D zt zTt 2 C V1 0 ; t D2
(
.! ; / D V T
T V1 0 .!0 ; 0 /
C
n X
ı
ht zt
1421
and the references therein. In this section we give a brief survey of the development in incorporating possible parameter changes over time in the volatility models of Sect. 96.2.
96.3.1 Regime-Switching Volatility Models
) 2
;
t D2
and generate .!; / from the truncated bivariate normal distribution in Equation (96.19). The Gibbs sampler iterates these two steps in generating h1 ; : : : ; hn , 2 , !, and until convergence; see Jacquier et al. (1994) for details. Implementation and software for Gibbs sampling can be found on the WinBUGS Web site (http://www.mrcbsu.cam.ac.uk/bugs/). Empirical evidence has suggested that the ut in Equation (96.14) have fatter tails than normal; see Gallant et al. (1997). To accommodate leptokurtic distribup tions for t in SV models, it is convenient to use t D t zt , in which zt are i.i.d. standard normal variables that are independent of t , and t has an inverted 2 -distribution: p =t 2 . This framework implies that t t zt has Student’s t-distribution with degrees of freedom. Harvey and Shephard (1996) propose another extension of the SV model to describe the leverage effect of asset return volatilities by allowing correlations between t and t . These extended SV models can still be estimated by Gibbs sampling when certain prior distributions are imposed on the unknown parameters; see Jacquier et al. (2004). A survey of various estimation methods for SV models is given by Broto and Ruiz (2004).
96.3 Regime-Switching, Change-Point and Spline-GARCH Models of Volatility As mentioned in Sect. 96.2.6, empirical studies of exchange rates, interest rates and stock returns have found high volatility persistence in the estimation of the GARCH(1, 1) parameters !; ˛ and ˇ, giving values of the MLE b of WD ˛ C ˇ close to 1. In his comments on Engle and Bollerslev (1986), Diebold (1986) noted that in fitting GARCH models to interest rate data the choice of a constant term ! not accommodating to shifts in monetary policy regimes might have led to an apparently integrated GARCH model. Subsequent work in the literature has demonstrated that if parameter changes are ignored in fitting time series models whose parameters in fact undergo occasional changes, then the fitted models tend to exhibit long memory; see Perron (1989), Lamoureux and Lastrapes (1990), Hillebrand (2005), Lai and Xing (2006),
Regime-switching volatility models allow parameter jumps over a small number of regimes. The seminal paper by Hamilton and Susmel (1994) on regime-switching ARCH models provides the basic ideas and paves the way for subsequent developments. They modify the usual ARCH model (Equation (96.5)) by introducing a scale factor t that remains constant within the same unobservable regime: ut D t t t ; t2
D!C
k X
˛i .ut i =t i /2 C .ut 1 =t 1 /2 1fut 1 0g ;
i D1
(96.20) where 1fut 1 0g is an indicator variable which assumes the value 1 or 0 and which is used to incorporate the leverage effect. Suppose that there are K possible values of t , which is assumed to be a Markov chain with a given K K transition probability matrix. Then one has a hidden Markov model with K states and its likelihood function can be maximized numerically. Hamilton and Susmel have found that Equation (96.20) provides a better fit and better forecasts for the weekly returns of the value-weighted portfolio of stocks traded on NYSE from the week ending on July 3, 1962 to the week ending on December 29, 1987. They attribute most of the persistence in stock price volatility to Markovian switching among low, moderate and high volatility regimes. Cai (1994) discusses the possibility of extending regime switching to GARCH models while considering a similar regime-switching ARCH model. He comments that “combining the Markov-switching model with GARCH includes tremendous complications in actual estimation.” Gray (1996) points out that this difficulty arises because the hidden states of the parameters in regime-switching GARCH models are path-dependent. To circumvent this difficulty, he suggests 2 modeling t2 by a finite mixture of conditional variances t;k .k D 1; : : : ; K/, with each t;k following a GARCH process, instead of using directly a regime-switching model for t2 . The weights in the mixture are determined by the transition probabilities of an underlying Markov chain. Regime-switching stochastic volatility (RSSV) models have been introduced by So et al. (1998), who modify Equation (96.14) in the case p D 1 as ut D t t ;
log.t2 / D !t C log.t21 /Ct ;
(96.21)
1422
T.L. Lai and H. Xing
in which !t is an unobservable Markov chain assuming K possible values. They use Gibbs sampling to estimate !t and .
96.3.2 Spline-GARCH Models Engle and Rangel (2008) propose to modify the GARCH model (Equation (96.9)) in the following way to allow structural changes over a long time-scale while maintaining GARCH-type changes over a short time-scale: p yt D C t ht t ;
(96.22)
in which t is an i.i.d. sequence and ht is generated by the GARCH model 1 0 I J I J X X X X 2 A @ ai bj C ai wt i C bj ht j ; ht D 1 i D1
j D1
i D1
j D1
p with ws D hs s :
(96.23)
Whereas the vector of unknown GARCH parameters, D .a1 ; : : : ; aI ; b1 ; : : : ; bJ /T , is assumed to be time-invariant, the additional factor t is used to model long-term changes in volatilities. Engle and Rangel propose to use an exponential spline function with prespecified knots to represent t by log t D ˇ T xt C
0t
C
K X
k .t
tk /2C ;
(96.24)
kD1
where xt are exogenous variables (including the constant 1 to incorporate the intercept) and zC D max.z; 0/. The number K and the locations of the knots are determined exogenously so that the parameters ; 0 ; : : : ; K , and ˇ of the splineGARCH model can be estimated by maximum likelihood.
96.3.3 Stochastic Change-Point ARX-GARCH Models Besides the inherent computational complexity, another issue with regime-switching models is the interpretation of the regimes and the choice of its number. An alternative approach to regime switching is to incorporate possible parameter changes of unknown magnitude and at unknown times. McCulloch and Tsay (1993) consider autoregressive models with random shifts in mean and error variances, and Wang and Zivot (2000) assume the number of change-points to be known and use conjugate priors to model changes
in regression parameters and error variance. Markov chain Monte Carlo methods are used to estimate the time-varying parameters. Lai et al. (2005) have recently introduced a much more tractable model that has explicit recursive formulas for estimating the time-varying parameters. The model has been further extended to include GARCH dynamics (Equation (96.9)) in the short time-scale. Here we summarize both models and the associated estimators. Consider a stochastic regression model with time-varying regression coefficients and error variances of the form yt D ˇ Tt xt C t
p ht t ;
(96.25)
where the stochastic regressor xt consists of lagged variables yt 1 , yt 2 ; : : : and exogenous variables, the regression coefficient ˇ t and the long-term volatility level t are piecewise constant, and ht is generated by the GARCH model (Equation (96.23)) that relates short-term volatility fluctuations to conditional heteroskedasticity. We use the following Bayesian model to describe the stochastic dynamics of the piecewise constant t D .ˇ Tt ; t /T , where t D .2t2 /1 . Letting It D 1f t ¤ t 1 g , which is the indicator variable (assuming the values 0 and 1) that shows whether a parameter change occurs at time t, the prior distribution assumes that It , t t0 , are independent with P .It D 1/ D p, and that
t D .1 It / t 1 C It .zTt ; t /; where .zT1 ; 1 /; .zT2 ; 2 /; are i.i.d. random vectors such that t 2m =;
zt jt N.z; V=.2t //:
(96.26)
To begin with, suppose that instead of Equation (96.23), ht is given by ht 1, which has been considered by Lai et al. (2005) who make use of Jt D maxfj t W Ij D 1g to derive explicit recursive formulas for the Bayes estimates E.ˇ t jYt / and E.t2 jYt /, where Yt D .x1 ; y1 ; : : : ; xt ; yt /. Note that conditional on Yt and Jt D j , the distribution of
t .D t 1 D D j / is given by ı 2t 2mCt j C1 j;t ;
ı ˇ t jt N.zj;t ; Vj;t .2t //; (96.27)
where Vi;j D V
1
C
j X
!1 xTt xt
;
t Di
zi;j D Vi;j
V zC 1
j X
! xt yt ;
(96.28)
t Di
X 1 D C zT V1 z C yt2 zTi;j V1 i;j zi;j : 2 t Di j
i;j
96 Time Series Modeling and Forecasting of the Volatilities of Asset Returns
Lai et al. (2005) make use of the following recursion for pj;t D P .Jt D j jYt /: pj;t / pj;t pf .yt jIt D 1/ WD .1 p/pj;t 1 f .yt jYj;t 1 ; Jt D j /
ı P fp C 1i 0 t <j 0 n ijt g, where ijt D ijt ijt
if j D t; if j t 1; (96.29)
where we use f .j/ to denote conditional densities as in Equation (96.18). Let gi;j D .m C j i C 1/=2 and
1423
D
ppi t .1 p/pi t qt C1;j fij f00 =.fi t ft C1;j /
i t D j; i t j n; (96.35)
and analogous to pi;t for the forward filter, the weights qt C1;j for the backward filter are given by qt C1;j / qtC1;j pft C1;t C1 =f00 D .1 p/qt C2;j ft C1;j =ft C2;j
if j D t C 1; if j > t C 1:
g
fij D .det.Vi;j //1=2 .gi;j /i;j i;j ; f00 D .det.V//1=2 .m/m=2 :
(96.30)
Then the conditional densities in Equation (96.29) can be expressed in terms of the quantities in Equation (96.30), yielding D pj;t
Hence pj;t D
pft t =f00 .1 p/pj;t 1 fjt =fj;t 1 pj;t =
Pt
i DkC1 pi;t
E.ˇ t jYt / D
t X
if j D t; if j t 1: (96.31)
and
D
t X
pj;t
j D1
b 2n;j D
j;t : 2.gj;t 1/
(96.32)
Lai et al. (2005) also develop explicit formulas for E.ˇ t jYn / and E.t2 jYn / when 1 t n. Let e Jt D minfj > t W Ij C1 D 1g be the future change-time closest to J t / D .i; j / with i t < j , t. Conditional on Yn and .Jt ; e the posterior distribution of t .D i D D j / is ı 2t 2mCj i C1 ij ;
ı ˇ t jt N.zij ; Vij .2t //: (96.33)
Using Bayes’ theorem to combine the forward and backward filters, they have shown that E.ˇt jYn / D
b ˇ n1;j ˇ n;j D b n . o hn;j CxTn Pn1;j xn ; C Pn 1;j xn yn b ˇ Tn1;j xn b (96.36)
pj;t zj;t ;
j D1
E.t2 jYt /
To extend these recursions to the case where ht is given by Equation (96.23), Lai and Xing (2008b) first assume that the parameter vector D .a1 ; : : : ; aI ; b1 ; : : : ; bJ /T in Equation (96.23) is known. For n j C 1, they use the recursions . n o hn;j C xTn Pn1;j xn ; Pn;j D Pn1;j Pn1;j xn xTn Pn1;j b n 2. o b hn;j C xTn Pn1;j xn ; n;j D n1;j C yn b ˇ Tn1;j xn
X
mCnj 2 2 1 b C m C n j 1 n1;j mCnj 1 n 2 . o b
yn b hn;j C xTn Pn1;j xn ; ˇ Tn1;j xn
b hn;j D 1
i D1
C
J X
X
! bj
lD1
bl b hnl;j C
I X
ı 2 n1;j : ai .yni b ˇ Tn1;j xni /2 b
i D1
The weight pj;n are given by Equation (96.29) with Vj;n and zj;n replaced by n X ı 1 Vj;n D V1 C ht;j xTt xt b ; t Dj n X ı ht;j : xt yt b zj;n D Vj;n V1 z C
ijt zi;j ;
i;j ; D ijt 2.gi;j 1/ 1i t j n
ai
J X
lD1
t Dj
1i t j n
E.t2 jYn /
I X
(96.34)
j;n D
n X 1 C zT V1 z zTj;n V1 z C yt2 =b ht;j : j;n j;n 2 t Dj
(96.37)
1424
T.L. Lai and H. Xing
With the modifications given by Equations (96.36) and (96.37), they extend the formulas (96.32) for the Bayes estimates E.ˇt jYt / and E.t2 jYt /. Similarly they modify the formulas (96.34) for the Bayes estimates E.ˇ t jYn / and E.t2 jYn / when the ht are generated by the GARCH model (Equation (96.23)). They also describe how the hyperparameters p; z; V; can be estimated by maximum likelihood or variants thereof. Approximations that involve only a bounded number of components in the mixtures in Equations (96.32) and (96.34) and associated approximations to the likelihood function are also given in Lai and Xing (2008b).
96.4 Multivariate Volatility Models and Applications to Mean–Variance Portfolio Optimization For a multivariate time series yt 2 Rp , let † t D Cov.ut jFt 1 / be the conditional covariance matrix of ut given the information set Ft 1 of events up to time t 1. These is an extensive literature on dynamic models of † t . In Sect. 96.4.1 we give a brief review of this literature and point out the difficulties with these multivariate volatility models due to the “curse of dimensionality.” In Sect. 96.4.2 we describe some recent work to circumvent these difficulties in connection with Markowitz’s mean–variance portfolio optimization when the means and covariances of the asset returns in the future period are unknown and have to be estimated from historical time series.
96.4.1 Multivariate GARCH Models A straightforward extension of the univariate GARCH(1, 1) model t2 D ! C ˇt21 C ˛u2t1 to the multivariate setting is to use the vech (half-vectorization, or half-stacking) operator that transforms a p p symmetric matrix M into a vector, denoted by vech.M/, which consists of the p.p C1/=2 lower diagonal (including diagonal) elements of M. This leads to a multivariate GARCH model of the form vech.† t / D ! C Bvech.† t 1 / C Avech.ut u0t /;
Instead of using the vech operator, an alternative approach is to replace s2 and u2s in Equation (96.9) by † s and us uTs and the scalar parameters !; ˛j ; ˇi by matrices, leading to † t D AAT C
h X i D1
Bi † t i BTi C
k X
Aj .ut j uTtj /ATj I
j D1
(96.39) see Engle and Kroner (1995) who refer to their earlier work with Baba and Kraft. The model, commonly called BEKK, involves a lower triangular matrix A and always gives nonnegative definite † t . The number of parameters is p 2 .k C h/ C p.p C 1/=2, which increases with p 2 . Many other multivariate generalizations of the GARCH.h; k/ model and SV model have been proposed in the literature; see the recent survey by Bauwens et al. (2006). An important statistical issue is the curse of dimensionality in parameter estimation unless p is small. Another issue that arises if one models the GARCH dynamics separately for the volatilities of the components of ut and for their correlations as in Equation (96.38) is that the resultant matrix may not be nonnegative definite.
96.4.2 A New Approach and Its Application to Mean–Variance Portfolio Optimization Markowitz’s celebrated mean–variance portfolio optimization theory assumes that the means and covariances of the underlying asset returns are known. In practice, they are unknown and have to be estimated from historical data. Plugging the estimates into Markowitz’s efficient frontier that assumes known parameters has led to portfolios that may perform poorly and have counter-intuitive asset allocation weights; this has been referred to as the “Markowitz optimization enigma.” Lai et al. (2008) recently introduced a new approach to resolve the enigma. Not only does the new approach provide substantial improvements over previous methods, but it also allows flexible modeling to incorporate dynamic features and structural changes in the training sample of historical data, as illustrated by simulation and empirical studies. They propose to use a stochastic regression model of the form
(96.38) p.pC1/ 2
where A and B are
p.pC1/ matrices. Note that 2 p.pC1/ 2 Equation (96.38) involves 2 C p.pC1/ parameters. 2 2 Moreover, A and B have to satisfy certain constraints for † t to be positive definite, making it difficult to fit such models.
ri t D ˇ Ti xi t C i t ;
(96.40)
for the time series of asset returns of the i th asset, where the components of xi t include 1, factor variables such as the return of a market portfolio like the S&P500 index at time t,
96 Time Series Modeling and Forecasting of the Volatilities of Asset Returns
and lagged variables ri;t 1 ; ri;t 2 , : : : They also include conditional heteroskedasticity in Equation (96.40) by assuming the i t to follow the GARCH(1,1) model i t D si t zi t ;
2 2 si2t D !i C ai si;t 1 C bi ri;t 1 ;
(96.41)
in which zi1 ; zi 2 ; : : : are i.i.d. with mean 0 and variance 1. Their basic idea is to introduce covariates (including lagged variables to account for time series effects) and possibly also conditional heteroskedasticity so that the unmodeled errors i t or zi t can be regarded as i.i.d. Therefore, instead of i.i.d. returns that are commonly assumed in mean–variance optimization theory, their new approach combines domain knowledge of the m assets with time series modeling to obtain better predictors of the returns and their volatilities for the future period. The model (Equation (96.40)) is intended to produce i.i.d. t D .1t ; : : : ; mt /T or i.i.d. zt D .z1t ; : : : ; zmt /T after adjusting for conditional heteroskedasticity as in Equation (96.41). Note that Equations (96.40) and (96.41) models the asset returns separately, instead of jointly in a multivariate regression or multivariate GARCH model which has the difficulty of having too many parameters to estimate. While the vectors t (or zt ) are assumed to be i.i.d., Equation (96.40) (or Equation (96.41)) does not assume their components to be uncorrelated since it treats the components separately rather than jointly. This avoids the curse of dimensionality of vector ARX or multivariate GARCH models. Another innovation of the approach of Lai et al. (2008) is their new formulation of mean–variance portfolio optimization, when the means and covariances of the asset returns are unknown and have to be estimated from data, as a stochastic optimization problem to which stochastic control techniques can be applied.
96.5 Conclusion Volatility modeling, which is of fundamental importance in quantitative finance, still remains an active area of research after three decades of continual development. The recent developments reviewed herein show the vibrancy of the subject.
References Baillie, R. T., T. Bollerslev, and H. O. Mikkelsen. 1996. “Fractionally integrated generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 74, 3–30. Barndorff-Nielsen, O. E. and N. G. Shephard. 2001. “Non-Gaussian Ornstein–Uhlenbeck based models and some of their uses in financial econometrics.” Journal of the Royal Statistical Society, Series B 63, 167–241.
1425
Bauwens, L., S. Laurent, and J. V. K. Rombouts. 2006. “Multivariate GARCH models: a survey.” Journal of Applied Econometrics 21, 79–109. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31, 307–327. Broto, C. and E. Ruiz. 2004. “Estimation methods for stochastic volatility models: a survey.” Journal of Economic Surveys 18, 613–649. Cai, J. 1994. “A Markov model of switching-regime ARCH.” Journal of Business and Economic Statistics 12, 309–316. Capobianco, E. 1996. “State-space stochastic volatility models: a review of estimation algorithms.” Applied Stochastic Models and Data Analysis 12, 265–279. Diebold, F. X. 1986. “Modeling the persistence of conditional variances: a comment.” Econometric Reviews 5, 51–56. Engle, R. F. 1982. “Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflations.” Econometrica 50, 987–1007. Engle, R. F. and T. Bollerslev. 1986. “Modeling the persistence of conditional variances.” Econometric Reviews 5, 1–50. Engle, R. F. and A. J. Patton. 2001. “What good is a volatility model?” Quantitative finance 1, 237–245. Engle, R. F. and J. G. Rangel. 2008. “The spline-GARCH model for low-frequency volatility and its global macroeconomic causes.” The Review of Financial Studies 21, 1187–1222. Gallant, A. R., D. Hsieh, and G. Tauchen. 1997. “Estimation of stochastic volatility models with diagnostics.” Journal of Econometrics 81, 159–192. Ghysels, E., A. C. Harvey, and E. Renault. 1996. “Stochastic volatility,” in Statistical methods in finance, G. S. Maddala and C. R. Rao (Eds.). North Holland, Amsterdam, pp. 119–191. Gray, S. F. 1996. “Modeling the conditional distribution of interest rates as a regime-switching process.” Journal of Financial Economics 42, 27–62. Hamilton, J. D. and R. Susmel. 1994. “A conditional heteroskedasticity and change in regimes.” Journal of Econometrics 64, 307–333. Harvey, A. C. and N. Shephard. 1996. “Estimation of asymmetric stochastic volatility model for asset returns.” Journal of Business and Economic Statistics 14, 429–434. Hillebrand, E. 2005. “Neglecting parameter changes in GARCH models.” Journal of Econometrics 129, 121–138. Jacquier, E., N. G. Polson, and P. E. Rossi. 1994. “Bayesian analysis of stochastic volatility models (with discussion).” Journal of Business and Economic Statistics 12, 371–417. Jacquier, E., N. G. Polson, and P. E. Rossi. 2004. “Bayesian analysis of stochastic volatility models with fat-tails and correlated errors.” Journal of Econometrics 122, 185–212. Knight, J. and S. Satchell (Eds.). 1998. Forecasting volatility in the financial markets, Butterworth Heinemann, London. Lai, T. L. and H. Xing. 2006. “Structural change as an alternative to long memory in financial time series,” in Advances in econometrics, H. Carter and T. Fomby (Eds.), Vol. 20, Elsevier, Amsterdam, pp. 209– 228. Lai, T. L. and H. Xing. 2008a. Statistical models and methods for financial markets, Springer, New York. Lai, T. L. and H. Xing. 2008b. Hidden Markov models of structural changes in econometric time series, Technical Report, Department of Statistics, Stanford University. Lai, T. L., H. Liu, and H. Xing. 2005. “Autoregressive models with piecewise constant volatility and regression parameters.” Statistica Sinica 15, 279–301. Lai, T. L., H. Xing, and Z. Chen. 2008. Mean–variance portfolio optimization when means and covariances are unknown, Technical Report, Department of Statistics, Stanford University. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Persistence in variance, structural change and the GARCH model.” Journal of Business and Economic Statistics 8, 225–234.
1426 McCulloch, R. E. and R. S. Tsay. 1993. “Bayesian inference and prediction for mean and variance shifts in autoregressive time series.” Journal of the American Statistical Association 88, 968–978. Nelson, D. B. 1990. “Stationarity and persistence in GARCH(1,1) model.” Econometric Theory 6, 318–334. Perron, P. 1989. “The great crash, the oil price shock and the unit root hypothesis.” Econometrica 57, 1361–1401. Poon, S. and C. W. J. Grander. 2003. “Forecasting volatility in financial markets: a review.” Journal of Economic Literature 41, 478–539. Shephard, N. 1996. “Statistical aspects of ARCH and stochastic volatility,” in Statistical models in econometric, finance and other fields, O. E. Barndorff-Nielsen, D. R. Cox, and D. V. Hinkley (Eds.). Chapman & Hall, London, pp. 1–67.
T.L. Lai and H. Xing So, M. K. P., K. Lam, and W. K. Li. 1998. “A stochastic volatility model with Markov switching.” Journal of Business and Economic Statistics 16, 244–253. Taylor, S. 1986. Modeling financial time series, Wiley, Chichester. Taylor. S. 1994. “Modeling stochastic volatility: a review and comparative study.” Mathematical Finance 4, 183–204. Wang, J. and E. Zivot. 2000. “A Bayesian time series model of multiple structural changes in level, trend and variance.” Journal of Business and Economic Statistics 18, 374–386.
Chapter 97
Listing Effects and the Private Company Discount in Bank Acquisitions Atul Gupta and Lalatendu Misra
Abstract We examine acquirer gains in bids for listed and unlisted targets in a large sample of bank acquisitions. Consistent with findings for the non-financial sector, we report that acquiring firm abnormal returns are significantly larger when the target is unlisted compared to bids for listed targets. Using pre-bid financial data for acquirer and targets in regressions, we estimate the magnitude of the acquisition discount for unlisted firms. We find that the multiple of deal value to target equity is 13.75% lower for unlisted targets, on average, and it is influenced by the method of payment and whether the bid is across state lines. Keywords Mergers r Bank acquisitions r Acquisition discounts r Earnings multiples r Public targets r Private targets r Listing effect r Method of payment r Interstate acquisitions
97.1 Introduction Several recent papers document that although acquiring firms in corporate control transactions involving non-financial firms earn zero abnormal returns, acquirer gains are positive and statistically significant when the target is an unlisted firm (Hansen and Lott 1996; Chang 1998; Fuller et al. 2002; and Moeller et al. 2004). Further, Faccio et al. (2006) find that such a “listing effect” is robust across 17 Western European countries. These findings confirm the presence of a “private company discount,” in that the differential gains to acquiring firms imply that owners of unlisted firms receive a lower value for their holdings than do owners of comparable targets that are publicly traded. In addition, while stock financed deals yield acquirer losses when the target is publicly traded A. Gupta Department of Finance, Bentley University, Waltham, MA 02452, USA L. Misra () Department of Finance, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX 78249, USA e-mail: [email protected]
(Travlos 1986), acquiring firm gains in stock financed deals are larger in acquisitions of unlisted target (Fuller et al. 2002; Moeller et al. 2004). In this paper we examine acquirer gains in acquisitions of listed and unlisted targets for a sample of U.S.-based commercial banks, document the magnitude of the acquisition discount for unlisted targets, and analyze the influence of firm-specific and deal characteristics on the observed discount. As outlined in the next section, examining the listing discount in a sample containing only commercial bank acquisitions has several advantages, including the ability to control for inter-industry differences in the listing discount and the availability of public financial data for both listed and unlisted targets. Our empirical analysis is based on a sample of 1,041 acquisitions over the years 1981 through 2005, where both the acquiring and target firms are commercial banks. Acquiring firms in the sample are publicly traded, as are 416 of the target firms; the remaining 625 targets are unlisted. Event study findings indicate that acquiring firms suffer significantly smaller losses when the target is an unlisted firm relative to bids for listed targets; the difference in acquirer abnormal returns is statistically significant, confirming the presence of a listing effect in commercial bank acquisitions. We follow the approach in Officer (2007) and use valuation multiples to estimate the magnitude of the listing discount. Since financial data for both listed and unlisted targets are available from regulatory filings, we use cross-sectional regressions to estimate the magnitude of the valuation discount for unlisted targets; our analysis provides estimates of the discount after controlling for differences among listed and unlisted targets, as well as differences in acquirers of listed and unlisted targets for a variety of financial characteristics. Our estimate of the private company discount has a mean value of 13.75%, which is somewhat lower than the approximately 17% discount in acquisitions of unlisted nonfinancial firms documented by Officer (2007). Our work contributes to the literature in several ways. First, while several papers have documented acquiring firm losses in financial mergers (Houston and Ryngaert 1994; Becher 2000; DeLong 2001; and Houston et al. 2001), we are
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_97,
1427
1428
not aware of any paper documenting the wealth impact for financial acquirers in acquisitions of unlisted targets. Second, by focusing on a single industry with an active market for corporate control we are able to control for possible interindustry differences that may influence observed differences in bidder gains in acquisitions of listed versus unlisted targets. Third, since commercial banks are required to provide financial data to regulatory authorities, we are able to get detailed data on unlisted targets for the pre-bid period. This allows us to control for a variety of target characteristics that can be expected to influence the listing effect and the acquisition discount documented by studies focusing on the nonfinancial sector. The rest of the paper is organized as follows. In the next section we summarize the literature on the listing effect in the non-financial sector, outline reasons why owners of unlisted targets may obtain a lower value for their holdings, and discuss how using a sample of commercial bank acquisitions allows us to control for some of these possible effects. Sample characteristics are described in Sect. 97.3, followed by event study results in Sect. 97.4, multiples based analysis in Sect. 97.5, and cross-sectional tests in Sect. 97.6. The last section contains concluding comments.
97.2 Why Acquiring Firms May Pay Less for Unlisted Targets Several recent papers have documented that announcement period abnormal returns to acquiring firms are larger when the target is unlisted, as we have already noted. The observed larger gain to the acquirer from the acquisition of an unlisted firm implies that when considering two similar acquisition targets, acquiring firms are willing to pay and target owners are willing to accept a lower value for their holdings. This phenomenon has been variously termed an acquisition discount, listing effect, valuation discount, or a private company discount. Several possible explanations have been proposed for the presence of such an acquisition discount and the method of payment effect. Chang (1998) suggests that owners of unlisted targets may join the board of the acquiring firm, and the addition of a large blockholder may yield valuation gains from improved monitoring. He finds that acquirer gains are larger in stock-financed acquisitions where such a blockholder actually results. Fuller et al. (2002), and Moeller et al. (2004), however, report significant acquirer gains in both stock and cash financed acquisitions; the latter finding suggesting that the blockholder/monitoring hypothesis provides at best a partial explanation for the phenomenon. Moeller et al. (2004) also find that smaller acquirers obtain
A. Gupta and L. Misra
larger abnormal returns, and larger acquirers obtain smaller abnormal returns, suggesting that differences in bidder size in acquisitions of listed versus unlisted targets may explain a part of the listing effect. Officer (2007) hypothesizes that sales of unlisted firms and subsidiaries of publicly traded parents are motivated by a need for liquidity and may face a situation similar to the “fire sale” environment described by Schliefer and Vishny (1992). The acquiring firm in these transactions is aware of the liquidity problems facing the target, and is able to benefit from this bargaining advantage and obtain a better deal. He documents acquisition discounts of approximately 17% for unlisted targets and 28% for unlisted subsidiaries of publicly held firms relative to comparable acquisitions of listed targets. He also finds that the acquisition discount is positively related to the target’s need for liquidity and to loan-spreads in the corporate debt market. Faccio et al. (2006) use data for 17 Western European countries to examine a variety of explanations for the listing effect on acquirer abnormal returns, including whether the unlisted target is a standalone firm or a subsidiary, the method of payment, whether a blockholder is created, acquirer size, whether the deal is friendly or hostile, the acquirer’s Tobin’s-Q ratio, whether the acquirer and target are in the same industry, the relative size of the deal, acquirer ownership, and the pre-announcement performance of the acquiring firm. Their cross-sectional regressions yield only two variables that are consistently significant – acquirer size and the listing status of the target. They find that the listing effect exists across both time and across countries, and conclude that the fundamental factors that give rise to the listing effect remain elusive. Explanations for observed acquisition discounts can be categorized into reasons why (1) an acquiring firm should be willing to pay less, and (2) why a target firm is willing to accept a lower price for their holdings. We explore each of these reasons below.
97.2.1 Motivations of the Acquiring Firm Acquiring firms in corporate mergers must use limited information about the target to come up with an appropriate bid. This task is complicated by the presence of informational asymmetry, in that owners of the target firm are presumed to have better information about their firm than does the management of the bidding firm. Since unlisted firms in nonfinancial industries are not required to file financial data with regulatory authorities or otherwise publicly disclose them, the acquirer’s informational asymmetry problem is exacerbated with an unlisted target entity. The lack of substantive information about the unlisted target and the increased risk
97 Listing Effects and the Private Company Discount in Bank Acquisitions
implies that the acquiring firm may want to pay less in such deals relative to comparable transactions with listed targets. Unlisted targets tend to be smaller firms and may be more risky than listed firms in their industry; higher perceived risk, coupled with a paucity of reliable information, can be expected to reduce both the number of potential bidders and the amount an acquirer would be willing to pay. As argued earlier, acquiring firms may also have a relative bargaining advantage in deals involving unlisted targets, enabling them to offer a lower price. Another potential contributor to the listing effect follows from the possibility that when both the acquiring and target firms are publicly traded, diversified shareholders can be expected to have long positions in both firms. Stockholders of the acquiring firm may consequently be less concerned about losses these firms experience upon bid announcement, because gains to the target firm enable them to make up these losses. As Hansen and Lott (1996) suggest, diversified stockholders are more likely to be concerned about aggregate gains than about the wealth impact on the acquiring firm by itself. When the target is an unlisted firm, however, these stockholders will in fact be concerned only about wealth gains to the acquiring firm, suggesting that management will be more diligent in their evaluation of unlisted targets and offer lower bid premiums; this logic implies that bidder gains in acquisitions of unlisted targets should be positive. The price that acquiring firms are willing to pay is also likely to be influenced by systematic differences in financial or other characteristics of listed versus unlisted targets. The latter, for example, tend to be smaller firms on average, and acquirers of listed firms tend to be larger than those of unlisted targets. The role of such factors in explaining the listing discount requires data on pre-bid financial characteristics of unlisted targets, something that is not available for firms in the non-financial sector.
97.2.2 Motivations of Owners of Unlisted Targets It seems reasonable to assume that owners of unlisted firms face a variety of constraints that may influence the price they are willing to accept for their ownership claim. In particular, they are likely to hold large, highly concentrated and illiquid positions in their firm. The illiquidity is expected to severely limit portfolio diversification for the owner or manager, thus increasing the unsystematic risk of their portfolios. Selling out to a publicly traded firm benefits them by providing a favorable resolution to such illiquidity and diversification related problems; to the extent that such benefits have value, the owners may be willing to accept a lower price for their holdings.
1429
Capital constraints faced by owners of unlisted firms may also motivate them to accept a lower value for their holdings. A lack of capital market access can be expected to limit capital availability with negative implications for the firm’s growth prospects. Unlisted firms have a choice between being acquired and doing an IPO, both of which have attendant transactions costs. To the extent that IPO costs are likely to be particularly large for smaller firms, the owners may be willing to accept a lower acquisition multiple in order to avoid the costs entailed in an IPO; such savings may be transferred in part to the acquiring firm, thus contributing to the observed acquisition discount.
97.2.3 Why Study Acquisitions in the Commercial Banking Industry? There are several reasons why commercial bank acquisitions provide a useful environment to study the private company discount. First, the industry has witnessed a significant volume of control activity involving both listed and unlisted targets since the 1980s. Second, since banks that are regulated by the Federal Reserve System, Federal Deposit Insurance Corporation, and the Comptroller of the Currency are required to file Reports of Condition and Income (Call Reports), we are able to obtain detailed information on the financial characteristics of both acquirers and targets. This allows us to examine the extent to which differences in firmspecific characteristics between listed and unlisted targets influence the acquisition discount and the observed gains to the acquiring firms. Third, unlisted firms in the non-financial sector are by their nature informationally opaque, and limited information on the target firm can be expected to reduce the price a bidder is willing to pay (and a seller willing to accept) in control transactions. Such informational asymmetries are significantly reduced in the case of unlisted commercial banks because detailed financial data on these firms are available from regulatory filings. Fourth, as reported by Maquieira et al. (1998), acquiring firm gains are larger in non-diversifying mergers; a single industry study provides a natural control mechanism for possible synergistic effects of focus-reduction or enhancements. Finally, unlisted banking firms have better liquidity and maintain larger capital positions relative to listed banks, suggesting that liquidity constraints or the potential for imminent distress are less likely to drive any observed acquisition discounts in these transactions. The impact of informational asymmetry on bidder gains and the acquisition discount in bids for unlisted targets is likely to be a lower in commercial bank acquisitions due to ready availability of financial data for these firms. Officer (2007) estimates that informational asymmetry may
1430
contribute up to 25% of the observed discount; this suggests that observed bidder gains and the acquisition discount should be lower for acquisitions in the banking sector relative to non-bank firms. Finally, since estimates of the acquisition discount requires an analysis of valuation multiples, it becomes necessary to construct a set of listed targets that are comparable to unlisted targets. Kaplan and Ruback (1995) find that more precise results are obtained by selecting comparable firms that are in the same industry and undergoing the same transaction. Since our sample consists of acquisition bids in a single industry, sample firms in our study meet these requirements by definition. The missing element is matching by firm size, which we control for in cross-sectional regressions.
97.3 Sample Characteristics 97.3.1 Sample Selection We obtain a sample of public banks acquiring public or non-public banks from the Securities Data Corporation (SDC) mergers and acquisitions database. Our initial sample of 5,411 cases is culled to exclude cases where the acquirer and target have the same identity, to eliminate transactions that represent the acquisition of selected branches or subsidiaries, and to retain only those transactions where the entire target firm is acquired. Transactions involving federally assisted acquisitions of insolvent financial institutions are also excluded (Gupta et al. 1997). Each bid contained in the sample represents the initial merger announcement for a unique target; announcements of bid revisions and announcements of bids by additional bidders for the same target are not included in the sample. We exclude cases where acquirer returns are not available from the Center for Research in Security Prices (CRSP) daily files. This reduces the sample to 1,579 cases. We searched the LexisNexis database for PR Newswire reports and the Wall Street Journal in the 1981–1990 period for news reports describing the merger. We excluded cases where we were unable find a news report about the merger. We saved the identity, date, or other salient features of the merger that were found to be different from SDC in the published report. We collected the call reports data for 1,442 cases, for eight quarters or as available, preceding the announcement. We also exclude cases where the deal value is not publicly disclosed at the time of the announcement. We were able to acquire value, stock returns, and call reports data for both the acquirer and the target for 1,195 cases. We apply a relative size screen and restrict the final sample to 1,049 cases where the asset size of the target exceeds 1% of the acquirer and is less than 100% of the acquirer size.
A. Gupta and L. Misra
97.3.2 Selected Sample Characteristics We show the time distribution of merger bids in Panel A of Table 97.1; the number of bids in a year ranges from 14 to 105, and there is evidence of heightened takeover activity during the 1990s. A similar pattern can be seen for both interstate versus intrastate bids and for bids where the target is a listed or unlisted firm. Our study period spans several substantive changes in the regulatory structure facing commercial banking firms. In particular, while several states permitted banks to engage in acquisitions across state lines, it was not until the passage in 1994 of the Riegel-Neal Interstate Banking and Branching Efficiency Act that nationwide expansion and consolidation was permitted; the regulation became effective on June 1, 1997. As we show in Panel B, the sample contains 690 bids in the Pre-Riegel-Neal period and 351 cases in the Post-RiegelNeal period. The distribution of bids by method of payment indicates that a large majority of bank acquisitions use all stock financing; this is evident for the full sample of bids and also for each sub-sample. We present some summary information on the size of the sample firms in Table 97.2. We aggregate call report data at the highest holding company level and use this data. We show in Panel A of the table that the sample contains 1,041 publicly held acquiring firms with 416 listed targets and 625 unlisted targets.1 Firms that acquire listed (public) targets are significantly larger than acquirers of unlisted (private) firms, both in terms of mean and median values of total assets. We find a similar pattern for target firms, in that both the mean and median total assets of public targets are significantly larger than for private targets. As we show in the last column of Panel A, the relative size of the merger participants (measured as the ratio of target to acquirer assets) is significantly greater for acquisitions of listed targets. We also find that there are statistically significant differences in acquirer total assets, target total assets, and relative size in both the pre- Riegel-Neal and post-Riegel-Neal periods. Acquirers of listed targets are larger, their targets are larger, and the relative size of the bidder-target pair is greater than in bids that involve unlisted targets. Our results based on sorting the sample between interstate and intrastate bids are reported in Panel C of Table 97.2. The differences in target and acquirer sizes are broadly similar to those reported in Panels A and B of the table. 1 Out of these 625 private targets, 184 banks trade on OTC Pink Sheet or on Bulletin Board and their returns data is not available on CRSP. We treat these unlisted firms as if they are privately owned. We have examined the robustness of this assumption on our subsequent results by splitting the sample into three groups: public, quasi-public, and private. Results are robust to this specification.
97 Listing Effects and the Private Company Discount in Bank Acquisitions
Table 97.1 Distribution of sample over time
1431
Panel A: By year Year Public target 1981 1 1982 3 1983 10 1984 7 1985 18 1986 19 1987 12 1988 5 1989 10 1990 4 1991 17 1992 17 1993 24 1994 19 1995 22 1996 27 1997 32 1998 29 1999 37 2000 26 2001 22 2002 12 2003 19 2004 12 2005 12 Total 416 Panel B: Period Pre-Riegel 225 Post-Riegel 191 Total 416 Panel C: Payment Cash payment 48 Other payment 81 Stock payment 287 Total 416
Private target 14 11 12 14 6 24 23 13 18 13 16 32 63 86 57 46 39 29 26 10 7 8 6 5 47 625
Intrastate 11 6 10 14 7 14 12 9 15 8 11 16 31 43 40 27 28 23 28 14 14 7 12 5 33 438
Interstate 4 8 12 7 17 29 23 9 13 9 22 33 56 62 39 46 43 35 35 22 15 13 13 12 26 603
Total 15 14 22 21 24 43 35 18 28 17 33 49 87 105 79 73 71 58 63 36 29 20 25 17 59 1,041
465 160 625
283 155 438
407 196 603
690 351 1,041
105 121 399 625
79 105 254 438
74 97 432 603
153 202 686 1,041
Public and private target acqusitions over time are shown as well as interstate and intrastate acquisitions. Summary breakdown of sample by pre- and post-Riegel time periods and by payment type are presented
97.4 Event Study Analysis 97.4.1 Event Study Approach Our empirical analysis proceeds in two stages. We employ standard event study techniques to document acquirer gains in acquisitions of public and private targets (see for example MacKinlay 1997). For each firm i , the regression parameters .˛i ; ˇi / are estimated from the market model using the estimation period from t D .250; 10/. The market model is shown below.
Rit D ˛i C ˇi Rmt C ©it
(97.1)
Over the event period t D .1; C1/, the abnormal return for each firm .ARit / is estimated daily by applying the regression parameters estimated from the market model. ARit D Rit .˛i C ˇi Rmt /
(97.2)
Daily abnormal returns are averaged in event time across the sample of firms to obtain the average abnormal returns .AARt / for the event period t D .10; C10/.
1432
A. Gupta and L. Misra
Table 97.2 Distribution of acquirer and target assets in acquisitions made by public banks during the 1981–2005 period Acquirer total assets Target total assets Relative target asset (%) Cases
Mean
Median
Mean
Median
Mean
Median
Panel A: Assets (in $m) Public
416
32,137
9,478
6,416
933
22.00
12.70
Private
625
4,775
2,120
234
129
10.50
6.62
Total
1041
15,709
3,749
2,704
232
15.10
8.61
7.55
134.61
5.41
393.26
10.78
41.17 13.10
Statistic Panel B: By period Pre-Riegel Public
225
22,250
11,515
3,597
1,107
21.40
Private
465
4,815
2,399
227
119
9.56
5.82
Total
690
10,501
3,852
1,326
198
13.40
7.48
11.11
108.05
8.76
243.12
8.76
108.05
Statistic Post-Riegel
Public
191
43,783
7,751
9,736
804
22.60
12.50
Private
160
4,658
1,535
253
149
13.30
9.42
Total
351
Statistic
25,948
3,177
5,413
327
18.40
11.10
3.87
47.85
2.92
107.06
8.76
108.05
Panel C: By location Intrastate Public
114
5,614
2,424
1,018
593
31.80
25.20
Private
324
2,075
1,188
169
109
14.10
9.85
Total
438
Statistic Interstate
2,996
1,357
390
145
18.70
12.10
6.29
18.04
8.67
107.02
9.41
33.31
Public
302
42,149
15,752
8,453
1,224
18.30
9.68
Private
301
7,681
4,201
304
158
6.64
3.86
Total
603
24,943
7,544
4,386
413
12.50
6.17
5.73
114.71
4.24
253.54
9.36
53.14
Statistic
The mean and Median of acquirer, target asset size, and relative size are shown according to public or private target banks. Sub-periods statistics are shown in Panel B as Pre-Riegel (January 1981–June 1997), and Post-Riegel (July 1997–December 2005). The t-statistic for the difference of Means across public and private sub-samples and Pearson’s continuity-corrected X2 -statistic for the test of median are reported The symbols , , denote statistical significance at the 10, 5, and 1% levels, respectively
AARt D
N 1 X ARit N i D1
t Da
In the absence of abnormal performance, the expected values of .AARt / and .CAARa;b / are zero. The test statistics for .AARt / and .CAARa;b / are respectively based on the Average Standardized Abnormal Return .ASARt / and the Average Standardized Cumulative Abnormal Return .ASCARa;b /, where: N 1 X ARit N i D1 Sit
b X
ASARt
(97.6)
t Da
Cumulative average abnormal returns .CAARa;b / are obtained by summing the .AARt / over the time interval, t D a; b. t Db X CAARa;b D AARt (97.4)
ASARt D
ASCARa;b D
(97.3)
(97.5)
and Sit , the square root of firm i ’s estimated forecast variance, is given by: 111=2 2 CC B B Rmt Rm 1 CC B B Sit D BSi2 B1 C C L CC 2 AA @ @ Li Pi Rmk Rm 0
0
(97.7)
kD1
Si2 D Residual variance for firm i from the market model regression, Li D Number of observations for firm i in the estimation period, Rmt D Market return for day t of the event period, Rm D Mean market return for the estimation period, Rmk D Market return for day k of the estimation period.
97 Listing Effects and the Private Company Discount in Bank Acquisitions
Assuming cross-sectional independence .ASARt /, approaches a normal distribution with mean zero and variance 1/N. For large samples, the statistics Zt ; Za;b are unit normal and can be used to test the null hypothesis that .AARt / and .CAARa;b / equal zero, where: p N ASARt p b X N Dp ASARt Œb a C 1 t Da
Zt D
(97.8)
Za;b
(97.9)
We also employ a non-parametric test to test the robustness of our conclusions regarding value enhancements. We compute the proportion of positive abnormal returns and the corresponding binomial z-statistics. The probability of a positive residual is the same as the probability of a negative residual on any day based on the market model. Thus, a binomial z-statistic for the proportion of positive observations is given by: .PPOS p/ zp D p Œp.1 p/ =N
(97.10)
where N is the number of total observations, PPOS is the proportion of positives (number of cases with positive Abnormal Return/N), p is the expected proportion of positive (0.5).
1433
columns of Table 97.3, and confirm the presence of a statistically significant listing effect except when the method of payment is cash. Our finding that acquiring firms incur a significant loss when the target is an unlisted firm and the method of payment is all-stock is counter to those reported for bids in the non-financial sector. In particular, these studies report that bidder gains in bids for unlisted targets are largest when the deal is all stock financed. As we indicate in the first two columns of the table, this result is driven by the subset of interstate bids; acquiring firms that make bids for unlisted targets domiciled in the same state experience insignificant announcement period wealth changes.
97.5 Findings Based on Multiples 97.5.1 Acquisition Multiples We use the dollar value bid amount to estimate acquisition multiples for the two types of target firms. The equity multiple .EM i / is computed as the ratio of the bid amount over the book value of equity for the most recent quarter. EM i D
97.4.2 Gains to Acquiring Firms for Listed and Unlisted Targets We summarize gains to acquiring firms in bids for listed and unlisted targets in Table 97.3. We present the full sample results in the last column of the table and show that acquiring firms suffer significant losses in bids for listed targets; the announcement period abnormal return is 2:02% and is statistically significant at the 1% level (z-statistic D 15:79). When the target is an unlisted firm the acquiring firm’s loss is 0:30%, which is also significant at the 1% level (z-statistic D 2:81). We also report that in bids for private targets the acquiring firm’s abnormal return is 1.72% larger compared to bids for public targets, and this difference is statistically significant at the 1% level (z-statistic D 7:80). This finding supports the presence of a listing effect on acquirer returns in bank mergers. We sub-sample according to method of payment and find that when the target is an unlisted firm, acquiring firms lose only when the bid is all-stock financed. Even in this case, however, the listing effect is positive and statistically significant. We report sub-sample results for interstate versus intrastate bids sorted by the method of payment in the first two
Estimated Value Paidi Book Value of Equityi
(97.11)
We then estimate cross-sectional regressions using the acquisition multiples as the dependant variable to examine how differences in acquirer and target characteristics impact the acquisition multiple. Given that financial data for both listed and unlisted targets is available from regulatory filings, our regressions control for the possible impact of several measures of growth, leverage, and profitability on the observed multiples.
97.5.2 Summary Statistics for Multiples We present summary statistics on the bid amount and the ratio of the bid amount to the book-value of target equity in Table 97.4. As we show in Panel A, the average amount paid for listed targets is $1.2 billion with a median of $170 million, versus an average payment of $41 million for unlisted targets with median of $22 million. This statistically significant difference in both mean and median values is not surprising, given that listed targets are significantly larger than unlisted firms as we presented in Table 97.2. Our estimates of the equity multiple offered to the target firm has a mean value of 2.40 and median of 2.23 for public targets,
1434
A. Gupta and L. Misra
Table 97.3 Summary of acquirer abnormal returns from (1 to C1) around the announcement of the acquisition All All Full Location: Acquirer CAR: public Acquirer CAR: private Total Private less public: CAR Cash payment: Acquirer CAR: public Acquirer CAR: private Total Private less public: CAR Other payment: Acquirer CAR: public Acquirer CAR: private Total Private less public: CAR Stock payment: Acquirer CAR: public Acquirer CAR: private Total Private less public: CAR
N
intrastate
2.43 0.12 0.72 2.32 Intrastate 11 1.78 68 0.04 79 0.21 1.83 Intrastate 36 2.08 69 0.03 105 0.74 2.05 Intrastate 67 2.73 187 0.20 254 0.87 2.52 114 324 438
z-Statistics
8.59 0.52 4.83 3.93 1.68 0.27 0.37 0.88 4.17 0.34 2.72 2.16 7.47 0.65 4.39 3.28
N
interstate
1.87 0.51 1.19 1.36 Interstate 37 0.02 37 0.25 74 0.13 0.23 Interstate 45 1.28 52 0.39 97 0.80 0.89 Interstate 220 2.30 212 0.58 432 1.46 1.72 302 301 603
z-Statistics
13.26 3.51 11.86 6.91 0.25 0.84 0.77 0.42 4.85 1.16 4.16 2.45 13.24 3.25 11.73 7.17
N
sample
z-Statistics
416 625 1,041
2.02 0.30 0.99 1.72
15.79 2.81 12.16 7.80
All cash 79 0.42 74 0.06 153 0.17 0.36 All other 81 1.63 121 0.18 202 0.77 1.45 All stock 287 2.40 339 0.40 686 1.24 2.00
1.31 0.23 0.80 0.34 6.39 1.02 4.84 3.26 15.20 2.81 11.98 7.69
The CAR are shown below in % with corresponding z-statistic and significance levels The symbols , , denote statistical significance at the 10, 5, and 1% levels, respectively
versus to an average equity multiple of 2.07 for private targets with a median of 1.91. Differences in both the mean and median values of the equity multiples paid to listed and unlisted targets are statistically significant, with public targets getting a higher value for their ownership stake. These findings suggest the presence of an acquisition discount in deals where the target is an unlisted firm; the analysis, however, does not control for differences in size and possible differences in financial characteristics of listed versus unlisted targets or their acquirers. We use multivariate regressions to allow for these possibilities, and report the findings in Sects. 97.5.3, 97.5.4 and 97.6 of the paper. We show related findings for sub-samples in Panels B, C, and D of Table 97.4 according to the timing of the bid, differences in location, and the method of payment. As we show in the second column of Table 97.4, there are statistically significant differences in the amount bid for listed targets relative to those that are unlisted. These findings are as expected, given that unlisted targets are in all cases significantly smaller than listed targets. Our reported results based on subsamples presented in the last column of the table are more interesting. Our findings for bids made during the pre-Riegel-Neal period are similar to those for the full sample of bids; listed targets received significantly larger value for their holdings relative to owners of unlisted firms. Differences in the postRiegel-Neal period, however, are not statistically significant.
Potential explanations for this finding include the possibility that ongoing consolidation in the industry made both listed and unlisted targets more valuable to potential acquirers. In addition, as we show in Panel B of Table 97.2, firms that acquired listed targets in the post-Riegel-Neal period are on average almost twice as large as those in the pre-Riegel-Neal period. If, as found by Moeller et al. (2004), smaller firms make better acquisitions, then the difference in bidder size during these two time periods may contribute to the observed findings. As we show in Panel C, listed targets obtain acquisition multiples that are significantly larger in both interstate and intrastate bids. The observed multiples for both types of targets are, however, larger in interstate bids; a finding that is consistent with the proposition that in bank mergers, inmarket deals are expected to create relatively more value (Houston and Ryngaert 1994). We present our findings on acquisition multiples in deals sorted by the method of payment in Panel D of the table. The difference in median values for listed versus unlisted targets is not statistically significant for deals financed with cash and when mixed financing is used. The mean difference is significant in both cases, but only at the 5% and 10% levels. We find that listed targets receive a significantly larger multiple than do unlisted targets in bids where the method of payment is the acquirer’s stock. Taken together, the findings on the relationship between deal multiples and the method of financing
97 Listing Effects and the Private Company Discount in Bank Acquisitions
1435
Table 97.4 Value estimated to be paid by the acquirer to the target at the time of announcement Estimated Value ($ m)
Target’s equity multiple
Cases
Mean
Median
Mean
Median
Public Private Total Statistic
416 625 1041
1,243 41 522 5.35
170 22 42 373.43
2.40 2.07 2.20 4.91
2.23 1.91 2.03 29.20
Public Private Total Statistic Public Private Total Statistic
225 465 690
456 34 172 9.65 2,170 63 1,210 3.27
168 19 33 233.01 171 36 85 98.37
2.10 1.90 2.00 3.06 2.70 2.60 2.70 0.95
2.00 1.80 1.80 7.62 2.60 2.50 2.50 1.28
170 31 67 11.06 1,648 53 852 4.23
105 19 26 93.93 217 28 74 235.72
2.31 1.99 2.08 3.48 2.44 2.16 2.30 2.87
2.17 1.83 1.92 6.27 2.23 1.98 2.09 12.55
186 35 82 6.82 895 39 383 2.27 1,518 44 661 4.62
105 20 31 37.85 157 22 41 84.42 203 23 51 274.34
2.14 1.90 1.98 1.80 2.24 1.85 2.00 2.30 2.49 2.19 2.31 3.71
1.93 1.75 1.82 1.62 1.98 1.72 1.80 2.06 2.31 2.02 2.14 20.15
Panel A: Total
Panel B: Regime Pre-Riegel
Post-Riegel
Panel C: Location Intrastate
Interstate
Panel D: Payment Cash
Other
Stock
191 160 351
Public Private Total Statistic Public Private Total Statistic
114 324 438
Public Private Total Statistic Public Private Total Statistic Public Private Total Statistic
48 105 153
302 301 603
81 121 202 287 399 686
Estimated value as a multiple of the target’s most recent quarter book equity is shown. The mean and median are presented. The two-sample t-statistic for the difference of means and Pearson’s continuity-corrected X2 -statistic for test of equality of median are reported
suggest that this variable plays an important role in these acquisitions; we examine this in greater detail in the regression analysis reported later in the paper.
97.5.3 Relating Acquisition Multiples to Firm and Deal Characteristics
get firm characteristics. Summary statistics on the set of explanatory variables used are presented in Table 97.5 and regression results in Sect. 97.5.4 of the paper. Our explanatory variables fall into five broad categories: profitability, capital and asset-liability structure, asset quality, loan composition, and other. We discuss each of these in turn.
97.5.3.1 Profitability In this section we report on findings from cross-sectional regressions designed to examine how the estimated acquisition multiples are related to a set of deal, acquiring firm, and tar-
Our proxies for profitability include ROA, ROE, and interest margin (net interest as a percentage of total assets). As shown
1436
A. Gupta and L. Misra
Table 97.5 Selected operating and profitability statistics Acquirer Mean A: Profitability ROA (%)
C: Asset quality Loan loss (loan loss allowance as % of total loans) Net charge off (net charge off as % of total loans)
Acquirer
Target
Wilcoxon
Means Test
Median
Median
z-Statistic
2:85
2:48
5:50
2:92
2:56
6:39
Private
2:93
2:40
7:14
2:96
2:55
8:29
2:43
2:94
2:55
10:47
2:90
8:94
Statistic
1:50
0:75
0:64
0:01
Public
37:20
30:30
8:20
37:80
32:10
8:76
Private
36:30
27:70
11:38
36:50
29:70
12:71
28:70
36:90
30:60
15:45
Total Interest margin (net interest as % of total assets) B: Capital structure Equity-toasset (book value of equity as % of total assets) Deposit-toasset (total deposits as % of total assets) Other-toasset (debt and other as % of total assets) Loan-toequity (multiple)
t-statistic
Mean
Public Total
ROE (%)
Target
36:60
14:02
Statistic
1:48
2:38
4:47
11:41
Public
5:41
5:77
2:42
5:47
5:56
2:58
Private
6:02
5:65
2:60
5:63
5:82
2:48
5:77
5:70
0:76
Total Statistic
2:66
5:54
5:64
0:50
1:38
0:45
0:40
Public
7:82
8:11
3:00
7:67
7:88
3:72
Private
8:16
8:94
7:92
8:02
8:49
7:48
8:61
7:91
8:20
8:19
Total
8:02
Statistic
3:19
5:72
Public
75:50
81:80
Private
81:60
Total
79:20
Statistic
13:02
8:23
10:25
27:32
13:45
76:70
83:70
3:45
87:40
20:56
82:60
88:40
0:85
85:10
23:78
80:30
87:00
2:80
14:97
100:71
103:27
15:50
7:71
11:96
Public
16:70
10:10
Private
10:20
3:66
23:28
8:81
1:93
18:57
Total
12:80
6:25
25:71
11:40
3:82
21:74
13:71
Statistic
13:10
17:52
131:75
203:42
Public Private
8:49 7:78
8:18 7:03
2:24 7:26
8:33 7:70
7:95 7:03
3:75 7:04
Total
8:06
7:49
6:88
7:99
7:39
7:87
Statistic
6:21
7:28
Public
1:62
1:56
2:01
1:48
1:38
2:93
Private
1:62
1:46
4:27
1:43
1:28
7:38
Total
1:62
1:50
4:70
1:44
1:33
7:63
22:76
29:20
Statistic
0:09
2:18
2:79
11:42
Public
0:36
0:37
0:29
0:29
0:25
1:18
Private
0:30
0:30
0:14
0:24
0:18
4:36
Total
0:32
0:33
0:08
0:26
0:21
4:06
Statistic
4:30
3:04
11:42
15:09
(continued)
97 Listing Effects and the Private Company Discount in Bank Acquisitions
1437
Table 97.5 (continued)
D: Loan composition Public Return on loans (interest on loans as % of Private total Total loans) Statistic Real estate loan (real estate loans as % of total loans) Loan to assets (total loans as % of total assets)
Acquirer
Target
t-statistic
Acquirer
Target
Wilcoxon
Mean
Mean
Means Test
Median
Median
z-Statistic
21:70
22:50
6:19
21:00
21:90
7:60
22:20
22:90
4:35
21:70
22:70
9:21
22:70
21:50
22:40
12:01
22:00
6:86
2:10
1:58
11:08
9:46
Public
50:50
57:80
8:80
51:60
59:60
8:48
Private
52:60
58:90
9:33
53:60
59:80
8:72
58:50
52:70
59:70
12:08
0:85
0:03
Total
51:80
12:80
Statistic
1:92
0:92
Public
63:50
62:80
1:23
65:00
63:30
1:38
Private
61:70
58:70
5:28
62:70
59:90
5:05
60:40
63:70
61:50
4:89
11:42
10:58
Total Statistic
62:40 3:12
5:46
5:06
E: Miscellaneous Public 0:34 0:34 0:08 SIGMA of ROA (%) Private 0:35 0:38 2:25 (based on Total 0:35 0:36 2:03 ROA of Statistic 1:70 2:57 last eight quarters) Public 31:80 16:30 6:37 Asset growth (%) (anPrivate 25:90 9:62 11:73 nualized Total 28:30 12:30 12:47 based on Statistic 2:50 5:30 assets of 8 quarters) Public 3:53 4:00 6:94 Salary cost (total salary expense Private 3:83 4:03 3:31 as % of Total 3:71 4:02 6:76 total Statistic 4:65 0:37 assets) Public 16:90 17:30 2:55 Operating cost (total operating Private 16:80 16:80 0:31 costs as % Total 16:90 17:00 1:24 of total Statistic 0:45 2:14 assets) Public 13:70 14:20 1:64 Liquidity (cash and securities Private 18:20 18:80 1:61 as % of Total 16:40 17:00 2:19 total Statistic 5:25 4:73 assets) For the public and private samples, we report a paired t-statistic for means and Wilcoxon signed sample t-statistic and Pearson’s X2 -statistic for median test are reported The symbols , , denote statistical significance at the 10, 5, and 1% levels, respectively
0:33
0:32
1:37
0:35
0:33
0:24
0:34
0:32
2:83
2:04
0:67
19:00
10:30
8:83
17:20
6:50
14:87
17:80
7:84
17:00
1:21
29:20
3:49
3:84
6:94
3:70
3:88
3:45
3:62
3:87
7:02
11:08
0:45
16:20
16:30
16:10
16:30
0:93
16:20
16:30
2:71
0:12
0:01
6:50
6:51
0:79
14:40
11:50
0:68
9:45
8:54
0:99
15:69
14:70
3:19
rank z-statistic for median. Samples, the two-
1438
in Panel A of the table, acquirers of both listed and unlisted targets have significantly higher levels of ROA and ROE than the corresponding target firms. In contrast, the acquirer’s interest margin is significantly larger (smaller) than the target’s for unlisted (listed) targets. The difference in mean values of ROA and ROE for acquirers of listed versus unlisted targets is not statistically significant, but acquirers of unlisted targets have a significantly higher interest margin than those of listed targets. For target firms, we find insignificant differences in ROA and interest margin for listed versus unlisted firms, but the ROE is significantly larger for listed targets. When viewed as a whole, it is difficult to identify a systematic pattern of differences in profitability between the subsets of acquiring and target firms; we expect the multivariate analysis to provide further insight into how these differences may impact the acquisition discount.
97.5.3.2 Capital and Asset Liability Structure We present summary statistics for four measures related to capital: ratios of equity-to-assets, deposits-to-assets, otherto-assets, and loan-to-equity. Equity-to-assets is a measure of the firm’s capital position. As shown in Panel B of Table 97.5, target firms in all cases have significantly larger values of this ratio relative to acquiring firms. In addition, the equity-to-assets ratio is significantly larger for unlisted relative to listed targets. The deposits-to-assets ratio is a proxy for the firm’s ability to leverage its deposit base into income-generating assets; we find that acquirers of both listed and unlisted targets have significantly smaller values of this ratio relative to the corresponding target firms. In addition, both unlisted targets and their acquirers have higher deposit-to-assets ratios than do listed targets and their acquirers. Smaller, unlisted targets appear to have a better ability to leverage their deposit base into assets, due perhaps in part to their superior ability at relationship-based lending (Berger and Udell 2002). The ratio labeled other-to-assets primarily captures the firm’s ability to access debt markets. Not surprisingly, this ratio is significantly larger for acquiring relative to target firms, and for listed relative to unlisted targets. A similar pattern is evident for the loan-to-equity ratio.
97.5.3.3 Asset Quality We use two measures to proxy for loan quality. As we report in Panel C of the table, the ratio of loan loss allowance to total loans (Loan Loss) shows significant differences between acquiring and target firms; acquiring firms have larger values of this ratio in all cases. In addition, we find that unlisted targets have significantly lower values of this ratio than do
A. Gupta and L. Misra
listed targets, suggesting that loan quality is better managed at unlisted target firms. Findings from the second measure of loan quality, measured by the ratio of the net chargeoff to total loans (Net ChargeOff ) are mixed, with the exception that unlisted targets have significantly lower values of this ratio than do listed targets.
97.5.3.4 Loan Composition We present three proxies for loan composition and returns in Panel D of Table 97.5. Return-on-loans serves as a proxy for the firm’s ability to support higher rates of interest on outstanding loans. We find that target firms have significantly higher values of this ratio. When combined with the findings on asset quality, this result suggests that target firms do better at asset management than does the set of acquiring firms. We also find that target firms in the sample make relatively more real estate loans; the ratio of real estate loans to total loans (Real Estate Loan) is significantly larger for target firms relative to the corresponding acquiring firms. Values of this variable are of comparable magnitude for listed versus unlisted targets. Our findings on the loan-to-assets ratio indicate that there are significant differences between unlisted targets and their acquirers; the latter have larger values of this ratio. In addition, listed targets have significantly larger values of loan-toassets relative to unlisted targets, contributing perhaps to the lower ROE measures given in Panel A of the table.
97.5.3.5 Other This category includes a set of five variables. We use the standard deviation of ROA as a proxy for income volatility; as shown in Panel E of the table, findings on this variable are mixed. An annualized measure of asset growth over the eight quarters preceding the bid (Asset Growth) is used as a proxy for the firm’s growth prospects; acquiring firms in all cases display significantly higher growth rates than do the corresponding targets. In addition, listed targets display higher growth rates than do unlisted targets. The latter finding suggests that the limited growth prospects of unlisted targets may imply that they have less value than listed targets of comparable size. The measures salary cost and operating costs serve as proxies for possible expense preference behavior and operating efficiency (Verbrugge and Jahera 1981). Salary Costs are significantly smaller for acquiring firms than for the corresponding targets, but are of comparable magnitude for listed versus unlisted targets. Findings on operating costs are mixed, with the exception that acquiring firms have lower values of this measure than do the target firms. Given that
97 Listing Effects and the Private Company Discount in Bank Acquisitions
acquiring firms are much larger than the corresponding target firms, this difference may be driven by scale economies in operating costs at larger institutions. The last variable in Panel E is liquidity, computed as the ratio of cash and securities to total assets. We include this measure because Officer (2007) finds that the acquisition discount for unlisted firms in the non-financial sector is related to the target firm’s need for liquidity. As noted in Panel E of the table, liquidity is significantly higher for unlisted relative to listed targets, suggesting that a need for liquidity is unlikely to be a significant driver of the acquisition discount in acquisitions involving commercial banks.
1439
target is a listed firm. This variable is of primary importance in our study. The regression equation is summarized as follows: EMi D ˛i C
10 X
ˇi;j .Target_Chari;j / C
j D1
C
5 X
3 X
i;k .Acq_Chari;k /
kD1
i;m .Deal_Chari;m / C i Listi
(97.12)
mD1
We estimate robust regressions employing White (1980) sandwich estimators of variance. In some specifications we employ year fixed effects. We report the results of four regression specifications.
97.6 Cross-Sectional Analysis 97.6.1 Regression Specification Our primary interest in this section is to examine if there are significant differences in the valuation multiples received by owners of unlisted targets relative to that received by listed targets. We do this by estimating regressions with the deal value to equity multiple (EM) as the dependant variable, and include a variety of target firm, bidding firm, and deal characteristics as potential explanatory variables. A dummy variable that takes a value of 1 if the target is listed (List) and is zero otherwise is designed to capture the possibility that an acquisition discount still exists after controlling for other characteristics that may influence the acquisition multiple. The set of explanatory variables includes controls for target characteristics captured in the following nine variables. These include (1) the relative size of the target (the target to bidder assets), (2) target firm’s age (information asymmetry may be declining in firm age), (3) distance from the acquirer (to capture the possible influence of in-market deals on the valuation multiple), (4) dummy variables to distinguish between high versus low values of the net interest margin, (5) the deposit-to-assets ratio, (6) liquidity, (7) loan losses, (8) asset growth, and (9) ROA. Acquirer characteristics include an acquirer size dummy that takes a value of 1 for higher than median values of total assets and is zero otherwise that may control for the firm size effect (Moeller et al. 2004); a dummy for high levels of acquirer equity to asset values; and a dummy variable that equals 1 if the acquirer ROA is greater than the target ROA. We measure five different aspects for deal characteristics, including (1) the cash method of payment, (2) other nonstocks methods of payment, (3) the regulatory regime (preRiegel-Neal or post-Riegel-Neal), (4) a location dummy that equals 1 for interstate bids and is zero otherwise, and (5) an interaction term between Post-Riegel-Neal and interstate. Finally, we have a dummy variable that captures whether the
97.6.2 Regression Results We present the regression results in the first column of Table 97.6. Focusing first on target characteristics, Relative Size has a negative and statistically significant coefficient. This finding is consistent with Houston and Ryngaert (1994), DeLong (2001), and Houston et al. (2001), who report that bidder gains are increasing in the relative size of the bid; lower acquisition multiples received by targets firms can be expected to improve bidder gains from the bid. The variable Years since inception measures the age of the target firm, and it has a negative and statistically significant coefficient; since target age is likely to be related to target size, this finding is not surprising. Measure of target profitability produce mixed results. The dummy variable for ROA has a positive and statistically significant coefficient, confirming that more profitable targets receive higher acquisition multiples; the Net Interest Margin dummy also has a positive coefficient but is not statistically significant. The dummy variable for Loan Loss is negative and statistically significant confirming that better asset quality yields higher acquisition multiples to the target. Consistent with the expectation that higher asset growth should have value, the Asset Growth dummy has a positive and statistically significant coefficient. The ratio of Depositsto-Assets serves as a proxy for the firm’s ability to leverage its deposit base into income producing assets; as expected, this variable has a positive and statistically significant coefficient. The variable Distance from acquirer is included to capture the possibility that “in-market” deals create more value (Houston and Ryngaert 1994); the coefficient has the wrong sign, and it is not statistically significant. We include a measure of Target Liquidity because Officer (2007) finds a positive relationship between target liquidity and the acquisition discount. This variable has a negative and statistically significant coefficient, indicating that in bank mergers lower levels of target liquidity are more valuable. This finding is
1440
A. Gupta and L. Misra
Table 97.6 Regression results of equity multiple estimated to be paid to target against explanatory variables Coefficient t-Statistic Coefficient t-Statistic Coefficient t-Statistic Target characteristics: 0:105 Relative target asset size (log) Years since inception (log)
0:079
Coefficient
t-Statistic
2:07
0:113
2:14
0:131
2:42
0:134
2:27
1:97
0:099
2:28
0:077
1:94
0:090
2:05
0:127
1:71
0:111
High net interest margin dummy
0:085
1:14
0:058
0:85
High deposit to asset dummy
0:145
2:32
0:132
2:09
1:58
0:210
2:62
0:232
2:96
1:63
0:124
2:11
0:099
1:20
1:97
0:099
1:77
0:169
2:69
High equity to asset dummy
High liquidity level dummy
0:131
2:32
0:139
High loan loss dummy
0:117
2:07
0:113
High asset growth dummy
0:188
3:10
0:129
1:72
0:176
2:92
0:110
1:58
High ROA dummy
0:242
3:11
0:262
3:26
0:277
3:44
0:263
2:95
Acquirer characteristics: 0:110 High acquirer size dummy High acquirer equity to asset dummy
Acquirer ROA higher than target dummy
0:118
Regime: postRiegel dummy
0:108
1:00
0:142
1:23
0:132
1:20
0:054
0:67
0:099
1:50
0:094
0:99
1:80
0:099
1:56
0:124
1:74
3:73
0:219
3:02
0:265
3:68
0:204
2:89
0:267
2:86
0:188
2:67
0:261
2:69
0:150
2:15
0:668
8:14
0:479
0:688
8:29
0:433
Deal characteristics: 0:271 Payment: cash dummy Payment: other dummy
0:94
1:63
1:45
(continued)
97 Listing Effects and the Private Company Discount in Bank Acquisitions
Table 97.6 (continued) Coefficient
t-Statistic
Coefficient
t-Statistic
Coefficient
t-Statistic
t-Statistic
0.178
2:66
0.176
2:24
0.139
2:21
0.137
1:87
Acquirer target distance (log)
0.020
0:91
0.016
0:76
0.021
0:97
0.018
0:88
Post-Riegel interstate
0.110
0:86
0.117
0:88
0.122
0:96
0.104
0:78
Target private dummy
0.258
3:30
0.292
3:52
0.281
3:37
0.311
3:46
Constant
2.029
2.109
10:46
2.121
2.070
8:08
Coefficient
Location: interstate dummy
9:91
1441
Robust regression statistics: Model number #1
#2
#3
Year fixed effect
No
Yes
No
Yes
Number of obs
1,041
1,041
1,041
1,041
F(23, 1024)
21.39
14.00
22.96
14.95
Prob > F
0.00
0.00
0.00
0.00
R-squared
0.2135
0.2599
0.2171
0.2672
8:31
#4
All mergers with public and private bank targets from 1981 to 2005. Estimated coefficients and corresponding robust t-statistics with significance levels are shown
consistent with the proposition that larger holdings of cash and securities reduce the level of loans in the firm’s portfolio, and have a negative impact on earning capacity. The regression includes two acquiring firm characteristics – Acquirer Size and a dummy variable that equals 1 when the acquirer’s ROA is greater than the target’s. Acquirer size dummy (large versus small) is included to capture the possibility that smaller firms may make better acquisitions (Moeller et al. 2004); the coefficient has the wrong sign but is not statistically significant. The ROA dummy has a negative and weakly significant coefficient, indicating that targets with a relatively high ROA receive a higher multiple. The third set of control variables consists of variable pertaining to deal characteristics. The regressions include two dummy variables to capture possible method of payment effects; the Cash (Other) dummy takes a value of 1 for cash (other) financing and is zero otherwise. Both these variables have negative and statistically significant coefficients, indicating that the acquisition multiple is larger in bids that use all stock financing. Estimated coefficients are positive and statistically significant for bids in the post-Riegel-Neal period and for Interstate relative to Intrastate bids. The post-Riegel-Neal finding suggests that increasing industry consolidation in the late 1990s resulted in acquiring firms paying larger multiples for comparable targets. The finding on Interstate versus Intrastate bids suggests that acquiring firms that try to increase geographic reach by making acquisitions across state lines end up paying higher acquisition multiples. Since higher mul-
tiples imply lower acquirer gains, this finding is consistent with Houston and Ryngaert (1994) who report larger gains in “in-market” deals in the financial sector. The interaction variable Post-Riegel Interstate is designed to examine if the changes mandated by the Riegel-Neal Act had an incremental impact on multiples paid in interstate deals. The variable has a negative coefficient and is not statistically significant. The dummy variable Target Listing status takes a value of 1 for listed targets and is zero for private targets. This variable has a negative and statistically significant coefficient, confirming the presence of a listing discount after controlling for a variety of firm and deal characteristics. We report regression results in column 2 of Table 97.6 using the same specification as in column 1, but include year fixed effects. Inclusion of a year fixed effect is likely to capture variations in multiples driven by other concerns such as those in hot IPO markets. We report results that are broadly similar to those given in column 1, with the exception that target liquidity is no longer statistically significant. The coefficient for Target listing status is negative and statistically significant, and has a magnitude similar to that reported in column 1 of the table. Our specification in column 3 of the table differs from that in column 1 in that we drop the target deposits-to-assets variable and include an Equity-to-Assets ratio for both the target and acquiring firm. This variable is a proxy for the firm’s capital position, and yields an expected negative and statistically significant coefficient for the target, but is not statistically significant for the acquiring firm. Findings reported in the
1442
A. Gupta and L. Misra
last column of the table use the same specification as in column 3, but include a year fixed effects. Our findings from this specification are qualitatively similar to those shown in column 3.
97.6.3 Acquisition Discounts We summarize the predicted multiples from the regression reported in column 4 of Table 97.6 and present the implied acquisition discounts in Table 97.7. Findings are reported for the full sample of bids, and also for sub-samples sorted by interstate versus intrastate bids and by the method of payment. We expect that the use of predicted multiples as opposed to the actual reported multiples provide a better estimate of multiples for various sub-samples by eliminating the idiosyncratic component of specific acquisition transactions. For the full sample of bids, the equity multiple is 2.40 for listed targets and 2.07 for unlisted targets; the difference is statistically significant at the 1% level, confirming the presence of a listing discount in commercial bank acquisitions. The implied valuation discount for unlisted targets is 13:75%. Sorting the sample according to target location
Table 97.7 Summary of equity multiples and private firm discounts according to payment modes and interstate acquisitions.
confirms that a listing discount is present in both interstate and intrastate bids; equity multiples are significantly smaller for unlisted relative to listed targets and the differences are statistically significant at the 1% level. Findings for sub-samples of bids sorted according to target location and the method of payment are broadly similar to those for the full sample of bids. Equity multiples are smaller for unlisted relative to listed targets in all subsamples, and the difference is statistically significant in all cases except intrastate bids where the method of payment is “Other.” Estimated values for the implied valuation discount range from a high of 28:76% (intrastate bids, cash payment) to a low of 8:33% (interstate bids, stock payment).
97.7 Conclusions Several recent papers document that while acquiring firms in the non-financial sector experience no wealth change in bids for listed targets, the wealth impact is positive and statistically significant when the target is an unlisted firm. Further, the observed listing effect is larger for stock financed bids relative to when the acquirer pays with cash.
Location:
All intrastate
Equity multiple: public Equity multiple: private Two sample means: t-statistic Private firm discount (in %)
2:37 1:97 6:89 17:04
Cash payment:
Intrastate
Interstate
All cash
Equity multiple: public Equity multiple: private Two sample means: t-statistic Private firm discount (in %)
2:45 1:74 4:56 28:76
2:30 1:94 3:71 15:55
2:34 1:81 6:60 22:31
Other payment:
Intrastate
Interstate
All other
Equity multiple: public Equity multiple: private Two sample means: t-statistic Private firm discount (in %)
2:02 1:87 1:62 7:28
Stock payment:
Intrastate
Equity multiple: public Equity multiple: private Two sample means: t-statistic Private firm discount (in %)
2:55 2:09 5:95 18:24
All interstate
Full sample
2:41 2:18 5:49 9:45
2:40 2:07 9:82 13:75
2:19 2:00 2:09 8:68 Interstate 2:47 2:27 4:11 8:33
2:11 1:93 2:91 8:75 All stock 2:49 2:18 7:45 12:40
Predicted equity multiples are computed based on regression model #4 in Table 97.6. Twosample means test is conducted on the multiples and the corresponding t-statistics and significance levels are reported. Average private firm discounts for sub-samples are computed based on the average predicted multiples The symbols , , denote statistical significance at the 10, 5, and 1% levels, respectively
97 Listing Effects and the Private Company Discount in Bank Acquisitions
In this paper we document a similar listing effect for acquisition bids in the banking sector. Focusing on merger bids in the banking industry offers several advantages, including the existence of an active market for corporate control, the availability of financial data for unlisted targets for the pre-bid period, and a reduction in the extent to which observed findings for unlisted targets are driven by information asymmetry. Consistent with findings for the non-financial sector, we find that acquiring firms suffer significantly smaller losses when the target is an unlisted firm relative to bids for listed targets. We use multiples of deal value to target equity to estimate the magnitude of the acquisition discount for unlisted targets, and use multivariate regressions to control for differences in acquirer, target, and deal characteristics that could influence the magnitude of the discount. Our findings yield an average acquisition discount of 13.75% for unlisted targets, which is somewhat lower than the average discount of approximately 17% documented by Officer (2007) for acquisitions of unlisted targets in the non-financial sector. We also document that the acquisition discount varies with the method of payment and is different for inter versus intrastate bids.
References Becher, D. A. 2000. “The valuation effects of bank mergers.” Journal of Corporate Finance 6, 189–214. Berger, A. and G. Udell. 2002. “Small business credit availability and relationship lending: the importance of bank organizational structure.” The Economic Journal 112, 32–53. Chang, S. 1998. “Takeovers of privately held targets, methods of payment, and bidder returns.” Journal of Finance 53, 773–784. DeLong, G. L. 2001. “Stockholder gains from focusing versus diversifying bank mergers.” Journal of Financial Economics 59, 221–252.
1443
Faccio, M., J. McConnell, and D. Stolin. 2006. “Returns to acquirers of listed and unlisted targets.” Journal of Financial and Quantitative Analysis 41, 197–220. Fuller, K., J. Netter, and M. Stegemoller. 2002. “What do returns to acquiring firms tell us? Evidence from firms that make many acquisitions.” Journal of Finance 57, 1763–1793. Gupta, A., R. L. B. LeCompte, and L. Misra 1997. “Acquisitions of solvent thrifts: wealth effects and managerial motivations.” Journal of Banking and Finance 21, 1431–1450. Hansen, R. and J. Lott. 1996. “Externalities and corporate objectives in a world with diversified shareholders/consumers.” Journal of Financial and Quantitative Analysis 31, 43–68. Houston, J. F. and M. D. Ryngaert. 1994. “The overall gains from large bank mergers.” Journal of Banking and Finance 18, 1155–1176. Houston, J. F., C. James, and M. Ryngaert. 2001. “Where do merger gains come from? Bank mergers from the perspective of insiders and outsiders.” Journal of Financial Economics 60, 285–331. Kaplan, S. and R. Ruback. 1995. “The valuation of cash flow forecasts: an empirical analysis.” Journal of Finance 50, 1059–1093. MacKinlay, A. C. 1997. “Event studies in economics and finance.” Journal of Economic Literature 35, 13–39. Maquieira, C., W. Megginson, and L. Nail. 1998. “Wealth creation versus wealth redistributions in pure stock-for-stock mergers.” Journal of Financial Economics 48, 3–33. Moeller, S., F. Schlingemann, and R. Stulz. 2004. “Firm size and the gains from acquisitions.” Journal of Financial Economics 73, 201–228. Officer, M. 2007. “The price of corporate liquidity: acquisition discounts for unlisted targets.” Journal of Financial Economics 83(3), 571–598. Schliefer, A. and R. Vishny. 1992. “Liquidation values and debt capacity.” Journal of Finance 47, 1343–1366. Travlos, N. 1986. “Corporate takeover bids, method of payment, and bidding firms’ stock returns.” Journal of Finance 42, 943–963. Verbrugge, J. A. and J. S. Jahera. 1981. “Expense-preference behavior in the savings and loan industry.” Journal of Money, Credit, and Banking 13, 465–476. White, H. 1980. “A heteroskedasticity-consistent covariance matrix estimator and a direct test of heteroskedasticity.” Econometrica 48, 817–830.
Chapter 98
An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)* Yu-Ting Chen, Cheng-Few Lee, and Yuan-Chung Sheu
Abstract Under the assumption that the asset value follows a phase-type jump diffusion, we show the expected discounted penalty satisfies an ODE and obtain a general form for the expected discounted penalty. In particular, if only downward jumps are allowed, we get an explicit formula in terms of the penalty function and jump distribution. On the other hand, if downward jump distribution is a mixture of exponential distributions (and upward jumps are determined by a general Lévy measure), we obtain closed form solutions for the expected discounted penalty. As an application, we work out an example in Leland’s structural model with jumps. For earlier and related results, see Gerber and Landry [Insurance: Mathematics and Economics 22:263–276, 1998], Hilberink and Rogers [Finance Stoch 6:237–263, 2002], Asmussen et al. [Stoch. Proc. and their App. 109:79–111, 2004] and Kyprianou and Surya [Finance Stoch 11:131–152, 2007]. Keywords Jump diffusion r Expected discounted penalty Phase-type distribution r Optimal capital structure
r
98.1 Introduction In the classical model of ruin theory, the process Xt D x C ct Zt ;
t 0
(98.1)
stands for the surplus process of an insurance company. Here, x > 0 is the initial surplus, c > 0 is the rate at which the premiums are received, and Z D .Zt I t 2 RC / is a
Y.-T. Chen Institute of Finance, National Chiao Tung University, Hsinchu, Taiwan, ROC C.-F. Lee Department of Finance, Rutgers University, New Brunswick, NJ, USA Y.-C. Sheu () Department of Applied Mathematics, National Chiao Tung University, Hsinchu, Taiwan, ROC e-mail: [email protected]
compound Poisson process which represents the aggregate claims between time 0 and t. (Note that in insurance X has only downward jumps.) Ruin is the event that Xt 0 for some t > 0. Let be the time of ruin and X the negative surplus when ruin occurs. Given a penalty scheme g, Gerber and Shiu (1998) considered the expected discounted penalty ˆ.x/ D Ex Œe r g.X / :
(98.2)
(Here, r 0 is the risk-free rate and we use the convention that e r.C1/ D 0.) By taking g 1 and r D 0, the ruin probability is a special case of Equation (98.2). For general penalty scheme g, this represents the amount payable at ruin and it depends on the deficit at ruin. For more results and related problems, see Asmussen (2000). Gerber (1970) extended the classical model (4.1) by adding an independent diffusion. And the surplus process takes the form Xt D x C ct C Wt Zt ;
t 0:
(98.3)
Here > 0 and W D .Wt I t 0/ is a standard Brownian motion independent of Z. In this case ruin may be caused by oscillation (that is, X D 0) or by a claim (that is, X < 0). Dufresne and Gerber (1989) studied the probability of ruin caused by oscillation and the probability of ruin caused by a claim. Moreover, as in Gerber and Shiu (1998) and Gerber and Landry (1998) considered the expected discounted penalty (Equation 98.2). They heuristically derived the integro-differential equation for ˆ and showed that it satisfies a renewal integral equation. As an application of the equation, they determined the optimal exercise strategy of a perpetual American put option under the assumptions that the log price of the stock is of the form Equation (98.3) and only downside jumps of X are allowed. On the other hand, by a different approach, Mordecki (2002) considered optimal exercise strategies and perpetual options for exponential Lévy models. In terms of the supremum and infimum
* This paper is reprinted from Finance and Stochastics, 11 (2007), Number 3, pp. 323–355.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_98,
1445
1446
processes of the Lévy process, the author derived closed formulas for the optimal exercise strategies and prices of the perpetual American call and put options. In particular, if X is the independent difference of a spectrally nonnegative Levy process and a compound Poisson process whose jump distribution is a mixture of exponential distributions, Mordecki (2002) gave explicit formulas for optimal exercise strategies and prices of perpetual American put options.(Similar results also hold for call options.) For related results on American option pricing, see also Asmussen et al. (2004), Boyarchenko and Levendorskiˇi (2002a, b), Chan (1999, 2005), and others. In addition to option theory, the expected discounted penalty also finds its importance in the structural form modeling of credit risk. The structural form modeling includes, based on (exogenous) default caused by an insufficiency of assets relative to liabilities, the classic Black–Scholes– Merton model of corporate debt pricing and Leland’s structural model (Leland 1994a, b, 1996), for which (endogenous) default occurs when the issuer’s assets reach a level so small that the issuer finds it optimal to declare bankruptcy. All the aforementioned models of credit risk have relied on diffusion process to model the evolution of the assets. However, while the diffusion approach is mathematically tractable and has inputs and parameters of the models observable and estimable, it cannot capture the basic features of credit risk observed empirically. There are many extensions of Black–Scholes–Merton model and Leland’s model. One example is Hilberink and Rogers (2002) in which the model of Leland (1996) was extended by adding downward jumps in the dynamics of the firm’s assets. If the log value of the assets follows a jump diffusion as in Equation (98.3) with only downside jumps, Hilberink and Rogers gave explicit formulas for the values of the debt and the value of the firm up to Fourier transforms. Without closed-form solutions for these two values, by imposing smooth-pasting condition, they surprisingly determined the optimal default boundary in terms of the Wiener– Hopf factors of a Lévy process. For recent works and related results, see Boyarchenko and Levendorskiˇi (2000), Chen and Kou (preprint) and Dao and Jeanblanc (preprint). In this paper we study the function ˆ defined in Equation (98.2). Here X is a jump diffusion of the form Equation (98.3) and the jump distribution for X is a two-sided phase type distribution. Main results of the paper are outlined as follows. We first show that ˆ satisfies an integodifferential equation (Theorem 98.2) and then derive a differential equation for ˆ (Theorem 98.3). Based on ODE, we prove in Proposition 98.2 that function ˆ can be written as a linear combination of known exponential functions. Moreover, if only downside jumps are allowed, we calculate any higher order derivative of ˆ at zero in terms of the penalty function g and the jump distribution. As a consequence, we obtain an explicit formula for ˆ when there are
Y.-T. Chen et al.
only downward jumps for X (Theorem 98.4). If the downward jumps have a mixed exponential distribution and the upward jumps have a general distribution, we also obtain an explicit formula for ˆ in Theorem 98.7. As hints for possible finance and insurance applications of our results, we provide some examples. In particular, in the setup of Leland’s model with jumps, we determine the optimal endogenous default and obtain the equity, debt and firm values in closed form formulas in Examples 98.1 and 98.4. The plan of the rest of the paper is as follows. In Sect. 98.2 we introduce the process X and derive an integro-differential equation for ˆ. Section 98.3 recalls the definition of phasetype distributions and presents our ODE approach for ˆ. In Sect. 98.4 we consider a general Lévy process X which is a difference of a spectrally nonnegative Lévy process and a compound Poisson process with only upward jumps. Moreover, if the jump distribution is a mixture of exponential distributions, based on the results in Sect. 98.3, we conjecture the solution form of ˆ and by using the Feymann-Kac formula we verify our conjecture. Section 98.5 concludes the paper. Some proofs and technical results are relegated to appendices.
98.2 Integro-Differential Equation To start with, we specify a Lévy process that we consider in this paper unless otherwise stated. We are given a probability space .; F ; P/ on which there are a standard Brownian motion W D .W PNt It t 2 RC / and a compound Poisson process Z D .Zt D nD1 Yn I t 2 RC /. Here the Poisson process N D .Nt I t 2 RC / has parameter > 0 and the random variables .Yn I n 2 N/ are independent and identically distributed. We assume further that the distribution F of Y1 has a bounded density f that is continuous on Rnf0g. In addition, W; N; .Yn/ are assumed to be independent. For every x 2 R, let Px be the law of the process Xt D X0 C ct C Wt Zt
(98.1)
where c 2 R R, > 0 and X0 D x. Write P0 for P and Ex ŒZ D Z.!/d Px .!/ for a random variable Z. For every 2 i R, we have E0 e X1 D e ./ ; (98.2) where Z ./ D D C c C 2
e y dF .y/
and DD
2 : 2
(98.3)
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
( is called the characteristic exponent of X .) Moreover, the infinitesimal generator L of X has a domain containing C02 .R/ and for any h 2 C02 .R/, Z Lh.x/ D Dh00 .x/ C ch0 .x/ C
h.x y/dF .y/ h.x/: (98.4)
(For details, see Bertoin (1996), Sato (1999)). On the other hand, let .Ft / be the usual augmentation of the natural filtration of X . Then for every Borel set A, the entry time of A by X , A D infft 0I Xt 2 Ag;
Theorem 98.1. The function ˆ in Equation (98.6) has the following properties: 1. For all r 0, ˆ 2 Cb1 .RC / \ C 2 .RCC /. 2. If r > 0 or r D 0 and EŒX1 > 0, then ˆ 2 C01 .RC /. We define Lh by the expression (98.4) for all functions h on R such that h0 ; h00 and the integral in Equation (98.4) exist at x. Lemma 98.1. The function Lˆ is in C.RCC /. Theorem 98.2. The function ˆ satisfies the integrodifferential equation
(98.5) .L r/ˆ.x/ D 0;
is an .Ft /stopping time. (See Chapter III Theorem 2.17 in Revuz and Yor (2005).) From now on, we will fix a bounded Borel penalty function g on .1; 0 and let the function ˆ be given by ˆ.x/ D Ex Œe r g.X / ;
x 2 R:
1447
(98.6)
Note that ˆ.x/ D g.x/ for x 0 and ˆ is called expected discounted penalty for the penalty function g in insurance literature. Notation 98.1. We write RC D fx 2 RI x 0g, RCC D fx 2 RI x > 0g, and analogously R and R . Let I be an interval in R and n 2 N. We introduce the following function spaces: 1. C.I / is the space of real-valued continuous functions f on I . 2. Cb .I / is the space of bounded continuous functions f on I . 3. C0 .I / is the space of continuous functions f on I with limx!1 f .x/ D 0 (resp. limx!1 f .x/ D 0) provided that I is not bounded above (resp. below). Also, C0;b .I / D C0 .I / \ Cb .I /. 4. Cc .I / is the space of continuous functions f on I with compact supports. 5. C n .I / is the space consisting of f 2 C.I / with f .n/ 2 C.I /. Here on I ı , f .n/ is the usual n times derivative. If x is the left (resp. right) hand boundary point of I and is in I , f .n/ .x/ is the n times right (resp. left) hand derivan .I / are similarly tive at x. C0n .I /, Cbn .I /, Ccn .I / and C0;b defined. T k T k 1 Cc .I /, C0;b .I / D 6. Cc1 .I / D k k C0;b .I /, and T k 1 C0 .I / D k C0 .I /. In Gerber and Landry (1998) and Tsai and Wilmott (2002), the following regularities were used implicitly. For a rigorous proof and related works, see Cai and Yang (2005) and Chen et al. (submitted).
x > 0:
(98.7)
2 If r > 0 or r D 0 and EŒX1 > 0, we have ˆ 2 C0;b .RCC /.
Proof. Assume that .L r/ˆ.x/ D 0 for all x > 0. Then we have Z c Cr ˆ.x/: ˆ00 .x/ D ˆ0 .x/ ˆ.x y/dF .y/ C D D D The second statement follows from this and Theorem 98.1. To establish Equation (98.7), we fix x > 0 and let 2 .0; x/. By Theorem 98.1, we have ˆ 2 Cb1 .RC / \ C 2 .RCC /. Since the behavior of ˆ00 near 0C and C1 are unclear, we stop the process X at the time e , where e is the entry time of .1; [ Œx C 1; 1/. Then e is an .Ft /stopping time. Moreover, since e , we see that e ^ t is a time. By the Strong Markov property of X , .F ^t /stopping h i r.t ^e / ˆ.Xe we have Ex e ^t / D ˆ.x/. Also in Appendix 1 we prove that h i / ˆ.Xe Ex e r.t ^e ^t / # "Z ^t e ru D Ex e .L r/ˆ.Xu /d u C ˆ.x/:
(98.8)
0
From these, we get "Z
#
t ^e
Ex
e
ru
.L r/ˆ.Xu /d u D 0:
(98.9)
0
On the other hand, by Lemma 98.1, #
" Ex
sup je
u
t # 0C:
ru
.L r/ˆ.Xu / .L r/ˆ.X0 /j ! 0; (98.10)
1448
Y.-T. Chen et al.
Therefore, using Equation (98.9), we obtain j.L r/ˆ.x/j ˇ ˇ # "Z ˇ1 ˇ t ^e ˇ ˇ ru D ˇ Ex e .L r/ˆ.Xu /d u .L r/ˆ.x/ˇ ˇt ˇ 0 #
" Ex
sup je
u
ru
.L r/ˆ.Xu / .L r/ˆ.X0 /j
C j.L r/ˆ.x/j
Ex Œ.t e ^ t/ : t
Note that e > 0 Px a.s. and hence t e ^ t D 0 for all ^t 2, we have sufficiently small t. Since further 0 t e t t e ^t ! 0 as t ! 0C. Together with Equation (98.10) E Œ x
t
and the last inequality, we establish that j.L r/ˆ.x/j ˇ # "Z ˇ1 t ^e ˇ ru lim sup ˇ Ex e .L r/ˆ.Xu /d u t !0C ˇ t 0 ˇ ˇ ˇ .L r/ˆ.x/ˇ D 0: ˇ
and a nonsingular subintensity matrix B 0 of dimension N0 0 such that f .x/ D ˛0 e xB b0> 1x>0 . We consider the process X defined in Equation (98.1) and assume that its jump distribution F has a probability density function f given by 8 < pf.C/ .x/; x > 0; (98.12) x D 0; f .x/ D 0; : qf./ .x/; x < 0; where p C q D 1, p; q 2 RC and f.˙/ are of PH.˛˙ ; B ˙ /. Here B C and B are not necessarily of the same dimension. We will write I for the identity matrices of the same dimensions as those of B C and B if there is no confusion. Remark 98.1. A summary of analytic facts of phase-type distributions is given in Appendix 2. It is worth noting that phase-type distributions are dense in the set of all probability distributions on RCC . Special cases of phase-type distributions include exponential distributions, Gamma distributions with integer parameter, mixtures of exponential distributions. For details, see Asmussen (2000) or Rolski et al. (1999) Sect. 11.2. In Asmussen et al. (2004), they considered a Lévy process X as in Equation (98.1) except that Z D .Zt I t 0/ is given by C
98.3 Explicit Formula for ˆ – ODE Method
Zt D
Nt X nD1
In this section, we consider phase-type jump distributions and the boundary value problem
.L r/ˆ D 0; in RCC ; ˆ D g; on R ;
(98.11)
where L is defined by Equation (98.4) and r 0. Our main purpose is to show that if ˆ satisfies Equation (98.11), then it satisfies an ODE on RCC . And from this, we obtain a general form of ˆ and derive an explicit formula for ˆ under some technical conditions. Based on these results, we consider an example in credit risk modeling (Example 98.2). To begin with, we recall the definition of phase-type distributions. Definition 98.1. Assume B is an N N nonsingular subintensity matrix, that is, bij 0 for i ¤ j , bi i 0 and b> D Be > 2 RN C nf0g. Here, e D Œ1 1 1 and 0 D Œ0 0 0 . Let ˛ be an N -dimensional probability function. The probability distribution function F with the density function f .x/ D
˛e xB b> ; x > 0; 0; x 0:
is called a phase-type distribution. We denote this distribution by PH.˛; B/. We say that the representation .˛; B/ is minimal if there do not exist N0 < N , ˛0 of dimension N0 ,
YnC
Nt X
Yn
(98.13)
nD1
where N ˙ are both Poisson processes, .YnC /(resp. .Yn /) are independently and identically distributed with distribution PH.˛C ; B C /(resp. PH.˛ ; B /). Also they assumed that W , N C , N , .YnC / and .Yn / are independent. Their processes have the same characteristic exponent as ours, and therefore the finite dimensional distributions of their processes coincide with those of ours. Since finite dimensional distributions determine law, our definition of X is sufficient. By assumption Equation (98.12) and Theorem 98.10, the characteristic exponent in Equation (98.2) is given by ./ D D 2 C c C 1 ./ ; 2 i R R where 1 ./ D e y f .y/dy is of the form: 1 ./
(98.14)
1 > D p˛C .I B C /1 b> C C q˛ .I B / b : (98.15)
Since the right hand side of Equation (98.15) is a rational function in on C (see Theorem 98.10), the right hand side of Equation (98.14) is actually a rational function on C with a finite number of poles in Cni R. Accordingly, we consider and 1 on C as analytic functions except at the poles in Cni R.
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Z
Let P0 ./ be the “minimal” polynomial with leading coefficient 1 such that the zeros of P0 ./ coincide with the poles of 1 ./, counting multiplicity. Write P1 ./ D P0 ./. ./ r/:
(98.16)
Then the zeros of ./ r coincide with those of the polynomial P1 ./, counting multiplicity. On the other hand, the infinitesimal generator of the process X takes the form
TB˙ h.x/
D
0 Ek .x/ D p˛C @B kC TBCC ˆ.x/ C
k1 X
(98.18)
Proposition 98.1. Assume that the jump density f is of Equation (98.12) and EŒX1 > 0 if r D 0. Then ˆ is in 1 .RCC / and for k 0, we have the recursive formula: C0;b c .kC1/ . C r/ .k/ ˆ ˆ .x/ .x/ C D D (98.19) Ek .x/; D
ˆ.kC2/ .x/ D
(98.17)
for all h 2 C02 .R/. Here, for a nonsingular subintensity matrix B, the matrix-valued operators TB˙ are defined on the set of bounded measurable functions h by
h.x y/e ˙By dy: R˙
(When we perform integration and differentiation with respect to a matrix of continuous parameter, these operations are meant to be performed termwise.) The following is a further refinement of Theorem 98.2.
Lh.x/ D Dh00 .x/ C ch0 .x/ C p˛C TBCC h.x/b> C C q˛ TB h.x/b> h.x/;
1449
where 1 B C ˆ.k1j / .x/A b> C j
j D0
1 k1 X C q˛ @.B /k TB ˆ.x/ C .1/j C1 B j ˆ.k1j / .x/A b> : 0
(98.20)
j D0
Proof. Fix x > 0. Since .L r/ˆ.x/ D 0, by Equation (98.17), we get that Equation (98.19) holds for k D 0 2 .RCC /. Since ˆ is continuous at x and, hence, ˆ 2 C0;b (Theorem 98.1), we obtain by Theorem 98.11, that TBCC ˆ is differentiable at x and
./ D D 2 C c C
C
and Z
1
Lh.x/ D Dh00 .x/Cch0 .x/C
h.xy/e y dyh.x/: 0
d C T ˆ.x/ D B C TBCC ˆ.x/ C ˆ.x/I: dx B C d TB ˆ.x/ D B TB ˆ.x/ ˆ.x/I . Similarly, we have dx So, by the differentiability of TB˙˙ ˆ at x and Equation (98.19) for k D 0, ˆ00 is differentiable at x and Equation 2 .RCC / and TB˙ ˆ 2 (98.19) holds for k D 1. Since ˆ 2 C0;b ˙ 000 3 .RCC /. C0;b .RCC /, ˆ 2 C0;b .RCC / and hence ˆ 2 C0;b A similar argument holds for general k. t u
Next we transform integro-differential equations into ordinary differential equations when the two sides of jump distribution are both phase-type distributions. Before given the general result, we treat a simple case by direct calculation. Example 98.1. We consider the model (Equation 98.1) with the jump distribution dF .y/ D e y 1y>0 dy for some > 0. Then
Note that
R1
0
h.x y/e y dy D e x
Rx 1
h.y/e y dy and
Z x d x C e h.y/e y dy/ D h.x/: dx 1
Hence, by Theorem 98.2, ˆ satisfies the following ODE:
d C .L r/ˆ.x/ 0D dx D Dˆ000 .x/ C .D C c/ˆ00 .x/ C .c /ˆ0 .x/ D 0: Clearly we have P0 ./ D C and P1 ./ D D 2 . C / C c. C / C . C / D D 3 C .D C c/ 2 C .c /: Therefore ˆ satisfies an ODE with the characteristic polynomial P1 .
1450
Y.-T. Chen et al.
Theorem 98.3. Let D1 be the differential operator with the characteristic polynomial P1 given by Equation (98.16). Then D1 ˆ 0 on RCC .
Z Z
Z
2
ŒT k.x/ dx D
k.x y/f .y/dy
2
2
Z Z k.y/f .x y/dy
D Proof. Let L2 D L2 .R/ be the space of all square integrable functions defined on R and set hf1 ; f2 i D R f1 .x/f2 .x/dx. For an operator A on L2 , write A as its adjoint (i.e., hAk; hi D hk; A hi for all k; h in L2 ). By integration by parts, we have hk 0 ; hi D hk; h0Ri whenever k or h has a compact support. Write Th D h.x y/f .y/dy. By a change of R variables R and Fubini’s Theorem, we have hT h; ki D k.x/dx h.x y/f .y/dy D R R ki, where T h.x/ D h.z/d z k.z y/f .y/dy D hh; T R h.x y/f .y/dy.
Let D0 be the differential operator with the characteristic polynomial P0 and 2 Cc1 .RCC / be a test function. Recall that by Theorem 98.2, .L r/ˆ 0 on RCC . Therefore we have 0 D hD0 .L r/ˆ; i D hˆ; .L r/D0 i:
Z Z
H
f .x y/ dy dx
k.y/ dy H
Z
dx < 1: H
Next, we show that the Fourier transforms D0 / R 2 Fi.T x F h./ D e h.x/dx. and F .D2 / coincide, where R y e f .y/dy and notice that Recall that 1 ./ D F .D0 /./ D P0 .2 i/F . /./. (See Stein and Shakarchi (2005) Sect. 8.3.) Since T D0 2 L1 \ L2 , we have, for all 2 R,
F .T D0 /./ Z
Z D e 2 i x D0 .x y/f .y/dy dx D
D0 .x y/e 2 i.xy/ dx e 2 iy f .y/dy
D §1 .2 i/P0 .2 i/F . /./
Z 00 0 L h.x/ D Dh .x/ch .x/C h.xy/f .y/dyh.x/:
D P2 .2 i/F . /./
.L r/D0 D T D0 C LD D0 D T D0 C .D0 LD / :
(98.22)
Write D2 as the differential operator corresponding to the polynomial P2 ./ D P0 ./ 1 ./. We prove that T D0 D D2 a.e. by showing that both T D0 and D2 are in L2 \ L1 and have the same Fourier transforms. First, we show that T D0 and D2 are in L1 \ L2 . (Here L1 D L1 .R/ is the space of all integrable functions on R.) Since 2 Cc1 .RCC /, we obtain D2 2 L1 \ L2 . Also, Z
Z Z jT D0 .x/jdx
jD0 .x y/jdxf .y/dy
kD0 kL1 kf kL1 < 1; and hence T0 D0 2 L1 . We show that R T D0 2 L2 . Since 1 D0 2 Cc .RCC / and T h.x/ D h.x y/f .y/dy, it suffices to show that T k 2 L2 for any k 2 Cc1 .RCC /. Note, by Theorem 98.9, f has tails that decay exponentially and, R hence, f 2 < 1. Let k 2 Cc1 .RCC / and write H for the compact support of k. By Cauchy-Schwartz’s inequality, we get
2
Here, L is given by
Set LD D L r T . Then
dx
Z 2
kkk2L2 kf k2L2
Z Z (98.21)
dx
D F .D2 /./: By Fourier inversion formula, we induce that T D0 D D2 almost everywhere. By Equation (98.22) and the fact that T D0 D D2 a.e., .L r/D0 D D2 C .D0 LD / D D1 a.e. Hence, by Equation (98.21), we have 0 D hˆ; .L r/D0 i D hˆ; D1 i D hD1 ˆ; i: Since 2 Cc1 .RCC / is arbitrary, we have D1 ˆ D 0 a.s. 1 .RCC /, D1 ˆ 0 on RCC . This on .0; 1/. Since ˆ 2 C0;b completes the proof. r D Assume from now on that r > 0. We denote by Z./ the collection of zeros of ./ r(counting multir is plicity) with strictly negative real parts. We say that Z./ separable if its members are distinct. We will write Z./ for r and i for ir if these cause no confusion. Z./
.ir /SiD1
Proposition 98.2. There exist polynomials Qi .x/ such that P ˆ.x/ D SiD1 Qi .x/e i x for x 0. Let U be the column vector with entries Uj D ˆ.j 1/ .0/ for 1 j S and set V be the S S Vandermonde matrix with V ij D ji 1 .
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Theorem 98.4. Suppose that Z./ is separable. Then PS i x , where Q D we have ˆ.x/ D i D1 Qi e T .Q1 ; Q2 ; ; QS / is the unique constant column vector satisfying the system of linear equations: VQ D U . Moreover, if there is no upward jumps for X (i.e., q D 0), then ˆ.k/ .0/ can be obtained explicitly by the recursive formula ˆ.kC2/ .0C/ D
1 cˆ.kC1/ .0C/ C . C r/ˆ.k/ .0C/ Ek .0/ : D (98.23)
with ˆ.0/ D g.0/, Ek .0/ Z D ˛C
B kC
1
g.y/e 0
By
dy C
k1 X
! j B C ˆ.k1j / .0/
b> C
j D0
(98.24) and
c C r g.0/ D Z 1 Z 1 C dv dF .y/e r v g.v y/: D 0 v
ˆ0 .0C/ D
1451
a procedure to calculate the values of bond, firm and hence the equity. Example 98.2 (Optimal Capital Structure). We shall assume that under a risk-neutral measure Q, the value of the firm’s assets is given by Vt D Ve ct CWt Zt . The setting of debt issuance and coupon payment follows the convention in Leland (1994b). In time interval .t; t C dt/, the firm issues new debt with face value adt, and maturity profile k.t/ D me mt . With the exponential maturity profile, the face value of the debt matured in .t; t Cdt/ is the same as the face value of the newly issued debt. Thus the face R 1 value of all pending debt is equal to the constant A D a 0 e mt dt D a=m. All bondholders will receive coupons at rate b until default. All debt is of equal seniority and, once default occurs, the bondholders get the rest of the value, ˇV , after the bankruptcy cost .1 ˇ/V . We assume that there is a corporate tax rate ı, and the coupons paid can be offset against tax. We assume further that the default occurs at time D infft 0I Vt Lg, where L will be determined optimally later on. As in Hilberink and Roger (2002), we assume that the market has a risk free rate r > 0 and > 0 is the proportional rate at which profit is disbursed to investors. Then the total value at time 0 of all debt is D.V; L/ D
(98.25)
bA C mA EQ 1 e .mCr/ CˇEQ V e .mCr/ mCr (98.26)
and the value of the firm is (Here
r
is a positive real number satisfying
.r /
D r.)
Proof. By Proposition 98.2, we know that ˆ.x/ D PS i x , where Qj .x/ are polynomials. Since Z./ i D1 Qi .x/e is separable, standard theory of ordinary differential equation gives that Qj .x/ must be a constant for all j . For details, see Tenenbaum and Pollard (1985) Lesson 20B and 20D. Simple calculation shows that the constant vector Q satisfies the system of linear equations: VQ D U . Since Z./ is separable, i ’s are distinct and, hence, the Vandermonde matrix V is invertible (See e.g. Friedberg et al. (1997) page 218). Since V is invertible, Q is the unique solution of the system of linear equations. By letting x ! 0C in Equations (98.19) and (98.20), we obtain Equations (98.23) and (98.24). Formula (98.25) follows from Theorem 98.12. u t Remark 98.2. It is interesting to compare our results with those of Asmussen et al. (2004). By martingale stopping and Wiener–Hopf factorization, they obtained similar and general results as ours. As an application of Theorem 98.4, we consider Hilberink and Roger’s extension of Leland’s model. We first recall the basic setting of Leland’s model and then obtain the optimal default boundary in explicit form. Furthermore, we provide
v.V; L/ D V C
bAı EQ Œ1 e r .1 ˇ/EQ ŒV e r : r (98.27)
The value of equity of the form is given by S.V; L/ D v.V; L/ D.V; L/:
(98.28)
(See Hilberink and Rogers (2002) or remark after Example 98.5) By using a smooth-pasting condition, we calculate the optimal default-triggering level L that maximizes the value of equity of the firm. Set Xt D log VLt and x D log VL . Note that D infft 0jVt Lg D infft 0jXt 0g and, under the risk neutral measure Q, X is of the form Equation (98.1). Then we have D.V; L/ D
A.m C b/ .1 GmCr .x// C ˇLHmCr .x/ mCr (98.29)
and v.V; L/ D V C where
Aıb .1 Gr .x// .1 ˇ/LHr .x/; r (98.30)
1452
Y.-T. Chen et al.
Gs .x/ D Ex Œe s and
Hs .x/ D Ex e s e X :
(98.31)
D
Hence, we obtain that
As in Leland (1996) and Hilberink and Rogers (2002), by @S imposing the smooth pasting condition (i.e., @V .L; L/ D 0), we get the optimal default triggering level A.m C b/ 0 ıAb 0 Gr .0/ GmCr .0/ mCr : L D r 0 1 .1 ˇ/Hr0 .0/ ˇHmCr .0/
d Hs .x/jxD0 dx Z 1 Z c 1 D s C dv dF.y/e s v e vy D D 0 v Z y Z 1 c D s C dF .y/ e .1s /v e y d v D D 0 0
Hs0 .0/ D
c s C D D.1 s / Z 1
.e s y 1 C 1 e y /dF.y/
s .1/ C 1: D.1 s /
(98.34)
e s v d v
0
Z
1
.1 e s y/ dF .y/ 0
Plugging Equations (98.35) and (98.34) into Equation (98.33) and using the fact .1/ D r (since V D EŒe .r/ V1 D Ve .r/C .1/ ), we obtain the optimal default triggering level A.m C b/ ıAb mCr r L D mC C .1 ˇ/ ˇ mCr 1 r 1
(98.36)
Assume the jump distribution F is of phase type distribution PH.˛; B/ on .0; 1/. By Equations (98.28)–(98.30), we compute D.V; L/ and v.V; L/ in terms of Gs .x/ and Hs .x/. Note that Gs.kC2/ .0/ D
c .kC1/ C s .k/ G Gs .0/ Ik .0/ .0/ C D s D D (98.37)
and c .kC1/ C s .k/ H Hs .0/ Jk .0/ .0/ C D s D D (98.38)
0 Ik .0/ D ˛ @B k
(Here in the last second equation we use the fact that s D 0.) Similarly, by taking g.y/ D 1 for all y 0 in Equation (98.25), we get d Gs .x/jxD0 D dx Z 1 Z c 1 D s C dv dF .y/e s v D D 0 v
Z
1 0
e By dy C
k1 X
1 B j G .k1j / .0/A bT
j D0
and 0
.s /
Gs0 .0/
0
y
where
0
c 1 D s C D D.1 s /
D.s /2 cs C s . .1/ D c/ D
dF .y/
c 1 .D.s /2 C cs s/ s C D Ds s : (98.35) D Ds
Hs.kC2/ .0/ D
D
Z
1
D
(98.33)
Assume further that X has only downward jumps. By taking g.y/ D e y 1y0 in Equation (98.25), we get
Z
c D s C D Ds
(98.32)
@S @ .V; L/ D .v.V; L/ D.V; L// @V @V ıAb 0 1 1 D 1 Gr .x/ .1 ˇ/LHr0 .x/ r V V A.m C b/ 0 1 1 0 GmCr .x/ ˇLHmCr C .x/ : mCr V V
c s C D D
Jk .0/ D ˛ @B k
Z 0
1
e y e By dy C
k1 X
1 B j H .k1j / .0/A bT :
j D0
Using Equations (98.34), (98.35) and Gs .0/ D Hs .0/ D 1 in Equations (98.37) and (98.38), we get any higher derivatives of Gr and Hs at zero. Then, by Theorem 98.4, we get solutions for Gs .x/ and Hs .x/. In order to illustrate our method, we consider the simple case that dF .x/ D e x 1x>0 . Then Z./ D f1 ; 2 g and
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint) s 1 x recall that G 0 .0/ D D C Q2 e 2 x , . Hence Gs .x/ D Q1 e s where Q is the unique column vector satisfying the equation:
2 3 1 1 1 Q1 D 4 s 5 : 1 2 Q2 Ds
1453
For general two-sided jump distributions, we get a necessary condition for the coefficient constant Q. This necessary condition will play an important role in next section. Proposition 98.3. If .˛C ; B C / is minimal, Z./ is separable and p > 0, then the constant vector Q for ˆ satisfies PS i D1 Qi D g.0/ and
Simple algebra shows that
Z
2
s 3 2 C 6 Ds 7 1 Q1 6 7: D Q2 2 1 4 s 5 1 Ds
1
˛C
g.y/e
BCy
0
D ˛C
" S X
dy e xB C b> C #
Qi .i I B C /
1
e xB C b> C;
8x > 0:
i D1
Therefore we get
(98.41)
s s 1 Ds 1 x Ds 2 x e C e : 2 1 2 1
2 C Gs .x/ D
(98.39)
s .1/ Recall that H 0 .0/ D D.1 C 1. Hence Hs .x/ D s / 1 x 2 x P 1 e C P 2 e , where P is the unique column vector satisfying the equation:
3 2 1 1 1 P1 7 6 D 4 s .1/ 5: 1 2 P 2 C1 D.1 s /
Simple algebra shows that 2
3 s .1/ 6 2 1 D.1 / 7 1 P1 s 7 6 D 6 7: P2 2 1 4 s .1/ 5 1 1 C D.1 s / Therefore we get s .1/ D.1 s / 1 x e 2 1
2 1 Hs .x/ D
s .1/ 1 1 C D.1 s / 2 x C e : 2 1
98.4 The Constant Vector Q: Second Method In this section, we consider a more general process whose upward jumps are determined by a general Lévy measure and downward is by a compound Poisson process with a mixture of exponential jump distributions. We show that if a vector Q satisfies a system of linear equations, then a conjectured function must be the desired function ˆ. And to find a solution to the system of equations, instead of directly inverting the system of equations, we exploit the technique in Dufresne and Gerber (1989) to obtain an explicit solution for ˆ. With this explicit formula, we compute especially in Example 98.4 the optimal default triggering level L . Also Example 98.5 shows that we can calculate option price with finite maturity up to a Fourier transform. .C/ First, let X .C/ D .Xt I t 0/ be a Lévy process on R such that it starts at 0 and has nontrivial diffusion part and no downward jumps. Let Z be a compound Poisson process that is independent of X .C/ , and its jump distribution is given by f.C/ DPH.˛C ; BC /. Then we consider in this section the family of Lévy processes .X; fPx gx2R / given by .C/
Xt D X0 C Xt (98.40)
Remark 98.3. The optimal level L in Equation (98.36) coincides with that in Hilberink and Roger (2002). Our ODE approach also works for perpetual American put option. Indeed we determine the optimal stopping times when only downside jumps are allowed. Also we calculate exactly the price of the perpetual American put option.
Zt ;
t 0:
(98.42)
Here, as before, under Px , X0 D x a.s. Clearly, the process in Equation (98.42) is a generalization of Equation (98.1). The characteristic exponent of X is given by Z
1
./ D D C c C 2
e z 1 z1jzj1 .d z/
0
Z
1
C
e y f.C/ .y/dy ; 0
1454
Y.-T. Chen et al.
where is an arbitrary Lévy measure on .0; 1/ and R1 min.1; z2 /.d z/ < 1. Moreover, the infinitesimal gen0 erator of X has a domain containing C02 .R/ and is given by Lh.x/ D Dh00 .x/ C ch0 .x/ Z 1 C Œh.x C z/ h.x/ h0 .x/z1jzj1 .d z/
P we conjecture that ˆ.x/ D Ex Œe r g.X / D SiD1 Qi e i x for some Qi . Moreover by Proposition 98.3, we assume further that Q satisfies Equation (98.41). More precisely, for f.C/ given by Equation (98.43), Z
m X
0
pj j e j x
g.y/e j y dy D
1
j D1
Z
0
mC1 X
Qi
i D1
m X j D1
pj
j e j x : i C j
1
C
(98.44)
h.x z/f.C/ .z/d z h.x/: 0
Recall that ˆ.x/ D Ex Œe r g.X / , where D infft 0I Xt 0g. The next proposition gives a converse of Theorem 98.2.
(Note that fP .C/ DPH..p1 ; ; pm /; diag.1 ; ; m //.) Recall that SiD1 D g.0/. Then by comparing the coefficients of e j x in Equation (98.44), we get a system of linear equations: 8P < mC1 D g.0/; i D1 Qi R P : mC1 Qi j D 0 g.y/j e j y dy; i D1 i Cj 1
Theorem 98.5. If g on .1; 0 , 2 C02 .RC / and .L r/ 0 on RCC , then ˆ on R. We consider below the special case that f.C/ .y/ is a mixture of exponential distributions. Namely, there exist conm stants .pj /m j D1 and .j /j D1 such that pj > 0; j > 0, Pm j D1 pj D 1 and f.C/ .y/ D
m X
pj j e
j y
(98.43)
1 j m: (98.45)
Theorem 98.6 confirms our conjecture. Theorem 98.6. If Q satisfies Equation (98.45) for some bounded Borel measurable function g, then
j D1
on y > 0 and f.C/ .y/ D 0 otherwise. Without loss of generality, assume that j ’s are distinct. It is worth noting that Z./ D .i /SiD1 is separable and S D m C 1(see Asmussen et al. (2004) Lemma 98.1(1) or Mordecki (2002) Corollary 2). Now, based on Theorem 98.4,
.L r/ .x/ D
mC1 X i D1
C
Z Qi Di2 C ci C "mC1 X
Z
D
C
j D1
Z
j x
#
1
g.x y/f.C/ .y/dy x
1
C ci C
e
zi
1 i z1jzj1
0
" e
0
e i .xy/ f.C/ .y/dy C
i D1 m X
zi e 1 i z1jzj1 .d z/ . C r/ e i x Z
"
Qi
Qi e i x ; for x 0. (98.46)
PS i x Proof. Set .x/ D for x 0, and .x/ D i D1 Qi e g.x/ for x 0. Then, for every x > 0,
0
Di2
S X i D1
x
Qi
i D1 mC1 X
1
ˆ.x/ D Ex Œe r g.X / D
pj
mC1 X i D1
j Qi C j j C i
Since .i /r D 0 for all i and Q satisfies Equation (98.45) and hence Equation (98.44), .L r/ .x/ D 0. By Theorem 98.5, ˆ on R. t u
Z
# m X pj j .d z/ C . C r/ e i x C j i j D1 #
0
g.y/e
j y
dy :
1
In fact the coefficient matrix of the system of equations (Equation 98.45) is invertible. Hence we can write Q in matrix form. Instead of the matrix form, we exploit the technique in Dufresne and Gerber (1989) to get an explicit formula for Q.
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Theorem 98.7. Assume X satisfies Equation (98.42) and f.C/ .y/ is given by Equation (98.43). Then for any bounded Borel function gW R ! R, we have ˆ.x/ D P Ex Œe r g.X / D SiD1 Qi e i x on RC , where, for 1 h m C 1, Qh D
h
QmC1
lD1;l¤h .h
mC1 X
0 Rj @
j D1
Note for each summand in Equation (98.50), the numerator is a polynomial of degree m and the denominator is a polynomial of degree m C 1. By the principle of partial fraction decomposition, there exist constants D h , 1 h m C 1, such that mC1 X
1 C l /
Rj
j D1
mC1 Y kD1
j C k x C k
mC1 Y i D1;i ¤j
X D i i x i : D H.x/ D j i Cx i D1 i mC1
1
mC1 Y
mC1 Y
kD1
i D1;i ¤j
.j C k /
1455
(98.51)
h i A : j i (98.47)
By multiplying both sides of Equation (98.51) by .x C h / and then set x D h , we get that D h is given by the right hand side of Equation (98.47). On the other hand, since QmC1 h i i D1;i ¤j j i D 0 for all h ¤ j ,
and Z R j D g.0/
mC1 X
1
g.y/j e j y dy:
(98.48)
i D1
0
mC1 Y h C k D i i D H.h / D R h i C h h C k kD1
D Rh;
(Here we set mC1 D 0.) Proof. Clearly, the system (Equation 98.45) is equivalent to the following system of linear equations mC1 X
Qi i D Rj ; 1 j m C 1 i C j
i D1
(98.49)
where R j ’s are given by Equation (98.48). Since the system (Equation 98.45) admits at most one solution, so does Equation (98.49). Consider the rational function H.x/ D
mC1 X j D1
Rj
mC1 Y kD1
j C k x C k
Ex Œe r 1X
mC1 X hD1
mC1 Y i D1;i ¤j
x i : j i
h
QmC1
i D1;i ¤h
1 h m C 1:
That is, D is a solution to Equation (98.49). Since the solution to Equation (98.49) is unique and Equations (98.45) and (98.49) are equivalent, Q D D is the unique solution of Equation (98.45). By Theorem 98.6, we have ˆ PS i x on RC . t u i D1 Qi e Remark 98.4. In fact, by approximation in Equations (98.47) and (98.48), one sees that we still have Equation R(98.47) for ˆ.x/ if g is any Borel function on R 0 such that 1 jg.y/je j y dy < 1 for all 1 j m. Example 98.3. As in Kou and Wang (2004), we consider some special g:
m X
1
lD1;l¤h .h C l / j D1
0
mC1 Y
mC1 Y
kD1
i D1;i ¤j
mC1 Y
mC1 Y
kD1
i D1;i ¤j
e j y @
.j C k /
13 h i A5 h x e : j i
2. Assume g 1f0g . Then R j D 1 for all j . Hence,
Ex e r 1f0g .X / D
mC1 X hD1
2 4
h
QmC1
h i h i
(98.50) 1. Assume g 1.1;y/ for some y 0. Then R j D e j y for 1 j m and R mC1 D 0. Hence,
2 4
mC1 Y
mC1 X
1
lD1;l¤h .h
C l /
j D1
0 @
.j C k /
13 h i A5 e h x : j i
1456
Y.-T. Chen et al.
Example 98.4 (Optimal Capital Structure, continued.). We use the notation and setting in Example 98.2 except that the process X is of the form Equation (98.42) and Z has a mixed exponential jump distribution (Equation 98.43). For every s > 0, let is ; 1 i m C 1; be the negative real solutions
Gs .x/ D
mC1 Y
k
kD1
"mC1 X hD1
h
QmC1
mC1 Y kD1
k
"mC1 X
QmC1
hD1
m Y h C i
1
lD1;l¤h .h
C l /
i D1
m Y h C i
1
lD1;l¤h .h
and, hence, Gs0 .0/ D
for .z/ s D 0. As before, we write i for is if this causes no confusion. By taking g 1 in Equation (98.48), we get R j D 0 for 1 j m and R mC1 D 1. By Theorem 98.7, we get
!#
i
:
C l / i D1
!
i
# e
h x
(98.52)
Similarly, by taking g.y/ D e y for y 0 in 1 Equation (98.48), we get R j D 1C for 1 j m C 1. j By Theorem 98.7, we get
(98.53)
Hs .x/ D
mC1 X hD1
h
8 <mC1 X
mC1 Y
mC1 Y
1 4 .j C k / : 1 C j lD1;l¤h .h C l / j D1 kD1 i D1;i ¤j
QmC1
1
2
39 h i 5= h x e j i ;
(98.54)
and, hence,
Hs0 .0/ D
mC1 X hD1
2
mC1 Y
mC1 Y
1 4 .j C k / : 1 C j lD1;l¤h .h C l / j D1 kD1 i D1;i ¤j
QmC1
1
8 <mC1 X
By using formulas (98.53) and (98.55) in Equation (98.33), we get the explicit solution for the optimal default triggering level L . Similarly, using formulas (98.52), (98.54) in Equations (98.29), (98.30), and (98.28) respectively, we get closed formulas for the prices at time 0 of all debts, the price of the firm and hence the price of the equity. Remark 98.5. Assume m D 1 and there is no upward jumps. By using Ds D 12 , we obtain that Equation (98.52) coins cides with Equation (98.39). By using .1 C /. .1/ s/ D D.1 1 /.1 2 /.1 s /, we check that Equation (98.54) coincides with Equation (98.40).
39 h i 5= j i ;
(98.55)
above the level H up to the expiration time T > 0, the claim expires worthless. It can be interpreted as an Americanput like option with the digital payoff. Since the payoff is the same for all levels of the stock price below the barrier, it is clearly optimal to exercise the option once the level H is crossed. We consider a general framework. Let S be the price process of a security satisfying S D He X under a risk-neutral probability measure, where X is given by Equation (98.42) and H > 0 is a given constant threshold. Assume the existence of a constant risk-free rate r > 0. Consider a derivative with maturity T on this security whose payoff structure is given by
Example 98.5 (Option Pricing). A first touch digital is a digital contract paying $1 when and if a prespecified event 1. A bounded function k.SH / at the time H , where H D infftI St H g occurs. Consider the first-touch digital, or touch-and-out 2. A constant payment P at the time T if S does not cross option, which pays $1 when ever the stock price S falls below the boundary H till maturity the level H from above. If the stock price turns out to stay
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Let x D log.S0 =H /; g.y/ D k.He y / for y 0 and D infftI Xt 0g as usual. Then by the risk-neutral pricing formula, the risk-neutral price of this derivative is given by
C Ex e rT A1Œ >T
V.S0 ; T / D Ex e r g.X /1Œ T D Ex e r g.X /1Œ T C e rT A.1 Px Œ T /: (98.56) To find the unknown quantities Ex e r g.X /1Œ T and Px Œ T , we take their Laplace transforms with respect to time T . By Fubini’s Theorem, for every ˇ > 0, Z
1 0
D
Z
1457
and Z
1 0
1 e ˇT Px Œ T d T D Ex e ˇ : ˇ
(98.58)
Right hand sides of both equations can be written explicitly using Theorem 98.7. Hence we have an expression for the value V.S0 ; T / up to a Fourier transform. For the first touch digital, we have k 1 and A D 0. Hence, using Equations (98.57) and (98.52) with s D r C ˇ, we deduce that the Laplace transform of the option price with respect to T is given by
h i e ˇT Ex e r g.X /1Œ T d T i 1 h .rCˇ/ Ex e g.X / ˇ
1
(98.57)
1 .rCˇ/ Ex e ˇ ( mC1 " # ) mC1 m X Y h C i 1 S0 h 1 Y k : D QmC1 ˇ H h lD1;l¤h .h C l / i D1 i
e ˇT V.S0 ; T /d T D 0
kD1
(98.59)
hD1
Remark 98.6. It is worth noting that the time zero value R1 of all debts in Example 98.2 is of the form A 0 me mT B.V; L; T /d T , where B.V; L; T / is the time zero value of a bond issued at zero with face value 1 and maturity T . The price formula (98.26) follows by similar arguments as in Example 98.5.
98.5 Conclusion Expected discounted penalty is a generalized notion of ruin probability in insurance literature. It has been widely studied and generalized since Gerber and Shiu (1998). On the other hand, this function has been a major concern in the pricing of perpetual financial securities and the Laplace transforms of securities with finite maturity, and examples include option theory and credit risk modelling. While empirical studies have been indicating the failure of diffusion model, the seminal paper of Merton (1976) receives more attentions in recent years. However, unlike the diffusion case in which many functionals are available, this is no longer the case for jump diffusion process, especially in the first-passage models. This is where the difficulty in the pricing of securities under jump diffusion process arises.
In this paper, we consider the jump diffusion process as in Asmussen et al. (2004). In the case that jump distribution is a two-sided phase-type one, by a Fourier transform argument, we transform the integro-differential equation of the expected discounted penalty into a homogeneous ordinary differential equation. The present method could possibly be extended to transform more integro-differential equations into ordinary differential equations and hence give an alternative approach to compute prices in jump diffusion model. Next, by ODE theory, we know that the solution for the expected discounted penalty is a linear combination of some known exponential functions. Moreover, by using the limit behavior of expected discounted penalty and the integro-differential equation, we can determine these coefficients(under some conditions). All these distinguish not only our approach from that by Asmussen et al. (2004) but also itself from the classical method to solve differential equations in which the knowledge of boundary values is required. This could be applied to solve other functionals once we have transformed its integro-differential into an ordinary differential equation. Our results in fact provide explicit solutions to a large amount of existing pricing problems in finance. Using our closed form solutions, especially Theorem 98.7, the Laplace
1458
transforms of securities with finite maturity have explicit solutions as mentioned in Example 98.5, and we show as an example by considering a touch-and-out option. In addition, the pricing problems of perpetual securities in jump diffusion have improved answers. For example, both the value of shareholders and the optimal bankruptcy level (by assuming the smooth-pasting condition) in the optimal capital structure problem have exact solutions. We worked out these in Example 98.2 under the setup in Hilberink and Rogers (2002), which is an extension of Leland (1994b). Moreover, in Example 98.4, we obtained a closed form solutions even in the two-sided jump case. By using our closed form solution as criterion, many structural form models in credit risk can be reconsidered and modified whether more phenomenon in empirical studies can be captured by jump diffusion model.
References Asmussen, S. 2000. Ruin probabilities, World Scientific, Singapore. Asmussen, S., F. Avram, M., and R. 2004. “Pistorius Russian and American put options under exponential phase-type Lévy models.” Stochastic Processes and their Applications 109, 79–111. Bertoin, J. 1996. Lévy processes. Cambridge tracts in mathematics, Vol. 121. Cambridge University Press, Cambridge. Boyarchenko, S. I. 2000. Endogenous default under Lévy processes. Working paper, University of Texas, Austin. Boyarchenko, S. I. and S. Levendorskiˇi. 2002a. “Perpetual American options under Lévy Processes.” SIAM Journal on Control and Optimization 40, 1663–1696. Boyarchenko, S. I. and S. Levendorskiˇi. 2002b. “Barrier options and touch-and-out options under regular Lévy Processes of exponential Type.” Annals of Applied Probability 12, 1261–1298. Cai, J. and H. Yang. 2005. “Ruin in the perturbed compound Poisson risk process under interest rate.” Advances in Applied Probability 37, 819–835. Chan, T. 1999. “Pricing contingent claims on stocks driven by Lévy processes.” Annals of Applied Probability 9, 504–528. Chan, T. 2005. “Pricing perpetual American options driven by spectrally one-sided Lévy processes.” in Exotic option pricing and advanced Lévy models, A. Kyprianou (Ed.). Wiley, England, pp. 195–216. Chen, N. and S. Kou. preprint. Credit spreads, optimal capital structure, and implied volatility with endogenous default and jump risk, University of Columbia. Chen, Y. T., C. F. Lee, and Y. C. Sheu. On the discounted penalty at ruin in a jump-diffusion model, Taiwanese Journal of Mathematics (accepted). Dao, B. and M. Jeanblanc. preprint. Double exponential jump diffusion process: a structural model of endogenous default barrier with rollover debt structure, Université d’Évry. Dufresne, F. and H. U. Gerber 1989. “Three methods to calculate the probability of ruin”. ASTIN Bulletin 19, 71–90. Friedberg, S. H., A. J. Insel, and L. E. Spence. 1997. Linear algebra,” 3rd edition. Prentice Hall, Upper Saddle River, New Jersey. Gerber, H. U. 1970. “An extension of the renewal equation and its application in the collective theory of risk.” Scandinavian Actuarial Journal 53, 205–210.
Y.-T. Chen et al. Gerber, H. U. and B. Landry. 1998. “On the discounted penalty at ruin in a jump-diffusion and the perpetual put option.” Insurances: Mathematics and Economics 22, 263–276. Gerber, H. U. and E. S. W. Shiu. 1998. “On the time value of ruin.” North American Actuarial Journal 2, 48–78. Hilberink, B. and L. C. G. Rogers. 2002. “Optimal capital structure and endogenous default.” Finance and Stochastics 6, 237–263. Kou, S. G. and H. Wang. 2003. “First passage times of a jump diffusion process.” Advances in Applied Probability 35, 504–531. Kou, S. G. and H. Wang. 2004. “Option pricing under a double exponential jump diffusion model.” Management Science 50, 1178–1192. Leland, H. E. 1994a. “Corporate debt value, bond covenants, and optimal capital structure.” Journal of Finance 49, 1213–1252. Leland, H. E. 1994b. Bond prices, yield spreads, and optimal capital structure with default risk, Working paper n240, IBER. University of California, Berkeley, CA. Leland, H. E. and K. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” Journal of Finance 51, 987–1019. Merton, R. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics 3, 125–144. Mordecki, E. 2002. “Optimal stopping and perpetual options for Lévy processes.” Finance and Stochastics 6, 473–493. Revuz, D. and Yor M. 2005. Continuous martingales and Brownian motion, corrected 3rd printing of the 3rd Edition. Springer, Berlin. Rolski, T., H. Schmidli, V. Schmidt, and J. Teugels. 1999. Stochastic processes for insurance and finance. Wiley, England. Sato, K. I. 1999. Lévy processes and infinitely divisible distributions. Studies in advances mathematics, Vol. 68. Cambridge University Press, Cambridge. Stein, E. M. and R. Shakarchi. 2005. Real analysis. Princeton University Press, Upper Saddle River, NJ. Tenenbaum, M. and H. Pollard. 1985. Ordinary differential equations. Dover Publications, New York. Tsai, C. C,-L. and G. E. Wilmott. 2002. “A generalized defective renewal equation for the surplus process perturbed by diffusion.” Insurance: Mathematics and Economics 30, 51–66.
Appendix 98A Proofs To prove Equation (98.8), we need the following lemmas and Dynkin’s formula. Lemma 98.2. For every bounded Borel measurable function g on R , there exists a sequence of uniformly bounded functions .gn / in C02 .R / such that gn ! g Lebesgue a.e. on R and gn .0/ D g.0/ for all n. Proof. Write d m for the Lebesgue measure on R. Let h be a normal density and set dG.y/ D h.y/d m. Then by the strict positivity of h, dG and d m are equivalent. Given > 0. By Dominated Convergence R 0 Theorem, we can find N large enough such that we have 1 jg1ŒN;0 gjdG < . Also, since g1ŒN;0 is bounded and is in L1 .R; d m/, a modification of the proof of Theorem 2.4 in Stein and Shakarchi (2005) Chapter 2 shows that we can find a sequence of uniformly bounded C02 .R /functions .gn0 / such that gn0 ! g1ŒN;0 d ma.e. Since dG and d m
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
are equivalent, gn0 ! g1Œ0;N dGa.e. Hence, by Dominated that R 0 Convergence Theorem, we can pick n largeR such 0 jgn g1ŒN;0 jdG < . We have obtained that 1 jgn0 gjdG < 2. And this implies that there exists a sequence of C02 .R / functions .gn / such that gn ! g dGa.e. on R . Since dG and d m are equivalent, we see that gn ! g d ma.e. on R . In addition, it is clear we can pick .gn / to be uniformly bounded. Now, for each n, define a C02 .R /function hn such that hn D gn on .1; 1=n/, hn .0/ D g.0/ and .hn / is uniformly bounded. Then it is obvious that .hn / is the desired sequence. This completes the proof. t u Lemma 98.3. The following distributions are absolutely continuous with respect to Lebesgue measure: (1) The distribution Px Xe < 0; Xe 2 , where X is defined by Equation (98.1) and e is defined as in the proof of Theorem 98.2. (2) The distribution Px ŒX < 0; X 2 for x > 0, where X is defined by Equation (98.42). Proof. We only show (2) and the proof of (1) follows similarly. Let N be a Borel set in .1; 0/ with Lebesgue measure zero. Write Jn as the nth jump time of the compound Poisson process Z. Observe that for x > 0, Px ŒX < 0; X 2 N D
1 X
Px ŒX < 0; X 2 N ; D Jn :
nD1
We first show that Px ŒX 2 N ; X < 0; D J1 D 0. Note .C/ since X .C/ , J1 and Y1 are independent, so are XJ1 and Y1 . Hence, we have Px ŒX 2 N ; X < 0; D J1 h i .C/ .C/ D Px XJ1 Y1 2 N ; XJ1 Y1 < 0; D J1 h i .C/ Px XJ1 Y1 2 N Z D Px Œz Y1 2 N dG.z/; .C/
where we G is the distribution of XJ1 . Since Y1 has an absolute continuous distribution, we deduce from the last inequality that Px ŒX 2 N ; X < 0; D J1 D 0. In general, for each n 2, we have by Strong Markov Property that Px ŒX < 0; X 2 N ; D Jn D Ex 1 >Jn1 PXJn1 ŒX < 0; X 2 N ; D J1 D 0; since Jn1 is an .Ft /stopping time and XJn1 > 0 on Œ > Jn1 . This concludes that Px ŒX < 0; X 2 is
1459
absolutely continuous with respect to Lebesgue measure for every x > 0. t u Theorem 98.8 (Dynkin’s formula). Let X.t/ 2 Rn be a jump diffusion and let k 2 C 20 .Rn /. If is a stopping time such that Ex ./ < 1, then Z
Ex Œk.X / D k.x/ C Ex
Ak.X.s//ds 0
where A is the generator of X defined by Ak.x/ D lim
t !0C
1 fEx Œk.X.t// k.x/g .if the limit exist./ t
Proof of Equation (98.8). As in the proof of Theorem 98.2, let > 0 and e the first exit time for X from .; x C 1/. Pick a sequence of uniformly bounded functions .gn / C02 ..1; 0 / that converges to g except on a set N of Lebesgue measure 0 in R and gn .0/ D g.0/ for all n (see Lemma 98.2). Therefore, we may consider a sequence of uniformly bounded functions .ˆn I n 1/ that satisfies the following conditions: C1 C2 C3
ˆn D gn on .1; 0 ; ˆn D ˆ on .1=n; n/; ˆn 2 C02 .R/.
By (C1), (C2) and the fact that gn ! g except on N , we have ˆn ! ˆ pointwise on R except on N . Now, pick n large such that .; x C / .1=n; n/. Then ˆn .x/ D ˆ.x/ by (C2). By Dynkin’s formula, for all t > 0, h i ^t ˆn .Xe / Ex e re ^t "Z # ^t e e ru .L r/ˆn .Xu /d u C ˆ.x/: D Ex 0
(98.60) Since .ˆn / is uniformly bounded, ˆn ! ˆ pointwise on RC and ˆn ! ˆ a.e. on .1; 0/, by Lemma 98.3(1) and Dominated Convergence Theorem, i i h h rt ^e lim Ex e rt ^e ˆn .Xt ^e ˆ.Xe / D Ex e ^t / :
n!1
(98.61) On the other hand, for every u < e ^ t, Xu 2 .; x C /, and hence ˆn .Xu / D ˆ.Xu / for all large n. These give .L r/ˆ.Xu / .L r/ˆn .Xu / Z D Œˆ.Xu y/ ˆn .Xu y/ dF .y/ (98.62)
1460
Y.-T. Chen et al.
By Equations (98.61) and (98.64), and letting n ! 1 for both sides of Equation (98.60), we get Equation (98.8). u t
and hence j.L r/ Œˆ.Xu / ˆn .Xu / j sup kˆn k1 C kgk1 : n
(98.63)
Proof of Lemma 98.1. Since ˆ 2 C 2 .RCC /, it suffices to show that, as y ! x Z
By Dominated Convergence Theorem, Equation (98.62) and the absolute continuity of F imply, for all u < t ^ e , .L r/ˆn .Xu / ! .L r/ˆ.Xu /;
Z ˆ.y z/dF .z/ !
For any y > 0, write
n ! 1:
Z
0
"Z D Ex
t ^e
# e ru .L r/ˆ.Xu /d u :
(98.64)
Z
1
ˆ.y z/dF .z/ D
By Equation (98.63) and Dominated Convergence Theorem, we have "Z # ^t e ru e .L r/ˆn .Xu /d u lim Ex n!1
ˆ.x z/dF .z/:
g.y z/dF .z/ y
Z
y
C
ˆ.y z/dF .z/: 1
R Let > 0 and find M > x such that .1;xM C1 [ŒxCM 1;1/ dF .z/ < . Then for all y > 0 such that jx yj 1=2,
0
ˇ ˇZ Z ˇ ˇ ˇ ˆ.y z/dF .z/ ˆ.x z/dF .z/ˇ ˇ ˇ ˇZ yCM Z ˇ 2kgk1 C ˇˇ g.y z/dF .z/ ˇZ ˇ C ˇˇ
y
y
Z
ˆ.y z/dF .z/
yM
ˇZ ˇ 2kgk1 C ˇˇ
x xM
0 M
ˇ ˇ g.x z/dF .z/ˇˇ x ˇ ˇ ˆ.x z/dF .z/ˇˇ
ˇ ˇZ ˇ ˇ g.z/Œf .y z/ f .x z/ d zˇˇ C ˇˇ
where f is the density function for F . Since g and ˆ are bounded and f is continuous on Rnf0g, by Dominated Convergence Theorem, we have ˇZ ˇ Z ˇ ˇ lim sup ˇˇ ˆ.y z/dF .z/ ˆ.x z/dF .z/ˇˇ 2kgk1 : y!x
Since > 0 is arbitrary, the proof is completed.
t u
Proof of Proposition 98.2 Denote by Z.C/ the collection of all solutions to ./ r D 0 with nonnegative real parts, counting multiplicity. Since D1 ˆ 0 on RCC and ./ r and P1 ./ have the same zero set, ˆ is of the form X X 0 ˆ.x/ D Qi .x/e i x C R0 .x/e x i 2Z./
xCM
0 2Z.C/
for some polynomials Qi ’s and R0 ’s. (See Tenenbaum and Pollard (1985) Theorem 19.3 and Lesson 20.) We show that R D 0 for all 2 Z.C/ . Assume that R .x/ ¤ 0 for some
M 0
ˇ ˇ ˆ.z/Œf .y z/ f .x z/ d zˇˇ ;
2 Z.C/ . Let a 0 be the maximum of all real parts of members in Z.C/ and let f10 ; ; k0 g be the set of all elements in Z.C/ such that <.j0 / D a. Now, let m 0 be the smallest integer such that the order of Rj0 is m for all j , and select the j0 ’s such that the order of Rj0 is equal to m. Still call the selected members f10 ; 20 ; ; k0 g. Let aj ¤ 0 be the coefficient of x m in Rj0 for all j . Then ˆ.x/x m e ax D
X
0
aj e i =.j /x C h.x/;
j
where h.x/ ! 0 as x ! 1. P 0 However, j aj e i =.j /x is the discrete Fourier transform of a nonzero function and hence is not identically zero, and it has period 2 . So, ˆ.x/x m e ax ¹ 0, as x ! 1. Since ˆ 2 C0 .RC /, this is impossible. Hence, R0 0 for all 0 2 Z.C/ . t u
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Proof of Proposition 98.3. The first statement follows from P the fact that SiD1 Qi D ˆ.0/ D g.0/. Consider the second statement. First, note that the minimality of the representation .˛C ; B C / guarantees that the collection of all zeros 2 C of the rational function r 1 ./ is equal to that of all eigenvalues of B C . (For details, see Asmussen et al. (2004) Lemma 98.1 and the paragraph above it.) Hence, i is not an eigenvalue of B C for each i and i I B C is invertible for all i . By Theorems 98.9(4) and 98.10(2), we have Z
x
Z
x
Q> e .x y/f .y/dy C
D
Z
x
ˆ.x y/f .y/dy 1
D
S X
Qi e i x
1 .i /
i D1
S X
Qi e i x p˛C .i I B C /1 e x.B C i I/ b> C:
i D1
0
Q> e .x y/f .y/dy
1
0
D
Z
Therefore, by Equation (98.15),
Recall .i / D D.i /2 C ci C 1 .i / . C r/ D 0 for P all i . Since ˆ.x/ D SiD1 Qi e i x , we have
ˆ.x y/f .y/dy 1
1461
S X
Qi e i x p˛C .i I B C /1 .I e x.B C i I/ /b> C
i D1
C
S X
Qi e i x q˛ .i I B /1 b> ;
i D1
Z 0 D .L r/ˆ.x/ D Dˆ00 .x/ C cˆ0 .x/ C D
S X
Z
ˆ.x y/f .y/dy . C r/ˆ.x/
1
Qi .i /e i x C
g.x y/f .y/dy x
i D1
Z D p˛C 0
1
S X
Qi e i x p˛C .i I B C /1 e x.B C i I/ b> C
i D1
S X g.y/e B C y dy e xB C b> p Qi e i x ˛C .i I B C /1 e x.B C i I/ b> C C: i D1
So, the conclusion of the proposition follows.
Z
t u lim Ex
Proof of Theorem 98.5. Let x > 0. Similar to the proof of Equation (98.8), we can pick a sequence of uniformly bounded functions . n / C02 .R/ such that n on RC and n ! a.e. on .1; 0/. By Dynkin’s formula, we have Ex e rt ^ n .X ^t / Z D Ex
e
ru
.L r/ n .Xu /d u C .x/:
e
n!1
ru
.L r/ n .Xu /d u
0
Z
t ^
D Ex
e ru .L r/ .Xu /d u :
0
By Dominated Convergence Theorem and Lemma 98.3(1), letting n ! 1 for both sides of Equation (98.65) gives
t ^
t ^
(98.65)
0
Note that n and only differ on .1; 0/ and Xu > 0 for all u < t ^ . Moreover, the jump distribution of Z is absolutely continuous. Hence, similar to the proof of Equation (98.64), we have
Ex e rt ^ .X ^t / Z
t ^
D Ex
e ru .L r/ .Xu /d u C .x/ D .x/:
0
(98.66)
1462
Y.-T. Chen et al.
d d C TB h.x/ D dx dx
Note the last equality follows from the assumption that .L r/ 0 on RCC . Let t ! 1 for both sides of the last equality. Since r > 0, the result follows by the fact that t u e rt 1Œ >t ! 0 as t ! 1. This completes the proof.
Appendix 98B Toolbox for Phase-Type Distributions
Z e Bx
x
h.y/e By dy 1
D BTBC h.x/ C h.x/I: d This shows that dx B TBC h.x/ D h.x/I . Similarly, d B C dx TB h.x/ D h.x/I. Finally, if h is bounded and h 2 C0;b .RCC /, then it is clear that TB˙ h is bounded on R. Moreover, by Dominated Convergence Theorem, TB˙ h 2 t u C0;b .RCC /. This proves the last statement.
We first give a summary of some facts of matrix algebra.
Theorem 98.9. (1) The matrix exponential function e hA is differentiable on R and de hA D Ae hA D e hA A: dh (2) Let 1 ; ; k be the eigenvalues of a matrix A. Then for each s > maxi <.i /, limt !1 e st exp.tA/ D 0. (3) Let B be a subintensity matrix. Then B is nonsingular if and only if every eigenvalue of B has a strictly negative real part. Rt (4) If A is nonsingular, then v e xA dx D A 1 e t A e vA ; if further allR eigenvalues of A have strictly negative real 1 parts, then 0 e xA dx D A 1 . The following summarizes the main analytic properties of phase type distributions.
Appendix 98C First Order Derivative of ˆ at Zero We consider the process (Equation 98.1) and set Z
D sup 2 RC I e y dF .y/ < 1; 8 2 Œ0; : (98.67) Throughout this section, we assume that > 0 and . / > 0. Let r > 0. Since .0/ r < 0 and is strictly convex on Œ0; /, there exists a unique number 2 .0; / such that . / r D 0. ( is called the Lundberg’s constant in literatures.) We write ˇD
Theorem 98.10. Suppose that F is PH.˛; B/. Then:
c C ; ˛ D ˇ C ; D
(98.68)
(1) The nth moment of F is .1/ nŠ˛BR e . 1 (2) For any s 2 C with <.s/ 0, 0 e sx dF .x/ D 1 > ˛.sI B/ b and is a rational function. n
n >
Assume that B is a nonsingular subintensity matrix. Theorem 98.9(2) and (3) imply that lim e By e ıy D 0
y!1
for some ı > 0. Hence, TB˙ given by Equation (98.18) is well defined. Theorem 98.11. Let h be a bounded Borel measurable func˙ tion. If h is continuous both differentiable Cat x, then TB hd are d at x and dx B TB D B C dx TB D I. Moreover, if h 2 C0;b .RCC /, each entry of TB˙ h is in C0;b .RCC /. Proof. Assume h is bounded and is continuous at x. By d Bx e D Be Bx . On the other hand, since Theorem 98.9(1), dx Rx d By dy D h.x/e xB . h is continuous at x, dx 1 h.y/e Therefore,
and h .x/ D e x h.x/ for any function h. Recall that ˆ.x/ D Ex Œe r g.X / for some fixed bounded Borel measurable function g W .1; 0 ! R and D infft 0I Xt 0g. Following similar arguments as in Sect. 98.3 of Gerber and Landry (1998), we calculate the first order derivative of ˆ at 0. For details, see Gerber and Landry (1998) or Chen et al. (submitted). Theorem 98.12. Suppose r > 0. The derivative of ˆ at 0 is given by ˆ0 .0C/ D #0
D
Z
Z
0
y
1
d vˆ .v/e y ;
dF .y/ 0
(98.69) where #0 D ˇg.0/ C
D
Z
Z
1
1
0
dF .y/e y g .v y/:
dv v
(98.70) In particular, if
R .1;0
dF D 0, then ˆ0 .0C/ D #0 .
98 An ODE Approach for the Expected Discounted Penalty at Ruin in Jump Diffusion Model (Reprint)
Proof. By Theorem 5.2, we have .L r/ˆ.v/ D 0 for all v > 0. Multiplying both sides of this equation by e v gives
e
v
Hence, Equation (98.71) becomes 0 D Dˆ00 .v/ C c C 2 ˆ0 .v/ Z 1 C ˆ .v y/e y dF .y/
d2 d . C r/ ˆ.v/ D 2 Cc dv dv Z
1
1
C
ˆ.v y/e v dF .y/ D 0:
C D. /2 C c . C r/ ˆ .v/:
(98.71)
1
. / D r. Then
Recall that
0 D Dˆ00 .v/ C c C 2 ˆ0 .v/ Z 1 C ˆ .v y/e y dF .y/
Note that ˆ.v/ D e v ˆ .v/ for v 2 R. Then
1463
ˆ0 .v/ D e v ˆ .v/ C e v ˆ0 .v/
(98.72)
1
and
Z
ˆ .v/
e y dF .y/:
(98.73)
ˆ00 .v/ D . /2 e v ˆ .v/
By Theorem 98.1, ˆ0 .0C/ exists in the sense of right hand derivative. Therefore, integrating Equation (98.73) from v D 0 to v D z gives
C 2 e v ˆ0 .v/ C e v ˆ00 .v/:
h D
ˆ0 .z/ Z
i
C c C 2 ˆ .z/ ˆ .0/ C
Z
z
ˆ0 .0C/
dF .y/ˆ .v/e
dv
y
Z C
0
Z
z
Z
Z
z
v
y
1
0 1
dF .y/g .v y/e
dv 0
dF .y/ˆ .v y/e
dv
y
D 0:
(98.74)
v
Note that we have Z
Z
z
v
dF .y/ˆ .v y/e
dv 0
1
Z
Z
0
D
z
d vˆ .v y/e
dF .y/ Z
1
Z
1
z
d vˆ .v y/e Z
y
1
z
y
0
z
d vˆ .v y/e
dF .y/ Z
Z
z
C
zy
d vˆ .v/e
dF .y/ Z
C
y
(Fubini’s Theorem)
y
0
d vˆ .v y/e
dF .y/
Z
z
C
0
0
D
Z 0
dF .y/ Z
y
0
0
D
y
(Change of Variable)
0
Z
z
y
zv
dF .y/ˆ .v/e y : (Fubini’s Theorem)
dv 0
0
So, Equation (98.74) is equivalent to h
i
0 D D ˆ0 .z/ ˆ0 .0/ C c C 2 ˆ .z/ ˆ .0/ C Z
Z
z
1
C
dv 0
Z
zv
0
dF .y/ˆ .v/e
1
y
Z
Z
z
C
dv 0
"Z
z
d vˆ .v y/e
dF .y/ 1
y
0
#
1
dF .y/e v
Z
0
y
g .v y/ :
(98.75)
1464
Y.-T. Chen et al.
By Theorem 98.1, ˆ0 .z/ and ˆ.z/ tend to zero as z ! 1. Hence, letting z ! 1 in the last equation gives
ˆ0 .0/
1 D D
" 2
.c C /g.0/
R0 1
dF .y/
R y 0
d vˆ .v/e
y
C
R1 0
dv
R1 v
# dF .y/e
y
g .v y/ : (98.76)
Since ˆ0 .0/ D g.0/ C ˆ0 .0/, we get Equation (98.69). t u Proposition 98.4. Assume Equation (98.12) holds, and r > 0. Then the number defined by Equation (98.67) is in .0; 1 and . / D C1. Proof. First, we show that > 0. Now, since all the eigenvalues of B have strictly negative real parts by Theorem 98.9(3), so are the eigenvalues of B C I for all in an open interval in .0; 1/. For all these , we obtain from Theorem 98.9(4) that
Z 0
1
e y f./ .y/dy D ˛ .I B /1 b> < 1
R and hence e y dF .y/ < 1. Next, we show that . / D C1. If < 1, must be a pole of ./ on C so that . / D C1 or 1. Since is strictly convex on Œ0; /, we have . / D C1. If
D 1, we have . / D C1 by the fact that D > 0. This completes the proof. t u
Chapter 99
Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers* Alice C. Lee and J. David Cummins
Abstract This paper estimates the cost of equity capital for property/casualty insurers by applying three alternative asset pricing models: the capital asset pricing model (CAPM), the arbitrage pricing theory (APT), and a unified CAPM/APT model (J Finance 43(4), 881–892, 1988). The in-sample forecast ability of the models is evaluated by applying the mean squared error method, the Theil U 2 (Applied economic forecasting, North-Holland, Amsterdam, 1966) statistic, and the Granger and Newbold (Forecasting economic time series, Academic, New York, 1977) conditional efficiency evaluation. Based on forecast evaluation procedures, the APT and Wei’s unified CAPM/APT models perform better than the CAPM in estimating the cost of equity capital for the PC insurers and a combined forecast may outperform the individual forecasts. Keywords Property/casualty r Insurance r CAPM rAPT rCost of equity capital r Asset pricing
99.1 Introduction It is well known that the cost of capital for an industrial firm may include cost of debt, cost of preferred stock, cost of retained earnings, and cost of a new issue of common stock. Cost of equity capital is made up of cost of retained earnings and cost of a new issue of common stock. Financial theory suggests that returns of industrial firms should be a function of risk, but because the management principles of financial institutions are not necessarily identical to those of industrial firms, applications of financial theory to nonindustrial firms (such as banks, savings and loans, and insurance firms) must be theoretically and empirically investigated. What financial institutions must consider are exposure to different A.C. Lee () San Francisco State University, San Francisco, CA, USA e-mail: [email protected] J.D. Cummins Temple University, Philadelphia, PA, USA e-mail: [email protected]
types of risk, the relative importance of various risks, and the existence of market imperfections and constraints such as regulation. For industrial firms, their production function and the return from their production capabilities are the main indicators of their value to investors in the market. Investments made by industrial firms are generally for the purposes of increasing and improving production capabilities. However, for financial institutions such as insurance companies, there are no production outputs to serve as an indicator of firm value. Output measurements for financial institutions are more difficult to measure because service sector output is intangible. For instance, an insurance company is in the business to manage the risk of others, as well as the firm’s own risk in financial investments. Specific issues of property/casualty (PC) insurance companies that differ from industrial firms include (Lee and Forbes 1980) (1) different income and accounting measures, generally accepted accounting principles (GAAP) and statutory accounting principles (SAP), which affect reported earnings and retained earnings; (2) unique borrowinglending rate relationship; and (3) an asset portfolio comprised primarily of securities of industrial firms. An additional complicating issue for insurance companies is policyholder reserves. Reserves are the debt capital of the insurance industry, and policyholder reserves may be viewed as a source of investable funds. And there are also special issues associated with the cost of capital for regulated industries, which include public utility and insurance firms. Do the regulations impose price controls or constraints that affect the profitability and valuation of firms in the industry? Because public utility prices are regulated by a “fair rate of return,” there has been much research and discussion on cost of equity capital estimation for public utilities. More recent research includes Bower et al. (1984); Bubnys (1990); Elton et al. (1994); Myers and Borucki (1994); Schink and Bower (1994). In fact, there is an entire body of literature *
Any views or opinions presented in this publication are solely those of the authors and do not necessarily represent those of State Street Corporation. State Street Corporation is not associated in any way with this publication and accepts no liability for the contents of this publication.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_99,
1465
1466
devoted to this subject. Two books, Gordon (1974a) and Kolbe et al. (1984), discuss this area. Similarly, the insurance industry has regulatory agencies that affect the pricing of its product. But in regulating insurance prices, regulators often use book value rates of return, which are typically not accurate because cost of equity capital is a market-based value instead of a book value. Also, regulators usually do not vary the cost of capital by company or line, possibly creating excessive profits in some lines and disincentives to participate in others, and with better estimates of cost of capital for PC insurers, this would be possible. In their book, Cummins and Harrington (1987) present research on the issue of the fair rate of return in property-liability insurance. Fairley (1979), Hill (1979), Lee and Forbes (1980), and Hill and Modigliani (1987) apply the capital asset pricing model (CAPM) to derive risk-adjusted rates of return that the capital markets require of stock property-liability insurers. Based on Ross’ (1976) arbitrage pricing theory (APT) of capital asset pricing, Kraus and Ross (1982) develop a continuous time model under uncertainty to determine the “fair” (competitive) premium and underwriting profit for a property-liability insurance contract. An empirical application of the APT in determining the cost of equity capital for PC insurers has yet to be considered.1 In addition to regulatory considerations, the insurance industry is interested in cost of capital estimation from a management prospective. Cost of capital affects an insurance company’s resource allocation and investment decisions. In project decision-making, insurers have begun to use financial techniques such as financial pricing and capital budgeting but have found reliable costs of capital estimates difficult to obtain.2 Cost of capital is an important component in many insurance pricing models.3 Cummins (1990b) analyzes the two most prominent discounted cash flow (DCF) models in property-liability insurance pricing – the Myers-Cohn (MC) model (Myers and Cohn 1987) and the National Council on Compensation Insurance (NCCI) model. Cummins points out that the two models are based on the concepts of capital budgeting, and 1
Additional early academic research on cost of capital estimates for the property-liability insurance industry includes Cummins and Nye (1972); Forbes (1971); Forbes (1972); Haugen and Kroncke (1971); Launie (1971, 1972); Lee and Forbes (1980); and Quirin and Waters (1975). 2 A 1993 conference on “Key Issues in Financial and Risk Management in the Insurance Industry,” sponsored by the Wharton Financial Institutions Center, brought together consultants, executives, and regulators from the insurance industry, as well as academicians. Those from the insurance industry pointed out areas of concern for the industry and areas for research. Among the areas mentioned was the estimation of cost of capital and the applicability of models such as the CAPM and APT. 3 D’Arcy and Doherty (1988) review financial pricing models proposed for property-liability contracts. Cummins (1992) also provides a review of principal financial models that have been proposed for pricing insurance contracts and proposes some new alternatives.
A.C. Lee and J.D. Cummins
essentially, the insurance policy is viewed as a project under consideration by the firm. Thus, financial pricing models can be used to determine the price for the project (in this case, the premium) that will provide a fair rate of return to the insurer, taking into account the timing and risk of the cash flows from the policy as well as the market rate of interest. Both the MC model and the NCCI model rely on cost of capital estimates for insurance pricing. In this paper, the cost of equity capital for PC insurers, during the 1987–1992 period, is estimated by applying three alternative asset pricing models: (1) the capital asset pricing model (CAPM); (2) the arbitrage pricing theory (APT); and (3) a unified CAPM/APT model (Wei 1988). Adjustments for nonsynchronous trading are made by applying the Scholes and Williams (1977) beta adjustment, and the errors in variables (estimation risk) problem in the second-stage regression is adjusted by the Litzenberger and Ramaswamy (1979) generalized least squares procedure. Then the in-sample forecast ability of the three models is evaluated by applying the mean squared error method, the Theil (1966) U 2 statistic, and the Granger and Newbold (1977) method of testing for conditional efficiency of forecasts. Based on forecast evaluation procedures, the APT and Wei’s unified CAPM/APT models perform better than the CAPM in estimating the cost of equity capital for the PC insurers, and a combined forecast may outperform the individual forecasts. The next section discusses some of the theories and prior work in the area of cost of equity capital estimation. Section 99.3 describes the model specifications and basic estimation procedures for the CAPM, APT, and unified CAPM/APT model. In Sect. 99.4, the dataset is described, and the empirical details for the asset pricing model estimations are discussed. Section 99.5 explains and applies the procedures used to evaluate the forecast quality of the cost of equity capital estimation by the various asset pricing models. Finally, Sect. 99.6 contains the summary and conclusion.
99.2 Prior Work Elton and Gruber (1994) point out that the estimation of a company’s cost of equity capital is one of the most important tasks because the cost of capital affects the company’s project selection, borrowing rate, and allocation of resources. They categorize and briefly describe techniques used for estimating cost of capital: (1) comparable earnings, (2) valuation models, (3) risk premium, (4) capital asset pricing models, and (5) arbitrage pricing models. Comparable earnings estimations of cost of capital use earnings on book equity for “comparable companies.” Although this technique is rarely discussed in texts, it is used in practice by regulatory agencies. Problems with this technique include difficulty in defining “comparable companies”
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
and inaccuracy of using book values to estimate cost of capital, which is a market-based value. Valuation models (also referred to as dividend growth models) define the cost of capital as the discount rate that equates expected future dividends to the current price. If we assume the dividend per share, D1 , grows at a constant rate, g, forever, then the cost of capital .k/ can be defined as: Cost of equity .k/ D
D Cg P
1467
comparison, concluding that neither model dominates in both forecasting and simulation. By using the LitzenbergerRamaswamy method of adjustment, Bubnys corrects for the errors in variables problem of using firm betas in the crosssectional security market line equation. By doing so, use of large sample size in the second pass regression results in significant pricing of the riskless asset and market risk premium.
(99.1)
where P represents the stock price per share. This is also called the constant-growth discounted cash-flow (DCF) formula. The DCF formula is the most widely used approach to estimate the cost of equity capital of regulated firms in the United States (Myers and Borucki 1994) and is also used extensively in insurance regulation. Various assumptions can be made about the growth rate. The risk premium technique to estimate cost of capital requires an estimate of the extra return (a premium) on equity needed over and above the yield on some long-term bond. The estimate of the average cost of equity capital for all firms is then the premium plus the current yield to maturity on the long-term bond. A risk adjustment can then be done, on an ad hoc basis, for firms that may have a different risk from the average firm. The CAPM and APT models are related to the risk premium approach. But rather than adjusting for a firm’s risk on an ad hoc basis, the models use a set of theories and assumptions to determine the risk adjustment for firms. The CAPM states that the expected return on a security is a linear function of the riskless rate plus beta (the security’s sensitivity to the market) times the risk premium of the market portfolio over the riskless rate. The APT is similar to the CAPM in that the expected return on a security is a linear function of the riskless rate and the risk premium of different factors of systematic risk times the factor beta (the security’s sensitivity to the factor). Unlike the CAPM, the APT model allows for more than one source of systematic risk, and the factors are not predefined. While the APT model can be viewed as a generalization of the CAPM, the difficulty lies in identifying these factors. Applying a methodology similar to that used by Bubnys (1990) (which expands upon Bower et al. 1984), this paper compares the CAPM, APT, and unified CAPM/APT model estimates of the cost of equity capital for PC insurance companies traded on the NYSE and NASDAQ during the 1988–1992 period. Bower et al. compare the CAPM model with the APT model as estimates of expected returns for utility stocks. They find that APT may be the superior model in explaining and conditionally forecasting return variations through time and across assets for public utility companies. Bubnys expands on Bower et al.’s CAPM/APT
99.3 Model-Specification and Estimation In Sect. 99.3.1, first, the basic estimation procedure of the CAPM is discussed. Then, the procedure for estimating the cost of equity capital using the CAPM is explained. Similarly, in Sect. 99.3.2 the estimation procedure for the APT model and the APT cost of equity capital is discussed. And finally, in Sect. 99.3.3, the unified CAPM/APT model, estimation procedure, and estimation of cost of equity capital are presented.
99.3.1 CAPM The CAPM describes the expected return on an asset as a linear function of only one variable, the asset’s systematic risk: E.Rj / D Rf C ˇj ŒE.Rm / Rf
(99.2)
E.Rj / D expected return on asset j Rf D risk-free rate E.Rm / D expected return on the market portfolio “j D COV.Ri Rm /=Var.Rj /, the systematic risk of asset j Estimation of the CAPM consists of two stages. The first stage is the estimation of the systematic risk, “. This can be done by applying the time-series market model regression:4 Rjt D ˛j C ˇj Rmt C "jt Rjt Rmt ’j “j ©jt
D D D D D
(99.3)
return on asset j in timet return on the market portfolio in time t intercept systematic risk of assetj residual term
From regression Equation (99.3), “j determines how responsive the returns for the individual asset j are to the market portfolio. 4
Fama and French (1992) have questioned the validity of the CAPM. However, Kim (1995) and Jagannathan and Wang (1996) have shown that Fama and French results are misleading.
1468
A.C. Lee and J.D. Cummins
For each firm in the market, regressing the firm’s rate of return at time t on the market return at time t gives the estimate of the firm’s systematic risk, “j from Equation (99.3), which is then used in the second stage cross-sectional regression:5 Rj D a C b ˇj C j (99.4) Rj D mean return over all time periods for asset j “j D systematic risk for asset j from first stage a D intercept b D slope j D residual term If CAPM is valid, then E.a/ O D Rf or E.Rz /
(99.5)
O D E.Rm / E.Rf / or E.Rm / E.Rz / E.b/
(99.6)
where aO is the intercept estimate, bO is the slope estimate, and E.Rz / is the expected return for the zero-beta portfolio. The two-stage procedure described above results in an estimate of the risk-free rate and market risk premium. After the CAPM is estimated, the betas of the PC insurance companies are estimated with regression Equation (99.3). The CAPM cost of equity capital estimates, Kei , are computed using the following equation: Kei D a C b“i
(99.7)
where a and b are the risk-free rate and the market risk premium, respectively, estimated from regression (99.4), and bi is the beta of PC insurance firm i estimated with regression Equation (99.3).
99.3.2 APT Similar to the CAPM, the APT (Ross 1976, 1977) describes the expected return as a linear function of systematic risks. The CAPM predicts that security rates of return will be linearly related to a single common factor, the rate of return on the market portfolio. The APT allows the possibility of more than one systematic risk factor, and so the expected rate of return on a security can be explained by k independent influences or factors: E.Rj / D 0 C 1 bj 1 C 2 bj 2 C : : : C k bjk 5
(99.8)
Fama and French (1992) have questioned the validity of the CAPM. However, Kim (1995) and Jagannathan and Wang (1996) have shown that Fama and French results are misleading.
E.Rj / D expected rate of return on asset j bjk D sensitivity of asset j ’s return to an unit change in the kth factor (factor loading) 0 D common return on all zero-beta assets k D premium for risk associated with factor k If there is a riskless asset with a riskless rate of return, Rf , then all bjk ’s are zero and 0 is equal to Rf ; bjk , similar to the CAPM “j in Equation (99.2), measures how responsive returns from asset j are to the kth factor. The theory doesn’t say what factors should be used to explain the expected rates of return. These factors could be interest rates, oil prices, or other economic factors. The market portfolio, used in the CAPM, might be a factor. Thus, the APT counterpart to the CAPM beta are the sensitivity coefficients, or factor loadings, that characterize an asset. These factor loadings are estimated from the market model: Rjt D bj 0 C bj 1 I1t C : : : C bjk Ikt C ujt
(99.9)
Rjt D average return of portfolio j in time period t Ikt D mean zero factor score at time t for factor k common to the returns of all assets bj 0 D estimated return on asset j when all factor loadings are zero bjk D estimated factor loading (reaction coefficient) describing the change in asset j ’s return for a unit change in factor k ujt D residual term The first-pass regression Equation (99.9) is the APT analog to Equation (99.3) for the CAPM, and the factor loadings, bjk , are similar to the CAPM “j in Equation (99.3). For the CAPM Equation (99.3), the return on the market portfolio, Rmt , is the independent variable, while for the APT, the independent variables, Ikt , are generated from factor analysis on the non-insurance assets returns. The second-pass cross-sectional regression is: Rj D 0 C 1 bj1 C 2 bj 2 C : : : C k bjk C uj
(99.10)
Rj D average return of portfolio j 0 D common return on all zero-beta assets k D premium for risk associated with factor k bjk D factor loading of asset j to factor k estimated from Equation (99.9) uj D residual term Similar to the CAPM second-stage cross-sectional regression Equation (99.4), the APT regression Equation (99.10) estimates in the risk premium, k for each of the k factors, and the intercept term, 0 , resulting in an estimate of the APT model.
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
After the empirical APT is obtained, the APT cost of equity capital for the PC insurance companies can be estimated. First, each individual insurance company’s rate of returns is regressed against the factor scores used in Equation (99.9): Rit D bi 0 C bi1 I1t C : : : C bik Ikt C uit
(99.11)
1469
Equation (99.8) except for the additional term m bjm for the market factor. To apply Wei’s unification theory, the estimation procedure is similar to that used to estimate the APT. The factor loadings are estimated with a modified market model: Rjt D bj 0 C bj 1 I1t C : : : C bjk Ikt C bjm Imt C ujt
(99.14)
Rit D return on insurance firm i in time period t Ikt D mean zero factor score at time t for factor k common to the returns of all assets bi 0 D return on insurance firm i when all factor loadings are zero bik D factor loading (reaction coefficient) describing the change in insurance firm i ’s return for a unit change in factor k uit D residual term
Rjt D average return of portfolio j in time period t Ikt D mean zero factor score at time t for factor k common to the returns of all assets rmt D mean adjusted market return at time t bj 0 D return on asset j when all factor loadings are zero bjk D factor loading (reaction coefficient) describing the change in asset j ’s return for a unit change in factor k bjm D factor loading for the market factor ujt D residual term
Then, the cost of equity for insurance firm i; Kei , the following equation is estimated for the PC insurance firms with the following equation:
This first-stage regression is similar to that used for the APT estimation, regression Equation (99.8) but with the additional factor rmt . The second-stage cross-sectional regression for the unified CAPM/APT model is:
Kei D 0 C 1 bi 1 C : : : C k bik
(99.12)
Kei D cost of equity capital estimate for insurance asset i 0 D common return on all zero-beta assets k D premium for risk associated with factor k bik D factor loading of insurance asset i to factor k To compute Equation (99.12), 0 and .1 ; : : : ; k /, the risk premiums, are estimated from regression Equation (99.10) and, .bi1 ; : : : ; bik / are the factor loading of insurance firm i estimated from regression Equation (99.11).
99.3.3 Unification of the CAPM and APT Wei (1988) develops a model that extends and integrates the CAPM and APT theories. The Wei unifying theory indicates that one only need add the market portfolio as an extra factor to the factor model in order to obtain an exact asset-pricing relationship: E.Rj / D 0 C 1 bj1 C 2 bj 2 C : : : C k bjk C m bjm (99.13) where .1 ; : : : ; k / are the risk premiums associated with factor k; .bj1 ; : : : ; bjk / are the factor loadings of asset j to factor k; m is the risk premium associated with the market factor, and bjm is the market factor loading of asset j to the market factor. If there is a riskless asset with a riskless rate of return, Rf , then all bjk ’s and bjm are zero and 0 is equal to Rf . Equation (99.13) is almost identical to the APT
Rj D 0 C 1 bj1 C 2 bj 2 C : : : C k bjk C m bjm C uj (99.15) Rj D average return of portfolio j k D premium for risk associated with factor k m D premium for risk associated with market factor bjk D factor loading, estimated from Equation (99.14), of asset j to factor k bjm D factor loading, estimated from Equation (99.14), of asset j to market factor uj D residual term This is similar to the APT regression Equation (99.10) but with the additional term, m bim , for the market factor. The independent variables in regression Equation (99.15) are the risk premium for each of the k C 1 factors, and the intercept term, 0 , is the risk-free rate when all bjk ’s are zero. After the empirical unified CAPM/APT is obtained, the cost of equity capital for the PC insurance companies can be estimated. First, each individual insurance company’s rate of returns is regressed against the factor scores and market factor, which are the same as the ones used in Equation (99.14): Rit D bi 0 C bi 1 I1t C : : : C bik Ikt C bim rmt C uit
(99.16)
Rit D return on insurance asset i in time period t Ikt D mean zero factor score at time t for factor k common to the returns of all assets rmt D mean adjusted market return at time t bi 0 D return on insurance asset i when all factor loadings are zero
1470
A.C. Lee and J.D. Cummins
bik D factor loading (reaction coefficient) describing the change in insurance asset i ’s return for a unit change in factor k bim D factor loading for market factor ujt D residual term To estimate cost of equity capital, Kei , the following equation is computed for the PC insurance firms: Kei D 0 C 1 bi 1 C : : : C k bik C m bim
(99.17)
Kei D cost of equity capital estimate for insurance firm i k D premium for risk, from Equation (99.15), associated with factor k m D premium for risk, from Equation (99.15), associated with market factor bik D factor loading, from Equation (99.16), of insurance firm i to factor k from equation bim D factor loading, from Equation (99.16), of insurance firm i to market factor k For Equation (99.17), 0 and the risk premiums .1 ; : : : ; k ; m / are estimated from regression Equation (99.15), and factor loadings .bi1 ; : : : ; bik ; bim / are estimated from regression (99.16).
99.4 Data Description and Cost of Equity Capital Estimates In Sect. 99.4.1, the dataset is described. Then in Sect. 99.4.2, the details of the estimation procedures for the asset pricing models are presented. First, the procedure for estimating the CAPM risk-free rate and risk premium is discussed in detail, and the estimate results are presented. Similarly, the APT model estimates and the unified APT/CAPM model estimates are discussed.
99.4.1 Data Rate-of-return data for common stocks are collected from the Center for Research in Security Prices (CRSP) tapes: (1) NYSE/AMEX (New York Stock Exchange/American Stock Exchange) monthly files, and (2) NASDAQ (National Association of Securities Dealers’ Automated Quotation system) daily files. Data are collected for a 5-year period (December 1987–December 1992). The time period is short because of the limited number of PC insurance companies that trade continuously for an extended period. The NYSE/AMEX data consists of 1,671 companies, excluding PC insurance companies, having no missing
returns during our time period. Similarly, the NASDAQ data consists of 1,995 companies, excluding PC insurance companies, having no missing returns from December 1987 to December 1992. CRSP NASDAQ data are only available as daily rate-of-return data. So, all daily rate-of-return data collected from NASDAQ tapes are converted to monthly returns using the following calculation: Rit D Œ.1 C ri1 /.1 C ri 2 / : : : .1 C rin / 1
(99.18)
Rit D monthly return on stock i in month t rij D return on stock i on trading day j n D number of trading days in the month Also, because many of the NASDAQ companies are infrequently traded with many days of zero return (nonsynchronous trading problem), converting the data from daily to monthly reduces the errors in variables problem. Thus, there is a total of 3,666 firms in our sample, excluding PC insurance firms. Hereafter, this sample is to be referred to as noninsurance companies. Data are available from the NYSE/AMEX tapes for 23 PC insurance companies (hereafter, to be referred to as insurance companies) with no missing returns during the period of December 1987 through December 1992. Multiline insurers with at least 25% of its business in PC insurance and no missing returns during the 5-year period are also included in the insurance sample. The selection process for PC insurers from the NASDAQ tapes is the same as the selection process for NYSE/AMEX firms. For NASDAQ companies, there are 41 insurance companies with no missing data during the 5-year period. Again, since only daily data are available from the CRSP tapes, all data collected from the NASDAQ tapes were converted from daily to monthly returns using Equation (99.18). Data are collected for a total of 64 insurance companies. CRSP also provides whole market indices. The whole market index for the NYSE, AMEX, and NASDAQ markets combined, containing the returns and including all distributions on a value-weighted market portfolio, is collected for the 5-year time period. Table 99.1 shows the summary statistics for the dataset. The results show the cross-sectional statistics of the average monthly rate of return for the firms. The values are calculated from the average monthly rate of return for each firm during the sample period. Thus, for example, the 1,671 non-insurance firms from NYSE/AMEX have an average monthly return of 0.0135 (16.2% annualized). Also, for these 1,671 non-insurance firms, the maximum average monthly rate of return for a firm during the sample period is 0.1503 (180.36% annualized). Looking at Table 99.1 for the non-insurance firms, the average firm monthly rate of return is 0.0154 (18.52% annualized), with NASDAQ firms (20.52% annualized) averaging
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
Table 99.1 Summary statistics for average firm monthly returns N Mean Std Dev Min Max
CV
1471
Skewness
Kurtosis
Non-insurance NYSE/AMEX NASDAQ Total
1671 1995 3666
0:0135 0:0171 0:0154
0:0141 0:0503 0:0183 0:0519 0:0166 0:0519
0:1503 0:1909 0:1909
1:045292 1:070185 1:074968
0:7004 1:2846 1:2073
8:0904 7:9810 8:7420
23 41 64
0:0140 0:0168 0:0158
0:0063 0:0005 0:0130 0:0203 0:0111 0:0203
0:0237 0:0670 0:0466
0:446827 0:770938 0:699094
0:4010 0:7925 0:6331
0:2928 1:8601 2:5158
Insurance NYSE/AMEX NASDAQ Total
higher than the NYSE/AMEX firms (16.2% annualized). The range of the firm rates of returns for this period is quite wide. The minimum (62:28% annualized) and maximum (229.08% annualized) firm rates of return are for companies from NASDAQ. This broad range of firm rate of returns is reflected in the coefficient of variation measurements, which are all over 1. Skewness coefficients for both NYSE/AMEX (0.7004) and NASDAQ (1.2846) average firm monthly returns are positive, meaning that the mode and the median lie below the mean. Kurtosis coefficients for both NYSE/AMEX (8.0904) and NASDAQ (7.9810) are similar when the data are considered separately, but the kurtosis coefficient rises to 8.7420 when the all non-insurance firms are considered together. Also shown in Table 99.1 is the cross-sectional data analysis of average insurance firm monthly rate of returns, broken down into NYSE/AMEX firms, NASDAQ firms, and all insurance firms combined. NASDAQ insurance firms have an average monthly rate of return of 0.0168 (20.16% annualized), and NYSE/AMEX firms have an average monthly rate of return of 0.0140 (16.84% annualized). Both the standard deviation (0.0130) and the coefficient of variation (0.7709) of the NASDAQ insurance firms are considerably higher than that of the NYSE/AMEX insurance firms (0.0063 and 0.4468, respectively). Since smaller firms trade on NASDAQ, we expect that returns for NASDAQ companies to be more variable than those for NYSE/AMEX companies. The overall average monthly return for the insurance firms (19.00% annualized) is very close to that of the non-insurance firms (18.52% annualized) in the dataset. The range of average firm monthly rate of returns for the insurance companies (24:4% to 55.9% annualized) is not as wide as that of the non-insurance companies, and the coefficient of variation for the insurance firms is less than 1 (0.6991), which is considerably less than that of the noninsurance firms (1.075). Skewness coefficients of the average firm monthly returns of the insurance companies are negative, which indicate that the mode and the median lie above the mean. Kurtosis coefficients for the insurance companies are much lower than those for the non-insurance companies,
which could result from the fact that the insurance sample size is much smaller than the non-insurance sample size. Also, a lower kurtosis coefficient would indicate that there are fewer outliers in the sample.
99.4.2 Empirical Results In the next section, details of the estimation procedure for the three models (the CAPM, APT, and Wei’s unified CAPM/APT) are discussed. The risk premium and risk-free rate estimates of each model are given and also discussed.
99.4.2.1 CAPM Estimate For the CAPM, the first-pass time-series market model regression Equation (99.3) is run for the 1,995 non-insurance NYSE/AMEX firms to estimate individual firm betas for the 5-year period: Rjt D ˛j C ˇj Rmt C "jt
(99.3)
The dependent variable Rjt is the monthly rate of return for each firm .j D 1 : : : 1; 995/ in month t.t D 1 : : : 60/. The CRSP value-weighted index, including all distributions, is used as the market portfolio proxy for Rmt . This results in 1,995 individual firm beta estimates. For securities traded on NASDAQ, there is the concern that many securities listed are traded infrequently (nonsynchronous trading). Scholes and Williams (1977) discuss how nonsynchronous trading introduces the potentially serious econometric problem of errors in variables into the market model. With errors in variables in the market model, ordinary least squares (OLS) estimators of coefficients in the market model are both biased and inconsistent. Estimators are asymptotically biased upward for alphas and downward for betas. Thus for the 1,671 non-insurance NASDAQ companies, the Scholes-Williams beta adjustment is
1472
A.C. Lee and J.D. Cummins
applied. To compute the Scholes-Williams estimator, ˇOSW , the following OLS regressions are run in place of regression Equation (99.3): Rjt D a1 C b1 RM;t 1 C u1;t
(99.19)
Rjt D a0 C b0 RM;t C u0;t
(99.20)
Rjt D a1 C b1 RM;t C1 C u1;t
(99.21)
RM;t D ¡0 C ¡M RM;t 1 C vt
(99.22)
where Rjt is the rate of return of asset j for time t, and RM;t is the market rate of return for time t. The CRSP valueweighted index, including all distributions, is used as the market portfolio proxy for RM;t . The estimates from these regressions are then used to compute bO1 C bO0 C bO1 ˇOS W 1 C 2OM
(99.23)
ˇOSW are the estimates for systematic risk for NASDAQ firms. In Table 99.2, the cross-sectional statistics for firm beta estimates are shown. The table is broken down into noninsurance and insurance firms. Statistics are summarized for firms listed on NYSE/AMEX, firms listed on NASDAQ, and all firms combined. Betas for firms listed on NYSE/AMEX are estimated from the first-stage times-series regression Equation (99.3), and betas for firms listed on NASDAQ are calculated using Equation (99.23). The average beta estimate for non-insurance firms (NYSE/AMEX and NASDAQ firms combined) is 1.2826. NASDAQ betas have an average of 1.5673 with a standard deviation of 1.3526 while NYSE/AMEX betas average 0.9426 with a standard deviation of 0.5773. In general, betas of smaller companies tend to have more fluctuations in price and more measurement error (or estimation risk) due to nonsynchronous trading. This is evident in the high variance and wide range of NASDAQ betas. The largest beta estimated is 8.7331, and the smallest beta estimated is 4:4375. Both extreme betas are for companies listed on NASDAQ. Similarly, betas for the insurance firms are also estimated with Equation (99.3) for NYSE/AMEX firms and Equation (99.23) for NASDAQ firms. In Table 99.2, the cross-sectional summary statistics for insurance firm beta estimates are shown. For insurance firms, the average beta is 1.2751 with NYSE/AMEX betas averaging 0.9428 and NASDAQ betas averaging 1.4615. The standard deviation and coefficient of variation for NASDAQ insurance betas (0.8372 and 0.5728, respectively) are considerably higher than those of NYSE/AMEX insurance betas (0.3702 and 0.3926, respectively). Though the maximum and minimum betas of insurance betas (4.7459 and 0:1407, respectively) are not quite as extreme as those for non-insurance betas.
Next, the non-insurance beta estimates are then used in the second-stage cross-sectional regression Equation (99.4) to estimate the risk-free rate and risk premium for the CAPM. Regression Equation (99.4) is run, using the estimated betas from Equations (99.3) and (99.23) of all 3,666 non-insurance companies. The dependent variable .j D 1; : : : ; 3; 666/ is the mean monthly return over the 5-year period for each of the 3,666 non-insurance companies, and the independent variable, “j , is the individual firm beta estimate from Equations (99.3) and (99.23). This results in the CAPM estimates for the risk-free rate, a, and the market risk premium, b. The CAPM estimates from the above procedure will be referred to as the “OLS” version of the CAPM model because the second-stage regression is an ordinary least squares regression. When taking sample statistics, there is almost always some measurement error. For regressions, the problem of inaccuracy in estimating both independent and dependent variables results in biases in the regression estimates. Even if errors of measurement are assumed to be mutually independent, independent of the true values, and have constant variance, the slope estimate will tend to be biased downward, and thus the intercept will tend to be biased upward.6 The greater the measurement error, the greater the biases. In estimating and testing the CAPM, Black et al. (1972) show that using individual firm betas may introduce significant errors in estimating beta. In the second-stage regression Equation (99.4), the intercept (risk-free rate or zero-beta estimate) will tend to be upward biased, and the slope (the market risk premium estimate) will tend to be downward biased. A generally accepted technique for reducing the problem of errors in variables is grouping observations. When grouped, the errors of individual observations tend to be canceled out by their mutual independence, reducing measurement error effect. For CAPM formation, the grouping procedure of using portfolios betas and portfolios returns reduces the estimation error in beta. The tradeoff of using portfolios instead of individual firm betas is that the number of observations in each secondstage cross-sectional regression Equation (99.4) is greatly reduced. To resolve this dilemma, Bubnys estimates the beta of individual firms in the first-stage regression Equation (99.3), and then, he applies the Litzenberger and Ramaswamy (1979) generalized least squares procedure (GLS) for the second-stage Equation (99.4) run. This method takes into account possible correlation between betas and residuals in the second-stage regression Equation (99.4). The residual standard deviations of the first-stage time series regression
6
Detailed discussions of the measurement error issue can be found in Greene (1997) and Judge et al. (1980).
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
Table 99.2 Summary statistics for firm betas
1473
N
Mean
Std Dev Min
Max
CV
Skewness
Kurtosis
1671 1995 3666
0:9426 1:5673 1:2826
0:5773 1:6700 1:3526 4:4375 1:1154 4:4375
3:9581 8:7331 8:7331
0:6125 0:8630 0:8696
0:1182 0:5817 1:0692
0:8520 2:3468 4:3662
23 41 64
0:9428 1:4615 1:2751
0:3702 0:2760 0:8372 0:1407 0:7455 0:1407
1:5415 4:7459 4:7459
0:3926 0:5728 0:5847
0:3480 1:4212 1:7190
0:9177 4:7115 6:3220
Non-insurance NYSE/AMEX NASDAQ Total Insurance NYSE/AMEX NASDAQ Total
Equation (99.3) are used to transform all the cross-sectional variables of the second-stage regression Equation (99.4). The resulting equation is: (99.24)
computing Equation (99.23). Estimates for the risk-free rate, O are from Ibbotson Associates a, O and the risk premium, b, (1994) for the IS version, regression Equation (99.4) for the OLS version, and regression Equation (99.24) for the GLS version.
Rj D mean monthly return over all time periods for asset j Sej D standard deviation of the residuals of firm j ’s market model regression “j D systematic risk for asset j from first stage j D residual term ı For regression Equation (99.24), Sej1 and ˇj Sej are the independent variables, and the intercept term is constrained to be zero. The coefficient estimates aO and bO are still defined as before. aO is the estimate of the expected return of a riskless or zero-beta asset, and bO is the estimate of the expected excess return of the market portfolio over the riskless rate or zero-beta asset rate. This version of the empirical CAPM will be referred to as the “GLS” CAPM because a generalized least squares procedure is used for the second-stage regression. A third version of the CAPM is considered where the Ibbotson Associates (1994) data are used for estimates of the risk-free rate and risk premium. For the risk-free rate estimate, the 1926–1993 average annual return of long-term government bonds (5.4%) is used, and for the risk premium estimate, the 1926–1993 average annual return of large companies minus the average annual return of long-term government bonds .12:3 5:4% D 6:9%/ is used. This version of the CAPM model is referred to as the “IS” version. Ibbotson Associates estimates are also used as a comparison for the risk premium and risk-free rate estimates obtained by the OLS and GLS versions of the CAPM. To estimate the cost of equity capital for the insurance firms, Equation (99.7) is calculated. Beta estimates for the firms are obtained by running regression Equation (99.3) for NYSE/AMEX insurance firms. NASDAQ insurance firm betas are obtained using the Scholes-Williams beta adjustment by running regressions Equations (99.19)–(99.22) and
Risk Premium and Risk-Free Rate Estimates. Table 99.3 shows the annualized estimates of the market risk-free asset rate .Rf / and market risk premium (RP) for the three CAPM versions. For the CAPM IS version, a risk-free rate of 5.4% and a risk premium of 6.9% from Ibbotson Associates (1994) is used. The second CAPM version is the ordinary least squares (OLS) version of the second-stage regression. The rates shown for the OLS version are the estimates obtained from running the second-stage cross-sectional regression Equation (99.4) before applying the Litzenberger and Ramaswamy adjustment. Estimate values of 11.5% for the risk-free rate and 5.21% for the risk premium are both significant at the 0.5% level. The third CAPM version is the intercept constrained GLS regression (Equation (99.25)), using the Litzenberger and Ramaswamy adjustment. The estimates are 10.65% for the risk-free rate and 6.12% for the risk-premium, both significant the 0.5% level. Although the OLS estimate of the CAPM Rf for the 5-year period, 11.85%, is much higher than historical and IS estimates of riskless rate, the estimate is significant at the 0.5% level. One possible interpretation is that is the not the riskless rate but the zero-beta asset rate. Also to be considered is the fact that NASDAQ companies are used in the model estimates. As shown in Table 99.1, the mean monthly return of the non-insurance NASDAQ companies (20.52% annualized) is considerably higher than that of the noninsurance NYSE/AMEX companies (16.25% annualized), and the overall average (18.48% annualized) is also considerably higher than the Ibbotson Associates average annual return of large company stocks (12.3%). The NASDAQ mean increases the overall mean used to estimate the risk-free rate and directly affects the estimate of the regression intercept. This can be explained as follows. The dependent variables of regression Equation (99.4) used to estimate the risk-free rate
ı ı ı ı Rj Sej D a Sej C b ˇj Sej C uj Sej
1474
A.C. Lee and J.D. Cummins
Table 99.3 CAPM and APT second-stage annualized results of regression coefficients 1988–1992 (t -values in parentheses below coefficients) CAPM .n D 3666/ APT .n D 40/ RP 0 1 2 3 4 5 Rf IS OLS GLS
0.054 0.1185 (24.707)a 0.1065 (36.031)a
0.069 0.0521 (14.453)a 0.0612 (22.964)a
0.0957 (1.940)b 0.1062 (2.195)a
0.4962 (1.181) 0.6400 (1.526)
1.9776 (2.914)a 1.5911 (2.421)a
0.3635 (0.370) 0:0941 .0:098/
0:0777 .0:087/ 0:2070 .0:239/
2.1275 (2.511)a 2.5941 (3.916)a
a
Significant at the 1% level Significant at the 5% level IS D CAPM with Ibbotson Associates risk-rate .Rf / and risk premium (RP), OLS D CAPM or APT with ordinary least squares second stage regression, GLS D CAPM or APT with generalized least squares second stage regression b
(the intercept) are the monthly returns of the non-insurance firms. In general, estimation of the intercept, a, is a D y bx
(99.25)
where y is the mean of the dependent variable, b is the slope estimate, and x is the mean of the independent variable. The NASDAQ returns increase the overall mean of the dependent variable, y, which increases the intercept estimate. Another possible reason for high risk-free rate and low risk premium estimates, by historical standards, is that small company betas have more measurement error (or estimation risk). The standard deviation and coefficient of variation of the non-insurance NASDAQ betas (1.3526 and 0.8630, respectively) are much higher than that of NYSE/AMEX betas (0.5773 and 0.6125, respectively). With higher estimation risk from NASDAQ betas, the second-stage regression will tend to underestimate the slope (risk premium) and overestimate the intercept (risk-free rate). Also, another explanation for a high risk-free rate estimate is that there are missing explanatory variables. Fama and French (1992) find that two easily measured variables, firm size and book-to-market equity, need to be considered in the CAPM model in addition to the market betas. In the GLS version of the CAPM, the LitzenbergerRamaswamy method (regression Equation (99.24)) is applied to correct for overestimation of the intercept and underestimation of the slope due to measurement error. In Table 99.3, the GLS version estimate of Rf and RP are 10.65 and 6.12%, respectively. Both estimates are significant at the 0.5% level. It appears that the Litzenberger-Ramaswamy adjustment has corrected for some of the measurement error problems present in the OLS method by lowering the riskfree rate estimate, increasing the risk premium estimate, and increasing the significance in both estimates. The OLS and GLS CAPM estimates for the risk-free rate and risk premium differ from the historical and the IS estimates. This can be partly attributed to the fact that NASDAQ companies (i.e., small companies) are used in the sam-
ple to estimate the CAPM. Smaller companies tend to have higher average returns and more measurement error than larger firms, and both factors contribute to the overestimation of the risk-free rate and underestimation of the risk premium. Or, it could be that these estimates are consistent with this time period and these companies.
99.4.2.2 APT Estimate To estimate the APT model, first, factor analysis is used to estimate the factor scores for the market, the 3,666 noninsurance companies. Factor analysis is a body of multivariate data-analysis techniques that provide “analysis of interdependence” of a given set of variables or responses (Churchill 1983, Morrison 1990). For the APT model, we apply factor analysis to the monthly returns of the 3,666 noninsurance stocks to generate k common factor loadings for all the stocks. These factors, in a sense, summarize and capture the unobserved, underlying interdependencies of the monthly returns of the 3,666 non-insurance stocks. Because the number of variables (3,666 companies) is much greater than the number of observations (60 months for each company), we form portfolios from the 3,666 companies to reduce the number of variables. First, the 3,666 non-insurance companies are grouped alphabetically into 40 equally weighted portfolios (14 portfolios of 91 companies and 26 portfolios of 92 companies). The equal-weighted average returns for each month of each portfolio are then used in a maximum likelihood estimation (MLE) method of factor analysis, varimax rotation of factors, to generate the factor scores (Ikt in Equation (99.9)).7 A varimax rotation is basically an orthogonal (angle-preserving) rotation of the original factor axes for simplification of the columns or factors (Churchill 1983, p. 634; Morrison 1990, pp. 370–372). Rotations are done to facilitate the isolation and identification 7 Alternative methods for the selection of factors is discussed in detail by Campbell et al. (1997).
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
of the factors underlying a set of observed variables (the monthly returns of the 3,666 non-insurance stocks). A five factor .k D 1 : : : 5/ model is used (Roll and Ross 1980), and this results in five factor scores for each observed month t.t D 1 : : : 60/. These five factor scores are then input into regression Equation (99.9) as independent variables, resulting in the estimated factor loading, bjk , of asset j to factor k. The dependent variables of this regression, Rjt , are the monthly average return of each portfolio j.D 1 : : : 40/ for the t.D 1 : : : 60/ months. Rjt D bj 0 C bj 1 I1t C : : : C bjk Ikt C ujt
(99.9)
The market model (99.9) is the APT analog to market model Equation (99.3) for the CAPM. This results estimates of factor loadings .bj1 ; : : : ; bj 5 / for each of the 40 portfolios. Next, the second pass cross-sectional regression Equation (99.10) is run to estimate the risk premium associated with each factor k in the APT model (Equation (99.8)). The dependent variables are the average return for the 5-year period for each of the 40 portfolios, and the independent variables are the estimated five factor loadings, bik , for each portfolio from the first-stage regression Equation (99.9): Rj D 0 C 1 bj1 C 2 bj 2 C : : : C 5 bj 5 C uj
(99.10)
This results in the risk premium, œk , associated with each factor of the five factors and œ0 , which is the risk-free rate when all bjk ’s are zero. The APT model estimate from the above described procedure will be referred to as the “OLS APT” version. Like the CAPM analysis, to adjust for possible sampling error in the estimation of the APT factor loadings in Equation (99.10), the Litzenberger and Ramaswamy procedure is applied. The resulting regression is: ı ı ı ı Rj Sej D 0 Sej C 1 bj1 Sej C 2 bj 2 Sej ı ı C : : : C k bj k Sej C uj Sej (99.26)
1475
analysis. Regression Equation (99.11) is run to obtain the estimated factor loadings .bi1 ; : : : ; bi 5 / for the 64 insurance companies: Rit D bi 0 C bi 1 I1t C : : : C bik Ikt C uit
(99.11)
To estimate APT cost of equity capital, Kei , use the equation Kei D 0 C 1 bi 1 C 2 bi 2 C 3 bi 3 C 4 bi 4 C 5 bi 5 (99.12) Insurance factor loading estimates, bik , from (99.11), and 0 and the risk premium estimates, œk , from either Equation (99.10) or (99.26) are used in Equation (99.12) to compute the APT cost of equity capital. Risk Premium Estimates. Table 99.3 shows the risk premium estimates (lambdas) of the OLS version (regression Equation (99.10)) and the intercept constrained GLS version of APT (regression Equation (99.26)). The factors of the APT are difficult to identify. œ0 is the estimate of the zero-loading return. The OLS APT estimate of œ0 is 9.57% and is significant at the 5% level. For the GLS APT version, the estimate of œ0 is 10.62% and significant at the 1% level. The APT estimates of œ0 are similar to the risk-free rate estimates of the CAPM. The first factor of the APT, œ1 , is considered highly correlated with the market index of the CAPM. Estimates of the first factor are 0.4962 and 0.6400 for the OLS and GLS versions, respectively, but these estimates are not highly significant. Overall, for the APT OLS version, two factors are significant at the 1% level, and the zero-loading return is significant at the 5% level for the APT OLS version. And, for the APT GLS version, the Litzenberger-Ramaswamy adjustment improved the statistical significance of the some of the factors. œ0 ; œ2 , and œ5 are all significant at the 1% level. After applying the Litzenberger and Ramaswamy adjustment, the t-values of œ0 ; œ1 ; œ4 , and œ5 increase.
99.4.2.3 Unified CAPM/APT Estimate where all variables are defined as before in regression Equation (99.10) and Sej is the standard deviation of the residuals of firm j ’s market model regression Equation (99.9). The intercept term of regression Equation (99.26) is constrained to zero. .1=Sej/ and .bjk =Sej / are the independent variables. Equation (99.26) for the APT is analogous to Equation (99.9) for the CAPM. The APT estimates resulting from regression Equation (99.26) will be referred to as the GLS APT version. Next, to estimate the cost of equity capital, each individual insurance company’s rate of returns are regressed against the five factor scores obtained from the MLE factor
The procedure to estimate Wei’s unified CAPM/APT is almost identical to the procedure to estimate the APT model except that an additional factor, the market factor, is added. Factor scores are estimated for the market by using the monthly returns of the 3,666 non-insurance firms. The equalweighted average returns for each month of 40 equally weighted portfolios are used in a MLE method of factor analysis, varimax rotation of factors, to generate the factor scores (Ikt in Equation (99.14)). These factor scores are the same as those used in the APT (Ikt in Equation (99.9)). Five .k D 5/ factor scores are generated, and an additional factor,
1476
A.C. Lee and J.D. Cummins
rmt , is calculated for the Wei model. The market factor, rmt , is calculated is the mean adjusted market return and is calculated by (99.27) rmt D Rmt Em where Rmt is the market index and Em is the mean of Rmt . The first-stage times-series regression Equation (99.14) estimates the factor loadings .bj1 ; : : : ; bj 5 ; bjm /, which are then used as independent variables in the second-stage crosssectional regression Equation (99.15). This results in estimates for œ0 and the risk premiums .œ1 ; : : : ; œ5 ; œm / associated with each factor, which is similar to the APT secondstage cross-sectional regression Equation (99.10) but with the additional term, œm bjm , for the market factor. The resulting estimates for this model are referred to as the “WEI OLS” version. Similar to the CAPM and APT analysis, to adjust for possible sampling error in the estimation of the factor loadings in Equation (99.15), the Litzenberger and Ramaswamy procedure is applied. The resulting regression is ı ı ı ı Rj Sej D 0 Sej C 1 .bj1 Sej / C 2 .bj 2 Sej / ı ı ı C : : : C 5 .bj 5 Sej / C m .bj m Sej / C uj Sej (99.28) where the variables are defined are before in regression Equation (99.15), and Sej is the standard deviation of the residuals of firm j ’s market model regression Equation (99.14). The intercept term of regression Equation (99.28) is constrained to zero. .1=Sej / and .bjk =Sej / are the independent variables. The estimates resulting from regression Equation (99.28) will be referred to as the “WEI GLS” version. To obtain the estimated factor loadings, bi1 : : : bi 5 , for the insurance companies, each individual insurance company’s rate of return are regressed against the same five factor scores and the market factor, which are used in regression (99.14): Ri t D bi 0 C bi 1 I1t C : : : C bi k I5t C bi m rmt C ui t (99.16) To estimate Wei cost of equity capital, Kei , Equation (99.23) is calculated. Kei D 0 C 1 bi 1 C 2 bi 2 C 3 bi 3 C 4 bi 4 C 5 bi 5 C m bim : (99.17) Insurance factor loading estimates from Equation (99.16) and risk premium estimates from either Equation (99.15) (OLS version) or Equation (99.28) (GLS version) are used in Equation (99.17) to compute the cost of equity capital of Wei’s model. Risk Premium Estimates. Table 99.4 shows the results for the OLS and GLS versions of Wei’s unified CAPM/APT
model. For the OLS version, two factor premiums, œ2 and œ5 , are statistically significant at the 1% level, and œ0 and œm , are statistically significant at the 5% level. The same factor premiums are significant for the GLS version, and in addition, œ1 is significant at the 5% level. Comparing estimates of the Wei model with those of the APT model, the risk premium estimates are similar in magnitude for œ0 through œ5 . And basically the same factors are statistically significant for both the Wei model and the APT model except for œ1 , which is statistically significant at the 5% level for the GLS Wei version. The risk premium for the market model, œm , is significant at the 5% level for both the OLS and GLS versions of the Wei model and seems to improve the specification of the APT model without detracting from the significance of the other factors.
99.5 Evaluations of Simulations and Estimates In this study, the CAPM, APT, and Wei models are used to do in-sample forecasting of the cost of equity capital for nonlife insurance companies. In any forecasting study, it is important to evaluate the quality of predictions made. Lee et al. (1986) discusses procedures for evaluating forecasts. Some of the forecast evaluating procedures considered in Lee et al. are used to measure the quality of the cost of equity capital estimations from the various versions of the three asset pricing models. The Theil mean squared error decomposition (MSE) and the Theil U 2 statistic enables us to get measures of the forecast performance of the models. Also, to evaluate Granger and Newbold (1973) conditional efficiency of the asset pricing models, the Bates and Granger method (1969) is applied. Several models for evaluating forecasts are considered because no one model can accurately assess the performance of the cost of equity capital estimation models. Each evaluation procedure provides different information about forecast quality.
99.5.1 MSE Method Let Ft .t D 1 : : : N / be a set of forecasts in time t, and Xt .t D 1 : : : N / be the corresponding true values at time t. Calculation of forecast mean squared error (MSE) (Theil 1958) is computed by the following equation, which can be deconstructed into the terms in the right-hand side of the equation: N P
MSED
.Xt Ft /2
t D1
N
D.F X /2 C.SF rSx /2 C.1r 2 /Sx2 (99.29)
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
1477
Table 99.4 WETs unified CAPM/APT second-stage annualized results of regression coefficients 1988–1992 (t -values in parentheses below coefficients) WETs Model .n D 40/ œ1 œ2 œ3 œ4 œ5 œm œ0 OLS GLS
0.0887 (1.789)a 0.0992 (2.090)a
0.5781 (1.360) 0.6750 (1.688)a
2.0047 (2.962)b 1.7068 (2.642)b
0.4471 (0.456) 0.02916 (0.031)
0.0188 (0.21) 0:1307 .0:154/
2.2453 (2.639)b 2.5950 (4.025)b
0.0885 (1.849)a 0.0870 (1.860)a
a
Significant at the 5% level Significant at the 1% level OLS D Wei model with ordinary least squares second stage regression, GLS D Wei model with generalized least squares second stage regression b
where F and X are the sample means, SF and Sx are the sample standard deviations, and r is the sample correlation between actual and predicted values. Mincer and Zarnowitz (1969) refer to the three components on the right-hand side of Equation (99.29) as representing the contributions to forecast mean squared error due to bias, inefficiency, and random error, respectively. Calculation of MSE for the CAPM, APT, and the Wei model forecasts gives us measures, and the decomposition of these measures, with which we can compare the performance of the various forecasts by comparing the measures. Applying Equation (99.29) to the cost of equity capital estimates: 64 P
MSED
.Ri Kei /2
i D1
64
D.K e R/2 C.SK rSR /2 C.1r 2 /SK2
Rt D average rate of return for insurance firm i Kei D cost of equity capital estimate for insurance firm i R; K e D sample means of Rt and Kei , respectively SR ; Sk D sample standard deviation for Rt and Kei , respectively r D sample correlation between average rate of return and cost of equity capital estimate
The smaller the U 2 ratio, the better the model estimate relative to the naive model. If the U 2 ratio is equal to 1, then the model estimates are no better than using the sample average. If the U 2 ratio is greater (less) than 1, then the model estimates are worse (better) than the naive model. Notice that the numerators of the MSE and the U 2 ratio are the same, but for the U 2 ratio, the denominator makes the ratio into a evaluation of the forecasts relative to the naive model. Thus, with the U 2 ratio, not only can we evaluate performance by comparing the ratio of various models, the U 2 ratio also provides us with a benchmark (the naive model) with which we can evaluate a model’s forecast performance.
99.5.3 Conditional Efficiency Method 99.5.3.1 Bates and Granger Model Bates and Granger (1969) propose a model that allows for the possibility of a composite, or combined, forecast, which is a weighted average of two individual forecasts and may be superior to either forecast individually in terms of predictive ability. Consider the following regression: Xt D kF1t C .1 k/ F2t C Ut or equivalently,
99.5.2 Theil U 2
Xt F2t D k .F1t F2t / C Ut The quality of cost of equity capital estimates can also be assessed by using the Theil measure (1966), U 2 : P64 .Ri Kei /2 U 2 D Pi D1 64 2 i D1 .R i R/
(99.31)
(99.30)
Rt D average return during 1988–1992 period for insurance asset i Kei D cost of equity capital estimate for 1988–1992 for insurance asset i R D average return for all insurance companies
(99.32)
Xt D true values in time t.t D 1; 2; : : : ; N / F1t D set of forecasts F2t D alternative set of forecasts Ut D random error term In terms of forecast evaluation, if the predictions F2t contain no useful information about the Xt which is not already incorporated in F1t , then the optimal value of kis one. Granger and Newbold (1973) define F1t as conditionally efficient with respect to F2t if k, the composite predictor of
1478
A.C. Lee and J.D. Cummins
Equation (99.31), is one. A test for conditional efficiency applied by fitting Equation (99.32) by ordinary least squares and testing the hypothesis that k D 1 through the usual t-test. Given a pair of forecasts, not only can we determine their conditional efficiency, it is possible to use this model to find a combined forecast that outperforms each individual forecast by the mean squared error criterion.
99.5.3.2 Davidson and Mackinnon’s Model Chen (1983) assumes that the CAPM and APT models are nonnested for the purposes of comparing and testing the two models. In an attempt to compare the two models, Chen (1983) points out the problems with simply doing a regression with the CAPM beta and the APT factor loadings: (1) This specification is not justified on theoretical grounds, and the two models are nonnested; and (2) because the CAPM beta and APT factor loadings are measures of risk, such a regression would have the problem of a high degree of multicollinearity between the independent variables. Davidson and Mackinnon (1981) proposed several procedures to test the validity of regression models of nonnested alternative hypotheses. Using the procedures suggested by Davidson and Mackinnon, Chen suggests the following regression in which a is estimated ri D ˛ rOi;APT C .1 ˛/Ori;CAPM C ei
(99.33)
where rOAPT and rOCAPM are the expected returns predicted by the APT and CAPM, respectively. From Equation (99.33), if the APT is the correct model relative to the CAPM, we would expect a to be close to 1. Thus, regression Equation (99.33) suggested by Chen (1983), based on the Davidson and Mackinnon (1981) procedures, is basically the same as the Granger and Newbold (1973) test for conditional efficiency, which uses the Bates and Granger (1969) model. Thus, we apply the Granger and Newbold test of conditional efficiency to evaluate the performance of the CAPM and APT forecasts of cost of equity capital. This test of conditional efficiency allows us to compare the two models’ forecasts performance simultaneously, unlike the MSE and U 2 ratio in which each model’s forecast performance is tested separately.
99.5.4 Comparison of Alternative Testing Results To obtain the cost of equity capital estimates, Kei , of the various versions of the CAPM, APT, and Wei models, Equations (99.7), (99.12), and (99.17) are used. The Theil mean squares
error decomposition (MSE), the Theil U 2 ratio, and the conditional efficiency (Granger and Newbold) methods are applied to evaluate forecast performance of the CAPM, APT, and Wei models in estimating the cost of equity capital for the 64 insurance companies. For all evaluation methods, the average return for the insurance company during the 5-year period is considered the actual value, and Kei is the forecasted (or predicted) value. First, consider the MSE decomposition presented in Table 99.5. The terms are calculated from annualized instead of monthly rates of returns so that the magnitudes of the MSE values would not be too small. Equation (99.30) is applied to the average returns and cost of equity capital estimates of the insurance companies. The CAPM versions (OLS and GLS) have biases (0.00003), which are at least one order of magnitude smaller than the biases of all other models. The APT OLS and Wei OLS models have bias values (0.00615 for both), which are by far the largest of any of the other models. But a large bias value merely indicates that the true values and the predicted values do not have similar sample means. As long as the difference of the samples means are not overwhelmingly large, this would not necessarily imply that a model provides poor overall forecasts. The largest inefficiency values are also those of the APT OLS and Wei OLS models (0.00068 and 0.00075, respectively). The inefficiency term is an indication of difference between the sample and the forecast standard deviations. The terms that have the most impact on the total MSE value are the random error terms, which are a function of the sample correlation between the predicted and actual values. And despite having the largest biases and inefficiency values, the APT OLS and the Wei OLS have the lowest random error values (0.00169 and 0.00201, respectively), which are one order of magnitude smaller than that of all other models, resulting in the lowest total MSE values (0.00852 and 0.00891, respectively) for the APT OLS and the Wei OLS. The CAPM IS has the highest total MSE of 0.01996. Thus, using the MSE method to evaluate forecast performance, the APT OLS and the Wei OLS models are considerably better models in estimating the cost of equity capital for the sample of PC insurance companies, and the CAPM IS model can be considered to have the worse forecast performance of all the models. Because there is no scale or basis to which we can compare the MSE values, it is difficult to determine exactly how much better the APT OLS and Wei OLS models are than the other models. The Theil decomposition of the MSE gives us an indication of the relative strengths and weaknesses of a model’s forecasts in terms of bias, inefficiency, and random error. Next, consider the Theil U 2 statistic (Equation (99.30)) for the cost of equity capital forecasts shown in Table 99.6. Because the U 2 ratio and the MSE equation have the same
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
Table 99.5 Mean squared errora decompositions for forecasts.b CAPM Bias Inefficiency Random error Total MSE 64 P
1479
APT
WEI
IS
OLS
GLS
OLS
GLS
OLS
GLS
0:00231 0:00066 0:01670 0:01996
0:00003 0:00017 0:01700 0:01720
0:00003 0:00040 0:01700 0:01743
0:00615 0:00068 0:00169 0:00852
0:00032 0:00030 0:01662 0:01724
0:00615 0:00075 0:00201 0:00891
0:0003 0:00012 0:01615 0:01659
.R i Kei /2
D .K e R/2 C .SK rSR /2 C .1 r 2 /SK2 D Bias C Inefficiency C Random Error b IS 5 D CAPM with Ibbotson Associates risk-rate .Rf / and risk premium (RP), OLS 5 D CAPM or APT or Wei model with ordinary least squares second stage regression, GLS 5 D CAPM or APT or Wei model with generalized least squares second stage regression a
MSE D
i D1
64
Table 99.6 Theil U 2 ratiosa for CAPM and APT model versions.b
CAPM
IS OLS GLS a
U2 D
APT
WEI
Insurance R D 0:1900
Market R D 0:1852
Insurance R D 0:1900
Market R D 0:1852
Insurance R D 0:1900
Market R D 0:1852
1.1328 0.9737 0.9866
1.1313 0.9724 0.9853
0.4883 0.9774
0.4876 0.9761
0.6539 0.9439
0.6530 0.9427
P64 .R i Kei /2 Pi D1 64 2 i D1 .R i R/
b IS D CAPM with Ibbotson Associates risk-rate .Rf / and risk premium (RP), OLS D CAPM or APT or Wei model with ordinary least squares second stage regression, GLS D CAPM or APT or Wei model with generalized least squares second stage regression
numerators, it is likely that both methods will similarly rank the models in terms of forecast performance. The additional information about forecast performance that the U 2 statistic provides is that the statistic compares a model’s forecast performance with the naive model (using the overall mean as a predictor). A U 2 ratio equal to 1 means that the forecast performance is equivalent to the naive model, and a ratio value > .1 means that the forecast performance is worse (better) than the naive model. The “grand mean,” R, used in the denominator for U 2 can be either the average return of the insurance companies (19.00%, annualized) or the market average of 18.52% (of the 3,666 non-insurance companies). But because the difference between the insurance company averages and the market average is not large, the U 2 are not likely to differ much. In all cases of all model versions, the U 2 ratio goes down when the market average is used in the denominator instead of the insurance average. All U 2 ratios are less than 1, except for the IS CAPM model, implying that almost all models have forecast performances that are better than that of the naive model. The highest U 2 ratios and the only ratios that are larger than 1 are those of the IS CAPM model, which is consistent with the MSE evaluation of the IS CAPM having the poorest forecast performance. In fact, U 2 ratio rankings of model performance is the same as the MSE rankings, with the APT OLS model having both the lowest MSE value and U 2 ratio.
Although both the MSE decomposition and the Theil U 2 statistic values can be easily calculated for comparison, they do not give a complete picture of forecast quality. With these two methods, it is not possible to compare the forecast quality of by considering them simultaneously. And as Granger and Newbold (1973) and Lee et al. (1986) point out, simply because one set of forecasts, F1t , outperform another set of forecasts, F2t , it does not necessarily imply that the one set of forecasts is more efficient than the other. It may be the case that F2t may contain useful information not captured by F1t . Thus as an additional measure of forecast performance is applied. The Granger and Newbold (1973) definition of conditional efficiency is tested by applying the Bates and Granger regression Equation (99.33). Conditional efficiency of forecasts F1t with respect to F2t is defined by Granger and Newbold to be .k D 1/ for regression Equation (99.32). The Granger and Newbold (1973) definition of conditional efficiency is tested by using the Bates and Granger regression Equation (99.32) or, equivalently, regression Equation (99.33). Results of the regression are shown in Table 99.7, comparing APT versions with CAPM versions and Wei model versions with CAPM versions. Considering first the APT and CAPM comparisons (regressions 1, 2, and 3 in Table 99.7), it would be logical expect the kO of the APT OLS model to be very close to 1 when evaluated with other models because of its dramatically lower MSE and U 2 values compared to those of the
1480
Table 99.7 Granger and Newbold’s conditional efficiency estimated weights of the expected return from the CAPM, APT, and Wei model Rt D kF 1t .1 k/F2t C Ut
A.C. Lee and J.D. Cummins
F1t
F2t
kO
t -Statistic .k D 0/
t -Statistic .k D 1/
R2
1. APT OLS 2. APT OLS 3. APT OLS 4. APT GLS 5. APT GLS 6. APT GLS 7. WEI OLS 8. WEI OLS 9. WEI OLS 10. WEI GLS 11. WEI GLS 12. WEI GLS
CAPM IS CAPM OLS CAPM GLS CAPM IS CAPM OLS CAPM GLS CAPM IS CAPM OLS CAPM GLS CAPM IS CAPM OLS CAPM GLS
0:65196 0:65138 0:65541 0:83660 0:48762 0:52743 0:59992 0:58792 0:59514 0:78071 0:57418 0:59837
15:666a 12:086b 12:069b 3:237b 1:549 1:796a 14:178a 10:712a 10:688a 3:731a 2:142b 2:320b
8:363b 6:468b 6:345b 0:632 1:627 1:585 9:455a 7:508a 7:380a 1:048 1:589 1:557
0:7925 0:6939 0:6933 0:1290 0:0214 0:0322 0:7576 0:6400 0:6389 0:1680 0:0531 0:0641
a
Significant at the 1% level Significant at the 5% level IS D CAPM with Ibbotson Associates risk-rate (Rf ) and risk premium (RP), OLS D CAPM or APT or Wei model with ordinary least squares second stage regression, GLS D CAPM or APT or Wei model with generalized least squares second stage regression
b
other models, but this is not the case. In fact, kO for APT OLS regressions with each of the CAPM versions are all approximately 0.65 and statistically different from 1. Table 99.7 shows that the t-statistics .k D 1/ of regressions 1, 2, and 3 are all statistically significant at the 1% level. This implies that the APT OLS model is not conditionally efficient with respect to the CAPM versions and that the CAPM versions contain useful information not present in the APT OLS foreO for regressions 1, 2, and 3 are stacasts. Although the k’s tistically different from 1, the t-statistics for k D 0 are all also significant at the 1% level, and the R2 values for regressions 1, 2, and 3 are all around 0.7. This suggests that perhaps a combined forecast of the APT OLS and an CAPM version would outperform each individual forecast by the MSE and/or U 2 criteria. The highest kO for the APT and CAPM regressions is 0.8366 with t-statistic .k D 1/ of 0:632 for the APT GLS and CAPM IS regression (regression 4). Thus, the null hypothesis of conditional efficiency .k D 1/ is not rejected. Also, the t-statistic .k D 0/ for regression 4 is significant at the 1% level. This implies that kO is not statistically different from 1 and that the APT GLS forecasts may be considered conditionally efficient with respect to the CAPM IS forecasts. Regressions 5 and 6, which test the APT GLS with respect to the CAPM OLS and IS, respectively, also do not reject the null hypothesis of k D 1. The corresponding tstatistics are 1:627 and 1:585, respectively. Although regressions 4, 5, and 6 do not reject the hypothesis of conditional efficiency of the APT GLS with respect to the CAPM versions, the R2 s of the regressions are quite low. Thus, a combined forecast of the APT GLS with any of the CAPM versions is unlikely to provide a forecast that is superior to each individual forecast. The results of the conditionally efficiency of the Wei forecasts with respect to the CAPM forecasts parallel those
for the APT forecasts with respect to the CAPM forecasts. Despite having low MSE and U 2 values compared to those of CAPM forecasts, the WEI OLS forecasts are not conditionally efficient with respect to the CAPM forecasts (regressions 7, 8, and 9). The kO estimates for these regressions are approximately 0.6 with t-statistics, which are significant at the 1% level for both the .k D 0/ and the .k D 1/ hypotheses. This implies that the CAPM forecasts may contain information not captured by the WEI OLS forecasts and that a combined forecast may outperform each individual forecast by the MSE and the U 2 criteria. Also the high R2 values of regressions 7, 8, and 9 suggest that a combined forecast should be considered. Regression 10, which considers the WEI GLS forecasts with respect to the CAPM IS forecasts, has a kO value of 0.78071 and a t-statistic .k D 1/ of 1:048, so it is possible to consider the conditional efficiency of the WEI GLS forecasts with respect to the CAPM IS forecasts. And the hypothesis of conditional efficiency of the WEI GLS forecasts with respect to the CAPM OLS or the CAPM GLS (regressions 11 and 12, respectively) cannot be rejected. But as with the APT GLS, the WEI GLS combined with the CAPM versions have low R2 values, which indicate that it is unlikely that a combined forecast of the WEI GLS with any of the CAPM versions would be superior to that of a combined forecast of either the APT OLS or the WEI OLS with the CAPM versions.
99.6 Summary and Conclusion This paper applies three alternative asset pricing models to estimate the cost of equity capital for 64 non-life insurance companies during the 5-year period 1987–1992. 3,666
99 Alternative Models for Estimating the Cost of Equity Capital for Property/Casualty Insurers
non-insurance companies were used to estimate CAPM and APT characteristic line equations. Three versions of the CAPM cost of equity capital estimates (IS, OLS, GLS) were considered. The IS version used Ibbotson and Sinquefield estimates of the risk-free rate and the risk premium for the CAPM. The OLS version used the risk-free rate and the risk premium estimates from an second-stage ordinary least squares regression. To adjust for the problem of errors in variables, the Litzenberger-Ramswamy generalized least squares procedure was used for the second-stage regression, resulting in the GLS version. For both, the OLS and GLS versions, the Scholes-Williams beta adjustment was applied to NASDAQ stocks for the problem of nonsynchronous trading. Two versions of the APT cost of equity capital estimates were considered (OLS and GLS). Similar to those of the CAPM model, the OLS version resulted from a second-stage ordinary least squares regression, and the GLS version resulted from a second-stage generalized least squares regression to adjust for the problem of error in variables. Finally, a model unifying the CAPM and APT models, developed by Wei, was also used to estimate cost of equity capital for the insurance companies. An OLS and GLS version of the Wei model, similar to those of the APT model, were considered. For the CAPM model, the OLS and GLS estimates of the risk-free rate and risk premium were higher and lower, respectively, than the historical and the IS values. Although, the GLS version seemed to somewhat correct for the errors in variables problem of underestimating the slope and overestimating the intercept, perhaps other errors in variables estimation methods of adjusting the estimates should be considered in future research. In addition, higher average returns and increased estimation risk associated with NASDAQ companies also contributed to the high risk-free rate estimates and the low risk premium estimates. But, because many of the insurance companies trade on NASDAQ, using NASDAQ companies to estimate the asset pricing models should improve the forecast ability of the models in estimating the cost of equity capital for insurance firms. Also, a misspecification of the CAPM (missing variables) could cause the high intercept estimates and low slope estimates. The APT and Wei’s unified CAPM/APT model had similar factor risk premium estimates. The additional market factor of the Wei model was statistically significant in both the OLS and GLS version. It is intuitively appealing to have a model that combines the APT and CAPM. Though, results from this research do not empirically show that the additional market factor significantly improves the cost of equity capital estimation. The MSE and U 2 methods of evaluating forecast quality found the APT OLS and the Wei OLS models to be the best estimators of cost of equity capital for the insurance companies during the 5-year period tested, but the tests of conditional efficiency suggested that a weighted combination of
1481
either the APT OLS or the Wei OLS with the CAPM may provide forecasts that are better than each individual forecast. Overall, the results show that the APT and the Wei model, which is basically a modified APT, perform better than the CAPM in estimating the cost of equity capital for insurance companies in our sample. The methods presented here are straight forward in their implementation and application, but additional techniques for adjusting for problems such as errors in variables should be explored.8 The insurance industry is very much interested in applying asset pricing models to estimate cost of equity capital for project decision-making and regulation purposes. These results show that the APT and Wei model, and perhaps a combination of either with the CAPM, can be used as more reliable estimates of cost of equity capital than techniques currently being used. Acknowledgements This is the first essay in the dissertation of the first author at the University of Pennsylvania, who is most grateful to the dissertation committee members: J. David Cummins (Chair), Franklin Allen, Randy Beatty, Neil Doherty, and Donald Morrison for their guidance and support. In addition, the author would like to thank seminar participants at the University of Pennsylvania, Indiana University, Northern Illinois University, University of Georgia, University of Illinois at Urbana-Champaign, and Syracuse University for their comments and suggestions. As always, any errors are the sole responsibility of the authors.
References Bates, J. M. and C. W. Granger. 1969. “The combination of forecasts.” Operational Research Quarterly 451–468. Black, F., M. C. Jensen, and M. Scholes. 1972. “The capital asset pricing model: some empirical tests,” in Studies in the theory of capital markets, M. C. Jensen (Ed.). Praeger, New York. Bower, D. H., R. S. Bower, and D. E. Logue. 1984. “Arbitrage pricing theory and utility stock returns.” Journal of Finance 39, 1041–1054. Bubnys, E. L. 1990. “Simulating and forecasting utility stock returns: arbitrage pricing theory vs. capital asset pricing model.” The Financial Review 29, 1–23. Campbell, J. Y., A. W. Lo, and A. C. Mackinlay. 1997. The econometrics of financial markets, Princeton University Press, Princeton. Chen, N.-F. 1983. “Some empirical tests of the theory of arbitrage pricing.” Journal of Finance 38, 1393–1414. Churchill, G. A. Jr. 1983. Marketing research: methodological foundations, 3rd edn. CBS College Publishing, New York. Cummins, J. D. 1990b. “Multi-period discounted cash flow rate-making models in property-liability insurance.” Journal of Risk and Insurance 57(1), 79–109. Cummins, J. D. 1992. “Financial pricing of property-liability insurance,” in Contributions to insurance economics, G. Dionne (Ed.). Kluwer Academic, Boston, 141–168. Cummins, J. D. and S. E. Harrington. 1987. Fair rate of return in property-liability insurance, Kluwer Academic, Norwell, MA.
8
Kim (1995) uses Judge et al.’s (1980, pp. 521–531) errors in variables method to estimate the CAPM as defined in Equation (99.4). Kim’s method can be used to estimate the cost of capital in term of the CAPM.
1482 Cummins, J. D. and D. J. Nye. 1972. “The cost of capital of insurance companies: comment.” Journal of Risk and Insurance 39(3), 487–491. D’Arcy, S. P. and N. A. Doherty. 1988. Financial theory of insurance pricing, S.S. Huebner Foundation, Philadelphia. Davidson, R. and J. G. Mackinnon. 1981. “Several tests for model specification in the presence of alternative hypotheses.” Econometrica 49(3), 781–793. Elton, E. J. and M. J. Gruber. 1994. “Introduction.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 1–5. Elton, E. J., M. J. Gruber, and J. Mei. 1994. “Cost of capital using arbitrage pricing theory: a case study of nine New York utilities.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 46–73. Fairley, W. B. 1979. “Investment income and profit margins in propertyliability insurance: theory and empirical results.” Bell Journal of Economics 10(1), 191–210. Fama, E. F. and K. R. French. 1992. “The cross-section of expected stock returns.” Journal of Finance 47(2), 427–465. Forbes, S. W. 1971. “Rates of return in the Nonlife insurance industry.” Journal of Risk and Insurance 38, 409–422. Forbes, S. W. 1972. “The cost of capital of insurance companies: comment.” Journal of Risk and Insurance 39(3), 491–492. Gordon, M. J. 1974a. The cost of capital to a public utility, MSU Public Utilities Studies, East Lansing, MI. Granger, C. W. J. and P. Newbold. 1973. “Some comments on the evaluation of economic forecasts.” Applied Economics 5, 35–47. Granger, C. W. J. and P. Newbold. 1977. Forecasting economic time series, Academic, New York. Greene, W. H. 1997. Econometric analysis, 3nd edn. Macmillan, New York. Haugen, R. A. and C. O. Kroncke. 1971. “Rate regulation and the cost of capital in the insurance industry.” Journal of Financial and Quantitative Analysis 6(5), 1283–1305. Hill, R. D. 1979. “Profit regulation in property-liability insurance.” Bell Journal of Economics 10(1), 172–191. Hill, R. D. and F. Modigliani 1987. “The massachusetts model of profit regulation in Nonlife insurance: an appraisal and extension,” in Fair rate of return in property-liability insurance, J. D. Cummins and S. E. Harrington (Eds.). Kluwer Academic Publishers, Norwell, MA. Ibbotson Associates. 1994. Chicago, stocks, bonds, bills and inflation 1994 yearbook, Ibbotson Associates, Inc., Chicago. Jagannathan, R. and Z. Y. Wang. 1996. “The conditional CAPM and the cross-section of expected returns.” Journal of Finance 51(1), 3–53. Judge, G., R. Hill, W. Griffiths, H. Lutkepohl, and T. C. Lee. 1980. The theory and practice of econometrics, John Wiley, New York. Kim, D. 1995. “The errors in the variables problem in the cross-section of expected stock returns.” Journal of Finance 50(5), 1605–1634. Kolbe, A. L., J. A. Read, and G. R. Hall. 1984. The cost of capital: estimating the rate of return for public utilities, MTT Press, Cambridge, Mass.
A.C. Lee and J.D. Cummins Kraus, A. and S. A. Ross. 1982. “The determination of fair profits for the property-liability insurance firm.” Journal of Finance 37(4), 1015–1026. Launie, J. J. 1971. “The cost of capital of insurance companies.” Journal of Risk and Insurance 38(2), 263–268. Launie, J. J. 1972. “The cost of capital of insurance companies: reply.” Journal of Risk and Insurance 39(3), 292–495. Lee, C. F. and S. W. Forbes. 1980. “Dividend policy, equity value, and cost of capital estimates for the property and liability insurance industry.” Journal of Risk and Insurance 47, 205–222. Lee, C. F., P. Newbold, J. E. Finnerty, and C.-C. Chu. 1986. “On accounting-based, market Based and composite-based beta predictions: method and implications.” Financial Review 21(1), 51–68. Litzenberger, R. H. and K. Ramaswamy 1979. “The effects of personal taxes and dividends on capital asset prices: theory and empirical evidence.” Journal of Financial Economics 7, 163–196. Mincer, J. and V. Zarnowitz. 1969. “The evaluation of economic forecasts,” in Economic forecasts and expectations, J. Mincer (Ed.). National Bureau of Economic Research, New York. Morrison, D. F. 1990. Multivariate statistical methods, 3rd edn. McGraw-Hill, Inc., New York. Myers, S. C. and L. S. Borucki. 1994. “Discounted cash flow estimates of the cost of equity capital – a case study.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 9–45. Myers, S. C. and R. A. Cohn. 1987. “A discounted cash flow approach to property-liability insurance rate regulation,” in Fair rate of return in property-liability insurance, J. D. Cummins and S. E. Harrington (Eds.). Kluwer Academic Publishers, Norwell, MA. Quirin, G. D. and W. R. Waters. 1975. “Market efficiency and the cost of capital: the strange case of fire and casualty insurance companies.” Journal of Finance 30(2), 427–224. Roll, R. and S. A. Ross. 1980. “An empirical investigation of arbitrage pricing theory.” Journal of Finance 35, 1073–1103. Ross, S. A. 1976. “Arbitrage theory of capital asset pricing.” Journal of Economic Theory 4, 341–360. Ross, S. A. 1977. “Return, risk, and arbitrage,” Risk and Return in Finance vol 1, Ballinger, Cambridge, MA, 187–208. Schink, G. R. and R. S. Bower. 1994. “Application of the fama-french model to utility stocks.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 74–96. Scholes, M. and J. Williams. 1977. “Estimating beta from nonsynchronous trading.” Journal of Financial Economics 5, 309–327. Theil, H. 1958. Economic forecasts and policy, North-Holland, Amsterdam. Theil, H. 1966. Applied economic forecasting, North-Holland, Amsterdam. Wei, K. C. J. 1988. “An asset-pricing theory unifying the CAPM and APT.” Journal of Finance 43(4), 881–892.
Chapter 100
Implementing a Multifactor Term Structure Model Ren-Raw Chen and Louis O. Scott
Abstract In this paper, we describe the methodology of how to implement a multifactor Cox–Ingersoll–Ross (CIR) models for the term structure of interest rates and its derivatives. We demonstrate how to calibrate the model to the U.S. Treasuries and options on Treasury bond futures. Keywords Term structure of interest rates r Cox–Ingersoll –Ross model
100.1 Introduction
and we show how the problem of estimating parameters and random state variables can be handled in a Kalman filter model. In the second section, we show how to use the Fourier inversion formula to calculate the probability functions in the solutions for European options. In the last section, we present an application in which we calibrate the model to match the term structure in the U.S. Treasury market, and we make some small adjustments in the volatility parameters so that the model matches market prices for selected interest rate options.
The early term structure models of Vasicek (1977) and Cox et al. (1985) are easy to use because they produce simple closed form solution for bond prices, but they rely on only one factor to determine changes in the term structure of interest rates. By contrast, most studies of interest rate variability find that several factors are necessary to explain the variation of the term structure. Long- and short- term rates move around together, but the correlations are not very high and the slope and curvature of the term structure vary. This problem, as well as the need to fit a term structure model to the current term structure, has led to the development of many new models. One approach has been to retain the single factor structure and allow the fixed parameters to be time dependent.1 Another approach has been to allow for multiple factors in the determination of the instantaneous interest rate.2 In this article, we follow the second approach, and with the help of two mathematical tools, we show that a multifactor version of the CIR model is really quite easy to implement. The model produces simple closed form solutions for bond prices, and we show how to apply Fourier inversion methods to get fast solutions for European options, which include several popular interest rate options. In the first section we present the setup of the multifactor model
100.2 A Multifactor Term Structure Model
R.-R. Chen () Graduate School of Business Administration, Fordham University, New York, NY 10019, USA e-mail: [email protected]
Chen is professor at the Department of Finance, Fordham University, New York and Scott is a managing director in the market risk department in Morgan Stanley, New York. 1 See Ho and Lee (1986), Black and Karasinski (1991), Hull and White (1993), and Babbs (1993). 2 See Chen and Scott (1992) and Longstaff and Schwartz (1992).
L.O. Scott Fordham University, USA
The multifactor CIR model follows from a suggestion in Cox et al. (1985), in which the instantaneous risk free rate is the sum of independent state variables: rD
J X
yj
(100.1)
j D1
Each state variable, yi , is assumed to follow a square root process and the risk- neutralized process for valuation is dyj D .kj qj kj yj l j yj /dt C sj
p yj dzj
(100.2)
The instantaneous interest rate in this model is always nonnegative, and there is a closed form solution for the bond pricing function. The price at time t for a discount bond that pays one currency unit, say $1, at time s is: 13 2 0 s Z ^ P .t; s/ D E 4exp @ r.u/duA5 D A1 .t; s/ AJ .t; s/ t
expfB1 .t; s/y1 BJ .t; s/yJ g
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_100,
(100.3)
1483
1484
R.-R. Chen and L.O. Scott
where Aj .t; s/ and Bj .t; s/ are deterministic functions of O is the time as defined in CIR (1985). The expectation, EŒ , conditional expectation evaluated with the risk-neutralized processes. The continuously compounded yield for a discount bond is computed as R.t; s/ D ln P .t; s/=.s t/. A major obstacle in the implementation of multifactor models has been the need to generate values for the fixed parameters in the interest rate processes. These parameters should be set to reflect potential interest rate variability, and a natural starting point is recent historical experience. The solutions for the bond rates, which are linear functions of the state variables, can be used to estimate the fixed parameters in the model. As noted in CIR, each state variable, or factor, has a conditional distribution that is noncentral chi-squared. The expectation of yj .s/, conditional on knowing yj .t/ is j Œ1 exp.j .s t// C exp.j .s t//yj .t/ (100.4) and the conditional variance is 2 1 exp.j .s t// j j 1
j .1 exp.j .s t/// C exp.j .s t//yj .t/ 2 (100.5) If we consider the bond rates and the state variables taken at discrete points in time and we add measurement errors to the bond rates, we have a state space model: yj .t/ D j 1 exp.j t/ C exp.j t/yj .t 1/ C j .t/ R.t; sm / D
J X ln Aj .t; sm / j D1
sm t
C
Bj .t; sm / yj .t/ C "m .t/ sm t (100.6)
where Dt is the size of the time interval between the discrete observations. The innovation for each state variable, hj .t/, has a conditional mean of zero and a conditional variance equal to the conditional variance for the state variable given in Equation (100.5). A Kalman filter can be used to compute estimates of the unobservable state variables at each time period, and quasi maximum likelihood can be used to estimate the fixed parameters.3 This statistical technique is useful for extracting estimates from historical data, but an analyst who is pricing interest rate derivatives needs a model that
describes the future potential variation. It will of course be necessary to adjust these parameter values to reflect changes in term structure variability. In the calibration section below, we present an example with an actual term structure and we make small adjustments to some of the parameter values. Factor analysis has become a popular tool for analyzing term structure variability, and this Kalman filter model is a form of dynamic factor analysis, which incorporates the structure of the multifactor CIR model. To implement the model, we need to set the number of factors. In previous factor analysis studies of the term structure, Litterman and Scheinkman (1991) have found that three principal components explain most of the variation in bond returns, but Wilson (1994) has argued that a two factor model may have some advantages. We have used this Kalman filter model to estimate one, two, and three factor models from a weekly dataset of bond rates for the period 1980–1988. We reproduce here the results for the three factor model because it is more flexible in terms of capturing the important characteristics of term structure variability. Table 100.1 contains the estimates for the parameter combinations that are relevant for pricing derivatives in the three factor model. The factor estimates that are computed from the Kalman filter are related to several familiar aspects of the term structure. Figures 100.1–100.3 contain graphs of these factors with term structure variables. The first factor in the three factor model is highly correlated with the slope of the term structure; the factor moves up and down with the difference between the short- and the long-term rate.4 This factor has strong mean reversion, which is also a characteristic of the slope of the term structure. The second factor in the three factor model is highly correlated with medium- and longterm rates; Fig. 100.2 is a graph of the second factor with the 5-year bond rate.5 The third factor, which displays less
Table 100.1 Parameter estimates for the three factor model. Estimates from weekly data, 1980–1988. U.S. treasury bond market, sample size D 470 Parameters State variable 1 State variable 2 State variable 3 KC
1.1830 0.05105 0.04790 (0.07273) (0.01336) (0.01116) 0.06253 0.00004286 0.0001126 (0.004635) (0.00057105) (0.00008571) ˙ 0.16049 0.10539 0.049599 (0.01047) (0.006794) (0.003148) Note The numbers in parentheses are standard errors
4
3
See Chen and Scott (1993, 1994a).
This variable in the graphs, the short rate minus the long rate, is the negative of the slope of the term structure. 5 Although it is not shown in the graph, the 5-year bond rate and the long-term rate tend to move together.
100 Implementing a Multifactor Term Structure Model Fig. 100.1 Y(1) in the three factor model
1485
Y(1) in the Three Factor Model 0.15
0.1
0.05 Y(1) Slope
09/03/88
05/03/88
01/03/88
09/03/87
05/03/87
01/03/87
09/03/86
05/03/86
01/03/86
09/03/85
05/03/85
01/03/85
09/03/84
05/03/84
01/03/84
09/03/83
05/03/83
01/03/83
09/03/82
05/03/82
01/03/82
09/03/81
05/03/81
01/03/81
09/03/80
05/03/80
01/03/80
0
−0.05
−0.1
Fig. 100.2 Y(2) in the three factor model
Y(2) in the Three Factor Model 0.18 0.16 0.14 0.12 0.1
Y(2) 5Y Rate
0.08 0.06 0.04 0.02
09/03/88
05/03/88
01/03/88
09/03/87
05/03/87
01/03/87
09/03/86
05/03/86
01/03/86
09/03/85
05/03/85
01/03/85
09/03/84
05/03/84
01/03/84
09/03/83
05/03/83
01/03/83
09/03/82
05/03/82
01/03/82
09/03/81
05/03/81
01/03/81
09/03/80
05/03/80
01/03/80
0
variability, is correlated with the difference between the long term rate and the 5-year rate.6 The second and third factors together determine medium- and long-term rates, but the third factor has a stronger effect on longer-term rates. The volatility is higher for the second factor, and the interaction of these two factors determines both volatility and the curvature of the term structure. The second and third factors also display extremely slow mean reversion, which is
consistent with the random walk characteristics of mediumand long-term bond rates.
6
7
The correlation coefficient between changes in these two variables is 0.60.
100.3 Pricing Options in the Multifactor Model We have also developed fast, closed form solutions for a variety of European style options in these multifactor models.7
See Chen and Scott (1992) and Chen and Scott (1994b).
1486
R.-R. Chen and L.O. Scott
Fig. 100.3 Y(3) in the three factor model
Y(3) in the Three Factor Model 0.03
0.02
0.01
Y(3) Long-5Y 07/03/88
01/03/88
07/03/87
01/03/87
07/03/86
01/03/86
07/03/85
01/03/85
07/03/84
01/03/84
07/03/83
01/03/83
07/03/82
01/03/82
07/03/81
01/03/81
07/03/80
−0.01
01/03/80
0
−0.02
−0.03
Here, we reproduce the solution for a European call option on a discount bond with a strike price K: C.y; tI T; s/ D P .t; s/ Pr
X J j D1
D P .t; T /K Pr
Where ˆ is the characteristic function defined as F.u/ D EŒexp.ln X / and i 2 D 1. The characteristic functions for the two probability functions above are:
Bj .T; s/yj .T / ZI v; ı
X J
ˆ.u/ D
j D1
Rj .T; s/yj .T / ZI v; ı
exp
(100.7)
and ıj D
2 j2 yj .t/ expfj .T t/g 4j j ; ı D ; j
j ' j j2 2 j2 yj .t/ expfj .T t/g
j C 'j C Bj .T; s/
(100.8)
!
1 B .T; s/ıj iu 2 j
j C 'j Bj .T; s/iu
and
J Y Bj .T; s/iu vj =2 1 ˆ .u/ D
j C ' j j D1
where the option expires at time T and the bond matures at time s. y is a vector containing the state variables, Z D ln K C †Jj D1 ln Aj .T; s/, and the elements of the vectors , ı, and ı are defined as follows for j D 1; : : :; J : vj D
J Y Bj .T; s/iu vj =2 1
j C ' j j D1
exp
1 B .T; s/ ıj 2 j
iu
j C 'j Bj .T; s/.1 iu/
! (100.10)
The solution for the European call option can be rewritten as follows: 2 3 Z1 sin uZ 1 ˆ .u/du5 C.y; u; T; s 0 / D P .t; s 0 / 4 u 1
2
with
j D
P .t; T /K 4
2j j C j C j and 'j D j2 .expfj .T t/g 1/ j2
1
Z1
3 sin uZ ˆ.u/du5 u
1
(100.11) A useful tool for calculating these probability functions is the Fourier inversion formula: 1 Pr.X x/ D n
Z1 1
sin ux ˆ.u/du; u
These Fourier integrals can be calculated to any desired level of accuracy very quickly.8 For example, we are able to calculate option prices in the three factor model in a fraction of a
(100.9) 8
For the details, see Chen and Scott (1994b).
100 Implementing a Multifactor Term Structure Model
second on a personal computer with the approximation errors for the probability functions set at 109 . European put options can be evaluated by applying the put-call parity relationship, and the model can be used to value interest rate caps on Treasury rates. We have also developed a model for valuing options and interest rate caps on LIBOR. The model is augmented by assuming that LIBOR is a function of the corresponding Treasury rate plus a variable to account for the spread between LIBOR and Treasury rates. The spread variable is also modeled as a square root process, and the model generates closed form solutions for LIBOR forward rates and futures, as well as European options on LIBOR.
100.4 Calibrating a Multifactor Model It is important that a model be able to match the current term structure when valuing interest rate derivatives. When the estimates of the fixed parameters and the estimates of the factors computed from the Kalman filter are used, the multifactor models match the yield curves at the short end and the long end, but there are misses on the intermediate points. With minor adjustments in the fixed parameter values, specifically the C parameters, we are able to bend the yield curves generated by the models so that they match the current term structure. As the parameter values are adjusted, we recomputed the values for the factors so that we minimize the differences between the actual interest rates and the rates generated by the models. One can also make additional adjustments on the volatility parameters, the ’s, so that option prices from the model match on selected interest rate options. Effectively, one can set the model parameters so that the model generates a desired set of implied volatilities. To demonstrate this flexibility of the models, we have calibrated the three factor model to the U.S. Treasury term structure and prices for at-the-money options on Treasury bond futures on May 10, 1994.9 The continuously compounded yields for six different discount bonds were taken from quotes in the U.S. Treasury Strip market; the maturities included 3 months, 6 months, 1 year, 5 years, 10 years, and 20 years. The rates and the yields calculated from the model are given at the top of Table 100.2. The volatility parameters were adjusted so that the model also matched on prices for at-the-money call options on the September Treasury bond futures.10 We started with the parameter estimates 9 We have also calibrated the two factor term structure model to the same term structure, and the results are similar. 10 There is a delivery option in U.S. Treasury bond futures so that the futures price is almost always less than the theoretical forward price computed by the cheapest-to-deliver bond.
1487
Table 100.2 Calibration for the three factor model May 10, 1994 Treasury term structure (zero coupon bonds): Maturity Actual yield Yield from 3 factor model 3 months 4.31% 4.51% 6 months 4.87 4.87 1 year 5.42 5.42 5 years 6.87 6.87 10 years 7.37 7.38 20 years 7.71 7.71 Prices for call options on September treasury bond futures: Strike Market Price from Price Price 3 Factor model 102 2.609375 104 1.6875 106 1.03125 Euro dollar futures rates: Futures Market Delivery Rate
Futures rate from 3 Factor model
Jun 94 Sep 94 Dec 94 Mar 95 Jun 95 Sep 95 Dec 95 Mar 96 Jun 96
5.03% 5.77 6.22 6.55 6.79 6.96 7.10 7.20 7.29
5.03% 5.72 6.19 6.47 6.74 6.96 7.15 7.19 7.30
2.602 1.697 1.031
in Table 100.1 and made the following adjustments for the three factor model: 1 C 1 D 1:3198, 2 C 2 D 0:07146, 2 D 0:08, 3 C 3 D 0:04790, and 3 D 0:045. The values for y1 , y2 , and y3 were set at 0.018174, 0.01066, and 0.01173, respectively. The model was set to match on all of the yields except for the rate on 3 month T-Bills. We have found that there is a persistent error for three month T-Bills rate, which seems to be unrelated to the factors that determine the rest of the term structure. We suspect that this error, typically plus or minus 20–40 basis point in the annualized rate, is due to activity in the banking system and the management of shortterm funds. This small effect is also built into our models for 3-month LIBOR, which depends on the 3-month Treasure rate. We have applied the model for futures on U.S. LIBOR (3 months) to price Eurodollar futures, and we are able to adjust the parameters in the process for the spread variable so that the model rates for Eurodollar futures are close to the actual futures rates. The market rates and the model rates are given at the bottom of Table 100.2. The multi-factor models can be calibrated so that they match a given term structure and match on selected option prices, or implied volatilities. To value the more complex exotic interest rate options, one must use a numerical method
1488
such as Monte Carlo simulation or a finite difference method. The multi-factor CIR models do have an advantage in the pricing of complex options because of the closed form solutions. In a Monte Carlo simulation, the analyst needs to simulate the model for the life of the derivative only; additional simulations to determine bond prices or interest rates are not necessary. In a finite difference algorithm, the grid is built for the life of the derivative; additional grid points to compute bond prices or interest rates are not necessary. To hedge an interest rate option with the multi-factor model, one must compute the partial derivative of the price with respect to each state variable and design strategies for hedging these potential changes in the term structure. The hedging strategies would, of course, require additional hedging instruments. In pricing this futures option, we adjust the futures price in the model to match the price on the underlying futures contract. We have also ignored the early exercise premium on this American style option because we find that it is very small for the at-the-money options.
100.5 Conclusion In this paper, we describe the methodology of how to implement a multi-factor Cox–Ingersoll–Ross model for the term structure of interest rates and its derivatives. The calibration results of a three-factor model to the U.S. Treasury curve are satisfactory. The average pricing error of derivatives is less than one-third percent.
R.-R. Chen and L.O. Scott
References Babbs, S. 1993. “Generalized Vasicek models of the term structure,” in Applied stochastic models and data analysis: proceedings of the 6’th international symposium, J. Janssen, and C. H. Skiadas (Eds.). World Scientific, V.I, Singapore. Black, F. and P. Karasinski. 1991. “Bond and option pricing when short rates are lognormal.” Financial Analysis Journal, 47, 52–59. Chen, R. and L. Scott. 1992. “Pricing interest rate options in a twofactor Cox–Ingersoll–Ross model of the term structure.” Review of Financial Studies 5, 613–636. Chen, R. and L. Scott. 1993. “Maximum likelihood estimation for a multi-factor equilibrium model of the term structure of interest rates.” Journal of Fixed Income 3, 14–31. Chen, R. and L. Scott. 1994a. “Multi-factor Cox–Ingersoll–Ross models of the term structure of interest rates: estimates and tests using a Kalman filter,” Working Paper, Rutgers University and University of Georgia. Chen, R. and L. Scott. 1994b. “Interest rate options in multi-factor CIR models of the term structure,” Rutgers University and University of Georgia. Cox, J., J. Ingersoll, and S. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 385–408. Ho, T. and S. Lee. 1986. “Term structure movements and pricing interest rate contingent claims.” Journal of Finance 41, 1011–1029. Hull, J. and A. White. 1993. “One-factor interest rate models and the valuation of interest rate derivative securities.” Journal of Financial & Quantitative Analysis 28, 235–54. Litterman, R. and J. Scheinkman. 1991. “Common factors affecting bond returns.” Journal of Fixed Income Securities 1, 54–61. Longstaff, F. and E. Schwartz. 1992. “Interest rate volatility and the term structure: a two factor general equilibrium model.” Journal of Finance 47, 1259–82. Vasicek, O. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188. Wilson, T. 1994. “Debunking the myths.” Risk 4, 68–73.
Chapter 101
Taking Positive Interest Rates Seriously* Enlin Pan and Liuren Wu
Abstract We present a dynamic term structure model in which interest rates of all maturities are bounded from below at zero. Positivity and continuity, combined with no arbitrage, result in only one functional form for the term structure with three sources of risk. We cast the model into a state-space form and extract the three sources of systematic risk from both the US Treasury yields and the US dollar swap rates. We analyze the different dynamic behaviors of the two markets during credit crises and liquidity squeezes. Keywords Term structure r Consistency r Positivity Quadratic forms r Forecasting r State-space estimation
r
101.1 Introduction Many term structure models have been proposed during the last two decades, yet most of these models imply positive probabilities of negative interest rates. Other models guarantee interest rate positivity, but very often imply that interest rates at certain maturities cannot go below a certain positive number. Asserting that an interest rate can be negative or cannot be lower than, say, 3%, is equally counterintuitive for academics and troublesome for practitioners. In this paper, we propose a dynamic term structure model where interest rates of all maturities are bounded below at exactly zero. We guarantee this bounded behavior by assuming that continuous compounded interest rates of all maturities take a quadratic function of the state vector. Such a reasonable and seemingly innocuous contention, together with the assumption of continuity and no arbitrage, generates several striking results. First, the term structure of interest rates collapses to one functional form, determined by the solution to a scalar Riccati equation. Second, the term
E. Pan and L. Wu () Baruch College, City University of New York, One Bernard Baruch Way, Box B10-225, New York, NY 10010, USA e-mail: [email protected]
structure is governed by exactly three sources of risk, only one of which is dynamic. This dynamic risk factor follows a special two-parameter square-root process under the riskneutral measure, and the two parameters of the process determine the other two sources of risk. The model has no extra parameters in addition to these three risk sources. The most surprising result is the collapse of dimensionality. We obtain the three sources of risk without any a priori assumption on the exact dimensionality of the state space. The single dynamic factor controls the level of the interest rate curve. The two parameters control the slope and curvature of the yield curve. Although the two parameters can be time varying, their dynamics do not affect the pricing of the interest rates. Therefore, we regard them as static factors. Despite its simplicity, our model captures the observed yield curve very well. In particular, the model captures nicely the well-documented hump shape in the term structure of forward rates. By a simple transformation, we can represent the whole term structure by the maximum forward rate, the maturity of the maximum forward rate, and the curvature of the forward rate curve at the maximum. We can also use the instantaneous interest rate level, the forward-rate curve slope, and its curvature at the short end as the three factors. Such transformations not only comply with the empirical findings and intuition, but also simplify the daily fitting of the forward-rate curve. To investigate the empirical performance of the model in fitting the term structure of interest rates, we calibrate the model to the weekly data of both US Treasury yields and US dollar swap rates over the 8 years from December 14, 1994 to December 28, 2000. The model fits both markets well. The pricing errors are mostly within a few basis points. The estimation also generates a time series of the three risk sources from both markets. The intuitive explanation of the three sources further enhances our understanding of the two interest rate markets. We find that although the average level spread between the swap rates and the Treasury
* This paper is reprinted from Advances in Quantitative Analysis of Finance and Accounting, 4 (2006), Chapter 14, pp. 327–356.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_101,
1489
1490
yields are small, the spread can become exceptionally large during credit events such as the late 1998 hedge fund crisis and during the Treasury liquidity squeeze in 2000. The paper is organized as follows. Section 101.2 describes the relevant literature that forms the background of our study. Section 101.3 elaborates on how the contention of interest rate positivity and continuity collapses the dimensionality of the state space to three. In Sect. 101.4, we analyze the properties and different representations of the three sources of risk. In Sect. 101.5, we fit the model to both the US Treasury yields and the US dollar swap rates. In Sect. 101.6, we explore the possibility of adding jumps to such a model while maintaining positive interest rates. Section 101.7 concludes.
101.2 Background Many term structure models allow positive probabilities of negative interest rates. The inconsistency in terms of negative interest rates in these models is often excused on the ground of “good” empirical performance and “small probability” of negative interest rates. Although this is true in many cases, the values of some derivatives are extremely sensitive to the possibility of negative rates (Rogers 1996). For such derivatives, the prices inferred from these “negative” interest-rate models can be far off. The literature has taken three approaches in generating positive interest rates. The first approach specifies the instantaneous interest rate as a general quadratic function of some Gaussian state variable. Examples of quadratic term structure models include Ahn et al. (2002), Beaglehole and Tenney (1991, 1992), Brandt and Chapman (2002), Brandt and Yaron (2001), Constantinides (1992), El Karoui et al. (1992), Jamshidian (1996) Leippold and Wu (2002, 2003), Longstaff (1989), and Rogers (1997). This approach can guarantee the positivity of the instantaneous interest rate by one parametric restriction. However, the underlying dynamics very often imply that interest rates at some other maturities can either become negative or cannot go below a certain positive number. Asserting that an interest rate can be negative or cannot be lower than, say, 3%, is equally troublesome. For example, no rational traders are willing to offer free floors at any strictly positive level of interest rates. Our model is the most closely related to this approach; however, instead of assuming a quadratic form for only the instantaneous interest rate, we require that interest rates at all maturities are quadratic functions of a finite-dimensional state vector. We further constrain the functions to have no linear or constant terms so that all interest rates are bounded from below at exactly zero. The second approach derives positive interest rates based on the specifications of the pricing kernel. For example,
E. Pan and L. Wu
Flesaker and Hughston (1996) derive a condition on the discount bond price that guarantees positive interest rates. However, the rational log-normal model they come up with from this condition has several issues: The short rate implied from the model is bounded from both above and below, and the model remains arbitrage-free only up to a certain point (Babbs 1997). Jin and Glasserman (2001) show how the framework of Heath et al. (1992) is related to the positive rate framework of Flesaker and Hughston (1996). The third approach treats nominal interest rates as options and hence guarantee positive interest rates. Examples include Black (1995), Gorovoi and Linetsky (2003), and Rogers (1995). In addition, Goldstein and Keirstead (1997) generate positive interest rates by modeling them as processes with reflecting or absorbing boundaries at zero. What these models lack are analytical tractability. The collapse of dimensionality to three is consistent with the empirical findings of factor analysis in Litterman and Scheinkman (1991), Knez et al. (1994), and Heidari and Wu (2003). The dimension of three has also become the consensus choice in recent empirical works on model designs, e.g., Backus et al. (2001), Balduzzi et al. (1996), Chen and Scott (1993), Dai and Singleton (2000, 2002, 2003), and Duffee (2002). However, these three-factor models have 10–20 free parameters. The estimates of many of these parameters show large standard errors. Therefore, in applying these models, we not only need to control and price the risk of the three state variables (factors), but we must also be concerned with the uncertainty and risk associated with the many parameter estimates. Recently, Longstaff et al. (2001) address the issue of overfitting in pricing American swaptions. In contrast, under our model, the three risk sources capture all that is uncertain. We have no other parameters to estimate and hence no other risks to bear. Furthermore, we find that the empirical performance of our model in fitting the term structure of US swap rates and Treasury yields is comparable to the much more complicated models. The collapse of dimensionality is also observed in the geometric analysis of Pan (1998). In this paper, we link the collapse of dimensionality to the risk-neutral dynamics of the interest rates. To guarantee that all interest rates are bounded below from zero, we start with the assumption that all continuously compounded spot rates are quadratic forms of a finitedimensional state vector. This setup belongs to the quadratic class of Leippold and Wu (2002). Nevertheless, the resulting term structure behaves as if all spot rates are proportional to one dynamic factor, which follows a special two-parameter square root process. Thus, the final model falls within the affine class of Duffie and Kan (1996). In a way, our model illustrates the inherent link between the affine class and the quadratic class of term structure models. With some transformation, we can define the three interest rate factors in terms of the level, the maturity, and the
101 Taking Positive Interest Rates Seriously
1491
curvature of the maximum forward rate. Thus, the model can naturally generate a hump-shaped term structure. Recent evidence supports such a hump shape. For example, Brown and Schaefer (2000) find that, in nearly 10 years of daily data on US Treasury STRIPS from 1985 to 1994, the implied 2-year forward rate spanning years 24–26 is lower than the forward rate for years 14 and 16 on 98.4% of occasions. The average difference in these rates is 138 basis points. A similar downward tilt also appears in estimates of forward rates derived from the prices of coupon bonds in the US Treasury market and in the UK market for both real and nominal government bonds. Given the initial upward-sloping term structure in most observations, the downward slopes in the very long term imply a hump-shaped term structure for the forward rates.
Z P .ut ; / D Et exp
T
rs ds
;
We fix a filtered complete probability space f; F ; P; .Ft /0t T g that satisfies the usual technical conditions1 with T being some finite, fixed time. We assume that the uncertainty of the economy is governed by a finitedimensional state vector u. Assumption 1 (Diffusive State Vector). Under the probability space f; F ; P; .Ft /0t T g, the state vector u is a d -dimensional Markov process in some state space D Rd , solving the stochastic differential equation: d ut D .ut /dt C †.ut /d zt ;
where Et Œ denotes expectation under measure P conditional on the filtration Ft . Under certain regularity conditions, the existence of such a measure is guaranteed by no-arbitrage. The measure is unique when the market is complete (Duffie 1992). Let .ut / denote the drift function of ut under measure P . The diffusion function †.ut / remains the same under the two measures by virtue of the Girsanov’s theorem. The spot rate of maturity is defined as
where zt is a vector Wiener process in R , .ut / is an d 1 vector defining the drift, and †.ut / is an n n matrix defining the diffusion of the process. We further assume that .ut / and †.ut / satisfy the usual regularity conditions such that the above stochastic differential equation allows a strong solution. For ease of notation, we assume for now that the process is time homogeneous. For any time t 2 Œ0; T and timeof-maturity T 2 Œt; T , we assume that the market value at time t of a zero-coupon bond with maturity D T t is fully characterized by P .ut ; / and that the instantaneous interest rate, or the short rate, r, is defined by continuity:
#0
(101.4)
The instantaneous forward rate is defined as f .ut ; /
@ ln P .ut ; / : @
(101.5)
Assumption 2 (Positive Interest Rates). The spot rates, y, take the following quadratic form of the state vector u, y.ut ; / D
1 > u A./ut ; t
(101.6)
(101.1) d
rt lim
(101.3)
t
1 y.ut ; / ln P .ut ; /:
101.3 The Model
1
We further assume that there exists a risk-neutral measure, or a martingale measure, P , under which the bond price can be written as
ln P .ut ; / :
(101.2)
For technical details, see, for example, Jacod and Shiryaev (Jacod).
where A./ is a positive definite matrix so that all spot rates are bounded from below at zero. As the asymmetric part of A has zero contribution to the spot rate, we also assume that A is symmetric with no loss of generality. In principle, interest-rate positivity can be guaranteed either through a quadratic form or through an exponential function. However, the exponential family is not consistent with any diffusion dynamics for the state vector (Björk and Christensen 1999; Filipovi´c 1999, 2000). Furthermore, the history of interest rates across the world (witness Switzerland and, in recent times, Japan) shows that we must allow an interest rate of zero to be reachable. Zero is not reachable if interest rates are specified as exponential functions of the state variable, but can be reached under our quadratic specification by letting the state vector u approach the vector of zeros. The fact that u can be small argues against the inclusion of linear terms, since the linear term would dominate when the state vector is small, thus potentially allowing negative interest rates.
1492
E. Pan and L. Wu
Proposition 1 (Bond Pricing). Under the assumptions of diffusion state dynamics in Equation (101.1) and the quadratic form in Equation (101.6) for interest rates, the term structure of zero-coupon bonds is given by P .rt ; / D exp .c./rt / ;
(101.7)
where rt is the instantaneous interest rate. The pricing is as if rt follows a square-root process under the risk-neutral measure P , p drt D ›rt dt C rt d wt ;
(101.8)
with › 2 R; 2 RC being constant parameters and wt being a newly defined scaler Wiener process. The maturity coefficient c./ is determined by the following Riccati equation: 1 c 0 ./ D 1 ›c./ 2 c./2 ; 2
(101.9)
with the boundary condition: c.0/ D 0. Although we start with a d -dimensional state vector, the dimension of the term structure collapses to one. The proof of the bond pricing formula follows standard argument. We solve for the coefficients c./ by applying the Feynman–Kac formula and the principle of matching. Proof. Applying the Feynman–Kac formula to the zero price function in Equation (101.3) yields: r .u/ P .u;/ D
@P .u;/ C L P .u; / ; @t
(101.10)
where L denotes the infinitesimal generator under the riskneutral measure P and is given by L P .u; / D
@P @u
>
1 C tr 2
.u/
@2 P @u@u>
† .u/ † .u/> :
The quadratic specification for the spot rate in Equation (101.6) implies that the instantaneous interest rate also has a quadratic form: r .u/ D u> A0 .0/ u:
into Equation (101.10), we have u> A0 .0/u D u> A0 ./ u 2u> A ./ .u/
i h t r A ./ † .u/ † .u/> C2 u> A ./ †.u/†.u/> A ./ u ; (101.12) which should hold for all maturity and states u. To maintain the quadratic nature of the Equation (101.12), we need the diffusion term † .u/ to be independent of the state vector u. Let V ††> denote a positive definite symmetric constant matrix. Indeed, via a rotation of indices, we can set V D I with no loss of generality. Equation (101.12) becomes u> A0 .0/u D u> A0 ./ u 2u> A ./ .u/ t r .A ./ V / C 2u> A ./2 V u
(101.13)
Furthermore, to balance the power of the equation, we decompose the drift function into two parts, .u/ D 1 .u/ Bu, where B denotes a constant matrix and is assumed to be symmetric with no loss of generality. The first part 1 .u; t/ satisfies the equality: 2u> A ./ 1 .u/ t r .A ./ V / D 0:
(101.14)
That is, the role of 1 .u/ is to cancel out the constant term on the right-hand side of Equation (101.13). However, since the drift term 1 .u/ cannot depend on maturity , for the equality (Equation (101.14)) to hold, we must be able to factor out the maturity dependence A ./ D a ./ D;
(101.15)
where a ./ is a scalar and D is a positive definite symmetric matrix independent of . This maturity separation determines the most important result of this article: the collapse of dimensionality. Given the maturity separation, Equation (101.14) becomes 2u> D1 .u/ t r .DV / D 0:
(101.16)
Equation (101.13) becomes a0 .0/ u> Du D a0 ./ u> Du C a ./ 2u> DBu C a ./2 2u> D 2 V u:
(101.17)
(101.11) For this equation to hold for all states u 2 Rd , we need
Plugging the quadratic specifications for the spot rate in Equation (101.6) and for the short rate in Equation (101.11)
a0 .0/ DDa0 ./ DC2a ./ DB C2a ./2 D 2 V:
(101.18)
101 Taking Positive Interest Rates Seriously
1493
After rearrangement, we have a0 ./ I D a0 .0/ I 2a ./ B 2a ./2 DV:
(101.19)
Since the equation needs to hold for all elements of the matrix, we must have 2B D ›I I
DV D
1 vI: 4
(101.20)
We hence obtain the ordinary differential equation, 1 a0 ./ D a0 .0/ ›a ./ va ./2 : 2
(101.21)
Furthermore, let x D u> Du, the zero price can then be written as ln P D u> A ./ u Da ./ u> Du Da ./ x:
(101.22)
Next, given the state vector process d u D .1 .u/ Bu/ dt C
p V d z;
by Itô’s lemma, we obtain the process for x under P , dx D 2u> D .d u/ C t r .DV / dt D 2u> D1 .u/ C t r .DV / 2u> DBu dt p C2u> D V d z p D ›xdt C vxd w: We obtain the last equality by applying Equation (101.16) and by defining a new Wiener process w: p p p 2u> D V d z 2u> D V d z 2u> D V d z D p D p : dw D p vx 4u> DVDu vu> Du The instantaneous interest rate is rt D a0 .0/ xt . A rescaling of index c./ D a./=a0 .0/;
D
p va0 .0/;
(101.23)
gives us ln P D c ./ rt ;
(101.24)
with 1 c 0 ./ D 1 ›c ./ 2 c ./2 ; 2 p drt D ›rt dt C rt d w:
(101.25) (101.26)
The initial condition c.0/ D 0 is determined by the fact that P .rt ; 0/ D 1.
Under our model, due to the maturity separability, the dimension of the state space collapses to one. Bonds are priced as if there is only one dynamic factor. Instantaneously, this one dynamic factor follows a two-parameter square-root process under the risk-neutral measure P . We leave the dynamics of this factor under the physical measure P unspecified. The specification of the physical dynamics can be separately determined to match the time-series properties of interest rates while satisfying the constraints implied by the Girsanov theorem. The two parameters of the square-root process determine both the instantaneous risk-neutral dynamics of the single dynamic factor, and the shape of the yield curve via the ordinary differential equation in Equation (101.25). Over time, these two parameters can change and will change, to accommodate changes in the shape of the yield curve. Nevertheless, the bonds are priced as if the two parameters are constant. We hence label them as static factors. One way to think of this is that there is only one source of market risk that is priced on the term structure. Bond pricing is done as if other factors were constant. The instantaneous market risk is seen to be a multiplicative factor for all rates. We obtain a three-factor term structure model with one dynamic factor and two static factors; yet, this three-factor structure is not a result of exogenous specification, but of a collapse of dimensionality due to the seemingly innocuous contention that all rates are bounded below from zero via the quadratic form in Equation (101.6). Our three-factor model also contrasts sharply with traditional three-factor models in that the three factors in our model summarize everything that is uncertain about the shape of the term structure. Traditional three-factor models often contain many parameters in addition to the three factors. The estimates of these parameters often exhibit large standard errors. Therefore, such models are subject to parameter risk. Under our specification, there are no other parameters to be estimated and hence no other risks to be concerned with that will affect the shape of the term structure. Treating › and as constants, we can solve the term structure coefficients c./ analytically: 2 1 e 2 ; c ./ D 4 .2 ›/ 1 e 2
(101.27)
p with D 12 ›2 C 2 2 . We can see immediately that c./ > 0 for all > 0. Furthermore, since the short rate follows a square-root process under the risk-neutral measure, it is bounded below from zero. By absolute continuity, the short rate should be bounded below from zero as well under the statistical measure. Therefore, all spot rates are bounded below from zero.
1494
Although we start with a quadratic specification for the spot rates, the final bond pricing formula says that spot rates are proportional to one dynamic factor. The square-root riskneutral dynamics of the short rate brings our model very close to the traditional term structure model of Cox et al. (1985). The key difference lies in the absence of a constant term in the drift of the risk-neutral dynamics and the absence of a constant term in the affine structure of the bond yields. A constant term in the affine structure drives the boundary away from zero and hence violates our assumption that all rates are bounded from zero. We solve the coefficients c./ treating › and as constants. Yet, in our application, we allow the two parameters to vary every day to fit the current yield curve. Thus, there seems to be inconsistency between the two practices. However, the inconsistency is only an illusion since we treat › and not as time-inhomogeneous parameters, but as static factors. We explicitly recognize the risk associated with the time variation of these factors and hedge the risk away by forming portfolios that are first-order neutral to their variation. Due to the low dimensionality of the factor structure, neutrality can be achieved with a maximum of only four instruments. In contrast, in a traditional three-factor model with more than ten parameters, making a portfolio first-order neutral to all parameters and state variables is impractical due to transaction costs. Our practice is also decisively different from traditional time-inhomogeneous specifications as often applied under the framework of Heath et al. (1992). In these specifications, the model parameters are allowed to vary over time in such a way that we can always fit the current observed term structure perfectly. Thus, these models have little to say about the fair pricing of the yield curve. Furthermore, accommodating the whole yield curve often necessitates accepting an infinite dimensional state space, which create difficulties for hedging practices.
101.4 The Hump-Shaped Forward Rate Curve The term structure of the long forward rates has been persistently downward sloping (Brown and Schaefer 2000). Given the initial upward sloping term structure in most observations, the downward slopes in the very long term imply a hump-shaped term structure for the forward rates. Our model captures very nicely the hump shape of the forward rate curve. We can rotate the system and redefine the three factors explicitly on the hump shape of the forward rate curve. Formally, we let F denote the maximum of the instantaneous forward rate (the peak of the hump), M the maturity at which the forward rate reaches its maximum, and some measure
E. Pan and L. Wu
of the curvature of the forward rate curve at the maximum. Then, the instantaneous forward rate at maturity is given by f ./ D F sech2 Œ. M / :
(101.28)
Appendix 1 provides a derivation for the transformation. The parameter is related to the curvature of the forward rate curve at the maximum by: ı.M /
f 00 .M / D 22 : f .M /
(101.29)
The new triplet ŒF; M; defines the same term structure as the original triplet Œr; ›; . They are linked by,
› 1 ›2 ; F D r 1 C 2 ; M D arctanh 2 2 1p 2 D › C 2 2 : r D F sech2 .M / ; 2 › D 2 tanh .M / ;
2 D 22 sech2 .M / : (101.30)
The new formulation defines the forward rate curve by controlling the exact shape of the curve at the hump. Thus, if we observe a forward rate curve, we can determine the value of the three factors very easily. In our estimation, we model T 1= instead of , because it has a natural interpretation of time scale. In contrast, the original triplet of factors Œr; ›; define the instantaneous risk-neutral dynamics of the interest rates. They also define the level, the slope, and the curvature of the forward rate curve at the short end ( D 0): f 0 .0/ D r;
f 0 .0/ D ›; f .0/
ı.0/ D
f 00 .0/ D ›2 2 : f .0/ (101.31)
The relation shows clearly the interaction between the instantaneous risk-neutral dynamics of the short rate and the shape of the forward rate curve. The drift parameter › controls the initial slope of the forward rate curve. The initial curve is upward sloping when › is negative. On the other hand, the instantaneous volatility term contributes to the curvature of the forward rate curve. The larger the variance, the more concave the forward rate curve. Furthermore, the two points of the forward rate curve at t D 0 and t D M are linked by a unit-free quantity D tanh .M /: f .0/ f .M / D 2; f .M /
ı.M / ı.0/ D 3 2 : ı.M /
These observations dramatically simplify the calibration of the forward rate curve as the factors can be directly mapped to the level and shape of the forward rate curve.
101 Taking Positive Interest Rates Seriously
1495
101.5 Fitting the US Treasury Yields and US Dollar Swap Rates To investigate the model’s performance, we calibrate the model to two sets of data. One is US Treasury constant maturity par yields and the other is US dollar swap rates of the same maturities. We investigate the goodness of fit of the model on the two sets of data. We also extract the three factors from the two markets for each day and analyze the time series variation of these risk sources.
1 P ./ ; h .Xt ; / D 200 P2 i D1 P .i=2/
101.5.1 Data and Estimation We obtain both the swap rate data and the constant maturity Treasury yields from Lehman Brothers. The maturities include 2, 3, 5, 7, 10, 15, and 30 years. The data are weekly (Wednesday) closing mid quotes from December 14th, 1994 to December 28th, 2000 (316 observations). Table 101.1 reports the summary statistics of the swap rates and Treasury par yields. We observe an upward-sloping mean term structure for both swaps and US Treasuries. The standard deviation for both the levels and the first differences exhibit a hump-shaped term structure with the plateau coming at 3-year to 5-year maturities. Interest rates are highly persistent. The excess skewness and kurtosis estimates are small for both levels and first differences. We are interested not only in the empirical fit of the model on the yield curves of different markets, but also in the time series behaviors of the three factors X ŒF; M; T at each date. The choice of ŒF; M; T over Œr; ›; in the estimation is purely due to numerical consideration. If we can forecast the three factors, we will be able to forecast the yield curve. A natural way to capture both the daily fitting of the crosssection of the term structure and the forecasting of the time series of interest rates is to formulate the framework into a state space system and estimate the system using Kalman (1960) filter. For the estimation, we assume that the three factors can be forecasted via a simple VAR(1) system: Xt D A C ˆXt 1 C ©t ;
(101.32)
where © denotes the forecasting residuals. We use this forecasting equation as the state propagation equation, with © as the state propagation error with covariance matrix Q. We then construct the measurement equations based on the valuation of the par yields on the Treasury and swap market, respectively, St ./ D h .Xt ; / C et ;
where h .Xt ; / denotes the model-implied value of the par yield of maturity as a function of the factors Xt and et denotes the measurement error, which we assume has a covariance matrix of R. In the estimation, we assume that the measurement errors on each series are independent, but bear distinct variance. Thus, R is a diagonal matrix, with each element denoting the goodness of fit on each corresponding series. Since the US Treasury par bond and the US dollar swap contract both have semiannual payment intervals, the modelimplied par yield is given by
(101.33)
(101.34)
where P ./ denotes the model-implied value of the zero coupon bond (discount factor) and is given in Equation (101.7). Since the measurement equation is nonlinear in the state vectors, we apply the extended Kalman Filter, under which the conditional variance update is based on a first-order Taylor expansion. The parameters of the state space system include those that control the forecasting time series dynamics and the covariance matrices of the state propagation errors and measurement errors ‚ ŒA; ˆ; Q; R . We estimate these parameters using a quasilikelihood method assuming that the forecasting errors of the par yields are normally distributed. Appendix 2 provides more details for the estimation methodology. Table 101.2 reports the estimates (and standard errors in parentheses) of the state space estimation on both the US dollar swap market and the US Treasury market.
101.5.2 Model Performance Table 101.3 reports the summary properties of the pricing errors on the swaps and Treasury par yields. We define the error as the difference between the market-observed rates and the model-implied rates, in basis points. The fitting is good despite the simple model structure. Overall, the mean absolute error is within a few basis points. The maximum error is only 28 basis points for the swap rates and 41 basis points for the Treasury par yields. An inspection of the error properties across different maturities indicates that the key difficulty of the model lies in fitting interest rates at short maturities (2 years). The mean error on the 2-year rates is 7:5 basis points for swaps and 4:5 for Treasuries, implying that the observed 2-year rates are on average lower than those implied by the model. Figure 101.1 plots the time series of the pricing errors on the swap rates (left panel) and the Treasury par yields (right panel) at selected maturities: two, five, ten, and 30 years.
1496
E. Pan and L. Wu
Table 101.1 Summary statistics of US dollar swap rates and US treasury par yields Swap Treasury Mat 2 3 5 7 10 15 30 2 3 5 7 10 15 30
Mean Levels 6.190 6.303 6.454 6.560 6.681 6.817 6.889 Differences 0.007 0.007 0.007 0.007 0.007 0.007 0.007
Std
Skew
Kurt
Auto
Mean
Std
Skew
Kurt
Auto
0.656 0.657 0.642 0.629 0.615 0.591 0.576
0.343 0.266 0.133 0.061 0.022 0.056 0.050
0.293 0.224 0.004 0.147 0.300 0.388 0.247
0.971 0.971 0.971 0.971 0.971 0.969 0.971
5.799 5.857 5.947 5.994 6.059 6.111 6.266
0.631 0.654 0.681 0.672 0.669 0.653 0.624
0.050 0.071 0.143 0.081 0.008 0.052 0.243
0.858 0.798 0.672 0.522 0.238 0.152 0.207
0.969 0.969 0.969 0.970 0.971 0.971 0.974
0.120 0.123 0.123 0.122 0.119 0.117 0.106
0.328 0.349 0.214 0.264 0.188 0.304 0.445
0.977 0.834 0.627 0.593 0.410 0.393 0.545
0.027 0.020 0.013 0.019 0.043 0.024 0.018
0.008 0.008 0.009 0.009 0.009 0.008 0.008
0.119 0.121 0.124 0.120 0.118 0.114 0.102
0.201 0.316 0.178 0.282 0.233 0.348 0.415
0.642 0.746 0.600 0.592 0.353 0.319 0.528
0.001 0.013 0.008 0.013 0.040 0.015 0.033
The table presents summary statistics of US dollar swap rates and US Treasury par yields. Mean, Std, Skew, Kurt, and Auto denote, respectively, the sample estimates of the mean, standard deviation, skewness, kurtosis, and first-order autocorrelation. The data are weekly closing mid quotes from Lehman Brothers, from December 14th, 1994 to December 28th, 2000 (316 observations)
We observe that except at short maturities, the pricing errors are normally within ten basis points. The magnitude of these pricing errors is comparable to those reported in much more complicated models.
101.5.3 The Time Series Behavior of the Interest-Rate Factors Through the state space estimation, we obtain estimates on the three interest-rate factors each day that fit the observed Treasury and swap rate curves, respectively. In this section, we analyze the time series behavior of the three factors estimated from both markets, and compare how the three factors relate to one another and how they differ across the two markets.
101.5.3.1 The Dynamic Level Factor Under our model structure, the level of the yield curve can be represented by the instantaneous short rate r. The left panel of Fig. 101.2 plots the extracted instantaneous interest rate from the swap market (solid line) and the Treasury market (dashed line). The right panel of Fig. 101.2 depicts the difference (swap spread) between the two short rates. The average spread on the two short rates over this sample period is 34.19 basis points. Overall, the two short rates move very closely to each other. However, the swap spread does change over time. Before 1998, the spread is in general within 40 basis
points. The spike in the swap spread in late 1998 and early 1999 corresponds to the hedge fund crisis during that time. The swap spread during year 2000 is also unusually high, corresponding to the reduced supply in the US Treasury as a result of the budget surplus at that time. Thus, although the spread spike in early 1999 can be attributed to a credit event, the spread plateau in 2000 is mainly due to liquidity factors.
101.5.3.2 The Slope and Curvature Factors The slope of the forward rate curve is closely related to the drift parameter › of the short rate risk-neutral dynamics. The slope is positive when › is negative. In contrast, the instantaneous volatility of the short rate dynamics is closely related to the curvature of the forward rate curve. The higher the volatility, the more concave the forward rate curve. Figure 101.3 plots the time series of › (left panel) and (right panel) as an illustration of the slope and curvature dynamics of the yield curve. The solid lines depict the factors extracted from swap market and the dashed lines depict the factors from the US Treasury market. The two markets move closely together as their shape (slope and curvature) of the forward rate curves also move together. Furthermore, comparing the time series of the short rate to that of the slope and curvature factors, we see that the slope and curvature factors tend to move in a direction opposite to the level factor. When the short rate is high, the forward rate curve tends to be flat. The two spikes in the slope and curvature time series correspond to the two dips in the short rate.
101 Taking Positive Interest Rates Seriously
Table 101.2 Summary statistics of the three factors from swaps and US strips
Data
1497
Swap
Treasury
State propagation equation: Xt 2 0:1831 6 AD 4 3:0572 7:9858 2 6 6 6 6 6 6 6 6 4
ˆD
2 6 6 6 6 6 6 6 6 4
S© D
D A C ˆXt1 C ©t ; 3 .0:1233/ 7 .0:6830/ 5 .1:4326/
3 0:9761 0:0052 0:0027 7 .0:0133/ .0:0307/ .0:0081/ 7 7 0:0163 0:9005 0:0110 7 7 .0:0081/ .0:0164/ .0:0046/ 7 7 7 0:0386 0:1895 0:9350 5 .0:0164/ .0:0425/ .0:0115/
2
3
2
1:1173 0 0 .0:0552/ 0:0634 0:7012 0 .0:0577/ .0:0413/ 0:7486 0:3839 1:5043 .0:1723/ .0:1862/ .0:1305/
Measurement equation: St D h.Xt / C et ; 3 2 2 2 0:1106 .0:0115/ 6 7 6 0:0512 .0:0051/ 6 3 7 6 7 6 6 6 5 7 6 0:0114 .0:0012/ 7 6 6 6 7 7 D 6 0:0130 .0:0007/ 7 6 6 7 6 6 6 10 7 6 0:0188 .0:0007/ 7 6 6 4 15 5 4 0:0295 .0:0017/ 30 0:0127 .0:0026/
L.103 /
S© S©> D C ov.©/ 2 3 0:2405 .1:4971/ 7 6 4 4:0164 .2:2390/ 5 4:4539 .3:9432/ 3 0:9660 0:0018 0:0046 6 7 6 .0:1497/ .0:0057/ .0:0051/ 7 6 7 6 0:0358 0:9673 0:0122 7 6 7 6 .0:0303/ .0:0123/ .0:0106/ 7 6 7 6 7 0:9299 5 4 0:0136 0:0286 .0:0490/ .0:0233/ .0:0202/ 1:1418 0 0 6 6 .0:0726/ 6 6 0:1898 2:3490 0 6 6 .0:2869/ .0:2455/ 6 6 2:3074 3:6332 4 1:5145 .0:6367/ .0:6562/ .0:3997/
7 7 7 7 7 7 7 7 5
Cov.e/ D diag i2 ; 3
3 7 7 7 7 7 7 7 7 5
i D 2; 3; 5; 7; 10; 15; 30 3 0:1468 .0:0180/ 6 0:0928 .0:0112/ 7 6 7 6 7 6 0:0314 .0:0016/ 7 6 7 6 0:0003 .0:0785/ 7 6 7 6 7 6 0:0409 .0:0021/ 7 6 7 4 0:0498 .0:0039/ 5 2
7 7 7 7 7 7 7 7 7 7 5
0:0285 .0:0027/
5.6517
4.7706
The table reports the parameter estimates (standard deviations in parentheses) of the state space system. The state propagation captures the dynamics of the three factors Xt ŒFt ; Mt ; Tt , where Ft is represented in one thousandth, and M and T are in years. The standard deviation of the measurement error (i ) captures the model’s performance in fitting the constant maturity yields or swap rates of the denoted maturities. The standard deviation is measured in annual percentages. The model is calibrated to both the US dollar swap rates (left panel) and the US Treasury constant maturity par yields, both of which are weekly data from December 14th, 1994 to December 28th, 2000 (316 observations)
Table 101.3 Summary statistics of pricing errors on US dollar swap rates and US treasury par yields Swap Treasury Mat 2 3 5 7 10 15 30
Mean
Std
Mae
Max
Auto
Mean
Std
Mae
Max
Auto
7:524 2:681 0:608 0:843 0:022 0:879 0:445
7.641 4.053 1.053 1.323 1.837 2.446 0.763
8.611 3.948 0.796 1.087 1.279 2.052 0.554
28:290 13:102 5:635 7:859 10:430 8:118 6:676
0.893 0.751 0.158 0.257 0.245 0.629 0.160
4:358 1:731 1:327 0:723 0:249 3:423 1:341
13:970 9:243 3:400 1:401 4:315 2:983 2:503
12:669 8:077 2:602 0:857 3:221 3:947 1:758
41:425 31:094 14:568 8:929 13:536 12:308 17:064
0.923 0.871 0.531 0.111 0.674 0.434 0.468
The table presents summary statistics of the pricing errors on US dollar swap rates and US Treasury par yields. We define the pricing error as the difference, in basis points, between the market observed rates and the model implied rates. Mean, Std, Mae, Max, and Auto denote, respectively, the sample estimates of the mean, standard deviation, mean absolute error, max absolute error, and first-order autocorrelation. The market observed rates are weekly closing mid quotes from Lehman Brothers, from December 14th, 1994 to December 28th, 2000 (316 observations). We compute the model-implied rates based on the state space system estimated in Table 101.2
1498
E. Pan and L. Wu
20
20
Pricing Error, Bps
30
10 0 −10
−30 Jan97
Jan98
Jan99
−40 Jan95
Jan00
40
5 year Swap
30
30
20
20
Pricing Error, Bps
Pricing Error, Bps
−10 −20
40
10 0 −10
−30
40
Jan98
Jan99
40
10 year Swap
30
20
20
Pricing Error, Bps
30
10 0 −10
−30
40
Jan98
Jan99
40
30 year Swap
30
30
20
20
10 0 −10
−30 Jan98
Jan99
Jan00
Jan96
Jan98
Jan99
Jan00
Jan98
Jan99
Jan00
Jan97
30 year Treasury
0
−20
Jan97
Jan00
−10
−30 Jan96
Jan99
10
−20
−40 Jan95
Jan98
10 year Treasury
−40 Jan95
Jan00
Jan97
0
−30 Jan97
Jan96
−10 −20
Jan96
Jan00
10
−20
−40 Jan95
Jan99
5 year Treasury
−40 Jan95
Jan00
Jan98
−10 −20
Jan97
Jan97
0
−30 Jan96
Jan96
10
−20
−40 Jan95
Pricing Error, Bps
0
−30 Jan96
2 year Treasury
10
−20
−40 Jan95
Pricing Error, Bps
40
2 year Swap
30
Pricing Error, Bps
Pricing Error, Bps
40
−40 Jan95
Jan96
Jan97
Fig. 101.1 Swap rate pricing errors. Lines report the time series of the pricing errors on swap rates and Treasury par yields. The pricing error is in basis points, defined as the difference between the market-observed rate and the model-implied rate
101.6 Extensions: Jumps in Interest Rates Our model is derived under three important assumptions: the positivity of interest rates guaranteed via a quadratic form, a finite-dimensional state representation, and diffusion state dynamics. Interest rate positivity is a necessary condition to guarantee no arbitrage, as long as we are allowed to hold cash for free. The zero lower bound assumption for
rates of all maturities matches empirical observation. A finite state representation is also necessary for complete hedging to be feasible in practice in the presence of transaction costs. However, the assumption on pure-diffusion state dynamics is more for convenience and tractability than for reasonability. We do observe that interest rates move discontinuously (jumps) every now and then. In this section, we explore whether incorporating a jump component by itself violates
101 Taking Positive Interest Rates Seriously
1499
The Level Factor
The Level Spread 90 Difference in Short Rate, Bps
7.5
Short Rate, %
7 6.5 6 5.5 5 4.5
80 70 60 50 40 30 20 10
4 Jan95
Jan96
Jan97
Jan98
Jan99
0 Jan95
Jan00
Fig. 101.2 The short rates and swap spreads. The left panel depicts the instantaneous interest rate (in percentages) implied from the swap market (solid line) and the US Treasury market (dashed line). The right
Jan96
Jan97
Jan98
Jan99
Jan00
panel depicts the spread, in basis points, between the short rate from the US Treasury market and the short rate from the Treasury market
The Curvature Factor
x 10−3
The Slope Factor 4
0.05
3.5 Curvature, 2λ2
Slope, −κ
0.04 0.03 0.02
3 2.5 2 1.5
0.01
1
0 Jan95
0.5 Jan96
Jan97
Jan98
Jan99
Jan00
Jan95
Jan96
Jan97
Jan98
Jan99
Jan00
Fig. 101.3 The slope factor › and the curvature factor . Lines depict the slope factor (›, left panel) and the curvature factor ( , right panel) extracted from the swap market (solid line) and the US Treasury market (dashed line)
the assumptions on positive interest rates and finite state dynamics and if not, how jumps can be incorporated into the state dynamics. We start with the degenerating case that the jump component has zero weight in the state dynamics. Then, our previous analysis indicates that zero prices can be written as ln P .rt ; / D c ./ rt ;
(101.35)
Af .x/ D
R0C
.m .dy/ C x .dy// ; where a0 D a C
p drt D ›rt dt C rt d wt ;
(101.36)
and the coefficient c ./ satisfies a Riccati equation. As we discussed before, this model serves as a special example of a one-factor affine model. Duffie et al. (2000) incorporate Poisson jumps in the affine structure. Filipovi´c (2001) incorporates more general jumps in a one-factor affine structure. Since we are dealing with a one factor structure, we consider the more general jump specification in Filipovi´c (2001). Filipovi´c (2001) proves that under the general affine framework, the positive short rate rt is a CBI-process (Conservative Branching Process with Immigration), uniquely characterized by its generator
R R0C
(101.37)
.1 ^ y/ m .dy/ for some numbers
; a 2 RC ; › 2 R and nonnegative Borel measures m .dy/ and .dy/ on R0C (the positive real line excluding zero) satisfying 2
where the short rate rt follows a square-root dynamics with a zero mean:
1 2 00 xf .x/ C a0 ›x f 0 .x/ 2 Z C f .x C y/ f .x/ f 0 .x/ .1 ^ y/
Z
Z .1 ^ y/ m .dy/ C R0C
R0C
1 ^ y 2 .dy/ < 1:
(101.38) We can obtain our current model by setting the jump part to zero and the constant part of the drift of the square root process to zero (a D 0). The two Borel measures define two jump components. The jump component defined by m .dy/ is a direct addition to the diffusion process. The jump component defined by .dy/ is specified as proportional to x. Hence, we label the former as a constant jump component and the latter a proportional jump component. In essence, the arrival rate of jumps in the “constant” component does not depend on the short rate level, but the arrival rate of the
1500
E. Pan and L. Wu
“proportional” component is proportional to the short rate level. Condition (101.38) requires that the jump component defined by m.dy/ exhibit finite variation and the jump component defined by .dy/ exhibit finite quadratic variation. Under the specification in Equation (101.37), the zero prices are given by ln P .rt ; / D A ./ C B ./ rt
(101.39)
with A ./ and B ./ solve uniquely the generalized Riccati equations B 0 ./ D R .B .// ; B .0/ D 0 Z A ./ D F .B .s// ds;
(101.40) (101.41)
0
where R and F are defined as 1 R ./ 1 › 2 2 2 Z 1 e y .1 ^ y/ .dy/ I C R0C
(101.42)
Z
1 e y m .dy/ :
F ./ a C R0C
To guarantee that all rates are bounded from zero, we need to set A ./ D 0 for all , which we obtain by setting a D 0 and m .dy/ D 0. The condition a D 0 is already known. The second condition m .dy/ D 0 says that we cannot add a constant jump component while maintaining that all rates are bounded from zero. Nevertheless, we can incorporate a proportional jump component. Since B ./ is positive for all , all interest rates are bounded from zero. In the absence of the proportional jump component, R ./ is reduced to our Riccati equation for the diffusion case. The last term in Equation (101.42) captures the contribution of the proportional jump component.
101.7 Conclusion In this paper, we contend that all interest rates should be bounded from below at zero. Such a seemingly innocuous contention, together with the assumption of continuity, results in a dramatic collapse of dimensionality. The conditions lead to a term structure model that has only one dynamic factor and two static factors. Even more surprising, there are no other parameters in the model that affect the shape of the term structure. Therefore, model calibration becomes a
trivial problem and there no longer exists a distinction between out-of-sample and in-sample performance. Furthermore, risks from the three factors can be hedged away easily with only a few instruments. Since there are no more parameters, the model is not subject to any parameter risk. To put the model into practical application, we cast the model in a state space framework and estimate the three states via quasi maximum likelihood together with an extended Kalman filter. We apply this estimation procedure to both the US Treasury market and the US dollar swap market. Despite its extreme simplicity, the model performs well in fitting the daily term structures of both markets. A time series analysis of the extracted factors from the two markets provides us with some interesting insights on the evolution of the interest rate market. A potential application of the model, which can be explored in future research, is to forecast the term structure of interest rates. Recently, Diebold and Li (2003) and Diebold et al. (2004) illustrate how the Nelson–Siegel framework can be applied successfully to forecasting the term structure of Treasury yields. Yet, the inherent inconsistency of the Nelson–Siegel model is well-documented in Björk and Christensen (1999) and Filipovi´c (1999, 2000). Our model provides a parsimonious but consistent alternative to the Nelson–Siegel framework. Another line for future research is to explore the model’s implications for option pricing. Acknowledgment We thank Cheng F. Lee (the editor), Yacine Ait-Sahalia, Peter Carr, and Massoud Heidari for insightful comments. All remaining errors are ours.
References Ahn, D.-H., R. F. Dittmar, and A. R. Gallant. 2002. “Quadratic term structure models: theory and evidence.” Review of Financial Studies 15(1), 243–288. Babbs, S. H. 1997. Rational bounds, Working paper, First National Bank of Chicago. Backus, D., S. Foresi, A. Mozumdar, and L. Wu. 2001. “Predictable changes in yields and forward rates.” Journal of Financial Economics 59(3), 281–311. Balduzzi, P., S. Das, S. Foresi, and R. Sundaram. 1996. “A simple approach to three-factor affine term structure models.” Journal of Fixed Income 6, 43–53. Beaglehole, D. R. and M. Tenney. 1991. “General solution of some interest rate-contingent claim pricing equations.” Journal of Fixed Income 1, 69–83. Beaglehole, D. R. and M. Tenney. 1992. “A nonlinear equilibrium model of term structures of interest rates: corrections and additions.” Journal of Financial Economics 32(3), 345–454. Björk, T. and B. J. Christensen. 1999. “Interest rate dynamics and consistent forward rate curves.” Mathematical Finance 9(4), 323–348. Black, F. 1995. “Interest rates as options.” Journal of Finance 50(5), 1371–1376. Brandt, M. and D. A. Chapman. 2002. Comparing multifactor models of the term structure, Working paper, Duke University.
101 Taking Positive Interest Rates Seriously Brandt, M., and A. Yaron. 2001. Time-consistent no-arbitrage models of the term structure, Working paper, University of Pennsylvania. Brown, R. H. and S. M. Schaefer. 2000. Why long term forward interest rates (almost) always slope downwards, Working paper, Warburg Dillion Read and London Business School, UK. Chen, R.-R. and L. Scott. 1993. “Maximum likelihood estimation of a multifactor equilibrium model of the term structure of interest rates.” Journal of Fixed Income 3, 14–31. Constantinides G. M. 1992. “A theory of the nominal term structure of interest rates.” Review of Financial Studies 5(4), 531–552. Cox, J. C., J. E. Ingersoll, and S. R. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53(2), 385–408. Dai, Q. and K. Singleton. 2000. “Specification analysis of affine term structure models.” Journal of Finance 55(5), 1943–1978. Dai, Q. and K. Singleton. 2002. “Expectation puzzles, time-varying risk premia, and affine models of the term structure.” Journal of Financial Economics 63(3), 415–441. Dai, Q. and K. Singleton. 2003. “Term structure dynamics in theory and reality.” Review of Financial Studies 16(3), 631–678. Diebold, F. X. and C. Li. 2003. Forecasting the term structure of government bond yields, Working paper, University of Pennsylvania. Diebold, F. X., L. Ji, and C. Li. 2004. “A three-factor yield curve model: non-affine structure, systematic risk sources, and generalized duration,” in Memorial volume for Albert Ando. Cheltenham, L. Klein (Ed.). Edward Elgar, UK. Duffee, G. R. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57(1), 405–443. Duffie, D. 1992. Dynamic asset pricing theory, 2nd Edition, Princeton University Press, Princeton, NJ. Duffie, D. and R. Kan. 1996. “A yield-factor model of interest rates.” Mathematical Finance 6(4), 379–406. Duffie, D., J. Pan, and K. Singleton. 2000. “Transform analysis and asset pricing for affine jump diffusions.” Econometrica 68(6), 1343–1376. El Karoui, N., R. Myneni, and R. Viswanathan. 1992. Arbitrage pricing and hedging of interest rate claims with state variables: I theory, Working paper, University of Paris. Filipovi´c, D. 1999. “A note on the Nelson–Siegel family.” Mathematical Finance 9(4), 349–359. Filipovi´c, D. 2000. “Exponential–polynomial families and the term structure of interest rates.” Bernoulli 6(1), 1–27. Filipovi´c, D. 2001. “A general characterization of one factor affine term structure models.” Finance and Stochastics 5(3), 389–412. Flesaker, B. and L. Hughston. 1996. “Positive interest.” RISK 9(1), 46–49. Goldstein, R. and W. Keirstead. 1997. On the term structure of interest rates in the presence of reflecting and absorbing boundaries, Working paper, Ohio-State University. Gorovoi, V. and V. Linetsky. 2004. “Black’s model of interest rates as options, eigenfunction expansions and Japanese interest rates.” Mathematical Finance 14(1), 49–78. Heath, D., R. Jarrow, and A. Morton. 1992. “Bond pricing and the term structure of interest rates: a new technology for contingent claims valuation.” Econometrica 60(1), 77–105. Heidari, M. and L. Wu. 2003. “Are interest rate derivatives spanned by the term structure of interest rates?.” Journal of Fixed Income 13(1), 75–86. Jacod, J. and A. N. Shiryaev. 1987. Limit theorems for stochastic processes, Springer, Berlin. Jamshidian, F. 1996. “Bond, futures and option valuation in the quadratic interest rate model.” Applied Mathematical Finance 3, 93–115. Jin, Y., and P. Glasserman. 2001. “Equilibrium positive interest rates: a unified view.” Review of Financial Studies 14(1), 187–214.
1501 Kalman, R. E. 1960. “A new approach to linear filtering and prediction problems.” Transactions of the ASME – Journal of Basic Engineering 82(Series D), 35–45. Knez, P. J., R. Litterman, and J. Scheinkman. 1994. “Explorations into factors explaining money market returns.” Journal of Finance 49(5), 1861–1882. Leippold, M. and L. Wu. 2002. “Asset pricing under the quadratic class.” Journal of Financial and Quantitative Analysis 37(2), 271–295. Leippold, M. and L. Wu. 2003. “Design and estimation of quadratic term structure models.” European Finance Review 7(1), 47–73. Litterman, R. and J. Scheinkman. 1991. “Common factors affecting bond returns.” Journal of Fixed Income 1(1), 54–61. Longstaff, F. A. 1989. “A nonlinear general equilibrium model of the term structure of interest rates.” Journal of Financial Economics 23, 195–224. Longstaff, F. A., P. Santa-Clara, and E. S. Schwartz. 2001. “Throwing away a million dollars: the cost of suboptimal exercise strategies in the swaptions market.” Journal of Financial Economics 62(1), 39–66. Norgaard, M., N. K. Poulsen, and O. Raven. 2000. “New developments in state estimation for nonlinear systems.” Automatica 36(11), 1627–1638. Pan, E. 1998. “Collpase of detail.” International Journal of Theoretical and Applied Finance 1(2), 247–282. Rogers, L. C. G. 1995. Mathematical finance, vol. IMA Volume 65, Springer, New York. Rogers, L. C. G. 1996. “Gaussian errors.” RISK 9, 42–45. Rogers, L. C. G. 1997. “The potential approach to the term structure of interest rates and foreign exchange rates.” Mathematical Finance 7, 157–176.
Appendix 101A Factor Representation The term structure is determined by the following ordinary differential equation: 1 c 0 ./ D 1 ›c./ 2 c./2 ; 2
(101.43)
with c.0/ D 0. One solution of this Riccati equation is given in Equation (101.27). Another way of solving the equation is through the following change of variables: c./ 2 C › ; ./ p ›2 C 2 2
D
1p 2 › C 2 2 ; 2
(101.44)
where ›2 C 2 2 defines the discriminant of the ordinary differential equation. Then the ordinary differential equation (101.43) is transformed into the elementary problem 0
./ D 1 ./2
with .0/ D ›=.2/. The solution of Equation (101.45) is ./ D tanh Œ . M / ;
(101.45)
1502
E. Pan and L. Wu
St D HXt C et ;
where M is defined by the boundary condition .0/ D tanh .M / D That is,
Translating gives c./ D
› : 2
the Kalman Filter provides the efficient a posteriori update on the conditional mean and variance of the state vector:
› 1 M D arctanh : 2 ./ back to the bond pricing coefficients c./
2 Œtanh .t M / C tanh M 2
22 r sech2 . M / 2
›2 22 r Dr 1C 2 ; F D 2 2
bt D .I Kt H / V t ; V
St H X t Xt C et ;
is the maximal forward rate and M is the corresponding maturity.
where
H Xt
Appendix 101B Extended Kalman Filter and Quasilikelihood The state space estimation method is based on a pair of state propagation and measurement equations. In our application, the state vector X propagates according to VAR(1) processes specified in Equation (101.32). The measurement equation is given in Equation (101.33), which is based on the valuation of the par yield. Let X t denote the a priori forecast of the state vector at time t conditional on time t 1 information and V t the corresponding conditional covariance matrix. Let b t denote the a posteriori update on the time t state vecX bt the corretor based on observations (St ) at time t and V sponding a posteriori covariance matrix. Then, based on the VAR(1) specification, the state propagation equation is linear and Gaussian. The a priori update equations are:
ˇ @h X t ˇˇ D ˇ @Xt ˇ
:
(101.49)
(101.50)
Xt DX t
Thus, although we still use the original pricing relation to update the conditional mean, we update the conditional variance based on this linearization. For this purpose, we need to numerically evaluate the derivative defined in Equation (101.50). We follow Norgaard et al. (2000) in updating the Cholesky factors of the covariance matrices directly. Using the state and measurement updates, we obtain the one-period ahead forecasting error on the par yields, et D St S t D St h X t : Assuming that the forecasting error is normally distributed, the quasi log-likelihood function is given by L .S/ D
T X
lt ;
(101.51)
t D1
b t 1 I X t D A C ˆX bt 1 ˆ> C Q: V t D ˆV
(101.48)
where S t and ANt are the a priori forecasts on the conditional mean and variance of the observed series and R are the covariance matrix of the measurement errors. However, in our application, the measurement equation in Equation (101.33) is nonlinear. We apply the Extended Kalman Filter (EKF), which approximates the nonlinear measurement equation with a linear expansion:
D F sech2 . M / where
ANt D H V t H > C R 1 Kt D V t H ANt I b t D X t C Kt St S t I X
(101.46)
The instantaneous forward rate is given by f ./ D c 0 ./r D
St D HXtI
(101.47)
The filtering problem then consists of establishing the conditional density of the state vector Xt , conditional on the observations up to and including time t. In case of a linear measurement equation,
where ˇ ˇ 1 > 1 1 lt D log ˇANt ˇ e ANt et ; 2 2 t where the conditional mean S t and variance ANt are given in the EFK updates in Equation (101.48).
Chapter 102
Positive Interest Rates and Yields: Additional Serious Considerations* Jonathan Ingersoll
Abstract Over the past quarter century, mathematical modeling of the behavior of the interest rate and the resulting yield curve has been a topic of considerable interest. In the continuous-time modeling of stock prices, one only need specify the diffusion term, because the assumption of risk-neutrality for pricing identifies the expected change. But this is not true for yield curve modeling. This paper explores what types of diffusion and drift terms forbid negative yields, but nevertheless allow any yield to be arbitrarily close to zero. We show that several models have these characteristics; however, they may also have other odd properties. In particular, the square root model of Cox–Ingersoll–Ross has such a solution, but only in a singular case. In other cases, bubbles will occur in bond prices leading to unusually behaved solutions. Other models, such as the CIR three-halves power model, are free of such oddities.
bound. Section 102.3 reviews the Cox et al. (1985) and Pan Wu (2006) models and shows that the zero lower bound in the latter is due to the interest rate process, which has an absorbing barrier at zero and no finite-variance steady-state distribution. Section 102.4 derives the bubble-free solution to the Pan Wu model when the risk-neutral and true processes differ in their behavior at zero. Section 102.5 develops a two-state variable extension to the Cox, Ingersoll, Ross model which permits a lower bound of zero for all yields with no price bubbles and with no absorption of the interest rate at zero. Section 102.6 discusses non-affine models of yields and shows it is possible to have interest rate models with a finite-variance steady-state distribution, no absorption at zero, and a lower bound of zero for all yields. Section 102.7 briefly discusses other constant lower bounds for yields.
Keywords Term structure r Interest rates r Price bubbles Positivity
102.2 A Non-Zero Bound for Interest Rates
r
102.1 Introduction In a recent paper Pan and Wu (2006) have posited that (nominal) interest rates of all maturities should have a lower bound of exactly zero – that is, yields arbitrarily close to zero should be possible for bonds of all maturities. They derived the “unique” model in which this assumption together with continuity and the absence of arbitrage are satisfied. This paper questions that desirata and highlights some curiosities that models possessing this property also have. In addition, it shows that, far from there being a unique model with this property, there are in fact countless other models that satisfy these conditions even if only one source of risk is assumed. Section 102.2 of this paper briefly discusses the relation among yields and highlights the question of a zero lower
J. Ingersoll () Yale School of Management, New Haven, CT, USA e-mail: [email protected]
Pan and Wu argue: “Asserting that an interest rate can be negative or cannot be lower than, say, 3%, is equally absurd. For example, no rational traders are willing to offer free floors at any strictly positive level of interest rates.” We contend that these claims are vastly different. The presence of cash alone requires that nominal interest rates of all maturities be nonnegative. On the other hand, the lack of interest rate floors is just common sense. It is true that the Cox–Ingersoll–Ross interest rate model and other similar models absolutely prohibit yields to various maturities below certain non-zero levels so that interest rate floors at some positive interest rate should be cost-free. But why would any trader offer a zero-cost floor at a rate he knew the interest rate could never reach? True, he could not lose on such a contract, but there would be no possibility of gain either if they were being given for free. Conversely, no buyer would be willing to pay any positive
*
The author has benefited from his discussions with his colleagues. This paper is reprinted from Advances in Quantitative Analysis of Finance and Accounting, 7 (2009), pp. 219–252.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_102,
1503
1504
J. Ingersoll
price and would completely indifferent about receiving such a floor at a zero cost. There would simply be no market for such contracts. In any case we must not forget that interest rate models are models, that is, simplifications of the world. By their very nature, all models make absolute claims of one type or another. The model that Pan and Wu derived, for example, requires that the 5-year yield to maturity always be greater than (or less than depending on the model’s two parameters) the 3-month yield to maturity. But no trader would quote swap rates based on this guarantee. If a model that is based on reasonable assumptions makes a surprising prediction that should be taken as a good sign. We want our models to tell us things we didn’t know or otherwise lead to new intuitions. The model’s assumptions may turn out to be wrong, but the logical process that leads us to the surprising conclusion is valuable nonetheless. So the appropriate question becomes: Is the conclusion that yields to maturity possess strictly positive bounds a surprising one? I do not think so and believe just the opposite in fact. Historically, yield curves have been downward sloping when the spot rate is high and upward sloping when the spot rate is low. This is generally explained by saying that long rates are related to expected spot rates and that the spot rate has some long-run or steady state distribution so that the expected rate is always closer to the long-run average than is the current spot rate. This notion can be made precise as follows. Continuous-time interest rate models generally assume that bonds can be priced using a risk-neutral (or equivalent martingale) process1 and a form of the expectations hypothesis for discounting. In particular the price of a zero-coupon bond is given by Z Et exp Pt .r; £/ D b
If the instantaneous rate is bounded below by zero but free to move above zero, the usual properties of averages would seem to guarantee that long-term yields would be bounded away from zero with the exact bound depending on the properties of the stochastic process generating the evolution of the interest rate and the maturity of the bond in question. Apparently it would not be surprising to find models with yields bounded away from zero; quite the contrary, we should expect just that property. In fact were the average in Equation (102.2) an arithmetic one, this property would be universally true and long rates would be bounded away from zero whenever positive interest rates remained possible in the future. Geometric averages are different, though. A negative geometric average is never greater than the corresponding arithmetic average, and if there is any variation in the random variable it is strictly less.2 So yields may have lower bounds of exactly zero even when the instantaneous rate is guaranteed to be remain positive. Before going on to explore positive interest rates in more detail we review the Cox et al. (1985) and the special case of it, which is the Pan and Wu term (2006) structure models in the next section.
102.3 The Cox–Ingersoll–Ross and Pan–Wu Term Structure Models Though derived in a different fashion, the Pan–Wu (henceforth PW) model is a special case of the Cox–Ingersoll–Ross (henceforth CIR) affine term structure mode in which the risk-neutral dynamics of the instantaneous interest rate are
t C£
rs ds
:
p ^ drt D ›.™ rt /dt C ¢ rt d¨t :
(102.1)
(102.3)
t ^
Here b Et denotes the expectation at time t with respect to the equivalent martingale process, and rs is the instantaneous spot rate prevailing at time s. When bonds are priced like this, the yield to maturity is the negative geometric expectation of the instantaneous rates over the time interval 1 Yt .r; £/ `n Pt .r; £/ £ Z t C£
1 D `n b rs ds : (102.2) Et exp £ t
The symbol D is used as a reminder that the dynamics in Equation (102.3) are the equivalent-martingale “risk-neutral” dynamics, the pricing equation for all zero-coupon bonds is 0 D 12 ¢ 2 rPrr C ›.™ r/Pr rP P£
where £ T t is the time until the maturity date T . The CIR solution is P .r; £/ D A.£/ expŒB.£/r 2
1
When pricing bonds and other fixed-income assets in the presence of interest rate uncertainty, the equivalent martingale process that allows discounting at the interest rate does not result from assuming riskneutrality on the part of investors as it does in the Black–Scholes model. Nevertheless, the term “risk-neutral” is still commonly applied. See Cox et al. (1981) for further discussion of this matter.
(102.4)
(102.5)
In discrete time, the yield in Equation (102.2) is 1 C Y 1=n
O Œ..1 C r1 / .1 C rn //1 E . By Jensen’s inequality this is less O Œ..1 C r1 / .1 C rn //1=n . So one plus the yield to maturity is than E
less than the expectation of the geometric average of one plus the future prevailing spot rates. Since a geometric average is never larger than the corresponding arithmetic average, the yield to maturity must be less than the risk-neutral expected spot rate prevailing in the future.
102 Positive Interest Rates and Yields: Additional Serious Considerations
Fig. 102.1 Lower bound for CIR yield curves. In the CIR model, the yield to maturity for any bond is Y .r; £/ D `nŒA.£/ =£ C B.£/r=£. The £ period yield is increasing in the spot rate, r, and has a lower bound of Y .£/ D `nŒA.£/ =£, which is achieved when the spot rate is zero. For the PW model, A.£/ 1 and all yields have a lower bound
where B.£/
2.1 e ”£ / 2” C .› ”/.1 e ”£ /
2”e .›”/£=2 A.£/ 2” C .› ”/.1 e ”£ / p ” ›2 C 2¢ 2 :
2›™=¢ 2
The PW derivation is substantially different from that used by CIR. They start with a general vector diffusion, ut , and then to ensure positive interest rates assume that yields of all maturities have a quadratic form y.ut ; £/ D 1£ u0t W.£/ut where W is a symmetric positive definite matrix. They go on to show that the only model satisfying these conditions is equivalent to the one-factor, two-parameter model given in Equation (102.3) and (102.5) with ™ D 0. In particular, for the PW model, A.£/ 1. In the CIR and PW models, the yield to maturity for a zero-coupon bond of maturity £ is `n A.£/ B.£/ 1 C r: Y .r; £/ `n P .r; £/ D £ £ £
(102.6)
Since A.£/ 1; B.£/ 0, and the interest rate cannot go below zero for the assumed dynamics, the £-maturity yield is clearly bounded below by Y .£/ `n.A.£//=£, which is strictly positive except when ™ D 0, the PW case.
1505
of zero. The parameter ¢ D 0:078 gives a standard deviation of changes in the interest rate equal to 1.9% point at the mean interest rate level of ™ D 6%. The parameter › measures the rate of return towards the mean level. After t years the interest rate will on average have moved the fraction e›t of the distance back toward ™
The limiting yield to maturity for an infinitely lived zerocoupon bond is 2›™=.› C ”/, which is somewhat less than ™. Since this quantity is independent of the current interest rate, it is also the lower bound for the infinite-maturity zerocoupon rate, which is constant in the CIR model. As shown in Fig. 102.1, the lower bounds for shorter maturity rates increase from zero (at £ D 0) to approach this level asymptotically. The rate of increase in the lower bounds is governed primarily by the parameter › with larger values producing a faster approach and generating higher lower bounds for yields of all maturities.3 The difference between the lower bounds in the two models is due to the behavior of the interest rate at zero. For the CIR model with ™ > 0, zero is a natural reflecting barrier of the interest rate process. If the interest rate hits zero, the uncertainty momentarily vanishes, and the then certain change in the interest rate is an increase – immediately moving it back to a positive value. For the PW model, r D 0 is an absorbing barrier. The uncertainty still vanishes, but now the expected future change is also zero, so the interest rate remains stuck at that level. Once the instantaneous interest rate reaches zero, it and the yields of all maturities are zero
3
The lower bounds for all yields also increase with › because the asymptotic value, Y1 D 2›™=.› C ”/, does. See Dybvig, et al. (1996) for an analysis of the asymptotic long rate.
1506
J. Ingersoll
forever. It is obvious, therefore, that no yield can have a positive lower bound. Furthermore, this property is true in any model of interest rates in which zero is an accessible absorbing barrier for the interest rate.4 A closely related feature of the PW model is that the values of zero-coupon bonds do not approach zero for long maturities. B.£/ is an increasing function but is bounded above by 2=.› C ”/, so no zero-coupon bond has a price less than expŒ2r=.› C ”/ regardless of its maturity. In other words, no zero-coupon bond has a value less than the value that a 2=.› C ”/ year bond would have were the yield curve flat at r. For the parameters estimated by PW this “maximal” maturity is about 50 years. This limit also means that the prices for annuities and coupon bonds with fixed coupons will become unbounded as the maturity increases. This aspect of the model is certainly problematic in many other contexts as well since transversality will be violated for many valuation problems. Since the interest rate can be trapped at zero, it is not surprising that long bond prices do not vanish. However, this problem does not follow directly from the interest rate being trapped at zero. It can be true in other models as well. For example, in Dothan’s (1978) model of interest rates, long-term bond prices also do not approach zero, although the interest rate has a lognormal distribution for which zero is inaccessible.5 It is also true in the two-factor version of the CIR model discussed in Sect. 102.4 below.
102.4 Bubble-Free Prices An interesting question remains. Can yields have a lower bound of zero, even when the interest rate cannot be trapped at zero and the values of zero-coupon bonds do become vanishingly small for long maturities? At first the answer to this question would seem obviously to be yes. Pricing is based on the risk-neutral process, but whether or not the interest rate is trapped at zero is a property of the true process. Apparently all that would be required is that the true process not have an absorbing state at zero while the risk-neutral process did. For example, the true and risk-neutral processes could be p drt D .r/dt C ¢ rt d¨t p ^ drt D ›rt dt C ¢ rt d¨t : 4
.0/ > 0
With .0/ > 0, zero will be a natural reflecting barrier for the interest rate process just as it is in the CIR model. Whenever the interest rate reaches zero, it will immediately become positive again. As always, the diffusion term in the risk-neutral process is identical to that in the true process, but the drift term is altered. Unfortunately, the true and risk-neutral processes in Equation (102.7) are not equivalent as is required for a proper risk-neutral process. The true process results in a continuous probability density for r defined over all nonnegative values.6 The risk-neutral process also has a continuous distribution over all positive r – in particular a non-central chi squared distribution, but there is an atom of probability at zero as well. As shown in the appendix, the risk-neutral probability that the interest rate will have reached zero and been trapped there at or before time T is ı PbrfrT D 0jrt D rg D exp 2›r Œ¢ 2 .e ›.T t / 1/ : (102.8) There is a corresponding positive state price atom (not simply a positive state price density) associated with the state rT D 0. The Arrow–Debreu price for the state (atom) rT D 0 when rt D r is7 Q.r; t/ D exp
˚
ı › ” cothŒ 12 ”.T t/ r ¢ 2 :
But under the true process, rT > 0 with probability one, so the true state price (like the true probability) cannot have a positive atom for rT D 0. As previously stated the problem here is a lack of equivalence between the true and risk-neutral measures. They do not possess the same probability-zero sets of states. There are, in fact, two opposite situations to consider. If zero is inaccessible under the true process, then the risk-neutral process clearly has a larger set of possibilities. On the other hand, if zero is accessible under both processes, then the
6
The drift term, ./, must satisfy mild regularity conditions. If is continuous and the stochastic process is not explosive so that r D 1 is inaccessible, then the density function for r 2 .0; 1/ will exist for all future t with a limiting steady state distribution of c
(102.7)
See, for example, Longstaff (1992), which solves the bond-pricing problem in the CIR square-root framework with r D 0, an absorbing barrier. 5 In Dothan’s 1978 model, interest prates.evolves
. as dr D ¢rd¨. The p 8r ¢ ¢ > 0, where K1 is the asymptotic bond price is 8r K1 modified Bessel function of the second kind of order one. See, in particular, Fig. 102.2 on p. 66.
(102.9)
¢ 2r
Z exp 2¢ 2
r
.x/x 1 dx
where c is chosen to ensure it integrates to unity. The density function may of course be zero for some values. 7 The function coth.x/ is the hyperbolic cotangent: coth x .e x C e x /=.e x e x /. The hyperbolic functions are related to the standard circular functions as: sinh x D i sin ix; cosh x D cos ix; tanh x D i tan ix; and coth x D i cot ix:
102 Positive Interest Rates and Yields: Additional Serious Considerations
true process has a large set of possibilities – namely those in which the interest rate reaches the origin and becomes positive again. The latter case, when zero is accessible but not an absorbing barrier under the true process, is irreconcilable with the assumed risk-neutral process. The risk-neutral process completely specifies the partial differential equation and this cannot be altered to modify the probabilities for “interior” states after the origin has been reached and left. In the former case, when zero is inaccessible, the risk-neutral process can be reconciled with the true process making the measures equivalent by assigning zero probability to the offending rT D 0 state. This can be done because the offending state is at a boundary of the distribution so the assignment can be handled by an appropriate boundary condition to the partial differential Equation (102.4) leaving it and therefore the risk-neutral diffusion process itself unchanged. In the remainder of this section we assume that zero is inaccessible under the true process. A sufficient condition for zero to be inaccessible is that .r/ 12 ¢ 2 for all r sufficiently small.8 The boundary condition applied at r D 0 to the bond pricing problem clearly has the form P .0; £/ D p.£/ for some function p.£/. In the PW model, p.£/ D 1 is used, but other assumptions can be made without changing the local no-arbitrage condition inherent in the partial differential equation. Furthermore, since zero is inaccessible, the condition will never actually apply. But what function is logically consistent? Fix a particular boundary condition, p.£/, by conjecture. Now consider a contract that grants 1=p.£/ zero-coupon bonds when the interest rate is zero and then terminates. Such a contract is clearly worthless, if r D 0 is truly inaccessible. However, the risk-neutral pricing procedure will assign a price equal to the expected present value of receiving Œ1=p.£/ p.£/ D 1 when r reaches zero. This is just the value of Q as given in Equation (102.9) above. Therefore, we must have p.£/ D 0. We shall refer to this boundary condition and the resulting solution as bubble-free for reasons to be explained below. As shown in Appendix 102A, the value of a zerocoupon bond under the true and risk-neutral processes in Equation (102.7) and the correct no-arbitrage boundary condition P .0; £/ D 0 when zero is inaccessible is P BF .r; £/ D expŒrB.£/ .1 expŒrŸ.£/ / where Ÿ.£/
2” 2 Œ” sinh.”£/ C › cosh.”£/ › 1 : ¢2 (102.10)
8 This condition is sufficient because zero is inaccessible for the CIR p process dr D k.r r/dt C ¢ rd¨ if 2kr ¢ 2 , and by assumption, the specified process with drift ./ 12 ¢ 2 for small r dominates the CIR process for sufficiently small r. This model is a special case of Heston, et al. (2007).
1507
The function B.£/ and the parameter ” are the same as in the CIR and PW models so this formula is just the PW price multiplied by the factor 1 expŒŸ.£/r . Since Ÿ.£/ > 0 for £ > 0, the value in Equation (102.10) is strictly less than the PW model price at any time before the bond matures. The yield to maturity computed from Equation (102.10) is 1 Y BF .r; £/ `n P BF .r; £/ £ 1 B.£/ r `n .1 expŒrŸ.£/ / : D £ £ (102.11) Since bond prices are lower than in the PW model, yields are correspondingly higher. However, as shown in Fig. 102.2, the yield curves are very similar for short maturities particularly when the spot rate is high. The PW and bubble-free yield curves match better at high rates because the boundary behavior then has less of an effect. As in the PW model, the yield curve is humped as a function of maturity when › < 0, but for realistic parameters, the peak occurs at longer maturities than in their model.9 Unlike the PW model, bubble-free yields are not monotonic in the short rate, but have a U-shape, and the lower bound is not zero. For the £-period yield, the lower bound is Y BF .£/ D
Ÿ.£/ B.£/ B.£/ C Ÿ.£/ 1 `n `n > 0; £Ÿ.£/ B.£/ £ B.£/ C Ÿ.£/ (102.12)
which is achieved when the interest rate is rmin .£/ D
B.£/ C Ÿ.£/ 1 `n : Ÿ.£/ B.£/
(102.13)
Figure 102.3 plots the yields for various maturities as a function of the spot rate. For interest rates above 4%, yields of all maturities through 20 years are nearly proportional to the spot rate just as they are (exactly) in the PW model. Below a spot rate of about 4%, yields have a strong U-shape in the spot rate. The minimum yield for each maturity is not zero, and for longer maturities can be quite high. A side-effect of the bubble-free structure is that it restores the property that the values of zero-coupon bonds go to zero as £ goes to infinity. This can be verified by Equation (102.10) but it must be true since yields for all positive maturities are bounded away from zero, and P .r; £/ expŒY .£/£ . Perhaps a more surprising result is that yields of all maturities become unboundedly large as the interest rate itself
9
In Figs. 102.2 through 102.6, the parameters used, › D 0:03; ¢ D 0:04, are the midpoints of the estimates by PW, though they were fitting their bond pricing function not the bubble-free function.
1508
J. Ingersoll
Fig. 102.2 Pan–Wu and bubble-free yield curves. The PW yield is given ı in Equation (102.6) as rB.£/=£ with B.£/ 2.1 e ”£ / Œ2” C .› ”/.1 e ”£ / . The bubble free yield is £1 ŒB.£/r `n.1 expŒrŸ.£/ with Ÿ.£/ 2” 2 ¢ 2 Œ” sinh.”£/ C
› cosh.”£/ › 1 as given in Equation (102.11). In each case the riskp ^ neutral evolution of the interest rate is dr D ›rdt C ¢ r d¨. The parameter choices › D 0:03 and ¢ D 0:04 correspond to the middle of the range estimated by Pan and Wu by fitting yield curves
Fig. 102.3 Bubble-free ble free yield is given expŒrŸ.£/ with Ÿ.£/ The parameter choices to the middle of the ting yield curves. All
proportional to the spot rate when the latter exceeds 4%. The lower bound for the £-period yield is given in Equation (102.12) D B.£/ =Œ£Ÿ.£/ .`nŒB.£/ C Ÿ.£/ `n B.£// as Y .£/ .`nŸ.£/ `nŒB.£/ C Ÿ.£/ /= £, which is achieved at a spot rate of `n Œ1 C Ÿ.£/=B.£/ = Ÿ.£/
yield as a function of the spot rate. The bubin Equation (102.11): £1 ŒB.£/r `n.1 2” 2 ¢ 2 Œ” sinh.”£/ C › cosh.”£/ › 1 . › D 0:03 and ¢ D 0:04 correspond range estimated by Pan and Wu by fityields through 20 years are approximately
102 Positive Interest Rates and Yields: Additional Serious Considerations
approaches zero. The intuition for this surprising result is found in the bond’s risk premium. The bond’s risk premium, .r; £/, can be determined by Ito’s Lemma
˜.r; £/
1509
B.£/ @P=@r D B.£/ C Ÿ.£/fexpŒrŸ.£/ 1g1 P
PW BF (102.17)
1 2 ¢ rPrr C .r/Pr P£ EŒdP D 2 dt: Œr C .r; £/ dt P P (102.14)
Comparing Equation (102.14) to the pricing equation, which uses the risk-neutral process, we have for the risk-neutral and true dynamics given in Equation (102.7) .r; £/ D
@P = @r Œ.r/ C ›r : P
(102.15)
The semi-elasticity, .@P =@r/=P , also determines each bond’s return risk. Again by Ito’s Lemma dP dP @P = @r p E D ¢ rd¨: P P P
(102.16)
This, of course, is no coincidence. The absence of arbitrage requires that assets whose returns are perfectly correlated have risk premiums proportional to their standard deviations. The relations Equation (102.15) and (102.16) are true for both the PW and BF prices, though the semi-elasticities differ. Under the PW and bubble-free solutions the semielasticities are
Fig. 102.4 Semi-elasticity of bubble-free bond prices. The semi=P D B.£/ C elasticity is given in Equation (102.17): ˜.r; £/ Pr ı Ÿ.£/fexpŒrŸ.£/ 1g1 with B.£/ 2.1 e ”£ / Œ2” C .› ”/ .1e ”£ / and Ÿ.£/ 2” 2 ¢ 2 Œ” sinh.”£/C› cosh.”£/› 1 . The parameter choices › D 0:03 and ¢ D 0:04 correspond to the middle of
Figure 102.4 shows the semi-elasticity function, ˜.r; £/ Pr =P , for the bubble-free prices. (In the PW model, the semi-elasticity is constant as a function of r at the asymptote shown.) As the interest rate increases, the semi-elasticity approaches B.£/, and both the risk and term premium are approximately independent of the interest rate. Where this flattening occurs depends on the bond’s maturity. For bonds with maturities less than 2 years, ˜ is nearly constant for all interest rates above 1.5%. For bonds with maturities in excess of 10 years, ˜ does not flatten out until the interest rate is above 5%. This leads to the unbounded risk premiums and a number of other unusual effects. Under the PW formula, the risk premium at low interest rates is negative since at r D 0 the premium is .0; £/ D B.£/.0/ < 0. Because the bond price can never exceed one and, for the PW model, the price approaches one as r nears zero, the expected change in the bond price, and therefore its risk premium, must be negative near r D 0. For the bubble-free price, the semi-elasticity and therefore the risk premium becomes unboundedly large driving the bond price to zero as r approaches zero.
the range estimated by Pan and Wu bypfitting yield curves. Any bond’s return standard deviation is j˜.r; £/j ¢ r with a negative value of ˜ indicating that bond’s price decreases with an increase in the interest rate. Any bond’s term premium is ˜.r; £/Œ.r/ C ›r
1510
J. Ingersoll
Fig. 102.5 Risk premiums of bubble-free bond prices. The risk premium is the product of the semi-elasticity in Equation (102.17): ˜.r; £/ Pr =P D B.£/ C Ÿ.£/fexpŒrŸ.£/ 1g1 and the difference between the true and risk-neutral expected changes, .r/ C ›r.
It is plotted for .r/ D 0:01.0:06 r/. The other parameter choices › D 0:03 and ¢ D 0:04 correspond to the middle of the range estimated by Pan and Wu by fitting yield curves
At some interest rate levels, longer maturity bonds are less risky and have smaller term premiums than shorter maturity bonds. For example, when the spot rate is 2%, 10-year bonds are less risky than 2-year bonds. At low enough interest rates levels, the risk changes sign and bond prices are increasing in the interest rate. For example, this reversal occurs around 1.5% for 5-year bonds. The change in sign of the semi-elasticity means that the risk premiums also change in sign. In fact in many cases the term premium changes in sign twice. This is true for example for all zero-coupon bonds if the true process mean is .r/ D k.r r/ with k > ›. Typical term premiums are illustrated in Fig. 102.5. Both the PW and BF prices are self-consistent. That is, if zero coupon bonds traded at either model’s prices, they would move in response to changes in interest rates just as the model predicts, and the risk premiums indicated in Equations (102.15) and (102.17) would be earned. However, this is always true of prices with bubbles. It is only in comparison to an alternate price that the bubble is obvious. The magnitude of the PW bubble is
arbitrage would exist by selling the bonds and replicating them at a cost equal to this lower value according to the replicating hedge inherent in the partial differential equation. Suppose bonds sold at the higher PW price. An investment matching the lower BF price could be achieved via a portfolio that holds n.r; £/ D PrBF =PrPW bonds and invests the residual, P BF nPPW , in instantaneous lending. The change in value of this portfolio is d Port D n.r; £/dPPW C ŒP BF n.r; £/P PW r dt p D n.r; £/ Œr C ˜PW .r/ P PW dt C PrPW ¢ rd¨ CŒP BF n.r; £/P PW r dt D
PrBF PrPW PW P .r/dt C P BF r dt PrPW P PW
PrBF PW p P ¢ rd¨ PrPW r p D r C ˜BF .r; £/.r/ P BF dt C PrBF ¢ rd¨ D dP BF C
(102.19) P P W P BF D e rŒB.£/CŸ.£/ :
(102.18)
Unlike most bubbles, this bubble disappears at a fixed point in time, when the bond matures and is always between 0 and 1 in magnitude. Therefore, this bubble is finite in duration and bounded in size. If bonds sold at the higher PW price, an
which exactly matches the change in the BF formula, so this portfolio would always be self-financing and always equal in value to P BF . A long position in this portfolio and a short position in the bond (at the PW price) would be an arbitrage. It would have a negative cost and be guaranteed to
102 Positive Interest Rates and Yields: Additional Serious Considerations
1511
Fig. 102.6 Price bubble in Pan–Wu bond prices. The difference between the Pan–Wu and Bubble-Free price is a price bubble. The magnitude of the price bubble is e rŒB.£/CŸ.£/ where B.£/
2.1 e ”£ / =Œ2” C .› ”/.1 e ”£ / and Ÿ.£/ 2” 2 ¢ 2 Œ” sinh.”£/ C › cosh.”£/ › 1 . Unlike most price bubbles this bubble is bounded in value between 0 and 1 and disappears after a fixed finite
be worth zero when the bond matured. Furthermore, since P PW > P BF , the arbitrage will never have a negative value at any point during the arbitrage. The PW model is valid and gives a lower bound of zero for all yields only when zero is an absorbing barrier under both the true and risk-neutral interest rate processes. The next obvious question is can we have a lower bound of zero for all yields even when the interest rate process does display mean reversion? The answer is yes as we show in the next two sections of this paper.
claim is not true. There are other continuous term structure models in which there is no arbitrage and all interest rates are bounded by zero exactly. In fact there are infinitely many other affine term structure models with a single (excluding parameter) source of risk that have a lower bound of zero for yields. The simplest such model extends the CIR dynamics so that the mean interest rate level, ™, is no longer a constant but is a weighted average of past interest rates. Specifically let xt be an exponentially smoothed average of past spot rates with an average lag of 1=•11 Z 1 xt D • e •s rt s ds: (102.20)
102.5 Multivariate Affine Term-Structure Models with Zero Bounds on Yields Pan and Wu (2006) claim: “Positivity and continuity, combined with no arbitrage, result in only one functional form for the term structure with three sources of risk.”10 But this
0
The dynamics of xt are locally deterministic, and the evolution of the state space is p drt D ›.xt rt /dt C ¢ rt d¨t dxt D •.rt xt /dt
10
Pan and Wu refer to their model as a three-factor (i.e., r; ›; ¢) model with a single dynamic factor, r. In fitting their model they allow the parameters to vary over time, hence adding two additional sources of risk. This “stochastic-parameter” method has been widely used in practice since being introduced to term-structure modeling by Black et al. (1990). As Pan and Wu point out, this is inconsistent with their derivation, which assumes the parameters to be constant. Were the parameters actually varying, then bond prices would not be given by Equation (102.5) or (102.10). A true multifactor model giving results similar to PW would be a special case of the Longstaff and Schwartz (1992) multifactor extension to the CIR model with thep constants in the drift terms set to zero. That is, dsi D ›i si dt C ¢i s i d¨i and r D s1 C s2 CPs3 . The zero-coupon yield to maturity in this model is Y .s; £/ D £1 Bi .£/si . This Longstaff–Schwartz model is an imme-
(102.21)
with a single source of risk, d¨.
diate counterexample to PW’s claim that their formula is unique, but as in the PW model each of the state variables can be trapped at zero, and the interest rate becomes trapped at zero once all three state variables are so trapped. R1 11 The average lag in the exponential average is • 0 se•s ds D •1 . An exponentially smoothed average is the continuous-time equivalent of a discrete-time geometrically smoothed average xt D .1 ˜/ P ˜s rts . Geometrically smoothed averages were first suggested in interest rate modeling by Malkiel (1966).
1512
J. Ingersoll
Unlike in the PW model, the interest rate cannot be trapped at zero under the process in Equation (102.21). Since x is a positively weighted average of past values of r, it clearly must remain positive. Even were r to reach zero,12 x would only decay towards zero at the rate •. But when r reaches zero, its diffusion term is zero so the immediate change in r is dr D ›x dt, and as x is still positive at this point, dr > 0; r immediately becomes positive again. Assuming the factor risk premium is linear in the state variables, .r; x; £/ D .§0 r C §1 x/Pr =P , then the bondpricing equation is13 1 0 D ¢ 2 rPrr C Œ.› §1 /x .› C §0 /r Pr 2 C •.r x/Px rP P£ :
(102.22)
The price of a zero-coupon bond has the form P .r; x; £/ D expŒb.£/r c.£/x :
(102.23)
This pricing formula can be easily verified by substituting the partial derivatives of P in Equation (102.23) into the pricing Equation (102.22). The terms proportional to r and x must separately sum to zero so the functions b and c are the solutions to the linked ordinary first-order differential equations
r terms W 0 D 12 ¢ 2 b 2 .£/ C .› C §0 /b.£/ •c.£/ 1 C b 0 .£/ with b.0/ D 0 x terms W 0 D .§1 ›/b.£/ C •c.£/ C c 0 .£/ with c.0/ D 0:
(102.24)
While a closed-form solution to Equation (102.24) is not known, b.£/ and c.£/ can be easily computed by numerically
12
It is irrelevant for this discussion whether or not zero is accessible; if zero is not accessible for r then clearly neither r nor x can become negative. By comparison to the CIR process, however, we can determine that 0 is accessible for r (though not x). Specifically compare the CIR process with ™ < 2¢ 2 =› to the process in Equation (102.21). The diffusion terms are identical and the excepted change under the bivariate process is smaller than for the CIR process whenever both r and x are less than ™. Since 0 is accessible for the CIR process, it must be accessible for the dominated bivariate process. The fact that r and x can be larger than ™ does not alter this conclusion as the accessibility of 0 depends only on the behavior of r and x near 0. 13 Only r is locally stochastic, so the risk premium is proportional to Pr =P and independent of Px =P . The risk-neutral and true processes are equivalent if and only if §1 < › so that r remains positive under the risk-neutral process as well.
integrating the two equations.14 Note that this calculation need only be done once for a given set of parameters; bond prices at each interest rate level and maturity need not be separately computed as in a finite difference or binomial model. For this model, the yield to maturity on a £-period zerocoupon bond is b.£/ c.£/ 1 rC x: Y .r; x; £/ `n P .r; £/ D £ £ £ (102.25) Since both r and x can be arbitrarily close to zero, yields to maturity have lower bounds of exactly zero for all maturities as in the PW model even though the interest rate exhibits mean reversion as in the CIR model. It should be noted, however, that the lower bound of zero cannot be approached immediately. In the PW model, yields are proportional to the interest rate, and since r can be arbitrarily close to zero at any time, so can the yields. In this two factor model r can also be arbitrarily close to zero at any time, but xt cannot be smaller than x0 e•t ; therefore, the £-period yield cannot be less than £1 c.£/x0 e•t at time t. Figures 102.7 and 102.8 display the yield curve for the two-factor model. Figure 102.7 compares it to the CIR yield curve for the same parameters and when xt D ™. The twofactor yield curve is less steeply sloped than the CIR yield curve because x and r both tend to move towards each other rather than r simply moving toward ™. This lessens future expected movements in the interest rate. Figure 102.8 illustrates the effect on the yield curve of different values of •. The larger is •, the less steeply sloped is the yield curve as x then moves more strongly towards r. Both figures also show that the long zero-coupon yields go to zero as the maturity lengthens regardless of the parameter values (provided • is not zero which is the CIR model). This is true even though the interest rate can never be trapped at zero as in the PW model verifying that the latter is not a prerequisite to the former. One problem with this model, apart from the lack of a simple closed-form expression for the solution, is that the longterm behavior of the interest rate is unrealistic. The model displays short-term mean reversion with r staying near x, but over long periods of time, the interest rate is likely to become
14
In particular, b.£ C £/ b.£/ 12 ¢ 2 b 2 .£/ C .› C §0 /b.£/ •c.£/ 1 £
b.0/ D 0
c.£ C £/ c.£/ Œ.§1 ›/b.£/ C •c.£/ £
c.0/ D 0:
102 Positive Interest Rates and Yields: Additional Serious Considerations
1513
Fig. 102.7 Cox–Ingersoll–Ross and two-factor affine yield curves. This figure shows the yields to maturity for the CIR and two-factor model in Equation (102.25). The parameters are › D 0:7; • D 0:5; ¢ D
0:078. The two factor yield curve is less steeply sloped than the CIR yield curve and is always downward sloping to zero for large maturities
Fig. 102.8 Two-factor yield curves illustrating dependence on •. This figure shows the yields to maturity for the two-factor model in Equation (102.25). The parameters are › D 0:7 and ¢ D 0:078. The
yield curve is plotted for various values of •, the reciprocal of the average lag in the central tendency mean
very large as there is no steady state distribution for the interest rate. In particular, the expected interest rate at time t,
does not reach a limit independent of the current state, and the variance of rt
› .r0 x0 / 1 e .›C•/t ›C• •r0 C ›x0 ; (102.26) ! t !1 •C›
EŒrt jr0 ; x0 D r0
VarŒrt2
¢ 2 •2 .•r0 C ›x0 /t C o.t/: 2›.› C •/3 (102.27)
1514
J. Ingersoll
becomes unbounded. So, in the limit, the distribution of rt is completely diffuse over all positive values. Of course, a similar criticism can also be leveled against the PW model. Those dynamics also have no steady state. The interest rate variance becomes unbounded, and the spot rate is either trapped at zero or has an infinite expectation.15 Also as in the PW model, the prices of very long zerocoupon bonds are bounded away from zero as £ becomes large. For given values of r and x, the smallest that a bond price can be is 8£ P .r; x; £/ > P expŒb1 r c1 x p .§0 C §1 /2 C 2¢ 2 §0 §1 where b1 ¢2 › §1 b1 : c1 (102.28) • This can be verified with Equation (102.24). The derivatives c 0 and b 0 are positive (zero) whenever c is less than (equal to) .› §1 /b=• and •c is greater than (equal to) 1 .› C §0 /b 1 2 2 0 2 ¢ b , respectively. Since b.0/ D c.0/ D c .0/ D 0 and 0 b .0/ > 0, both b and c increase monotonically staying in the range bounded by the b 0 D 0 and c 0 D 0 curves approaching b1 and c1 as £ ! 1. Similar properties are true for all affine models with multiple state variables. Consider the general multiple statevariable extension to the CIR model with a risk-neutral evolution of p ^ dr D .r; O x/dt C ¢ rd¨
X p ^ D KO C ›O 0 r C ›O i xi dt C ¢ rd¨
X •ji xi dtj D 1; : : : I: dxj D j .r; x/dt D j C •j 0 r C (102.29)
The same analysis leading to Equation (102.24) shows the yield curve will be affine Y .r; x; £/ D a.£/ C b.£/r C
X
ci .£/xi :
102.6 Non-Affine Term Structures with Yields Bounded at Zero Outside of the affine class, many models of the term structure will have zero lower bounds for all rates and still be wellbehaved with mean reversion and a finite-variance steadystate distribution with no atom of probability at r D 0. One such model that admits to a closed form solution for bond prices is the three-halves power model in which the interest rate’s evolution is 3=2
drt D ›rt .™ rt /dt C ¢rt d¨t ;
(102.31)
with ›; ™; ¢ > 0.17 This structure was introduced in the original CIR (1985) paper to model the rate of inflation. Like the CIR process, this diffusion displays mean reversion with a central tendency of ™, and the local variance vanishes when rt D 0 so negative rates are impossible.18 The interest rate has a finite-variance steady state distribution with a density function, mean, and variance of
(102.30)
so all yields will have a lower bound of exactly zero only if a.£/ 0 and all state variables remain nonnegative. The former requires that the constant terms, KO and all j , are zero. The latter requires additionally that all ›O i .i ¤ 0/ and all •ij .j ¤ i / are nonnegative and that all state variables are
15
nonnegative initially.16 The interest rate will never be trapped at zero so long as one of the parameters, ›O i .i ¤ 0/, is positive. However, as in the two-variable case, there will not be a finite-variance steady-state distribution, and the prices of all zero-coupon bonds will be bounded away from zero even as the maturity grows without bound. Our search for an affine model in which the interest rate is well behaved in the long term and yields of all maturities are bounded below exactly by zero has not been successful. But such models are possible outside of the affine structure. One such model has already appeared in the literature. It is the three-halves power model used in the two-factor CIR model. This model is discussed in the next section.
If › 0 in the PW model, then the interest rate is eventually trapped at zero with probability one. If › < 0, then the expected interest rate ›t ! 1; VarŒrt jr0 D and variance ›t become ı infinite, EŒrt jr0 D r0 e 2 2›t e › ! 1, and there is an atom of probability for r0 ¢ e r1 D 0 equal to exp.2›r0 =¢ 2 /.
16 In addition, the risk-neutral process must also be equivalent to the true process so all state variables must remain nonnegative under the latter as well. 17 The central tendency parameter, ™, and the adjustment parameter, ›, can be zero just as in the PW model. For the three-halves process, the origin remains inaccessible even in these cases, and yields are still bounded below by zero. There is, however, no finite-variance steadystate distribution. 18 Since the drift term is zero at an interest rate of zero, rt D 0 is technically an absorbing state. However, zero is inaccessible for all parameter values so r is never trapped there. To verify this define z D 1=r. pThen using Itô’s Lemma, the evolution of z is dz D .›C¢ 2 ›™z/dt¢ zd¨. Since z D 1 is inaccessible for the square root process with linear drift, zero is inaccessible for r D 1=z. Note also that 2.› C ¢ 2 / > ¢ 2 so zero is inaccessible for z guaranteeing that 1 is inaccessible for r in Equation (102.31).
102 Positive Interest Rates and Yields: Additional Serious Considerations
f .r; 1/ D r 1 EŒr1 D
Œ.“ 1/™ 1C“ “ .“1/™=r r e .“ C 1/
“1 ™ “
Var Œr1 D
The three-halves power model may fit the data better than the CIR model and other proposed models. Chan et al. (1992) tested the general specification for interest rate evolution EΩt D 0
2”
EŒ©2t D ¢ 2 rt : (102.33)
Using GMM they found a best unrestricted fit of ” D 1:4999 with a standard error of 0.252. This, of course, is not a direct test of Equation (102.31), which has a quadratic rather than linear form for the expected interest rate, but the estimated
P .r; £/ D
(102.32)
value of ” is almost exactly what this model calls for. The CIR-square-root, Merton (1975, 1990), and Vasicek (1977) models are all well outside the usual confidence intervals, and the Brennan and Schwartz (1982) and Dothan (1978) models are just at the 5% significance level.19 We assume that the risk premium is of the form .r; £/ r D .§1 r C §2 r 2 /Pr =P so that the risk-neutral process also has the form in Equation (102.31). The risk-neutral drift is O then ›O r.™r/ with ›O ›C§2 and ™O .›™§1 /=.›C§2 /.20 Zero-coupon bond prices for this model can be determined from Cox et al. (1985) as
. •/ Œc.£/=r • M.•; ; c.£/=r/ ./
1 ›™ §1 .›™§1 /£ e 1 ; “O 1 C 2.› C §2 /= ¢ 2 2 ¢ O • 12 .“O 2 C 8 =¢ 2 /1=2 12 “; 1 C .“O 2 C 8 =¢ 2 /1=2 ;
c.£/ 2
where
./ is the gamma function, and M./ is the confluent hypergeometric function.21
Y1
lim 1£ `n P .r; £/ £!1
(
• `n £
e .›™§1 /£ 1 C O.£1 / • £ o.£/
See Abramowitz and Stegum (1964) for the properties of the gamma and confluent hypergeometric functions.
(102.34)
The asymptotic long rate is
! £!1 ! £!1
19
21
¢ 2 .“ 1/2 ™2 ¢2 2 D r 2› “2 2› 1
“ 1 C 2›=¢ 2:
where
rt D ’ C “rt 1 C ©t
1515
O .›™ §1 /• D ›O ™• 0
for ›O > 0 for ›O 0: (102.35)
The volatility parameter is ” D 0 for Merton (1990) and Vasicek (1977), ” D 12 for CIR, and ” D 1 for Brennan and Schwartz (1982), Dothan (1978) and Merton (1975). Each of these models with the exception of Merton’s (1975) does have a linear form for the expected change in r. 20 We require that §2 .› C 12 ¢ 2 / so that the true and risk-neutral processes are equivalent. If this condition is not satisfied then the riskneutral process is explosive, and the interest rate can become infinite in finite time. As shown in footnote 18, the risk-neutral process for z O O Œdz D .O› C¢ 2 ›O ™z/dt. So if §2 violates the condition given, 1=r has E 2 2.O› C ¢ / < ¢ 2 , and 0 is accessible for z implying that 1 is accessible in finite time for r under the risk-neutral (though not true) process.
1516
J. Ingersoll
When §2 < › (i.e., ›O > 0), Y1 is a positive constant a bit less O 22 The asympthan the risk-neutral central tendency level ™. totic bond price, P .r; 1/, is also zero when §2 ›.23 For short maturities
P Substituting P .r; £/ D expŒ yi .£/r i and its derivatives into the bond pricing equation gives 0D
Y .r; £/ D r C 12 rŒ›™ §1 .› C §2 /r £ C O.£2 / D r C 12 ›O r.™O r/ £ C O.£2 /
(102.36)
so the yield curve is upward sloping at £ D 0 whenever the interest rate is below the risk-neutral central tendency point, O < ™O since ›O • < 1; therefore, O The long rate is Y1 D ›O ™• ™. the yield curve is humped shape whenever the spot interest O rate is between the values ›O •™O and ™. Even though the asymptotic yield to maturity is a positive constant (when ›O > 0), the yield to maturity on any finite maturity bond is bounded below exactly by zero. As shown in the Appendix, the £-period yield to maturity is an analytic function of r and can be expressed as a power series with no leading constant term, 2r 2.› C §2 /r 2 1 C : Y .r; £/ `n P .r; £/ D 2 £ ¢ £c.£/ ¢ 4 £c 2 .£/ (102.37) Therefore, for any ı finite maturity, the yield to that maturity is less than 2r Œ¢ 2 £c.£/ when r is sufficiently small and approaches its lower bound of zero as r does.24 These properties can also be verified for many models even when a closed-form solution to the bond pricing problem cannot be found. Suppose that the risk-neutral expected change and variance in the interest rate process and the yield to maturity are analytic functions of the interest rate at r D 0; that is, they can be expressed as the infinite power series ¢ 2 .r/ D
1 X
.r/ O D
si r i
i D0
Y .r; £/ D
1 X
mi r i
i D0 1 X
£1 yi .£/r i :
(102.38)
i D0
A yield to maturity will have the desired lower bound of exactly zero if and only if the lead term in its expansion is zero; that is, y0 .£/ 0. Holding ¢ constant, ›O • increases from 0 to 1 when “O ranges from 1, its lowest value, to 1. 23 Whenı §2 > ›, the asymptotic bond price is P .r; 1/ D . •/ ./Œ2O›=¢ 2 • M.•; ; 2O›=¢ 2 / > 0. 24 The lower bound for any yield is zero since for every finite £, there exists an interest rate r£ such that Y .r; £/ < © for all r < r£ . The bound is not a uniform one for all £, however, in that r£ depends on £. The bound cannot be uniform since the asymptotic long rate is a positive constant.
i2 1 X i hX si r .i C 1/yi C1 .£/r i 2
X .i C 1/.i C 2/yi C2 .£/r i
D
X
mi r i
hX
i hX i .i C 1/yi C1 .£/r i r C yi0 .£/r i
1 2 s0 y1 .£/ 2y2 .£/ m0 y1 .£/ C y00 .£/ 2 C r Œ C r 2 Œ C
(102.39)
The terms in each power of r must be identically zero, so y0 .£/ will be constant (and therefore 0 since yi .0/ D 0) if and only if s0 D m0 D 0. If s0 D 0, then zero is a natural barrier for the interest rate process and negative rates will be precluded if .0/ 0.25 If m0 also is zero, then rt D 0 is an absorbing barrier; however, the barrier might be inaccessible as in the three-halves power model. Therefore, models like CIR (1985) or Vasicek (1977) with a linear expected change in the interest rate, .r/ D ›.™ r/, will not have yields with a lower bound of zero. Conversely, a model like that in Merton (1975) with ¢.r/ D vr and .r/ D ar br2 will have all yields bounded below by exactly zero.
102.7 Non-Zero Bounds for Yields All of the results presented here apply to other constant bounds for yields that derive from restricted interest rate processes. Suppose the interest rate is restricted so that it can never go below r. Under what conditions will all yields have this same lower bound of r ? This question can be easily answered by defining the modified interest rate, ¡ r r. Since ¡ and r are linearly related, the dynamics for ¡ will be identical to those for r translated down by r, and the modified rate, ¡, can never be negative. All yields will have an identical lower bound of r whenever yields in the modified economy are all bounded below by zero. To verify this claim write the interest rate dynamics as dr D .r/dt C ¢.r/d¨. Now express the bond prices in the original economy as P .r; £/ D e r£ ‚.¡; £/. The bond pricing equation can then be reexpressed as
22
1 2 ¢ .r/Prr C .r/Pr rP P£ 2 1 D ¢ 2 .¡ C r/‚¡¡ C .¡ C r/‚¡ ¡‚ ‚£ 2
0D
25
(102.40)
Zero can also be an inaccessible natural barrier for the process if .0/ D 1 even if ¢.0/ ¤ 0.
102 Positive Interest Rates and Yields: Additional Serious Considerations
This is just the regular bond pricing equation for an interest rate ¡ with expected change and standard deviation of ¡ .¡/ .¡ C r/ and ¢¡ .¡ C r/. So all yields in the original economy will be larger than those in the modified economy by exactly r. In particular, all yields will have a lower bound of r if, and only if, they have a lower bound of 0 in the modified economy. For example, in Sundaresan (1984),26 the dynamics of the real interest rate are dr D .’ 1/.r C ¢ 2 / .r • C 12 ’¢ 2 /dt C ¢d¨ (102.41) with r > ¢ . Rewriting this in terms of ¡ r C¢ , we have 2
2
1 d¡ D .’ 1/¡ .¡ • C .’ 2/¢ 2 /dt C ¢d¨ : 2 (102.42) which is identical to Merton’s (1975) model. All yields have a lower bound of 0 in Merton’s model; therefore, all yields in Sundaresan’s model are bounded below by exactly ¢ 2 .
102.8 Conclusion This paper has established the properties of interest-rate models in which all yields have a lower bound equal to the lower bound of the interest rate itself. In particular, when the interest rate must remain positive, it is possible for all yields to have a lower bound of zero as well. Yields are not simple arithmetic expectations (or even risk-neutral expectations) of future short rates; therefore, they can be as low as the lowest possible interest rate even when the interest rate process displays mean reversion and has a finite-variance steady-state distribution. This paper also illustrates the problems that can arise when then true and risk-neutral stochastic processes for the interest rate have different boundary behaviors. Price bubbles can be introduced unless events that are impossible under the true distribution are assigned zero probability under the riskneutral process as well. In some cases, such as the Pan Wu model, alternate bubble-free prices can be derived.
26 Equation (102.41) fixes a typo in the second unnumbered equation on p. 84 of Sundaresan (1984).
1517
References Abramowitz, M. and I. Stegum. 1964. Handbook of mathematical functions, Dover Publications, New York. Black, F., E. Derman, and W. Toy. 1990. “A one-factor model of interest rates and its application to treasury bond options.” Financial Analysts Journal 46, 33–39. Brennan, M. J. and E. S Schwartz. 1982. “An equilibrium model of bond pricing and a test of market efficiency.” Journal of Financial and Quantitative Analysis 41, 301–329. Chan, K. C., G. Andrew Karolyi, F. Longstaff, and A. Sanders 1992. “An empirical comparison of alternative models of the short-term interest rate.” Journal of Finance 47, 1209–1227. Cox, J. C., J. E. Ingersoll, and S. A. Ross. 1981. “A re-examination of traditional hypotheses about the term structure of interest rates.” Journal of Finance 36, 769–799. Cox, J. C., J. E. Ingersoll, and S. A. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 385–408. Dothan, U. 1978. “On the term structure of interest rates.” Journal of Financial Economics 6, 59–69. Dybvig, P. H., J. E. Ingersoll, and S. A. Ross. 1996. “Long forward and zero-coupon rates can never fall.” Journal of Business 69, 1–25. Heston, S. L., M. Lowenstein, and G. A. Willard. 2007. “Options and bubbles.” Review of Financial Studies 20, 359–390. Longstaff, F. A. 1992. “Multiple equilibria and term structure models.” Journal of Financial Economics 32, 333–344. Longstaff, F. A. and E. S. Schwartz. 1992. “Interest rate volatility and the term-structure of interest rates: a two-factor general equilibrium model.” Journal of Finance 47, 1259–1282. Malkiel, B. 1966. The term structure of interest rates: expectations and behavior patterns, Princeton University Press, Princeton. Merton, R. C. 1975. “An asymptotic theory of growth under uncertainty.” Review of Economic Studies 42, 375–393. Merton, R. C. 1990. “A dynamic general equilibrium model of the asset market and its application to the pricing of the capital structure of the firm,” chapter 11 in Continuous-time finance Basil Blackwell Cambridge, MA. Pan, E. and L. Wu. 2006. “Taking positive interest rates seriously.” chapter 14, Advances in Quantitative Analysis of Finance and Accounting 4, Reprinted as chapter 98 herein. Sundaresan, M. 1984. “Consumption and equilibrium interest rates in stochastic production economies.” Journal of Finance 39, 77–92. Vasicek, O. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188.
Appendix 102A 102A.1 Derivation of the Probability and State price for r T = 0 for the PW Model Let H.r; t/ be probability that rT D 0 conditional on rt D r for the square root stochastic process with no mean reversion: dr D ›rdt C ¢r 1=2 d¨. Let Q.r; £/ be the state price for the state rT D 0; that is, Q.r; £/ is the value at time t when the interest rate is r of receiving $1 if the interest rate is zero at time T D t C £. This section of the appendix verifies Equations (102.8) and (102.9) in the text that
1518
J. Ingersoll
ı where h.£/ 2› ¢ 2 .e ›£ 1/ ı where q.£/ Œ› ” coth. 12 ”£/ ¢ 2 :
H.r; tI T / D expŒh.£/r Q.r; £/ D expŒq.£/r
(102A.1)
The probability H satisfies the Kolmogorov backward equation and boundary conditions
0 D 12 ¢ 2 rHrr ›rHr H£
subject to H.0; tI T / D 1 and H.r; T I T / D 0:
The condition H.0; t/ D 1 must be satisfied because zero is an absorbing state so once r reaches zero before time T , it will be there at time T with probability one.
Hr D expŒrh.£/ h.£/
(102A.2)
For the solution given, H.0; t/ D 1 and H.r; T / D 0 are readily confirmed. The solution itself can be verified by differentiating and substituting into Equation (102A.2). The partial derivatives we need are
Hrr D expŒrh.£/ h2 .£/
H£ D expŒrh.£/ rh0 .£/
with h0 .£/ D 2›2 ¢ 2 .e ›£ 1/2 e ›£ D 12 ¢ 2 e ›£ h2 .£/:
(102A.3)
So we have 1 2 ›£ › 1 2 1 2 2 ¢ rHrr ›rHr H£ D expŒrh.£/ rh .£/ ¢ C ¢ e 2 2 h.£/ 2 1 2 1 2 ›£ 2 1 2 ›£ D expŒrh.£/ rh .£/ ¢ C 2 ¢ .e 1/ 2 ¢ e D0 2
as required establishing the first part of Equation (102A.1). The state price for the state rT D 0 has the same value as a contract that pays $1 the first time that the interest rate hits zero because the interest rate is then trapped at zero, and
0 D 12 ¢ 2 rQrr ›rQr rQ Q£
(102A.4)
the state rT D 0 will be realized for sure, and with a zero interest rate there is no further discounting. The value of this therefore asset satisfies the usual pricing partial differential equation
subject to Q.0; £/ D 1
and Q.r; 0/ D 0:
(102A.5)
The boundary conditions are confirmed since coth.0/ D 1. Again differentiating and substituting into Equation (102A.5) gives27
27
The derivative of the hyperbolic cotangent is the negative hyperbolic cosecant function d coth x=dx D csch2 x D 4.e x e x /2 :
The third equality in Equation (102A.6) uses the identity coth2 x csch2 x 1. The fourth equality follows from the definition of ”.
102 Positive Interest Rates and Yields: Additional Serious Considerations
1519
1 2 ¢ rQrr ›rQr rQ Q£ 2 rQ D 2 12 Œ› ” coth. 12 ”£/ 2 ›Œ› ” coth. 12 ”£/ ¢ 2 12 ” 2 csch2 . 12 ”£/ ¢
1 2 rQ rQ 2 1 2 2 1 2 1 D 2 › ¢ C 2 ” Œcoth . 2 ”£/ csch . 2 ”£/ D 2 12 ›2 ¢ 2 C 12 ” 2 D 0 ¢ 2 ¢
as required establishing Equation (102A.1).
the
second
part
(102A.6)
of setting the bond price to zero when the interest rate is zero. This boundary condition is required to make the risk-neutral and true processes equivalent.
102A.2 Bond Price When r t = 0 Is Accessible for Only the Risk-Neutral Process As discussed in the body of the paper, the pricing equation is the standard one with only an altered boundary condition
0 D 12 ¢ 2 rPrr ›rPr rPP£
P .r; 0/D1
P .0; £/ D 0: (102A.7)
The price of the bond is
P .r; £/ D expŒrB.£/ .1 expŒrŸ.£/ / where
Ÿ.£/
2” 2 ¢ 2 Œ” sinh.”£/ C › cosh.”£/ ›
B.£/
2 2.1 e ”£ / D : 1 2” C .› ”/.1 e ”£ / ” coth. 2 ”£/ C ›
B.£/ is the same function as found in the CIR solution. The maturity condition are satisfied since B.0/ D 0; Ÿ.0/ D 1, and the boundary condition at r D 0 is clearly
(102A.8)
satisfied. That the solution satisfies the pricing partial differential equation can be verified by substituting the derivatives
Pr D B.£/P .r; £/ C Ÿ.£/ exp .rŒB.£/ C Ÿ.£/ / Prr D B 2 .£/P .r; £/ Ÿ.£/ŒŸ.£/ C B.£/ exp .rŒB.£/ C Ÿ.£/ / P£ D rB 0 .£/P .r; £/ C rŸ0 .£/ exp .rŒB.£/ C Ÿ.£/ /
(102A.9)
into the partial differential equation and collecting terms
0 D rP .r; £/
1
2¢
2
B 2 C ›B 1 C B 0 r exp .rŒB.£/ C Ÿ.£/ / 12 ¢ 2 Ÿ2 C .› C 12 ¢ 2 B/Ÿ C Ÿ0 :
(102A.10)
1520
J. Ingersoll
The first term is zero because B.£/ is the same function as in the CIR model. So we need only verify that the final term in brackets is also zero. The derivative of Ÿ is
Ÿ0 .£/ D D
2” 2 Œ” sinh.”£/ C › cosh.”£/ › 2 Œ” 2 cosh.”£/ C ”› sinh.”£/ ¢2 ¢2 2 Ÿ .£/Œ” 2 cosh.”£/ C ”› sinh.”£/ : 2” 2
(102A.11)
So the final term in brackets in Equation (102A.10) is ı 1 2 2 1 1 ¢ Ÿ C .› C ¢ 2 B/Ÿ C Ÿ0 D ¢ 2 Ÿ2 1 C .2›= ¢ 2 C B/ Ÿ cosh.”£/ .›= ”/ sin.”£/ 2 2 2
1 2 1 2 2 2 ›C ¢ B Œ” sinh.”£/ C › cosh.”£/ › cosh.”£/ .›= ”/ sin.”£/ D ¢ Ÿ 1C” 2 2 1 1 D ¢ 2 Ÿ2 ¢ 2 ” 2 BŒ” sinh.”£/ C › cosh.”£/ › C .›2 = ” 2 1/Œcosh.”£/ 1 2 2 1 4 2 2 1 BŒ” sinh.”£/ C › cosh.”£/ › cosh.”£/ C 1 : (102A.12) D ¢ Ÿ ” 2 2
Substituting for B.£/ and using the “half-angle” identity,28 1 2 2 1 2 1 4 2 2 ” sinh.”£/ C › cosh.”£/ › 0 ¢ Ÿ C .› C ¢ B/Ÿ C Ÿ D ¢ Ÿ ” cosh.”£/ C 1 2 2 2 ”Œcosh.”£/ C 1 = sinh.”£/ C › " # 1 4 2 2 ” sinh2 .”£/ ” cosh2 .”£/ C ” D ¢ Ÿ ” D 0: 2 ”Œcosh.”£/ C 1 C › sinh.”£/
102A.3 Properties of the Affine Exponentially Smoothed Average Model The multivariate affine model with a exponentially smoothed average has a Kolmogorov backward equation of 0 D 12 ¢ 2 rFrr C›.x r/Fr C•.r x/Fx F£ :
The “half-angle” identity is coth 12 x D .cosh x C 1/=sinh x. The final equality in Equation (102A.13) follows from the identity cosh2 x sinh2 x D 1.
28
equation subject to various boundary conditions. In partic ular, the expected values EŒrt C£ and E rt2C£ are solutions with boundary conditions F .r; x; 0/ D r and r 2 , respectively. The expected value of the interest rate is EŒrt C£ D
(102A.14)
The joint probability distribution for r and x, and other probably functions are the solutions to this partial differential
(102A.13)
rt Œ• C ›e .›C•/£ C ›xt Œ1 e .›C•/£ ›C• (102A.15)
The expected value of the square of the interest rate has a quite messy formula, but for times far in the future, its asymptotic behavior is EŒrt2C£ D
¢ 2 •2 .•rt C ›xt /£ C o.£/: 2›.› C •/3
(102A.16)
102 Positive Interest Rates and Yields: Additional Serious Considerations
Since the mean value converges to EŒr1 D .•rt C ›xt /= .• C ›/, the variance of rt C£ also diverges at the rate £.
where C is the constant of integration required to ensure the density integrates to one. For the three-halves power process the steady state density is Z C f .r1 / D 2 3 exp 2 ¢ r
102A.4 Properties of the Three-Halves Power Interest Rate Process
f .x1 / D
C ¢ 2 .x/
x
.z/= ¢ 2 .z/d z
r
›z.™ z/= ¢ z d z 2 3
2
2
D .2›™ =¢ 2/2C2›=¢ r 32›=¢ exp.2›™ =¢ 2 r/:
The steady-state distribution for a diffusion on .0; 1/ with inaccessible boundaries and evolution dx D .x/dt C ¢.x/d¨ is Z exp 2
1521
(102A.18) For the risk-neutral dynamics 3
dr D rŒ›™ §1 .› C §2 /r dt C ¢r 2 d¨ (102A.17)
3
D ›O r.™O r/dt C ¢r 2 d¨;
(102A.19)
the bond price is given in Cox et al. (1985) as
P .r; £/ D
. •/ Œc.£/=r • M.•; ; c.£/=r/ ./
1 2O›™O .›™§1 /£ e 1 ; “O 1 C 2O› = ¢ 2 2 ¢ O • 12 .“O 2 C 8 =¢ 2 /1=2 12 “; 1 C .“O 2 C 8 =¢ 2 /1=2 ;
where
c.£/
To help analyze the three-halves power model, we first present some preliminary results. Define the function
(102A.20)
m.zI a; b/ za M.a; b; 1=z/ where M.a; b; x/ is the confluent hypergeometric function.29 Then
m0 .z/ D aza1 M.a; b; 1 =z/ C za2 .a =b/M.a C 1; b C 1; 1 =z/ D aza1 M.a C 1; b; 1 =z/:
(102A.21)
By repeated application of Equation (102A.21), the nth derivative is m.n/ .z/ D .1/n a.a C 1/ .a C n 1/zan M.a C n; b; 1 =z/ D .1/n with
.a C n/ an z M.a C n; b; 1=z/ .a/ m.n/.0/ D .1/n
.a C n/.b/ : .a/.b a n/
(102A.22)
Since derivatives of all orders exist, m is analytic and can be expressed as the power series m.z/ D
1 X nD0
1 X ı zn m.n/ .0/ nŠ D .z/n nD0
29 .a C n/.b/ The derivative of the confluent hypergeometric function is : @M.a; b; x/=@x D .a=b/M.a C 1; b C 1; x/. The asymptotic behavior nŠ.a/.b a n/ as x ! 1 is M.a; b; x/ D .b/=.b a/x a Œ1 C O.1=x/ . See (102A.23) Abramowitz and Stegum (1964).
1522
J. Ingersoll
The bond price in Equation (102A.20) therefore can be written as the power series30
. •/ X .• C n/ . •/ m .c.£/=rI •; / D P .r; £/ D Œr =c.£/ n .•/ .•/ nD0 nŠ. • n/ 1
D 1 Œr =c.£/ •. • 1/ C 12 Œr =c.£/ 2 •. • 1/.1 C •/. • 2/ D 1
2r ¢ 2 c.£/
C
2.1 C ›O /r 2 C ¢ 4 c 2 .£/
(102A.24)
And since `n.1 C x/ D x 12 x 2 C , the yield to maturity can be expressed as the power series 1 2r 2.O› C 1/r 2 2r 2 Y .r; £/ `n P .r; £/ D 2 4 2 C 4 2 C £ ¢ £c.£/ ¢ £c .£/ ¢ £c .£/ D
2O›r 2 2r C O.Œr =c.£/ 3 /: ¢ 2 £c.£/ ¢ 4 £c 2 .£/
(102A.25)
At short maturities Y .r; £/ D r C 12 ›O r.™O r/£ C O.£2 /:
30
Note that •. • 1/ D ¢ 4
h
›O C 12 ¢ 2
2
C 2¢ 2
(102A.26)
i1=2
C ›O C 12 ¢ 2
h
›O C 12 ¢ 2
2
C 2¢ 2
i1=2
›O 12 ¢ 2 D 2¢ 2 and 2• 2 D 2O›=¢ 2
Chapter 103
Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds* Cheng-Few Lee, Dilip K. Patro, and Bo Liu
Abstract This paper proposes a generalized functional form CAPM model for international closed-end country funds performance evaluation. It examines the effect of heterogeneous investment horizons on the portfolio choices in the global market. Empirical evidences suggest that there exist some empirical anomalies that are inconsistent with the traditional CAPM. These inconsistencies arise because the specification of the CAPM ignores the discrepancy between observed and true investment horizons. A comparison between the functional forms for share returns and NAV returns of closed-end country funds suggests that foreign investors may have more heterogeneous investment horizons compared to the U.S. counterparts. Market segmentation and government regulation does have some effect on the market efficiency. No matter which generalized functional model we use, the empirical evidence indicates that, on average, the risk-adjusted performance of international closed-end fund is negative even before the expenses. Keywords Closed-end country fund r Functional transformation r Performance evaluation
103.1 Introduction and Motivation A nonlinear functional form is more appropriate when returns are measured over an interval different from the “true” homogeneous investment horizon of investors that is assumed by the CAPM (Jensen 1969). Levy (1972) and D.K. Patro () OCC, 250 E Street SW, Washington, DC 20219, USA e-mail: [email protected] C.-F. Lee Department of Finance and Economics, Rutgers Business School, 94 Rockafeller Road, Piscataway, NJ 08854, USA e-mail: [email protected] B. Liu Citigroup Global Market Inc., 390 Greenwich Street, New York, NY 10013, USA e-mail: [email protected]
Levhari and Levy (1977) demonstrate that if the assumption of the holding period is different from the “true” investment horizon, there will be a systematic bias of the performance measurement index as well as the beta estimate. Lee (1976) has proposed a non-linear model to investigate the impact of investment horizon on the estimate of beta coefficients. Lee et al. (1990) have theoretically derived the relationship between heterogeneous investment horizons and capital asset pricing model. Empirically, Fabozzi et al. (1980) have used a generalized functional form approach developed by Zarembka (1968) to investigate the open-ended mutual fund return generation process. Chaudhury and Lee (1997) use the functional form model for international stock markets to investigate whether the international stock markets are integrated or segmented. The U.S. traded international closed-end funds (CEFs) provide a useful testing ground for the market efficiency in the framework of functional transformation (see Patro 2008). Mutual funds provide investors with heterogeneous investment horizons. As mentioned in Johnson (2004), investors’ liquidity needs are primarily revealed by their investment horizons. His empirical result shows that investors’ liquidity needs can vary significantly across individual investors, which leads to an obvious heterogeneous investment horizons phenomenon according to his duration model. The closed-end funds are unique in that they provide contemporaneous and observable market-based rates of returns for both funds and underlying asset portfolios. Moreover, for most funds the value of the underlying portfolio is known with considerable accuracy since the component assets are listed on the stock market. However, close-end funds typically trade at a substantial discount to the underlying value of their holdings (the net asset value (NAV) of the fund). The discount value is not constant, and varies considerably over time. Unlike domestic closed-end funds, international * A different version of the chapter was published in volume three of Advances in Financial Planning and Forecasting 2008, pp 235–278. The views expressed in this article are solely that of the authors and does not represent necessarily views of the OCC or Citigroup. We would like to thank Ren-Raw Chen, Ben Sopranzetti, Louis Scott and participants in ASSA 2005 Annual Meeting in Philadelphia for helpful comments.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_103,
1523
1524
closed-end funds shares and underlying assets are traded in different markets. Therefore, share returns and NAV returns may display totally different distribution characteristics according to the corresponding markets. Investors from different markets will also have different consumption patterns and investment horizons, especially those investors from emerging markets. Prior research of Chang et al. (1995) finds that except for the Mexico Fund, the shares of a sample of 15 closed-end country funds did not outperform the Morgan Stanley Capital International (MSCI) world market index for the 1989–1990 periods. Patro (2001) also provides evidence of inferior performance of risk-adjusted share returns and NAV returns of 45 U.S.-based international closed-end funds over the 1991–1997 period. The main purpose of this chapter is to propose a generalized functional form model for closed-end mutual fund performance evaluation. First, we want to investigate whether the negative performance of international closed-end funds documented by previous studies is due to an incorrect specification of the return generating process. Second, we want to test whether international closed-end funds investors have heterogeneous investment horizons. Third, we want to provide a comprehensive analysis of the relationship between the functional form of the returns and fund characteristics, with special attention given to the difference between emerging funds and developed funds, single country funds and regional funds. We also consider the short sale effect on the return generating process of international CEFs. Short sale plays an important role in the determinant of the market efficiency. On the one hand, short sale facilitates efficiency of the price discovery; on the other hand, short sale may also facilitate severe price declines in individual security. A common conjecture by regulators is that short-sales restriction can reduce the severity of price declines. Hong and Stein (2003) developed a model linking short sale constraints to market crashes, where they find investors with negative information cannot reveal their information until the market begins to drop if they are constrained from short selling and their activities will further aggravate market declines and lead to a crash. Most research suggests that short sale constraints have an adverse effect on the market efficiency. Empirical evidence from both the U.S. and non-U.S. markets also supports the theoretical view that constraining short sale hinders price discovery – particularly when the news is bad. Aitken et al. (1998) provide evidence that short sale trades reflect significant bad news about companies. Jones and Lamont (2002) show that stocks with expensive short sale cost have higher valuations and thus lower subsequent returns, which are consistent with their hypothesis that stocks that are difficult to short sale are overpriced. Bris et al. (2004) find that there exists significant cross-sectional variation in equity returns in the markets where short selling is allowed or practiced con-
C.-F. Lee et al.
trolling for a host of other factors, while no such evidence is displayed in the non-short-selling markets. Although we doubt closed-end country fund managers will undertake short sale in their portfolio investment, short sales does affect investors’ behaviors in the local markets, which will further affect the prices of the stocks in the local market. Taking a look at the data-generating process of international CEFs, especially the movement of NAV returns, will provide a better understanding of the effect of government regulations on the financial markets. In this chapter, we use the generalized functional form approach to investigate performance of 90 closed-end country funds. Both linear and loglinear assumption of the return generating process are rejected for nearly 25% of closedend country funds. The wide distribution of estimated transformation parameters indicates that investors from foreign countries (especially emerging countries) have more heterogeneous investment horizons compared to their U.S. counterparts. The government regulation (short sale) effect on the functional form only affects the NAV returns. Further, we find that most of the international closed-end country funds are not integrated with the world market, even though the global risk factor is significant in several models. Consistent with previous research, after the consideration of the functional form transformation of share returns, international closed-end country funds generate negative risk-adjusted returns, no matter which benchmark we use. Moreover, in the framework of generalized CAPM, consistent with previous research, all factors in Carhart (1997) four-factor model remain as important pricing factors. However, when we include the global return into the model, the momentum factor no longer plays an important role. This chapter is organized as follows. In Sect. 103.2, we investigate the appropriateness of the functional form used by previous research. In Sect. 103.3, a generalized international capital asset pricing model is used to test whether the closed-end fund is integrated with the world market or not. Section 103.4 describes the data and testing methodology used in the empirical analysis. Finally, the empirical results are summarized in Sect. 103.5 and the conclusion is given in Sect. 103.6.
103.2 Literature Review 103.2.1 CES Functional Form of the CAPM The traditional CAPM assumes that all investors have the same single-period investment horizon. However, this assumption is unlikely to be true. In reality, individual investors have multiple investment horizons depending on their consumption patterns. The explicit consideration of
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
multiperiod investment horizons generates several important implications on the empirical estimation of the systematic risk and the risk-return relationship. One of the multiperiod investment analyses is the nonlinear functional form of CAPM. Tobin (1965) pioneered the study of multiperiod investment CAPM. He analyzed the effect of the heterogeneous investment horizon on portfolio choices and developed a relationship between the risk and return measures of the single-period investment horizon and those of the multiperiod investment horizon. Following Tobin’s work, Jensen (1969) was the first to investigate the effect of investment horizon on the estimation of the systematic risk. Based on the instantaneous systematic risk concept, he concluded that the logarithmic-linear form of the CAPM could be used to eliminate systematic risk. However, Jensen did not include the investment horizon parameter in his model. Lee (1976) extended Jensen’s (1969) work and derived the CES functional form of the CAPM that introduces the functional form investment parameter into regression directly and then estimates the systematic risk beta in a homogeneous investment horizon framework. Lee (1976) showed that based on a homogeneous mean-variance preference structure and an equilibrium market, a risk-return relationship can be defined as: H H / D .1 ˇpH /.1 C RftH / C ˇpH E.1 C Rmt / E.1 C Rpt (103.1)
where H is the investment horizon assumed by the CAPM; H 1 C Rpt is the holding period return on the pth portfolio; 1 C H Rmt is the holding period return of the market portfolio; 1 C RftH is the holding period return of the risk-free asset; ˇpH is the systematic risk. Equation (103.1) implies that the risk return tradeoff is linear only when the investment horizon is the same as assumed by the CAPM. If the observed horizon is defined by N, which is not the same as H, then (1) can be rewritten as:
n o1= N N D 1 ˇpH 1 C RftN CˇpH E 1 C Rmt E 1CRpt (103.2) where D H=N , In order to get the full nonlinear estimation in the above equation, using logarithm and Euler expansion, Equation (103.2) can be rewritten as:
1525
This implies that the CAPM should include a quadratic excess market return if H is not trivial. Following the standard regression method, both ˇpH and pH can be estimated. The adjusted coefficient of determination will reflect whether the new parameter of pH can improve the explanatory power of the CAPM. If the quadratic market return is significantly different from zero and is arbitrarily omitted, then the estimated systematic risk may be subjected to the specification bias.
103.2.2 Generalized Functional Form of the CAPM Using the Box and Cox (1964) transformation technique, Lee (1976) and Fabozzi et al. (1980) developed a generalized model to describe the mutual fund return-generating process: .1 C Rpt /./ .1 C Rft /./ D ˛p C ˇp ..1 C Rmt /./ .1 C Rft /./ C "pt (103.4) where .1CRpt /./ D .1 C Rmt /./ D
..1CRpt / 1/ ; .1CRft /./
D
..1CRft / 1/ ,
..1 C Rmt / 1/ ; "pt N 0; "2
is the functional transformation parameter to be estimated across the pth mutual fund’s time series rates of return. Equation (103.4) includes both the linear and log-linear functional forms as special cases: When D 1; Rpt Rft D ˛p C ˇp .Rmt Rft / C "pt When D 0; ln.1 C Rpt / ln.1 C Rft / D ˛p C ˇp .ln.1C Rmt / ln.1 C Rft / C "pt Box and Cox (1964) used the maximum likelihood method to determine the functional form parameters. ln Max./ D N lnŒO " ./ C . 1/
N X
ln.1 C Rpt /
t D1
N Œln.2 / C 1 2
(103.5)
where N is the number of observation and O " ./ is the estimated regression residual standard error of Equation (103.4). After Equation (103.4) is estimated over a range of values for
2 , Equation (103.5) is used to determine the optimum value N ln 1 C RfNt C pH ln E.1 C Rmt / ln 1 C RfNt for that maximizes the logarithmic likelihood over the pa(103.3) rameter space. McDonald (1983) found a generalized model to be ap
1 H H H propriate in a significant number of cases of his sample of where D H ˇ 1 ˇ
N N ln 1 C RfNt D ˇpH ln E 1 C Rmt ln E 1 C Rpt
p
2
p
p
1526
C.-F. Lee et al.
1,164 securities, although the bias of the CAPM beta did not appear to be material. Generalized functional form has also been found in the international stock market. Chaudhury and Lee (1997) found that the linear (loglinear) empirical return model could be rejected for more than half of the international sample of 425 stocks from ten countries.
103.2.3 Translog Functional Form of the CAPM Jensen (1969), Lee (1976), Levhari and Levy (1977), and McDonald (1983) investigated the empirical implications of multiperiod investment, but none of them has provided a generalized asset pricing model for the equilibrium risk-return relationship under heterogeneous investment horizons. In order to control the systematic skewness from the square-term of a market excess return in the CES functional form of the CAPM, Lee et al. (1990) proposed the translog functional form of the CAPM in a heterogeneous investment horizon framework:
j fj N mj D 1 ˇj 1 C RftN Cˇj E 1 C Rmt E 1 C RjtN (103.6) H
H
H
where j D Nj ; j D Nfj ; mj D Nmj and H and N are investors’ weighted time horizon and observed time horizon, respectively. Note each portfolio has different set of . When i D fi D mi , it reduces to homogeneous generalized functional form CAPM.
The translog function form provides a generalized functional form that is local second-order approximation to any nonlinear relationship. For many production and investment frontiers employed in econometric studies, the translog function often provides accurate global approximations. Moreover, the translog model permits greater substitution among variables. Thus, it provides a flexible functional form for risk estimation. Finally, this model can be estimated and tested by relatively straightforward regression methods with a heterogeneous investment horizon. All of the three alternative functional CAPM models can reduce the misspecification bias in the estimates of the systematic risk and improve the explanatory power of the CAPM.
103.3 Model Estimation 103.3.1 Generalized Functional Form Model for Closed-End Fund Using the technique of Box and Cox (1964), Lee (1976) and Fabozzi et al. (1980) developed a generalized model to describe the mutual fund return-generating process. Following their approach, we define the generalized Box-Cox functional form of the international closed-end fund return as follows:
.1 C Rjk .t//./ .1 C Rf .t//./ D ˛jk C ˇjk Œ.1 C Rmk .t//./ .1 C Rf .t//./ C "jk .t/ where .1 C Rjk .t//./ D .1CRf .t // 1 ,
.1 C Rmk .t//./ D
.1CRjk .t // 1 ;
.1 C Rmk .t// 1 ;
.1 C Rf .t//./ D
(103.7)
Equation (103.7) can be rewritten as: .1 C Rjk .t//./ D ˛jk C .1 ˇjk /.1 C Rf .t//./
"jk .t/ N.0; j2 /
is the functional form parameter to be estimated across the mutual funds; Rjk .t/ is the monthly rate of return (shares and NAV) for the closed-end fund j, which invest in country k in period t; Rmk .t/ is the monthly market rate of return of country k in period t; Rf .t/ is risk free rate of interest in period t.
Cˇjk .1 C Rmk .t//./ C "jk .t/ (103.8) Equation (103.8) is a constrained or restricted regression. Equation (103.7) includes both linear and log-linear functional forms as special cases: When D 1, it reduces to the linear case: Rjk .t/ Rf .t/ D ˛jk C ˇjk ŒRmk .t/ Rf .t/ C "jk .t/ when approach zero, it reduces to the log-linear case:
In.1 C Rjk .t// In.1 C Rf .t// D ˛jk C ˇjk ŒIn.1 C Rmk .t// In.1 C Rf .t// C "jk .t/
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
Following Box and Cox (1964), we use the maximum likelihood method to determine the functional form parameter: Max l./ D
f˛;ˇ;;g
N X N 2 In.1 C Rpt / In O i ./ C . 1/ 2 t D1 N X N 1 2 In.2 / e 2 2i2 i t D1
where N is the number of observation and O "2 ./ is the estimated regression residual standard error and ei D .1 C Rjk .t//./ Œ˛jk C.1ˇjk /.1CRf .t//./ Cˇjk .1CRmk .t//./ . Given the unrestricted estimates of parameters .˛; ˇ; ; /, a model that is linear . D 1/ or log-linear . D 0/ is a simple parametric restriction and can be tested with a likelihood ratio statistic. The test statistics is 21 D 2 fIn L . D 1 or D 0/ In L . D MLE/g : This statistics has a chi-squared distribution with one degree of freedom and can be referred to the standard table (5% critical value D 3:84). We can use the difference of these two likelihood values to test whether of each fund is significantly different from zero or one.
103.3.2 Functional Form of the International Closed-End Country Fund Model In the context of international asset pricing, if the world capital market is integrated, ex ante risk premium on a security equals ex ante risk premium of global market portfolio times the security’s systematic risk with respect to the global portfolio (Solnik 1974a). However, Solnik (1974b, c) and Lessard (1974) also report that there are strong national factors that present in the price generating process of individual securities. Hence, Solnik (1974a) suggested using a twofactor model for individual securities. A direct extension to Solnik’s model would lead us to the generalized functional form of the international capital asset pricing model for closed-end funds: .1 C Rj k .t//.œ/ D ˛j C ˇj k .1 C Rmk .t//./ Cˇjg .1 C Rg .t//.œ/ C "j .t/: (103.9) where .1 C Rjk .t//./ D .t // 1
.1CRmk
.1CRmk .t // 1 ; .1
C Rmk .t//./ D
,
.1 C Rg .t//./ D
.1 C Rg .t// 1 ; "j .t/ N.0; "2 /
1527
.1 C Rjk .t//.œ/ and .1 C Rmk .t//.œ/ are generalized transformation of the monthly CEF return (both share returns and NAV returns) and the monthly market return of country k. .1 C Rg .t//.œ/ is the transformed monthly return on the global market portfolio. Both Rmk .t/ and Rg .t/ are commonly proxied by the return on the country stock index and the global stock index. In an integrated world market, the pure national risk ˇjk would not be priced since it can be diversified away through international investments. On the other hand, in an extreme segmented market, the international risk ˇjg would not be priced (Errunza and Losq 1985; Jorion and Schwartz 1986). In a mildly segmented market (Errunza and Losq 1989), the pure national risk of securities in which foreign investment is not allowed would be priced as in the case of complete segmentation, but their global systematic risk would also be priced as in the case of complete integration. In addition to the two-factor generalized global CAPM model, we also consider the generalized single factor model with only global index as the pricing factor. Pioneering work by Carhart (1997) has shown that the four-factor model, which includes a momentum factor, is superior to both the CAPM and the Fama-French three-factor model in explaining the cross-sectional variation in fund returns. We further extend our generalized CAPM model into the framework of Carhart four-factor model. In this chapter, we focus on the empirical specification of return generating process in the context of international asset pricing. We would explore whether it is worthwhile to include a global index in the generalized two-factor model. Specifically, we will compare the generalized functional form of CEF between developed markets and emerging markets, between single country funds and regional funds, and between short sale allowed country funds and short sale not allowed country funds.
103.4 Data and Methodology 103.4.1 Data The total sample of data includes 90 U.S.-traded international closed-end country funds with complete monthly data from July 1965 to December 2003.1 These returns are adjusted for capital distribution, stock splits, dividends, and right offerings. All of the data are in the U.S. dollars. Table 103.1 lists the fund name, fund code, IPO date, and the total number of observations. The market indices of different countries and regions are also included in Table 103.1. 1
Most funds report their NAVs weekly on Fridays. We use the data for the last Friday of the month for the monthly observations.
1528
C.-F. Lee et al.
Table 103.1 Information of international closed-end country funds traded in the U.S Developed/ Single country No. Code Fund name Index emerging fund
Short sale allowed
IPO date
Observations (month)
1
JPN
Japan Fund Inc
Japan
Dev
S
Y
5-15-75
265
2
ASA
ASA Limited
South Africa
Em
S
Y
9-1-58
462
3
MXF
Mexico Fund Inc
Mexico
Em
S
Y
6-3-81
270
4
KF
Korea Fund Inc
Korea
Em
S
N
8-22-84
232
5
ZSEV
Z-Seven Fund Inc
World
World
R
Y
12-29-83
195
6
IAF
First Australia Fund Inc
Australia
Dev
S
Y
12-12-85
216
7
ITA
Italy Fund Inc
Italy
Dev
S
Y
2-26-86
203
8
FRN
France Fund
France
Dev
S
Y
5-30-86
42
9
GER
Germany Fund
Germany
Dev
S
Y
7-18-86
209
10
VLU
Worldwide Value Fund
World
World
R
Y
8-19-86
130
11
TWN
Taiwan Fund Inc
Taiwan
Em
S
N
12-16-86
204
12
EMF
Templeton Emerging Mkts Fund
EM
Em
R
B
2-26-87
202
13
APB
Asia Pacific Fund Inc
AP
Em
R
N
4-24-87
200
14
MF
Malaysia Fund Inc
Malaysia
Em
S
G
5-8-87
199
15
SAF
Scudder New Asia Fund Inc
Asia
B
R
B
6-18-87
198
16
CLM
Clemente Global Growth Fund
World
World
R
Y
6-23-87
198
17
UKM
United Kingdom Fund
United Kingdom
Dev
S
Y
8-6-87
139
18
SWZ
Swiss Helvetia Fund
Europe
Dev
S
Y
8-19-87
196
19
TTF
Thai Fund Inc
Thailand
Em
S
G
2-17-88
190
20
BZF
Brazil Fund Inc
Brazil
Em
S
N
3-31-88
189
21
IBF
First Iberian Fund
Spain C Portugal
Dev
R
B
4-13-88
130
22
SNF
Spain Fund Inc
Spain
Dev
S
N
6-21-88
186
23
IGF
India Growth Fund Inc
India
Em
S
N
8-12-88
176
24
ROC
ROC Taiwan Fund
Taiwan
Em
S
N
5-12-89
175
25
OST
Austria Fund Inc
Austria
Dev
S
Y
9-21-89
150
26
CH
Chile Fund Inc
Chile
Em
S
N
9-26-89
171
27
PGF
Portugal Fund Inc
Portugal
Dev
S
Y
11-1-89
169
28
FPF
First Philippine Fund Inc
Philippines
Em
S
N
11-8-89
162
29
TKF
Turkish Investment Fund Inc
Turkey
Em
S
N
12-5-89
168
30
GF
New Germany Fund
Germany
Dev
S
Y
1-24-90
167
31
NEF
Scudder New Europe Fund
Europe
Dev
R
Y
2-9-90
113
32
GSP
Growth Fund of Spain Inc
Spain
Dev
S
N
2-14-90
105
33
CEE
Central European and Russia Fund
EE
Em
R
N
2-27-90
166
34
IF
Indonesia Fund Inc
Indonesia
Em
S
N
3-1-90
141
35
JOF
Japan OTC Equity Fund
Japan
Dev
S
Y
3-14-90
165
36
GTF
GT Greater Europe Fund
Europe
Dev
R
Y
3-22-90
113
37
FRG
Emerging Germany Fund Inc
Germany
Dev
S
Y
3-29-90
108
38
IRL
Irish Investment Fund
Ireland
Dev
S
Y
3-30-90
165
39
ANE
Alliance New Europe Fund Inc
Europe
Dev
R
Y
3-27-90
10
40
JGF
Jakarta Growth Fund
Indonesia
Em
S
N
4-10-90
119
World-AC world index; EM-Emerging Market; AP-Asian Pacific; EE-EM Eastern Europe; EAFEEM-AC EAFE C EM; Latin-EM Latin American.
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
1529
Table 103.1 (continued) No.
Code
Fund name
Index
Developed/ emerging
Single country fund
Short sale allowed
IPO date
Observations (month)
41
EF
Europe Fund Inc
Europe
Dev
R
Y
4-26-90
164
42
PEF
Pacific-European Growth Fund
Eafeem
Dev
R
Y
4-20-90
27
43
FRF
France Growth Fund Inc
France
Dev
S
Y
5-10-90
163
44
TC
Thai Capital Fund
Thailand
Em
S
G
5-22-90
128
45
EWF
European Warrant Fund Inc
Europe
Dev
R
Y
7-17-90
161
46
SGF
Singapore Fund
Singapore
Dev
S
N
7-24-90
161
47
LAM
Latin America Investment Fund
Latin
Em
R
N
7-25-90
161
48
MXE
Mexico Equity and Income Fund
Mexico
Em
S
Y
8-14-90
160
49
MEF
Emerging Mexico Fund
Mexico
Em
S
Y
10-2-90
101
50
AF
Argentina Fund Inc
Argentina
Em
S
N
10-11-91
121
51
LAQ
Latin America Equity Fund
Latin
Em
R
N
10-22-91
108
52
MSF
Morgan Stanley Emerging Mkts
Em
Em
R
N
10-25-91
146
53
KIF
Korean Investment Fund Inc
Korea
Em
S
N
2-13-92
116
54
BZL
Brazilian Equity Fund Inc
Brazil
Em
S
N
4-3-92
140
55
LDF
Latin American Discovery Fund
Latin
Em
R
B
6-16-92
138
56
CHN
China Fund Inc
China
Em
S
N
7-10-92
137
57
GCH
Greater China Fund Inc
China
Em
S
N
7-15-92
137
58
JFC
Jardine Fleming China Region
China
Em
S
N
7-16-92
137 136
59
JEQ
Japan Equity Fund Inc
Japan
Dev
S
Y
7-17-92
60
ISL
First Israel Fund Inc
Israel
Em
S
N
10-22-92
134
61
TCH
Templeton China World Fund Inc
China
Em
S
N
9-9-93
118
62
GSG
Global Small Cap Fund Inc
World
World
R
Y
10-6-93
74
63
GRR
Asia Tigers Fund Inc
Asia
Em
R
N
11-18-93
121
64
KEF
Korea Equity Fund
Korea
Em
S
N
11-24-93
121
65
PKF
Pakistan Investment Fund Inc
Pakistan
Em
S
N
12-16-93
89
66
SHF
Schroder Asian Growth Fund Inc
Asia
B
R
B
12-22-93
50
67
TGF
Emerging Tigers Fund Inc
Em
Em
R
N
2-25-94
26
68
SBW
Salomon Brothers Worldwide
World
World
R
Y
12-22-93
120
69
GTD
GT Global Developing Markets
Em
Em
R
N
1-11-94
44
70
AFF
Morgan Stanley Africa
Em
Em
R
N
2-3-94
97
71
IFN
India Fund Inc
India
Em
S
N
2-14-94
118
72
IIF
Morgan Stanley India Inv Fund
India
Em
S
N
2-17-94
118
73
SOA
Southern Africa Fund Inc
South Africa
Em
S
Y
2-25-94
118
74
JFI
Jardine Fleming India Fund Inc
India
Em
S
N
3-3-94
117
75
NSA
New South Africa Fund Inc
South Africa
Em
S
Y
3-4-94
62 (continued)
1530
C.-F. Lee et al.
Table 103.1 (continued) No.
Code
Fund name
Index
Developed/ emerging
Single country fund
Short sale allowed
IPO date
Observations (month)
76
FAE
Fidelity Advisor Emerging
Em
Em
R
N
3-18-94
62
77
EMO
TCW/DW Emerging Markets Opp Tr
Em
Em
R
N
3-23-94
44
78
TEA
Templeton Emerging Markets
Em
Em
R
N
4-29-94
100
79
CUBA
Herzfeld Caribb
Latin
B
R
Y
4-22-94
115
80
TYW
Taiwan Equity Fund Inc
Taiwan
Em
S
N
7-18-94
69
81
APF
Morgan Stanley Asia Pacific
AP
B
R
B
4-16-96
113
82
TVF
Templeton Vietnam Opportunity
Asia
Em
S
N
9-15-94
95
83
TDF
Templeton Dragon Fund Inc
Asia
B
R
N
9-21-94
111
84
CRF
Czech Republic Fund Inc
Czech Republic
Em
S
Y
9-23-94
96
85
EME
Foreign and Colonial Emerging
Em
Em
R
N
10-28-94
67
86
LLF
Lehman Bros Latin America Fund
Latin
Em
R
N
10-28-94
50
87
FAK
Fidelity Advisor Korea Fund
Korea
Em
S
N
10-25-94
68
88
TRF
Templeton Russia Fund Inc
Russia
Em
R
N
6-15-95
99
89
RNE
Morgan Stanley Russia and New Europe
EE
Em
R
N
9-24-96
87
90
DGE
Dessauer Global Equity Fund
World
World
R
Y
5-30-97
22
This table reports sample of the closed-end country funds. Index is the MSCI country or regional index corresponding to underlying assets of each fund. Index used for regional fund is the regional index that mostly covers the investing countries of the fund; global fund uses the MSCI global index; Dev/Em display whether the fund invests in developed countries or emerging countries, World-world fund, B-fund invests in both developed and emerging countries; Single country fund indicates whether the fund is single country fund or regional fund; Short sale gives the information whether or not the short sale is allowed in the investing country of each fund, Y-Short sale allowed, N-Short sale not allowed, B-funds invest in countries where short sale allowed and countries where short sale not allowed, G-funds invest in country where short sale policy changed during our sample period
Our sample includes 55 single country closed-end funds and 35 regional closed-end funds, with their underlying assets traded in 31 different countries. We divide those funds into two groups: developed market funds and emerging market funds, according to the country of their underlying asset. Among these funds 19 funds invest in developed markets, 53 funds invest in emerging markets, while the other 18 funds invest in both. We also classify each fund according to the regulation policy of short sale in the invested countries.2 Among these countries, the first group includes most developed markets such as the UK, Australia, Austria, Belgium, Canada, Czech Republic, Denmark, France, Germany, Ireland, Italy, Japan, Mexico, Netherlands, Portugal, South Africa, and Switzerland where short sale is allowed and practiced; the second group includes the countries where short sale is not allowed or not practiced, they are Colombia, Greece, Indonesia, Pakistan, Singapore, South Korea,
2
Information of short sale policy comes from Bris et al. (2004).
Taiwan, Argentina, Brazil, Chile, Finland, India, Israel, New Zealand, Philippines, China, and Spain; and the last group includes three countries (regions) that short sale regulation and practice changed during our sample period, they are Hong Kong, Malaysia, and Thailand. All together there are 35 funds that invest in countries or regions where short sale is allowed or practiced, and other 46 funds that invest in countries or regions where short sale is not allowed or practiced. The market indices for different countries, regions and the world market used in this study are from MSCI of Morgan Stanley. The U.S. market index used is the monthly return of the CRSP value weighted composite index. The problem of assuming a riskless rate in an international context is well known, and it is especially troublesome for international capital market where a domestic market determined short-term rate similar to the U.S. T-bills is generally not available. Since we have no domestic market short rate information, instead we use 30-day U.S. T-bill rate as a proxy. All information of the Fama-French three factors and the momentum factor is downloaded from Kenneth French’s website. Table 103.2 reports the descriptive statistics for the share returns and the NAV returns of international closed-end
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
Table 103.2 Summary statistics of the U.S. international closed-end country fund Share return No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
1531
NAV return
Ticker
Min
Max
Mean
STD
Min
Max
Mean
STD
AF AFF ANE APB APF ASA BZF BZL CEE CH CHN CLM CRF CUBA DGE EF EME EMF EMO EWF FAE FAK FPF FRF FRG FRN GCH GER GF GRR GSG GSP GTD GTF IAF IBF IF IFN IGF IIF IRL ISL ITA JEQ JFC JFI JGF JOF JPN KEF
0:342 0:250 0:244 0:422 0:183 0:276 0:388 0:394 0:376 0:305 0:294 0:437 0:321 0:299 0:247 0:277 0:229 0:360 0:174 0:452 0:238 0:300 0:349 0:203 0:253 0:330 0:341 0:266 0:272 0:196 0:323 0:222 0:159 0:479 0:383 0:248 0:375 0:199 0:247 0:242 0:264 0:177 0:274 0:282 0:325 0:264 0:320 0:246 0:276 0:295
0.351 0.153 0.237 0.409 0.194 0.514 0.351 0.376 0.188 0.335 0.587 0.196 0.195 0.515 0.145 0.219 0.453 0.286 0.210 0.340 0.252 0.397 0.444 0.239 0.161 0.197 0.517 0.444 0.206 0.241 0.409 0.239 0.195 0.214 0.429 0.538 0.566 0.323 0.395 0.322 0.195 0.272 0.542 0.327 0.375 0.258 0.629 0.485 0.595 0.404
0:002 0:004 0:019 0:013 0:001 0:014 0:016 0:008 0:008 0:015 0:017 0:008 0:003 0:005 0:006 0:008 0:010 0:014 0:000 0:006 0:002 0:005 0:002 0:007 0:007 0:013 0:013 0:010 0:002 0:002 0:013 0:013 0:004 0:004 0:009 0:014 0:006 0:010 0:008 0:012 0:008 0:007 0:009 0:004 0:010 0:003 0:005 0:007 0:023 0:001
0.101 0.071 0.128 0.110 0.074 0.111 0.119 0.116 0.080 0.100 0.121 0.075 0.079 0.100 0.081 0.070 0.085 0.100 0.074 0.118 0.089 0.129 0.108 0.071 0.069 0.091 0.117 0.092 0.080 0.094 0.085 0.079 0.074 0.093 0.083 0.102 0.137 0.107 0.108 0.103 0.073 0.077 0.101 0.096 0.115 0.104 0.130 0.114 0.093 0.111
0:278 0:202 0:137 0:421 0:132 0:499 0:520 0:309 0:322 0:245 0:229 0:241 0:228 0:204 0:179 0:199 0:138 0:340 0:169 0:271 0:203 0:323 0:280 0:178 0:180 0:231 0:265 0:234 0:215 0:191 0:184 0:168 0:116 0:356 0:449 0:296 0:421 0:217 0:450 0:480 0:163 0:182 0:357 0:136 0:961 0:188 0:429 0:205 0:681 0:317
0.244 0.123 0.053 0.333 0.173 0.409 0.484 0.351 0.216 0.183 0.312 0.134 0.879 0.123 0.179 0.134 0.176 0.198 0.096 0.225 0.249 0.720 0.555 0.155 0.174 0.157 0.383 0.189 0.161 0.169 0.268 0.147 0.107 0.159 0.148 0.193 0.514 0.260 0.421 0.891 0.151 0.134 0.258 0.241 22.075 0.313 0.254 0.446 0.225 0.481
0:001 0:002 0:014 0:007 0:002 0:006 0:015 0:004 0:004 0:007 0:007 0:001 0:005 0:000 0:016 0:001 0:002 0:005 0:001 0:005 0:003 0:005 0:004 0:000 0:003 0:007 0:006 0:001 0:002 0:003 0:008 0:009 0:000 0:002 0:004 0:005 0:000 0:011 0:000 0:013 0:006 0:003 0:001 0:001 0:165 0:005 0:013 0:005 0:006 0:001
0.081 0.055 0.082 0.087 0.055 0.107 0.137 0.120 0.074 0.074 0.080 0.054 0.101 0.052 0.089 0.052 0.048 0.075 0.055 0.091 0.070 0.148 0.090 0.055 0.054 0.074 0.102 0.070 0.063 0.070 0.064 0.057 0.051 0.070 0.064 0.070 0.122 0.098 0.100 0.135 0.050 0.066 0.070 0.069 1.918 0.087 0.100 0.090 0.079 0.112 (continued)
1532
C.-F. Lee et al.
Table 103.2 (continued) Share return
NAV return
No.
Ticker
Min
Max
Mean
STD
Min
Max
Mean
STD
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 Average
KF KIF LAM LAQ LDF LLF MEF MF MSF MXE MXF NEF NSA OST PEF PGF PKF RNE ROC SAF SBW SGF SHF SNF SOA SWZ TC TCH TDF TEA TGF TKF TRF TTF TVF TWN TYW UKM VLU ZSEV
0:307 0:304 0:385 0:391 0:471 0:444 0:449 0:537 0:318 0:363 0:457 0:326 0:463 0:332 0:183 0:230 0:314 0:543 0:278 0:389 0:500 0:255 0:190 0:239 0:385 0:281 0:469 0:274 0:260 0:288 0:096 0:402 0:615 0:371 0:266 0:326 0:301 0:211 0:235 0:267 0:315
0.569 0.514 0.277 0.255 0.291 0.171 0.381 0.833 0.243 0.369 0.545 0.156 0.329 0.645 0.161 0.343 0.400 0.330 0.408 0.257 0.508 0.513 0.198 0.875 0.271 0.234 0.550 0.339 0.311 0.228 0.149 0.704 0.543 0.714 0.315 0.732 0.278 0.211 0.168 0.325 0.363
0:017 0:002 0:014 0:008 0:013 0:012 0:013 0:009 0:011 0:013 0:017 0:011 0:007 0:004 0:003 0:007 0:009 0:017 0:001 0:010 0:016 0:005 0:009 0:013 0:011 0:008 0:002 0:007 0:012 0:006 0:005 0:013 0:026 0:010 0:000 0:014 0:011 0:011 0:007 0:002 0:007
0.123 0.125 0.103 0.101 0.102 0.098 0.128 0.136 0.091 0.102 0.129 0.067 0.096 0.109 0.067 0.082 0.117 0.127 0.104 0.095 0.083 0.097 0.078 0.111 0.085 0.065 0.149 0.097 0.092 0.084 0.065 0.149 0.175 0.145 0.097 0.137 0.098 0.063 0.057 0.067 0.099
0:327 0:306 0:323 0:321 0:451 0:323 0:460 0:372 0:267 0:464 0:534 0:162 0:420 0:905 0:116 0:211 0:590 0:454 0:262 0:274 0:458 0:208 0:177 0:228 0:304 0:397 0:653 0:513 0:286 0:245 0:094 0:558 0:512 0:280 0:224 0:388 0:315 0:196 0:260 0:640 0:320
0.645 0.571 0.171 0.217 0.239 0.122 0.188 0.341 0.211 0.206 0.516 0.145 0.298 0.322 0.055 2.687 1.127 0.259 0.213 0.252 0.157 0.290 0.107 0.194 0.291 0.154 1.801 0.703 0.257 0.147 0.126 0.490 0.343 0.333 0.321 0.540 0.262 0.163 0.113 0.155 0.571
0:013 0:001 0:005 0:006 0:008 0:013 0:005 0:002 0:006 0:008 0:010 0:006 0:000 0:012 0:004 0:013 0:005 0:012 0:002 0:005 0:004 0:002 0:003 0:002 0:006 0:003 0:016 0:006 0:004 0:000 0:009 0:020 0:020 0:007 0:002 0:007 0:010 0:004 0:004 0:002 0:005
0.107 0.126 0.080 0.083 0.101 0.074 0.113 0.089 0.079 0.089 0.124 0.047 0.094 0.126 0.044 0.220 0.167 0.104 0.085 0.070 0.074 0.069 0.049 0.068 0.083 0.057 0.208 0.113 0.080 0.058 0.047 0.175 0.133 0.106 0.078 0.111 0.096 0.050 0.048 0.073 0.107
Min
Max
Mean
STD
0:140 0:007
0.113 0.0135
Global index monthly return One month T-bill
0:0074 0:0046
0.0427 0.0023
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
Global Carhart GCAPM Model: .1 C Rpt /./ D ˛p C j ˇpm .1 C Rmt /./ C ˇpg /.1 C Rgt /./ C ˇp;mkrf .1 C ./ MKRFt / C ˇp;smb .1 C SMBt /./ C ˇp;hml .1 C HMLt /./ C ˇp;umd .1 C UMDt /./ C "pt
funds. The price return and the NAV return are calculated as follows: p
Rj k .t/ D
Pj k .t/ Pj k .t 1/ nov Rj k .t/ Pj k .t 1/
NAVj k .t/ NAVj k .t 1/ D NAVj k .t 1/ where Pjk .t/ and NAV jk .t/ are share price and NAV for fund j investing in country k during period t. The average monthly share return is 0.73%, which is higher than 0.53% of the average monthly NAV return. However, the average standard deviation of the share return is only 9.93% compared to 10.7% of the average standard deviation of the NAV return. This partly reflects the higher risk of the underlying foreign countries’ assets of CEFs, especially of those emerging markets.
103.4.2 Methodology To determine the appropriate functional form of the CEF, for the share return, we use both local country index and the U.S. CRSP value weighted composite index return as the proxy of the market return; for the NAV return, we only use the monthly return of the local country index as the proxy of the market return. The use of respective local market indices as benchmarks may be motivated by noting that the funds are constrained to invest most of their assets in equity securities of one country or region, and if the fund managers are market timers, they are more likely to time the respective local market. Therefore, it is of interest how the funds compare to the passive market portfolios. Since all of these funds are traded in the U.S. market, and their price movements are also affected by the U.S. market, therefore, using the CRSP value weighted composite index as the benchmark for the share return is reasonable Table 103.4. The generalized functional models used for the share return and the NAV return of international closed-end funds are as follows: Share return
GCAPM Model: .1 C Rpt /./ D ˛p C ˇp .1 C Rft /./ C k ./ / C "pt .1 ˇp /.1 C Rmt GCAPM-CRSP Model: .1 C Rpt /./ D ˛p C ˇp .1 C crsp ./ Rft /./ C .1 ˇp / 1 C Rmt C "pt Global Index Model: .1 C Rpt /./ D ˛p C ˇpg .1 C Rgt /./ C "pt Global GCAPM Model: .1 C Rpt /./ D ˛p C k ./ C ˇpg /.1 C Rgt /./ C "pt ˇpm 1 C Rmt Global GCAPM-CRSP Model: .1 C Rpt /./ D ˛p C crsp ./ ˇpm 1 C Rmt C ˇpg /.1 C Rgt /./ C "pt
1533
NAV return
GCAPM Model: .1CNAV pt /./ D ˛p Cˇp .1CRft /./ C k ./ .1 ˇp / 1 C Rmt C "pt Global Index Model: .1 C NAV pt /./ D ˛p C ˇpg .1 C Rgt /./ C "pt Global GCAPM Model: .1 C NAV pt /./ D ˛p C k ./ C ˇpg .1 C Rgt /./ C "pt ˇpm 1 C Rmt crsp
k where Rmt is the monthly return of country k’s index, Rmt is the monthly return of the CRSP value weighted composite index, Rgt is the monthly return of the MSCI global market index.3 All together, we use six different generalized models for share returns and three different generalized models for NAV returns. The objective of this chapter is to determine whether the global index needs to be included in the empirical return model in addition to the national or domestic market index. MLE estimation of generalized functional model provides the alternative pairs of coefficient estimates and t-ratios. We wish to see whether the coefficient of the global index is significant, and more importantly, the choice of functional form matters with respect to the significance of the global index. Additionally, as the underlying assets and closed-end funds are traded at different markets, investors from different markets may have different investment horizons. Even for the same closed-end fund the functional form model of the share returns and the NAV returns may be different. Therefore, we want to investigate the difference of the functional form model between share returns and NAV returns of closed-end funds, as well as the difference between emerging market funds and developed market funds, between single country funds and regional funds. To test whether funds’ characteristics will affect the functional form specification, we use two approaches for the comparison. First, we run the grouped regression that pools across all CEFs based on different grouping criterions. We divide all of CEFs into six groups: single country funds versus regional funds, emerging market funds versus developed market funds, short sale allowed funds versus short sale not allowed funds. Second, we pool all data into one group across time and funds. In addition, we add different fund characteristics as dummy variables in the estimation. Pooling data across funds and time allows us to test the determinants of
3
Morgan Stanley started the emerging market index in 1988; therefore our global index (including developed and emerging markets) starts from January 1988.
1534
C.-F. Lee et al.
the cross-sectional variation in the functional transformation. Several specifications of the pooled model are estimated. In order to control the heteroskedasticity problem, we use White t-statistics that are based on the standard error robust to heteroskedasticity Table 103.5. For all of our empirical tests with pooling data for international CEFs share returns and NAV returns we use the different specification of the following model respectively: j
.1 C Rpt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./ C ˇg .1 C Rgt /./ C ˇmkrf .1 C MKRF t /./ C ˇsmb .1 C SMBt /
./
C ˇhml .1 C HMLt /
./
C ˇumd .1 C UMDt /./ C ˇe EM P C ˇr REGp C ˇs SSp C "pt
j
.1 C NAV pt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./ C ˇg .1 C Rgt /./ C ˇe EM p C ˇr REGp C ˇs SSp C "pt ./
where ././ D . / 1 and Em, Reg, SS are dummy variables of funds’ characteristics for emerging markets funds or developed market funds, regional funds or single country funds, short sale allowed country funds or not allowed country funds. In general, we summarize the test hypothesis of our empirical study as follows: Test 1: The coefficients of the global index return as well as the national market index return are significant. ! To test the market segmentation. Test 2: The functional form of CEF does matter (functional transformation parameter lambda is significant) in the context of global asset pricing model. ! To test the heterogeneous investment horizons. Test 3: There exists significant difference of the functional form between share returns and NAV returns of international closed-end funds. ! To test the difference of investors’ behavior from the U.S. market and foreign markets. Test 4: There exists significant difference of the functional form between closed-end funds investing in emerging markets and developed markets. ! To test the market maturity’s effect on the functional form. Test 5: There exists significant difference of the functional form between single country closed-end funds and regional closed-end funds. ! To test whether there exists cross-market effect. Test 6: There exists significant difference of the functional form between international CEFs investing in short
sale allowed countries and those investing in short sale not allowed countries. ! To test the government regulation’s effect on the market efficiency.
103.5 Empirical Results 103.5.1 Generalized Functional Form for International Closed-End Fund Unlike McDonald (1983) and Chaudhury and Lee (1997), we find the transformation parameter to be positive on average. In Table 103.3, we find the average of all share returns in GCAPM is 0.218, which is smaller than the average of NAV returns, and this similar pattern remains no matter which model we use. The absolute magnitude of seems a little higher for NAV returns compared to the share returns. Another noticeable feature is the dispersion of across the funds in our sample. The standard deviation of changes from 1.248 to 2.717, and the minimum and maximum value of range from 11 to 13. These dispersion measures are rather large relative to the mean and median values. This indicates the fund-specific nature of the transformation parameter and raises questions about the conventional use of the same transformation parameter (usually 1 for linear and 0 for loglinear specification) for all securities. Moreover, from Fig. 103.1 we find the distribution of of the NAV returns is more disperse compared to that of the share returns, and there exists some skewness in most of the distributions of . The large dispersion of for NAV return indicates that foreign investors have more heterogeneous investment horizons compared to the U.S. investors. In Table 103.4 we report the rejection rate of the hypothesis of linear and loglinear return. Irrespective of which generalized model we use, the loglinear return hypothesis is rejected for at least 30% of our sample funds, and the linear return hypothesis is rejected for more than 45% of the funds. These numbers increase to 58.4 and 55% of the NAV returns for global GCAPM model. The GCAPM model of the share return has the lowest rejection rate, where neither linear nor loglinear hypothesis can be rejected for 60% of funds. When we substitute the market return with the CRSP value weighted index return, more funds are rejected for the linear or loglinear hypothesis (rejection rate for both hypothesis increases to 20%). This reflects that the heterogeneous investment horizons for U.S. investors are more significant when we treat international closed-end fund in a local environment rather than in the global environment. However, when we include global market return in the model specification, the rejection rates also increase. This implies that in order to explore the heterogeneous investment horizon in
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
Table 103.3 Transformation parameter summary
GCAPM (share return)
GCAPM_CRSP (share return)
Global index (share return)
Global GCAPM_CRSP (share return)
Global Carhart GCAPM (share return)
GCAPM (NAV return)
Global index (NAV return)
Global GCAPM (NAV return)
t ./
jj
t .jj/
Mean
0:218
0:369
0:994
1:577
Median
0:282
0:495
0:801
1:335
STD
1:248
1:928
0:778
1:158
Min
3:798
6:310
0:001
0:000
Max
3:162
4:380
3:798
6:310
0:026
0:029
1:030
1:668
0:164
0:290
0:831
1:355 1:213
Mean Median STD
1:349
2:070
0:864
Min
5:320
5:610
0:006
0:020
Max
2:936
4:090
5:320
5:610
Mean
0:107
0:182
1:098
1:724
Median
0:366
0:370
0:942
1:600
STD
1:405
2:086
0:875
1:174
Min
5:780
5:780
0:013
0:020
2:814
4:080
5:780
5:780
0:047
0:007
0:992
1:572
Median
0:130
0:110
0:817
1:330
STD
1:269
1:977
0:774
1:168
Min
3:642
6:440
0:000
0:000
Max
2:395
4:100
3:642
6:440
Mean
0:147
0:199
1:096
1:703
Median
0:237
0:400
0:870
1:540 1:186
Max Global GCAPM (share return)
1535
Mean
STD
1:460
2:085
0:956
Min
5:561
5:700
0:000
0:000
Max
2:894
4:200
5:561
5:700
Mean
0:021
0:047
0:961
1:547
Median
0:127
0:190
0:756
1:360
STD
1:237
1:946
0:771
1:187
Min
3:675
6:660
0:001
0:000
Max
2:402
4:020
3:675
6:660
Mean
1:269
1:758
2:183
2:881
Median
1:213
1:700
1:739
2:160
STD
2:769
3:331
2:116
2:415
Min
10:978
10:220
0:005
0:040
Max
12:630
11:190
12:630
11:190
Mean
1:185
1:786
2:303
2:617
Median
1:599
1:880
1:770
2:030
STD
3:833
2:735
3:278
1:944
Min
29:013
8:790
0:012
0:110
Max
10:668
10:590
29:013
10:590
Mean
1:567
1:760
2:204
2:792
Median
1:441
1:730
1:571
2:120
STD
2:717
3:194
2:201
2:303
Min
5:758
10:240
0:000
0:000
Max
12:852
11:380
12:852
11:380
This table reports the summary of estimated transformation parameters of each model
Model ¤ 0; ¤ 0; D 0; D 0; Sum ¤ 0; ¤ 0; D 0; D 0; Sum
¤1 D1 ¤1 D1
¤1 D1 ¤1 D1
GCAPM_CRSP (%) 18(20%) 12(13.3%) 30(33.3%) 30(33.3%) 90 – – – – –
Global Global index (%) 20(22.5%) 11(12.4%) 26(29.2%) 32(36.0%) 89 26(29.2%) 24(27%) 14(15.7%) 25(28.1%) 89
Global GCAMP (%) 15(16.9%) 13(14.6%) 26(29.2%) 35(39.3%) 89 35(39.3%) 17(19.1%) 14(15.7%) 23(25.8%) 89
Global GCAMP_CRSP (%) 21(23.6%) 11(12.4%) 25(28.1%) 32(36%) 89 – – – – –
This table reports the rejection rate of linear and loglinear hypothesis of share return and NAV return for each model
NAV returns
Share returns
GCAPM (%) 13(14.4%) 16(17.8%) 29(32.2%) 32(35.6%) 90 36(40%) 12(13.3%) 14(15.6%) 28(31.1%) 90
Table 103.4 Comparison of functional factor Global Carhart GCAPM (%) 19(21.3%) 10(11.2%) 23(25.8%) 37(41.6%) 89 – – – – –
Total (%) 106(19.8%) 73(13.6%) 159(29.7%) 198(36.9%) 536 97(36.2%) 53(19.8%) 42(15.7%) 76(28.4%) 268
Total (%) (share return NAV return) 203(25.2%) 126(15.7%) 201(25%) 274(34.1%) 804
1536 C.-F. Lee et al.
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
Table 103.5 Correlation coefficient of dummy variables Em Reg SS EM Reg SS
1 0.06675 -0.69308
0.06675 1 0.02349
-0.69308 0.02349 1
This table reports the correlation coefficient of three dummy variables of each fund
the context of international asset pricing, the global market return should be included. In the next section we also test whether the global risk is priced in the framework of functional form CAPM. From Table 103.5 we also find an obvious pattern that, on average, the rejection rate of the NAV returns is higher than the rejection rate of the share returns for the corresponding model. This is reasonable since compared to the U.S. market (where share funds traded), the foreign countries’ investor (where the NAV of funds is determined) have more heterogeneous investment horizons. The linear or loglinear specification of security return may be more inappropriate for international financial markets, especially for those emerging markets. We will explore the difference between developed markets and emerging markets further in the following section. When we compare the intercept of the generalized models, the average excess return (alpha) after controlling for the market risk premium is only 0:11% for the share returns and 0:33% for the NAV returns, while the average excess return of the share returns for the conventional linear model is 0.05% and 0:13% for the loglinear model, and the corresponding number of the NAV return is 0:14% and 0:44%, respectively. This interesting result indicates that the functional form of returns does affect the performance evaluation. Consistent with Patro (2001), the NAV return seems to perform slightly worse compared to the shares when performance is measured using the local market portfolios.
103.5.2 Generalized Global Model for International Closed-End Fund Using a local market index to evaluate performance is appropriate only in fully segmented international capital markets. Therefore, using a benchmark that is likely to be meanvariance efficient in a global pricing context is important. Although the evidence in favor of the domestic CAPM is ambiguous, several authors such as Cumby and Glen (1990), Harvey (1991), Ferson and Harvey (1994) and Chang et al. (1995) provide evidences in favor of the mean-variance efficiency of the MSCI world market index. When we include the global index in the pricing model, we find the rejection
1537
rate of the conventional linear or the loglinear model is higher compared to the model with only local market index. Consistent with previous research, the global market risk has been priced in most of the international closed-end funds. We find that the coefficient of global return is significant for all of developed country funds and most of emerging country funds, no matter if we use share returns or NAV returns. This may partly provide evidence that all developed market and most emerging markets are integrated markets, at least mildly segmented markets.4 When we compare the one-factor generalized global index model and the two-factor generalized global model, we find an interesting feature. After we introduce the local market return into the global index model, risk premium ˇg of the global factor reduces and the risk premium of the one-factor generalized global index model ˇg is nearly equal to the summation of the risk premium of the local market return and the global return in the two-factor generalized global model.
103.5.3 Comparison of Functional Form Model Between Developed Market Funds and Emerging Market Funds, Between Regional Funds and Single Country Funds, Between Short Sale Allowed Country Funds and Short Sale Not Allowed Country Funds As our expectation, there exist significant different patterns of transformation parameters between emerging market funds and developed market funds no matter which model we use. From Panel A of Table 103.6, we find developed markets generally have negative transformation parameters while emerging markets have positive transformation parameters. The only exception is the GCAPM model for the share return, where is positive and insignificantly different from 0. This implies that the conventional loglinear return can be used in the pricing model of share returns for the developed markets. However, this is not true for the emerging markets, which have a significant positive and the factor loading of the market return is also smaller than that of the developed markets. Results of the NAV return from Table 103.7 tell us another story. The transformation parameters of the developed market funds change to positive while the emerging market funds do not have consistent . The significant different patterns of transformation parameters also exist between
4
Although we admit that all of international CEFs in our sample are U.S. traded securities, the evidence from NAV return still provides some support for the market integration test of foreign markets.
1538
C.-F. Lee et al.
GCAPM(CRSP)_Share
GCAPM_Share
GlobalIndex_Share
22
20 18
18
16
12
16
14
14
12
10 8 6
Frequency
20 14
Frequency
Frequency
16
12 10 8
2 0
4
4
2
2 0
0 −4
−3
−2
−1
0
1
2
3
−6 −5 −4 −3 −2 −1
4
Lambda
0
1
2
3
−4
4
Frequency
10 8 6
16
18
14
16
12
14
10 8 6
6 4
2
2
0
0 2
3
3
8
2
1
2
10
4
0
1
12
4
−1
0
GlobalCarhart_Share 20
Frequency
12
−2
−1
GlobalGCAMP(CRSP)_Share 18
14
−3
−2
Lambda
16
−4
−3
Lambda
GlobalGCAPM_Share
Frequency
8 6
6
4
10
0 −6 −5 −4 −3 −2 −1
Lambda
0
1
2
3
−4
4
−3
−2
Lambda
−1
0
1
2
3
Lambda
GCAPM(ALL)_Share
GCAPM_NAV
18 16
16
14
14
12 Frequency
Frequency
12 10 8 6
10 8 6
4
4
2
2
0 −4
−3
−2
−1 0 Lambda
1
2
0 −12 −10 −8 −6 −4 −2 0 2 4 Lambda
3
6
8 10 12 14
GlobalGAPM_NAV
GlobalIndex_NAV 20 16
16
14
14
12 Frequency
Frequency
18
12 10 8 6
8 6 4
4
2
2 0 −12 −10 −8 −6 −4 −2 0 2 4 Lambda
10
0 6
8 10 12
−8 −6 −4 −2
0
2 4 6 Lambda
8
10 12 14
Fig. 103.1 Histogram of estimated functional form parameter. This figure shows the distribution of estimated lambda of each model for international closed-end fund share return and NAV return. The curve in each graph is the standard normal distribution with mean 0 and variance 1
0.645(26.0) 0.681(26.8) 0.651(26.3)
4:70.4:2/ 2:43.2:05/ 4:37.3:91/
5:49.4:85/ 2:99.2:56/ 4:99.4:52/
Model 5
Model 6
0.197(3.87) 0.138(2.62) 0.187(3.68)
0.573(16.4) 0.502(13.9) 0.562(16.2)
0.778(36.5) 0.808(36.9) 0.782(36.8)
5:12.3:84/ 1:46.1:04/ 4:16.3:13/
Model 4
0.560(10.1) 0.538(9.43) 0.556(10.0)
1.244(35.4) 1.181(31.9) 1.224(34.8)
0.645(18.41) 0.583(15.28) 0.637(18.32)
4:20.3:87/ 1:70.1:53/ 3:88.3:62/
Model 3
0.666(27.02) 0.701(27.78) 0.670(27.26)
1.312(47.62) 1.273(44.34) 1.300(47.31)
0.0424
5:09.4:22/ 1:38.1:11/ 4:21.3:55/
0.971(55.38) 0.970(53.67) 0.971(55.41)
0.0615
Em D 0
Model 2
0.029(1.67) 0.030(1.66) 0.029(1.66)
0.00161
0.00766
2:17.1:98/ 0:146.0:13/ 2:20.2:04/
Std. dev.
0.00877
Model 1
0.00397
Mean
Table 103.6 Cross-sectional regressions for share return of international CEFs by group Panel A: comparison between emerging market funds and developed market funds Intercept .103 / Rf Rm Rg Rmkrf
0.245(7.4) 0.233(6.85) 0.243(7.34)
0.178(5.4) 0.169(5.0) 0.176(5.4)
0.234(6.0) 0.221(5.5) 0.230(5.9)
Rsmb
0.235(5.83) 0.199(4.76) 0.229(5.67)
0.229(5.59) 0.194(4.57) 0.224(5.46)
0.382(7.93) 0.339(6.75) 0.370(7.65)
Rhml
0.014(0.61) 0.010(0.39) 0.013(0.57)
0.0160(0.69) 0.013(0.52) 0.016(0.67)
0.0395(1.44) 0.0378(1.29) 0.0390(1.40)
Rumd
0:194.2:45/ 1 0
0:169.2:12/ 1 0
0:361.4:11/ 1 0
0:142.1:79/ 1 0
0:308(81.3) 1 0
0.020(0.26) 1 0
(continued)
0.0895
0.00848
Share return
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1539
0.625(24.1) 0.598(22.6) 0.633(24.6)
0.627(27.04) 0.601(25.57) 0.636(27.57)
0.649(72.2) 0.660(73.0) 0.646(72.4) 0.628(69.3) 0.640(70.2) 0.625(69.5)
6:54.5:44/ 1:76.1:47/ 6:90.5:93/
2:69.3:05/ 1:10.1:26/ 3:10.3:50/
5:37.4:30/ 0:486.0:38/ 5:54.4:57/
3:07.3:36/ 1:79.1:89/ 3:36.3:70/
3:93.4:33/ 2:56.2:72/ 4:22.4:68/
Model 2
Model 3
Model 4
Model 5
Model 6
0.643(71.58) 0.652(72.21) 0.639(71.73)
0.756(90.15) 0.756(89.38) 0.753(90.24)
0.563(12.0) 0.553(11.7) 0.566(12.1)
0.167(3.37) 0.136(2.86) 0.163(3.49)
1.310(40.6) 1.263(37.9) 1.312(40.9)
1.393(50.76) 1.359(48.43) 1.397(51.23)
0.244(29.16) 0.244(28.80) 0.247(29.6)
0.580(0.64) 1.70(1.85) 0.041(0.04)
0.00756 0.0419
Model 1
0.00782 0.1088
0.00368
Em D 1
Rmkrf
0.00134
Rg
Std. dev.
Rm
Mean
Rf
Table 103.6 (continued) Intercept .103 /
0.251(9.82) 0.241(9.34) 0.254(9.92)
0.200(7.84) 0.191(7.45) 0.202(7.93)
0.433(12.8) 0.412(12.0) 0.434(12.8)
Rsmb
0.211(6.60) 0.207(6.35) 0.212(6.65)
0.227(7.03) 0.224(6.79) 0.228(7.09)
0.321(7.44) 0.285(6.46) 0.323(7.49)
Rhml
0.013(0.73) 0.006(0.33) 0.014(0.80)
0.013(0.71) 0.006(0.35) 0.014(0.78)
0:097.4:14/ 0:117.4:81/ 0:096.4:12/
Rumd
0.163(3.11) 1 0
0.175(3.31) 1 0
0.0342(0.61) 1 0
0.187(3.56) 1 0
0.0696(1.24) 1 0
0.325(6.19) 1 0
(continued)
0.1115
0.00852
Share return
1540 C.-F. Lee et al.
0.0711 0.899(59.51) 0.948(40.70) 0.906(61.17)
0.00137
0.101(6.72) 0.052(2.25) 0.094(6.32)
1.73(1.58) 15:9.9:57/ 0.379(0.35)
2:06.1:65/ 0:992.0:81/ 4:58.3:73/
Model 1
Model 2
0.00738
Std. dev.
0.00376
4:51.5:43/ 2:81.3:23/ 4:41.5:33/
Model 6
Mean
0.615(68.1) 0.637(70.0) 0.616(69.2)
3:53.4:19/ 1:95.2:21/ 3:46.4:13/
Model 5
1.3172(45.49) 1.3039(45.16) 1.3587(47.12)
0.0418
0.00799
Reg D 1
0.076(1.87) 0.039(0.94) 0.074(1.82)
0.596(25.1) 0.557(22.7) 0.594(25.2)
0.648(72.6) 0.669(74.4) 0.649(73.7)
6:19.5:51/ 0:309.0:26/ 4:73.4:29/
Model 4
0.632(15.7) 0.625(15.2) 0.631(15.7)
1.212(41.8) 1.161(37.8) 1.196(41.0)
0.632(29.89) 0.596(27.54) 0.630(30.03)
3:40.4:21/ 1:50.1:78/ 3:34 .4:15/
Model 3
0.627(69.80) 0.647(71.42) 0.628(70.73)
1.321(54.62) 1.286(51.06) 1.309(54.14)
7:47.7:0/ 1:63.1:51/ 6:08.5:86/
0.755(90.69) 0.761(90.15) 0.753(90.92)
0.00745 0.0422
Model 2
0.245(29.37) 0.239(28.26) 0.247(29.76)
0.00807 0.0999
0:525.0:63/ 0.965(1.14) 0:747.0:90/
Std. dev.
Rmkrf
Model 1
0.0038
0.00149
Mean
Reg D 0
Table 103.6 (continued) Panel B: comparison between regional funds and single country funds Intercept .103 / Rf Rm Rg
0.256(10.8) 0.244(10.0) 0.255(10.7)
0.193(8.1) 0.182(7.5) 0.192(8.1)
0.368(11.8) 0.346(10.8) 0.362(11.6)
Rsmb
0.189(6.44) 0.183(6.02) 0.189(6.42)
0.207(7.0) 0.201(6.5) 0.207(6.9)
0.293(7.5) 0.255(6.3) 0.283(7.2)
Rhml
0.0182(1.12) 0.012(0.72) 0.0179(1.11)
0.022(1.33) 0.017(0.97) 0.022(1.32)
0:058.2:73/ 0:078.3:42/ 0:063.2:92/
Rumd
0.6950(9.18) 1 0
0.657(7.92) 1 0
0:057.1:15/ 1 0
0:0399.0:80/ 1 0
0:336.5:71/ 1 0
0:034.0:68/ 1 0
0:309.5:32/ 1 0
0:131.2:68/ 1 0
(continued)
0.0925
0.00943
0.1061
0.00801
Share return
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1541
0.656(26.26) 0.634(24.91) 0.661(26.56)
0.732(80.05) 0.732(79.22) 0.730(80.15)
7:13.5:6/ 2:34.1:84/ 7:13.5:76/
2:63.2:77/ 1:00.1:06/ 2:83.2:99/
Model 2
Model 3
0.612(62.77) 0.621(63.24) 0.610(63.01)
1.3857(47.73) 1.3580(45.54) 1.3857(47.93)
0.10869
0.00133
0.268(29.25) 0.268(28.97) 0.270(29.6)
0.04198
Rmkrf SS D 0
0.812(0.82) 2.00(1.99) 0.436(0.44)
0.00750
Rg
Model 1
0.00663
Rm
Std. dev.
0.00367
Rf
Mean
Intercept .103 /
Panel C: comparison between short sale allowed country funds and short sale not allowed country funds
0.494(9.09) 0.478(8.74) 0.514(9.46)
0.672(33.5) 0.675(33.2) 0.673(34.1)
2:42 .2:17/ 1:34 .1:20/ 3:45 .3:16/
Model 6
0.173(2.95) 0.166(2.81) 0.176(3.00)
0.618(17.9) 0.597(17.2) 0.641((18.7)
0.698(38.6) 0.701(38.3) 0.698(39.2)
2:16.1:95/ 1:12.1:00/ 3:18.2:92/
Rmkrf
Model 5
Rg 0.509(14.76) 0.495(14.30) 0.528(15.39) 1.318(38.1) 1.296(37.5) 1.364(40.0)
Rm 0.695(34.59) 0.696(34.26) 0.698(35.37)
2:66.2:01/ 1:39.1:06/ 4:75.3:69/
Rf
Model 4
Table 103.6 (continued) Intercept .103 / Model 3 0:713.0:66/ 0.337(0.31) 2:02.1:89/
Rsmb
0.229(7.23) 0.221(7.0) 0.236(7.40)
0.209(6.75) 0.203(6.55) 0.215(6.92)
0.338(9.32) 0.329(9.1) 0.354(9.68)
Rsmb
Rhml
0.282(7.12) 0.268(6.74) 0.297(7.50)
0.283(7.14) 0.269(6.76) 0.298(7.52)
0.375(8.03) 0.358(7.66) 0.406(8.69)
Rhml
Rumd
0:002.0:07/ 0:006.0:26/ 0.003(0.14)
0:001.0:06/ 0:005.0:23/ 0.003(0.15)
0:026.0:99/ 0:031.1:16/ 0:017.0:64/
Rumd
0.105(1.81) 1 0
0.0010(0.02) 1 0
0.241(4.17) 1 0
0.484(6.0) 1 0
0.488(6.03) 1 0
0.6162(8.06) 1 0
0.550(6.75) 1 0
(continued)
0.10912
0.00784
Share return
Share return
1542 C.-F. Lee et al.
0.458(15.69) 0.429(14.61) 0.473(16.31)
2:50.2:55/ 1:00.1:08/ 3:13.3:29/
Model 3
0.717(38.27) 0.726(38.16) 0.714(38.3)
1.1748(45.53) 1.1494(44.24) 1.1907(46.62)
2:17.1:94/ 0:125.0:11/ 3:20.2:91/
0.904(63.24) 0.866(67.27) 0.908(63.92)
0.00774 0.04222
Model 2
0.0957(6.69) 0.134(10.43) 0.092(6.51)
0.00948 0.06577
1:27.1:33/ 1.00(1.18) 2:00.2:14/
Std. dev.
Model 1
0.00390
0.00156
Mean
SS D 1
0.125(2.47) 0.104(2.01) 0.128(2.52)
0.598(60.9) 0.608(61.6) 0.596(61.2)
3:74.3:83/ 2:39.2:35/ 3:89.4:0/
Model 6
0.617(12.0) 0.613(11.7) 0.619(12.0)
0.639(23.0) 0.617(21.6) 0.642(23.3)
0.623(64.0) 0.633(64.8) 0.621(64.4)
Rmkrf
2:80.2:83/ 1:55.1:51/ 2:94.3:0/
Rg
Model 5
Rm 1.274(37.2) 1.233(34.6) 1.273(37.3)
Rf
5:64.4:25/ 0:78.0:57/ 5:50.4:24/
Model 4
Table 103.6 (continued) Intercept .103 /
0.275(10.0) 0.270(9.74) 0.276(10.1)
0.219(7.99) 0.216(7.79) 0.220(8.01)
0.424(11.8) 0.408(11.2) 0.423(11.8)
Rsmb
0.208(6.05) 0.208(5.90) 0.208(6.06)
0.226(6.5) 0.226(6.4) 0.226(6.5)
0.298(6.5) 0.266(5.63) 0.287(6.47)
Rhml
0.005(0.29) 0:002.0:08/ 0.006(0.32)
0.005(0.27) 0:0013.0:07/ 0.006(0.30)
0:102.4:1/ 0:121.4:66/ 0:102.4:12/
Rumd
0.319(4.38) 1 0
0.3289(4.36) 1 0
0.393(5.49) 1 0
0.089(1.54) 1 0
0.099(1.72) 1 0
0:031.0:5/ 1 0
(continued)
0.08843
0.00883
Share return
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1543
0.220(7.63) 0.208(7.17) 0.225(7.77)
0.186(6.54) 0.175(6.13) 0.190(6.68)
0.285(8.42) 0.271(7.95) 0.291(8.58)
Rsmb
0.234(6.60) 0.215(6.0) 0.242(6.82)
0.236(6.6) 0.217(6.01) 0.243(6.84)
0.364(8.56) 0.339(7.89) 0.373(8.8)
Rhml
0.034((1.71) 0.032(1.54) 0.035(1.78)
0.035(1.74) 0.033(1.58) 0.036(1.8)
0.0497(2.06) 0.0474(1.92) 0.051(2.11)
Rumd
0.255(3.52) 1 0
0.262(3.6) 1 0
0.2549(3.29) 1 0
Share return
C ˇsmb .1 C SMBt /./ C ˇhml .1 C HMLt /./ C ˇumd .1 C UMDt /./ C "pt
.1 C Rpt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./ C ˇg .1 C Rgt /./ C ˇmkrf .1 C MKRF t /./
j
Significant at 5% level. Significant at 10% level. This table reports the grouped cross-sectional regression of share returns of international closed-end country funds. Rf is the 1-month T-bill of the U.S. market; Rm j is the MSCI country or regional index monthly return corresponding to each international CEF. Rg is the MSCI world index monthly return; EM is the dummy variable, which is one for emerging country fund and zero for developed country fund; Reg is the dummy variable, which is one for regional fund and zero for single country fund; SS is the dummy variable, which is one for fund investing in countries where short sale is allowed, and equal to zero for fund investing in countries where short sale is not allowed. MKRF, SMB, HML, and UMD are monthly returns of Carhart four factors. The data spans from 1965 to 2003. White heteroscedasticity consistent t-statistics is in the parentheses. The model estimated is:
0.294(6.34) 0.262(5.60) 0.306(6.63)
0.695(37.1) 0.706(37.1) 0.692(37.2)
4:08.4:11/ 2:72.2:69/ 4:55.4:66/
Model 6
0.295(6.11) 0.286(5.88) 0.297(6.16)
0.508(16.7) 0.471(15.3) 0.523(17.4)
0.743(43.5) 0.753(43.4) 0.740(43.5)
Rmkrf
3:73.3:75/ 2:41.2:38/ 4:20.4:29/
Rg
Model 5
Rm 1.171(37.2) 1.135(35.5) 1.186(38.3)
Rf
3:05.2:55/ 1:00.0:83/ 3:76.3:21/
Model 4
Table 103.6 (continued) Intercept .103 /
1544 C.-F. Lee et al.
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
single country funds and regional funds. All single country funds’ share returns produce significant negative while regional funds’ share returns have significant positive . Although the NAV return does not have this obvious positivenegative pattern, the value of is really smaller for the single country funds compared to the regional funds. A comparison between short sale allowed country funds and short sale not allowed country funds also gives us a similar pattern with regional funds. Short sale allowed country funds have consistent positive for both share returns and NAV returns. Results from the pooled regression, as reported in Tables 103.6 and 103.7, are also supportive of a significant global risk premium. No matter which model we use for both share returns and NAV returns, the factor the loadings of the global risk is significant positive for both developed market funds and emerging market funds, regional funds or single country funds, short sale allowed funds or short sale not allowed funds. Moreover, in the framework of the generalized CAPM, consistent with previous research, all factors in Carhart (1997) four-factor model remain significant, including the size factor after controlling for the momentum factor. And the factor loadings of each model are consistent with our expectation. However, when we include the global return or the local market return into the model, the momentum factor no longer plays an important role. As reported in Tables 103.8 and 103.9, results of crosssectional pooling regression give us another view to compare the differences among different model specifications for the functional form of international CEFs. First, all of the transformation parameters are significantly positive, with larger for share returns than for NAV returns. Consistent with the grouped regression, the global risk has been priced in each of generalized model, even controlling for the local market returns. Although the transformation parameters are significant for each model, the factor loadings between the generalized model and the linear or the loglinear model are similar, especially for the NAV returns. The results from generalized model are more close to the models with loglinear returns, which implies that the conventional use of loglinear return is still reasonable. Controlling for the risk factors in the generalized model, the dummy variables do not produce significant result for share returns. However, the dummy variables of Em and SS are significant for NAV returns. This implies that the difference between the emerging market and the developed market, and the difference between short sale allowed country funds and short sale not allowed country funds are more significant for NAV returns. This is consistent with the characteristics of international closed-end funds, underlying assets are traded in foreign markets while share returns are determined by the U.S. investors. Hence NAV returns should reflect the difference among different funds more significantly.
1545
103.5.4 Performance Evaluation As motivated by Jensen (1969), and Lee (1976) among others, one of the important applications of the generalized CAPM is to give a more unbiased and efficient performance evaluation of mutual funds. In Table 103.10 we report the risk-adjusted returns of each model for share returns and NAV returns of individual fund. Consistent with previous research such as Patro (2001), the average performance of international CEFs is negative, even before the exclusion of fund expenses irrespective of the model used. The grouped regressions and cross-sectional pooling regressions give us similar results.
103.6 Conclusion In this chapter, we investigate the functional form specifications for 90 U.S.-based international closed-end country funds. Our results suggest that the frequently used linear and loglinear specifications of security return may be inappropriate for many funds. The use of a generalized functional form should be the rule rather than an exception. The evidence against the linear functional form in our sample indicates that the investment horizon is likely different from 1 month. For most funds, the true investment horizon is not instantaneous either. Similar to McDonald (1983) and Bubnys and Lee (1989), we encounter many negative estimates for the horizon parameter in our sample. Lee and Wei (1988) have shown that the estimated will be negative instead of positive if the security return and the market return had a bivariate lognormal distribution. Similar for most funds, some of the NAV returns seem to perform slightly worse compared to the share returns. Our empirical results imply that the functional transformation does matter for international CEFs. The global risk factor is priced in the framework of generalized CAPM. The momentum factor is no longer an important factor in asset pricing of share returns of international CEFs after controlling for the risk factors of the global market return and the local market return. No matter which benchmark we use, international CEFs have negative risk adjusted performance. Moreover, there exists significant different patterns of functional forms for emerging market funds and developed market funds, single country funds and regional funds, and funds for which short sales are allowed. A comparison between the functional form share returns and NAV returns of closed-end country funds suggests that foreign investors, especially those investors from emerging markets, may have more heterogeneous investment horizons vis-à-vis their U.S. counterparts.
Mean Std. dev. Model 1
Reg D 0
Model 3
Model 2
Mean Std. dev. Model 1
Em D 0
4:36.4:32/ 0:123.0:04/ 4:58.4:5/
6:04.6:20/ 6:04.5:34/ 7:88.6:33/ 5:69.4:50/ 4:58.3:43/ 7:38.5:29/ 5:70.5:05/ 4:93.4:09/ 6:88.5:40/
Intercept .103 /
0.00378 0.0015 0.239(23.55) 0.288(9.5) 0.238(23.26)
0.00396 0.00162 0.188(10.81) 0.190(10.33) 0.188(9.34)
Rf
0.00826 0.0996 0.761(74.81) 0.712(23.5) 0.762(74.56)
0.743(28.96) 0.749(27.49) 0.723(24.39)
0.00878 0.0612 0.812(46.54) 0.810(44.02) 0.812(40.25)
Rm
0.00763 0.0421
0.856(29.15) 0.848(27.28) 0.872(26.97) 0.121(3.31) 0.109(2.80) 0.152(3.52)
0.00776 0.0423
Rg
0.073(10.18) 1 0
0.610(31.14) 1 0 0.561(25.05) 1 0 0.574(28.85) 1 0
œ
Table 103.7 Cross-sectional regressions for NAV return of international CEFs by group
0.00685 0.2732
0.00189 0.0833
NAV return
2:32.2:82/ 2:98.3:64/ 3:84.4:54/
Reg D 1
2:70.2:61/ 1.60(0.47) 2:76.2:64/ 81:8.27:18/ 1.39(0.67) 6:61.4:80/ 2:58.2:56/ 1:27.0:85/ 2:61.2:25/
Em D 1
Intercept .103 /
0.00373 0.00138 0.283(24.12) 0.266(23.13) 0.252(21.76)
0.00366 0.00136 0.247(25.71) 0.298(9.4) 0.247(25.69)
Rf
0.00747 0.0710 0.717(60.98) 0.734(63.76) 0.748(64.7)
0.748(68.45) 0.744(48.81) 0.748(68.48)
0.00806 0.1084 0.753(78.46) 0.702(22.13) 0.753(78.51)
Rm
0.00814 0.0416
0.807(16.80) 0.831(17.13) 0.944(29.26) 0.0616(2.91) 0.0204(0.53) 0.0618(2.35)
0.00781 0.0418
Rg
1.737(16.97) 1 0
0.00349 0.0725
0.00836 0.2905
NAV return
(continued)
0.0125(1.64) 1 0 0:391.35:5/ 1 0 0.0137(1.69) 1 0
œ
1546 C.-F. Lee et al.
2:27.1:96/ 2.37(0.57) 2:30.1:98/ 6:74.4:50/ 0:342.0:14/ 6:87.4:59/ 2:167.1:95/ 3.70(0.88) 1:80.1:57/
5:93.4:70/ 0:336.0:22/ 6:55.5:15/ 3:50.3:38/ 1.40(0.45) 3:80.3:57/
0.00366 0.00134 0.268(25.08) 0.323(8.52) 0.268(25.07)
Rf
0.723(64.78) 0.680(16.0) 0.717(59.6)
0.00656 0.1083 0.732(68.5) 0.677(17.83) 0.732(68.54)
0.742(64.17) 0.708(20.38) 0.743(63.93)
Rm
0.918(26.22) 0.826(14.37) 0.919(26.27) 0.070(2.41) 0:023.0:21/ 0.080(2.61)
0.00762 0.0419
0.859(29.26) 0.801(22.98) 0.865(29.21) 0.078(2.87) 0.0056(0.07) 0.081(2.93)
Rg
0.012(1.52) 1 0 0.021(2.07) 1 0 0.0129(1.59) 1 0
0.099(10.09) 1 0 0.068(9.47) 1 0
œ
0.00799 0.3139
NAV return
6:23.6:6/ 5:52.5:59/ 7:40.6:99/ 4:96.4:35/ 3:76.3:17/ 6:77.5:49/ 5:490.5:58/ 4:70.4:51/ 6:60.6:07/
SS D 1
2:11.2:13/ 4:35.4:44/ 6:38.6:28/ 2:40.2:9/ 3:20.3:84/ 4:10.4:78/
Intercept .103 /
0.00389 0.00157 0.1979(13.87) 0.204(13.58) 0.189(11.94)
Rf
0.741(38.57) 0.738(36.28) 0.746(34.98)
0.00959 0.0656 0.802(56.25) 0.796(53.13) 0.811(51.19)
0.615(39.32) 0.641(41.2) 0.662(42.2)
Rm
0.870(32.68) 0.860(30.99) 0.896(31.12) 0.1318(4.34) 0.125(3.97) 0.138(4.14)
0.00796 0.0421
0.940(41.0) 0.963(41.56) 0.992(41.32) 0.250(9.53) 0.230(8.67) 0.220(8.03)
Rg
0.594(30.74) 1 0 0.571(25.14) 1 0 0.561(28.69) 1 0
2.200(19.35) 1 0 1.864(17.54) 1 0
œ
0.00291 0.0848
NAV return
C ˇg .1 C Rgt /./ C "pt where . /./ D
. /./ 1
.1 C NAV pt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./
j
Significant at 5% level. Significant at 10% level. j This table reports the grouped cross-sectional regression of NAV returns of international closed-end country funds. Rf is the 1-month T-bill of the U.S. market; Rm is the MSCI country or regional index monthly return corresponding to each international CEF. Rg is the MSCI world index monthly return; EM is the dummy variable, which is one for emerging country fund and zero for developed country fund; Reg is the dummy variable, which is one for regional fund and zero for single country fund; SS is the dummy variable, which is one for fund investing in countries where short sale is allowed, and equal to zero for fund investing in countries where short sale is not allowed. The data spans from 1965 to 2003. White heteroscedasticity consistent t-statistics is in the parentheses. The model estimated is:
Model 3
Model 2
Mean Std. dev. Model 1
SS D 0
Model 3
Model 2
Intercept .103 /
Table 103.7 (continued)
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1547
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12
3.73e–3 1.39e–3
Rf
1.51(2.24) 0:89.1:35/ 1:53.1:26/ 1:36.1:69/ 0:662.0:72/ 1:083.0:50/ 0:649.0:73/ 2:39.0:84/ 1:67.2:40/ 1:62.0:74/ 2:31.3:35/ 2:56.1:18/ 0.486(14.6) 0.552(14.9)
0.674(84.7) 0.661(78.8) 0.641(78.2) 0.629(73.3)
0.494(15.1) 0.557(15.3)
0.660(83.8) 0.652(78.4) 0.626(77.2) 0.619(72.8)
0.574(32.17) 0.606(32.25) 0.574(32.16) 0.568(30.52) 0.605(31.28)
0.605(34.4) 0.638(34.47) 0.605(34.39) 0.593(32.27) 0.631(33.08)
7.64e–3 0.042
Rg
0.788(108.8) 0.642(79.26) 0.644(78.64) 0.642(79.25) 0.633(74.93) 0.634(74.83)
7.59e–3 0.0917
Rm
0.212(28.85) 0.788(107.5) 0.654(80.00) 0.656(79.29) 0.654(80.01) 0.641(75.28) 0.642(75.10)
0.189(0.28) 0.212(29.32) 2:65.4:09/ 3:50.2:96/ 3:0.3:77/ 2:10.2:34/ 2:60.1:21/ 4:75.5:47/ 5:70.2:05/ 3:18.4:74/ 2:88.1:34/ 3:93.5:89/ 3:86.1:83/
Panel B D 1
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12
Panel A D
Mean Std. dev.
Intercept .103 /
1.207(51.5) 1.213(46.3) 0.577(29.2) 0.595(27.5) 0.184(5.5) 0.143(3.87)
1.254(55.5) 1.256(49.5) 0.613(31.7) 0.623(29.5) 0.215(6.58) 0.170(4.69)
Rmkrf
0.340(13.9) 0.355(13.0) 0.190(9.91) 0.190(9.02) 0.239(12.4) 0.244(11.6)
0.359(14.9) 0.372(13.8) 0.200(10.5) 0.196(9.39) 0.250(13.1) 0.252(12.0)
Rsmb
Table 103.8 Cross-sectional pooled regressions for share return of international CEFs
0.289(9.24) 0.289(8.30) 0.225(9.20) 0.221(8.22) 0.213(8.78) 0.206(7.77)
0.324(10.6) 0.320(9.38) 0.237(9.91) 0.227(8.66) 0.225(9.51) 0.213(8.23)
Rhml
8.03e–5(0.04)
0.0006(0.42)
0.668 0.471
Em 0.444 0.497
SS
9:34e-4.0:70/ 3:85e–4.0:23/ 8:64e–4.0:43/
9.79e–4(0.73)
0.342 0.474
Reg
1:87e–4.0:09/ 0:062.3:55/ 0:057.2:92/ 1.11e–3(0.41) 0.0097(0.71) 0.017(1.12) 4:01e–4.0:19/ 6.45e–3(0.47) 0.014(0.94) 2:47e–4.0:12/
3.25e–4(0.22)
4.72e–4(0.28)
2.45e–4(0.14)
3.16e–5.0:01/
3.61e–4(0.21)
1.38e–3(1.01)
(continued)
0.280(6.82) 0.132(3.18) 0.108(2.50) 0.132(3.19) 0.182(4.06) 0.148(3.19) 0:011.0:24/ 0.032(0.64) 0.110(2.66) 0.135(2.89) 0.097(2.34) 0.123(2.64)
1 1 1 1 6:93e–4.0:51/ 1 5:96e–4.0:29/ 1 1 1.72e–3(0.65) 1 1 7:37e–4.0:36/ 1 1 5:6e–4.0:28/ 1
0:048.2:83/ 0:046.2:43/ 3:68e–4.0:14/ 3.97e–4(0.18) 2.37e–3(0.92) 0.015(1.16) 0.021(1.44) 8.70e–6(0.00) 5:58e–4.0:33/ 1:05e–3.0:53/ 0.013(0.98) 0.019(1.30) 4.29e–5(0.02) 2:89e–4.0:18/ 8:31e–4.0:42/
Rumd
1548 C.-F. Lee et al.
0:317.0:48/ 2:93.4:56/ 3:75.3:18/ 3:25.4:11/ 2:46.2:73/ 2:84.1:34/ 4:70.5:53/ 5:80.2:09/ 3:38.5:07/ 3:06.1:43/ 4:11.6:2/ 4:04.1:91/
0.214(29.71)
Rf
0.612(35.03) 0.644(35.01) 0.612(35.03) 0.602(32.93) 0.638(33.64)
0.496(15.1) 0.558(15.3)
0.658(84.1) 0.650(78.6) 0.624(77.5) 0.617(73.0)
Rg
0.786(109.1) 0.639(79.49) 0.642(78.93) 0.639(79.48) 0.630(75.01) 0.631(74.98)
Rm
1.253(55.8) 1.258(49.9) 0.619(32.3) 0.630(30.0) 0.219(6.7) 0.174(4.81)
Rmkrf
0.359(14.9) 0.372(13.8) 0.201(10.6) 0.197(9.45) 0.252(13.2) 0.253(12.1)
Rsmb
0.324(10.6) 0.321(9.43) 0.239(10.0) 0.229(8.73) 0.227(9.6) 0.215(8.29)
Rhml
0:048.2:84/ 0:045.2:41/ 0.016(1.21) 0.021(1.48) 0.013(1.03) 0.019(1.34)
Rumd
7.03e–5(0.03)
5.85e–5(0.03)
4:21e–4.0:16/
1.11e–4(0.05)
6.25e–4(0.44)
Em
4:06e–4.0:25/
6:95e–4.0:42/
4.05e–4(0.19)
5:29e–4.0:32/
9.11e–4(0.68)
Reg
8:76e–4.0:44/
1:11e–3.0:56/
2.39e–3(0.92)
9:83e–4.0:73/ 9:19e–4.0:46/
SS
0 0 0 0 0 0 0 0 0 0 0 0
C ˇhml .1 C HMLt /./ C ˇumd .1 C UMDt /./ C ˇe Emp C ˇr Regp C ˇs SSp C "pt where . /./ D
. /./ 1 l
.1 C Rpt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./ C ˇg .1 C Rgt /./ C ˇmkrf .1 C MKRF t /./ C ˇsmb .1 C SMBt /./
j
Significant at 5% level. This table reports the cross-sectional pooled regression of share returns of international closed-end country funds. Rf is the 1-month T-bill of the U.S. market; Rm j is the MSCI country or regional index monthly return corresponding to each international CEF. Rg is the MSCI world index monthly return; EM is the dummy variable, which is one for emerging country fund and zero for developed country fund; Reg is the dummy variable, which is one for regional fund and zero for single country fund; SS is the dummy variable, which is one for fund investing in countries where short sale is allowed, and equal to zero for fund investing in countries where short sale is not allowed. MKRF, SMB, HML, and UMD are monthly returns of Carhart four factors. The data spans from 1965 to 2003. White heteroscedasticity consistent t-statistics is in the parentheses. The model estimated is:
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12
Panel C D 0
Intercept .103 /
Table 103.8 (continued)
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1549
4:16.5:71/ 3:63.4:85/ 6:49.4:56/ 3:68.4:01/ 1:88.1:75/ 4:01.1:52/
1:07.0:52/ 0:086.0:04/ 4:12.1:00/ 1.04(0.40) 3.02(0.95) 0.682(0.09)
4:34.5:91/ 3:81.5:06/ 6:68.4:67/ 3:91.4:24/ 2:08.1:91/ 4:22.1:60/
0.241(30.01)
0.283(12.71)
0.242(30.33)
3.71e–3 1.40e–3
0.759(94.65) 0.731(77.80) 0.737(74.88) 0.731(77.79) 0.721(71.35) 0.725(69.24)
0.717(32.14) 0.699(26.63) 0.704(25.01) 0.699(26.62) 0.688(23.43) 0.692(22.34)
0.758(94.89) 0.730(78.02) 0.736(75.11) 0.730(78.02) 0.720(71.59) 0.724(69.48)
7.81e–3 0.0915
0.109(5.31) 0.102(4.57) 0.109(5.31) 0.114(5.18) 0.110(4.64)
0.064(1.12) 0.051(0.80) 0.0646(1.13) 0.059(0.92) 0.0488(0.70)
0.107(5.27) 0.100(4.52) 0.107(5.27) 0.112(5.12) 0.108(4.59)
7.80e–3 0.0419
0.00243(0.96)
0.00416(2.39)
0.00404(0.54)
0.00641(1.29)
0.00245(0.98)
0.00417(2.41)
0.671 0.470
Em
5:22e 4.0:25/
3.01e–4(0.19)
0:00477.0:78/
0:00329.0:75/
6:92e 4.0:34/
1.42e–4(0.09)
0.341 0.474
Reg
0:00423.2:61/ 0:00284.1:14/
0:00673.1:42/ 0:00452.0:61/
0:00424.2:64/ 0:00283.1:15/
0.441 0.496
SS
0.076(12.06) 0.072(11.31) 0.071(10.75) 0.072(11.30) 0.072(10.96) 0.070(10.51)
5.88e–3 0.229
NAV return
C ˇe Emp C ˇr Regp C ˇs SSp C "pt where . /./ D
. /./ 1
.1 C NAV pt /./ D ˛p C ˇf .1 C Rf /./ C ˇm .1 C Rmt /./ C ˇg .1 C Rgt /./
j
Significant at 5% level. Significant at 10% level. This table reports the cross-sectional pooled regression of NAV returns of international closed-end country funds. Rf is the 1-month T-bill of the U.S. market; Rmj is the MSCI country or regional index monthly return corresponding to each international CEF. Rg is the MSCI world index monthly return; EM is the dummy variable, which is one for emerging country fund and zero for developed country fund; Reg is the dummy variable, which is one for regional fund and zero for single country fund; SS is the dummy variable, which is one for fund investing in countries where short sale is allowed, and equal to zero for fund investing in countries where short sale is not allowed. The data spans from 1965 to 2003. White heteroscedasticity consistent t-statistics is in the parentheses. The model estimated is:
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
Panel C D 0
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
Panel B D 1
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
Mean Std. dev. Panel A D
Table 103.9 Cross-sectional pooled regressions for NAV return of international CEFs Intercept .103 / Rf Rm Rg
1550 C.-F. Lee et al.
0:85 0:78 1.18 3:69 1.62
0:75 0:51 1.11 3:80 1.31
Mean Median STD Min Max
Global index (share return)
0:90 0:91 1.14 4:00 1.66
0:33 0:38 0.52 2:59 1.24
Mean Median STD Min Max
GCAPM (NAV return)
0:85 0:74 1.11 3:65 2.03
0:80 0:51 1.11 3:87 1.41
Mean Median STD Min Max
GCAPM_CRSP (share return)
t .˛/ 0:13 0:31 0.85 1:77 2.17
˛.%/
Mean Median STD Min Max
GCAPM (share return)
0:11 0:16 0.61 2:80 1.54
D
Table 103.10 Performance evaluation of international CEFs ˇm
0.77 0.79 0.19 0.20 1.47
1.16 1.12 0.31 0.39 2.15
0.87 0.89 0.25 0.21 1.76
15.05 15.19 7.14 1.34 33.55
6.84 7.06 2.39 1.42 13.90
12.20 12.33 4.72 1.02 27.40
t .ˇm /
1.32 1.33 0.36 0.37 2.33
ˇg
8.10 7.87 3.06 1.48 16.92
t .ˇg /
0:32 0:11 0.89 3:55 1.62
0:14 0:39 1.84 2:29 16.24
0:38 0:17 0.95 4:12 1.70
0.05 0.05 0.59 2:00 1.88
˛.%/
D1
0:30 0:21 0.99 3:53 1.63
0:91 0:95 1.24 4:00 1.69
0:34 0:26 1.02 3:69 3.28
0.13 0.12 0.86 1:77 2.53
t .˛/
0.76 0.80 0.27 1:05 1.51
1.13 1.12 0.30 0.34 2.12
0.88 0.90 0.24 0.20 1.64
ˇm
14.72 14.40 7.38 0:72 31.82
6.58 6.46 2.35 1.28 13.82
12.08 12.33 4.63 1.05 25.45
t .ˇm /
1.30 1.34 0.35 0.33 2.31
ˇg
7.78 7.49 2.99 1.50 16.56
t .ˇg /
0:73 0:45 0.96 4:32 1.53
0:44 0:45 0.62 3:38 0.77
0:82 0:55 0.98 4:98 1.37
0:13 0:12 0.60 2:44 1.65
˛.%/
D0 t .˛/
0:84 0:73 1.01 4:00 1.61
1:04 1:08 1.16 4:14 1.70
0:90 0:79 0.98 4:16 2.81
0:16 0:17 0.84 1:82 2.33
ˇm
0.79 0.80 0.19 0.27 1.49
1.19 1.20 0.32 0.37 2.29
0.89 0.90 0.25 0.23 1.71
14.93 13.95 7.57 1.22 34.99
7.04 7.02 2.49 1.46 14.52
12.33 12.37 4.71 1.19 26.92
t .ˇm /
8.18 8.04 3.06 1.64 16.95
t .ˇg /
(continued)
1.34 1.35 0.37 0.36 2.47
ˇg
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds 1551
Mean Median STD Min Max
Mean Median STD Min Max
Mean Median STD Min Max
Global GCAPM (share return)
Global GCAPM_CRSP (share return)
Global GCAPM (NAV return)
0:83 0:75 1.15 3:79 1.66 0:83 0:71 1.22 7:61 1.60
0:30 0:30 0.52 2:40 1.50 0.72 0.74 0.25 0.04 1.80
0:07 0:01 0.59 2:44 1.18 11.81 10.94 6.84 0.20 30.55
0:14 0:04 1.45 3:42 3.10
8.30 8.31 4.53 1:62 23.27
0.11 0.08 0.26 0:94 0.98
1.40 1.40 0.67 0:57 4.25
0.56 0.56 0.41 0:70 1.86
0.73 0.73 1.92 10:50 6.45
4.08 3.99 2.13 1:19 10.67
3.45 3.57 1.77 1:98 8.52
0:06 0:33 2.40 2:52 21.63
0:31 0:11 0.91 3:48 1.12
0:19 0:12 0.70 2:67 1.64
0:16 0:27 2.40 2:91 21.18
0:76 0:45 1.10 4:02 1.30
0.68 0.66 0.37 1:21 2.27
6.36 6.37 2.76 0:30 13.74
0:59 0:66 0.97 3:30 1.62
t .ˇg /
0:42 0:35 0.79 3:88 1.32
0.91 0.92 0.37 0:13 2.94
˛.%/
ˇg
0:59 0:61 0.96 2:74 1.49
0:41 0:32 0.84 3:91 1.90
t .ˇm /
t .˛/
˛.%/
ˇm
D1
D
0:90 0:80 1.51 10:55 1.64
0:27 0:14 0.97 3:16 1.33
0:23 0:26 0.93 2:57 1.99
0:67 0:60 0.93 3:50 1.26
t .˛/
0.72 0.75 0.28 0:28 1.84
0:09 0:05 0.59 2:26 1.20
0.70 0.72 0.34 1:14 2.04
ˇm
11.57 10.82 6.97 0:18 27.56
0:21 0:10 1.43 3:13 2.67
8.95 8.40 4.91 1:62 26.16
t .ˇm /
0.05 0.07 0.69 5:79 1.00
1.39 1.38 0.66 0:51 3.97
0.52 0.50 0.39 0:55 1.75
0.84 0.93 0.80 6:04 1.61
ˇg
0.63 0.61 1.80 9:08 4.15
3.99 3.79 2.11 1:08 10.87
2.86 3.12 2.21 7:78 8.52
6.27 6.24 2.93 1:47 13.96
t .ˇg /
This table gives the summary statistics of estimated alphas and betas for functional form, linear and loglinear specification of different models
Mean Median STD Min Max
Global index (NAV return)
Table 103.10 (continued)
0:43 0:41 0.64 3:70 0.86
0:74 0:45 0.97 4:37 1.00
0:40 0:34 0.74 3:00 1.55
0:75 0:52 0.75 3:73 0.29
˛.%/
D0
1:04 0:96 1.44 10:37 1.65
0:83 0:66 0.98 3:69 1.18
0:58 0:58 0.95 2:81 1.78
1:08 0:92 0.83 3:73 0.30
t .˛/
0.74 0.77 0.27 0.08 1.86
0:06 0:01 0.60 2:35 1.27
0.68 0.69 0.36 1:22 2.14
ˇm
11.69 10.91 7.06 0.30 31.47
0:13 0:01 1.45 3:33 2.74
8.85 8.72 4.98 1:68 26.48
t .ˇm /
0.11 0.07 0.30 0:94 1.10
1.41 1.39 0.66 0:27 4.10
0.57 0.52 0.40 0:62 1.89
0.93 0.94 0.33 0:08 1.63
ˇg
0.56 0.59 1.78 8:43 4.68
4.05 3.96 2.10 0:52 10.67
3.20 3.26 1.99 2:36 8.49
6.37 6.34 2.86 0:35 13.95
t .ˇg /
1552 C.-F. Lee et al.
103 Functional Forms for Performance Evaluation: Evidence from Closed-End Country Funds
References Aitken, M., F. Alex, M. S. McCorry and P. L. Swan. 1998. “Short sales are almost instantaneously bad news: evidence from the Australian stock exchange.” Journal of Finance 53, 2205–2223. Box, G. E. P. and D. R. Cox. 1964. “An analysis of transformation.” Journal of the Royal Statistical Society Series B 26(2), 211–252. Bris, A., W. N. Goetzmann and N. Zhu. 2004. “Efficiency and the bear: short sales and markets around the world.” Working paper, Yale School of Management. Bubnys, E. L. and C. F. Lee. 1989. “Linear and generalized functional form market models for electric utility firms.” Journal of Economics and Business 41, 213–223. Carhart, M. 1997. “On persistence in mutual fund performance.” Journal of Finance 52, 57–82. Chang, E., C. Eun and R. Kolodny. 1995. “International diversification through closed-end country funds.” Journal of Banking and Finance 19, 1237–1263. Chaudhury, M. M. and C. F. Lee. 1997. “Functional form of stock return model: some international evidence.” The Quarterly Review of Economics and Finance 37(1), 151–183. Cumby, R. and J. Glen. 1990. “Evaluating the performance of international mutual funds.” Journal of Finance 45(2), 497–521. Errunza, V. and E. Losq. 1985. “International asset pricing under mild segmentation: theory and test.” Journal of Finance 40, 105–124. Errunza, V. and E. Losq. 1989. “Capital flow controls, international asset pricing, and investors’ welfare: a multi-country framework.” Journal of Finance 44, 1025–1037. Fabozzi, F., J. Francis and C. F. Lee. 1980. “Generalized functional form for mutual fund returns.” Journal of Financial and Quantitative Analysis 15, 1107–1119. Ferson, W. and C. Harvey. 1994. “Sources of risk and expected returns in global equity markets.” Journal of Banking and Finance 18, 775–803. Harvey, C. 1991. “The world price of covariance risk.” The Journal of Finance 46(1), 111–157. Hong, H. and J. Stein. 2003. “Differences of opinion, short-sales constraints and market crashes.” Review of Financial Studies 16(2), 487–525. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolio.” Journal of Business 2, 167–247.
1553
Johnson, T. W. 2004. “Predictable investment horizons and wealth transfers among mutual fund shareholders.” Journal of Finance 59(5), 1979–2012. Jones, C. and O. Lamont. 2002. “Short sale constraints and stock returns.” Journal of Financial Economics 66(2), 207–239. Jorion, P. and E. Schwartz. 1986. “Integration vs. segmentation in the Canadian stock market.” Journal of Finance 41, 603–613. Lee, C. F. 1976. “Investment horizon and the functional form of the capital asset pricing model.” Review of Economics and Statistics 58, 356–363. Lee, C. F. and K. C. J. Wei. 1988. “Impacts of rates of return distributions on the functional form of CAPM.” Working paper, Rutgers University. Lee, C. F., C. Wu and K. C. J. Wei. 1990. “The heterogeneous investment horizon and the capital asset pricing model: theory and implications.” Journal of Financial and Quantitative Analysis 25, 361–376. Lessard, D. 1974. “World, national, and industry factors in equity returns.” The Journal of Finance 29(2), 379–391. Levhari, D. and H. Levy. 1977. “The capital asset pricing model and the investment horizon.” Review of Economics and Statistics 59, 92– 104. Levy, H. 1972. “Portfolio performance and the investment horizon.” Management Science 36, 645–653. McDonald, B. 1983. “Functional forms and the capital asset pricing model.” Journal of Financial and Quantitative Analysis 18, 319–329. Patro, D. K. 2001. “Measuring performance of international closed-end funds.” Journal of Banking and Finance 25, 1741–1767. Patro, D. K. 2008. “Stock market liberalization and emerging market country fund premiums.” Journal of Business 78(1), 135–168. Solnik, B. 1974a. “An equilibrium model of international capital market.” Journal of Economic Theory 36, 500–524. Solnik, B. 1974b. “An international market model of security price behavior.” Journal of Financial and Quantitative Analysis 10, 537– 554. Solnik, B. 1974c. “The international pricing of risk: an empirical investigation of the world capital market structure.” Journal of Finance 29, 365–377. Tobin, J. 1965. “The theory of portfolio selection,” in The theory of interest rates, F. Hain and F. Breechling (Eds.). MacMillan, London, 3–51. Zarembka, P. 1968. “Functional form in the demand for money.” Journal of the American Statistical Association 63, 502–511.
Chapter 104
A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure* Michael Mania, Marina Santacroce, and Revaz Tevzadze
Abstract An incomplete financial market model is considered, where the dynamics of the assets price is described by an Rd -valued continuous semimartingale. We express the density of the minimal entropy martingale measure in terms of the value process of the related optimization problem and show that this value process is determined as the unique solution of a semimartingale backward equation. We consider some extreme cases when this equation admits an explicit solution. Keywords Semimartingale backward equation r Contingent claim pricing r Minimal entropy martingale measure r Incomplete markets
104.1 Introduction Using the dynamic programming technique, we study the structure of the minimal entropy martingale measure, which is related to the problem of derivative pricing in incomplete markets. We assume that the dynamics of the price process of the assets traded on the market is described by an Rd valued continuous semimartingale X D .Xt ; t 2 Œ0; T / defined on a filtered probability space .; F ; F D .Ft ; t 2 Œ0; T /; P / satisfying the usual conditions of right-continuity and completeness, where F D FT and T < 1 is the fixed time horizon. Suppose that the market contains also a riskless M. Mania () A. Razmadze Mathematical Institute, M. Aleksidze St. 1, Tbilisi, Georgia and Georgian–American University, 3, Alleyway II, Chavchavadze Ave. 17, A, Tbilisi, Georgia e-mail: [email protected] M. Santacroce Politecnico di Torino, Department of Mathematics, C.so Duca degli Abruzzi 24, 10129 Torino, Italy R. Tevzadze Institute of Cybernetics, S. Euli St. 5, Tbilisi, Georgia and Georgian–American University, 3, Alleyway II, Chavchavadze Ave. 17, A, Tbilisi, Georgia
bond with discounted price equal to 1 at all times. Denote by Me the set of equivalent martingale measures of X , i.e., a set of measures Q, equivalent to the basic measure P , such that X is a local martingale under Q. In complete markets every contingent claim is replicable and the price of the claim is determined using the uniquely defined martingale measure. It is also well known that if the market is incomplete the set of martingale measures is not in general a singleton and there is an interval of arbitrage-free prices, associated to different pricing (martingale) measures [see El Karoui and Quenez (1995)]. The choice of the pricing measure is no longer preference-free and depends on the utility functions of investors or on the criterion relative to which the hedging error is measured. The most popular choices of pricing measures are the minimal martingale measure, introduced in Föllmer and Schweizer (1991), and the variance-optimal martingale measure (Delbaen and Schachermayer 1996; Schweizer 1996). The latter is determined by minimizing the L2 -norm of densities of the martingale measures with respect to the basic measure P among all signed martingale measures and represents the best choice when the quality of the hedging strategies is measured by the quadratic criterion [see Gourieroux et al. (1998); Pham et al. (1998)]. Another possibility is the minimal entropy martingale measure which minimizes the relative entropy of a martingale measure with respect to the measure P and is known to be closely related to the exponential hedging problem (Delbaen et al. 2002; Rheinländer 1999; Rouge and El Karoui 2000). For the economic interpretation of the minimal, variance optimal and minimal entropy martingale measures see Hoffman et al. (1992), Delbaen et al. (1997), and Frittelli (2000), respectively. It is known [see Frittelli (2000); Miyahara (1996)] that for a locally bounded process X the minimal entropy martingale measure always exists, is unique and if there is a martingale measure with finite relative entropy then the minimal entropy martingale measure is equivalent to P .
* This paper is reprinted from Finance and Stochastics, 7 (2003), No. 3, pp. 385–402.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_104,
1555
1556
M. Mania et al.
Our aim is to give the construction of the minimal entropy We assume the following conditions to be satisfied: martingale measure when the dynamics of the discounted as(a) All .F; P /-local martingales are continuous. sets price process is governed by a continuous semimartin(b) There is an equivalent martingale measure Q such that gale. We obtain a description of the minimal entropy marQ Q EZT lnZT < 1, i.e. tingale measure in terms of the value function of a suitable problem of an optimal equivalent change of measure and MeEnt ¤ ;: (104.1) show that this value process uniquely solves the corresponding semimartingale backward stochastic differential equation Note that conditions (a) and (b) imply that X is a continu(BSDE). We show that in two specific extreme cases [already ous semimartingale satisfying the structure condition. This studied in Biagini et al. (2000), Laurent and Pham (1999), means that X admits the decomposition Pham et al. (1998) in relation to the variance-optimal martingale measures] this semimartingale BSDE admits an explicit (104.2) Xt D X0 C ƒt C Mt ; solution which gives explicit construction of the minimal entropy martingale measure. In particular, we give a necessary where M is a continuous local martingale and there exists and sufficient condition when the minimal entropy martin- a predictable Rd -valued process such that dƒ D d hM i R 0 gale measure coincides with the minimal martingale measure with KT D 0T s d hM is s < 1, where 0 denotes the transas well as with the martingale measure appearing in the sec- position. The process K is called the mean–variance tradeoff ond above mentioned extreme case. process of X [see Schweizer (1994) for the interpretation of BSDEs have been introduced in Bismuth (1973) for the the process K]. linear case as the equations for the adjoint process in the Since X is continuous, any element Q of Me is given by stochastic maximum principle. In Chitashvili (1983) and the density ZtQ which is expressed as an exponential martinPardoux and Peng (1990) the well-posedness results for gale of the form BSDEs with more general generators were obtained [see (104.3) Et . M C N / also El Karoui et al. (1997) for references and related rewhere N is a local martingale strongly orthogonal to M and sults]. The dynamic programming approach in relation to the notation M stands for the stochastic integral. mean–variance and exponential hedging was first used in If the local martingale ZO D E. M / is a true martinLaurent and Pham (1999) and Rouge and El Karoui (2000) gale, dPO =dP D ZO T defines an equivalent probability mearespectively, for diffusion models. The dynamic programsure called the minimal martingale measure for X . ming method was also applied in Mania and Tevzadre (2000) We denote by NEnt .X / the class of local martingales N and Mania et al. (2002) to determine the variance optimal and strongly orthogonal to M such that the process .Et . p-optimal martingale measures in a semimartingale setting. M C N /; t 2 Œ0; T / is a strictly positive P -martingale with The semimartingale backward equation, as a stochastic verEET . M C N /lnET . M C N / < 1. Then sion of the Bellman equation in an optimal control problem, was first derived in Chitashvili (1983) [see also Chitashvili dQ e MEnt D Q P W jF and Mania (1987, 1996)]. dP T For all unexplained notations concerning the martingale o theory used below we refer the reader to Dellacherie and D ET . M C N /; N 2 NEnt .X / : (104.4) Meyer (1980), Jacod (1979), Lipster and Shiryayev (1986). We recall the definition of BMO-martingales and the reverse Hölder condition. The square integrable continuous martingale M belongs to the class BMO iff there is a constant C > 0 such that
104.2 Some Basic Definitions, Conditions, and Auxiliary Facts
E.hM iT hM i jF / C 2
(104.5)
Q
For any equivalent martingale measure Q we denote by Z the density process of Q relative to the measure P and let M Q be a P -local martingale such that Z Q D E.M Q / D .Et .M Q /; 0 t T /; where E.M / is the Doleans-Dade exponential of M . Let n o Q Q MeEnt D Q 2 Me W EZT lnZT < 1 :
for every stopping time . The smallest constant with this property is called the BMO norm of M and is denoted by jjM jjBMO . Let Z be a strictly positive uniformly integrable martingale. Definition 1. The process Z satisfies REnt .P / inequality if there is a constant C1 such that
104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure
E
ZT ZT ln jF Z Z
C1 ;
(104.6)
1557
Let us introduce the following notations Et T .M Q / D
for every stopping time . For the proof of the following assertion the reader is referred to Rheinländer (1999) [see Doleans-Dade and Meyer (1979) or Kazamaki (1994) for the case x p ; p > 1]. Proposition 1. Let E.M / be an exponential martingale associated to the continuous local martingale M . Then if E.M / is a uniformly integrable martingale and satisfies the REnt .P / inequality then M belongs to the class BMO. Let us recall also the concept of relative entropy [see Csiszar (1975) about the basic properties of relative entropy]. The relative entropy, or Kullback–Leibler distance, I.Q; R/ of the probability measure Q with respect to the measure R is defined as
ET .M Q / ; hM Q it T D hM Q iT hM Q it Et .M Q /
and let Vt D essinf E.Et T .M Q /lnEt T .M Q /jFt / Q2MeE nt
D
E Q lnEt T . M C N /jFt
essinf
(104.9)
N 2NE nt .X /
be the value process corresponding to the problem (104.8). Let us introduce also the process Vt D
1 essinf E Q .hM Q it T jFt /: 2 Q2MeE nt
(104.10)
(104.7)
Remark 2. We shall see later, that Vt D V t if there exists an equivalent martingale measure satisfying the REnt inequality.
if Q R and I.Q; R/ D 1 otherwise. Note that I.Q; R/ 0 and that I.Q; R/ D 0 iff Q D R. The minimal entropy martingale measure Q is a solution of the optimization problem
The optimality principle, which is proved in a standard manner [see, e.g., El Karoui and Quenez (1995), Elliott (1982), Laurent and Pham (1999)], takes in this case the following form:
I.Q; R/ D E R
inf
Q2Mabs
dQ dQ ln ; dR dR
I.Q; P / D I.Q; P /;
where Mabs is the set of measures Q absolutely continuous with respect to P such that X is a local martingale under Q. Proposition 2. If X is locally bounded and there exists Q 2 Mabs such that I.Q; P / < 1 then the minimal entropy martingale measure exists and is unique. Moreover if I.Q; P / < 1 for some Q 2 Me then the minimal entropy martingale measure is equivalent to P . Remark 1. This assertion is proved in Frittelli (2000) assuming that X is bounded, defining the class Me as a set of equivalent measures Q such that X is a martingale (and not a local martingale) under Q. The proof is the same if X is locally bounded and Me is defined as in the Introduction. Since any continuous process is locally bounded, under assumptions A) and B) the minimal entropy martingale measure always exists and is equivalent to the basic measure P . Therefore, hereafter we shall consider only equivalent martingale measures and focus our attention on the construction and properties of optimal martingale measures. Thus, we consider the optimization problem inf
Q2MeE nt
EET .M Q /lnET .M Q /:
(104.8)
Proposition 3. (A) There exists an RCLL semimartingale still denoted by Vt such that for each t 2 Œ0; T Vt D essinf E Q .lnEt T .M Q /jFt /: Q2MeE nt
Vt is the largest RCLL process equal to 0 at time T such that Vt C lnEt .M Q / is a Q-submartingale for every Q 2 MeEnt : (B) The following properties are equivalent: (i) Q is optimal, i.e. , V0 D infQ2MeE nt E Q lnET .M Q / D E Q lnET .M Q /; (ii) Q is optimal for all conditional criteria, i.e., for each t 2 Œ0; T
Vt D E Q .lnEt T .M Q /jFt / a:s:
(iii) Vt C lnEt .M Q / is a Q -martingale. The following statement, proved in Delbaen et al. (2002), is a consequence of Proposition 3(B). Corollary 1. If there exists an equivalent martingale measure QQ whose density satisfies the REnt .P / inequality, then the density of the minimal entropy martingale measure also satisfies the REnt .P / inequality.
1558
M. Mania et al.
Proof. It follows immediately, since for any stopping time
E.E T .M Q /lnE T .M Q /jF / D essinf E.E T .M Q /lnE T .M Q /jF / Q2MeE nt
Q
Q
E.E T .M Q /lnE T .M Q /=F / C:u t
104.3 Backward Semimartingale Equation for the Value Process
Y of Equations (104.11) and (104.12) belongs to the class BMO and jjLjjBMO .2C C 1/2 jjM Q jjBMO ;
We say that the process B strongly dominates the process A and we shall write A B, if the difference B A 2 AC loc , i.e., is a locally integrable increasing process. Let .AQ ; Q 2 Q/ be a family of processes of bounded variations, zero at time zero. Denote by essinf.AQ / the largest process of finite Q2Q
variation, zero at time zero, which is strongly dominated by the process AQ for every Q 2 Q, i.e., this is an “ess inf” of the family .AQ ; Q 2 Q/ relative to the partial order . Let us consider the following semimartingale backward equation Yt D Y0 essinf
1 2
Q2MeE nt
hM it C hM ; Lit C Lt ; t < T; Q
Q
(104.11)
where C is an upper bound of the process Y . Proof. Using the Itô formula for YT2 Y2 and the boundary condition YT D 0 we have that Z
(104.12)
Yt D Y0 C Bt C Lt ; B 2 Aloc ; L 2 M2loc ;
(104.13)
such that YT D 0 and Bt D essinf
Q2MeE nt
2
hM it C hM ; Lit : Q
Ys d.Bs C Ls / 0
Q
(104.14)
(104.17)
for any stopping time . Since Y satisfies Equation (104.11) 1 Bt C hM Q it C hM Q ; Lit 2 AC loc 2
(104.18)
and, therefore, Equation (104.17) implies that Z
We say that the process Y is a solution of Equations (104.11) and (104.12) if Y is a special semimartingale with respect to the measure P with canonical decomposition
1
T
hLiT hLi C 2
with the boundary condition YT D 0:
(104.16)
Z
T
hLiT hLi C 2
Ys dLs Z
2
T
Ys d hM Q is
T
Ys d hM Q ; Lis 0:
(104.19)
Without loss of generality we may assume that L is a square integrable martingale, otherwise one can use localization arguments. Therefore, if we take conditional expectations in Equation (104.19), having inequality jYt j C in mind, we obtain E.hLiT hLi jF / CE.hM Q iT hM Q i jF /
Z T (104.20) jd hM Q ; Lis jjF 0 2CE
Let Z
t
Lt D 0
0 s dMs
C LQ t ;
Q M i D 0; hL;
(104.15)
Now using the conditional Kunita–Watanabe inequality from Equation (104.20) we have 1=2
E.hLhT hLi jF /2C jjM Q jjBMO E 1=2 .hLiT hLi jF /
be the Galtchouk–Kunita–Watanabe decomposition (G–K– W) of L with respect to the martingale M .
C jjM Q jjBMO 0:
Lemma 1. If there exists Q 2 MeEnt such that M Q 2 BMO, then the martingale part L of any bounded solution
Solving this quadratic inequality with respect to x D E 1=2 .hLiT hLi jF /, we obtain the estimate
(104.21)
104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure
E.hLiT hLi jF / .2C C 1/2 jjM Q jjBMO :
and as Vt C lnEt .M Q / is a Q-submartingale for every Q 2 MeEnt , we have that
Since the right-hand side does not depend on , the estimate (Equation (104.16)) also holds and L belongs to the space BMO. t u The value process of the problem (104.8) defined by Equation (104.9) is a special semimartingale with respect to the measure P with the canonical decomposition 2 ; A 2 Aloc : Vt D V0 C mt C At ; m 2 Mloc
(104.22)
1559
1 At C hM Q it C hm; M Q it 2 AC loc 2
(104.26)
for every Q 2 MeEnt . On the other hand, according to Proposition 2 the optimal martingale measure Q exists and is equivalent to P . There fore, by the optimality principle the process Vt C lnEt .M Q / will be a Q -martingale and using again Girsanov’s theorem we obtain that
Let Z
t
mt D 0
's0 dMs C m Q t ; hm; Q Mi D 0
(104.23)
Theorem 1. Let conditions (A) and (B) be satisfied. Then: (a) the value process V is a solution of the semimartingale backward equations (104.11) and (104.12). Moreover, a martingale measure Q is the minimal entropy martingale measure if and only if it is given by the density dQ D ET .M Q /dP , where Q
(104.27)
Relations (104.26) and (104.27) imply that
be the G–K–W decomposition of m with respect to M . Now we formulate the main statement of the paper.
Mt
1 At C hM Q it C hm; M Qit D 0: 2
Z
t
D 0
0s dMs m Q t:
At D essinf
Q2MeE nt
(104.28)
hence, the value process V satisfies Equation (104.11) and it is evident that VT D 0. Equality (Equation (104.27)) implies that the processes At and, hence, Vt are continuous. Let us show now that the optimal martingale measure Q is given by Equation (104.24). From Equation (104.28) we have
(104.24)
1 At D h M it C h M; mit 2
1 hN it C hN; mit essinf 2 N 2NE nt .X /
(b) If, in addition, the minimal martingale measure exists and satisfies the reverse Hölder REnt -inequality, then the value process V is the unique bounded solution of Equations (104.11) and (104.12). Proof. (a) By Condition (B) there exists QQ 2 MeEnt and according to Proposition 3 the process Zt D Vt ClnEt .M QQ / is a Q Q-submartingale, hence a P -semimartingale by Girsanov’s theorem. Since ET .M QQ / is strictly positive and continuous the process lnEt .M QQ / is a semimartingale and, consequently, the value process V will be also a semimartingale under P . Condition (A) implies that any adapted RCLL process is predictable [see Revuz and Yor (1999)], hence any semimartingale is special. So, V is a P -special semimartingale admitting decomposition (Equation (104.22)). The processes m hm; M Q i and M Q hM Q i are Q-local martingales by Girsanov’s theorem. Therefore, since
1 hM Q it C hM Q ; mit ; 2
1 1 D h M it C h M; mit C hmi Q t 2 2
1 essinf .hN C mi Q t/ 2 N 2NE nt .X /
1 1 Q t; D h M it C h M; mit C hmi 2 2 (104.29) since Q t / D 0: essinf .hN C mi
(104.30)
N 2NE nt .X /
To prove relation (104.30) let us define a sequence of stopping times
Zt D Vt C lnEt .M Q / 1 Q D V0 C mt C At C Mt hM Q it 2 Q
(104.25)
D V0 C .mt hm; M Q it / C .Mt hM Q it / 1 C At C hM Q it C hm; M Q it 2
n D inf ft W Et .NQ /
1 or n
Et . M m/ Q ng ^ T;
where NQ is a local martingale from the class NEnt .X /, which exists by condition B). It is not difficult to see that the local
1560
M. Mania et al.
martingale N n D m Q n C NQ NQ n belongs to the class NEnt .X / and n " T . Therefore Q t / hN n C mi Q t essinf .hN C mi N 2NE nt .X /
D hm Q m Q n C NQ NQ n it 2.hmi Q t hmi Q t ^n C hNQ it hNQ it ^n / for each n 1 and Equation (104.30) holds, since the righthand side of the latter inequality tends to zero as n ! 1. Here, as previously, m Q is the orthogonal martingale part of m in the G–K–W decomposition (Equation (104.23)) and Q n ^t ; t 2 Œ0; T / is a stopped martingale. m Q n D .m By the optimality principle Vt C lnEt .M Q / is a Q martingale. Since V solves Equation (104.11), this implies that 1 1 hM Q it C hM Q ; mit D hM Q it ChM Q ; mit : essinf 2 2 Q2MeE nt (104.31)
C
1 hM Q it C hL; M Q it 2 ! 1 hM Q it C hL; M Q it : essinf 2 Q2MeE nt
Therefore, the Girsanov theorem implies that Yt C lnEt .M Q / is a Q-local submartingale for every Q 2 MeEnt . Thus, the process Yt Et .M Q / C Et .M Q /lnEt .M Q / is a local P -submartingale. Since .Et .M Q /; t 2 Œ0; T / is a martingale with EET .M Q /lnET .M Q / < 1, the process Et .M Q /lnEt .M Q / will be from the class D, as a submartingale bounded from below (by the constant 1=e). On the other hand, the process Yt Et .M Q / is also from the class D, since Y is bounded and Et .M Q / is a martingale [see, e.g., Dellacherie and Meyer (1980)]. Thus, Yt Et .M Q / C Et .M Q /lnEt .M Q / is a submartingale from the class D, hence from the boundary condition we have that Yt Et .MQ /CEt .MQ /lnEt .M Q / E.ET .M Q /lnET .M Q /jFt /
Q
Since M is represented in the form M C N for some N 2 NEnt .X /, it follows from Equations (104.29) Q and, hence, the and (104.31) that the processes N and m Q are indistinguishable. So, the processes M Q and M m minimal entropy martingale measure is unique and admits representation (Equation (104.24)). (b) It is easy to see that the value process satisfies the twosided inequality for all t 2 Œ0; T 0 Vt C a:s::
(104.32)
The positivity of V follows from the Jensen inequality. On the other hand, if there exists a martingale measure QQ satisfying the reverse Hölder REnt inequality, we have that V is bounded above, since
Q2MeE nt
Q
E.Et T .M Q /lnEt T .M Q /jFt / C: Thus, V is a bounded solution of Equations (104.11) and (104.12). Uniqueness. Let Y be a bounded solution of Equations (104.11) and (104.12). Let us show that the processes Y and V are indistinguishable. Since Y solves Equation (104.11) we have 1 Q Yt C lnEt .M Q / D Y0 C Lt C Bt C Mt hM Q it 2 Q
D Y0 C .Lt hL; M Q it / C .Mt hM Q it /
EŒEt T .M Q /lnEt T .M Q /jFt D Vt : Yt essinf e Q2ME nt
(104.34) Let us show the inverse inequality. Similarly to Equation (104.29) we have 1 1 Q Bt D h M it C h M; Lit C hLi t 2 2
(104.35)
and the infimum is attained for the martingale Nt D LQ t ;
(104.36)
where LQ is the orthogonal martingale part of L in the G–K–W decomposition (Equation (104.15)). 0 Q Since the minimal martingale Let M Q D M L. measure satisfies the REnt .P / condition, Proposition 1 implies that M 2 BMO. On the other hand for any s t
Vt D essinf E.Et T .M Q /lnEt T .M Q /jFt / Q
for all Q 2 MeEnt and
(104.33)
Q s hLit hLis ; Q t hLi hLi 0
hence Lemma 1 implies that M Q 2 BMO. Therefore, from 0 Kazamaki (1994) it follows that the process .Et .M Q /; t 2 0 Œ0; T / is a martingale, hence dQ0 D ET .M Q /dP defines an absolutely continuous martingale measure. 0 It is easy to see that Yt C lnEt .M Q / is a local martingale under Q0 . Indeed, Equations (104.35) and (104.15) imply that
104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure
1 1 Q 0 Yt ClnEt .M Q / D Y0 CLt hM it ChM; Lit C hLi t 2 2
1561
be the decomposition of the value process. Then the triple Q is a solution of the martingale equation .V0 ; '; m/ Z
1 1 Q . M /t LQ t h M it hLi t D Y0 C .. / X /t 2 2 (104.37)
cC
which is a Q0 -local martingale, by Girsanov’s theorem. 0 0 0 Therefore, Zt D Yt Et .M Q / C Et .M Q /lnEt .M Q / is a P -local martingale. Let us show that Q0 2 MeEnt and that the process Z is a martingale. It is easy to see that
Q such that c 2 RC and Conversely, if a triple .c; ; L/, M; LQ 2 BMO, solves Equation (104.41), then the process Y defined by
Zt C Et .M
Q0
0
0
0
Yt Et .M Q / C Et .M Q /lnEt .M Q / E.YT ET .M Q / 0
0
C ET .M Q /lnET .M Q /jFt /: Therefore, from Equations (104.12) and (104.34) we obtain that E.Et T .M
Q0
/lnEt T .M
Q0
/jFt / Yt Vt C: 0
(104.38) 0
The latter inequality implies that EET .M Q /lnET .M Q / < 1 and that Q0 is optimal, hence by Proposition 2, Q0 is equivalent to P and Q0 2 MeEnt . Using the same arguments as before, we have that Z is a local martingale of class D, therefore it is a martingale [see, e.g., Dellacherie and Meyer (1980)]. Now, the martingale property and the boundary condition imply that 0
0
Yt D E.Et T .M Q /lnEt T .M Q /jFt /:
(104.39)
Since Q0 2 MeEnt , the equality Yt D Vt a.s. for all t 2 Œ0; T results from Equations (104.34) and (104.39) , hence V is the unique bounded solution of Equations (104.11) and (104.12). u t Now we formulate Theorem 1(b) in the following equivalent martingale form as a Proposition 4. Let the conditions of Theorem 1(b) be satisfied and let Z
t
Vt D V0 C At C 0
1 1 Q hM iT hM; M iT hLi T 2 2 0 (104.41) Q 2 BMO. and c 2 RC ,' M; m
's0 dMs C m Q t ; hM; mi Q D0
(104.40)
0 Q s dMs C LT
Yt D E
1 / : e
Thus, Z is a local martingale majorizing a uniformly integrable martingale, hence it is a supermartingale and we have that 0
T
D
1 h M it T h M; 2
M it T
1 Q hLit T jFt ; 2 (104.42)
is a bounded solution of Equations (104.11) and (104.12), and coincides with the value process. Proof. Relation (104.29) implies that Equations (104.11) and (104.12) is equivalent to the backward semimartingale equation Z t 1 1 Q Yt D Y0 hM it ChM; M it C hLi tC 2 2 0 YT D 0:
0 Q s dMs CLt ;
(104.43) (104.44)
Since V solves Equation (104.43), using the boundary condition (104.12) we obtain from Equation (104.43) that the triple .V0 ; '; m/ Q satisfies Equation (104.41). Besides, it follows from Lemma 1 that ' M; m Q 2 BMO. Q Conversely, let the triple .c; ; L/ solves Equation (104.41) and Y be the process defined by Equation (104.42). Using the martingale properties of the Q we see that the martingale BMO-martingales M and L, part of Y [defined by Equation (104.42)] coincides with Rt V0 C 0 s0 dMs C LQ t , hence Y satisfies Equations (104.11) and (104.12). As M; LQ 2 BMO, the conditional Kunita– Watanabe inequality and Equation (104.42) imply that Y is bounded, therefore Y coincides with the value process by Theorem 1(b). t u It is well known (Frittelli 2000; Rheinländer 1999) that martingale measure if and only if Q is the minimal entropy RT Q cC 0 h0s dXs (i) ET .M / D e for some constant c and an X -integrableR h, RT T (ii) E Q 0 h0s dXs D 0 and E Q 0 h0s dXs 0 for any Q 2 MeEnt . The sufficiency part of this assertion is hard to verify, since condition (ii) involves the optimal martingale measure. The following corollary of Theorem 1 shows that the integrand h of the minimal entropy martingale measure can be
1562
M. Mania et al.
expressed in terms of the value process V and since V solves Equations (104.11) and (104.12) condition (ii) is automatically satisfied. Corollary 2. A martingale measure Q is the minimal entropy martingale measure if and only if the corresponding density admits representation
ET .M Q / D e V0 C
RT 0
.'s s /0 dXs
;
(104.45)
where ' is the integrand in the G–K–W decomposition of the martingale part m of the value process. Proof. It follows from Theorem 1 and relation (104.29) that V satisfies equation 1 1 Q t C.' M /t C m Q t: Vt D V0 hM it ChM; ' M it C hmi 2 2 Taking the exponentials of the both sides of the latter equation, using the definitions of the process X and of the Doleans-Dade exponential we obtain that Q V0 C e Vt D Et1 . M m/e
Rt
0 .'s s /
0 dX
s
(104.46)
Z E
T
Q
.'s s /0 dXs D E Q lnET . M m/ Q V0
0 ElnET .M Q / ET .M Q / ET .M Q / 0 (104.49) t for any Q 2 MeEnt , hence (ii) is satisfied. u
Corollary 3. If there exists a martingale measure QQ whose density satisfies the reverse Hölder inequality REnt .P /, then Vt D V t
(104.50)
Proof. We denote by REnt .X / the set of martingale measures Q whose densities Z Q satisfy the REnt .P / inequality. By Corollary 1 the minimal entropy martingale measure Q is in REnt .X /. Therefore Vt D essinf E Q .lnEt T .M Q /jFt / Q2MeE nt
D essinf E Q .lnEt T .M Q /jFt / Q2RE nt .X /
1 Q D essinf E Q Mt T hM Q it T C hM Q it T jFt 2 Q2RE nt .X / D
1 essinf E Q .hM Q it T jFt / 2 Q2RE nt .X /
and from the boundary condition (104.12) we have ET . M m/ Q De
V0 C
RT 0
.'s s /0 dXs
:
(104.47)
Now, since by Theorem 1 Q is the minimal entropy martingale measure if and only if it satisfies Equation (104.24), the representation (Equation (104.45)) follows from Equations (104.24) and (104.47). Note that for the process ' condition (ii) is satisfied. Indeed, Equation (104.46) implies that Z
t
Q D V0 C Vt ClnEt .M m/
.'s s /0 dXs
(104.48)
0
Rt and by the optimality principle . 0 .'s s /0 dXs ; t 2 Œ0; T / R T is a Q -martingale, hence E Q 0 .'s s /0 dXs D 0. e For any Q 2 MEnt and x 2 Œ0; 1 let us define Qx D xQ C.1 x/Q . Then ZTx D xET .M Q /C.1 x/ET .M Q / is the corresponding density and according to Lemma 2.1 of Frittelli (2000) the function f .x/ D EZTx lnZTx is differentiable at x and d EZTx lnZTx jxD0 D ElnET .M Q / ET .M Q / ET .M Q / : dx 0. Therefore, from Moreover, Q is optimal iff Equation (104.48) and the latter inequality we obtain
d f jxD0 dx
since Q 2 REnt .X / implies that M Q 2 BMO (Proposition 1) and according to Proposition 7 of Doleans-Dade and Meyer (1979) from M Q 2 BMO.P / we have that the process M Q hM Q i is a BMO martingale with respect to the Q measure Q and, hence, E Q .Mt T hM Q it T jFt / D 0: We recall that Mt T D MT Mt and hM it T D hM iT t hM it . u This expression of the value process enables us to determine easily the minimal entropy martingale measure in some particular cases. Proposition 5. Assume that the minimal martingale measure Qmi n belongs to the class MeEnt and X is a martingale with respect to any Q 2 MeEnt . Then the following assertions are equivalent (1) the minimal entropy martingale measure Q coincides with the minimal martingale measure Qmi n , (2) the mean variance tradeoff admits representation Z
T
h M iT D c C 0
0 s dXs
(104.51)
for some process such that R T constant c and X -integrable RT E mi n 0 s0 dXs D 0 and E Q 0 s0 dXs 0 for any Q 2 MeEnt .
104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure
Proof. (1)) (2). Let Q D Qmi n. Then by Corollary 2 ET . M / D e V0 C
RT 0
.'s s /0 dXs
;
(104.52)
where ' is defined by Equation (104.40). It follows from Equation (104.52) that
Z
exp
T
0
1 h M iT 2 Z T 0 0 ; 's dXs s dMs h M iT
0s dMs Z
T
D exp V0 C 0
0
which implies 1 h M iT D V0 C 2
Z
's0 dXs ;
(104.53)
and, hence, Equation (104.51) is satisfied with D 2' and c D 2V0 . RT Since E Q 0 0s dXs D 0 for any Q 2 MeEnt , it RT follows from Equation (104.49) that E Q 0 's0 dXs RT E Q 0 0s dXs D 0 for any Q 2 MeEnt . Besides Equation (104.46) implies that Z
Remark 3. Condition (104.51) is satisfied in the case of “almost complete” diffusion models [see, e.g., Pham et al. (1998)], where the market price of risk is measurable with respect to the filtration generated by the asset price process. Corollary 5. The mean variance tradeoff h M iT is deterministic if and only if the minimal entropy martingale measure coincides with the minimal martingale measure and ' D 0 hM i -a.e., where ' is defined by Equation (104.23) and hM i is the Dolean measure of hM i:
Proposition 6. Assume that the minimal martingale measure exists and satisfies the reverse Hölder REnt -inequality. Then the density of the minimal entropy martingale measure is of the form Q
ZT
D
Z
0
.'s s / dXs 0
(104.54)
0
Rt and by the optimality principle . 0 .'s s /0 dXs ; t 2 RT Œ0; T / is a Qmi n -martingale, hence E mi n 0 's0 dXs D R T E Q 0 0s dXs D 0. (2)) (1). If Equation (104.51) is satisfied, then 1 0s dXs C h M iT 2 0
0 Z T 1 c C D exp dX s s s 2 2 0 T
RT and it is evident that E Q 0 . 12 s s /0 dXs 0 for any R T Q 2 MeEnt and E mi n 0 . 12 s s /0 dXs D 0, hence Q D mi n by the sufficiency condition of Theorem 2.3 in FritQ telli (2000), corresponding to the previously given condition (ii). u t Corollary 4. Assume that the mean variance tradeoff h M iT is bounded. Then Q D Qmi n if and only if Equation (104.51) is satisfied for Rsome constant c and X t integrable process such that . 0 s0 dXs ; t 2 Œ0; T / is mi n Q -martingale. The proof follows from Proposition 5, since the boundedness of hM iT implies that X is a martingale with respect to any Q 2 MeEnt . Besides, if equality (Equation (104.51))
T
exp 0 Z E exp
t
Vt C lnEt . M / D V0 C
Z ET . M / D exp
Rt is satisfied and if . 0 s0 dXs ; t 2 Œ0; T / is a martingale with respect to some Q 2 MeEnt then this process is bounded and will be a martingale with respect to any Q 2 MeEnt .
The proof immediately follows from Proposition 5.
T 0
1563
0s dXs
(104.55)
DcCm OT
(104.56)
T
0s dXs
if and only if
1 exp h M iT 2
for some constant c and a martingale m O strongly orthogonal to M . Proof. Let ZT .Q / is of the form, Equation (104.55). By Equation (104.24) we have that Z T 0s dXs exp 0 ET . M m/ Q D Z T E exp 0s dXs 0
which implies that 1 1 Q T Q T hmi exp h M iT D c exp m 2 2 Z T D cET .m/ Q DcCc Es .m/d Q m Q s; 0
where the martingale m, Q orthogonal to M , is defined by Equation (104.23) and belongs to the class BMO according to Lemma R t1. Therefore, Equation (104.56) is satisfied with m O t D c 0 Es .m/d Q m Q s , which is a martingale according to Kazamaki (1994).
1564
M. Mania et al.
Conversely, let Equation (104.56) be satisfied. Then using the Itô formula for ln.c C m O t / from Equation (104.56) we have Z
T
lnc C 0
1 1 dm O s D h M iT cCm Os 2 Z : 1 1 C dm Os 2 0 cCm Os T
lnc;
Z
0
(2) By Corollary 2, Q is the minimal entropy martingale measure if and only if the corresponding density admits representation
1 dm Os c Cm Os
is a solution of the martingale equation (104.41). The martingale cC1 mO m O belongs to the class BMO, since by Equation (104.56) c C m O t 1 and Proposition 1 with the Jensen inequality imply that 1
cCm O t D E.e 2 hM iT =Ft / E.e
12 hM it T
=Ft / e
Q where M Q D M L:
dQ D ET .M Q /dP;
which implies that the triple D 0; LQ D
LQ is a real-valued local martingale strongly orthogonal to M . We give two equivalent characterizations of the minimal entropy martingale measure in terms of the solution of Equation (104.57): (1) By Theorem 1, Q is the minimal entropy martingale measure if and only if
ET .M Q / D e Y0 C
RT
e
1 2C
s s /
.
RT 0
s0 dXs
:
Since solution of Equation (104.41) is unique in the class RC BMO BMO, we obtain that ' D 0. Therefore, it follows from Corollary 2 that ZT .Q / is of the form, Equation (104.55). t u
In an incomplete market model where the dynamics of the discounted assets price process is described by an Rd -valued continuous semimartingale, we characterize the minimal entropy martingale measure in terms of the value function of a suitable problem of an optimal equivalent change of measure. This value process has been expressed (under certain conditions) as the unique bounded solution of the BSDE 1 1 Q Yt D Y0 h M it C h M; M it C hLi tC 2 2 Z t 0 Q C YT D 0; (104.57) s dMs C Lt ; 0
which is equivalent to Equations (104.11) and (104.12) according to Equation (104.29). We say that the solution of this equation is a special semimartingale Y , keeping in mind that M C LQ is the martingale part of Y . One can say Q also that a solution of Equation (104.57) is a triple .Y; ; L/, d where is R -valued predictable M -integrable process and
s
:
(104.58)
over all 2 ˘; (104.59)
where ˘ is the class of all Rd -valued predictable M -integrable processes such that X is a martingale with respect to all Q 2 MeEnt . The optimal strategy of problem (104.59) can be expressed in terms of the BSDE (Equation (104.57)) as D
104.4 Conclusions
0 dX
We point out that, by duality, one can also find the optimal portfolio related to the exponential utility maximization problem to maximize EŒe ˛
12 E.hM it T =Ft /
0
1 . ˛
/:
This can be shown using Equation (104.58) in a standard way, since i h h RT 0 RT 0 RT E e ˛ 0 s dXs D E Q e ˛ 0 s dXs Y0 0 .
s s /
0 dX
i s
e Y0 by Jensen’s inequality and this value is attained for D 1 ˛ . /. We also consider two specific extreme cases for which the BSDE admits an explicit solution and in such cases we explicitly construct the minimal entropy martingale measure. In particular, we give necessary and sufficient conditions for the minimal entropy martingale measure to coincide with the minimal martingale measure as well as to coincide with the martingale measure (expressed by Equation (104.55)) appearing in the second above mentioned extreme case. Acknowledgment The authors are grateful to an anonymous referee for valuable remarks and suggestions. This research is supported by INTAS Grant 97-30204.
104 A Semimartingale BSDE Related to the Minimal Entropy Martingale Measure
References Biagini, F., P. Guasoni and M. Pratelli. 2000. “Mean variance hedging for stochastic volatility models.” Mathematical Finance 10, 109–129. Bismut, J. M. 1973. “Conjugate convex functions in optimal stochastic control.” Journal of Mathematical Analysis and Applications 44, 384–404. Chitashvili, R. 1983. “Martingale ideology in the theory of controlled stochastic processes,” in Lecture Notes in Mathematics, Vol. 1021, Springer, Berlin, pp. 73–92. Chitashvili, R. and M. Mania. 1987. “Optimal locally absolutely continuous change of measure: finite set of decisions.” Stochastics Stochastics Reports 21, 131–185 (part 1), 187–229 (part 2). Chitashvili, R. and M. Mania. 1996. “Generalized Ito’s formula and derivation of Bellman’s equation,” in Stochastic processes and related topics, Stochastics Monograph Vol. 10, H. J. Engelbert et al. (Eds.). Gordon & Breach, London, pp. 1–21. Csiszar, I. 1975. “I divergence geometry of probability distributions and minimization problems.” Annals of Probability 3, 146–158. Delbaen, F. and W. Schachermayer. 1996. “The variance-optimal martingale measure for continuous processes.” Bernoulli 2, 81–105. Delbaen, F., P. Monat, W. Schachermayer, M. Schweizer, and Ch. Stricker. 1997. “Weighted norm inequalities and hedging in incomplete markets.” Finance and Stochastics 1, 181–227. Delbaen, F., P. Grandits, Th. Rheinländer, D. Samperi, M. Schweizer, and Ch. Stricker. 2002. “Exponential hedging and entropic penalties.” Mathematical Finance 12, 99–123. Dellacherie, C. and P. A. Meyer. 1980. Probabilités et potentiel. Chapitres V a VIII. Théorie des martingales, Hermann, Paris. Doleans-Dade, K. and P. A. Meyer. 1979. “Inégalités de normes avec poinds,” in Séminaire de Probabilités XIII, Lecture Notes in Mathematics, Vol. 721, Springer, Berlin, pp. 204–215. El Karoui, N. and M. C. Quenez. 1995. “Dynamic programming and pricing of contingent claims in an incomplete market.” SIAM Journal on Control and Optimization 33, 29–66. El Karoui, N., S. Peng, and M. C. Quenez. 1997. “Backward stochastic differential equations in finance.” Mathematical Finance 75, 1–71. Elliott, R. J. 1982. Stochastic calculus and applications, Springer, New York. Föllmer, H. and M. Schweizer. 1991. “Hedging of contingent claims under incomplete information,” in Applied stochastic analysis, Stochastics monographs Vol. 5, M. H. A. Davis and R. J. Elliott (Eds.). Gordon & Breach, London, pp. 389–414.
1565
Frittelli, M. 2000. “The minimal entropy martingale measure and the evaluation problem in incomplete markets.” Mathematical Finance 10, 39–52. Gourieroux, C., J. P. Laurent, and H. Pham. 1998. “Mean–variance hedging and numeraire.” Mathematical Finance 8, 179–200. Hoffman, N., E. Platen, and M. Schweizer. 1992. “Option pricing under incompleteness and stochastic volatility.” Mathematical Finance 2, 153–187. Jacod, J. 1979. Calcule Stochastique et problèmes des martingales, Lecture Notes in Mathematics, Vol. 714, Springer, Berlin. Kazamaki, N. 1994. Continuous exponential martingales and BMO, Lecture Notes in Mathematics, Vol. 1579, Springer, Berlin. Laurent, J. P. and H. Pham. 1999. “Dynamic programming and mean– variance hedging.” Finance and Stochastics 3, 83–110. Liptser, R. Sh. and A. N. Shiryayev. 1986. Martingale theory, Nauka, Moscow. Mania, M. and R. Tevzadze. 2000. “A semimartingale Bellman equation and the variance-optimal martingale measure.” Georgian Mathematical Journal 7, 765–792. Mania, M., M. Santacroce, and R. Tevzadze. 2002. “A semimartingale backward equation related to the p-optimal martingale measure and the lower price of a contingent claim,” in Stochastic processes and related topics, Stochastics Monograph Vol. 12, Taylor & Francis, London, pp. 189–212. Miyahara, Y. 1996. “Canonical martingale measures of incomplete assets markets,” in Probability theory and mathematical statistics: proceedings of the seventh Japan–Russia symposium, World Scientific, Tokyo, pp. 343–352. Pardoux, E. and S. G. Peng. 1990. “Adapted solution of a backward stochastic differential equation.” Systems & Control Letters 14, 55–61. Pham, H., Th. Rheinländer, and M. Schweizer. 1998. “Mean–variance hedging for continuous processes: new proofs and examples.” Finance and Stochastics 2, 173–198. Revuz, D. and M. Yor. 1999. Continuous Martingales and Brownian motion, Springer, Berlin. Rheinländer, Th. 1999. Optimal martingale measures and their applications in mathematical finance, Phd Thesis, Berlin. Rouge, R. and N. El Karoui. 2000. “Pricing via utility maximization and entropy.” Mathematical Finance 10, 259–276. Schweizer, M. 1994. “Approximating random variables by stochastic integrals.” Annals of Probability 22, 1536–1575. Schweizer, M. 1996. “Approximation pricing and the variance optimal martingale measure.” Annals of Probability 24, 206–236.
Chapter 105
The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model with Jumps (Reprint)* Fred Espen Benth and Thilo Meyer-Brandis
Abstract We derive the density process of the minimal entropy martingale measure in the stochastic volatility model proposed by Barndorff-Nielsen and Shephard (Journal of the Royal Statistical Society, Series B 63:167–241, 2001). The density is represented by the logarithm of the value function for an investor with exponential utility and no claim issued, and a Feynman–Kac representation of this function is provided. The dynamics of the processes determining the price and volatility are explicitly given under the minimal entropy martingale measure, and we derive a Black and Scholes equation with integral term for the price dynamics of derivatives. It turns out that the price is the solution of a coupled system of two integro-partial differential equations. Keywords Stochastic volatility r Lévy processes r Subordinators r Minimal entropy martingale measure r Density process r Incomplete market r Indifference pricing of derivatives r Integro-partial differential equations
105.1 Introduction In this paper we derive the density process of the minimal entropy martingale measure in a Black and Scholes market with a stochastic volatility model given by Barndorff-Nielsen and Shephard (2001). We apply our results to find the minimal entropy price of derivatives in this market, and present a system of integro-partial differential equations (integro-PDEs) that determines the price. The knowledge of the density process also enables us to describe the price dynamics of the market under the minimal entropy measure. Barndorff-Nielsen and Shephard (2001) propose a geometric Brownian motion where the squared volatility is modeled by a non-Gaussian Ornstein–Uhlenbeck process as the price dynamics for a financial asset. In their model, the volatility level will revert toward zero, with random upward
F.E. Benth () and T. Meyer-Brandis University of Oslo, Oslo, Norway e-mail: [email protected]; [email protected]
shifts given by the jumps of a subordinator process (being an increasing Lévy process). Due to the stochastic volatility, this asset price model leads to an incomplete market, and the arbitrage theory does not provide a unique price for derivatives written on the asset due to the existence of a continuum of equivalent martingale measures. Thus, the risk preferences of the market participants need to be included in the price formation of the derivatives. Utility indifference pricing (see Hodges and Neuberger Hodges and Neuberger (1989)) gives an alternative to the arbitrage theory to derive the fair premium of derivatives in incomplete markets. One considers an investor trying to maximize his exponential utility by either entering into the market by his own account, or issuing a derivative and investing his incremental wealth after collecting the premium. The indifference price of the claim is then defined as the premium for which the investor becomes indifferent between the two investment alternatives. It is well-known (see e.g., Fritelli (2000), Rouge and El Karoui (2000), and Delbaen et al. (2002)) that the zero risk aversion limit of the indifference price corresponds to the minimal entropy martingale measure price. The zero risk aversion limit is of particular interest, since this is the only price for which the buyer and seller agree on the indifference price. We state the density of the minimal entropy martingale measure by appealing to general results by Grandits and Rheinländer (2002) and verification results by Rheinländer (2005). The density process is introduced via a function H which is related to the solution of the portfolio optimization problem of the investor having an exponential utility function and not issuing any claim. In fact, it arises from the logarithmic transform of the value function in a similar fashion as demonstrated in Musiela and Zariphopoulou (2003). This function is represented as an expectation of the exponential of a ratio between the drift and squared volatility, and shown to be the Feynman–Kac solution of an integro-PDE.
*
This paper is reprinted from Finance and Stochastics, 9 (2005), No. 4, pp. 563–575.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_105,
1567
1568
It provides us with the scaling of the jumps when considering the minimal entropy dynamics of the stochastic volatility model. We apply our results to find the minimal entropy price of a class of claims with payoff given by a function of the underlying at maturity of the contract. The price is written as an expected value of the payoff, where we have complete knowledge of the dynamics of the asset and volatility processes. Furthermore, we state the integro-PDE for the pricing equation, which will become a Black and Scholes partial differential equation with an additional integral term arising from the stochastic volatility. This integral term will also include the function H , and thus to solve it we need to consider a coupled system of two integro-PDEs. Related papers studying the minimal entropy martingale measure for stochastic volatility markets are Hobson (2004), Becherer (2003, 2004), and Benth and Karlsen (2005). There is a huge interest in applying entropy measures in pricing of derivatives in incomplete markets. We list here some related works. Fujiwara and Miyahara (2003) investigate the minimal entropy martingale measure for geometric Lévy processes, while Kim and Lee (2007) study it for the specific CGMY model. A general approach to minimal entropy martingale measures in incomplete markets based on a semimartingale backward stochastic differential equation is found in Mania et al. (2003). Monoyios (2007) studies the minimal entropy measure and the Esscher transform in an incomplete market generated by two Brownian motions. The relations between the minimal martingale measure and the minimal entropy martingale measure in incomplete markets are studied by Arai (2001). Finally, we mention Branger (2004) where cross-entropy is analyzed in connection with derivatives pricing. Entropy-based methods have been applied in numerous papers for inferring implied probability densities from market price data. We only mention a few papers here, including Avellaneda (1998), Buchen and Kelly (1996), Guo (2001), Matsuba and Takahashi (2003), Mogedey and Ebeling (2000), and Palatella (2005). Other interesting financial applications of entropy based ideas can be found in Tang et al. (2006) where decision making problems and financial management are studied. Further, McCauley (2003) provides a comparison of ideas in thermodynamics and finance where entropy plays a role. The paper is organized as follows: In Sect. 105.2 we define our financial market, and in Sect. 105.3 we study the density of the minimal entropy martingale measure. Section 105.4 is devoted to the density process and the analysis of the function H , while in Sect. 105.5 we apply our results to the minimal entropy pricing of claims.
F.E. Benth and T. Meyer-Brandis
105.2 The Market Given a probability space .; F ; P / and a time horizon T , consider a financial market consisting of a bond and a risky asset with prices at time 0 t T denoted by Rt and St , respectively. Assume without loss of generality that the bond yields a risk-free rate of return equal to zero, i.e., dRt D 0;
(105.1)
together with the convention that R0 D 1. In this paper we will consider the stochastic volatility model introduced by Barndorff-Nielsen and Shephard (2001), but let us mention that our results can be achieved for more general stochastic volatility models (under appropriate integrability conditions) as long as the volatility driving process Lt is independent from Bt (see Equations (105.2) and (105.3) below). In the Barndorff-Nielsen and Shephard model the price of the risky asset is evolving according to the following dynamics dSt D ˛.Yt /St dt C .Yt /St dBt ; d Yt D Yt dt C dLt ;
S0 D s > 0 (105.2)
Y0 D y > 0;
(105.3)
where Bt is a Brownian motion and Lt a pure jump subordinator (that is, an increasing pure jump Lévy process with no drift) with Poisson random measure denoted by N.dt; d z/. In this paper we assume Bt and Lt to be independent. The Lévy R1 measure .d z/ of Lt satisfies 0 min.1; z/ .d z/ < 1. Further, we denote by fFt gt 0 the completion of the filtration .Bs ; Ls I s t/ generated by the Brownian motion and the subordinator such that .; F ; Ft ; P / becomes a complete filtered probability space. In this paper we will assume the following specification of the parameter functions ˛ and : ˛.y/ D C ˇy;
.y/ D
p
y;
(105.4)
with and ˇ being constants. The process Yt models the squared volatility, and will be an Ornstein–Uhlenbeck process reverting toward zero, and having positive jumps given by the subordinator. An explicit representation of the squared volatility is Z
t
Yt D y exp .t/ C
exp ..t u// dLu :
(105.5)
0
The scaling of time by in the subordinator is to decouple the modeling of the marginal distribution of the (log)returns of S and their autocorrelation structure. We note that in Barndorff-Nielsen and Shephard (2001) it is proposed to use a superposition of processes Yt with different speeds of
105 The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model
mean-reversion. However, in this paper we will stick to only one process Yt , but remark that there are no essential difficulties in generalizing to the case of a superposition of Y ’s. The modeling idea is to specify a stationary distribution of Y that implies (at least approximately) a desirable distribution for the returns of S . Given this stationary distribution, one needs to derive a subordinator L. In Barndorff-Nielsen and Shephard (2001), several examples of such distributions and their associated subordinators are given in the context of financial applications. We denote by ./ the cumulant function of Lt , which is defined as the logarithm of the characteristic function ./ D ln E Œexp .iL1 / ;
2 R:
1569
länder (2002) the density of QME for general stochastic volatility models driven by an independent noise process is shown to be of the form Z ZT D c exp Z D c exp
T 0 T 0
˛.Yt / 1 dS S t 2 .Yt / t
Z T 2 ˛.Yt / ˛ .Yt / dBt dt : 2 .Yt / 0 .Yt / (105.12)
Here, c is the normalizing constant given by Z c 1 D E exp
(105.6)
T
0
From the Lévy–Kintchine Formula we have Z
1
./ D
˚ i z e 1 .d z/:
(105.7)
0
We suppose that the Lévy measure satisfies an exponential integrability condition, that is, there exists a constant k > 0 such that Z 1 ekz .d z/ < 1: (105.8)
E Œexp .L1 / D exp . .//
(105.9)
with
./ D
.i/:
(105.10)
Note that Lt is also a subordinator, and the cumulant function of this is ./. The process Lt has the decomposition Z tZ
Z tZ
1
Lt D 0
0
Z E exp
T 0
˛ 2 .Ys / ds 2 .Ys /
< 1:
(105.13)
Then ZT as defined in (105.12) is the density of the minimal entropy martingale measure QME . Proof. Referring to the results in Rheinländer (2005), it is enough to verify the following three statements: (i) (ii) (iii)
The expectation EŒZT is equal to one. The measure induced by ZT , denoted by QME , has finite entropy. We have Z
z.N.d z; dt/.d z/dt/; 0
:
Proposition 1 Suppose we have
T
1
z .d z/dt C
However, their boundedness assumptions on ˛ and are not covering our assumptions in model (Equations (105.2) and (105.3)). Thus, we give a short proof that ZT in Equation (105.12) is indeed the density we are looking for by appealing to the sufficient conditions developed by Rheinländer (2005).
1
Later we will be more precise about the size of k (see Proposition 2 and the examples following), and relate it to parameters in the specification of the Lévy measure. Under condition (105.8), the moment generating function is defined for all jj k, and
˛ 2 .Yt / dt 2 2 .Yt /
0
˛.Yt / 1 S 2 .Yt / t
2 d ŒS t 2 Lexp .P /;
(105.14)
0
(105.11) where the second integral on the right-hand side is a martingale. The reader is referred to Applebaum (2004), Bertoin (1996), Protter (2003), and Sato (1999) for more information about Lévy processes and subordinators.
where ŒS t is the quadratic variation process of St and Lexp .P / is the Orlicz space generated by the Young function exp./. (i) Define Z
105.3 The Minimal Entropy Martingale Measure In this section we derive the density of the minimal entropy martingale measure of the model (Equations (105.2) and (105.3)), which we denote by QME . In Grandits and Rhein-
ZT0
D exp 0
T
˛.Yt / dBt .Yt /
Z
T 0
! 1 ˛ 2 .Yt / dt : 2 2 .Yt / (105.15)
Then by assumption (105.13) and the Novikov condition, we know that Zt0 is a true martingale. We denote its corresponding probability measure by Q0 and note that Yt has the same dynamics under P and Q0 . Hence, we get
1570
F.E. Benth and T. Meyer-Brandis
EŒZT D cE
ZT0
Z exp 0
Z D cEQ0 exp
T 0
˛ 2 .Yt / dt 2 2 .Yt /
˛ 2 .Yt / dt D 1: (105.16) 2 2 .Yt /
T
(ii) Using the same arguments as in i), we see that Z E ŒZT j ln ZT j D EQ0 exp Z D EQ0 exp
T 0 T 0
ˇ Z T
ˇ Z T 2 ˇ ˇ ˛ 2 .Yt / ˛.Yt / ˛ .Yt / ˇ dt ˇ dBt C dt ˇˇ 2 .Y / 2 2 .Yt / .Y / t t 0 0 ˇ
ˇZ T 2 ˇ ˛ .Yt / ˛.Yt / e ˇˇ dt ˇˇ d B t ˇ < 1; 2 2 .Yt / .Y t/ 0 Proposition 2 If
et is the Brownian motion under Q0 . where B (iii) Since we have Z exp 0
T
˛.Yt / 1 S 2 .Yt / t
!
2
ˇ2 .1 exp.T // z 1 .d z/ < 1; exp 0 (105.18) then ZT defined in (105.12) is the density of the minimal entropy martingale measure QME . Z
Z
T
D exp
d ŒS t
0
˛ 2 .Yt / dt ; 2 .Yt /
assumption (105.13) implies condition (105.14).
1
Proof. Since Yt y exp.T /, we have
Proposition 2 gives a sufficient condition for assumption (105.13) stated in terms of the Lévy measure of L1 . Moreover, it determines an exact constant k in the exponential integrability condition (105.8):
2 ˛ 2 .Yt / D C 2ˇ C ˇ 2 Yt C C ˇ 2 Yt 2 .Yt / Yt for a positive constant C . But this gives
Z T
˛ 2 .Yt / 0 2 dt C E exp ˇ Y dt t 2 0 .Yt / 0
2 Z T ˇ y.1 exp.T // C D C 0 E exp .1 exp..T t/// dLt 0 Z TZ 1
D C 00 exp .exp.f .t/z/ 1/ .d z/ dt ;
Z E exp
(105.17)
T
0
(105.19)
0
where C 0 , C 00 are positive constants, and f .t/ ˇ 2 .1 exp..T t// =.
D t u
Let us consider some examples of the process Lt that are relevant in finance, and state sufficient conditions for the density of the minimal entropy martingale measure. If we choose the stationary distribution of Yt to be an inverse Gaussian law with parameters ı and , that is Yt IG.ı; /, the Lévy measure of L becomes
1 ı 3=2 .1 C z/ exp z d z: .d z/ D p z 2 2 2
Hence, the exponential integrability condition in Prop. 2 is satisfied whenever ˇ 2 .1 exp.T // <
1 : 2
When Yt IG.ı; /, the log-returns of St will be approximately normal inverse Gaussian distributed, a family of laws that has been successfully fitted to log-returns of stock prices (see e.g., Barndorff-Nielsen (1998) and Rydberg (1999)). Another popular distribution in finance is the variance gamma law (see Madan and Seneta (1990)). If the stationary
105 The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model
distribution of Yt is a gamma law with parameters ı and ˛, that is Yt .ı; ˛/, the marginal distribution of the logreturns of St is approximately following a variance gamma law. The Lévy measure of L becomes .d z/ D ı˛ exp.˛z/ d z; for which the integrability condition in Proposition 2 is satisfied whenever ˇ 2 .1 exp.T // < ˛: Note that the case ˇ D 12 corresponds to symmetrically distributed log-returns. When the log-returns are symmetric,
it is sufficient for the integrability condition in Proposition 2 that > 1=2 (inverse Gaussian law) or ˛ > 1=2 (the variance gamma law).
105.4 The Density Process Section 105.3 determines the minimal entropy martingale measure QME for the model (Equations (105.2) and (105.3)). It is of interest (for example in pricing of derivatives) to know the dynamics of the processes St and Yt under QME . In this section we identify the density process of QME as a certain stochastic exponential, which then by means of the Girsanov theorem gives us the dynamics of St and Yt under QME . A key ingredient in the description of the density process of the minimal entropy martingale measure is the function H.t; y/ defined as follows:
Z ˇ 1 T ˛ 2 .Yu / ˇ d u Yt D y ; H.t; y/ D E exp 2 t 2 .Yu /
where RC D .0; 1/. We remark that our motivation for considering the function H comes from portfolio optimization with an exponential utility function. It turns out that the difference between the value function of the utility maximization problem and the utility function itself can be represented as H . This can be seen by, e.g., considering the Hamilton–
Lemma 3 For all .t; y/ 2 Œ0; T RC it holds that exp a.t/y 1 C b.t/y C c.t/ H.t; y/ 1; (105.22) where a.t/ D
.exp..T t// 1/ ; 2
ˇ2 b.t/ D .1 exp..T t/// ; 2 Z T c.t/ D ˇ.T t/ C
.b.u// d u: t
Proof. The upper bound of 1 is clear (which is reached for t D T ). Denote
.t; y/ 2 Œ0; T RC ;
(105.20)
Jacobi–Bellman of the stochastic control problem. We refer to Musiela and Zariphopoulou Musiela and Zariphopoulou (2003) for more on this in a different market context than ours. Using the time-homogeneity of the Lévy process, we can rewrite H.t; y/ as
Z ˇ 1 T t ˛ 2 .Yu / ˇY 0 D y ; d u H.t; y/ D E exp 2 0 2 .Yu / This function will describe the change of the jump measure of the Lévy process under the minimal entropy martingale measure. However, before considering this in more detail, we study some simple but useful properties of H.t; y/: The function satisfies the following bounds:
1571
.t; y/ 2 Œ0; T RC :
(105.21)
2 2 C 2ˇ C ˇ y : y (105.23)
1 1 ˛ 2 .y/ D g.y/ WD 2 2 .y/ 2
Using the explicit representation of Yu in (105.5), its lower bound Yu y exp..u t// and the fact that Z
T
Yu d u D YT Yt .LT Lt /;
t
it is straightforward to derive H.t; y/ exp a.t/y 1 C b.t/y ˇ.T t/
Z ˇ2 T .1 exp..T u/// dLu :
E exp 2 t (105.24) The lower bound follows.
1572
F.E. Benth and T. Meyer-Brandis
Later we shall make explicit use of the differentiability of H.t; y/ with respect to t and y, proved in the following proposition: Proposition 4 The function H.t; y/ is continuously differentiable in t and y, i.e., H 2 C 1;1 .Œ0; T RC /. Proof. The random variable Z
T t
Xt;T WD exp
g.Yu / d u 0
with g.y/ as in (105.23) and Y0 D y, is obviously differentiable with respect to y. The derivative is given by 1 2
Z
T t
0
2 2 eu d u Xt;T ˇ Yu2
which can be bounded by C C C 0 =y 2 for two positive constants C; C 0 after appealing to the inequality Yu y exp.u/. Hence, the dominated convergence theorem implies that H.t; y/ is differentiable with respect to y. Moreover, by similar arguments we find that @H=@y is continuous, which proves the first part of the Proposition. Concerning the differentiation with respect to t, we rewrite H.t; y/ as Z H.t; y/ D E T t
D
T s
Z
0 T s
E exp
0
@H LY H D y C @y
g.Yu /d u g.Ys / ds C 1;
0
(105.25) where we have appealed to the Fubini–Tonelli theorem together with the exponential integrability conditions on L to interchange integration and expectation. Because Yu is a Lévy diffusion and the compensating measure of a Lévy process is diffuse with respect to time, the integrand in (105.25) is continuous in s. Hence H.t; y/ is continuously differentiable in t.
(105.27)
Z
1
fH.t; y C z/ H.t; y/g .d z/: 0
(105.28)
The function H in (105.20) solving Equations (105.26) and (105.27) plays a crucial role in the derivation of the density of the minimal entropy martingale measure. In general, Equation (105.20) is rather difficult to calculate explicitly. However, if we consider the special case ˛.y/ D ˇy, i.e., D 0 in Equation (105.2), a direct calculation using the moment generating function of L1 gives the following explicit solution of the integro-PDE (Equations (105.26) and (105.27)): Corollary 5 Suppose ˛.y/ D ˇy. Then the solution of Equations (105.26) and (105.27) is given as H.t; y/ D exp .b.t/y C c.t// ;
(105.29)
where b and c are defined as ˇ2 .1 exp..T t/// ; 2 Z T c.t/ D
.b.u// d u;
b.t/ D
y 2 RC :
Here
g.Yu /d u g.Ys /ds C 1
exp 0
Z
Z
T t
H.T; y/ D 1;
t
We recall that is the log moment generating function of L1 defined in Equation (105.10). Setting D 0 in (105.2) corresponds to an expected log-return of .ˇ 12 /y of the risky asset St . If we, for instance, specify the stationary distribution of Y to be inverse Gaussian, then the log-returns will be approximately normal inverse Gaussian distributed (see Barndorff-Nielsen and Shephard (2001)), and choosing this to be symmetric corresponds to ˇ D 12 , that is, with D 0 we have zero expected log-return. Now we introduce the notation
We get from the theory of Markov processes that H.t; y/ is the Feynman–Kac representation of the solution of the following integro-PDE:
ı.y; z; t/ WD
H.t; y C z/ H.t; y/
(105.30)
and define the following stochastic exponentials
˛ 2 .y/ @H 2 H C LY H D 0; @t 2 .y/
.t; y/ 2 Œ0; T / RC ; (105.26)
with terminal data Z Zt0 WD exp Z tZ Zt00
D exp
t 0
˛.Ys / dBs .Ys / Z tZ
1
1
ln ı.Ys ; z; s/N.d z; ds/ C 0
0
Z
t 0
1 ˛ 2 .Ys / ds 2 2 .Ys /
! (105.31) !
.1 ı.Ys ; z; s// .d z/ds : 0
0
(105.32)
105 The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model
We identify the density process in question as follows.
where the right hand side of the above equation is the density in Equation (105.12). Since we have
Theorem 6 Suppose condition (105.13) is fulfilled. Then
dSt D ˛.Yt /dt C .Yt /dBt ; St
Zt WD Zt0 Zt00 is the density process of the minimal entropy martingale measure QME .
T
0
we get Z ln.ZT0 / D
Proof. We want to show that Z 0 00 ZT ZT D c exp
1573
˛.Yt / dBt .Yt /
Z
T 0
Z ln.ZT0 / C ln.ZT00 / D
2
˛ .Yt / dt ; 2 .Yt /
T 0
˛.Yt / 1 1 St dSt C 2 .Yt / 2
Z 0
T
˛ 2 .Yt / dt: 2 .Yt / (105.33) 2
t/ Now, substituting in Equation (105.33) for 12 ˛ 2 .Y the ex.Yt / pression we get from the integro-PDE Equation (105.26), we end up with
Z T @y H.t; Yt / @t H.t; Yt / ˛.Yt / 1 St dSt C Yt dt H.t; Yt / H.t; Yt / 0 .Yt / 0 Z TZ 1 C .ln H.t; Yt C z/ ln H.t; Yt // N.d z; dt/: T
0
(105.34)
0
Note that we have used the short-hand notation @t H for @H=@t and @y H for @H=@y. Since H 2 C 1;1 from Proposition 4, we can apply Itô’s formula on h.t; Yt / D ln H.t; Yt / to derive Z
T
h.T; YT / D h.0; Y0 / C Z
T
C 0
Finally, substitution of Equation (105.34) yields ZT0 ZT00
Equation
T 0
1
.ln H.t; Yt C z/ ln H.t; Yt // N.d z; dt/:
(105.35)
0
(105.35)
in
˛.Yt / 1 St dSt 2 0 .Yt /
Z T 2 ˛.Yt / ˛ .Yt / dBt dt ; 2 .Yt / 0 .Yt /
Z D exp ln H.0; y/ Z D c exp
0
Z
@y H.t; Yt / @t H.t; Yt / dt Yt H.t; Yt / H.t; Yt /
T
(105.36) such that ZT0 ZT00 is indeed the density of QME . Finally, the orthogonality of Zt0 and Zt00 together with the fact EŒZT0 ZT00 D 1 (point i) in the proof of Proposition 1) yields that Zt D Zt0 Zt00 is a martingale.
105.5 The Entropy Price of Derivatives and Integro-Partial Differential Equations As an application of our results, we consider the price of derivatives written on the asset S under the minimal entropy martingale measure. We derive the corresponding integroPDE for claims having a payoff given by the asset price ST at maturity of the contract, a typical example being a European call option on S . Consider a contingent claim with payoff f .ST / at maturity time T , where we suppose that f is of linear growth and f .ST / 2 L1 .QME /. Then the entropy price of the claim at time t given St D s and Yt D y, denoted by ƒ.t; y; s/, is
1574
F.E. Benth and T. Meyer-Brandis
defined as the conditional expectation under the minimal entropy martingale measure. We thus get ˇ ƒ.t; y; s/ D EQME f .ST / ˇ Yt D y; St D s : (105.37) Knowing the density process of QME we are now able to determine the dynamics of St and Yt under QME . Define the et by two processes e S t and Y et e et ; St dB de St D Y
(105.38)
et D Y et dt C d e dY Lt ;
(105.39)
et is Brownian motion and e Lt is a pure jump Markov where B process with the predictable compensating measure e .!; d z; dt/ D
et .!/ C z/ H.t; Y .d z/dt: et .!// H.t; Y
(105.40)
Observe that the state-dependent jump measure e .d z/ becomes deterministic when D 0: Indeed, from Corollary 5 we find that e .!; d z; dt/ D eb.t /z .d z/ dt; where b.t/ is given in Corollary 5. Hence, for D 0, e L is an independent increment process (see e.g., Sato (1999)). et and e Lt and the Girsanov By using the independence of B theorem for Brownian motion and random measures (see Jacod and Shiryaev (1987)), respectively, we see that ˇ ƒ.t; y; s/ D EQME f .ST / ˇ Yt D y; St D s ˇ et D y; e DE f e ST ˇ Y St D s : This representation allows us to set up an integro-PDE for the entropy price. Like in Bensoussan and Lions (1982), Chapter 3, Theorem. 8.1, using the bounds of H.t; y/, we get that ƒ.t; y; s/ is the Feynman–Kac representation of the solution of the following integro-PDE
1 @ƒ @2 ƒ @ƒ C 2 .y/s 2 2 y @t 2 @s @y Z 1 H.t; y C z/ .d z/ D 0; .t; y; s/ 2 Œ0; T / R2C ; C .ƒ.t; y C z; s/ ƒ.t; y; s// H.t; y/ 0
with terminal condition ƒ.T; y; s/ D f .s/; .y; s/ 2 R2C :
(105.42)
Note that in order to solve this integro-PDE, we need to consider the Equations (105.26) and (105.27) for H as well. Thus, the minimal entropy price of a claim in the BarndorffNielsen and Shephard model is given as the solution of a coupled system of two integro-PDEs.
105.6 Conclusions In incomplete markets, there exist many risk neutral pricing measures, and tools to single out one for the purpose of derivatives pricing is called for. A natural choice seems to be given via utility indifference pricing, where the limiting pricing measure for zero risk aversion plays a crucial role, namely the minimal entropy martingale measure. This is the measure minimizing the relative entropy, and can be viewed financially as the measure yielding a price which both the buyer and the seller “agree” on.
(105.41)
We have studied the minimal entropy martingale measure for the stochastic volatility model of Barndorff-Nielsen and Shephard (2001). The novel feature of their model is that the squared volatility process follows an Ornstein–Uhlenbeck process driven by a subordinator, ensuring a positive volatility and a very flexible modeling of observed financial time series. We have found the density process of the Radon– Nikodym derivative of the minimal entropy martingale measure in terms of an expectation functional solving a firstorder integro-partial differential equation. The proof is based on a general characterization result by Rheinländer (2005). We also present the characteristics of the subordinator under the minimal entropy martingale measure, which turns out to have state-dependent jumps and thereby loosing the independent increment feature under the objective probability. The price under the minimal entropy martingale measure of a derivative with payoff depending on the underlying asset at some exercise time will solve a Black–Scholes type integropartial differential equation. We finally remark that Rheinländer and Steiger (2006) recently generalized our results to the case of an asset price dynamics with BNS-volatility and leverage.
105 The Density Process of the Minimal Entropy Martingale Measure in a Stochastic Volatility Model
The characterization of the minimal entropy martingale measure goes via the solution of a very involved partial differential equation. Acknowledgments We are grateful to Kenneth Hvistendahl Karlsen and Thorsten Rheinländer for interesting and fruitful discussions. This chapter is a reprint of the paper Benth and Meyer-Brandis (2005) (with a slightly extended reference list and amended conclusions). Two anonymous referees and an associate editor are thanked for their careful reading of an earlier version of that paper, leading to a significant improvement of the presentation.
References Applebaum, D. 2004. Lévy processes and stochastic calculus. Cambridge University Press, Cambridge. Arai, T. 2001. “The relations between minimal martingale measure and minimal entropy martingale measure.” Asia-Pacific Financial Markets, 8, 167–177. Avellaneda, M. 1998. “Minimum relative entropy calibration of asset pricing models.” International Journal of Theoretical and Applied Finance 1(4), 447–472. Barndorff-Nielsen, O. E. 1998. “Processes of normal inverse Gaussian type.” Finance and Stochastics 2, 41–68. Barndorff-Nielsen, O. E. and N. Shephard. 2001. “Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in financial economics.” Journal of the Royal Statistical Society, Series B, 63, 167–241 (with discussion). Becherer, D. 2003. “Rational hedging and valuation of integrated risks under constant absolute risk aversion.” Insurance Mathematics and Economics 33(1), 1–28. Becherer, D. 2004. “Utility-indifference hedging and valuation via reaction-diffusion systems.” Proceedings of the Royal Society of London, Series A Mathematical Physical and Engineering Science, 460(2041) 27–51. Bensoussan, A. and J.-L. Lions. 1982. Impulse control and QuasiVariational Inequalities. Gauthiers-Villars, Bordas Benth, F. E. and K. H. Karlsen. 2005. “A PDE representation of the density of the minimal entropy martingale measure in stochastic volatility markets.” Stochastics and Stochastic Reports, 77(2), 109–137. Benth, F. E. and T. Meyer-Brandis. 2005. “The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps.” Finance and Stochastics 9(4), 564–575. Bertoin, J. 1996. Lévy processes. Cambridge University Press, Cambridge. Branger, N. 2004. “Pricing derivative securities using cross-entropy: an economic analysis.” International Journal of Theoretical and Applied Finance 7(1), 63–81. Buchen P. W. and M. Kelly. 1996. “The maximum entropy distribution of an asset inferred from option prices.” Journal of Financial and Quantitative Analysis 31, 143–159. Delbaen, F., P. Grandits, T. Rheinländer, D. Samperi, M. Schweizer, and C. Stricker. 2002. “Exponential hedging and entropic penalties.” Mathematical Finance 12(2), 99–123. Fritelli, M. 2000. The minimal entropy martingale measure and the valuation problem in incomplete markets. Mathematical Finance 10, 39–52.
1575
Fujiwara, T. and Y. Miyahara. 2003. The minimal entropy martingale measures for geometric Lévy processes. Finance and Stochastics 7, 509–531. Grandits, P. and T. Rheinländer. 2002. “On the minimal entropy martingale measure.” Annals of Probability 30, 1003–1038. Guo, W. 2001. “Maximum entropy in option pricing: a convex-spline smoothing method.” Journal of Futures Markets, 21, 819–832. Hobson, D. 2004. “Stochastic volatility models, correlation and the q-optimal measure.” Mathematical Finance 14(4), 537–556. Hodges, S. D. and A. Neuberger. 1989. Optimal replication of contingent claims under transaction costs. Review of Future Markets 8, 222–239. Jacod, J. and A. N. Shiryaev. 1987. Limit Theorems for Stochastic Processes, 2nd Edition, Springer, New York. Kim, Y. S. and J. H. Lee. 2007. “The relative entropy in CGMY processes and its applications to finance.” Mathematical Methods of Operations Research 66, 327–338. Madan, D. B. and E. Seneta. 1990. “The variance gamma model for share market returns.” Journal of Business 63, 511–524. Mania, M., M. Santacroce, and R. Tevzadze. 2003. A semimartingale BSDE related to the minimal entropy martingale measure. Finance and Stochastics 7, 385–402. Matsuba, I. and H. Takahashi. 2003. “Generalized entropy approach to stable Lévy processes with financial application.” Physica A 319, 458–468. McCauley, J. L. 2003. “Thermodynamic analogies in economics and finance: instability of markets.” Physica A 329(1-2), 199–212. Mogedey, L. and W. Ebeling. 2000. “Local order, entropy and predictability of financial time series.” The European Physical Journal B 5, 733–737. Monoyios, M. 2007. “The minimal entropy measure and an Esscher transform in an incomplete market model.” Statistics and Probability Letters 77, 1070–1076. Musiela, M. and T. Zariphopoulou. 2003. “An example of indifference prices under exponential preferences.” Finance and Stochastics 8(2), 229–239. Palatella, L., J. Perello, M. Montero, and J. Masoliver. 2005. “Diffusion entropy technique applied to the study of the market acitivity.” Physica A 355, 131–137. Protter, P. 2003. Stochastic integration and differential equations, Second Edition, Springer, New York. Rheinländer, T. 2005. “An entropy approach to the stein/stein model with correlation.” Finance and Stochastics 9, 399–412. Rheinländer, T. and G. Steiger. 2006. “The minimal entropy martingale measure for general Barndorff–Nielsen/Shephard models.” Annals of Applied Probability 16(3), 1319–1351. Rouge, R. and N. El Karoui. 2000. “Pricing via utility maximization and entropy.” Mathematical Finance 10, 259–276. Rydberg, T. 1999. “Generalized hyperbolic diffusion processes with applications towards finance.” Mathematical Finance 9, 183–201. Sato, K. 1999. Lévy processes and infinitely divisible distributions, Cambridge University Studies in Advanced Mathematics, Vol. 68, Cambridge University Press, Cambridge. Tang, C. M., A. Y. T. Leung, and K. C. Lam. 2006. “Entropy application to improve construction finance decisions.” Journal of Construction Engineering and Management October issue, 1099–1113.
Chapter 106
Arbitrage Detection from Stock Data: An Empirical Study Cheng-Der Fuh and Szu-Yu Pai
Abstract In this paper, we discuss the problems of arbitrage detection, which is known as change point detection in statistics. There are some classical methods for change point detection, such as the cumulative sum (CUSUM) procedure. However, when utilizing CUSUM, we must be sure about the model of the data before detecting. We introduce a new method to detect the change points by using Hilbert–Huang transformation (HHT) to devise a new algorithm. This new method (called the HHT test in this paper) has the advantage in that no model assumptions are required. Moreover, in some cases, the HHT test performs better than the CUSUM test, and has better simulation results. In the end, an empirical study of the volatility change based on the S&P 500 is also given for illustration. Keywords Arbitrage detection r Hilbert–Huang transformation r Volatility r CUSUM
106.1 Introduction 106.1.1 Background In the stock market, investors always pursue the goal of finding change points promptly. These changes may come from the alteration of company policies or from an economic recession. We hope to detect the changes as soon as they appear. There is already a term to describe these problems: change point detection or arbitrage detection. Many results, such as the CUSUM test, exist in the change point detection of previous studies. The first result can be found in Page (1954) who constructed classical CUSUM tests. This test provides a widely accepted procedure to detect change points. C.-D. Fuh () National Central University and Academia Sinica, Taipei, Taiwan e-mail: [email protected] S.-Y. Pai National Taiwan University, Taipei, Taiwan e-mail: [email protected]
However, there are still some problems to be found in the CUSUM test in Pollak and Siegmund (1985) and Siegmund (1985). For example, it is necessary to ascertain the distribution of the data. Without the distribution of the data, the CUSUM test cannot be implemented. Therefore, we introduce a new method of arbitrage detection: the Hilbert– Huang transformation (HHT) test. The HHT was discussed in Huang et al. (2003), Huang and Shen (2005) and Wu and Huang (2004) and has already been used extensively in other fields. Such as, studies in geophysics, structural safety, and operating research that make use of it. However, the HHT is rarely applied in finance. In this respect we try to design a test using HHT on some finance problems due to the fact that many financial data are nonlinear, nonstationary time series with unknown models, which is exactly what the HHT can analyze. This paper is organized as follows. In this section, we introduce a new method (called the HHT test in this paper) to detect the change points. In Sect. 106.2, we describe some cases where the HHT test performs better than the CUSUM test and has better simulation results. In Sect. 106.3, we find some weakness of the HHT test in the change of the mean. In Sect. 106.4, an empirical study of the volatility change based on the S&P 500 is also given for illustration.
106.1.2 Previous Studies in Arbitrage Detection Consider an infinite sequence of observation: x1 ; x2 ; x3 ; : : : ; xi ; : : :. These variables represent the stock price at time i . At some unknown time v, either the company altered its policy or it was the beginning of a period of economic recession, such as the dot-com bubble in 2000. These types of events have a common characteristic, in that they transfer the whole structure of the market. In other words, the distribution of the time series changes parameters, like the mean or the variance. Therefore, what we seek is a stopping rule T , which detects the change promptly. When mentioning change point detection, the CUSUM test cannot be left unnoticed. We will use the above
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_106,
1577
1578
C.-D. Fuh and S.-Y. Pai
assumption to explain the idea of CUSUM. First, x1 ; x2 ; x3 ; : : : ; xv1 are independent and identically distributed with the probability density function f0 , whereas xv ; xvC1 ; : : : are independent and identically distributed with the probability density function f1 , for some v > 1. Let Pi denote the probability as the change from f0 to f1 occurs at the i th observation; Ei .T / denotes the expectation of a stopping rule T when the change occurs at time i . If i D 0, there is no change. A stopping rule can be described as: Minimize sup Ev . v C 1j v/
(106.1)
v1
subject to E0 B
(106.2)
for some given (large) constant B. A special method to solve the above problem is the following. Assume x1 ; : : : ; xn have been observed. Consider for 1 v n Hv W x1 ; : : : ; xv1 f0 I
xv ; xvC1 ; : : : ; xn f1
against Then the log likelihood ratio statistic can be written down: max .sn sk / D sn min sk ; 0kn
where Sn D
n X j D1
log
f1 .xj / : f0 .xj /
(106.3)
(106.4)
Then we can get a stopping rule:
D inf n W sn min sk b :
0kn
106.1.3.1 Hilbert–Huang Transformation A new instrument on arbitrage detection, HHT, has already been extensively used in engineering. We choose to use the HHT instead of other spectrum analysis methods, such as Fourier transformation and Wavelet transformation, for the following reasons: First, the HHT can be utilized on nonlinear, nonstationary
H0 W x1 ; : : : ; xn f0 :
0kn
HHT. In this way, we focus on the high frequency part while detecting volatility change because high frequency means short period. Hence the long period part, which is useless in detecting volatility change, can be omitted. By using the above methods we have several simulation studies. The first simulation study analyzes the time series from normal distribution, which represents the basic model. Following, we analyze the time series from Brownian motion models and geometric Brownian motion models, which are close to real financial data. The last simulation study analyzes the time series from Markov switch models and unlike the previous models, in this one, we do not know when the distribution of the data will change. Therefore, the HHT test can detect the change points, although the change timing is unknown.
(106.5)
106.1.3 The Use of Hilbert–Huang Transformation on Arbitrage Detection The HHT was used to analyze data in two ways. The first one is using the whole time–frequency–amplitude plot. This is the classical way to utilize the HHT. Its advantage is that we can collect the information from all of the data; however, the disadvantage of the classical method is it is less sensitive than method two. The second way is using a part of the time–frequency– amplitude plot. This method is a new application for the
time series. Although Fourier transformation has a wide application on spectrum analysis, it can be only used on stationary time data; Wavelet transformation can be applied on nonstationary data, but it is ineffective on nonlinear data. However, financial data are usually nonlinear and nonstationary time series. Therefore, the HHT becomes the first choice. Second, although Fourier transformation is useful on spectrum analysis, it can only transform a function from time domain into frequency domain. In other words, we do not know the frequency at the specific timing (Table 106.1). Third, compared with Wavelet transformation, the HHT is more precise. In Huang et al. (1998), the time–frequency– amplitude plots from HHT show more details than the Wavelet transformation.
106.1.3.2 The Process of Hilbert–Huang Transformation Empirical Mode Decomposition (EMD) 1. Let X.t/ denote a time series, identify local maxima and local minima of X.t/, and then connect all the local maxima by a cubic spline line named the upper envelope.
106 Arbitrage Detection from Stock Data: An Empirical Study
Table 106.1 The comparison between Fourier transformation and Hilbert–Huang transformation
Fourier transformation
Hilbert–Huang transformation
Symbol
F ./
H./
Transformation of domain
From time domain to frequency domain
From time domain to time and frequency domain
Function
F .X.t // D A.!/
H.X.t // D A.t; !/
Repeat the above process for local minima to produce the lower envelope. max.t/ denotes the upper envelope and min.t/ denotes the lower envelope. Let their mean be M1 .t/ D
1579
Œmax.t/ C min.t/ ; 2
(106.6)
2. The analytic signal is defined as Zj .t/ D Cj .t/ C iYj .t/ D Aj .t/e ij .t / ; where Aj .t/ D
and the difference between the data and M1 .t/ is X.t/ M1 .t/ D H1 .t/:
(106.7)
2. Repeat step 1 to obtain Mj .t/, Hj .t/: H1 .t/ M2 .t/ D H2 .t/;
(106.8)
Hk1 .t/ Mk .t/ D Hk .t/;
(106.9)
and
q
ŒCj2 .t/ C Yj2 .t/ ;
Yj .t/ : j .t/ D arctan Cj .t/
Hk .t/ D C1 .t/;
(106.10)
T X ŒHk1 .t/ Hk .t/ 2 t D1
2 Hk1 .t/
(106.11)
is smaller than a predetermined value (about 0.2–0.3 in Huang et al. 1998). 3. Let (106.12) X.t/ C1 .t/ D R1 .t/; and do the above procedure again. We can obtain C2 .t/; C3 .t/; : : : and so on. 4. Let Cj .t/ be the intrinsic mode functions (IMFs) (Fig. 106.1).
dj .t/ : dt
(106.17)
3. The original time series can be expressed as following: X.t/ D <
8 n <X :
j D1
9 Z = Aj .t/ exp i !j .t/dt ; ; (106.18)
where < denotes the real part of the number. 4. The above equation represents the amplitudes to be contoured on the frequency–time plane. This frequency–time distribution of the amplitude is designated as “Hilbert amplitude spectrum,” A.t; !/, or simply “Hilbert spectrum.”
106.2 Arbitrage Detection: Volatility Change
Hilbert Transformation
106.2.1 Volatility Change in Normal Distribution Models
1. Apply the Hilbert transformation to each IMFs Hilbert transformation: Z 1 Cj .s/ 1 ds: (106.13) Yj .t/ D P V 1 t s
The essential model is described as the following. Our model:
Here, PV indicates the principal value of the singular integral.
(106.16)
Here, Aj .t/ is the instantaneous amplitude, and j .t/ is the phase function, so the instantaneous frequency is
when SDk D
(106.15)
!j .t/ D
and let
(106.14)
x1 ; x2 ; x3 ; : : : ; x100 N.0; 12 /;
(106.19)
x101 ; x102 ; x103 ; : : : ; xn N.0; 22 /:
(106.20)
1580
C.-D. Fuh and S.-Y. Pai
Fig. 106.1 An example of IMFs
Fig. 106.2 (1)–(14) Hilbert spectra. (15) The plot of mean of maxima in Hilbert spectra. (16) The original data
After executing the HHT, the difference between “time from 1 to 100” and “from t D 1 to 110” can be observed in Fig. 106.2. There is a “peak” in the plot of “from t D 1 to 110.” The appearance of the peak is rational because of the change of the magnified volatility. In engineering, the volatility change can be treated as amplitude change. Therefore,
if the maximum amplitudes of the plots are “large enough,” then the change is called to be detected. Now the problem is how to define “large enough.” Here we adopt the idea of arbitrage detection problem. In this idea, the null hypothesis is that xi N.0; 12 / for all i , and the expectation of stopping rule in our detection method can be as large as B in
106 Arbitrage Detection from Stock Data: An Empirical Study
Table 106.2 The comparison between CUSUM test and HHT test in volatility change of Normal Distribution Models
1581
Change of volatility
Stopping time of the CUSUM test
Stopping time of the HHT test
sigma from 0.1 to 0.3 sigma from 0.1 to 0.2 sigma from 0.1 to 0.15 sigma from 0.1 to 0.125 sigma from 0.1 to 0.11 sigma from 0.1 to 0.105 sigma from 0.1 to 0.1
3.85 10.17 37.64 182.64 598.43 723.8 810.87
56.99 170.4 328.77 490.51 582.22 639.08 823.32
CUSUM. In practice, we (can) find that B is approximately 800. In other words, we can define our stopping rule as following: 0 D inffM > bg; (106.21) where M is the mean of the three maximum amplitudes in the Hilbert spectra (to avoid the extreme value), subject to (106.22) E0 Œ 0 > B: Compared with CUSUM, we obtain Table 106.2. It is apparent that CUSUM yields better results when the change is large. However, the HHT test performs better with slight change, and the average stopping time in the HHT test grows slower compared with the CUSUM test where the change becomes minor.
106.2.2 Volatility Change in Geometric Brownian Motion Models
where W . t/ is standard Brownian motion. Let x1 D 100, and then we can use the above model to generate x1 ; x2 ; x3 ; : : :, and so on. After generating all xi , we let yi D xi 100 for all i to reduce the influence of the start point. In this model, we could focus on high frequency parts of time–frequency–amplitude plots because high frequency means short period. The larger implies the larger amplitude of the short period part. Therefore, we focus on the frequency from 0.375 to 0.5, in other words, in the period from 2 to 2.7 days. In Fig. 106.3, we cannot see the difference directly from the Hilbert amplitude spectrum, but in Fig. 106.3(15) we can see the mean of maximum amplitude touch 0.5 after t > 100. We use the same idea to make a criterion, and we have Table 106.3. The result is similar to Sect. 106.2.1. The HHT performs better than CUSUM, when changes slightly.
106.2.3 Volatility Change in Markov Switch Models
After detecting change points in normal distribution, we apply the HHT test to geometric Brownian motion (GBM) models, which is always used to describe stock prices. In option pricing, modeling stock prices by GBM models has a well-known problem: the implied volatility is not a constant. In practice, stock prices also have the volatility clustering property. Here we do not seek to reduce the inaccuracy. On the contrary, we aim to detect volatility clustering. First of all, assume x1 ; x2 ; x3 ; : : : ; x100 GBM.r D 0:08; D 1 /;
The simulation study here analyzes the time series from Markov switch models, and unlike the previous models, in this one, we do not know when the distribution of the data will change. Therefore, the HHT test can detect the change points although the change timing is unknown. The Markov switch model can be described as
and
(106.23) x101 ; x102 ; x103 ; : : : ; xn GBM.r D 0:08; D 2 /: (106.24) In other words, xi D xi 1 e .r0:5
2 / t CW . t /
;
(106.25)
Yt D ˛St Yt 1 C "t ;
(106.26)
"t N u D 0; S2t ;
(106.27)
St 2 f1; 2g
(106.28)
where represent two states. In state 1, let ˛1 D 0:1;
(106.29)
1 D 0:1:
(106.30)
1582
C.-D. Fuh and S.-Y. Pai
Fig. 106.3 (1)–(14) Hilbert spectra. (15) The means of maxima in Hilbert spectra. (16) The original data
Table 106.3 The comparison between CUSUM test and HHT test in volatility change of Geometric Brownian Motion Models
Change of volatility
Stopping time of the CUSUM test
Stopping time of the HHT test
sigma from 0.1 to 0.3 sigma from 0.1 to 0.2 sigma from 0.1 to 0.15 sigma from 0.1 to 0.125 sigma from 0.1 to 0.11 sigma from 0.1 to 0.105 sigma from 0.1 to 0.1
3.59 10.51 39 214.32 642.7 766.38 853.46
30.67 158.75 340.59 591.21 663.24 702.8 840.55
106.2.4 A Brief Summary
In state 2, let ˛2 D 0:1;
(106.31)
2 D :
(106.32)
Let the transition matrix be " # 0:99 0:01 pD : 0:001 0:999
(106.33)
The result of Markov switch models is shown in Table 106.4 and Fig. 106.4. From the above table we can still find the good characteristics of the HHT test. Although we cannot quickly detect the large change, the stopping times is begin to slow.
In volatility change, we can find a detection rule from the ideas of the HHT. The HHT is a method to decompose a time series and produce Hilbert spectra, which are time– frequency–amplitude plots. The plots can be used to determine whether or not the volatility changes. In detecting volatility change, we focus on the amplitude change. The stopping rule is that the maxima of the amplitude are larger than a constant b subject to small type 1 error. In other words, the expectation of stopping time must be larger than a constant B in null hypothesis. Here we adopt B D 800, which is usually utilized in practice. Then the criterion b can be produced subject to B D 800.
106 Arbitrage Detection from Stock Data: An Empirical Study
1583
Fig. 106.4 (1)–(14) Hilbert spectra. (14) The means of maxima in Hilbert spectra. (15) The state of the data. (16) The original data
Table 106.4 The volatility changes of Markov switch models Stopping time of the HHT test Change of volatility sigma from 0.1 to 0.3 sigma from 0.1 to 0.2 sigma from 0.1 to 0.15 sigma from 0.1 to 0.125 sigma from 0.1 to 0.11 sigma from 0.1 to 0.105 sigma from 0.1 to 0.1
38.8 180.4 431.0 549.7 666.3 770.1 889.5
In both the HHT and the CUSUM tests, we can find that as the change becomes minor, the stopping times of the HHT test and the CUSUM test increase. However, the increasing speed is different for the two tests. The stopping time of CUSUM test increases faster than that of the HHT test. Therefore, the HHT test performs better than the CUSUM test when the change is slight.
106.3 Arbitrage Detection: Mean Change 106.3.1 Mean Change in Normal Distribution Models
Let x1 ; x2 ; x3 ; : : : ; x100 N.u D u1 ; 2 D 1/; (106.34) x101 ; x102 ; x103 ; : : : ; xn N.u D u2 ; 2 D 1/: (106.35) The stopping time of the HHT test is increasing faster than that of the CUSUM test, showing in Fig. 106.5 and Table 106.5. This means that the HHT test is weak in detecting the change of the mean. Reviewing the procedure of the HHT may explain the reason. It treats the low frequent IMFs as unimportant components. However, in the change of the mean, low frequent IMFs represent the trend of the data. Therefore, we are not satisfied with the results of the HHT test in the change of the mean.
106.3.2 Mean Change in Brownian Motion Models After the unsatisfying results in Sect. 106.3.1, we try another model, the Brownian motion model, which cumulates the value of u. The model we use is x1 ; x2 ; x3 ; : : : ; x100 BM.u D u1 ; 2 D 1/; (106.36)
The change of the mean is an important topic in finance. It represents the trend of the stock prices. Let us start with the essential model: the normal distribution.
x101 ; x102 ; x103 ; : : : ; xn BM.u D u2 ; D 1/: 2
(106.37)
1584
C.-D. Fuh and S.-Y. Pai
Fig. 106.5 (1)–(14) Hilbert spectra. (15) The means of maxima in Hilbert spectra. (16) The original data
Table 106.5 The comparison between CUSUM test and HHT test in mean change of Brownian Motion Models Stopping time of the Stopping time of HHT test CUSUM test Change of mean mean from 0 to 2 mean from 0 to 1 mean from 0 to 0.5 mean from 0 to 0.25
14.7 38.7 136.8 589.7
3.81 12.54 45.69 174.25
In Fig. 106.6(15), we find that when the value of the time series decreases, the amplitude of mean of maximum will increase; when the time series start to increase (the change point), the amplitude of the maximum mean will stop changing. We cannot find a good criterion for the phenomenon, but we can detect it visually. In Fig. 106.7, we can observe that after about t D 110 the value stops increasing. Conservatively, we choose t D 118 to be the stopping time. In Fig. 106.8, we find the same phenomenon, and we detect the change on t D 127. In Fig. 106.9, we can detect the change at t D 137, and we find another special phenomenon, which is when u2 becomes smaller, the “shake” after the change point becomes larger.
In Fig. 106.10, we can detect the change at t D 176, and the range of the special phenomenon, the “shake” increases from 70–80 to 90–120. In Fig. 106.11, we cannot see the change any more. In the case of u1 u2 > 0, the means of maximum continue increasing. We should analyze the data xi xi 1 instead in this case. In Table 106.6 we can see the HHT test performs well in the case when u1 u2 0. However, it is useless in the case when u1 u2 > 0.
106.3.3 A Brief Summary In this section, we discuss the problem of the change of the mean. Unfortunately the HHT test does not perform well in this case. In normal distribution models, the HHT test loses its positive characteristics mentioned in Sect. 106.2, which is that the stopping time increases slowly when the change is slight. Therefore, the HHT test cannot perform better than the CUSUM test under such condition.
106 Arbitrage Detection from Stock Data: An Empirical Study Fig. 106.6 (1)–(14) Hilbert spectra. (15) The means of the maxima in Hilbert spectra. (16) The original data
Fig. 106.7 Change from BM(1,1) to BM(2,1)
Fig. 106.8 Change from BM(1,1) to BM(1,1)
1585
1586
C.-D. Fuh and S.-Y. Pai
Fig. 106.9 Change from BM(1,1) to BM(0.5,1)
Fig. 106.10 Change from BM(1,1) to BM(0,1)
Fig. 106.11 Change from BM(1,1) to BM(0.5,1).
In Brownian motion models, it is hard to find worthwhile criteria to detect the change, but we still can find some phenomena in plots of the maxima means. When the time series starts to change, the value of the maximum means will stop at the same level; when u2 is small, the “shake” of the amplitude of mean of maximum will increase. However, this detection rule is useless while u1 u2 > 0. The reason for the weakness of the HHT test in detecting the change of the mean may be the procedure of the HHT, which focuses on high frequency parts but neglects trend changes.
106.4 Empirical Studies This section discusses the empirical studies of the HHT test. The CUSUM test cannot be applied here because we do not have any model assumptions for the S&P 500 Index. However, when utilizing the HHT test, we do not need any assumptions, and the empirical studies of the HHT test lead us to a positive conclusion.
106 Arbitrage Detection from Stock Data: An Empirical Study
1587
Table 106.6 Stopping time of the HHT test in mean change of Brownian Motion Models Change of mean Stopping time of the HHT test
106.4.1 Volatility Change Data (Subprime Mortgage Crisis in 2007)
u from 1 to 2 u from 1 to 1 u from 1 to 0.5 u from 1 to 0 u from 1 to 0.5
106.4.1.1 Using Stock Prices Directly Let us use our new method on empirical studies. In 2007, the global market faced a serious crisis, and suffered an unprecedented credit risk. At the same time, stock prices underwent acute vibration. This event is a good example in which to test our new method. First, using the S&P 500 Index from January 3, 2006 to April 9, 2008, we obtain 570 daily data. Second, we use the data of VIX Index, which is the implied volatility from S&P 500 to check our result. When we apply our new method to detect change points, is it efficient? If a change does exit, how quickly can we detect it?
In Fig. 106.12(16), S&P 500 Index has an acute vibration after t D 400. The same phenomenon can be found in Fig. 106.12(15), the VIX Index. After t D 400, VIX Index are all approximately larger than 20%.
vix(%)
VIX
Time
Figure 106.12(15)
Amplitude
Mean of Maxima
Time
Figure 106.12(14)
18 27 37 76 Cannot detect
1588
C.-D. Fuh and S.-Y. Pai
Fig. 106.12 (1)–(13) Hilbert spectra. (14) The means of the maxima in Hilbert spectra. (15) The VIX Index. (16) The S&P 500 Index
VIX
VIX(%)
Table 106.7 tells us that the change points can be detected efficiently exclusive of Event 1. However, the changes of volatility in Event 2 and Event 3 still continue for 20 days and 53 days, respectively. Only a minority of indexes have their own implied volatility index. The S&P 500 Index is one of them and its implied volatility index is called VIX. Some markets do not have big enough option trading volumes. Therefore the new method can be utilized on these markets. We can regard trading options as trading the volatility. Accordingly, in the market with low option trading volume, when we detect the increase of the volatility, the prices of options are probably undervalued. Contrarily, when the volatility decreases, the prices of options are possibly overvalued.
Time
Figure 106.13(15) Mean of Maxima
In this section, the log returns of stock prices are analyzed. The use of the log returns of stock prices allows us to focus on the volatility change without the influence of the trend. Therefore, analyzing the log returns of data yields better results than analyzing the original data. Figure 106.13(16) is the log return of the S&P 500 Index from January 3, 2006 to April 9, 2008.
Amplitude
106.4.1.2 Using Log Return of Stock Prices
Time
Figure 106.13(14)
106 Arbitrage Detection from Stock Data: An Empirical Study
Volatility change event
Start date
End date
Date of the HHT test detection
Stopping time of the HHT test
Event 1
393
430
436
43
Event 2
465
495
475
10
Event 3
501
570
517
16
From Fig. 106.13(14) and Table 106.8 the results of the detection show that the method in this section can detect the changes faster then in Sect. 106.4.1.1. Therefore, we can make investment decisions more promptly by applying the results in this section. All the three results of events in this section are better than in Sect. 106.4.1.1. In Event 1, stopping time D 43 when using the data of S&P 500 Index directly, however, we have stopping time D 9 here. In Event 2, stopping time D 10 in Sect. 106.4.1.1, which contrasts with stopping time D 4 here. Only in Event 3 the stopping time D 16 in Sect. 106.4.1.1 which is almost the same as stopping time D 15 in this section. It means that by using our detection results we can react more quickly to the change of the volatility.
106.4.2 Volatility Change Data (Dot-Com Bubble in 2000) Another economic recession in 2000 is considered here. In 1998–2000, the stock prices of dot-com companies rose quickly. However, most dot-com companies had not even made a profit in that period. Hence, after irrational investors spent all the wealth they had buying shares, the stock prices of dot-com companies began to fall, and were accompanied by the rise in the volatility. We adopt the data of S&P 500 Index from July 3, 2000 to December 31, 2001, including 374 daily prices. As a result of the better results when analyzing log return of data in Sect. 106.4.1, we prefer log return of data here (Fig. 106.14).
VIX%
VIX
Time
Figure 106.14(15)
Mean of Maxima
Amplitude
Table 106.7 The stopping times of the HHT test in subprime mortgage crisis (using raw data)
1589
Time
Figure 106.14(14) Here we try to find the change points by sight. The conclusion is Table 106.9. In Event 1, because the change is about 16–30%, the stopping time is short. In Event 2 and 3, the change is about 20– 30%, so the stopping time is longer than in Event 1, but still quite small when compared to the length of duration of the change. In other words, the HHT test can detect the change points promptly so we can make investment decisions before the changes finish in the dot-com bubble case.
106.4.3 A Brief Summary In empirical study, we obtain two conclusions. First, the new method, the HHT test, is useful on empirical data. In both the subprime mortgage crisis and the dot-com bubble, we can detect the change points correctly and quickly. In the subprime mortgage crisis, we can find three events in which the volatility changes from less than 20% to more than 20%. We can detect all the changes promptly in the subprime mortgage crisis. In the dot-com bubble, there exist four events and all of them can be detected quickly, too. Therefore, the results we get from data of both the subprime mortgage crisis and the dot-com bubble are outstanding. Second, due to a shorter stopping time, using the log return of data is better than using the original data. The reason is that in log return of data, we can focus on the volatility change without the influence of the trend.
1590
C.-D. Fuh and S.-Y. Pai
Fig. 106.13 (1)–(13) Hilbert spectra. (14) The means of the maxima in Hilbert spectra. (15) The VIX Index. (16) The log return of the S&P 500 Index
Table 106.8 The stopping times of the HHT test in subprime mortgage crisis (using log return of stock prices)
Volatility change event
Start date
End date
Date of the HHT test detection
Event 1
393
430
402
Event 2
465
495
469
4
Event 3
501
570
516
15
106.5 Conclusions and Further Researches
Stopping time of the HHT test 9
Second, the HHT test performs better than the CUSUM
test in some cases.
106.5.1 Conclusions In this paper we reach four conclusions: First of all, we introduce a new method to arbitrage de-
tection. This method can be utilized without any model assumptions. HHT can produce Hilbert spectra, which represent frequency–time distribution of the amplitude. By utilizing high frequency parts of Hilbert spectra, we can devise the HHT test. Moreover, this test can be applied without model assumptions. In the classic method of change point detection, using the CUSUM test, we need to know the distribution of the data before and after change to compute the log likelihood ratio statistic. The above information is needless in the HHT test.
In Sect. 106.2, although the CUSUM test performs well in large change problems, the HHT test is good handling those cases with slight changes. Therefore, if dealing wit slight changes, we can consider the HHT test first. Third, in practice, we still obtain good results.
In Sect. 106.4, we review the data of the dot-com bubble in 2000 and the subprime mortgage crisis in 2007, respectively. Both of the two financial crises are well known and extensively influential. The HHT test can detect these two crises promptly and efficiently, and we can even make a profit from the successful detection. Fourth, the advantage and disadvantage of the HHT test.
In Sect. 106.3, we learn the characteristics of the HHT test. When detecting the change of the mean, it yields a mediocre result. Although the HHT test is sensitive to volatility change, it is insensitive to the change of the mean.
106 Arbitrage Detection from Stock Data: An Empirical Study
1591
Fig. 106.14 (1)–(13) Hilbert spectra. (14) The means of the maxima in Hilbert spectra. (15) The VIX Index. (16) The log return of the S&P 500 Index
Table 106.9 The stopping times of the HHT test in the dot-com bubble
Volatility change event
Start date
End date
Date of the HHT test detection
Stopping time of the HHT test
Event 1
63
158
76
13
Event 2
159
226
194
35
Event 3
291
373
309
18
In this paper, we become familiar with some features of the new method, the HHT test. It is a good method in dealing with the volatility change problems, but it has some difficulties when facing the changes of the mean. These imperfect parts of the HHT test still need to be solved.
congenitally in mean change case. However, if there are some means that can overcome the obstacles of this weakness, then the HHT test could become a comprehensive method for arbitrage detection.
References 106.5.2 Further Research The evidence presented above indicates that there are some persuasive reasons for preferring the HHT test in some cases. Therefore, we indicate some open problems here. First, some strict proof of the HHT test surpassing the CUSUM needs to be provided. Because the HHT does not have theoretical bases, the proof may be the hardest part of further research. Second, why does the HHT test yield better results of detection of volatility change? Why is it insensitive when dealing with the change of the mean? Third, can we utilize the HHT test in other fields? We have a good conclusion in finance, so the next step should be doing some research on other kinds of data. Fourth, how can we settle the weakness of the HHT test in mean change? The HHT test may have some limits
Huang, N. E. and S. P. Shen. 2005. Hilbert–Huang transform and its applications, World Scientific, London. Huang, N. E., Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung, and H. H. Lin. 1998. “The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis.” Proceedings of the Royal Society of London A 454, 903–995. Huang, N. E., M. C. Wu, S. R. Long, S. P. Shen, W. Qu, P. Gloersen, and K. L. Fan. 2003. “A confidence limit for the empirical mode decomposition and Hilbert spectral analysis.” Proceedings of the Royal Society of London A 459(2037), 2317–2345. Page, E. S. 1954. “Continuous inspection schemes.” Biometrika 41, 100–115. Pollak, M. and Siegmund, D. 1985. “A diffusion process and its applications to detecting a change in the drift of Brownian motion.” Biometrika 72(2), 267–280. Siegmund, D. 1985. Sequential analysis, Springer-Verlag, New York, NY. Wu, Z. and Huang, N. E. 2004. “A study of characteristics of white noise using the empirical mode decomposition method.” Proceedings of the Royal Society of London 460, 1597–1611.
Chapter 107
Detecting Corporate Failure Yanzhi Wang, Lin Lin, Hsien-Chang Kuo, and Jenifer Piesse
Abstract This article introduces definitions of the terms bankruptcy, corporate failure, insolvency, as well as the methods of bankruptcy, and popular economic failure prediction models. We will show that a firm filing for corporate insolvency does not necessarily fail to pay off its financial obligations as they mature. Moreover, we will assume an appropriate risk monitoring system centered by well-developed failure prediction models, which is crucial to various parties in the investment world as a means to look after the financial future of their clients or themselves. Keywords Corporate failure r Bankruptcy r Distress r Receivership r Liquidation r Failure prediction r Discriminant analysis r Conditional probability analysis r Hazard model r Misclassification cost model
107.1 Introduction The financial stability of firms is always of concern to many agents in a society, including investors, bankers, governY. Wang Department of Finance, Yuan Ze University, 135 Yuan-Tung Rd., Jung-Li, Taiwan 32003, R.O.C. e-mail: [email protected] L. Lin () Department of Banking and Finance, National Chi-Nan University, 1 University Rd., Puli, Nantou Hsien, Taiwan 545, R.O.C. e-mail: [email protected] H.-C. Kuo Department of Banking and Finance, National Chi-Nan University and Takming University of Science and Technology, 1 University Rd., Puli, Nantou Hsien, Taiwan 545, R.O.C. e-mail: [email protected] J. Piesse Management Centre, School of Social Science and Public Policy, King’s College, University of London, 150 Stamford St., London SE1 9NN, UK University of Stellenbosch, Stellenbosch, Western Cape, RSA e-mail: [email protected]
mental/regulatory bodies, and auditors. The credit rating of listed firms is an important indicator not only to the stock market for investors to adjust the stock portfolio they hold, but also to the capital market for lenders to calculate the costs of loan default and to consider the borrowing terms for their clients. It is also the duty of the governmental and regulatory organizations to monitor the general financial status of firms in order to make proper economic and industrial policies. Moreover, in the interest of the public, the auditors of firms need to maintain a watching brief over the going concern of their clients for the foreseeable future and fairly present their professional view in the audit report attached to each of their client’s financial statements. Thus, it is understandable that the financial stability of firms attracts so much attention from so many different parties. A single firm’s bankruptcy will influence a chain of members of society, especially its debtors and the employees. But, if a group of firms in an economy simultaneously face financial failure, it will not only leave scars on this particular economy, but also its neighbors. The latest evidence is demonstrated by the financial storm clouds gathered over Thailand in July 1997, which caused immediate damage to most Asia–Pacific countries. For these reasons, the development of bankruptcy theory and bankruptcy prediction models, which can protect the market from unnecessary losses, is essential. This can also help governmental organizations make appropriate policies in time to maintain industrial cohesion and minimize the damage caused by widespread corporate bankruptcy to the economy as a whole. As a popular definition of “distress,” the financial terms “near bankruptcy or corporate failure,” “bankruptcy” (see Brealey et al. 2001: 621; Ross et al. 2002: 858), or “financial failure,” can be referred to in a variety of circumstances, including: 1. the market value of assets of the firm is less than its total liabilities; 2. the firm is unable to pay debts when they come due; 3. the firm continues trading under court protection.
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_107,
(Pastena and Ruland 1986: 289) 1593
1594
Of these, the second condition; that is, the insolvency of the company, has been of a main concern in the majority of early bankruptcy studies in literature, because insolvency cannot only be explicitly identified but has also served as a legal and normative definition of the term “bankruptcy” in many developed countries. However, the first definition is more complicated and subjective in the light that different accounting treatments of asset valuation usually give a different market value of a company’s assets. Meanwhile, the court protection legislation varies among countries.
107.2 The Possible Causes of Bankruptcy Insolvency problems can result from decisions taken within the company or a change in the economic environment, essentially outside the firm. Some general causes of insolvency are noted in Table 107.1. Apart from those listed above, a new company is usually thought to be riskier than those with longer history. Blum (1974: 7) confirmed that “other things being equal, younger firms are more likely to fail than older firms.” Hudson (1987), studying a sample between 1978 and 1981, also pointed out that companies liquidated through a procedure of creditor’s voluntary liquidation or compulsory liquidation during that period of time were mainly 2–4 years old and three-quarters of them less than 10 years old. Moreover, Walker (1992: 9) also found that “many new companies fail within the first three years of their existence.” This evidence suggests that the distribution of the failure likelihood against company age is skewed to the right. However, a clear cut point in age structure has so far not been identified to distinguish “new” from “young” firms in a business context, nor is there any convincing evidence with respect to the propensity to fail by firms of different ages. In consequence, the age characteristics of liquidated companies can only be treated as an “observation” or “suggestion,” rather than a “theory.”
Table 107.1 Some possible causes of insolvency (Rees 1990: 394) 1: Low and declining real profitability 2: Inappropriate diversification – into unfamiliar industries or not away from declining ones 3: Import penetration into the firm’s home markets 4: Deteriorating financial structures 5: Difficulties controlling new or geographically dispersed operations 6: Overtrading in relation to the capital base 7: Inadequate financial control over contracts 8: Inadequate control over working capital 9: Failure to eliminate actual or potential loss-making activities 10: Adverse changes in contractual arrangements
Y. Wang et al.
Although the most common causes of bankruptcy have been noted above, they are not sufficient to explain or predict corporate failure. In other words, a company with any of these characteristics is not doomed to bankruptcy in a predictable period of time. This is because some factors such as government intervention may play an important part in the rescue of distressed firms. Therefore, as Bulow and Shoven (1978) noted, the conditions under which a firm goes through liquidation are rather complicated. On the whole, as Foster (1986: 535) put it, “there need not be a one-toone correspondence between the non-distressed/distressed categories and the non-bankrupt/bankrupt categories.” It is noticeable that this ambiguity is even more severe in the notfor-profit sector of the economy.
107.3 The Methods of Bankruptcy As the corporate failure is not only an issue for those people involved as company owners and creditors but also influences the economy as a whole, many countries legislate for formal bankruptcy procedures for the protection of public interests from avoidable bankruptcy, such as Chaps. 10 and 14 in the US, and the Insolvency Act in the UK. The objectives of legislation are to “[firstly] protect the rights of creditors. . . [secondly] provide time for the distressed business to improve its situation. . . [and finally] provide for the orderly liquidation of assets” (Pastena and Ruland 1986: 289). In the UK where a strong rescue culture prevails, the Insolvency Act contains six separate procedures that can be applied to different circumstances to prevent either creditors, shareholders, or the firm as a whole from unnecessary loss, thereby reducing the degree of individual as well as social loss. They will be briefly described in the following section.
107.3.1 Company Voluntary Arrangements A voluntary arrangement is usually submitted by the directors of the firm to an insolvency practitioner, “who is authorized by a recognized professional body or by the Secretary of State” (Rees 1990: 394) when urgent liquidity problems have been identified. The company in distress then goes through the financial position in detail with the practitioner and discusses the practicability of a proposal for corporate restructuring. If the practitioner endorses the proposal, it will be put to the company’s creditors in the creditors’ meeting, requiring an approval rate of 75% of those attending. If this restructuring report is accepted, those notified will thus be bound by this agreement and the practitioner becomes the supervisor of the agreement. It is worth emphasizing that a
107 Detecting Corporate Failure
1595
voluntary arrangement need not pay all the creditors in full but a proportion of their lending (30% in a typical voluntary agreement in the UK) on a regular basis for the following several months. The advantage of this procedure is that it is normally much cheaper than formal liquidation proceedings and the creditors usually receive a better return.
holders and other preferential creditors by selling the assets of the businesses at the best prices. The whole business may be sold as a going concern if it is worthy. As in an administration order, the receiver must advise creditors of any progress by way of a creditors’ meeting, which will be convened in a short period of time after the initial appointment.
107.3.2 Administration Order
107.3.4 Creditors’ Voluntary Liquidation
It is usually the directors of the insolvent firm who petition the court for an administration order. The court will then assign an administrator who will be in charge of the daily affairs of the firm. However, before an administrator is appointed, the company must convince the court that the making of an order is crucial to company survival or a better realization of company assets than would be the case if the firm was declared bankrupt. Once it is rationalized, the claims of all creditors are effectively frozen. The administrator will then submit recovery proposals to the creditors’ meeting for approval within 3 months of the appointment being made. If this proposal is accepted, the administrator will then take the necessary steps to put it into practice. An administration order can be seen as the UK’s version of the American Chap. 11 in terms of the provision of a temporary legal shelter for companies in trouble to escape future failure without damaging their capacity to continuously trade (Counsell 1989). This does sometimes lead to insolvency avoidance (Homan 1989).
In a creditor’s voluntary liquidation, the directors of the company will take the initiative to send an insolvency practitioner an instruction that will lead to the convening of creditors’ and shareholders’ meetings. In a shareholders’ meeting, a liquidator will be appointed and this appointment will be asked for ratification in a creditors’ meeting later. Creditors have the right of final say as to who acts as liquidator. A liquidator will start to find potential purchasers and realize the assets of the insolvent firm to clear its debts. Unlike receivers who have wide ranging powers in the management of the businesses, a liquidator’s ability to continue trading is restricted to such that a beneficial winding up will be promised. As Rees (1990) stated, it is the most common method used to terminate a company.
107.3.3 Administrative Receivership An administration receiver has very similar power and functions of an administrator but is appointed by the debenture holder (the bank) secured by a floating or fixed charge after the directors of the insolvent company see no prospect for improving its ability to repay debts. In some cases, before the appointment of an administration receiver, a group of investigating accountants will be empowered to examine the real state of the company. The investigation normally includes the estimation of the valuable assets and liabilities of the company. If this investigation team finds that the company has no other choices but to be liquidated, an administration receiver who works in partnership with the investigation team will thus be entitled to take over the management of the company. The principal aim is to raise money to pay debenture
107.3.5 Members’ Voluntary Liquidation The procedure for a member’s voluntary liquidation is rather similar to that of a creditors’ voluntary liquidation. The only difference is that in a members’ voluntary liquidation the directors of the firm must swear a declaration of solvency to clear debts with fair interest within 12 months and creditors should not be involved in the appointment of a liquidator. In other words, a company’s announcement of a members’ voluntary liquidation by no means signals its insolvency, but only means a shutdown along with the diminishing necessity of its existence.
107.3.6 Compulsory Liquidation A compulsory liquidation is ordered by the court to wind up a company directly. This order is usually initiated by the directors of the insolvent firm or its major creditors. Other possible petitioners include the Customs and Excise, the Inland
1596
Revenue, and local government (Hudson 1987: 213). The whole procedure is usually started with a statutory demand made by creditors who wish to initiate a compulsory liquidation. If the firm fails to satisfy their request in certain working days, this failure is sufficient grounds to petition the court to wind up the firm. Once the order is granted, Official Receiver will take control of the company instantly or a liquidator will be appointed by Official Receiver instead. The company then must cease trading and liquidation procedure starts. However, an interesting phenomenon is that many valuable assets may be removed or sold prior to the control of the liquidator or even during the delivery of the petition to the court. Therefore, it probably leaves nothing valuable for the liquidator to deal with. In this sense, the company initiating a compulsory liquidation has been practically terminated far before a court order is granted.
107.4 Prediction Model for Corporate Failure Because corporate failure is not simply a closure of a company but has impacts on the society and economy where it occurs. Therefore, it makes good business and academic sense to model corporate failure for prediction purposes. If bankruptcy can be predicted properly, it may be avoided and the firm restructured. By doing so, not only the company itself but also employees, creditors, and shareholders can all benefit. Academics have long believed that corporate failure can be predicted through the design of legal procedure (Dahiya and Klapper 2007) or financial ratio analysis. It is because that using financial variables in the form of ratios can control for the systematic effect of size of, and industry effects on, the variables under examination (Lev and Sunder 1979: 187– 188), to facilitate cross-sectional comparisons in attempt to objectively discover the “symptoms” of corporate failure. In addition to the cross-sectional comparisons, Theodossiou (1993) builds up the time series analysis to investigate the discovery of the corporate failure. In consequence, financial ratio analysis for decades is not only preferred when the interpretation of financial accounts is required, but it has also been extensively used as inputs to the explicit formulation of corporate bankruptcy prediction models. Other than the financial ratio analysis, the application of the Merton (1974) Model (e.g., the KMV Model) that is widely applied by practitioners also predicts corporate failure. Differing from the financial ratio analysis, the Merton (1974) Model examines the possibility of bankruptcy with the stock market valuation using the concept of “distance to default,” and the possibility of bankruptcy is negatively related to the distance to default, which measures how far the firm valuation is from the critical value of default in the firm value distribution.
Y. Wang et al.
107.4.1 Financial Ratio Analysis and Discriminant Analysis The use of ratio analysis for the purpose of predicting corporate failure at least can be dated back to Fitzpatrick (1932), but this method did not attract enough attention until Beaver (1966) proposed his famous univariate studies. Beaver (1966) systematically categorized 30 popular ratios into six groups, and then found that some ratios under investigation, such as the cash flow/total debt ratio, demonstrated excellent predictive power in the corporate failure model. His results also showed the deterioration of the distressed firms prior to their financial failure, including a sharp drop in their net income, cash flow, and working capital, coupled with an increase in total debt. Although useful in predicting bankruptcy, univariate analysis was later criticized for its failure to incorporate more causes of corporate failure measured by other ratios into a single model. This criticism prompted interest in the multivariate distress prediction model that simultaneously includes different financial ratios with better combined predictive power for corporate failure. In the 1980s, with the increasing attention placed on multiratio analysis, multivariate discriminant analysis (MDA) began to dominate the bankruptcy prediction literature. MDA determines the discriminant coefficient of each of the characteristics chosen in the model on the basis that these will discriminate between failed and nonfailed ones in an efficient manner. Then, a single score for the firms in study will be generated. A cutoff point will be determined to minimize the dispersion of scores associated with firms in each category and the chance of overlap. An intuitive advantage of using MDA techniques is that the entire profile of the characteristics under investigation and their interaction are considered. Another advantage of using MDA lies in its convenience in application and interpretation (Altman 1983: 102–103). One of the most popular MDA must be the Z-score model developed by Altman (1968). On the basis of their popularity and relevancy to corporate failure, 22 carefully selected financial ratios were further classified into five bankruptcyrelated categories. In Altman’s sample 33 bankrupt and 33 nonbankrupt manufacturing companies during the period 1946–1965 were included. The best and final model in terms of failure prediction contained five variables that are still frequently used in banking and business sectors to date. This linear function is Zscore D 1:2Z1 C 1:4Z2 C 3:3Z3 C 0:6Z4 C 0:999Z5 ; (107.1) where Z-score is the overall index; Z1 the working capital/total assets; Z2 the retained earnings/total assets; Z3 the earnings before interest and taxes/total assets; Z4 the market value equity/book value of total debt; and Z5 the sales/total assets.
107 Detecting Corporate Failure
Altman (1968) also tested the cutoff point to balance the Type I and Type II Error, and found that in general it was highly possible for a company with its Z-score smaller than 1.8 to go bankrupt in the next few years of the examining period, and one with its Z-score higher than 2.99 was comparatively much safer. This Z-score model is so well known that it is still one favorite indicator of credit risk to fund suppliers in the new millennium. Although these statistical discrimination techniques are popular in predicting bankruptcy, they suffer from some methodological problems. Some methodological problems identified stem directly from the use of financial ratios (also see Agarwal and Taffler 2008). For example, proportionality and zero-intercept assumptions are the main two factors that are crucial to the credibility of the ratio analysis. The proportionality and zero-intercept assumption determines the form of any ratio. The basic form of a ratio is usually assumed to be y=x D c, where y and x are two accounting variables, which are assumed to be different but linearly related, and c can be interpreted as the value of this specific ratio. This raises three questions. First, is there an error term in the relationship between the two accounting variables? Second, is it possible that an intercept term exists in this relationship? And finally, what if the numerator and denominator are not linearly related? With regard to the first question, Lev and Sunder (1979) proved that if there is an additive error term in the relationship between y and x suggested by the underlying theory, that is, y D ˇx C e or y=x D ˇ C e=x, the comparability of such ratios will be limited. It is because “the extent of deviation from perfect size control depends on the properties of the error term and its relation to the size variable, x” (Lev and Sunder 1979: 191). The logic behind their argument can be summarized as follows. Considering the error term is homoscedastic, e=x is then smaller for the large firms than for the small firms because x as a size variable for large firms will on average be greater than that of small firms. In other words, the ratio y=x for large firms will be closer to the slope term ˇ than that for small firms. Thus, because the variance of the ratio y=x for smaller firms is greater than that of larger firms, it proves that the ratio y=x of two groups (i.e., large and small firms) are statistically drawn from two different distributions. This certainly weakens the comparability of the ratios with such an underlying relationship. It seems that to place an additive error term in the relationship between denominator and numerator in a ratio is not adequate in terms of size control. However, if heteroscedasticity in y is the case, it may result in the homoscedasticity of y=x. But it is also possible that this heteroscedastic problem of y=x is unchanged. In fact, Lev and Sunder (1979) also pointed out that this problem may be ameliorated only when the error term is
1597
multiplicative in the relationship; that is, y D ˇxe or y=x D ˇe. It is because the deviation of y=x now has no mathematical relationship with size variable x. As a result, this format of the ratio is more appropriate for comparison purpose. The above discussion can also be applied to the case where an intercept term exists in the relationship between two ratio variables, represented by y D ˛ C ˇx or y=x D ˇ C ˛=x. It is obvious that the variance of ratio y=x for smaller firms will be relatively larger than that of larger firms, under the influence of the term ˛=x. Again, suffice to say that the underlying formation of ratio is not acceptable to be used in the comparison of company performance. If two variables are needed to control the market size of the variable y, such as y D ˛CˇxCız or y D ˛CˇxCıx 2 , and if the underlying relationship is nonlinear, considerable confusion will be caused in the interpretation of ratios, not to mention the results of the ratio analysis. All those problems cast doubts on whether the normally used form of ratios is appropriate in all circumstances. Theoretically, it seems that the use of ratios is less problematic if and only if highly restrictive assumptions can be satisfied. Empirically, Whittington (1980) claimed that the violation of proportionality assumption of the ratio form is the problem researchers will most frequently encounter in the use of financial data in practice, especially in a time series study of an individual firm. McDonald and Morris (1984: 96) found that the proportionality assumption is better satisfied when a group of firms in a simple homogeneous industry was analyzed, otherwise, some amendment of the form of ratios will be necessary. However, they do not suggest the replacement of the basic form of ratio with a more sophisticated one. On the contrary, they commented that, on average, the basic form of ratio empirically performed quite satisfactorily in its application to ratio analysis. Keasey and Watson (1991: 90) also suggested that possible violations of the proportionality assumptions can be ignored. Since then, due to lacking further theoretical developments or the improvement of ratio forms, the basic form of ratio is still prevailingly used in the bankruptcy studies. In addition to the flaws in the designs of financial ratios, there are some methodological problems associated with the use of MDA. Of them, non-normality, inequality of dispersion matrices across all groups, and nonrandom sampling are the three problems that haunt the model users. The violation of normality assumption of MDA has given rise to considerable discussion since the 1970s (Kshirsagar 1971; Deakin 1976; Eisenbeis 1977; Amemiya 1981; Frecka and Hopwood 1983; Zavgren 1985; Karels and Prakash 1987; Balcaena and Ooghe 2006). Violation of the normality is the ultimate cause of biased tests of significance and estimated error rates. Studies on univariate normality of financial ratios found that ratio distributions tend to be
1598
skewed (Deakin 1976; Frecka and Hopwood 1983; Karels and Prakash 1987). If the ratios included in the model are not perfectly univariate normal, their joint distribution will, a priori, not be multivariate normal (Karels and Prakash 1987). Therefore, a good variable set for bankruptcy modeling should be able to minimize multivariate non-normality problems. A traditional but prevalent stepwise procedure apparently cannot satisfy this requirement. However, despite quite a few complementary studies on data transformation and outlier removal for ratio normality (Eisenbeis 1977; Ezzamel et al. 1987; Frecka and Hopwood 1983), their suggestion is rarely used in later research on the generation program of MDA models (Shailer 1989: 57). Because all these techniques are imperfect, McLeay (1986) advocated that selecting a better model is more straightforward than the removal of outliers or data transformations. In comparing the problems of non-normality to inequality of dispersion matrices across all groups, the latter does not seem crucial to the estimation procedure of MDA. In theory, the violation of the equal dispersion assumption will affect the appropriate form of the discriminating function. After testing the relationship between the inequality of dispersions and the efficiency of the various forms of classification models, a quadratic classification rule seems to outperform a linear one in terms of the overall probability of misclassification when the variance–covariance matrices of the mutually exclusive populations are not identical (Eisenbeis and Avery 1972; Marks and Dunn 1974; Eisenbeis 1977). More importantly, the larger the difference between dispersions across groups, the more the quadratic form of discriminating function is recommended. One of the strict MDA assumptions is random sampling. However, the sampling method that MDA users prefer for bankruptcy prediction studies is choice-based, or state-based, sampling, which prepares an equal or approximately equal draw of observations from each population group. Because corporate failure is not a frequent occurrence in an economy (Altman et al. 1977; Wood and Piesse 1988), such sampling techniques will cause a relatively lower probability of misclassifying distressed firms as nondistressed firms (Type I Error) but a higher rate of misclassifying nondistressed firms as distressed firm (Type II Error) (Lin and Piesse 2004; Kuo et al. 2002; Palepu 1986; Zmijewski 1984). Therefore, the high predictive power of MDA models claimed by many authors appears to be suspect. As Zavgren (1985: 20) commented, MDA models are “difficult to assess because they play fast and loose with the assumptions of discriminant analysis.” When doubt is cast on the validity of the results of MDA models, scholars begin to look at more defensible approaches, such as conditional probability analysis (CPA). The change in these mainstreams is also reviewed by Balcaena and Ooghe (2006).
Y. Wang et al.
107.4.2 Conditional Probability Analysis Since the late 1970s, the use of discriminant analysis was gradually replaced by the use of CPA, which is different from MDA in that CPA produces the “probability of occurrence of a result, rather than producing a dichotomous analysis of fail/survive as is the norm with basic discriminant techniques” (Rees 1990: 418). CPA primarily refers to logit and probit techniques and has been discussed or used rather widely in academics (Keasey and Watson 1987; Martin 1977; Mensah 1983; Ohlson 1980; Peel and Peel 1987; Storey et al. 1987; Zavgren 1985, 1988; Sun 2007). Its popularity highlights the various responses of the model users to the risk of failure. But the main value of using CPA is that CPA application does not depend on demanding assumptions as does MDA (Kennedy 1991, 1992). However, logit CPA is not always better than MDA under all conditions. If the multivariate normality assumption is met, the maximum likelihood estimator (MLE) of MDA is more efficient asymptotically than MLE of logit models. In any other circumstance, the MLE of MDA may not remain consistent, but MLE of logit models will (Amemiya 1981; Judge et al. 1985; Lo 1986). However, as a rejection of normality in bankruptcy literature is very common, the logit model is appealing. Methodologically speaking, suffice to say that it seldom goes wrong if logit analysis is used in distress classification. One most commonly cited and used CPA was developed by Ohlson 1980). Different from Altman’s (1968) sample with an equal number of bankrupts and nonbankrupts, his sample included 105 bankrupt and 2058 nonbankrupt industrial companies during 1970–1976. Using logistic analysis, his failure prediction models with accuracy rate of above 92% contained nine financial ratios and suggested that the company size, capital structure, return on assets, and current liquidity were the four most powerful bankruptcy indicators. His model looked like Y D 1:3 0:4Y1 C 6:0Y2 1:4Y3 C 0:1Y4 2:4Y5 1:8Y6 C 0:3Y7 1:7Y8 0:5Y9 ;
(107.2)
where Y is the overall index; Y1 the log(total assets/GNP price-level index); Y2 the total liabilities/total assets; Y3 the working capital/total assets; Y4 the current liabilities/current assets; Y5 the one if total liabilities exceed total assets, zero otherwise; Y6 the net income/total assets; Y7 the funds provided by operations/total liabilities; Y8 the one if net income was negative for the last 2 years, zero otherwise; and Y9 the measure of change in net income. It is interesting to find that Ohlson (1980) chose 0.5 as cutoff point, implicitly assuming a symmetric loss function across the two types of classification error. He later ex-
107 Detecting Corporate Failure
1599
plained that the best cutoff point should be calculated using data beyond his examining period (i.e., 1976), but unnecessary in his paper because the econometric characteristics of CPA and his large sample size would neutralize the problems (Ohlson 1980: 126). He also pointed out the difficulty of comparing his results with others due to the differences in the design of lead time, selection of predictors and observation periods, and finally the sensitivity of these results to choice of estimation procedures. As far as the predictive accuracy rates of MDA and CPA are concerned, Ohlson 1980) found that the overall results of logit models were not an obvious improvement on those of MDA. Hamer (1983) tested the predictive power of MDA and logit CPA, and then concluded that the both performed comparably in the prediction of business failure for a given variable set. However, with the knowledge that the predictive accuracy rates were overstated in previous MDA papers mainly due to the employment of choice-based sampling technique, the above comparison may be biased and the inferences from them could favor CPA. Apart from this, there do exist some factors that vary among previous papers and may erode such comparisons, including differences in the selection of predictors, the firm matching criteria, the lead time, the estimation and test time periods, and the research methodologies. Unless these factors are deliberately controlled, any report about the comparisons between CPA and MDA in terms of the predictive ability will not be robust. In conclusion, not only can CPA provide what any other technique can do in terms of user and ease of interpretation, but more importantly, it has no strict assumptions in the way MDA does, from which biased results primarily stem (Keasey and Watson 1991: 91). Even more recently, the dynamics of firm-specific and macroeconomic covariates can be captured by the new CPA modeling proposed by Duffie et al. (2007). With these appealing advantages, CPA is believed to be superior and preferred to MDA for bankruptcy classification.
107.4.3 Three CPA Models: LP, PM, and LM
where y is a dichotomous dummy variable that takes the value of 1 if the event occurs and 0 if it does not, and Pr./ represents the probability of this event. F ./ is a function of a regressor vector x coupled with a vector ˇ of parameters to govern the behavior of x on the probability. The problem arises as to what distribution model best fits the above equation. Derived from three different distributions, LP, PM, and LM are then chosen to fit into the right-hand side of the equation. LP is a linear regression model, which is easy to use but has two main problems as far as its application to generate the probability of an outcome is concerned. The first problem is the heteroscedastic nature of the error term. Recall the form of an ordinary LP, Y D X 0 ˇ C ", where Y is the probability of an outcome and X is a column of independent variables, ˇ is the parameter vector, and " is the error term. When an event occurs, Y D 1, " D 1X 0 ˇ, but when it does not occur, Y D 0, " D .X 0 ˇ/. It is noticeable that this error term is not normally distributed, so feasible general least squares estimation procedure should be used to correct this heteroscedasticity problem (Greene 1997: 87). The second problem, which may be the real difficulty with the LP model, is that LP cannot constrain Y to lie between 0 and 1 as a probability should. Amemiya (1981: 1486) then suggested the condition that Y D 1 if Y > 1 and Y D 0 if Y < 0. But this may produce unrealistic and nonsensical results. Therefore, it is not surprising that LP is used less frequently now. Hence, it will not be employed in this study either. In the discussion of qualitative response models, academics seem to be more interested in the comparisons between logit and probit models. Although logit models are derived from logistic density and probit models generated from Normal density, these two distributions are almost identical except that the logistic distribution has thicker tails and a higher peak in the middle (Cramer 1991: 15). In other words, the probability at each tail and in the middle of logistic distribution curve will be larger than that of the Normal distribution. However, one of the advantages of using the logit model is its computational simplicity. A look at the formula of these two models listed below will help one to appreciate this merit: Z
There are three commonly cited CPA models: the linear probability model (LP), the probit model (PM), and the logit model (LM). For CPA estimates the probability of the occurrence of a result in study, the general form of a CPA equation can be easily set as Pr.y D 1/ D F .x; ˇ/; Pr.y D 0/ D 1 F .x; ˇ/;
Probit Model W
1
1 2 p e t =2 dt 2
D ˆ.ˇ 0 x/; Logit Model W
Prob.Y D 1/ D D
(107.3)
ˇ0 x
Prob.Y D 1/ D
(107.4)
exp.ˇ 0 x/ 1 C exp.ˇ 0 x/ 1 ; 1 C exp.ˇ 0 x/ (107.5)
1600
where the function ˆ./ is the standard normal distribution. This mathematical convenience of logit models is one of the reasons for its popularity in practice (Greene 1997: 874). As far as the classification accuracy of CPA models is concerned, some comparisons of the results produced from these two models have been made and, as discussed above, generally suggest that they are actually indistinguishable in cases where the data are not heavily concentrated in the tails or the middle (Amemiya 1981; Cramer 1991; Greene 1997). This finding is consistent with the difference in the shape of the two distributions from which PM and LM are derived. It is alsopproven that the coefficients of the logit model are about = 3 1:8 times as large as those of the probit model, implying that the slopes of each variable in both models are very similar. In other words, “the logit and probit models results are nearly identical” (Greene 1997: 878). Choice of sampling methods is also a validity factor of CPA. The prevailing sampling method in the bankruptcy literature is to draw a sample with an approximately equal number of bankrupts and nonbankrupts, usually referring to the state-based sampling technique, as an alternative of random sampling. Although the econometric estimation procedure usually assumes random sampling, the use of state-based sampling has an intuitive appeal. As far as bankruptcy classification models are concerned, corporate failure is an event with rather low probability. Hence, a random sampling method may result in the inclusion of a very small percentage of bankrupts but a very high percentage of nonbankrupts. Such a sample will not be an effective one for any econometric model to produce efficient estimators (Palepu 1986: 6). In contrast, state-based sampling is an “efficient sample design” (Cosslett 1981: 56) that can effectively reduce the required sample size without influencing its provision of efficient estimators if an appropriate model and modification procedure are employed. In short, the information content of a state-based sample for model estimation is relatively better than that of a random sampling. A state-based sample for CPA results in the understatement of the Type I Error but the overstatement of the Type II Error (Palepu 1986; Lin and Piesse 2004). Manski and McFadden (1981) suggested several alternatives that can negate the drawbacks of using statebased sampling. They include the weighted exogenous sampling maximum likelihood estimator (WESMLE) and modified version by Cosslett (1981), the nonclassical maximum likelihood estimator (NMLE), and the conditional maximum likelihood estimator (CMLE). They compare and report these estimation procedures, which can be summarized into the following four points: 1. All these estimators are computationally tractable, consistent, and asymptotically normal. 2. The weighted estimator and conditional estimator avoid the introduction of nuisance parameters.
Y. Wang et al.
3. The nonclassical maximum likelihood estimators are strictly more efficient than the others in large samples. 4. In the presence of computational constraints, WESMLE and CMLE are the best; otherwise, NMLE is the most desirable. Accordingly, by using any one of the above modification methods, the advantages of using state-based sampling technique can be maintained, whereas the disadvantages can be mostly removed. What can also be inferred from this comparison is that the selection of modification method depends on two factors: the sample size and the computational complexity. Of them, CMLE is most frequently referred to in the bankruptcy literature for three reasons. First, this method has been more extensively demonstrated in Cosslett (1981) and Maddala (1983) studies with its application to the logit model. Second, this method has been used in the development of the acquisition prediction model by Palepu’s (1986), merger/insolvency choice model by BarNiv and Hathorn (1997), and bankruptcy classification models by Lin and Piesse (2004). Finally, because CMLE only changes the constant term of the model produced by normal MLE procedures, but has no effects on other parameters, this procedure is relatively simple and straightforward for CPA users. In a word, the merits of using CMLE with CPA include computational simplicity and practical workability. Without biases caused by the choice of sampling methods, modified CPA can almost correct all possible methodological flaws that MDA can have.
107.4.4 Time Series Analysis: CUSUM Model One application regarding the time series analysis on corporate distress is provided by Theodossiou (1993). He explores the idea that the difference between healthy and distressed firms by using the firm’s change in characteristics in time series could help to discover the corporate distress. In the procedure from normal status to the distress, the financial ratios deteriorate over time, which provides the information for early warning. In Theodossiou’s (1993) framework, we may set up Xi;1 ; : : : ; Xi;T be a sequence of important financial variables for firm i . The unconditional mean of Xi;t in the group for healthy companies is stationary over time; i.e., E.Xi;t jh/ D uh . For the group of distressed firms, the mean of Xi;t diverges from uh and gradually moves toward to mean uf which is denoted as the mean of the distressed firms; that is, uh ! uf;s ! ! uf;1 ! uf D uf;0 , in which uf;m E.Xi;t jf; m/ for m D 0; 1; 2; : : :; s being the mean of Xi;t , m reporting periods before failure. Next, the deviations of a firm’s indicator vectors from their means are expressed as a vector-autoregressive-movingaverage (VARMA) process of order p and q:
107 Detecting Corporate Failure
Xi;t uh D
p X
1601 q X
ˆk .Xi;t k uh / C "i;t
‚s "i;t s ;
sD1
kD1
(107.6) X p
Xi;t uf;m D
ˆk .Xi;t k uf;mCk / C "i;t
kD1
q X
m D 0; 1; 2; : : : ; (107.7)
‚s "i;t s ;
sD1
E."i;t / D 0; D D 0 for i ¤ j and/or t ¤ s, i; j D 1; 2; : : :,N and N D Nf C Nh . Based on the time series processes of financial ratios, Theodossiou (1993) shows a cumulative sums (CUSUM) model that conveys the information regarding the deterioration on the distress firm’s financial status: ˙; E."i;t "0j;s /
Ci;t D min.Ci;t 1 C Zi;t K; 0/ < L;
for K; L > 0; (107.8)
Ci;t and Zi;t are cumulative(dynamic) and an quarterly(static) time series performance score for the i th company at time t. K and L are sensitivity parameters required as positive values. Accordingly, score Zi;t is a function of the attribute vector Xi;t accounting for serial correlation in the data, and is estimated by Zi;t D ˇ0 C ˇi Xi;t
p X
ˆk Xi;t k C
q X
# ‚s "i;t s ;
sD1
kD1
(107.9) # 0
" p X 1 .uh uf / ˇ0 ˆk .uh uf;k / 2D kD1 # " p X 1 ˙ ˆk .uh C uf;k / ; .uh C uf / kD1
"
ˇ1 .1=D/ .uh uf /
p X
#0 ˆk .uh uf /
˙ 1 ;
kD1
and " D D .uh uf / 2
" ˙
1
Pf D prob.Ci;t > Ljdistress firm ands D 1/; Ph D prob.Ci;t Ljhealthy firm/
E."i;t "0i;t /
"
thus the Ci;t scores are equal to zero. In contrast, any distress firm that has Zi;t lower than K makes the cumulative score of Ci;t negative. In particular, two parameters K and L determine the discovery ability of CUSUM model and relate to the Type I and Type II Errors. Generally, the larger the value of K is, the less likely Type I Error and the more likely Type II Error might occur. The opposite is true with the parameter L:
p X
#0 ˆk .uh uf /
kD1
.uh uf /
p X
# ˆk .uh uf / :
kD1
Based on the CUSUM model, we are able to measure the overall performance on the firm’s financial distress possibility by the cumulative score Ci;t . If one company is typically a healthy firm, then Zi;t scores are positive and greater than K,
(107.10)
are, respectively, the percentages of distress and healthy country in the population not classified accurately by the prediction of CUSUM model, which also termed as Type I and Type II Errors. In means of the determination of the optimal values of K and L, we solve the dynamic optimization problem: Min D EC D wf Pf .K; L/ C .1 wf /Ph .K; L/; K;L
(107.11) where wf and wh D 1 wf are investor’s specific weights attached to the probabilities Pf and Ph . EC is the expected error rate.
107.4.5 Merton Model In the Merton (1974) Model, the probability of a firm going into default is measured by the stock market valuation on firm value. Conceptually, the firm value (on the market valuation basis) less than the firm’s obligation falls into distress. Yet, the market valuation on a firm (equal to sum of debt and equity) is not provided directly; only the stock market valuation on equity is available. Merton (1974) attempts to estimate the mean and standard deviation of the firm value indirectly through the option formula. One simple way refers to the Black and Scholes (1973) option formula, the mean and volatility of the firm value could be estimated from the option model. The measure of “distance to default” serves as the proxy for the possibility of corporate financial distress where distance to default is defined as the distance from the actual firm value to the critical value of the distress firm, given the confident level of Type I Error in determining the firm’s default. In Fig. 107.1, the distance to default is expressed as the DD. More precisely, the market valuation and standard deviation of the firm equity could be obtained from the Merton (1974) Model as VE D VA N.d1 / De r N.d2 /; E D
N.d1 /VA A ; VE
(107.12)
1602
Y. Wang et al.
Fig. 107.1 Firm value distribution and distance to default. Note. This figure is from Fig. 8.1 in Crosbie and Bohn (2002): (1) the current asset value; (2) the distribution of the asset value at time H ; (3) the volatility of the future assets value at time H ; (4) the level of the default point, the book value of the liabilities; (5) the expected rate of growth in the asset value over the horizon; and (6) the length of the horizon, H
where d1 D
ln.VA =D/ C .r C .1=2/A2 / p ; A
p d2 D d1 A ;
VE is the market value of the firm’s equity, E is the standard deviation of the equity, VA is the market valuation on firm value, A is the standard deviation of the firm value, r is risk-free rate, is time interval between the current date and the option expiration date, D is the total liability, N is the cumulative density function for Normal. From Equation (107.9) and (107.10), we solve unknown parameters VA and A . Based on the estimates for VA and A , the distance to default is DD D
ln.VA =D/ C . A2 =2/ p ; A
the Merton-based KMV Model with a naïve alternative. Let market value of each firm’s debt equal to the face value of debt; that is, naïve D D F: Then, Bharath and Shumway (2008) approximate the volatility of each firm’s debt by following naïve equation: naïve D D 0:05 C 0:25E : The 5% is a naïve estimation on term structure volatility, and 25% multiplying the equity volatility, which represents the tendency of default risk. Bharath and Shumway (2008) use this simple approximation and estimate the total volatility of the firm:
(107.13)
in which is the growth rate of assets and DD is the distance to default. Generally, the smaller DD, the higher likelihood this firm will meet the default in near future. One vital question in the estimation of the distance to default is how to compute the standard deviation of the return on equity price. One way is to estimate the standard deviation by daily returns with some autocorrelation adjustments. In the real world, KMV Corporation applies Merton (1974) to build up the financial distress forecast model that is usually termed as KMV Model. In the estimation of equity return standard deviation, KMV Corporation adopts estimations on VA and A other than the option approach. By and large, the different method of measuring standard deviation of equity return predicts different financial distress model. To model the equity return distribution (particularly the standard deviation), Bharath and Shumway (2008) compare
naïve A D
VE naïve D E C naïve D VE C naïve D VE C naïve D D
VE F E C .0:05 C 0:25E /: VE C F VE C F (107.14)
Next, they set up the expected return process by the firm’s stock return in previous year: naïve D ri t 1 : Finally, the naïve distance to default is naïve DD D
lnŒ.VE C F /=F C .ri t 1 0:5naïve A2 / p : naïve A (107.15)
107 Detecting Corporate Failure
1603
Under this naïve DD, Bharath and Shumway (2008) argue that this alternative is slightly better than the traditional Merton-based KMV Model.
107.5 The Selection of Optimal Cutoff Point The final issue with respect to the accuracy rate of a bankruptcy classification model is the selection of an optimal cutoff point, especially for the scoring model. As Palepu (1986) noted, traditionally the cutoff point determined in most early papers was an arbitrary cutoff probability, usually 0.5. This choice may be intuitive, but lacks theoretical backing. Joy and Tollefson (1975), Altman and Eisenbeis (1978), and Altman et al. (1977) developed an equation to calculate the optimal cutoff point in their ZETA model. Two elements in the calculation of the optimal cutoff point can be identified as (1) the costs of Type I and Type II Errors and (2) the prior probability of failure and survival. These two essentials had been ignored in most previous studies. However, Kuo et al. (2002) adopted the points by use of Fuzzy theory to enhance credit decision model. Although their efforts were such a big breakthrough, there were still several unsolved problems. The first problem is the subjectivity in deciding the costs of Type I and Type II Error. Altman et al. (1977: 46) claimed that bank loan decisions will be approximately 35 times more costly for Type I Errors than for Type II Errors. This figure certainly cannot be applied to other decision models used by different parties such as investors and in different periods of time. In this case, such an investigation on costs of Type I and Type II Errors may be a one-off case. With constrains in reality, the subjectivity of selecting convenient cutoff figures in academic studies seems inevitable. The second problem is the subjectivity of selecting a prior bankruptcy probability. Wood and Piesse (1988) criticized Altman et al. (1977) for choosing a 2% higher failure rate than the average annual failure rate of 0.5%, suggesting spurious results from Altman et al. and necessitating a correction that was taken up in the later research. The final problem is that the optimal cutoff score produced may not be “optimal” in the light of the violation of assumptions with respect to multinormality and equal dispersion matrices (Altman et al. (1977: 43, footnote 17), which is apparently a common methodological problem in this data analysis. The optimal cutoff equation in Maddala (1983: 80) seems to be less problematic. It firstly develops the overall misclassification cost model as Z Z f1 .x/dx C C2 P2 f2 .x/dx; (107.16) C D C1 P1 G2
G1
where C is the total cost of misclassification; C1 the cost of misclassifying a failed firm into nonfailed one (Type I Error); C2 the cost of misclassifying a nonfailed firm into failed one (Type II Error); P1 the proportion of the failed firms to the total population firms; P2 the proportion of the nonfailed firms to the total population firms; G1 the failed firm group; G2 the nonfailed firm group; x a vector of characteristics x D .x1 ; x2 ; : : :; xk /; f1 .x/ the joint distribution of the characteristics x in failed firm group; f2 .x/ the joint distribution of x in nonfailed firm group; and P1 C P2 D 1. However, Z
Z
*
f1 .x/dx C G2
f1 .x/dx D 1:
(107.17)
G1
From Equations (107.3) and (107.4), we then have Z ) C D C1 P1 1
Z f1 .x/dx C C2 P2 G1
Z D C1 P1 C
f2 .x/dx G1
ŒC2 P2 f2 .x/ C1 P1 f1 .x/ dx: G1
(107.18) To minimize the total cost of misclassification, min C , we have to let C2 P2 f2 .x/ C1 P1 f1 .x/ 0 or
f1 .x/ C2 P2 : f2 .x/ C1 P1
(107.19)
(107.20)
Assume the expected costs of Type I Error and Type II Error are the same, C2 P2 D C1 P1 , the condition to minimize the total misclassification cost will be f1 .x/ 1: f2 .x/
(107.21)
This result is consistent with the one proposed by Palepu (1986) under the assumption of equal costs of Type I and II Errors. Therefore, the optimal cutoff point should be the probability value where the two conditional marginal densities, f1 .x/ and f2 .x/, are equal. In this equation, what can be found is that there is no need to use the prior failure rate to calculate the optimal cutoff point, but instead, the ex post failure rate (i.e., sample failure rate). Palepu (1986) more clearly illustrates this convenience by the application of Bayes’ formula. Instead of using the costs of Type I and Type II Errors, the expected costs of these errors are still required in this formula. Unfortunately, the subjectivity of deciding the relationship of these two types of expected costs still cannot be removed. There is no theory suggesting they shall be the
1604
same; that is, C2 P2 D C1 P1 . However, compared to the traditionally arbitrary 50% cutoff point, this assumption is more neutral and acceptable in an economic sense. This application of this procedure for the determination of bankruptcy cutoff probability can be found in Palepu (1986) and Lin and Piesse (2004).
107.6 Recent Development While MDA and CPA are classified as static analyses, dynamic modeling is becoming the mainstream in bankruptcy literature as other academic schemes. Shumway (2001) criticized static bankruptcy models for their observation of only the status of each bankrupt company 1 year prior to their failures and their ignorance of firms’ changes from year to year. In contrast, Shumway (2001) proposed a simple hazard dynamic model, which contained a survivor function and a hazard function to determine a firm’s failure possibility at each point in time. Given the infrequency of corporate failure in reality, hazard model users will avoid the small sample problem because this dynamic model requires all available time series information of firms. As this hazard model takes the duration dependence, time-varying covariates, and data sufficiency problems into consideration, it is in methodology superior to the models of MDA and CPA families. More empirical evidence is needed to support its excellency in prediction power. Studies following similar concepts can also be found in Whalen (1991) and Helwege (1996).
107.7 Conclusion In all the frequently used methods of bankruptcy, it is clear that the reasons a firm may file for corporate insolvency does not necessarily include the inability to pay off its financial obligations when they mature. For example, a solvent company can also be wound up through members’ voluntary liquidation procedure to maximize the shareholders’ wealth when the realized value of its assets exceeds its present value in use. Bulow and Shoven (1978) modeled the potential conflicts among the various claimants to the assets and income flows of the company (e.g., bondholders, bank lenders, and equity holder) and found that a liquidation decision should be made when “the coalition of claimants with negotiating power can gain from immediate liquidation” (Bulow and Shoven 1978: 454). Their model also considered the existence of some asymmetrical claimants of the firm. These results confirmed the complicated nature of the bankruptcy decision and justify the adoption of members’ voluntary liquidation procedure to determine a company’s value (see Brealey et al. 2001: 622; Ross et al. 2002: 857).
Y. Wang et al.
As to the development of failure prediction models, new models are methodologically superior, but the increase of their prediction power does not seem to trade off the increase of their modeling complexity, which casts doubts to their real values in practice. In addition, the costs of bankruptcy vary with different bankruptcy codes in different countries (see Brealey et al. 2001: 439–443; Ross et al. 2002: 426). It implies that bankruptcy prediction models with universally applicable factors and cutoff probability does not exist. Acknowledgments We would like to express our heartfelt thanks to Professor C. F. Lee who provided valuable suggestions in structuring this article. We also owe thanks to many friends at University of London (UK) and National Chi Nan University (Taiwan) for valuable comments. We also want to thank our research assistant Chiu-Mei Huang for preparing the manuscript and cheerfully proofreading several drafts of the manuscript. Last, but not least, special thanks go to the Executive Editorial Board of the Handbook of Quantitative Finance and Risk Management at Kluwer Academic Publishers, who encouraged us to write the article, expertly managed the development process, and superbly turned the final manuscript into a finished product.
References Agarwal, V. and R. Taffler. 2008. “Comparing the performance of market-based and accounting-based bankruptcy prediction models.” Journal of Banking and Finance 32, 1541–1551. Altman, E. I. 1968. “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy.” Journal of Finance 23(4), 589–609. Altman, E. I. 1983. Corporate financial distress: a complete guide to predicting, avoiding, and dealing with bankruptcy, Wiley, New York, NY. Altman, E. I. and R. O. Eisenbeis. 1978. “Financial applications of discriminant analysis: a clarification.” Journal of Financial and Quantitative Analysis 13(1), 185–195. Altman, E. I., R. G. Haldeman, and P. Narayanan. 1977. “Zeta analysis: a new model to identify bankruptcy risk of corporations.” Journal of Banking and Finance 1, 29–54. Amemiya, T. 1981. “Qualitative response models: a survey.” Journal of Economic Literature 19(4), 1483–1536. Balcaena, S. and H. Ooghe. 2006. “35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems.” British Accounting Review 38, 63–93. BarNiv, R. and J. Hathorn. 1997. “The merger or insolvency alternative in the insurance industry.” Journal of Risk and Insurance 64(1), 89–113. Beaver, W. 1966. “Financial ratios as predictors of failure.” Journal of Accounting Research 4(Suppl.), 71–111. Bharath, S. and T. Shumway. 2008. “Forecasting default with the Merton distance to default model.” Review of Financial Studies 21(3), 1339–1369. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81(3), 637–654. Blum, M. 1974. “Failing company discriminant analysis.” Journal of Accounting Research 12(1), 1–25. Brealey, R. A., S. C. Myers, and A. J. Marcus. 2001. Fundamentals of corporate finance, 3rd Edition, McGraw-Hill, New York, NY. Bulow, J. and J. Shoven. 1978. “The bankruptcy decision.” Bell Journal of Economics 9(2), 437–456.
107 Detecting Corporate Failure Cosslett, S. R. 1981. “Efficient estimation of discrete-choice models,” in Structural analysis of discrete data with econometric applications, C. F. Manski and D. McFadden (Eds.). MIT Press, London, pp. 51–111. Counsell, G. 1989. “Focus on workings of insolvency act.” The Independent, 4th April. Cramer, J. S. 1991. The logit model: an introduction for economists, Edward Arnold, London. Crosbie, P. and J. Bohn. 2002. Modeling default risk, KMV Corporation, San Francisco, CA. Dahiya, S. and L. Klapper. 2007. “Who survives? A cross-country comparison.” Journal of Financial Stability 3, 261–278. Deakin, E. B. 1976. “Distributions of financial accounting ratios: some empirical evidence.” Accounting Review 51(1), 90–96. Duffie, D, L. Saita, and K. Wang. 2007. “Multi-period corporate default prediction with stochastic covariates.” Journal of Financial Economics 83, 635–665. Eisenbeis, R. A. 1977. “Pitfalls in the application of discriminant analysis in business, finance, and economics.” Journal of Finance 32(3), 875–900. Eisenbeis, R. A. and R. B. Avery. 1972. Discriminant analysis and classification procedure: theory and applications, D.C. Heath, Lexington, MA. Ezzamel, M., C. Mar-Molinero, and A. Beecher. 1987. “On the distributional properties of financial ratios.” Journal of Business Finance & Accounting 14(4), 463–481. Fitzpatrick, P. J. 1932. “A comparison of ratios of successful industrial enterprises with those of failed firms.” Certified Public Accountant, October, November, and December, pp. 598–605, 656–662, and 727–731, respectively. Foster, G. 1986. Financial statement analysis, 2nd Edition, PrenticeHall, Englewood Cliffs, NJ. Frecka, T. J. and W. S. Hopwood. 1983. “The effects of outliers on the cross-sectional distributional properties of financial ratios.” Accounting Review 58(1), 115–128. Greene, W. H. 1997. Econometric analysis, 3rd Edition, Prentice-Hall, Englewood Cliffs, NJ. Hamer, M. M. 1983. “Failure prediction: sensitivity of classification accuracy to alternative statistical methods and variable sets.” Journal of Accounting and Public Policy 2(4), 289–307. Helwege, J. 1996. “Determinants of saving and loan failures: estimates of a time-varying proportional hazard function.” Journal of Financial Services Research 10, 373–392. Homan, M. 1989. A study of administrations under the insolvency act 1986: the results of administration orders made in 1987, ICAEW, London. Hudson, J. 1987. “The age, regional and industrial structure of company liquidations.” Journal of Business Finance & Accounting 14(2), 199–213. Joy, M. O. and J. O. Tollefson. 1975. “On the financial applications of discriminant analysis.” Journal of Financial and Quantitative Analysis 10(5), 723–739. Judge, G. G., W. E. Griffiths, R. Carter Hill, H. Lutkepohl, and T. C. Lee. 1985. The theory and practice of econometrics, Wiley, New York, NY. Karels, G. V. and A. J. Prakash. 1987. “Multivariate normality and forecasting of business bankruptcy.” Journal of Business Finance & Accounting 14(4), 573–595. Keasey, K. and R. Watson. 1987. “Non-financial symptoms and the prediction of small company failure: a test of the Argenti’s hypothesis.” Journal of Business Finance & Accounting 14(3), 335–354. Keasey, K. and R. Watson. 1991. “Financial distress prediction models: a review of their usefulness.” British Journal of Management 2(2), 89–102. Kennedy, P. 1991. “Comparing classification techniques.” International Journal of Forecasting 7(3), 403–406.
1605 Kennedy, P. 1992. A guide to econometrics, 3rd Edition, Blackwell, Oxford. Kshirsagar, A. M. 1971. Advanced theory of multivariate analysis, Dekker, New York, NY. Kuo, H. C., S. Wu, L. Wang, and M. Chang. 2002. “Contingent fuzzy approach for the development of banks’ credit-granting evaluation model.” International Journal of Business 7(2), 53–65. Lev, B. and S. Sunder. 1979. “Methodological issues in the use of financial ratios.” Journal of Accounting and Economics 1(3), 187–210. Lin, L. and J. Piesse. 2004. “The identification of corporate distress in UK industrials: a conditional probability analysis approach.” Applied Financial Economics 14, 73–82. Lo, A. W. 1986. “Logit versus discriminant analysis: a specification test and application to corporate bankruptcies.” Journal of Econometrics 31(3), 151–178. Maddala, G. S. 1983. Limited-dependent and qualitative variables in econometrics, Cambridge University Press, Cambridge. Manski, C. F. and D. L. McFadden. 1981. Structural analysis of discrete data and econometric applications. Cambridge, The MIT Press. Marks, S. and O. J. Dunn. 1974. “Discriminant functions when covariance matrices are unequal.” Journal of the American Statistical Association 69(346), 555–559. Martin, D. 1977. “Early warning of bank failure: a logit regression approach.” Journal of Banking and Finance 1(3), 249–276. McDonald, B. and M. H. Morris. 1984. “The functional specification of financial ratios: an empirical examination.” Accounting and Business Research 15(59), 223–228. McLeay, S. 1986. “Students and the distribution of financial ratios.” Journal of Business Finance & Accounting 13(2), 209–222. Mensah, Y. 1983. “The differential bankruptcy predictive ability of specific price level adjustments: some empirical evidence.” Accounting Review 58(2), 228–246. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29(2), 449–470. Ohlson, J. A. 1980. “Financial ratios and the probabilistic prediction of bankruptcy.” Journal of Accounting Research 18(1), 109–131. Palepu, K. G. 1986. “Predicting takeover targets: a methodological and empirical analysis.” Journal of Accounting and Economics 8(1), 3–35. Pastena, V. and W. Ruland. 1986. “The merger bankruptcy alternative.” Accounting Review 61(2), 288–301. Peel, M. J. and D. A. Peel. 1987. “Some further empirical evidence on predicting private company failure.” Accounting and Business Research 18(69), 57–66. Rees, B. 1990. Financial analysis, Prentice-Hall, London. Ross, S. A., R. W. Westerfield, and J. F. Jaffe. 2002. Corporate finance, 6th Edition, McGraw-Hill, New York, NY. Shailer, G. 1989. “The predictability of small enterprise failures: evidence and issues.” International Small Business Journal 7(4), 54–58. Shumway, T. 2001. “Forecasting bankruptcy more accurately: a simple hazard model.” Journal of Business 74(1), 101–124. Storey, D. J., K. Keasey, R. Watson, and P. Wynarczyk. 1987. The performance of small firms, Croom Helm, Bromley. Sun, L. 2007. “A re-evaluation of auditors’ opinions versus statistical models in bankruptcy prediction.” Review of Quantitative Finance and Accounting 28, 55–78. Theodossiou, P. 1993. “Predicting shifts in the mean of a multivariate time series process: an application in predicting business failures.” Journal of the American Statistical Association 88, 441–449. Walker, I. E. 1992. Buying a company in trouble: a practical guide, Gower, Hants. Whalen, G. 1991. “A proportional hazard model of bank failure: an examination of its usefulness as an early warning tool.” Economic Review, First Quarter, 20–31.
1606 Whittington, G. 1980. “Some basic properties of accounting ratios.” Journal of Business Finance & Accounting 7(2), 219–232. Wood, D. and J. Piesse. 1988. “The information value of failure predictions in credit assessment.” Journal of Banking and Finance 12, 275–292. Zavgren, C. V. 1985. “Assessing the vulnerability to failure of American industrial firms: a logistic analysis.” Journal of Business Finance & Accounting 12(1), 19–45.
Y. Wang et al. Zavgren, C. V. 1988. “The association between probabilities of bankruptcy and market responses – a test of market anticipation.” Journal of Business Finance & Accounting 15(1), 27–45. Zmijewski, M. E. 1984. “Methodological issues related to the estimation of financial distress prediction models.” Journal of Accounting Research 22(Suppl.), 59–82.
Chapter 108
Genetic Programming for Option Pricing N.K. Chidambaran
Abstract This chapter describes the Genetic Programming methodology and illustrates its application for the pricing of options. I describe the various critical elements of a Genetic Program – population size, the complexity of individual formulas in a population, and the fitness and selection criterion. As an example, I implement the Genetic Programming methodology for developing an option pricing model. Using Monte Carlo simulations, I generate a data set of stock prices that follow a Geometric Brownian motion and use the Black–Scholes model to price options off the simulated prices. The Black–Scholes model is a known solution and serves as the benchmark for measuring the accuracy of the Genetic Program. The Genetic Program developed for pricing options well captures the relationship between option prices, the terms of the option contract, and properties of the underlying stock price. Keywords Genetic programming r Option pricing r Non– parametric r Monte-Carlo simulation
108.1 Introduction Genetic Programming is a powerful computational tool that has seen widespread applications – from the determination of gene sequencing to searching for technical trading rules in financial futures markets. This chapter describes Genetic Programming and illustrates its application for developing an option pricing model. Genetic Programming is a non-parametric data-driven methodology. Genetic Programming requires minimal assumptions to implement and easily adapts to changing and uncertain economic environments.
N.K. Chidambaran () Graduate School of Business, Fordham University, 1790 Broadway, New York, NY 10023, USA e-mail: [email protected]
Theoretical option pricing models based on risk-neutral pricing theory, such as the seminal Black–Scholes model, rely on strict assumptions that do not hold in the real world. The Black and Scholes (1973) model has been shown, for example, to exhibit systematic biases from observed option prices (Rubinstein 1985; Macbeth and Merville 1979; Macbeth and Merville 1980) and researchers have attempted to explain the systematic biases as an artifact of its assumptions. The most often challenged assumption is the normality of stock returns. Merton (1976) and Ball and Torous (1985) propose a Poisson jump-diffusion returns processes. French et al. (1987) and Ballie and DeGennaro (1990) advocate GARCH (Bollerslev 1986) processes. While closed-form solutions for the option price cannot be obtained for all these models, pricing formulas can be obtained numerically. The difficulty in finding an analytical closed-form parametric solution has led to non-parametric approaches. Rubinstein (1997) suggests that we examine option data for the implied binomial tree to be used for pricing options. Chidambaran and Figlewski (1995) use a quasi-analytic approximation based on Monte Carlo simulation. Hutchinson et al. (1994) build a numerical pricing model using neural networks. Chidambaran et al. (1999) propose Genetic Programming to develop an adaptive evolutionary model of option pricing that is also data driven and non-parametric. They show that this method offers some advantages over learning networks. In particular, it can operate on small data sets, circumventing the large data requirement of the neural network approach noted by Hutchinson et al. (1994). The philosophy underlying Genetic Programming is to replicate the process by which genetic traits evolve in offspring in the biological world, through a random combination of the genes of the parents. A random selection of equations of the option contract terms and basic statistical properties of the underlying stock price will have among them some elements that will ultimately make up the true option pricing formula. By selectively breeding the equations, presumably these elements will be passed onto future generations of equations that can price options more accurately. The essence of the method is the selection of equation components; that is, genetic traits, that parents pass on to the next
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_108,
1607
1608
generation. Since it is impossible to determine which element is the best ex ante, the focus is on choosing parents that seem to be the fittest. The genes to be propagated to the next generation are thus selected on the basis of the pricing errors of the equations. An important advantage of the Genetic Programming approach over other numerical techniques is its ability to incorporate known approximate solution as a starting point for the approximation. That is, the algorithm can seed the initial population of equations with a particular equation or individual. This has two effects on the efficiency of the program. One, it starts with an individual member in the population that gives a good fit to the data. Two, the elements of this equation will add to the “gene pool” to be used in evolving future generations. When developing the option pricing model using options prices, for example, we can start with the Black–Scholes model in the initial gene pool. The genetic programming should quickly converge to the included Black–Scholes model when data is consistent with the assumptions of the Black–Scholes model. When the data is from an environment that violates the Black–Scholes model, the included Black–Scholes model serves as a useful starting point. The program can converge to the true pricing model more efficiently as it begins the search from a locally optimum solution. There are many factors that determine the efficiency of Genetic Programming. Important specifications include the size of the population, the method of selecting equations with their embedded “genetic traits” to serve as parents, the number of mutations that are allowed, the size of the data set used for training the program. I examine these parameters and explore the efficiency of the Genetic Program. The chapter proceeds as follows. In Sect. 108.2, I discuss the elements of a genetic programming and highlight its advantages over other non-parametric methods. In Sect. 108.3, I assess the ability of Genetic Programming in learning the Black–Scholes model, given data that are simulated according to the assumptions of the Black–Scholes world. In Sect. 108.4, I discuss extensions of the Genetic Program to non-Black–Scholes world and show how Genetic Programming can adapt the Black–Scholes model to its specifications. In Sect. 108.5, I conclude.
N.K. Chidambaran
108.2.1 Basic Approach Genetic Programming is an offshoot of Genetic Algorithms. Genetic Algorithms have been used to successfully develop technical trading rules by Allen and Karjalainen (1999) for the S&P 500 Index and by Neely et al. (1997) for foreign exchange markets. Genetic Programming has also been used in heterogeneous multi-agent economies by Marimon et al. (1990), in multi-agent financial markets by Lettau (1997), and in multi-agent games by Ho (1996). I use a variant of Genetic Programming called Genetic Regression, where the desired program is a function that relates a set of inputs such as share price, option exercise price, and so forth, to one output, the option price. The set of data on which the program operates to determine the relationship between input parameters and the options price is called the training set. The set of data on which the resulting formula is tested is called the test set. The procedure of the basic approach is as follows. Individuals: Given a problem to be solved and a training
set of matched inputs and outputs, an initial set of possible formulas is randomly generated. These formulas are functions of some or all of the independent variables and randomly generated constants. The programmer specifies the allowable operations on the variables and constants, for example, mathematical operations and the cumulative probability function for a standard normal distribution. Each formula is an individual and the elements of the randomly generated formulas are the genes of the individual. A formula is represented in a binary tree. The nodes of a tree represent either a value or an operation. If the node represents an operation, then the operation is executed from the left branch to the right branch of the tree as shown in Fig. 108.1. All formulas, even complex nonlinear formulas, can be specified in terms of these binary trees. Figure 108.2 shows the binary tree representation for the Black–Scholes Formula.
108.2 Genetic Program Elements Genetic Programming is a technique that applies the Darwinian theory of evolution to develop efficient computer programs. In this section I describe the mechanics of the approach and the various ways to improve its efficiency.
Fig. 108.1 A binary tree representation of the function ln(S) X C T
108 Genetic Programming for Option Pricing
1609
Fig. 108.2 A binary tree representation of the Black–Scholes Model
Population: A set of individuals is called the population.
In the initial step all the individuals in the population are randomly generate. The larger the population, the larger is the diversity in the gene pool as a greater number of possible combinations are observed. The size of the population is usually kept at a constant throughout the programming step and population size is a control variable for optimizing the modeling process. Fitness: Every individual in the population is evaluated to test whether it can accurately price options in the training data set. A smaller mispricing for the training data set indicates a better fit. The program assigns a fitness measure that can be used to rank the performance of each individual formula in fitting the input data. The fitness measure and ranking will be used to select the individual formulas that best fit the data. Selection: Based on a fitness measure, a subset of the population is selected to act as the parents for the next generation of the population of formulas. The logic behind the selection criteria is that the individual formula that best explains the data consists of elements (the genes) that will ultimately constitute the elements of an option pricing formula. Procreation: A pair of the parents generates a pair of offspring. Components of the parent formulas are crossed to generate offspring formulas. A random point is selected in each parent tree. The sub-trees below that random point are switched between the two parent formulas. This operation creates a new pair of individuals, the offspring. It is possible that no crossover is performed and the parents themselves are placed in the new population (a clone).
The process of selection and crossover is repeated until the new generation is completely populated. Generations: Each successive population is referred to as a generation. Through the process of selection and procreation each new generation has a set of individual formulas that are an overall better fit of the options data (the input data). Analogous to numerical methods, we have to set the maximum number of iteration steps that the Genetic Program is allowed to run, which is implemented by specifying a maximum limit on the number of Generations. I usually use a large number, for example, 1,000 generations, but I find that the Genetic Program converges quickly in about 100 generations and improvements are fewer and far between in subsequent generations. Evolutionary pressure in the form of fitness-related selection combined with the crossover operation eventually produces populations of highly fit individuals. The program keeps track of the best-fit individual found in each generation step throughout this process. The researcher can control the accuracy of the Genetic Program by specifying the maximum permissible error in fitting the data. Once the Best Formula in a generation meets the error criteria specified, the Genetic Program has converged to a solution and the Best Formula in the final generation is the desired option pricing formula. Since Genetic Programming is a computational tool, it is common to run the program multiple times (i.e., multiple trials) and use the mean (median) of the several runs as the final result. The solutions obtained via a Genetic Program is a formula, but it is up to the researcher to gather insight from the elements of the final formula obtained from a Genetic
1610
Program. On the one hand, a Genetic Program can be viewed as a Black-Box that gives the value of an option given the inputs. On the other hand, the Genetic Program results in a program and it may be useful to examine the elements of the final equation obtained.
108.2.2 Fitness and Selection Criteria The method of selecting parents for the next generation can affect the efficiency of genetic programs. I examine different selection methods: Best, Fitness, Fitness-overselection, Random, Tournament with four individuals and Tournament with seven individuals. These methods represent various attempts to preserve a degree of randomness in the evolutionary process. In the Best method, individuals are ranked in terms of their fitness, ascending in the order of magnitude of their errors. The individuals with the smallest errors are thus picked to serve as parents of the next generation. In the Fitness method, individuals are selected randomly with a probability that is proportional to their fitness. In the Fitnessoverselection method, individuals are classified into two groups. Group 1 has best-fit individuals and Group 2 has the remainder. Individuals are selected randomly with an 80% probability from Group 1 and a 20% probability from Group 2. In the Random method, the fitness of the individuals is completely ignored and parents are chosen at random from the existing population. Finally, in the Tournament method, n individuals are selected at random from the population and the best-fit individual is chosen to be a parent. I examine Tournament method with n D 4 and n D 7.
108.2.3 Advantages of Genetic Programming An important advantage of Genetic Programming is its capability of incorporating a known analytical approximation to the solution into the program. I experiment with including the Black–Scholes model as an initial parameter; that is, part of the initial gene pool, for the algorithm. Since the method begins with a known approximation, it increases the probability of finding the true pricing formula and reduces computing time. Genetic Programming requires smaller training sets than Neural Networks, which is a popular alternative adaptive learning algorithm (see Hutchinson et al. 1994, and Koza 1992). Since most options, especially those that are deep-in-the-money and deep-out-of-the-money, are thinly traded, Genetic Programming is an ideal tool for option pricing.
N.K. Chidambaran
The methodology can also be made robust to changing environmental conditions and can operate on data sets generated over a range of possible conditions. I make the population robust by stochastically changing the training sets in the middle of the evolution. Only individuals with the desirable characteristics that are well adapted to changing environments will survive. The problem of over-fitting, in particular, is easily resolved by this approach. Further, new formulas can evolve out of previously optimal solutions when the data set contains structural changes rather than requiring retraining from scratch like in learning networks. Since genetic programs are self-learning and self-improving, they are an ideal tool for practitioners.
108.2.4 Convergence Characteristics of Genetic Algorithms and Programs The implementation of the Genetic Programming is effectively a search over the space of functions that can be constructed from a user-defined set of base variables and operations. This space of functions is generally infinite. However, the Genetic Programming algorithms are aided by the fact that the search space can be limited and that the search is a parallel search. The complexity of the problem can be controlled by the maximum depth size of the binary trees used to represent formulas. In my work on option pricing, I use a maximum depth size of 17 for the binary trees, which mirrors the parameter commonly used to limit the size of tree sizes Koza (1992). Practically, I chose the maximum depth size possible without running into excessive computer run times. Note that the Black–Scholes formula is represented by a tree of depth size 12. A depth size of 17, therefore, is large enough to accommodate complicated option pricing formulas and works in practice. The search space is, however, still very large and it is computationally inefficient to examine every possible tree. The implicit parallelism of Genetic Programming, however, ensures that the search is efficient. The central idea behind the parallelism of Genetic Programming is that each of the formula elements defines hyperplanes; that is, sub-regions of the search space. In the population of candidate formulas, all the elements are present, and the fitness of each formula is a function of how many of the elements of the true pricing formula is present in the individual being evaluated. All formulas that contain a particular element will have similar errors and an evaluation of the formulas in the population is a parallel search for the hyperplanes containing the elements that make up the true option-pricing model. For example, the Black–Scholes formula is: C D SN.d1/ Xe r N.d 2/
(108.1)
108 Genetic Programming for Option Pricing
1611
where, p p d1 D Œln.S=X / C .r C 2 =2/ = and d 2 D d1 (108.2) N (d1) and N (d2) are the cumulative standard normal values for d1 and d2, S is the current stock price, X is the exercise price, r is the risk free rate, t is the option time to maturity and s is the volatility of the underlying stock. We can treat the formula to be the point at which the hyperplanes containing the term S N (d1) and -X e-rt N (d2) intersect. Searching over a randomly generated set of formulas is, therefore, a parallel search over a set of hyperplanes. The true option pricing formula will consist of many different elements that form a set of hyperplanes and this is called its schemata. The individual sub-regions formed by the hyperplanes are the schema. If an individual equation contains elements that represent a superior region of the search space, it will generally be reflected as better fitness for the equation. This will increase the individual’s chance to reproduce and pass on its schema to the next generation. When used to solve problems that involves a search for the sequence of elements that make up a gene, or any problem that involves a search for a sequence of numbers, Holland (1975) and Koza (1992) and show that the schemata of the Genetic Algorithm search process is extremely efficient and the algorithm converges. In this paper, I implicitly test whether such an approach will also work when searching for a closed-form option-pricing model.
rived for each simulated option, using the Black–Scholes equation. I thus have a sample of simulated options data. I adopted many of the simplifications suggested by Hutchinson et al. (1994) in generating the data sample; for example, I hold the annual volatility and riskless rate r constant throughout. Figure 108.3 shows a stock price path generated by Geometric Brownian motion and Figure 108.4 shows the distribution of associated option prices. Table 108.1 describes the specifications of the Genetic Programming model. I use the four basic mathematical operations, the log function, the exponential function, the square root function, and the cumulative Normal distribution. The basic division operation is protected against division by zero and the log and square root functions are protected against negative arguments. The current stock price, option exercise price, option intrinsic value, and option time-to-maturity are input parameters. The functional representation of a formula is assumed to be 17-step deep. I implement ten trial runs; that is, I generate ten different Genetic Programming option pricing formula. Table 108.2 shows the genetic programming parameters used in each run. For each training set the price path of a stock with starting
108.3 Black–Scholes Example In this section, I present an example of how Genetic Programming can learn the Black–Scholes model, paralleling the study by Hutchinson et al. (1994) and Chidambaran et al. (1999). Data to train the Genetic Programming is generated through Monte-Carlo simulation. For each data set, price paths of the underlying stock with initial value S0 D 50 are simulated for 504 days (24 months 21 days/month). Stock returns are assumed to follow a diffusion process dS(t)/S(t)D mdt C sdW(t) with annual continuously compounded expected return D 0:10, standard deviation D 0:20, and risk-free rate r D 0:05. Stock price at time t is calculated as: t P
S.t/ D e
i D1
Zi
I
t D 1; ::; 504:
Fig. 108.3 Sample geometric brownian stock price process using Monte Carlo simulation
(108.3)
I then generate a sample of call options for each stock price realization. CBOE rules (Hull 1993) were used to create call options with varying strikes and maturity for each day of the simulated price path. Option prices are de-
Fig. 108.4 Distribution of option prices derived from the stock price path in Fig. 108.1
1612
N.K. Chidambaran
Table 108.1 Training variables and arithmetic operations
Name
Source
Definition
S
Option contract
Stock price
X
Option contract
Exercise price
S/X
Part of Black–Scholes
Option moneyness
S
Option contract
Time to maturity (years)
max(S-X)
Boundary condition
Option intrinsic value Max (S-X,0)
C
Standard arithmetic
Standard arithmetic
Subtraction
Standard arithmetic
Multiplication
%
Standard arithmetic
Protected division:
Addition
x%y D 1 if y D 0 D x/y otherwise Exp
Black–Scholes component
Exponent: exp(x) D ex
Plog
Black–Scholes component
Protected Natural log:
Psqrt
Black–Scholes component
Protected Square root:
plog.x/ D ln.jxj/ psqrt(x) D sqrt(jxj) Ncdf
Table 108.2 Genetic programming parameters Sum of absolute dollar errors and percentage errors Fitness criterion Population size
100–50,000
Sample size
5% – dynamically sampled
Number of generations
100–1,000
Mutations
10–50%
value S0 D 50 was simulated through twenty-four 21-day months as described earlier. Options were created according to CBOE rules and valued using the Black–Scholes formula. Each training set consisted of the daily values of these options. Formula populations were exposed to dynamically sampled subsets, about 5% of the entire sample. The data set is stochastically changed in the middle of the training run to prevent over-fitting. I find that evaluating the population formulas on such stochastic subsets of the data set resulted in reduced training times and better out-of-sample performance. Only robust formulas can survive the constantly changing environment and pass on their “traits” to the next generation. In my work I have experimented with population size from 100 to 50,000 individual formulas and varied the level of mutations between 10 and 50%. I have found that the larger the initial population, or in biological terms the larger the diversity of the gene pool, the greater is the accuracy and the efficiency of the genetic program. The criterion for selecting the surviving formulas is a linear combination of the absolute pricing errors and the
Black–Scholes component
Normal Cumulative Distribution Function
percentage pricing errors. I found (Chidambaran 2003) that the formulas consistently made relatively small absolute errors when pricing out-of-the-money options and relatively large absolute errors when pricing in-the-money options. The pattern in the magnitudes of the percentage error is just the opposite. Linear combination of these two error measurements leads to a more efficient selection rule. For example, if the true price of an option is $2.00 and one of the Genetic Programming formulas gives a price of $2.20, then percentage error is small (10%) but the dollar error is $0.20, which is economically significant. On the other hand, if the true price is $0.10 and the formula gives a price of $0.07, dollar error is small ($0.03) but the price is off by 30%. Our error measure is then 30 (10% C $0.20) in the first case and 33 (30% C $0.03) in the second case. In the classic Genetic Programming fashion, I define the fitness of a formula to be: 1 1C
# fitnesscases P
(108.4) "i
i D1
where ©i is the training error for the ith case. This training error is defined as the sum of percentage and dollar errors if the Black–Scholes value was greater than $0.01 and just the dollar error if the Black–Scholes value was less than $0.01. It should be noted that the restrictions on Genetic Programming are far fewer than those required for Neural Networks. Only the variables needed for pricing options have to be specified. I need not make assumptions on the
108 Genetic Programming for Option Pricing
smoothness or complexity of the formulas beyond the maximum allowable depth (tree size) for representing a formula. I measure the performance of Genetic Programming on an out-of-sample two-dimensional options grid of option maturities and strike prices. Each cell in the table is the average pricing errors across ten different Genetic Programming formulas. My results indicate a pattern of performance of the Genetic Programming model consistent with Chidambaran et al. (1999). First, the dollar pricing errors are small for short-maturity options as opposed to long-maturity options. However, the percentage pricing errors are just the opposite. Obviously this is because the magnitude of option prices varies substantially across option maturities. Second, the errors vary across the option strike. Once again this is because option prices are very small for out-of-the money options and much higher for in-the-money options. The fitness criterion I use balances the two effects by minimizing a combination of the absolute and percentage pricing errors. I found that this allows me to better control the pricing errors for out-of-the money and in-the-money options without adversely affecting the errors for at-the-money options. While all the parent selection methods yield similar qualitative results, they vary widely in efficiency. Results indicate that the Fitness-overselection method and the Tournament method (n D 7) providing the best results and that Genetic Programming gives a good numerical approximation to the Black–Scholes model.
108.4 Extensions An important advantage of the Genetic Programming approach over other numerical techniques is its ability to incorporate known approximate solution as a starting point for the approximation. That is, we can seed the initial population of equations with a particular equation or individual. This has two effects on the efficiency of the program. One, I start with an individual member in the population that gives a good fit to the data. Two, the elements of this equation will be added to the “gene pool” to be used in evolving future generations. This approach can reach the true pricing model more efficiently as it begins the search from a locally optimum solution. In the work above using the data simulated using a Geometric Brownian motion and Black–Scholes option prices, the Black–Scholes equation will exactly fit the data. I verify this by seeding the program with the Black–Scholes model and find that the program immediately identifies the Black–Scholes equation as the Best Formula. Chidambaran et al. (1999) extend the application of Genetic Programming and illustrate how the Genetic
1613
Programming model can adapt and outperform the Black– Scholes model in a jump-diffusion world described by Merton (1976). Since the closed form solution for the option prices in a jump-diffusion world is available, they can measure the pricing errors from the Genetic Programming model and the Black–Scholes model in such a world. They show that Genetic Programming performs very well, beating the Black–Scholes and Neural Networks in a jump-diffusion world as well. The efficiency and accuracy of the routine improves even more when the initial population is seeded with the Black–Scholes model and the program operates to improve on the known elements that seem to work; that is, the elements of the Black–Scholes model.
108.5 Conclusion In this chapter I have described a procedure to use Genetic Programming to develop an option pricing model using only data. The results, from a controlled simulation, suggest that Genetic Programming works well in practice. Genetic Programming has many advantages over other numerical techniques. First, it is a non-parametric data driven approach and requires minimal assumptions about the stock price process. Many researchers attribute the systematic biases in Black– Scholes prices to the assumption that stock prices follow a diffusion process and have developed alternatives to Black– Scholes by considering other stock price processes. However, no single model explains all of the Black–Scholes biases and closed form solutions are elusive (Rubinstein 1985). Genetic Programming uses options data and extracts the implied pricing equation without making specific assumptions about the stock price process. Second, the Genetic Programming method requires less data than other numerical techniques such as Neural Networks (Hutchinson et al. 1994). I show this by simulation studies (Chidambaran et al. 1999) that use smaller subsets of the data and by using both genetic programs and neural networks to price relatively thinly traded equity options. Indeed, in some cases the programs are run on as few as 50 data points. The time required to train and develop the genetic programming formulas is also relatively short. Third, Genetic Programming can incorporate known analytical approximations in the solution method. For example, we can use the Black–Scholes model as a parameter in the genetic program to build the option pricing model when developing extensions of the Black–Scholes model. The final solution can then be considered to be an adaptation of the Black–Scholes model to conditions that violate the underlying assumptions. The flexibility in adding terms to the parameter set used to develop the functional approximation can also
1614
be used to examine whether factors beyond those used in this study, for example, trading volume, skewness and kurtosis of returns, and inflation, are relevant to option pricing. Finally, since the Genetic Programming method is self-learning and self-improving, it is an ideal too for practitioners. The selflearning and self-improving feature also makes the method robust to changing economic environments.
References Allen, F. and R. Karjalainen. 1999. “Using genetic algorithms to find technical trading rules.” Journal of Financial Economics 51(2), 245–271. Ball, C. A. and W. N. Torous. 1985. “On jumps in common stock prices and their impact on call option pricing.” Journal of Finance 40(March), 155–173. Ballie R. and R. DeGennaro. 1990. “Stock returns and volatility.” Journal of Financial and Quantitative Analysis 25(June), 203–214. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81, 637–654. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31(April), 307–327. Chidambaran, N. K. 2003. Genetic programming with Monte Carlo simulation for option pricing, Unpublished working paper, Rutgers University. Chidambaran, N. K. and S. Figlewski. 1995. “Streamlining Monte Carlo simulation with the quasi-analytic method: analysis of a pathdependent option strategy.” Journal of Derivatives 3(2), 29–51. Chidambaran, N. K., C. H. Jevons Lee, and J. Trigueros. 1999. “An adaptive evolutionary approach to option pricing via genetic programming,” in Computational Finance – Proceedings of the Sixth International Conference, Y. S. Abu-Mostafa, B. LeBaron, A. W. Lo, and A. S. Weigend (Eds.). MIT Press, Cambridge, MA.
N.K. Chidambaran French, K. R., G. W. Schwert, and R. F. Stambaugh. 1987. “Expected stock returns and volatility.” Journal of Financial Economics 19(September), 3–29. Ho, T. H. 1996. “Finite automata play repeated prisoner’s dilemma with information processing costs.” Journal of Economic Dynamics and Control 20(January–March), 173–207. Holland, J. H. 1975. Adaptation in natural and artificial systems, The University of Michigan Press, Ann Arbor. Hull, J. 1993. Options, futures, and other derivative securities, 2nd Edition, Prentice-Hall, Englewood Cliffs, NJ. Hutchinson, J., A. Lo, and T. Poggio. 1994. “A nonparametric approach to the pricing and hedging of derivative securities via learning networks.” Journal of Finance 49(June), 851–889. Koza, J. R. 1992. Genetic programming, MIT Press, Cambridge, MA. Lettau, M. 1997. “Explaining the facts with adaptive agents.” Journal of Economic Dynamics and Control 21, 1117–1147. Macbeth, J. D. and L. J. Merville. 1979. “An empirical estimation of the Black-Scholes call option pricing model.” Journal of Finance 34(December), 1173–1186. Macbeth, J. D. and L. J. Merville. 1980. “Tests of the Black-Scholes and Cox call option valuation models.” Journal of Finance 35(May), 285–301. Marimon, R., E. McGrattan, and T. J. Sargent. 1990. “Money as a medium of exchange in an economy with artificially intelligent agents.” Journal of Economic Dynamics and Control 14, 329–373. Merton, R. C. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics 3(January–March), 125–144. Neely, C., P. Weller, and R. Dittmar. 1997. “Is technical analysis in the foreign exchange market profitable? A genetic programming approach.” Journal of Financial and Quantitative Analysis 32(4), 405–426. Rubinstein, M. 1985. “Nonparametric tests of alternative option pricing models.” Journal of Finance 40(June), 455–480. Rubinstein, M. 1997. “Implied binomial trees.” Journal of Finance 49, 771–818.
Chapter 109
A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions in Option Pricing, Review, and Integration Ren-Raw Chen and Cheng-Few Lee)
Abstract One of the important issues in option pricing is to find a stock return distribution that allows the stock rate of return and its volatility to depend on each other. Cox’s (Notes on option pricing I: constant elasticity of diffusions, unpublished draft, Stanford University, 1975) Constant Elasticity of Variance (CEV) diffusion generates a family of distributions for such a purpose. The main goal of this paper is to review and show the procedures of how such process and its resulting option pricing formula are derived. First, we show how the density function of the CEV diffusion is identified and we demonstrate the option formula by using the Cox and Ross (Journal of Financial Economics 145–166, 1976) methodology. Then, we transform the solution into a non– central chi-square distribution. Finally, a number of approximation formulas are provided. Keywords Non-central chi-square r CEV diffusion r Lognormal process r Gamma function r Option pricing model
109.1 Introduction Since the simultaneity of the world’s first options exchange, Chicago Board of Options Exchange (CBOE), and the most powerful option pricing model to date, the Black–Scholes (BS) model, in 1973, interest and research on the option area have grown explosively. People have found that options, so-called derivative (or redundant) securities, can serve as a perfect hedging tool. The BS model also provides a synthetic way of hedging if certain options are not traded. Furthermore, a number of corporate finance areas find option theories useful in either pricing (e.g., convertible/callable bonds), or carrying out important characteristics of the problem (e.g., investment policies).
R.-R. Chen and C.-F. Lee () Rutgers University, New Brunswick, NJ, USA e-mail: [email protected]
The major drawback of the BS model is its assumption of the constant volatility of the stock return distribution. Due to the dynamic nature of our economy, the volatility is changing over time. This important property is not characterized in the stock return assumptions of the BS model. To overcome this problem, people have come up with a number of suggestions. Hull and White (1987), Johnson and Shanno (1987), Scott (1987), and Wiggins (1987) set the volatility to vary over time. This direct modeling of the randomness of the volatility gives flexibility to the stock return distribution but pays a high price of costly numerical calculations. The non-existence of a closed form solution prevents their models from being popular. Geske’s compound option model (1979) also allows the volatility to change over time. However, his model introduces randomness only when an option can be exercised early. If the option cannot expire prematurely, his model will not allow the volatility to change.1 Cox (1975) proposes a constant elasticity of variance (CEV) diffusion that is complex enough to allow changing volatility and simple enough to generate a closed form solution for options. The CEV diffusion not only has the flexibility to allow changing volatility but also preserves the property of non-negative values of the state variable as does in the log-normal diffusion used in the BS model. Beckers (1980), MacBeth and Merville (1980), Christie (1982), Emanuel and MacBeth (1982), Ang and Peterson (1984), Schaefer and Schwartz (1984), Choi and Longstaff (1985), Tucker et al. (1988), Jacklin and Gibbons (1989), and
This paper was previously published in the Journal of Financial Studies, 1 (1993), No. 1, pp. 29–51. ) Assistant Professor and Distinguished Professor and Chairperson, respectively. This paper adopts from our previous paper titled “Distribution Family of Stock Prices for Option Pricing Models” presented at the 1991 Annual Conference for Accounting and Quantitative Finance in Buffalo, NY. We would like to thank Joe Ogden and Todd Patzel for their helpful comments. 1 Bachelier (1900), a French mathematician, was believed to be the first person who used the Brownian Motion to model stock prices dynamically. There are other static analyses on the stock price distributions (e.g., Mandelbrot [1963] and Fama [1965]). For a detailed discussion on the distributions of the stock returns, please refer to Fama (1976, Chapter 1).
C.-F. Lee et al. (eds.), Handbook of Quantitative Finance and Risk Management, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-77117-5_109,
1615
1616
R.-R. Chen and C.-F. Lee
Lauterbach and Schultz (1990) have successfully used CEV models to perform empirical studies. Due to its flexibility, the CEV diffusion can be extended to allow mean-reversion, a necessary characteristic in interest rates. As a consequence, a modified CEV, the square root process, has led to a most used term structure model by Cox et al. (1985). Their term structure model is better than the one from the log-normal diffusion used by Dothan (1978), which does not allow mean reversion, and the one from the Ornstein-Uhlenbeck diffusion used by Vasiced (1977), which permits negative interest rates. The main purpose of the paper is to review and show the procedure of deriving Cox’s CEV models. First, the CEV diffusion is defined and all necessary derivations are shown. Then it is shown that the transition density of the CEV diffusion can be transformed into a non-central chi-square distribution. This paper is organized as follows: The next section shows the CEV diffusion and its transition density. Section 109.3 derives the resulting option pricing formula. The implementation of the non-central chi-square probabilities is introduced in Sect. 109.4. Finally, in sections, the paper is summarized.
109.2 The CEV Diffusion and Its Transition Density In this section, we show how the Constant Elasticity of Variance (CEV) model proposed by Cox (1975) can be used to determine the stock price distribution. A process, S.t/, is said to follow a CEV diffusion if it satisfies the following stochastic differential equation: ˇ
ds D Sdt C S 2 dW
(109.1)
Where W .t/ is a standard Wiener process, and ¢ are constants representing instantaneous mean and standard deviation, and “ (also constant) is known as the CEV parameter that determines the stock price distribution, which includes normal and log-normal as special cases. It should be noted that the instantaneous volatility on the ˇ stock return, S 2 , is a function of the stock price. The direction and size of the impact of the stock price on the volatility clearly depend on the magnitude of “. If “ is less than two, the higher the stock price, the lower the volatility. On the other hand, if “ is equal to two, the stock price has no influence on the volatility. In other words, the volatility will be a constant ¢ over time regardless of the stock price. Recently, Jacklin and Gibbons (1989) are trying to conduct studies on this CEV diffusion directly with the stock prices. It has been well recognized that option markets are more efficient than stock markets. Furthermore, with the possibility of perfect hedging between stocks and options, given
stock prices, option prices should be determined efficiently. Thus, focusing on option markets should provide less noisy results than on stock markets. In the next section, we show the resulting option formulas. It is clear that when “ D 2, it is nothing but a log-normal diffusion used by Black and Scholes. The transition density can be shown in the following theorem. Theorem 1. The log-normal process, dS D Sdt C ¢SdW, has the following transition density function: f .ln S.T /jln S.t//
2 .T t/; 2 .T t/ D N ln S.t/ C 2 where N.m; V / is a normal density function with mean m and variance V Proof. Divide the diffusion equation by S , we get: dS D dt C dW S Now, both the drift and the diffusion coefficient are constants. Therefore, we know that dS/S should be normally distributed. Define X be lnS , by Itô’s Lemma, we can get: 1 dX D XS dS C XSS .dS/2 2 1 1 2 2 1 D .Sdt C SdW/ S dt S 2 S2
2 dt C dW D 2 Since both ¢ 2 =2 and ¢ are constants and dW is normally distributed with mean 0 and variance dt,
2 .T t/; 2 .T t/ f .X.T /jX.t// D N X.t/ C 2 Substituting back lnS completes the proof. If “ ¤ 2, then the transition is more complex. Cox has given the transition density for “ < 2 as the following theorem:2 Theorem 2. If “ in Equation (109.1) is strictly less than two, then the process has the following transition density: f .S.T /jS.t// 1
1
D .2 ˇ/k 2ˇ .xz12ˇ / 42ˇ e xz I
1 2ˇ
p Œ2 xz
(109.2)
2 Note that “ < 2 is the usual case found empirically. Empirical evidence has suggested that the relationship between the volatility and the stock price is negative.
109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions
where is a set of all possible states .¨/ at time t C t. Applying Taylor’s series expansion to the first term of the density (from t to t C t) around t, we can get:
where kD
1617
2 2 .2 ˇ/.e .2ˇ/.T t / 1/
ft !T .S / D
x D kS.t/2ˇ e .2ˇ/.T t /
Z @ft !T .S /
t C o. t/ ft !T .S / C @t ft C t !T .S /d!
z D kS.T /2ˇ
Z
D Œ1 C 0 C o. t/ and Iq ./ is the modified Bessel function of order q.
Z D
3
Proof. By Chapman–Kolmogorov equation, we have: ft !T .S / f .S.T /jS.t// Z D ft !t C t .S / ft C t !T .S /d!
ft C t !T .S /d!
ft C t !T .S /d!
This is a single dimensional expansion because S.t/ is not stochastic. Next, we perform a two dimensional expansion on the integram: once on time .t/ and twice on the underlying variable .S /.
Z @ft !T .S / @2 ft !T .S / . S /2 @ft !T .S /
t C
S C C o. t/ d! ft !T .S / C @t @S @S 2 2 Z ˇ @ft !T .S / @ft !T .S / @2 ft !T .S / 2 S ˇ
t C .S t C S 2 W / C
t C o. t/ d! ft !T .S / C D @t @S @S 2 2 Z ˇ @ft !T .S / @2 ft !T .S / 2 S ˇ @ft !T .S / @ft !T .S /
t C .S t/
t C o. t/ C S 2 Wd! D ft !T .S / C 2 @t @S @S 2 @S
ft !T .S / D
D ft !T .S / C
@ft !T .S / @2 ft !T .S / 2 S ˇ @ft !T .S /
t C .S t/
t C o. t/ @t @S @S 2 2
Divide t both sides and let it go to zero, we have the partial differential equation that f has to satisfy: 0D
@2 ft !T .S / . 2 S ˇ / @ft !T .S / @ft !T .S / S C C @t @S @S 2 2 (109.3)
Solving this PDE is not an easy task; but one can verify that Equation (109.2) satisfies this PDE. A sketch of the verification is provided in Appendix 109A. Note that the proof of Theorem 1 cannot apply here because no matter what transformation we make, we will never
have both the drift and the diffusion coefficient constant.4 Thus, we know that the transition density cannot be normal. The proof of Theorem 2 is a standard procedure to identify the transition density, if it exists.5 Equation (109.3) is called the forward equation of the density function f .6 It should be noted that Equation (109.3) does not have a unique solution unless relevant boundary conditions are specified. These conditions normally depend on how the stochastic process evolves. A full development of this process can be seen in Feller (1951). In Theorem 2 where “ is less than 2, this process will hit zero and once it hits zero, it will stay there forever. Hence Theorem 2 only describes the part of the distribution where S is alive. There is a nontrivial probability 4
3
For example, see Breiman (1986, p. 199) for the Chapman– Kolmogorov equation.
For the transformation to qualify Gaussian, drift term has to be linear in X (transformed S) and the diffusion coefficient needs to be constant. 5 See Breiman (1986, p. 205–207) for this standard procedure. There are other methods but they all involve solving a partial differential equation. 6 See Karlin and Tayler (1981, p. 214) for details.
1618
R.-R. Chen and C.-F. Lee
that S will be dead at some time between t and T (i.e., S hits zero and stays there) as follows: f .S.s/ D 0jS.t// D G
1 ;x 2ˇ
(109.4)
v
u m1
e u du and .m/
./ is a Gamma function. It is well understood that in the log-normal diffusion .“ D 2/, S will never hit zero. Hence, it is not difficult for us to conjecture that zero will keep being S ’s absorbing barrier until “ D 2. When “ > 2, the process will not hit zero and its transition density describes a complete behavior of S . Schroder (1989) shows that Theorem 2 is essentially a form of non-central chi-square density if we make an appropriate change of variable as indicated in the following theorem. Theorem 3. The density function in Theorem 2 is a noncentral chi-square density if the changes of variables, z D kS.T /2“ and y D 2z, are made. Proof. The Jacobian of the transformation, z D kS.t/2“ , is jJ j D
1 S.T /ˇ1 .2 ˇ/k
Plugging the Jacobian into Equation (109.2), we obtain: ˇ1
1
k 2ˇ S.T /ˇ1 .xz12ˇ / 42ˇ e xz I
1 2ˇ
p Œ2 xz
1 42ˇ
Rewrite the middle term .xz12ˇ / and replace z with kS.T /2“ . Thus, the transformed density becomes: .x=z/
1 42ˇ
e
xz
I
1 2ˇ
p Œ2 xz
Now, do the second change of variable, y D 2z (with the Jacobian, jJ j D 1=2). Then the density further becomes:
1 p 2x 42ˇ 2xy e 2 I 1 2 xz 2ˇ y 1
p 1 y 42ˇ 2xy e 2 I 1 2 xz or 2ˇ 2 2x 1 2
Note that this expression is close to a non-central chi-square density. It differs from the non-central chi-square density in that the sign of the power of y=2x does not match with that of the order of the Bessel function. Fortunately, if the order of the modified Bessel function is an integer, we have a property of the Bessel function: Iq ./ D Iq ./;
1 p 2xy 1 y 42ˇ e 2 I 1 2 xz 2ˇ 2 2x
Rm for some S 2 .t; T / where G.m; v/ D
which will make the above equation be rewritten as:
This is a non-central chi-square density function with 2 2=.“ 2/ degrees of freedom and 2x non-centrality.7 We denote such a density function X2 .2 2=.“ 2/; 2x/. We also denote its distribution function. ZyN
1 p 2xy 1 y 42ˇ e 2 I 1 2 xz dy 2ˇ 2 2x
0
x 2 yI N 2
2 ; 2x 2ˇ
If q is not an integer, the density will not be precisely a non-central chi-square density but the distribution function can still be expressed as a non-central chi-square distribution. Comparing the integram above and the original densityfunction, one can find that the original density acts like8 a non-central chi-square density with 2x being the random variable, y contributing to the degree of non-centrality, and 2 2=.“ 2/ being the degrees of freedom. Thus, the same integral (without q becoming q) will be represented by another non-central chi-square distribution function: ZyN 0
1 p 2xy 1 y 42ˇ e 2 I 1 2 xz dy 2ˇ 2 2x
D 1 x 2 2xI 2
2 ; yN 2ˇ
It is easy to see that for 1=.2“/ to be a non-negative integer, “ must be one. If “ D 1, then the Equation (109.1) will reduce to the square root process, a process used in Cox and Ross (1976) and Cox et al. (1985). This is a convenient choice in expressing the option pricing formulas in terms of two noncentral chi-square distributions. Although the empirical evidence has shown that the relationship between the stock price and its return volatility is negative, in order to allow flexibility in testing this hypothesis, the transition density9 for “ > 2 is given by Emanuel and MacBeth (1982) through a similar manner. With the following theorem derived by Emanuel and MacBeth, the CEV process [Equation (109.1)] is fully characterized. 7 See Johnson and Kotz (1970) or Equation (109.11) later on in the paper for the non-central chi-square density formula. 8 We use the term, “acts like,” because the density is never a non-central chi-square one. It can be represented as a non-central chi-square only in the form of distribution functions. 9 Not in the form of non-central chi-square.
109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions
Theorem 4. If “ in Equation (109.1) is strictly greater than two, then the process has the following transition density: 1
1
f .S.T /jS.t// D .ˇ2/k 2ˇ .xz12ˇ / 42ˇ e xz I
1 ˇ2
p 2 xz
where k, x, z, and Iq ./ are as defined. Furthermore, this is a non-central chi-square density for all values of “. x2 2 C
1 p 2xy 1 y 42ˇ 2 ; 2x D e 2 I 1 2 xy 2ˇ ˇ2 2 2x
Proof. The procedure is precisely identical to that of Theorem 3 except that 1=.“ 2/ needs not be an integer.10 It can be seen that Theorem 4 provides a more direct link between the transition density function of the process [Equation (109.1)] and the non-central chi-square density. Thus, the distribution function is directly a non-central chisquare distribution function. Note that the PDE that this transition entity must satisfy is the same one11 but with a different boundary condition. Note that when “ < 2, S D 0 is an absorbing barrier while here it is inaccessible.
109.3 The CEV Option Pricing Models
1619
condition max fK S; 0g leads to the put option formula. Note that this PDE is similar to the PDE that the density function has to satisfy. The differences are (1) this PDE has an extra term on the left hand side, C ; and (2) this PDE is “risk-neutralized” so that is replaced by the risk-free rate, . The first difference is caused by the fact that Equation (109.6) is not a forward equation but a backward equation for the Kac functional [see Karlin and Tayler (1981, p. 224–4)]. The second difference results from an arbitrage argument used by Black and Scholes (1973) and Cox et al. (1979).12 Therefore, one can understand that the CoxRoss method is no easier than the traditional Black–Scholes method since both of them have to solve essentially the same PDE. However, the Cox-Ross method is convenient if one changes the payoff slightly. Since the density function is already calculated, it is just a matter of finding the expected value, while in the BS method, one has to solve another similar but different PDE. Given the density function, we can find the European call option formula by replacing C.T / in Equation (109.5) with max fS.T / K; 0g. The put formula can be obtained from either a similar replacement (max fK S.T /; 0g) or imposing the put-call parity. Cox shows the following call option pricing model under the density described by Theorem 2. Theorem 5.
The European option formula can be derived by taking a riskneutralized expectation of the option’s end payoff at the maturity date:
C.t/ D S.t/
1 e x x n G n C 1 C X
1 2ˇ 2ˇ ; kK
.n C 1/
nD0
1 X e x x 2ˇ G.n C 1; kK 2ˇ /
K 1 n C 1 C nD0 2ˇ nC1
C.t/ D EO t Œe .T t / C.T /
where EO t Œ is the conditional expectation (conditional on S.t/) taken under the risk-neutralized process of the stock price. The risk-neutralized process is the same as the original process except that the drift parameter, , is replaced by an instantaneous riskless rate, (see Cox and Ross [1976]). In their original work, Black and Scholes used an arbitrage-free argument to price C.t/. If the same argument applies there, the PDE that the option has to satisfy is: C D
@C @2 C 2 ˇ @C C S C S @t @S @S 2
e
(109.5)
.T t /
(109.7) Proof. C.t/ D EO t Œe .T t / maxfS.T / K; 0g Z1 De
.T t /
S.T /f .S.T /jS.t//dS.T / K
(109.6)
Solving this PDE with the boundary condition max fS K; 0g leads to the call option formula and with the
10 Note that the Jacobian for the first change of variable in the proof of Theorem 3 is -dz /dS. This is because z is now an inverse function of S. Details can be seen in Emanuel and MacBeth (1982). 11 It can be easily seen that it is the same forward equation that it has to satisfy.
Z1 e
.T t /
K
f .S.T /jS.t//dS.T / K
12 A formal discussion of this so-called “risk-neutralized” expectation starts from Harrison and Kreps (1979) where they introduced the equivalent martingale measure. They show that no arbitrage is equivalent to the existence of an equivalent martingale measure, which can be reached by using the Girsonov’s theorem. In the case of Black and Scholes, the stock return (risky) will be transformed into the riskless rate.
1620
R.-R. Chen and C.-F. Lee
Using Equation (109.1) and writing out the Bessel function explicitly, we can write the integram of the second integral as follows. .2 ˇ/k
1 X nD0
1 2ˇ
.x
12ˇ
/
1 42ˇ
e
xz
zn x n .n C 1/.n C 1 C
.zx/
Integrating the quantity leads to: Z1 X 1
1 42ˇ
kK 2ˇ
nD0
1 X
Making change of variable z D kS.T / and rearranging terms, we obtain the following expression:
nD0
nD0
1 2ˇ
dz
1 R1 e z zn dz e x x nC 2ˇ kK 2ˇ .n C 1/
1 n C 1 C 2ˇ
1 1 X e x x nC 2ˇ G n C 1; kK 2ˇ
D 1 n C 1 C 2ˇ nD0
1
e xz zn x nC 2ˇ .n C 1/ n C 1 C
1 2ˇ
The final expression can be obtained by interchanging the integral sign and summation sign:
1 2ˇ /
2“
1 X
1
e xz zn x nC 2ˇ .n C 1/ n C 1 C
The first integral can be done in a similar manner. The integram is:
1 1 X X e .T t / S.T /e xz zn x nC 2ˇ e .T t / S.T /e xz znC 2ˇ z 2ˇ x n x 2ˇ
D
1 1 .n C 1/ n C 1 C 2ˇ nD0 .n C 1/ n C 1 C 2ˇ nD0 1
1
1
1
1
1
1
1 e .T t / S.T /e xz znC 2ˇ k 2ˇ S.t / x n k 2ˇ e .T t / X S.T /
D 1 .n C 1/ n C 1 C 2ˇ nD0
D
1 X
Similarly, integrating this integram and interchanging integration and summation will lead to: 1 e x x n G n C 1 C X nD0
1 2ˇ 2ˇ ; kK
1
e xz znC 2ˇ x n S.t/
1 nD0 .n C 1/ n C 1 C 2ˇ
.n C 1/
This completes the proof. Since the Cox-Ross method can be imposed and the transition density can be expressed as a non-central chi-square density. Schroder shows that Equation (109.8) can be simplified into a formula in terms of non-central chi-square distribution functions. Theorem 6. Equation (109.7), with proper transformations, can be expressed in terms of non-central chi-square
distributions and the option pricing formula takes the same form as the Black–Scholes formula. 2 C.t/ D S.T / 1 2zI 2 C e .T t / K2 2xI
2 ; 2x 2ˇ
2 ; 2z 2ˇ (109.8)
Where x and z as defined before and 2 .¨I ¤; œ/ is a cumulative non-central chi-square distribution function with ¨, ¤, and œ being the upper limit of the integral, degrees of freedom, and noncentrality, respectively.
109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions
1621
Proof. As in Theorem 5, we have the call option formula expressed in terms of two integrals:
C.t/ D EO t Œe .T t / maxfS.T / K; 0g Z1 D
Z1 e
.T t /
S.T /f .S.T /jS.t//dS.T / e
.T t /
K
K
f .S.T /jS.t//dS.T / K
The integram of the first integral can be simplified as follows.
1 1 e .T t / S.T /.2 ˇ/k 2ˇ xz12ˇ 42ˇ e xz I 1
D e .T t / S.T /.2 ˇ/k 2ˇ 1
D e .T t / S.T /.2 ˇ/k 2ˇ
z x
1 z 42ˇ
Recall that z D kS.T /2“ and x D kS.t/e .2“/.T t / in the risk-neutralized world. Thus, the equation can be simplified as follows: S.t/.2 ˇ/kS.T /1ˇ
1 z 42ˇ
x
e xz I
1 2ˇ
p 2 xz
1 p z 42ˇ xz @z D S.t/ e I 1 2 xz 2ˇ @S.T / x
Plugging into the first integral, we obtain:
S.t/
Z1 1 p z 42ˇ xz e I 1 2 xz dz 2ˇ x
y
Letting y D 2z, we arrive at the non-central chi-square distribution we desire: S.t/ 1 2 2y I 2 C
2 ; 2x 2ˇ
x 2 z2ˇ
x
1 2ˇ
1
42ˇ
1
p 2 xz
e xz I
1 2ˇ
p 2 xz
ˇ
x 2ˇ z 2ˇ e xz I
1 2ˇ
p 2 xz
if 1=.2 “/ is an integer; or e .T t / K2 2xI 2
2 ; 2y 2ˇ
if 1=.2 “/ is not an integer. This completes the proof. Similar to Theorems 5 and 6, in the case where, “ > 2, we can show solutions in a similar manner. Thus, we simply state the results in the following theorem without proof. Theorem 7. When “ > 2, we may adopt the density in Theorem 4 and react the call option formula as follows. 1 1 X e x x nC 2ˇ G n C 1; kK 2ˇ
C.t/ D S.t/ 1 n C 1 C 2ˇ nD0
1 e x x n G n C 1 C 1 ; kK 2ˇ X 2ˇ e .T t / K .n C 1/ nD0 (109.9) in its original form and
where y D kK 2“ . From Theorem 3, the second integral is clearly a non-central chi-square distribution after proper transformations. Thus, the second term is: .T t / 2 e K 1 2y I 2
2 ; 2x 2ˇ
C.t/ D S.T / 1 2 2xI e
.T t /
2 ; 2z ˇ2
K 2zI 2 C 2
2 ; 2x ˇ2
(109.10)
if expressed in term of non-central chi-square forms.
1622
R.-R. Chen and C.-F. Lee
When p D 2, the density described in Theorem 1 will yield the well-known Black–Scholes formula as follows: p C.t/ D S.T /N.d / e .T t / KN.d T t / (109.11) ln.
S.t /
2
/C. C /.T t /
K p 2 where d D . T t It should be noted that although the option formulas can be expressed much more simply in the form of non-central chi-square distributions, which are easier to interpret, it does not make the computations any easier. Unlike other wellknown distributions, a non-central chi-square probability involves an infinite sum whose convergence, in some cases, is very slow. In the following section, we discuss how to compute such probabilities.
with 1 degree of freedom and one non-central 2 with 1 degree of freedom and noncentrality. The sum of 12 (l) is nothing but 2 . 1/; and the last term is 2 .l / but this is nothing but a squared N.; 1/. It will be clear later on that this separation is helpful since calculating a non-central chi-square’s tail probabilities can be very difficult. Johnson and Kotz (1970) have shown the density of a noncentral chi-square variable as: f .y/ D
Equations (109.8) and (109.11) involve non-central Chi-square probabilities. Unlike other known central distributions, each non-central chi-square probability involves an infinite sum of Poisson-weighted central chi-square probabilities. Thus, to obtain an accurate estimate, one has to make sure that he has summed up enough terms. It is found that with a large value of degrees of freedom or non-centrality, the number of terms to be summed up is huge. In this section, we present the potential problems involved in computing noncentral chi-square probabilities and how to overcome them.
109.4.1 A Formula for Non-Central Chi-Square Distribution, An Infinite Sum of Poisson-Weighted Central Chi-Squares Let Ui0 s be independent unit normal variables and ıi0 s be constants. Then the distribution of X
.Uj C ıj /2
j D1
is non-central 2 with degrees of freedom and parameter P D j D1 ıj2 . It is clear then that if all D 0, which means all ı will have to be zero, the distribution will be a central 2 distribution with degrees of freedom and we denote it 2 .; /. From the definition, we know that a 2 .; / can be regarded as the sum of a series of ( 1 terms) central 2 ’s
(109.12)
All the tail probability can be expressed as a weighted sum of central chi-squares with Poisson probabilities being the weights: 2 .xI ; /
109.4 Computing the Non-Central Chi-Square Probabilities
p y 1 y v2 4 I v2 y e 2 : 4 2
D
3 j 2 Zx y 1 v 2 4 v y 2 Cj 1 e 2 dy5 jŠ 2 2 Cj 2v C j
1 X e 2
j D0
0
(109.13) The first two cumulants of this distribution are C and 2. C 2/. It is clear that the first part is a Poisson probability with mean =2 and the second part is a central 2 .C2j /. Therefore the non-central 2 probability depends on two probabilities. From Johnson and Kotz, we also know that if we increase either the degrees of freedom or the noncentrality, the distribution will shift to the right. This is because when either parameter gets large, both Poisson and central 2 distribution will shift to the right. As a consequence, we have to add a big number of terms to get a non-central 2 probability. Summing up a big number of terms is not a problem with modern computer technology. The problem lies in each probability to be summed up is extremely small. When the degrees of freedom and noncentrality get real big (this is the case in some options in our sample) even double precision is not able to detect such probability. Therefore, although a non-central distribution looks like all the other well-known distributions, it is much harder to compute its tail probabilities, because we not only have to compute the integral, but to compute the infinite sum. Unfortunately, for most stock options, degrees of freedom and noncentrality tend to be very large. In such a case, we have to go for the approximation methods described below.
109.4.2 Various Approximation Formulas Johnson and Kotz suggest a normal approximation by Sankaran (1963) be a good approximation when and
109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions
are large. This approximation is also recommended by Schroder (1989). Sankaran suggests that
y C
h (109.14)
C 2 1 C h.h 1/ . C /2
. C /2 h.h 1/.2 h/.1 3h/ 2. C /4
and variance
C 2 C 2 h2 1 .1 h/.1 3h/ . C /2 . C /2 where 1 23 . C /. C 3/. C 2/2 . There are two other approximations in Abramowitz and Stegun (1970): P .Y < y/ D N.y / where y D
q
2y 1Cb
q
2a 1Cb
Stock options tend to have large and possibly . Recall that in Cox’s model is either 2z or 2x, each of which is an increasing function of k. which, in turn, is a sensitive function of “. As “ gets closer to 2, k will grow tremendously. 2 Even if “ D 0, k will carry a value of 2 2 .e2.T t / 1/ , which is already quite large.14 This leads us to use the approximations rather than direct calculation base on Equation (109.13).
is approximately normal with expected value
(109.15)
1,
a D C and b D
; C
and P .Y < y/ D N.y / (109.16) q q 2y 2a D 1Cb 1Cb 1 and a and b are defined as
where y above. Since when either or approaches infinity, the standardized variable will have unit normal distribution, we have: P .Y < y/ D N.y /
(109.17)
p . where y D y.C/ 2C4 Theoretically, when and are not so large, it is not difficult to sum up terms from Equation (109.13). FORTRAN IMSL mathematical library has a routine to compute such probabilities following precisely Equation (109.13). When and are large, a non-central 2 variable will tend to be normally distributed and therefore normal approximation will be accurate.13
109.5 Conclusion The CEV process can result in a whole class of stock price distributions by changing the “ parameter. It is known that this class of processes does not permit negative values for the state variable; it allows the volatility to change over time; and it also allows the volatility to be either positively or negatively related to the stock price. Furthermore, it can be extended to allow mean reversion without changing its basic properties.15 This paper fully reviews and integrates the CEV diffusion and its resulting option models. All necessary theorems are proved. It is hoped that the reader can use this paper as a starting point to understand stochastic processes, their transition densities, stock price and interest rate dynamics, and further be able to price contingent claims. This line of study has a much room for future research. Empirical studies on the process itself have so far been very limited. Due to the nonlinear nature of the process and nonstationary data, a direct test of the distribution has not yet been done. Empirical studies on the resulting option formula are still very limited. In its theoretical aspects, the CEV class of processes are useful in modeling other financial variables, which by definition cannot be negative, such as volatility, interest rates, and other macro/micro economic variables. Given the growing popularity of the contingent claim pricing methodology, a number of innovative securities traded over the counter (e.g., swaps) and corporate finance areas (e.g., capital budgeting) are perfect candidates for the CEV process.
Appendix 109A In this appendix, we show that the forward equation [Equation (109.3)], 0 D ft C Sf s C ¢s2 S “ fss ; 14
13
1623
However, we find very little support that either method is reliable. The IMSL routine will occasionally give probabilities greater than 1 and the approximations do not provide close answers among themselves. One needs to be very careful about what approximation to use.
For interest rate options, both and are quite small. Hence, Equation (109.13) is plausible to use. 15 The mean reversion feature has been explicitly incorporated in the square root process (“ 1) in studying the term structure of interest rates. See Cox et al. (1985) for details.
1624
R.-R. Chen and C.-F. Lee
will be satisfied by the density function [Equation (109.2)]. f .S.T /jS.t// 1 1 D .2 ˇ/k 2ˇ xz12ˇ 42ˇ e xz I
p 2 xz :
1 2ˇ
We first derive the following necessary partial derivatives: fs
fss
ft
1 2ˇ
1 2ˇ
1 2ˇ
D fx xs D f
1 2ˇ
Cf
1 1 2ˇ
xs
D fxx xs xs C fx xss
1 1 1 1 f 1 f 1 Cf 2 .xs / 2ˇ 2ˇ 2ˇ 2ˇ
1 1 C f Cf 1 xss 2ˇ 2ˇ
D f
D fx xt C fk kt C fz zt
1 1 Cf 1 xt C 2ˇ 2ˇ
1 1 Cf C 1 zt C f 2ˇ 2ˇ
D f
where xs D .2 ˇ/kS.t/1ˇ e .2ˇ/.T t / xss D .1 ˇ/.2 ˇ/kS.t/ˇ e .2ˇ/.T t / xt D .2 ˇ/kS.t/1ˇ e .2ˇ/.T t / C kt S.t/2ˇ e .2ˇ/.T t / kt D
22 2 .e .2ˇ/.T t / 1/2
zt D S.T /2ˇ kt D
e .2ˇ/.T t /
22 S.T /2ˇ 2 .e .2ˇ/.T t / 1/2
The detailed derivation of the above result can be obtained from the authors on request. The simplification depends on the recursive properties of the Bessel function: @Iq Œ! q D IqC1 Œ! C Iq Œ! @! ! 2q Iq1 Œ! D IqC1 Œ! C Iq Œ! !
ft
1 2ˇ
k
kt
We now plug the partial derivative results into the forward equation: 2S ˇ fss D 0 ft C Sf s C 2 or fx xt C fk kt C fz zt C Sf x xs C
2S ˇ 2 fx xss C fx xss D 0 2
Note that Sf s will cancel the first term of xt (multiplied by 2 ˇ fx ) and the first two terms in 2S fss (multiplied by .xs /2 ) will cancel the second term of xt (multiplied by fx ). For the remaining terms, we first have to express all density func1 1 or .2ˇ/ 1 order of the Bessel tions in terms of either .2ˇ/ function. This involves the use of the second recursive formula of the Bessel function:
1 f C1 2ˇ
x 1 1 1 D f 1 f z 2ˇ .2 ˇ/z 2ˇ
109 A Constant Elasticity of Variance (CEV) Family of Stock Price Distributions
1 2 2ˇ
z 1 ˇ1 1 D f f 1 x 2ˇ .2 ˇ/x 2ˇ
f
After the all expressions been made in a consistent manner, we can cancel terms directly and are left with the following three terms: f
1 2ˇ
e .2ˇ/.T t / Œ.2 ˇ/ 1 .1 ˇ/ D 0 .e .2ˇ/.T t / 1/
and this completes the verification.
References Abramowitz, M. and I. Stegun (Eds.). 1970. Handbook of mathematical functions, NBS Applied Mathematical Series, vol. 55, 9th Printing, National Bureau of Standards, Washington, D.C. Ang, J. and D. Peterson. 1984. “Empirical properties of the elasticity coefficient in the constant elasticity of variance model.” Financial Review 19(4), 372–380. Bachelier, L. 1964. Theorie de la Speculation (1900), reprinted in English in the Random Characters of Stock Market Prices, MIT, Cambridge. Beckers, S. 1980. “The constant elasticity of variance model and its implications for option pricing,” Journal of Finance 35, June, 661–673. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities,” Journal of Political Economy 81, 637–654. Breiman, L. 1986. Probability and stochastic processes, 2nd Edition, Scientific Press, CA. Choi, J. and F. Longstaff. 1985. “Pricing options on agricultural futures: an application of the constant elasticity of variance option pricing model.” Journal of Futures Markets 5(2), 247–258. Cox, J. 1975. Notes on option pricing I: constant elasticity of diffusions, unpublished draft, Stanford University. Cox, J. and S. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3(2), 145–166. Cox, J., J. Ingersoll, and S. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53(March), 385–407. Cox, J, S. Ross, and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 7, 229–263.
1625
Dothan, L. 1978. “On the term structure of interest rates.” Journal of Financial Economics 6(1), 59–69. Emanuel, D. and J. MacBeth. 1982. “Further results on the constant elasticity of variance call option pricing formula.” Journal of Financial and Quantitative Analysis 17(November), 533–554. Fama, E. 1965. “The behavior of stock market prices.” Journal of Business 34(January), 34–105. Fama, E. 1976. Foundation of finance, New York: Basic Books, 1976. Feller, W. 1951. “Two singular diffusion problems.” Annals of Mathematics 54(1), 173–182. Geske, R. 1979. “The valuation of compound options.” Journal of Financial Economics 7, 63–82. Harrison, J. and D. Kreps. 1979. “Martingales and arbitrage in multi-period securities markets.” Journal of Economic Theory 20, 381–408. Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” The Journal of Finance 42(2), June, 281–300. Jacklin, C. and M. Gibbons. 1989. Constant elasticity of variance diffusion estimation, Working paper, Stanford University. Johnson, N. and S. Kotz. 1970. Distributions in statistics: continuous univariate distributions-2, Houghton Mifflin, Boston. Johnson, H. and D. Shanno. 1987. “Option pricing when the variance is changing.” The Journal of Financial and Quantitative Analysis 22(2), June, 143–151. Karlin, S. and H. Tayler. 1981. A second course in stochastic processes, Academic, San Diego, CA. Lauterbach, B. and P. Schultz. 1990. “Pricing warrants: an empirical study of the Black-Scholes model and its alternatives.” Journal of Finance 45(September), 1081–1120. MacBeth, J. and L. Merville. 1980. “Tests of the Black-Scholes and cox call option valuation models.” Journal of Finance 35(May), 285–301. Mandelbrot, B. 1963. “The variation of certain speculative prices.” Journal of Business 36(October), 34. Sankaran, M. 1963. “Approximations to the non-central chi-square distribution.” Biometrica 50(June), 199–204. Schaefer, S. and E. Schwartz. 1984. “A two-factor model of the term structure: an approximate analytical solution.” Journal of Financial and Quantitative Analysis 19(4), 413–424. Schroder, M. 1989. “Computing the constant elasticity of variance option pricing formula.” Journal of Finance 44(March), 211–220. Scott, L.O. 1987. “Option pricing when the variance changes randomly: theory, estimation, and an application.” The Journal of Financial and Quantitative Analysis 22(4), Dec., 419–438. Tucker, A., D. Peterson, and E. Scott. 1988. “Tests of the Black-Scholes and constant elasticity of variance currency call option valuation models.” Journal of Financial Research 11(3), 201–214. Wiggins, J.B. 1987. “Options values under stochastic volatility: theory and empirical estimates.” Journal of Financial Economics 19, 351–372.
References
Abdel-khalik, A. R. and E. J. Lusk. 1974. “Transfer pricing – a synthesis.” The Accounting Review 49, 8–24. Abraham, J. M. and P. H. Hendershott. 1996. “Bubbles.” Journal of Housing Research 7(2), 191–208. Abramowitz, M. and I. Stegum. 1964. Handbook of mathematical functions, Dover Publications, New York. Abramowitz, M. and I. Stegun (Eds.). 1970. Handbook of mathematical functions, NBS Applied Mathematical Series, vol. 55, 9th Printing, National Bureau of Standards, Washington, D.C. Acharya, V. and J. Carpenter. 2002. “Corporate bond valuation and hedging with stochastic interest rates and endogenous bankruptcy.” Review of Financial Studies 15, 1355–1383. Acharya, V. V. and T. C. Johnson. 2007. “Insider trading in credit derivatives.” Journal of Financial Economics 84, 110–141. Acharya, V., J.-Z. Huang, M. Subrahmanyamt, and R. Sundaram. 2006. When does strategic debt-service matter? Economic Theory 29, 363–378. Ackert, L., N. Charupat, B. Church and R. Deaves. 2002. “Bubbles in experimental asset markets: irrational exuberance no more.” Working paper 2002–2024. Ackert, L., N. Charupat, R. Deaves and B. Kluger. 2006. “The origins of bubbles in laboratory asset markets.” Working paper 2006–6, Federal Reserve Bank of Atlanta. Acs, Z. J. and S. C. Isberg. 1996. “Capital structure, asset specificity and firm size: a transaction cost analysis,” in Behavior norms, technological progress, and economic dynamics: studies in Schumpeterian economics, E. Helmstadter and M. Pertman (Eds.). University of Michigan Press, Ann Arbor, Michigan. Adams, J. and C. J. Montesi. 1995. Major issues related to hedge accounting, Financial Accounting Standard Board, Newark, CT. Adenstedt, R. K. 1974. “On large-sample estimation for the mean of a stationary sequence.” Annals of Statistics 2, 1095–1107. Adler, M. and B. Dumas. 1983. “International portfolio choices and corporation finance: A synthesis.” Journal of Finance 38, 925–984. Admati, A. R. and P. Pfleiderer. 1988. “A theory of intraday patterns: volume and price variability.” Review of Financial Studies 1, 3–40. Afonso, A. 2003. “Understanding the determinants of sovereign debt ratings: evidence for the two leading agencies.” Journal of Economics and Finance 27(1), 56–74. Afonso, A., P. Gomes, P. Rother. 2007. What “hides” behind sovereign debt ratings? European Central Bank Working Paper Series 711. Agarwal, V. and R. Taffler. 2008. “Comparing the performance of market-based and accounting-based bankruptcy prediction models.” Journal of Banking and Finance 32, 1541–1551. Aggarwal, R. 2000. “Stabilization activities of underwriters after initial public offerings.” Journal of Finance 55, 1075–1103. Aggarwal, R. and P. Conroy. 2003. “Price discovery in initial public offerings.” Journal of Finance 55, 2903–2922.
Aggarwal, R., C. Inclan, and R. Leal. 1999. “Volatility in emerging stock markets.” Journal of Financial and Quantitative Analysis 34, 33–55. Aggarwal, R., K. Purnanandam, and G. Wu. 2006. “Underwriter manipulation in initial public offerings.” Working Paper, University of Minnesota. Agrawal, V., M. Kothare, R. K. S. Rao, and P. Wadhwa. 2004. “Bidask spreads, informed investors, and firm’s financial condition.” The Quarterly Review of Economics and Finance 44, 58–76. Aguilar O. and M. West. 2000. “Bayesian dynamic factor models and variance matrix discounting for portfolio allocation.” Journal of Business and Economic Statistics 18, 338–357. Aharony, J. and I. Swary. 1980. “Quarterly dividends and earnings announcements and stockholder returns: evidence from capital markets.” Journal of Finance 35, 1–12. Ahn, D. H. and B. Gao. 1999. “A parametric nonlinear model of term structure dynamics.” Review of Financial Studies 12, 721–762. Ahn, C and H. Thompson. 1988. “Jump diffusion processes and the term structure of interest rates.” Journal of Finance 43, 155–174. Ahn, D.-H., J. Boudoukh, M. Richardson, and R. F. Whitelaw. 1999. “Optimal risk management using options.” Journal of Finance 54(1), 359–375. Ahn, D.-H., R. F. Dittmar, and A. R. Gallant. 2002. “Quadratic term structure models: theory and evidence.” Review of Financial Studies 15(1), 243–288. Ahn, D.-H., R. F. Dittmar, A. R. Gallant, and B. Gao. 2003. “Purebred or hybrid?: Reproducing the volatility in term structure dynamics.” Journal of Econometrics 116, 147–180. Aigner, D. J. 1974. “MSE dominance of least squares with errors-ofobservation” Journal of Econometrics 2, 365–372. Aitken, M., F. Alex, M. S. McCorry and P. L. Swan. 1998. “Short sales are almost instantaneously bad news: evidence from the Australian stock exchange.” Journal of Finance 53, 2205–2223. Ait-Sahalia, Y. 1996. “Nonparametric pricing of interest rate derivative securities.” Econometrica 64(3), 527–560. Ait-Sahalia, Y. 1996. “Testing continuous-time models of the spot interest rate.” Review of Financial Studies 9, 385–426. Ait-Sahalia, Y. 1999. “Transition densities for interest rate and other nonlinear diffusions.” Journal of Finance 54, 1361–1395. Ait-Sahalia, Y. and A. W. Lo. 1998. “Nonparametric estimation of stateprice densities implicit in financial asset prices.” The Journal of Finance 53(2), 499–547. Aït-Sahalia, Y. and A.W. Lo. 2000. “Nonparametric risk management and implied risk aversion.” Journal of Econometrics 94, 9–51. Akgiray, V. 1989. “Conditional heteroskedasticity in time series of stock returns: evidence and forecasts.” Journal of Business 62, 55–80. Al-Eryani, M. F., P. Alam, and S. H. Akhter. 1990. “Transfer pricing determinants of U.S. multinationals.” Journal of International Business Studies 21, 409–425.
1627
1628 Alexander, G. J. and J. C. Francis. 1986. Portfolio analysis, PrenticeHall, Englewood Cliffs, NJ. Alexander, G. J. and B. J. Resnick. 1985. “More on estimation risk and single rule for optimal portfolio selection.” Journal of Finance 40, 125–134. Alexe, S. 2002. “Datascope – A new tool for logical analysis of data.” DIMACS Mixer Series. DIMACS, Rutgers University. Alexe, G., S. Alexe, T. O. Bonates, and A. Kogan. 2007. “Logical analysis of data – The vision of Peter L. Hammer.” Annals of Mathematics and Artificial Intelligence 49, 265–312. Alizadeh, S., M. Brandt, and F. Diebold. 2002. “Range-based estimation of stochastic volatility models.” Journal of Finance 57, 1047–1091. Allayannis, G. and J. P. Weston. 2001. “The use of foreign currency derivatives and financial market value.” Review of Financial Studies 14, 243–276. Allen, F. and D. Gale. 2000. “Bubbles and crises.” Economic Journal 110, 236–255. Allen, F. and G. Gorton. 1988. “Rational finite bubbles,” in L. Rodney (Ed.). Working paper no. 41–88, White Center for Financial Research. Allen, F. and R. Karjalainen. 1999. “Using genetic algorithms to find technical trading rules.” Journal of Financial Economics 51(2), 245–271. Allen, L. and A. Saunders. 1993. “Forbearance and valuation of deposit insurance as a callable put.” Journal of Banking and Finance 17, 629–643. Altiok, T. and B. Melamed. 2001. “The case for modeling correlation in manufacturing systems.” IIE Transactions 33(9), 779–791. Altman, E. I. 1968 “Financial ratios, discriminate analysis and the prediction of corporate bankruptcy.” Journal of Finance 23(4), 589– 609. Altman, E. I. 1983. Corporate financial distress: a complete guide to predicting, avoiding, and dealing with bankruptcy, Wiley, New York, NY. Altman, E. I. 1984. “A further empirical investigation of the bankruptcy cost question.” Journal of Finance 39, 1067–1089. Altman, E. I. and R. O. Eisenbeis. 1978. “Financial applications of discriminant analysis: a clarification.” Journal of Financial and Quantitative Analysis 13(1), 185–195. Altman, E. and V. Kishore. 1996. “Almost everything you wanted to know about recoveries on defaulted bonds.” Financial Analysts Journal 52, 57–64. Altman, E. I., and H. A. Rijken. 2004. “How rating agencies achieve rating stability.” Journal of Banking and Finance 28, 2679–2714. Altman, E. I. and A. Saunders. 1998. “Credit risk measurement: developments over the last 20 years.” Journal of Banking & Finance 21(11–12) 1721–1742. Altman, E. I., R. G. Haldeman, and P. Narayanan. 1977. “Zeta analysis: a new model to identify bankruptcy risk of corporations.” Journal of Banking and Finance 1, 29–54. Ambarish, R., K. John, and J. Williams. 1987. “Efficient signaling with dividend and investments.” Journal of Finance 42, 321–344. Amemiya, T. 1980. “Selection of regressors.” International Economic Review 21, 331–354. Amemiya, T. 1981. “Qualitative response models: a survey.” Journal of Economic Literature 19(4), 1483–1536. Amihud, Y. 2002. “Illiquidity and stock returns: cross-section and timeseries effects.” Journal of Financial Markets 5, 31–56. Amihud, Y. and C.M. Hurvich. 2004. “Predictive regressions: a reduced bias estimation method”. Journal of Financial and Quantitative Analysis 39, 813–41. Amihud, Y. and B. Lev. 1981. “Risk reduction as a managerial motive for conglomerate mergers.” Bell Journal of Economics 12, 605–617. Amihud, Y. and H. Mendelson. 1980. “Dealership market: marketmaking with inventory.” Journal of Financial Economics 8, 31–53.
References Amihud, Y. and H. Mendelson. 1986. “Asset pricing and the bid-ask spread.” Journal of Financial Economics 17, 223–249. Amihud, Y, B. J. Christensen, and H. Mendelson. 1993. Further evidence on the risk-return relationship. Working paper, New York University. Amin, K. I. 1993. “Jump diffusion option valuation in discrete time.” Journal of Finance 48(5), 1833–1863. Amin, K. and R. Jarrow. 1992. “Pricing options on risky assets in a stochastic interest rate economy.” Mathematical Finance 2, 217–237. Amin, K. and V. Ng. 1993. “Option valuation with systematic stochastic volatility.” Journal of Finance 48, 881–910. Amin, K. I. and V. K. Ng. 1997. “Inferring future volatility from the information in implied volatility in eurodollar options: a new approach.” Review of Financial Studies 10(2), 333–367. Amram, M. and N. Kulatilaka. 2001. Real options, Oxford University Press, USA. Anas, A. and S. J. Eum. 1984. “Hedonic analysis of a housing market in disequilibrium.” Journal of Urban Economics 15 87–106. Anderson, T. W. 1984. An introduction to multivariate statistical analysis, 2nd Edition, Wiley, New York. Andersen, T. 1996. “Return volatility and trading volume: an information flow interpretation.” Journal of Finance 51, 169–204. Andersen, T. G. and L. Benzoni. 2006. Do bonds span volatility risk in the U.S. Treasury market? A specification test of affine term structure models, Working paper, Northwestern University. Andersen, T. G. and T. Bollerslev. 1997. “Intraday periodicity and volatility persistence in financial markets.” Journal of Empirical Finance 4(2–3), 115–157. Andersen, T. G. and T. Bollerslev. 1998. “Answering the skeptics: yes standard volatility models do provide accurate forecasts.” International Economic Review 39, 885–905. Andersen, L. and R. Brotherton-Ratcliffe. 2001. Extended LIBOR market models with stochastic volatility, Working paper, Gen Re Securities. Andersen, T. and J. Lund. 1997. “Estimating continuous time stochastic volatility models of the short term interest rate.” Journal of Econometrics 77, 343–377. Anderson, J., and R. Narasimhan. 1979. “Assessing project implementation risk: a methodological approach.” Management Science 25(6), 512–521. Anderson, T. W. and H. Rubin. 1949. “Estimation of the parameters of a single equation in a complete system of stochastic equations.” Annuals of Mathematical Statistics 20, 46–63. Andersen, L. and J. Sidenius. 2004. “Extensions to the Gaussian copula: random recovery and random factor loadings.” Journal of Credit Risk 1(1), 29–70. Anderson, R. and S. Sundaresan. 1996. “Design and valuation of debt contract.” Review of Financial Studies 9, 37–68. Anderson, R. and S. Sundaresan. 2000. “A comparative study of structural models of corporate bond yields: an exploratory investigation.” Journal of Banking and Finance 24, 255–269. Anderson, R. W., S. Sundaresan, and P. Tychon. 1996. Strategic analysis of contingent claims. European Economic Review 40, 871–881. Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens. 2001a. “The distribution of stock return volatility.” Journal of Financial Economics 61, 43–76. Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys. 2001b. “The distribution of exchange rate volatility.” Journal of the American Statistical Association 96, 42–55. Andersen, T., T. Bollerslev, F. Diebold and P. Labys. 2003. “Modeling and forecasting realized volatility.” Econometrica 71, 579–625. Andersen, T., T. Bollerslev, P. Christoffersen, and F. Diebold. 2006a. “Volatility and correlation forecasting,” in Handbook of economic forecasting, G. Elliott, C. W. J. Granger, and A. Timmermann (Eds.). North-Holland, Amsterdam, pp. 778–878.
References Andersen, T., T. Bollerslev, P. Christoffersen, and F. Diebold. 2006b. “Practical volatility and correlation modeling for financial market risk management,” in Risks of financial institutions, M. Carey, and R. Stulz (Eds.). University of Chicago Press for NBER, Chicago. Andersen, T. G., T. Bollerslev, and F. X. Diebold. 2007. “Roughing it up: including jump components in the measurement, modeling and forecasting of return volatility.” Review of Economics and Statistics 89, 701–720. Andrew, D. W. K. 1993. “Tests for parameter instability and structural change with unknown change point.” Econometrica 61, 821–856. Ang, A. and G. Bekaert. 2002. “Regime switches in interest rates.” Journal of Business and Economic Statistics 20(2), 163–182. Ang, A. and J. Liu. 2004. “How to discount cashflows with time-varying expected returns.” Journal of Finance 59(6), 2745–2783. Ang, J. and D. Peterson. 1984. “Empirical properties of the elasticity coefficient in the constant elasticity of variance model.” Financial Review 19(4), 372–380. Ang, A. and M. Piazzesi. 2003. “A no-arbitrage vector autoregression of term structure dynamics with macroeconomic and latent variables.” Journal of Monetary Economics 50, 745–787. Ang, J. and T. Schwarz. 1985. “Risk aversion and information structure: an experimental study of price variability in the securities market.” Journal of Finance 40, 924–844. Angbazo, L. 1997. “Commercial bank net interest margins, default risk, interest-rate risk, and off-balance-sheet banking.” Journal of Banking and Finance 21, 55–87. Angel, J. 1997. “Tick size, share prices, and stock splits.” Journal of Finance 52, 655–681. Angrist, J. D., G. W. Imbens, and A. B. Krueger. 1999. “Jackknife instrumental variables estimation.” Journal of Applied Econometrics 14, 57. Ansel, J.P. and C. Stricker. 1992. “Lois de martingale, densities et decompositions de Foellmer Schweizer.” Annales de l’Institut Henri Poincare (B) Probability and Statistics 28, 375–392. Aoki, M. 1990. “Toward an economic model of the Japanese firm.” Journal of Economic Literature 8, 1–27. O AOt-Sahalia, Y. and A. W. Lo. 1998. “Nonparametric Estimation of State-Price Densities Implicit in Financial Asset Prices.” Journal of Finance, American Finance Association 53(2), 499–547. Applebaum, D. 2004. Lévy processes and stochastic calculus. Cambridge University Press, Cambridge. Aragó, V. and L. Nieto. 2005. “Heteroskedasticity in the returns of the main world stock exchange indices: volume versus GARCH effects.” International Financial Markets, Institution and Money 15, 271–284. Arai, T. 2001. “The relations between minimal martingale measure and minimal entropy martingale measure.” Asia-Pacific Financial Markets, 8, 167–177. Arditti, F. D. and H. Levy. 1977. “The weighted average cost of capital as a cutoff rate: a critical examination of the classical textbook weighted average.” Financial Management (Fall 1977), 6, 24–34. Armstrong, J. 1989. “Combining forecasts: the end of the beginning or the beginning of the end?” International Journal of Forecasting 5(4), 585–588. Arnold, L. 1974. Stochastic differential equation: theory and applications, Wiley, New York. Arnott, R. D. and P. L. Bernstein. 2002. “What risk premium is normal?” Financial Analysts Journal 58, 64–85. Arnott, R., J. Hsu, and P. Moore. 2005. “Fundamental indexation.” Financial Analysts Journal 61(2), 83–99. Arora, N., J. R. Bohn, and F. Zhu. 2005. Reduced form vs. structural models of credit risk: a case study of three models. Journal of Investment Management 3, 43–67. Arzac, E. and L. Glosten. 2005. “A reconsideration of tax shield valuation.” European Financial Management (September 2005), 11, 453–61.
1629 Asmussen, S. 2000. Ruin probabilities, World Scientific, Singapore. Asmussen, S., F. Avram, M. R. Pistorius. 2004. “Russian and American put options under exponential phase-type Levy models.” Stochastic Processes and their Applications 109, 79–111. Asquith, P. and D. Mullins. 1983. “The impact of initiating dividend payments on shareholders’ wealth.” Journal of Business 56, 77–96. Asquith, P. and D. Mullins. 1986. “Equity issues and offering dilution.” Journal of Financial Economics 15, 61–89. Atiya, A. 2001 “Bankruptcy prediction for credit risk using neural networks: a survey and new results.” IEEE Transactions on Neural Networks 12(4), 929–935. Atkinson, A. C. 1982. “The simulation of generalized inverse Gaussian and hyperbolic random variables.” SIAM Journal of Scientific & Statistical Computing 3, 502–515. Atkinson, A. C. 1985. Plots, transformations, and regression: an introduction to graphical methods of diagnostic regression analysis, Oxford University Press, New York. Atkinson, A. C. 1994. “Fast very robust methods for the detection of multiple outliers.” Journal of the American Statistical Association 89(428), 1329–1339. Atkinson, A. C. and T.-C. Cheng. 2000. “On robust linear regression with incomplete data.” Computational Statistics & Data Analysis 33(4), 361–380. Atkinson, A. C. and M. Riani. 2001. “Regression diagnostics for binomial data from the forward search.” The Statistician 50(1), 63–78. Atkinson, A. C. and M. Riani. 2006. “Distribution theory and simulations for tests of outliers in regression.” Journal of Computational Graphical Statistics 15(2) 460–476. Aumann, R. and S. Hart (Eds.). 1994. Handbook of games theory, Vol.2, North-Holland, Amsterdam. Ausin, M. C. and H. F. Lopes. 2006. “Time-varying joint distribution through copulas.” mimeograph. Avellaneda, M. 1998. “Minimum relative entropy calibration of asset pricing models.” International Journal of Theoretical and Applied Finance 1(4), 447–472. Avramov, D. 2000. “Stock return predictability and model uncertainty.” Journal of Financial Economics 64, 423–458. Awh, R. V. and D. Waters. 1974. “A discriminant analysis of economic demographic and attitudinal characteristics of bank chargecard holders: a case study.” Journal of finance 29, 973–80. Baba, Y., R. F. Engle, D. Krafet, and K. F. Kroner. 1990. “Multivariate simultaneous generalized ARCH.” Unpublished Manuscript, Department of Economics, University of California, San Diego. Babbs, S. 1993. “Generalized Vasicek models of the term structure,” in Applied stochastic models and data analysis: proceedings of the 6’th international symposium, J. Janssen, and C. H. Skiadas (Eds.). World Scientific, V.I, Singapore. Babbs, S. H. 1997. Rational bounds, Working paper, First National Bank of Chicago. Bachelier, L. 1964. Theorie de la Speculation (1900), reprinted in English in the Random Characters of Stock Market Prices, MIT, Cambridge. Bachelier, L. 1900. “Théorie de la Spéculation.” Annales de l’Ecole Normale Superiure, 3, Paris: Gauthier-Villars. (English translation “Random Character of Stock Market Prices” in Cootner, ed. (1964)). Back, K. 1993. “Asymmetric information and options.” The Review of Financial Studies 6(3), 435–472. Back, K. and. Baruch. 2007. “Working orders in limit-order markets and floor exchanges.” Journal of Finance 62, 1589–1621. Backus, D., A. W. Gregory, and S. E. Zin. 1989. “Risk premiums in the term structure: evidence from artificial economies.” Journal of Monetary Economics 24, 371–399. Backus, D., S. Foresi, A. Mozumdar, and L. Wu. 2001. “Predictable changes in yields and forward rates.” Journal of Financial Economics 59(3), 281–311.
1630 Bae, K., K. Chan, and A. Ng. 2004. “Investibility and return volatility.” Journal of Financial Economics 71, 239–263. Bagehot, W. (pseudonym). 1971. “The only game in town.” Financial Analysts Journal 27, 12–14, 22. Bai, J. 1994. “Least squares estimation of a shift in linear processes.” Journal of Time Series Analysis 15, 453–472. Bai, J. 1996. “Testing for parameter constancy in linear regression: an empirical distribution function approach.” Econometrica 64, 597–622. Bai, J. and P. Perron. 1998. “Estimating and testing linear models with multiple structural changes.” Econometrica 66, 47–78. Bai, J., R. L. Lumsdaine, and J. H. Stock. 1998. “Testing for and dating common breaks in multivariate time series.” Review of Economic Studies 65, 395–432. Bailey, A. D. and W. J. Boe. 1976. “Goal and resource transfers in the multigoal organization.” The Accounting Review 559–573. Bailey, W. and R. Stulz. 1989. “The pricing of stock index options in a general equilibrium model.” Journal of Financial and Quantitative Analysis 24, 1–12. Bailey, W., A. Kumar, and D. Ng. 2008. “Foreign investments of U.S. individual investors: causes and consequences.” Management Science 54, 443–459. Baillie, R. T. and T. Bollerslev. 1989. “The message in daily exchange rates: a conditional variance tale.” Journal of Business and Economic Statistics 7, 297–305. Baillie, R. T., T. Bollerslev, and H. O. Mikkelsen. 1996. “Fractionally integrated generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 74, 3–30. Bajari, P. and C. L. Benkard. 2005. “Demand estimation with heterogeneous consumers and unobserved product characteristics: a hedonic approach.” Journal of Political Economy 113(6), 1239. Baker, G. P. 1992. “Incentive contracts and performance measurement.” Journal of Political Economy 100, 598–614. Bakshi, G. and Z. Chen. 1997. “An alternative valuation model for contingent claims.” Journal of Financial Economics 44, 123–165. Bakshi, G., C. Cao, and Z. Chen. 1997. “Empirical performance of alternative option pricing models.” Journal of Finance 52, 2003–2049. Bakshi, G., C. Cao, and Z. Chen. 2000a. “Do call prices and the underlying stock always move in the same direction?” Review of Financial Studies 13, 549–584. Bakshi, G., C. Cao, and Z. Chen. 2000b. “Pricing and hedging longterm options.” Journal of Econometrics 94, 277–318. Bakshi, G., D. Madan, and F. Zhang. 2006. “Investigating the role of systematic and firm-specific factors in default risk: lessons from empirically evaluating credit risk models.” Journal of Business 79, 1955–1987. Balakrishnan, S. and I. Fox. 1993. “Asset specificity, firm heterogeneity and capital structure.” Strategic Management Journal 14(1), 3–16. Balcaena, S. and H. Ooghe. 2006. “35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems.” British Accounting Review 38, 63–93. Balduzzi, P. and Y. Eom. 2000. Non-linearities in U.S. Treasury rates: a semi-nonparametric approach, Working paper, Boston College. Balduzzi, P., S. Das, S. Foresi, and R. Sundaram. 1996. “A simple approach to three-factor affine term structure models.” Journal of Fixed Income 6, 43–53. Bali, T. G. 2000. “Testing the empirical performance of stochastic volatility models of the short-term interest rate.” Journal of Financial and Quantitative Analysis 35, 191–215. Ball, C. and W. Torous. 1983. “Bond prices dynamics and options.” Journal of Financial and Quantitative Analysis 18, 517–532. Ball, C. and W. Torous. 1984. “The maximum likelihood estimation of security price volatility: theory, evidence, and application to option pricing.” Journal of Business 57, 97–112.
References Ball, C. A. and W. N. Torous. 1985. “On jumps in common stock prices and their impact on call option pricing.” Journal of Finance 40(March), 155–173. Ball, C. and W. Torous. 1999. “The stochastic volatility of short-term interest rates: some international evidence.” Journal of Finance 54, 2339–2359. Ballie R. and R. DeGennaro. 1990. “Stock returns and volatility.” Journal of Financial and Quantitative Analysis 25(June), 203–214. Bank, P. and D. Baum. 2004. “Hedging and portfolio optimization in illiquid financial markets with a large trader.” Mathematical Finance 14(1), 1–18. Bansal, R. and H. Zhou. 2002. “Term structure of interest rates with regime shifts.” Journal of Finance 57, 1997–2043. Bantwal, V. J. and H. C. Kunreuther. 2000. “A CAT bond premium puzzle?” Journal of Psychology and Financial Markets 1(1), 76–91. Banz, R. W. 1981. “The relationship between return and market value of common stocks.” Journal of Financial Economics 9, 3–18. Banz R. and M. Miller. 1978. “Prices for state contingent claims: some estimates and applications.” Journal of Business 51, 653–672. Bao, J., J. Pan, and J. Wang. 2008. Liquidity of corporate bonds. SSRN eLibrary, http://ssrn.com/paperD1106852. Barber, B. M. and T. Odean. 2000. “Trading is hazardous to your wealth: the common stock investment performance of individual investors.” Journal of Finance 55, 773–806. Barber, B. M., T. Odean, and L. Zheng. 2005. “Out of sight, out of mind: the effects of expenses on mutual fund flows.” Journal of Business 78, 2095–2119. Barberis, N. 2000. “Investing for the long run when returns are predictable.” Journal of Finance 55, 225–264. Barclay, E. I. and C.W. Smith Jr. 1995. “The maturity structure of corporate debt.” Journal of Finance 50, 609–631. Barclay, M., T. Hendershott, and T. McCormick. 2003. “Competition among trading venues: information and trading on electronic communications networks.” Journal of Finance 58, 2637–2666. Barclay, M. J., L. M. Marx, and C.W. Smith. 2003. “The joint determination of leverage and maturity.” Journal of Corporate Finance 9(2), 1–18. Barclay, M. J., C. G. Holderness, and D. P. Sheehan. 2007. Private placements and managerial entrenchment, Unpublished working paper, Boston College. Barle, S. and N. Cakici. 1998. “How to grow a smiling tree.” Journal of Financial Engineering 7, 127–146. Barles, G. and H. Soner. 1998. “Option pricing with transaction costs and a nonlinear Black–Scholes equation.” Finance and Stochastics 2, 369–397. Barndorff-Nielsen, O. E. 1977. “Exponentially decreasing distributions for the logarothm of partical size.” Proceedings of the Royal Society London A, 353, 401–419. Barndorff-Nielsen, O. E. 1978. “Hyperbolic distributions and distributions on hyperbolae.” Scandinavian Journal of Statistics, 5, 151–157. Barndorff-Nielsen, O. E. 1995. “Normal inverse Gaussian distributions and the modeling of stock returns.” Research Report no. 300, Department of Theoretical Statistics, Aarhus University. Barndorff-Nielsen, O. E. 1998. “Processes of normal inverse Gaussian type.” Finance and Stochastics 2, 41–68. Barndorff-Nielsen, O. E. and N. Shephard. 2001. “Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in financial economics.” Journal of the Royal Statistical Society, Series B, 63, 167–241 (with discussion). Barndorff-Nielsen, O. and N. Shephard. 2003. “How accurate is the asymptotic approximation to the distribution of realised volatility?,” in Identification and inference for econometric models, A Festschrift for Tom Rothenberg, D. Andrews, J. Powell, P. Ruud, and J. Stock (Eds.). Econometric Society Monograph Series, Cambridge University Press, Cambridge.
References Barndorff-Nielsen, O. and N. Shephard. 2006. “Econometrics of testing for jumps in financial economics using bipower variation.” Journal of Financial Econometrics 4, 1–30. Barndorff-Nielsen, O., J. Kent, and M. Sorensen. 1982. “Normal variance-mean mixtures and z distributions.” International Statistical Review 50, 145–159. Barnea, A., R. A. Haugen, and L. W. Senbet 1980. “A rationale for debt maturity structure and call provisions in the agency theoretic framework.” Journal of Finance 35, 1223–1234. Barnea, A., R. A. Haugen, and L. W. Senbet. 1981. “Market imperfections, agency problems, and capital structure: a review.” Financial Management 10, 7–22. BarNiv, R. and J. Hathorn. 1997. “The merger or insolvency alternative in the insurance industry.” Journal of Risk and Insurance 64(1), 89–113. Barone-Adesi, G. and P. P. Talwar. 1983. “Market models and heteroscedasticity of residual security returns.” Journal of Business and Economics 1, 163–168. Barone-Adesi, G. and R. Whaley. 1987. “Efficient analytic approximation of American option values.” Journal of Finance 42, 301–320. Barr, R. S., and T. F. Siems. 1994. “Predicting bank failure using DEA to quantify management quality.” Federal Reserve Bank of Dallas Financial Industry Studies 1, 1–31. Barraquand, J. and T. Pudet. 1996. “Pricing of American pathdependent contingent claims.” Mathematical Finance 6, 17–51. Barrett, B. E. and J. B. Gray. 1997. “Leverage, residual, and interaction diagnostics for subsets of cases in least squares regression.” Computational Statistics and Data Analysis 26(1), 39–52. Barry, D. and J. A. Hartigan. 1993. “A Bayesian analysis for change point problems.” Journal of the American Statistical Association 88(421), 309–319. Bartter, B. J. and R. J. Rendleman Jr. 1979. “Fee-based pricing of fixed rate bank loan commitments.” Financial Management 8. Bartels, L. M. 1991. “Instrumental and ‘quasi-instrumental’ variables.” American Journal of Political Science 35, 777–800. Bas, J. and S. R. Das. 1996. “Analytical approximation of the term structure for jump-diffusion process: a numerical analysis.” Journal of Fixed Income 6, 78–86. Basel Committee on Banking Supervision. 2001. “The internal ratings based approach.” Basel Committee on Banking Supervision. Basel Committee on Banking Supervision. 2004. Bank failures in mature economies, Working Paper 13, Basel Committee on Banking Supervision. Basel Committee on Banking Supervision. 2006.“International convergence of capital measurement and capital standards: a revised framework (Basel II).” Basel Committee on Banking Supervision. Baskin, J. 1987. “Corporate liquidity in games of monopoly power.” Review of Economics and Statistics 69, 312–319. Basu, S. 1977. “Investment performance of common stocks in relation to their price-earnings ratios: a test of the efficient markets hypothesis.” Journal of Finance 32, 663–682. Bates, D. 1991. “The crash of 87’: was it expected? The evidence from options markets.” Journal of Finance 46, 1009–1044. Bates, D. 1996. “Jumps and stochastic volatility: exchange rate processes implicit in deutsche mark options.” Review of Financial Studies 9, 69–107. Bates, D. 1996a. “Testing option pricing models,” in Statistical methods in finance (Handbook of statistics, Vol. 14), G. S. Maddala and C. R. Rao (Eds.). Amsterdam: Elsevier, pp. 567–611. Bates, D. 2000. “Post-87 crash fears in S&P 500 futures options.” Journal of Econometrics 94, 181–238. Bates, D. 2006. “Maximum likelihood estimation of latent affine processes.” Review of Financial Studies, 19, 909–965. Bates, J. M. and C. W. Granger. 1969. “The combination of forecasts.” Operational Research Quarterly 451–468.
1631 Baufays, P. and J. P. Rasson. 1985. “Variance changes in autoregressive models,” in Time series analysis: theory and practice 7, O. D. Anderson (Ed.). North Holland, New York, pp. 119–127. Baumol, W. J. 1963. “An expected gain-confidence limit criterion for portfolio selection.” Management Science 10, 171–182. Baumol, W. J. and R. E. Quandt. 1965. “Investment and discount rate under capital rationing – a programming approach.” Economic Journal 75, 317–329. Bauwens, L., S. Laurent, and J. V. K. Rombouts. 2006. “Multivariate GARCH models: a survey.” Journal of Applied Econometrics 21, 79–109. Bawa, V. S. 1975. “Optimal rules for ordering uncertain prospects.” Journal of Financial Economics 2, 95–121. Bawa, V. S. 1978. “Safety-first, stochastic dominance, and optimal portfolio choice.” Journal of Financial and Quantitative Analysis, 13, 255–271. Bawa, V. and E. Lindenberg. 1977. “Capital market equilibrium in a mean-lower partial moment framework.” Journal of Financial Economics 5, 189–200. Baxter, M. and A. Rennie. 1996. Financial calculus, Cambridge University Press, Cambridge. Bayer P., F. Ferreira, and R. McMillan. 2007. “A unified framework for measuring preferences for schools and neighborhoods.” Journal of Political Economy 115(4), 588–638. Beaglehole, D. R. and M. Tenney. 1991. “General solution of some interest rate-contingent claim pricing equations.” Journal of Fixed Income 1, 69–83. Beaglehole, D. R. and M. Tenney. 1992. “A nonlinear equilibrium model of term structures of interest rates: corrections and additions.” Journal of Financial Economics 32(3), 345–454. Beatty, R., S. Riffe, and R. Thompson. 1999. “The method of comparables and tax court valuations of private firms: an empirical investigation.” Accounting Horizons 13(3), 177–199. Beaver, W. 1966. “Financial ratios as predictors of failure.” Journal of Accounting Research 4(Suppl.), 71–111. Beaver, W. H. and S. G. Ryan. 1993. “Accounting fundamentals of the book-to-market ratio.” Financial Analysts Journal (November– December 1993), 49, 50–56. Beaver, W., P. Kettler, and M. Scholes. 1970. “The association between market determined and accounting determined risk measures.” Accounting Review 45, 654–682. Becher, D. A. 2000. “The valuation effects of bank mergers.” Journal of Corporate Finance 6, 189–214. Becherer, D. 2003. “Rational hedging and valuation of integrated risks under constant absolute risk aversion.” Insurance Mathematics and Economics 33(1), 1–28. Becherer, D. 2004. “Utility-indifference hedging and valuation via reaction-diffusion systems.” Proceedings of the Royal Society of London, Series A Mathematical Physical and Engineering Science, 460(2041) 27–51. Beck, T., R. Levine, and N. Loayza. 2000. “Finance and the sources of growth.” Journal of Financial Economics 58, 261–300. Beck, T., A. Demirgüç-Kunt, and R. Levine. 2003. “Law, endowments, and finance.” Journal of Financial Economics 70, 137–181. Beckers, S. 1980. “The constant elasticity of variance model and its implications for option pricing,” Journal of Finance 35, June, 661–673. Beckers, S. 1981. “Standard deviations implied in option prices as predictors of future stock price variability.” Journal of Banking and Finance 5, 363–381. Beckers, S. 1983. “Variances of security price returns based on highest, lowest, and closing prices.” Journal of Business 56, 97–112. Becker, B. E. and M. A. Huselid. 1992. “The incentive effects of tournament compensation systems.” Administrative Science Quarterly 37, 336–350.
1632 Begley, J., J. Ming, and S. Watts. 1996. “Bankruptcy classification errors in the 1980s: an empirical analysis of Altman’s and Ohlson’s models.” Review of Accounting Studies 1, 267–284. Beja, A. 1971. “The structure of the cost of capital under uncertainty.” Review of Economic Studies 38(3), 359–368. Beja, A. and M. B. Goldman. 1980. “On the dynamic behavior of prices in disequilibrium.” Journal of Finance 35, 235–248. Beja, A. and N. H. Hakansson. 1977. “Dynamic market processes and the rewards to up-to-date information.” Journal of Finance 32, 291–304. Bekaert, G. and C. R. Harvey. 1995. “Time-varying world market integration.” Journal of Finance 50, 403–444. Bekaert, G. and C. Harvey. 2000. “Capital flows and the behavior of emerging market equity returns,” with Geert Bekaert,’ in Capital inflows to emerging markets, S. Edwards (Ed.). NBER and University of Chicago Press, Chicago, pp. 159–194. Bekaert, G. and C. R. Harvey. 2003. “Emerging markets finance.” Journal of Empirical Finance 10, 3–56. Bekaert, G. and R. Hodrick. 1992. “Characterizing predictable components in excess returns on equity and foreign exchange markets.” Journal of Finance 47, 467–509. Bekaert, G. and R. J. Hodrick. 2001. “Expectations hypotheses tests.” Journal of Finance 56, 1357–1394. Bekaert, G. and M. Urias. 1996. “Diversification, integration and emerging market closed-end funds.” Journal of Finance 51, 835– 869. Bekaert, G., R. J. Hodrick, and D. A. Marshall. 1997. “On biases in tests of the expectations hypothesis of the term structure of interest rates.” Journal of Financial Economics 44, 309–348. Bekaert, G., R. J. Hodrick, and D. A. Marshall. 2001. “‘Peso problem’ explanations for term structure anomalies.” Journal of Monetary Economics 48(2), 241–270. Bekaert, G., C. Harvey, and A. Ng. 2005. “Market integration and contagion.” Journal of Business 78, 39–70. Bekaert, G., R. Hodrick, and X. Zhang. 2005. International stock return comovements, NBER working paper, Columbia Business School, Columbia. Belkaoui, A 1977. “Canadian evidence of heteroscedasticity in the market model.” Journal of Finance 32, 1320–1324. Ben-Ameur, H., M. Breton, and P. L’Ecuyer. 2002. “A dynamic programming procedure for pricing American-style Asian options.” Management Science 48, 625–643. Beneish, M., C. Lee, and R. Tarpley. 2001. “Contextual fundamental analysis through the prediction of extreme returns.” Review of Accounting Studies 6(2/3), 165–189. Ben-Horin, M. and H. Levy. 1980. “Total risk, diversifiable risk and non-diversifiable risk: a pedagogic note.” Journal of Financial and Quantitative Analysis 15, 289–295. Benninga, S. 2000. Financial modeling. MIT Press, Cambridge. Benninga, S. 2008. Financial modeling. MIT Press, Cambridge. Benston, G. and Hagerman, R.. 1974. “Determinants of bid-asked spreads in the over-the-counter market.” Journal of Financial Economics 1, 353–364. Bensoussan, A. and J.-L. Lions. 1982. Impulse control and QuasiVariational Inequalities. Gauthiers-Villars, Bordas. Benth, F. E. and K. H. Karlsen. 2005. “A PDE representation of the density of the minimal entropy martingale measure in stochastic volatility markets.” Stochastics and Stochastic Reports, 77(2), 109–137. Benth, F. E. and T. Meyer-Brandis. 2005. “The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps.” Finance and Stochastics 9(4), 564–575. Benveniste, L. M. and P. A. Spindt. 1989. “How investment bankers determine the offer price and allocation of new issues.” Journal of Finance Economics 25, 343–361. Bera, A. K. and M. L. Higgins. 1993. “ARCH models: properties, estimation and testing.” Journal of Economic Surveys 7, 305–362.
References Beranek, W. 1977. “The weighted average cost of capital and shareholder wealth maximization.” Journal of Financial and Quantitative Analysis (March 1977), 12, 17–31. Beranek, W. 1978. “Some new capital budgeting theorems.” Journal of Financial and Quantitative Analysis (December 1978), 809–823. Beranek, W. 1981. “Research directions in finance.” Quarterly Journal of Economics and Business 21, 6–24. Bergantino, S. and G. Sinha. 2005. Special report: subprime model update, Bear Stearns & Co. Inc., May 26. Berger, A. and G. Udell. 2002. “Small business credit availability and relationship lending: the importance of bank organizational structure.” The Economic Journal 112, 32–53. Berger, P. G., E. Ofek, and D. L. Yermack. 1997. “Managerial entrenchment and capital structure decisions.” Journal of Finance, 52(4), 1411–1438. Berger, A. N., R. S. Demsetz, and P. E. Strahan. 1999. “The consolidation of the financial services industry: causes, consequences and implications for the future.” Journal of Banking and Finance 23(2–4), 135–194. Berger, A. N., N. H. Miller, M. A. Petersen, R. G. Rajan, and J. C. Stein. 2005. “Does function follow organizational form? Evidence from the lending practices of large and small banks.” Journal of Financial Economics 76, 237. Berk, J. and P. DeMarzo. 2007. Corporate finance, Pearson Addison Wesley, Boston. Berk, J. and R. C. Green. 2004. “Mutual fund flows and performance in rational markets.” Journal of Political Economy 112, 1269–1295. Berkman, H. and M. E. Bradbury. 1996. “Empirical evidence on the corporate use of derivatives.” Financial Management 25, 5–13. Berkowitz, J. 2001. “Testing density forecasts with applications to risk management.” Journal of Business and Economic Statistics 19, 466–474. Bernard, R. H. 1969. “Mathematical programming models for capital budgeting – a survey generalization and critique.” Journal of Financial and Quantitative Analysis 4, 111–158. Berndt, A., A. A. Lookman, and I. Obreja. 2006. “Default risk premia and asset returns.” AFA 2007 Chicago Meetings Paper. Bernhard, R. 1978. “Some new capital budgeting theorems: comment.” Journal of Financial and Quantitative Analysis 13, 825–829. Bernstein, P. 1987. “Liquidity, stock markets and market makers.” Financial Management 16, 54–62. Bernstein, P. 2007. “The surprising bond between CAPM and the meaning of liquidity.” Journal of Portfolio Management 34, 11–11. Berry, M. J. A., and G. Linoff. 1997. Data mining techniques for marketing, sales, and customer support, Wiley, New York. Bertsekas, D. 1974. “Necessary and sufficient conditions for existence of an optimal portfolio.” Journal of Economic Theory 8, 235–247. Bertoin, J. 1996. Lévy processes. Cambridge University Press, Cambridge. Bertrand, P. and J. L. Prigent. 2001. Portfolio insurance: the extreme value approach to the CPPI method, Working paper, Thema University of Cergy. Bertrand, P. and J. L. Prigent. 2005. “Portfolio insurance strategies: OBPI versus CPPI.” Finance 26(1), 5–32. Bessembinder, H. 1999. “Trade execution costs on Nasdaq and the NYSE: a post-reform comparison.” The Journal of Financial and Quantitative Analysis 34, 387–407. Bessembinder, H. 2003. “Quote-based competition and trade execution costs in NYSE-listed stocks.” Journal of Financial Economics 70, 385–422. Bessembinder, H. and H. Kaufman. 1997a. “A comparison of trade execution costs for NYSE and Nasdaq-listed stocks.” The Journal of Financial and Quantitative Analysis 32, 287–310.
References Bessembinder, H. and H. Kaufman. 1997b. “A cross-exchange comparison of execution costs and information flow for NYSE-listed stocks.” Journal of Financial Economics 46, 293–319. Bessler, W. (Ed.). 2006. Bösen, Banken und Kapitalmärkte, Duncker & Humblot, Berlin. Best, M. J. 1975. “A feasible conjugate direction method to solve linearly constrained optimization problems.” Journal of Optimization Theory and Applications 16, 25–38. Bey, R. P. and G. E. Pinches 1980. “Additional evidence of heteroscedasticity in the market model.” Journal of Financial and Quantitative Analysis 15, 299–322. Bhagat, S. and I. Welch. 1995. “Corporate research and development investment: international comparisons.” Journal of Accounting and Economics 19(2–3), 443–470. Bhamra, H. S., L.-A. Kuehn, and I. A. Strebulaev. 2008. The levered equity risk premium and credit spreads: a unified framework. SSRN eLibrary, http://ssrn.com/paperD1016891. Bharath, S. and T. Shumway. 2008. “Forecasting default with the Merton distance to default model.” Review of Financial Studies 21(3), 1339–1369. Bharath, S. and T. Shumway. in press. Forecasting default with the KMV-Merton model. Review of Financial Studies. Bhaskar, K. A. 1979. “Multiple objective approach to capital budgeting.” Accounting and Business Research 10, 25–46. Bhasker, S. 2008. “Financial market interdependence: the copulaGARCH model.” mimeograph. Bhaskar, K. A. and P. McNamee. 1983. “Multiple objectives in accounting and finance.” Journal of Business Finance & Accounting 10, 595–621. Bhatia, A. V. 2002. Sovereign credit risk ratings: an evaluation, IMF Working Paper WP/03/170. Bhattacharya, S. 1979. “Imperfect information, dividend policy, and the ‘bird in the hand’fallacy.” Bell Journal of Economics 10, 259–270. Bhattacharya, M. 1980. “Empirical properties of the Black–Scholes formula under ideal conditions.” Journal of Financial Quantitative Analysis 15, 1081–1105. Bhattacharya, U. and P. Groznik. 2008. “Melting pot or salad bowl: some evidence from U.S. investments abroad.” Journal of Financial Markets 11(3), 228–258. Bhattacharya, R. N. and R. R. Rao. 1976. Normal approximations and asymptotic expansions, Wiley, New York. Bhojraj, S. and C. Lee. 2003. “Who is my peer? A valuation-based approach to the selection of comparable firms.” Journal of Accounting Research, 41(5), 745–774. Biagini, F., P. Guasoni and M. Pratelli. 2000. “Mean variance hedging for stochastic volatility models.” Mathematical Finance 10, 109–129. Biais, B. and P. Hillion. 1994. “Insider and liquidity trading in stock and options markets.” Review of Financial Studies 7, 743–780. Biais, B., P. Hillion, and C. Spatt. 1995. “An empirical analysis of the limit order book and the order flow in the paris bourse.” Journal of Finance 50, 1655–1689. Biais, B., L. Glosten, and C. Spatt. 2005. “Market microstructure: a survey of microfoundations, empirical results, and policy implications.” Journal of Financial Markets 8, 217–264. Bibby, B. M. and M. Sørensen. 2003. “Hyperbolic processes in finance,” in Handbook of heavy tailed distributions in finance. S. T. Rachev (Ed.). Elsevier Science, Amsterdam, pp. 211–248. Bielecki, T. and M. Rutkowski. 2000. “Multiple ratings model of defaultable term structure.” Mathematical Finance 10, 125–139. Bielecki, T. and M. Rutkowski. 2001. Modeling of the defaultable term structure: conditional Markov approach, Working paper, The Northeastern Illinois University. Bielecki, T. R. and M. Rutkowski. 2002. Credit risk: modeling, valuation and hedging, Springer, New York.
1633 Bielecki, T. R. and M. Rutkowski. 2004. “Modeling of the defaultable term structure: conditionally Markov approach.” IEEE Transactions on Automatic Control 49, 361–373. Bierman, H., Jr. 1995. “Bubbles, theory and market timing.” Journal of Portfolio Management 22, 54–56. Bikbov, R. and M. Chernov. 2004. Term structure and volatility: lessons from the eurodollar markets, Working Paper, Columbia GSB. Bingham, N. H. and R. Kiesel. 2001. “Modelling asset returns with hyperbolic distribution.” in In return distribution on finance. J. Knight and S. Satchell (Eds.). Butterworth-Heinemann, Oxford, pp. 1–20. Bismut, J. M. 1973. “Conjugate convex functions in optimal stochastic control.” Journal of Mathematical Analysis and Applications 44, 384–404. Bitler, M. P., T. J. Moskowitz, and A. Vissing-Jorgensen. 2005. “Testing agency theory with entrepreneur effort and wealth.” Journal of Finance 60, 539. Bjerksund, P. and G. Stensland. 1993. “Closed-form approximation of American options.” Scandinavian Journal of Management 9, 87–99. Bjork, T. 1998. Arbitrage theory in continuous time, Oxford University Press, Oxford. Björk, T. and B. J. Christensen. 1999. “Interest rate dynamics and consistent forward rate curves.” Mathematical Finance 9(4), 323–348. Black, F. 1972. “Capital market equilibrium with restricted borrowing.” Journal of Business 45, 444–455. Black, F. 1975. “Fact and fantasy in the use of options.” Financial Analysts Journal 31, 36–41, 61–72. Black, F. 1976. “The pricing of commodity contracts.” Journal of Financial Economics 3, 167–178. Black, F. 1976. “Studies in stock price volatility changes,” Proceedings of the 1976 business meeting of the business and economic statistics section, American Statistical Association, pp. 177–181. Black, F. 1986. “Noise.” Journal of Finance 41(3), 529–543. Black, F. 1993. “Beta and return.” Journal of Portfolio Management 20, 8–18. Black, F. 1995. “Interest rates as options.” Journal of Finance 50(5), 1371–1376. Black, F. and J. C. Cox. 1976. “Valuing corporate securities: some effects of bond indenture provisions.” Journal of Finance 31, 351–367. Black, F. and R. Jones. 1987. “Simplifying portfolio insurance.” Journal of Portfolio Management 14, 48–51. Black, F. and P. Karasinski. 1991. “Bond and option pricing when short rates are lognormal.” Financial Analysts Journal 47, 52–59. Black, F. and R. Litterman. 1991. “Asset allocation: combining investor views with market equilibrium.” Journal of Fixed Income 1(2), 7–18. Black, F. and R. Litterman. 1992. “Global portfolio optimization.” Financial Analysts Journal 48, September/October, 28–43. Black, F. and M. Scholes. 1972. “The valuation of option contracts and a test of market efficiency.” Journal of Finance 27, 399–417. Black, F. and M. Scholes. 1973. “The pricing of options and corporate liabilities.” Journal of Political Economy 81(3), 637–654. Black, F., M. C. Jensen, and M. Scholes. 1972. “The capital asset pricing model: some empirical tests,” in Studies in the theory of capital markets, M. C. Jensen (Ed.). Praeger, New York, pp. 20–46. Black, F., E. Derman, and W. Toy. 1990. “A one-factor model of interest rates and its application to treasury bond options.” Financial Analysts Journal 46, 33–39. Blæsid, P. 1981. “The two-dimensional hyperbolic distribution and related distribution with an application to Johannsen’s bean data.” Biometrika, 68, 251–263. Blair, R. D. and A. A. Heggestad. 1978. “Bank portfolio regulation and the probability of failure.” Journal of Money, Credit, and Banking 10, 88–93.
1634 Blair, B. J., S. H. Poon, and S. J. Taylor. 2001. “Modelling S&P 100 volatility: the information content of stock returns.” Journal of Banking & Finance 25, 1665–1679. Blair, B. J., S. H. Poon, and S. J. Taylor. 2002. “Asymmetric and crash effects in stock volatility.” Applied Financial Economics 12, 319–329. Blalock, H. M. 1969. Theory construction: from verbal to mathematical formulations, Prentice-Hall, Englewood Cliffs, NJ. Blanco, R., S. Brennan, and I. Marsh. 2005. “An empirical analysis of the dynamic relation between investment-grade bonds and credit default swaps.” Journal of Finance 60, 2255–2281. Bliss, R. and N. Panigirtzoglou. 2002. “Testing the stability of implied probability density functions.” Journal of Banking and Finance 26, 381–422. Blomquist, G. and L. Worley. 1982. “Specifying the demand for housing characteristics: the exogeniety issue,” in The economics of urban amenities, D. B. Diamond and G. Tolley (Eds.). Academic Press, New York. Blum, M. 1974. “Failing company discriminant analysis.” Journal of Accounting Research 12(1), 1–25. Blume, M. 1970. “Portfolio theory: a step toward its practical application.” Journal of Business 43, 152–173. Blume, M. E. 1971. “On the assessment of risk.” Journal of Finance 26, 1–10. Blume, M. 1975. “Betas and their regression tendencies.” Journal of Finance 20, 785–795. Blume, M. E. and I. Friend. 1973. “A new look at the capital asset pricing model.” Journal of Finance 28, 19–34. Blume, M. and I. Friend. 1975. “The asset structure of individual portfolios and some implications for utility functions.” Journal of Finance 10(2), 585–603. Blume, M. E. and F. Husick. 1973. “Price, beta, and exchange listing.” Journal of Finance 28, 19–34. Blume, M. E., D. B. Keim and S. A. Patel. 1991. “Returns and volatility of low-grade bonds, 1977–1989.” Journal of Finance 46(1), 49–74. Boardman, C. M., W. J. Reinhart, and S. E. Celec. 1982. “The role of payback period in the theory and application of duration to capital budgeting.” Journal of Business Finance & Accounting 9, 511–522. Bodhurta, J. and G. Courtadon. 1986. “Efficiency tests of the foreign currency options market.” Journal of Finance 41, 151–162. Bodie, Z., A. Kane, and A. Marcus. 2006. Investments, 7th Edition, McGraw-Hill, New York. Bodie, Z., A. Kane, and A. J. Marcus. 2007. Essentials of investment, 6th ed., McGraw-Hill/Inwin, New York. Bodie, Z., A. Kane, and A. J. Marcus. 2008. Investment, 7th ed., McGraw-Hill/Inwin, New York. Bodnar, G. M., G. S. Hayt, and R. C. Marston. 1996. “1995 Wharton Survey of derivatives usage by US non-financial firms.” Financial Management 25, 142. Bodurtha, J. N. and G. R. Courtadon. 1987. “Tests of an American option pricing model on the foreign currency options market.” Journal of Financial and Quantitative Analysis 22, 153–168. Boehmer, E. and J. M. Netter. 1997. “Management optimism and corporate acquisitions: evidence from insider trading.” Managerial and Decision Economics 18, 693–708. Bogue, M. and R. Roll. 1974. “Capital budgeting of risky projects with ‘imperfect’ markets for physical capital.” Journal of Finance 29(2), 601–613. Bohl, M. T. and H. Henke. 2003. “Trading volume and stock market volatility: the Polish case.” International Review of Financial Analysis 12, 513–525. Bohn, H. and L. Tesar. 1996. “U.S. equity investment in foreign markets: portfolio rebalancing or return chasing?” American Economic Review 86, 77–81. Bollerslev, T. 1986. “Generalized autoregressive conditional heteroskedasticity.” Journal of Econometrics 31(3), 307–328.
References Bollerslev, T. 1987. “A conditional heteroskedastic time series model for speculative prices and rates of returns.” Review of Economics and Statistics 69, 542–547. Bollerslev, T. 1990. “Modeling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH model.” Review of Economics and Statistics 72, 498–505. Bollerslev, T. and J. M. Wooldridge. 1992. “Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances.” Econometric Reviews 11, 143–179. Bollerslev, T. and Wooldridge, J. 1992. “Quasi-maximum likelihood estimation and inference in dynamic models with time varying covariances.” Journal of Political Economy 96, 116–131. Bollerslev, T. and J. H. Wright. 2000. “Semiparametric estimation of long-memory volatility dependencies: the role of high-frequency data.” Journal of Econometrics 98, 81–106. Bollerslev, T., R. F. Engle, and J. M. Wooldridge. 1988. “A capital asset pricing model with time varying covariances.” Journal of Political Economy 96(1), 116–131. Bollerslev, T., R. Y. Chou, and K. Kroner. 1992. “ARCH modeling in finance: a review of the theory and empirical evidence.” Journal of Econometrics 52, 5–59. Bollerslev, T., R. F. Engle, and D. B. Nelson. 1994. “ARCH model,” in Handbook of econometrics IV, R. F. Engle and D. C. McFadden (Eds.). Elsevier, Amsterdam, pp. 2959–3038. Baillie, R. and T. Bollwerslev. 1989. “Common stochastic trends in a system of exchange rates.” Journal of Finance 44, 167–181. Bondarenko, O. 2003. Why are put options so expensive?, CalTech Working paper. Bongaerts, D., F. de Jong, and J. Driessen. 2008. Derivative pricing with liquidity risk: theory and evidence from the credit default swap market, Working paper. Bontemps, C., M. Simioni, and Y. Surry. 2008. “Semiparametric hedonic price models: assessing the effects of agricultural nonpoint source pollution.” Journal of Applied Econometrics 23(6), 825–842. Bookstaber, R. M. 1981. Option pricing and strategies in investing, Addison-Wesley, Reading, MA. Bookstaber, R. M. and R. Clarke. 1983. Option strategies for institutional investment management, Addison-Wesley Publishing Company, Reading, MA. Booth, L. 2002. “How to find value when none exists: pitfalls in using APV and FTE.” Journal of Applied Corporate Finance (Spring 2002), 15, 8–17. Booth, L. 2007. “Capital cash flows, APV and valuation.” European Financial Management (January 2007), 13, 29–48. Booth, N. B. and A. F. M. Smith. 1982. “A Bayesian approach to retrospective identification of change-points.” Journal of Econometrics 19, 7–22. Boros, E., P. L. Hammer, T. Ibaraki, and A. Kogan. 1997. “Logical analysis of numerical data.” Mathematical Programming 79, 163–190. Boros, E., P. L. Hammer, T. Ibaraki, A. Kogan, E. Mayoraz, and I. Muchnik. 2000. “An implementation of logical analysis of data.” IEEE Transactions on Knowledge and Data Engineering 12(2), 292–306. Bose, I. and R. K. Mahapatra. 2001. “Business data mining – a machine learning perspective.” Information and Management 39, 211–225. Boser, B., I. Guyon, and V. VN. 1992. “A training algorithm for optimal margin classifiers,” in 5th Annual ACM workshop on COLT, ACM Press, Pittsburgh, PA, pp. 144–152. Bossaerts, P. and P. Hillion. 1999. “Implementing statistical criteria to select return forecasting models: what do we learn?” Review of Financial Studies 12, 405–428. Botteron, P., M. Chesney, and R. Gibson-Anser. 2003. “Analyzing firms strategic investment decisions in a real options framework.” Journal of International Financial Markets, Institutions and Money 13, 451–479.
References Bouaziz, L., E. Briys, and M. Crouhy. 1994. “The pricing of forwardstarting Asian options.” Journal of Banking and Finance 18, 823–839. Bouchet, M. H., E. Clark, and B. Groslambert. 2003. Country risk assessment: a guide to global investment strategy, Wiley, Chichester, England. Boudoukh, J., M. Richardson and T. Smith. 1993. “Is the ex ante risk premium always positive? – A new approach to testing conditional asset pricing models.” Journal of Financial Economics 34, 387–408. Boudoukh, J., M. Richardson, R. Stanton, and R. F. Whitelaw. 1998. The stochastic behavior of interest rates: implications from a multifactor, nonlinear continuous-time model, Working Paper, NBER. Boudreaux, K. J. and R. W. Long. 1979. “The weighted average cost of capital as a cutoff rate: a further analysis.” Financial Management (Summer 1979), 8, 7–14. Bourke, P., and B. Shanmugam. 1990. An introduction to bank lending. Addison-Wesley Business Series, Sydney. Bouye, E., V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli. 2000. “Copulas for finance – a reading guide and some application.” Groupe de Recherche. Bowden, R. J. 1978. The econometrics of disequilibrium. North Holland. Bower, D. H., R. S. Bower, and D. E. Logue. 1984. “Arbitrage pricing theory and utility stock returns.” Journal of Finance 39, 1041–1054. Bowers, N. L. et al. 1997. Actuarial mathematics, 2nd Edition, Society of Actuaries, Itasca, IL. Bowman, R. G. 1979. “The theoretical relationship between systematic risk and financial (accounting) variables.” Journal of Finance 34, 617–630. Box, G. E. P. and D. R. Cox. 1964. “An analysis of transformation.” Journal of the Royal Statistical Society Series B 26(2), 211–252. Box, G. E. P. and G. C. Tiao. 1973. Bayesian inference in statistical analysis, Addison-Wesley, Reading, MA. Boyarchenko, S. I. 2000. Endogenous default under Lévy processes. Working paper, University of Texas, Austin. Boyarchenko, S. I. and S. Levendorskiˇi. 2002a. “Perpetual American options under Lévy Processes.” SIAM Journal on Control and Optimization 40, 1663–1696. Boyarchenko, S. I. and S. Levendorskiˇi. 2002b. “Barrier options and touch-and-out options under regular Lévy Processes of exponential Type.” Annals of Applied Probability 12, 1261–1298. Boyce, W. M. and A. J. Kalotay 1979. “Tax differentials and callable bonds.” Journal of Finance 34, 825–838. Boyd, J. H., J. N. Hu and R. Jagannathan. 2005. “The stock market’s reaction to unemployment news: ‘why bad news is usually good for stocks.”’ Journal of Finance 60(2), 649–672. Boyle, P. 1977. “Options: a Monte-Carlo approach.” Journal of Financial Economics 4, 323–338. Boyle, P. 1988. “A Lattice Framework for Option Pricing with Two State Variables.” Journal of Financial and Quantitative Analysis, 23, 1–12. Boyle, P. P. 1989. “The quality option and the timing option in futures contracts.” Journal of Finance 44, 101–103. Boyle, P. P. and D. Emanuel. 1980. “Discretely adjusted option hedges.” Journal of Financial Economics 8(3), 259–282. Boyler, P.P. and A. L. Ananthanarayanan. 1977. “The impact of variance estimation in option valuation models.” Journal of Financial Economics 5, 375–387. Brace, A., D. Gatarek, and M. Musiela. 1997. “The market model of interest rate dynamics.” Mathematical Finance 7, 127–155. Bradley, M., A. G. Jarrell, and E. H. Kim. 1984 “On the existence of an optimal capital structure: theory and evidence.” Journal of Finance 39(3), 857–878. Brailsford, T. J. and R. W. Faff. 1996. “An evaluation of volatility forecasting techniques.” Journal of Banking & Finance 20, 419–438.
1635 Brancheau, J. C. and J. C. Wetherbe. 1987. “Key issues in information systems management.” MIS Quarterly 11, 23–45. Brander, J. and T. Lewis. 1986. “Oligopoly and financial structure: the limited liability effect.” American Economic Review 76, 956–970. Brander, J. and B. Spencer, 1989. “Moral hazard and limited liability: implications for the theory of the firm.” International Economic Review 30, 833–850. Brandimarte, P. 2006. Numerical methods in finance and economics: a MATLAB-based introduction, 2nd ed., Wiley, New York. Brandt, M. W. and D. Chapman. “Comparing Multifactor Models of the Term Structure,” Working paper. Brandt, M. and F. Diebold. 2006. “A no-arbitrage approach to rangebased estimation of return covariances and correlations.” Journal of Business 79, 61–74. Brandt, M and C. Jones. 2006. “Volatility forecasting with range-based EGARCH models.” Journal of Business and Economic Statistics 24, 470–486. Brandt, M. W. and Q. Kang. 2004. “On the relationship between the conditional mean and volatility of stock returns: a latent VAR approach.” Journal of Financial Economics 72, 217–257. Brandt, M., and A. Yaron. 2001. Time-consistent no-arbitrage models of the term structure, Working paper, University of Pennsylvania. Branger, N. 2004. “Pricing derivative securities using cross-entropy: an economic analysis.” International Journal of Theoretical and Applied Finance 7(1), 63–81. Bratley, P., B. L. Fox, and L. E. Schrage. 1987. A guide to simulation, Springer, New York. Brav, A. and J. B. Heaton. 2002. “Competing theories of financial anomalies.” Review of Financial Studies 15, 575–606. Brealey, R. A. and S. D. Hodges. 1975. “Playing with portfolios.” Journal of Finance 30, 125–134. Brealey, R. and S. Myers. 1988. Principles of corporate finance, McGraw-Hill, New York. Brealey, R., and S. Myers. 1996. Principles of corporate finance, 5th ed., McGraw-Hill, New York. Brealey, R. A., S. C. Myers, and A. J. Marcus. 2001. Fundamentals of corporate finance, 3rd Edition, McGraw-Hill, New York, NY. Brealey, R., S. Myers, and F. Allen. 2007. Principles of corporate finance, 9th edn. McGraw Hill, New York. Breeden, D. T. 1979. “An intertemporal asset pricing model with stochastic consumption and investment opportunities.” Journal of Financial Economics 7, 265–296. Breeden, D. and R. Litzenberger. 1978. “State contingent prices implicit in option prices.” Journal of Business 51, 621–651. Breen, W. and R. Jackson. 1971. “An efficient algorithm for solving large-scale portfolio problems.” Journal of Financial and Quantitative Analysis 6, 627–637. Breen, W., L. R. Glosten and R. Jagannathan. 1989. “Economic significance of predictable variations in stock index returns.” Journal of Finance 44, 1177–1190. Breidt, F. J., N. Crato, and P. De Lima. 1998. “The detection and estimation of long memory in stochastic volatility.” Journal of Econometrics 83, 325–348. Breiman, L. 1986. Probability and stochastic processes, 2nd Edition, Scientific Press, CA. Brennan, M. J. 1958. “The Supply of Storage,” American Economic Review 48, 50–72. Brennan, M. J. 1971. “Capital market equilibrium with divergent borrowing and lending rate.” Journal of Financial and Quantitative Analysis 7, 1197–1205. Brennan, M. J. 1975. “The optimal number of securities in a risky asset portfolio where there are fixed costs of transaction: theory and some empirical results.” Journal of Financial and Quantitative Analysis 10, 483–496. Brennan, M. 1979. “The pricing of contingent claims in discrete time models.” Journal of Finance 24(1), 53–68.
1636 Brennan, M. J. 1986 The cost of convenience and the pricing of commodity contingent claims, Working paper, University of British Columbia. Brennan, M. J. 1991 “The price of convenience and the valuation of commodity contingent claims” in Stochastic models and options values, D. Lund and B. Oksendal (Eds.). Elsever North Holland. Brennan, M. and H. Cao. 1997. “International portfolio investment flows.” Journal of Finance 52, 1851–1880. Brennan, M. J. and E. S. Schwartz. 1977. “The valuation of American put options.” Journal of Finance 32(2), 449–462. Brennan, M. J. and E. S. Schwartz. 1978. Corporate income taxes, valuation, and the problem of optimal capital structure. Journal of Business 51, 103–114. Brennan, M. J. and E. S. Schwartz. 1979. “A continuous time approach to the pricing of bonds.” Journal of Banking and Finance 3, 133–155. Brennan, M. and E. Schwartz. 1980. “Analyzing convertible bonds.” Journal of Financial and Quantitative Analysis 15, 907–929. Brennan, M. J. and E. S. Schwartz. 1982. “Consistent regulatory policy under uncertainty.” Bell Journal of Economics 13(2), 507–521. Brennan, M. J. and E. S Schwartz. 1982. “An equilibrium model of bond pricing and a test of market efficiency.” Journal of Financial and Quantitative Analysis 41, 301–329. Brennan, M. J. and E. S. Schwartz. 1985. “Evaluating natural resource investments.” Journal of Business 58, 135–157. Brennan, M. J. and W. N. Torous. 1999. “Individual decision making and investor welfare.” Economic Notes 28, 119–143. Brennan, M. J. and Y. Xia. 2001. “Assessing asset pricing anomalies.” Review of Financial Studies 14, 905–942. Brennan, M. J. and Y. Xia. 2001. “Stock price volatility and the equity premium.” Journal of Monetary Economics 47, 249–283. Brennan, M. J. and Y. Xia. 2002. “Dynamic asset allocation under inflation.” The Journal of Finance 57, 1201–1238. Brennan, M. J., E. S. Schwartz and R. Lagnado. 1997. “Strategic asset allocation.” Journal of Economic Dynamics and Control 21, 1377–1403. Brenner, M. 1974. “On the stability of the distribution of the market component in stock price changes.” Journal of Financial and Quantitative Analysis 9, 945–961. Brenner, R., R. Harjes, and K. Kroner. 1996. “Another look at alternative models of short-term interest rate.” Journal of Financial and Quantitative Analysis 31, 85–107. Brewer, T. L., and P. Rivoli. 1990. “Politics and perceived country creditworthiness in international banking.” Journal of Money, Credit and Banking 22, 357–369. Brewer, E., W. E. Jackson, and J. T. Moser. 1996. “Alligators in the swamp: The impact of derivatives on the financial performance of depository institutions.” Journal of Money, Credit, and Banking 28, 482–497. Brick, I. E. and S. A. Ravid 1991. “Interest rate uncertainty and optimal debt maturity structure.” Journal of Financial and Quantitative Analysis 28(1), 63–81. Brick, I. E. and D. G. Weaver. 1984. “A comparison of capital budgeting techniques in identifying profitable investments.” Financial Management 13, 29–39. Brick, I. E. and D. G. Weaver. 1997. “Calculating the cost of capital of an unlevered firm for use in project evaluation.” Review of Quantitative Finance and Accounting 9, 111–129. Brigham, E. F. 1988. Financial management: theory and practice, 4th ed., Dryden Press, Hinsdale. Bris, A., W. N. Goetzmann and N. Zhu. 2004. “Efficiency and the bear: short sales and markets around the world.” Working paper, Yale School of Management. Britten-Jones, M. and A. Neuberger. 1996. “Arbitrage pricing with incomplete markets.” Applied Mathematical Finance 3, 347–363.
References Briys, E. and F. de Varenne. 1997. “Valuing risky fixed rate debt: an extension.” Journal of Financial and Quantitative Analysis 32, 239–248. Broadie, M. and J. Detemple. 1996. “American option valuation: new bounds, approximations, and a comparison of existing methods.” Review of Financial Studies 9, 1211–1250. Broadie, M. and Ö. Kaya. 2006. “Exact simulation of stochastic volatility and other affine jump diffusion processes.” Operations Research 54(2), 217–231. Broadie, M. and O. Kaya. 2007. “A binomial lattice method for pricing corporate debt and modeling chapter 11 proceedings.” Journal of Financial and Quantitative Analysis 42, 279–312. Broadie, M., M. Chernov, and S. Sundaresan. 2007. “Optimal debt and equity values in the presence of chapter 7 and chapter 11.” Journal of Finance 62, 1341–1377. Brock, W. A. and B. D. LeBaron. 1996. “A dynamic structural model for stock return volatility and trading volume.” Review of Economics and Statistics 78(1), 94–110. Brockwell, P. J. and R. A. Davis. 1991. Time series: theory and methods, Springer, New York. Broll, U., J. E. Wahl, and W. K. Wong. 2006. “Elasticity of risk aversion and international trade.” Economics Letters 92(1), 126–130. Brooks, C., S. P. Burke, S. Heravi, G. Persand. 2005. “Autoregressive conditional kurtosis.” Journal of Financial Economics 3, 399–421. Broto, C. and E. Ruiz. 2004. “Estimation methods for stochastic volatility models: a survey.” Journal of Economic Surveys 18, 613–649. Brown, S. J. 1977. “Heteroscedasticity in the market model: a comment.” Journal of Business 50, 80–83. Brown, D. 2001. “An empirical analysis of credit spread innovations.” Journal of Fixed Income 11, 9–27. Brown, S. J. and P. H. Dybvig. 1986. “The empirical implications of the Cox, Ingersoll, Ross theory of the term structure of interest rates.” Journal of Finance 41, 617–630. Brown, S. J. and W. N. Goetzmann. 1995. “Performance persistence.” Journal of Finance 50, 679–698. Brown, R. H. and S. M. Schaefer. 1994. “The term structure of real interest rates and the Cox, Ingersoll and Ross model.” Journal of Financial Economics 35, 3–42. Brown, R. H. and S. M. Schaefer. 2000. Why long term forward interest rates (almost) always slope downwards, Working paper, Warburg Dillion Read and London Business School, UK. Brown, K. C., W. V. Harlow, and D. J. Smith. 1994. “An empirical analysis of interest rate swap spread.” Journal of Fixed Income 4, 61–78. Brown, J. R., Z. Ivkovic, P. A. Smith, and S. Weisbenner. 2008. “Neighbors matter: causal community effects and stock market participation.” Journal of Finance 63, 1509. Bruche, M. 2005. Estimating structural bond pricing models via simulated maximum likelihood, Working paper, London School of Economics. Bruner, R. F., K. M. Eades, R. S. Harris and R. C. Higgins. 1998. “Best practices in estimating the cost of capital: survey and synthesis.” Financial Practice and Education 8(1), 13–28. Brunner, B. and R. Hafner. 2003. “Arbitrage-free estimation of the riskneutral density from the implied volatility smile.” Journal of Computational Finance 7, 76–106. Brunnermeier, M. and S. Nagel. 2004. “Hedge funds and the technology bubble.” Journal of Finance 59, 2013–2040. Bubnys, E. L. 1990. “Simulating and forecasting utility stock returns: arbitrage pricing theory vs. capital asset pricing model.” The Financial Review 29, 1–23. Bubnys, E. L. and C. F. Lee. 1989. “Linear and generalized functional form market models for electric utility firms.” Journal of Economics and Business 41, 213–223. Buchen P. W. and M. Kelly. 1996. “The maximum entropy distribution of an asset inferred from option prices.” Journal of Financial and Quantitative Analysis 31, 143–159.
References Bühler, W. and M. Trapp. 2007. Credit and liquidity risk in bond and CDS markets, Working paper, Universität Mannheim. Bulirsch, R. and J. Stoer 1991. The conjugate-gradient method of Hestenes and Stiefel. §8.7 in introduction to numerical analysis. Springer-Verlag, New York. Bulow, J. and J. Shoven. 1978. “The bankruptcy decision.” Bell Journal of Economics 9(2), 437–456. Bulow, J., J. Geanakoplos, and P. Klemperer. 1985. “Multi market oligopoly: strategic substitutes and complements.” Journal of Political Economy 93, 488–511. Business Review Weekly. 1997. “Model of a competitive audit bureau.” July 21, 1997. 19(27), 88. Buss, M. D. J. 1983. “How to rank computer projects.” Harvard Business Review 62, 118–125. Caginalp, G., D. Porter and V. Smith. 2001. “Financial bubbles: excess cash, momentum, and incomplete information.” The Journal of Psychology and Financial Markets 2(2), 80–99. Cai, J. 1994. “A Markov model of switching-regime ARCH.” Journal of Business and Economic Statistics 12, 309–316. Cai, J. and H. Yang. 2005. “Ruin in the perturbed compound Poisson risk process under interest rate.” Advances in Applied Probability 37, 819–835. Cakici, N. and K. R. Foster. 2002. “Risk-neutralized at-the-money consistent historical distributions in currency options pricing.” Journal of Computational Finance 6, 25–47. Camerer, C. 1989. “Bubbles and fads in asset price: a review of theory and evidence.” Journal of Economic Surveys 3, 3–41. Campa, J. and K. H. Chang. 1995. “Testing the expectation hypothesis on the term structure of volatilities.” Journal of Finance 50, 529–547. Campa, J. M. and S. Kedia. 2002. “Explaining the diversification discount.” Journal of Finance 57, 1731. Campbell, J. Y. 1986. “A defense of traditional hypotheses about the term structure of interest rates.” Journal of Finance 41, 183–193. Campbell, J. Y. 1987. “Stock returns and the term structure.” Journal of Financial Economics 18, 373–400. Campbell, J.Y. 1991. “A variance decomposition for stock returns.” Economic Journal 101, 157–179. Campbell, J. Y. 1995. “Some lessons from the yield curve.” Journal of Economic Perspectives 9(3), 129–152. Campbell, J. Y. 2001. “Why long horizons? A study of power against persistent alternative.” Journal of Empirical Finance 8, 459–491. Campbell, J. Y. and J. H. Cochrane. 1999. “By force of habit: a consumption-based explanation of aggregate stock market behavior.” Journal of Political Economy 107, 205–251. Campbell, J. Y. and Y. Hamao. 1992. “Predictable stock returns in the United States and Japan: a study of long-term capital market integration.” Journal of Finance 47, 43–69. Campbell, J. Y. and L. Hentschel. 1992. “No news is good news: an asymmetric model of changing volatility in stock returns.” Journal of Financial Economics 31(3), 281–318. Campbell, T. S. and W. A. Kracaw. 1990. “Corporate risk-management and the incentive effects of debt.” Journal of Finance 45, 1673– 1686. Campbell, T. S. and W. A. Kracaw. 1990b. “A comment on bank funding risks, risk aversion, and the choice of futures hedging instrument.” Journal of Finance 45, 1705–1707. Campbell, J. and R. Shiller. 1987. “Cointegration and tests of present value models.” Journal of Political Economy 95, 1062–1088. Campbell, J. and R. Shiller. 1989. “The dividend-price ratio and expectations of future dividends and discount factors.” Review of Financial Studies 1, 195–228. Campbell, J. Y. and R. J. Shiller. 1991. “Yield spreads and interest rate movements: a bird’s eye view.” Review of Economic Studies 58, 495–514.
1637 Campbell, J. Y. and R. J. Shiller. 1998. “Valuation ratios and the long run stock market outlook.” Journal of Portfolio Management 24, 11–26. Campbell, J. Y. and R. J. Shiller. 2001. “Valuation ratios and the long run stock market outlook: an update.” in Advances in behavioral finance, vol II, N. Barberis and R. Thaler (Eds.). Russell Sage Foundation, 2004. Campbell, J. and G. Taksler. 2003. Equity volatility and corporate bond yields. Journal of Finance 58, 2321–2349. Campbell, J. Y. and S. B. Thompson. 2004. “Predicting the equity premium out of sample: can anything beat the historical average?” working paper, Harvard University. Campbell, J. and L. Viceira. 1999. “Consumption and portfolio decisions when expected returns are time varying.” Quarterly Journal of Economics 114, 433–495. Campbell, J. Y. and L. Viceira. 2001. “Who should buy long term bonds?” American Economic Review 91, 99–127. Campbell, J. Y. and M. Yogo. 2003. “Efficient test of stock return predictability.” working paper, Harvard University. Campbell, J., S. Grossman, and J. Wang. 1993. “Trading volume and serial correlation in stock returns.” Quarterly Journal of Economics 108, 905–939. Campbell, J. Y., A. W. Lo, and A. C. Mackinlay. 1997. The econometrics of financial markets, Princeton University Press, Princeton. Campbell, J., Y. L. Chan, and L. Viceira. 2003. “A multivariate model of strategic asset allocation.” Journal of Financial Economics 67, 41–80. Campello, M. 2006. “Debt financing: Does it boost or hurt firm performance in product markets?” Journal of Financial Economics 82, 135. Canina, L. and S. Figlewski. 1993. “The informational content of implied volatility.” Review of Financial Studies 6(3), 659–681. Cantor, R., and F. Packer. 1996. “Determinants and impact of sovereign credit ratings.” FRBNY Economic Policy Review 2, 37–53. Cao, C. and J. Huang. 2008. “Determinants of S&P 500 index option returns.” Review of Derivatives Research 10, 1–38. Cao, C., Z. Chen, and J. M. Griffin. 2003. “The informational content of option volume prior to takeovers.” Journal of Business 78, 1071–1109. Cao, C., F. Yu, and Z. Zhong. 2007. The information content of optionimplied volatility for credit default swap valuation, Working paper. Capobianco, E. 1996. “State-space stochastic volatility models: a review of estimation algorithms.” Applied Stochastic Models and Data Analysis 12, 265–279. Capon, N. 1978. “Discrimination in screening of credit applicants.” Harvard Business Review 56(3), 8–12. Caporale, M. G., N. Ptittas, and N. Spagnolo. 2002. “Testing for causality in variance: an application to the East Asian markets.” International Journal of Finance and Economics 7, 235–245. Cappiello L., R. Engle, and K. Sheppard. 2006. “Asymmetric dynamics in the correlations of global equity and bond returns.” Journal of Financial Econometrics 4, 537–572. Carelton, W. T. 1970. “An analytical model for long-range financial planning.” Journal of Finance 25(2), 291–315. Carhart, M. M. 1997. “On persistence in mutual fund performance.” Journal of Finance 52, 57–82. Cario, M. C. and B. L. Nelson. 1996. “Autoregressive to anything: timeseries input processes for simulation.” OR Letters 19, 51–58. Carlsson C. and R. Fuller. 2003. “A fuzzy approach to real option valuation.” Fuzzy Sets and Systems 139, 297–312. Carr, P. and D. Madan. 1999. “Option valuation using the fast Fourier transform.” Journal of Computational Finance 2(4), 61–73. Carr, P. and L. Wu. 2004. “Time-changed Levy processes and option pricing.” Journal of Financial Economics 71, 113–141. Carverhill, A. P. and L. J. Clewlow. 1990. “Valuing average rate (‘Asian’) options.” Risk 3, 25–29.
1638 Casella, G. and E. I. George. 1992. “Explaining the Gibbs sampler.” The American Statistician 46(3), 167–174. Casassus, J. and P. Collin-Dufresne, 2005. “Stochastic convenience yield implied from commodity futures and interest rates.” Journal of Finance 60, 2283–2331. Casassus, J., P. Collin-Dufresne, and R. S. Goldstein. 2005. “Unspanned stochastic volatility and fixed income derivative pricing.” Journal of Banking and Finance 29, 2723–2749. Cason, T. and C. Noussair. 2001. “The experimental study of market behavior,” Advances in Experimental Markets, Springer, New York, pp. 1–14. Castellacci, G. M. and J. Siclari. 2003. “Asian basket spreads and other exotic options.” Power Risk Management. http://www.olf.com/news/articles/2005/articles_20050610.aspx. Cathcart, L. and L. El-Jahel. 1998. “Valuation of defaultable bonds.” Journal of Fixed Income 8, 65–78. Cebenoyan, A. S., E. S. Cooperman, and C. A. Register. 1999. “Ownership structure, charter value, and risk-taking behavior for thrifts.” Financial Management 28(1), 43–60. Cecchetti, S. J., P. Lam, and N. C. Mark. 1993. “The equity premium and the risk-free rate: matching the moments.” Journal of Monetary Economics 31, 21–46. Cesa-Bianchi, N. and G. Lugosi. 2006. Prediction, learning, and games, Cambridge University Press, Cambridge. Cesari, R. and D. Cremonini. 2003. “Benchmarking, portfolio insurance and technical analysis: a Monte Carlo comparison of dynamic strategies of asset allocation.” Journal of Economic Dynamics and Control 27, 987–1011. Çetin, U. 2003. Default and liquidity risk modeling, PhD thesis, Cornell University. Çetin, U., R. Jarrow, P. Protter, and M. Warachka. 2002. Option pricing with liquidity risk , Working paper, Cornell University. Cetin, U., R. Jarrow, P. Protter, and Y. Yildirim. 2004. Modeling credit risk with partial information. Annals of Applied Probability 14, 1167–1178. Chacko, G. and S. Das. 2002. “Pricing interest rate derivatives: a general approach.” Review of Financial Studies 15, 195–241. Chakravarty, S., H. Gulen, and S. Mayhew. 2004. Informed trading in stock and option markets, Working Paper, Purdue University, Virginia Tech, University of Georgia and SEC. Chalasani, P., S. Jha, and A. Varikooty. 1998. “Accurate approximations for European Asian options.” Journal of Computational Finance 1(4), 11–29. Chalasani, P., S. Jha, F. Egriboyun, and A. Varikooty. 1999. “A refined binomial lattice for pricing American Asian options.” Review of Derivatives Research 3(1), 85–105. Chambers, D. R., R. S. Harris, and J. J. Pringle. 1982. “Treatment of financing mix in analyzing investment opportunities.” Financial Management (Summer 1982), 11, 24–41. Chan, K. S. 1993. “Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model.” The Annals of Statistics 21, 520–533. Chan, T. 1999. “Pricing contingent claims on stocks driven by Lévy processes.” Annals of Applied Probability 9, 504–528. Chan, T. 2005. “Pricing perpetual American options driven by spectrally one-sided Lévy processes.” in Exotic option pricing and advanced Lévy models, A. Kyprianou (Ed.). Wiley, England, pp. 195–216. Chan, C. K. and N. F. Chen. 1988. “An unconditional asset pricing test and the role of size as an instrumental variable for risk.” Journal of Finance 43, 309–325. Chan, L. K. C. and J. Lakonishok. 1993. “Are the reports of beta’s death premature?” Journal of Portfolio Management, summer 19, 51–62. Chan, L. and J. Lakonishok. 2004. “Value and growth investing: review and update.” Financial Analysts Journal 60(1), 71–87. Chan, C. and B. Lewis. 2002. “A basic primer on data mining.” Information Systems Management 19, 56–60.
References Chan, S. H., J. Martin, and J. Kensinger. 1990. “Corporate research and development expenditures and share value.” Journal of Financial Economics 26(2), 255–276. Chan, K., A. Karolyi, F. Longstaff, and A. Sanders. 1992. “An empirical comparison of alternative models of the short-term interest rate.” Journal of Finance 47, 1209–1227. Chan, K., P. Chung, and H. Johnson. 1993. “Why option prices lag stock prices: A trading-based explanation”. Journal of Finance, 48, 1957– 1967. Chan, L. K., J. Karceski, and J. Lakonishok. 1999. “On portfolio optimization: forecasting covariances and choosing the risk model.” Review of Financial Studies 12, 937–974. Chan, L. K. C., J. Lakonishok, and T. Sougiannis. 2001. “The stock market valuation of research and development expenditures.” Journal of Finance 56(6), 2431–2456. Chan, K., Y. P. Chung, and W. M. Fong. 2002. “The informational role of stock and option volume.” Review of Financial Studies 15, 1049– 1075. Chance, D. M. 1986. “Empirical tests of the pricing of index call options.” Advances in Futures and Options Research 1(A), 141–166. Chance, D. 2006. An introduction to derivatives and risk management, Thompson South Western, Mason. Chang, S. 1998. “Takeovers of privately held targets, methods of payment, and bidder returns.” Journal of Finance 53, 773–784. Chang, C. C. and H. C. Fu. 2001. “A binomial options pricing model under stochastic volatility and jump.” CJAS 18(3), 192–203. Chang, E., C. Eun and R. Kolodny. 1995. “International diversification through closed-end country funds.” Journal of Banking and Finance 19, 1237–1263. Chang, C., Chang, J. S., and M.-T. Yu. 1996. “Pricing catastrophe insurance futures call spreads: a randomized operational time approach.” Journal of Risk and Insurance 63, 599–617. Chang, J., V. Errunza, K. Hogan, and M. Hung. 2005. “An intertemporal international asset pricing model: theory and empirical evidence.” European Financial Management 11, 218–233. Chang, C., A. C. Lee, and C. F. Lee. 2005. Reexamining the determinants of capital structure using structural equation modeling approach, Unpublished Paper, National Chenchi University, Taiwan. Chao, J. C. and N. R. Swanson. 2005. “Consistent estimation with a large number of weak instruments.” Econometrica 73, 1673. Chao, C. and E. Yu. 1996. “Are wholly foreign-owned enterprises better than joint ventures?” Journal of International Economics 40(1–2), 225–237. Chao, C. and E. Yu. 2000. “Domestic equity controls of multinational enterprises.” Manchester School 68(3), 321–330. Chao, C., W. Chou, and E. Yu. 2002. “Domestic equity control, capital accumulation and welfare: theory and China’s evidence.” World Economy 25(8), 1115–1127. Chapman, D. A. and N. D. Pearson. 2000. “Is the short rate drift actually nonlinear?” Journal of Finance 55, 355–388. Chapman, D. A. and N. D. Pearson. 2001. “Recent advances in estimating term-structure models.” Financial Analysts Journal 57, 77–95. Chaudhury, M. M. and C. F. Lee. 1997. “Functional form of stock return model: some international evidence.” The Quarterly Review of Economics and Finance 37(1), 151–183. Chauvin, K. W. and M. Hirschey. 1993. “Advertising, R&D expenditures and the market value of the firm.” Financial Management 22(4), 128–140. Chava, S. and R. A. Jarrow. 2004. “Bankruptcy prediction with industry effects.” Review of Finance 8, 537–569. Chava, S. and H. Tookes. 2005. Where (and how) does news impact trading? Unpublished working paper, Texas A&M University and Yale School of Management. Cheh, J. J. 2002. “A heuristic approach to efficient production of detector sets for an artificial immune algorithm-based bankruptcy pre-
References diction system.” Proceedings of 2002 IEEE World Congress on Computational Intelligence, pp. 932–937. Chen, N.-F. 1983. “Some empirical tests of the theory of arbitrage pricing.” Journal of Finance 38, 1393–1414. Chen, H. 2007. Macroeconomic conditions and the puzzles of credit spreads and capital structure, Working paper, University of Chicago. Chen, S. N. and S. J. Brown. 1983. “Estimation risk and simple rules for optimal portfolio selection.” Journal of Finance 38, 1087–1093. Chen, J. and A. K. Gupta. 1997. “Testing and locating variance changepoints with application to stock prices.” Journal of the American Statistical Association 92, 739–747. Chen, N. and S. Kou. 2005. Credit spreads, optimal capital structure, and implied volatility with endogenous default and jump risk, Working paper, University of Columbia. Chen, R. R. and C. F. Lee. 1993. “A constant elasticity of variance (CEV) family of stock price distributions in option pricing: review and integration.” Journal of Financial Studies 1, 29–51. Chen, R.-R. and O. Palmon. 2005. “A non-parametric option pricing model: theory and empirical evidence.” Review of Quantitative Finance and Accounting 24(2), 115–134. Chen, R. and L. Scott. 1992. “Pricing interest rate options in a twofactor Cox–Ingersoll–Ross model of the term structure.” Review of Financial Studies 5, 613–636. Chen, R. and L. Scott. 1993. “Maximum likelihood estimation for a multifactor equilibrium model of the term structure of interest rates.” Journal of Fixed Income 3, 14–32. Chen, R. and L. Scott. 1994a. “Multi-factor Cox–Ingersoll–Ross models of the term structure of interest rates: estimates and tests using a Kalman filter,” Working Paper, Rutgers University and University of Georgia. Chen, R. and L. Scott. 1994b. “Interest rate options in multi-factor CIR models of the term structure,” Rutgers University and University of Georgia. Chen, R., and L. Scott. 2001. Stochastic volatility and jumps in interest rates: an empirical analysis, Working paper, Rutgers University. Chen, S. S., C. F. Lee, and K. Shrestha. 2001. “On a mean-generalized semivariance approach to determining the hedge ratio.” Journal of Futures Markets 21, 581–598. Chen, S., K. W. Ho, K. H. Ik, and C. F. Lee. 2002. “How does strategic competition affect firm values? A study of new product announcements,” Financial Management 31, 67–84 (Financial Management Association), Summer. Chen, S. S., C. F. Lee, and K. Shrestha. 2003. “Futures hedge ratio: a review.” The Quarterly Review of Economics and Finance 43, 433–465. Chen, S.-S., K. W. Ho, C.-F. Lee, and K. Shrestha. 2004, “Nonlinear models in corporate finance research: review, critique, and extensions.” Review of Quantitative Finance and Accounting 22(2), 141–169. Chen, C. R., T. Steiner, and A. Whyte. 2006. “Does stock optionbased executive compensation induce risk-taking? An analysis of the banking industry.” Journal of Banking and Finance 30, 915–946. Chen, R., S. Hu, and G. Pan. 2006. Default prediction of various structural models, Working paper, Rutgers University, National Taiwan University, and National Ping-Tung University of Sciences and Technologies. Chen, R. R., F. J. Fabozzi, G. G. Pan, and R. Sverdlove. 2006. Sources of credit risk: evidence from credit default swaps. The Journal of Fixed Income 16, 7–21. Chen, Y. T., C. F. Lee, and Y. C. Sheu. 2007. “An ODE approach for the expected discounted penalty at ruin in a jump-diffusion model.” Finance and Stochastics 11, 323–355. Chen, L., D. A. Lesmond, and J. Wei. 2007. “Corporate yield spreads and bond liquidity.” Journal of Finance 62, 119.
1639 Chen, F., K. Yee, and Y. Yoo. 2007. “Did adoption of forward-looking methods improve valuation accuracy in shareholder litigation?” Journal of Accounting, Auditing, and Finance 22(4), 573–598. Chen, R., C. Lee, and H. Lee. 2008. Default prediction of alternative structural credit risk models and implications of default barriers, Working paper, Rutgers University. Chen, R. R., X. Cheng, F. J. Fabozzi, and B. Liu. 2008. “An explicit, multi-factor credit default swap pricing model with correlated factors.” Journal of Financial and Quantitative Analysis 43(1), 123–160. Chen, S.S., C.F. Lee, and H.H. Lee. 2010. “Alternative methods to determine optimal capital structure: theory and application.” Handbook of Quantitative Finance and Risk Management, (forthcoming). Chen, F., K. Yee, and Y. Yoo. 2010. “Robustness of judicial decisions to valuation-method innovation: an exploratory empirical study.” Journal of Business, Accounting, and Finance, forthcoming. Chen, L., P. Collin-Dufresne, and R. Goldstein. in press. On the relation between the credit spread puzzle and the equity premium puzzle. Review of Financial Studies. Chen, Y. T., C. F. Lee, and Y. C. Sheu. On the discounted penalty at ruin in a jump-diffusion model, Taiwanese Journal of Mathematics (accepted). Cheng, J. K. 1993. “Managerial flexibility in capital investment decision: insights from the real-options literature.” Journal of Accounting Literature 12, 29–66. Cheridito, P., D. Filipovi, and R. L. Kimmel. 2007. “Market price of risk specifications for affine models: theory and evidence.” Journal of Financial Economics 83, 123–170. Chernov, M., and E. Ghysels. 2000. “A study towards a unified approach to the joint estimation of objective and risk neutral measures for the purpose of options valuation.” Journal of Financial Economics 56, 407–458. Chernov, M., R. Gallant, E. Ghysels, and G. Tauchen. 2003. “Alternative models for stock price dynamics.” Journal of Econometrics 116, 1–2, 225–257. Cheung, Y. 2007. “An empirical model of daily highs and lows.” International Journal of Finance and Economics 12, 1–20. Cheung, C. S. and C. C. Y. Kwan. 1988. “A note on simple criteria for optimal portfolio selection.” Journal of Finance 43, 241–245. Chevalier, J. and G. Ellison. 1997. “Risk taking by mutual funds as a response to incentives.” Journal of Political Economy 105, 1167–1200. Chiang, R. C. and M. P. Narayanan 1991. “Bond refunding in efficient markets: a dynamic analysis with tax effects.” Journal of Financial Research 14(4), 287–302. Chiappori, P.-A. and B. Salanie. 2000. “Testing for asymmetric information in insurance markets.” Journal of Political Economy 108, 56–78. Chib, S., F. Nardari, and N. Shephard. 2002. “Markov Chain Monte Carlo methods for stochastic volatility models.” Journal of Econometrics 108, 281–316. Chidambaran, N. K. 2003. Genetic programming with Monte Carlo simulation for option pricing, Unpublished working paper, Rutgers University. Chidambaran, N. K. and S. Figlewski. 1995. “Streamlining Monte Carlo simulation with the quasi-analytic method: analysis of a pathdependent option strategy.” Journal of Derivatives 3(2), 29–51. Chidambaran, N. K., C. H. Jevons Lee, and J. Trigueros. 1999. “An adaptive evolutionary approach to option pricing via genetic programming,” in Computational Finance – Proceedings of the Sixth International Conference, Y. S. Abu-Mostafa, B. LeBaron, A. W. Lo, and A. S. Weigend (Eds.). MIT Press, Cambridge, MA. Childs, P., D. Mauer and S. Ott. 2005. “Interaction of corporate financing and investment Decisions: the effects of agency conflict.” Journal of Financial Economics 76, 667–690.
1640 Chiou, W. 2008. “Who benefits more from international diversification?” Journal of International Financial Markets, Institutions, and Money 18, 466–482. Chiou, W., A. C. Lee, and C. Chang. 2009. “Do investors still benefit from international diversification with investment constraints?” Quarterly Review of Economics and Finance 49(2), 448–483. Chiras, D. and S. Manaster. 1978. “The information content of option prices and a test of market efficiency.” Journal of Financial Economics 6, 213–234. Chitashvili, R. 1983. “Martingale ideology in the theory of controlled stochastic processes,” in Lecture Notes in Mathematics, Vol. 1021, Springer, Berlin, pp. 73–92. Chitashvili, R. and M. Mania. 1987. “Optimal locally absolutely continuous change of measure: finite set of decisions.” Stochastics Stochastics Reports 21, 131–185 (part 1), 187–229 (part 2). Chitashvili, R. and M. Mania. 1996. “Generalized Ito’s formula and derivation of Bellman’s equation,” in Stochastic processes and related topics, Stochastics Monograph Vol. 10, H. J. Engelbert et al. (Eds.). Gordon & Breach, London, pp. 1–21. Cho, M.-H. 1998. “Ownership structure, investment, and the corporate value: an empirical analysis.” Journal of Financial Economics 47, 103. Chod, J. and E. Lyandres. 2008. Strategic IPOs and product market competition, Working Paper, University of Rochester. Choi, T. S. and R. R. Levary. 1989. “Multi-national capital budgeting using chance-constrained goal programming.” International Journal of Systems Science 20, 395–414. Choi, J. and F. Longstaff. 1985. “Pricing options on agricultural futures: an application of the constant elasticity of variance option pricing model.” Journal of Futures Markets 5(2), 247–258. Choi, S. and A. C. Pritchard. 2004. “Should issuers be on the hook for laddering?.” Working Paper, Law and Economics, Michigan University, No. 04-009/CLEO USC No. C04-3. Choo E.U. and Wedley W.C. 2004. A common framework for deriving preference values from pairwise comparison matrices. Computers and Operational Research 31, 893–908. Chordia, T., R. Roll, and A. Subrahmanyam. 2000. “Commonality in liquidity.” Journal of Financial Economics 56, 3–28. Chordia, T., R. Roll, and A. Subrahmanyam. 2008. “Liquidity and market efficiency.” Journal of Financial Economics 87, 249–268. Chou, R. 2005. “Forecasting financial volatilities with extreme values: the conditional autoregressive range (CARR) model.” Journal of Money Credit and Banking 37, 561–82. Chou, R. 2006. “Modeling the asymmetry of stock movements using price ranges.” Advances in Econometrics 20A, 231–257. Chou, R. Y. and N. Liu. 2008a. The economic value of volatility timing using a range-based volatility model, Working paper, Academia Sinica. Chou, R. Y. and N. Liu. 2008b. Estimating time-varying hedge ratios with a range-based multivariate volatility model, Working paper, Academia Sinica. Chou, H. and D. Wang. 2007. “Forecasting volatility of UK stock market: a test of conditional autoregressive range (CARR) model.” International Research Journal of Finance and Economics 10, 7–13. Chou, R. Y., N. Liu, and C. Wu. 2007. Forecasting time-varying covariance with a range-based dynamic conditional correlation model, Working paper, Academia Sinica. Christie, W. and P. Schultz. 1994. “Why do NASDAQ market makers avoid odd-eighth quotes?” Journal of Finance 49, 1813–1840. Christensen, M. and C. Munk. 2004. Note on numerical solution of Black–Scholes-Merton PDE, Working paper, University of Southern Denmark. Christensen, B. J. and N. R. Prabhala. 1998. “The relation between implied and realized volatility.” Journal of Financial Economics 50, 125–150.
References Christensen, K. and M. Podolskij. 2007. “Realized range-based estimation of integrated variance.” Journal of Econometrics 141, 323–349. Christie, W., J. Harris, and P. Schultz. 1994. “Why did NASDAQ market makers stop avoiding odd-eighth quotes?” Journal of Finance 49, 1841–1860. Christoffersen, P. 1998. “Evaluating interval forecasts.” International Economic Review 39, 841–862. Christoffersen, P. F. 2002. Elements of financial risk management, Academic Press, San Diego. Christoffersen, P. and K. Jacobs. 2004. “The importance of the loss function in option valuation.” Journal of Financial Economics 72, 291–318. Christopherson, J. A., W. E. Ferson and D. Glassman. 1998. “Conditioning manager alpha on economic information: another look at the persistence of performance.” Review of Financial Studies 11, 111–142. Chu, C.-S. J. 1995. “Detecting parameter shift in GARCH models.” Econometric Review 14, 241–266. Chu, C.-S. J., M. Stinchcombe, and H. White. 1996. “Monitoring structural change.” Econometrica 64, 1045–1065. Chui-Yu, C. and S. P. Chan. 1998. “Capital budgeting decisions with fuzzy projects.” The Engineering Economist 43(2), 125–150. Chung, S. L. and P. T. Shih. 2007. “Generalized Cox–Ross–Rubinstein binomial models.” Management Science 53, 508–520. Churchill, R. V. 1972. Operational mathematics, McGraw-Hill, New York. Churchill, G. A. Jr. 1983. Marketing research: methodological foundations, 3rd edn. CBS College Publishing, New York. Citron, J. T., and G. Neckelburg. 1987. “Country risk and political instability.” Journal of Development Economics 25, 385–395. Claessens, S., S. Djankov, and L. Lang. 2000. “The separation of ownership and control in East Asian corporations.” Journal of Financial Economics 58, 81–112. Clark, P. K. 1973. “A subordinate stochastic process model with finite variance for speculative prices.” Econometrica 41, 135–155. Clarke, J., C. Dunbar, and K. M. Kahle. 2001. “Long-run performance and insider trading in completed and canceled seasoned equity offerings.” Journal of Financial and Quantitative Analysis 36, 415–430. Clemen, R. 1989. “Combining forecasts: a review and annotated bibliography.” International Journal of Forecasting 5, 559–583. Clemen, R. and R. Winkler. 1986. “Combining economic forecasts.” Journal of Economic and Business Statistics 4, 39–46. Cliff, M. T. and D. J. Denis. 2004. “Do initial public offering firms purchase analyst coverage with underpricing?” Journal of Finance 59, 2871. Cochrane, J. 1992. “Explaining the variance of price-dividend ratios.” Review of Financial Studies 5, 243–280. Cochrane, J. H. 1996. “A cross-sectional test of an investment-based asset pricing model.” Journal of Political Economy 104, 572–621. Cochrane, J. H. 1997. “Where is the market going? Uncertain facts and novel theories.” Economic Perspectives XXI(6) (Federal Reserve Bank of Chicago), 3–37. Cochrane, J. H. 1999. “New facts in finance.” Economic Perspectives XXIII(3) (Federal Reserve Bank of Chicago), 36–58. Cochrane, J. H. 1999. New facts in finance, Working Paper, University of Chicago, Chicago. Cochrane, J. H. 2001. Asset pricing, Princeton University Press, Princeton, NJ. Cochrane, J. H. and Jesus Saa–Requejo. 2000. “Beyond arbitrage: gooddeal asset price bounds in incomplete markets.” Journal of Political Economy 108, 79–119. Cohen, K. and J. Pogue. 1967. “An empirical evaluation of alternative portfolio-selection models.” Journal of Business 46, 166–193. Cohen, R., W. Lewellen, R. Lease, and G. Schlarbaum. 1975. “Individual investor risk aversion and investment portfolio composition.” Journal of Finance 30, 605–620.
References Cohen, K., S. Maier, R. Schwartz, and D. Whitcomb. 1979. “Market makers and the market spread: a review of recent literature.” The Journal of Financial and Quantitative Analysis 14, 813–835. Cohen, K., S. Maier, R. Schwartz, and D. Whitcomb. 1981. “Transaction costs, order placement strategy, and existence of the bid-ask spread.” The Journal of Political Economy 89, 287–305. Collin-Dufresne, P. and R. S. Goldstein. 2001. “Do credit spreads reflect stationary leverage ratios?” The Journal of Finance 56(5), 1929–1958. Collin-Dufresne, P. and R. S. Goldstein, 2002. “Do bonds span the fixed income markets? Theory and evidence for unspanned stochastic volatility.” Journal of Finance 57, 1685–1729. Collin-Dufresne, P. and R. S. Goldstein. 2003. Stochastic correlation and the relative pricing of caps and swaptions in a generalized affine framework, Working paper, Carnegie Mellon University. Collin-Dufresne, P. and B. Solnik. 2001. “On the term structure of default premia in the swap and LIBOR markets.” Journal of Finance 56, 1095–1115. Collin-Dufresne, P., R. S. Goldstein, and J. S. Martin. 2001. “The determinants of credit spread changes.” Journal of Finance 56, 2177–2207. Collin-Dufresne, P., R. Goldstein, and J. Hugonnier. 2004. “A General Formula for Valuing Defaultable Securities.” Econometrica 72(5), 1377–1407. Collins, A., A. E. Scorcu, R. Zanola. 2007. “Sample selection bias and time instability of hedonic art price indexes, Quaderni, Working Papers DSE Nı 610. Conley, T. G., L. P. Hansen, E. G. J. Luttmer, and J. A. Scheinkman. 1997. “Short-term interest rates as subordinated diffusions.” Review of Financial Studies 10, 525–577. Conrad, J. and G. Kaul. 1988. “Time-variation in expected returns.” Journal of Business 61, 409–425. Conrad, J., A. Hammeed, and C. Niden. 1994. “Volume and autocovariances in short-horizon individual security returns.” Journal of Finance 49, 1305–1329. Conrad, J., K. M. Johnson, and S. Wahal. 2003. “Institutional trading and alternative trading systems.” Journal of Financial Economics 70, 99. Constantinides, G. M. 1983. “Capital market equilibrium with personal tax.” Econometrica 51, 611–636. Constantinides, G. M. 1984. “Optimal stock trading with personal taxes: implications for prices and the abnormal January returns.” Journal of Financial Economics 13, 65–89. Constantinides G. M. 1992. “A theory of the nominal term structure of interest rates.” Review of Financial Studies 5(4), 531–552. Constantinides, G. and T. Zariphopoulou. 1999. “Bounds on prices of contingent claiims in an intertemporal economy with proportional transaction costs and general preferences.” Finance and Stochastics 3, 345–369. Cook, R. D. and S. Weisberg. 1982. Residuals and influence in regression, Chapman & Hall/CRC, London. Cooper, I. A. and A. S. Mello. 1991. “The default risk of swaps.” Journal of Finance 46, 597–620. Cooper, I. A. and K. G. Nyborg. 2006. “The value of tax shields is equal to the present value of the tax shields.” Journal of Financial Economics 81, 215–225. Copeland, T. 1976. “A model of asset trading under the assumption of sequential information arrival.” Journal of Finance 31, 1149–1168. Copeland, T. and M. Copeland. 1999. “Managing corporate FX risk: a value-maximizing approach.” Financial Management 28(3), 68–75. Copeland, T. and D. Galai. 1983. “Information effects on the bid-ask spread.” Journal of Finance 38, 1457–1469. Copeland, T. E. and J. F. Weston. 1988. Financial theory and corporate policy, 3rd ed., Addison-Wesley, New York. Cornell, B. and K. Green. 1991. “The investment performance of lowgrade bond funds.” The Journal of Finance 66(1), 29–48.
1641 Corrado, J. C. and T. Su. 1997. “Implied volatility skews and stock index skewness and kurtosis implied by S&P 500 index option prices.” Journal of Derivatives 4(4), 8–19. Corrado, C. and C. Truong. 2007. “Forecasting stock index volatility: comparing implied volatility and the intraday high-low price range.” Journal of Financial Research 30, 201–215. Cortazar, G., E. S. Schwartz, and J. Casassus. 2001. “Optimal exploration investments under price and geological-technical uncertainty: a real options model.” R&D Management 31(2), 181–190. Cosset, J. and J. Suret. 1995. “Political risk and the benefit of international portfolio diversification.” Journal of International Business Studies 26, 3013–18. Cossin, D. and T. Hricko. 2001. Exploring for the determinants of credit risk in credit default swap transaction data, Working Paper. Cosslett, S. R. 1981. “Efficient estimation of discrete-choice models,” in Structural analysis of discrete data with econometric applications, C. F. Manski and D. McFadden (Eds.). MIT Press, London, pp. 51–111. Costabile, M., I. Massabo, and E. Russo. 2006. “An adjusted binomial model for pricing Asian options.” Review Quantitative Finance and Accounting 27(3), 285–296. Costanigro, M., J. J. McCluskey, and R. C. Mittelhammer. 2007. “Segmenting the wine market based on price: hedonic regression when different prices mean different products.” Journal of Agricultural Economics 58(3), 454–466. Costinot, A., T. Roncalli, and J. Teiletche. 2000. Revisiting the dependence between financial markets with copulas, Working paper, Groupe de Recherche Operationnelle, Credit Lyonnais. Cotton, P., J.-P. Fouque, G. Papanicolaou, and R. Sircar. 2004. “Stochastic volatility corrections for interest rate derivatives.” Mathematical Finance 14(2), 173–200. Coulson, N. and R. Robins. 1993. “Forecast combination in a dynamic setting.” Journal of Forecasting 12, 63–67. Counsell, G. 1989. “Focus on workings of insolvency act.” The Independent, 4th April. Court, A. T. 1939. “Hedonic price indexes with automotive examples.” The dynamics of automobile demand, General Motors, New York. Cox, J. 1975. Notes on option pricing I: constant elasticity of variance diffusion, Unpublished Note, Standford University, Graduate School of Business. Also, Journal of Portfolio Management (1996) 23, 5–17. Cox, J. 1996. “Notes on option pricing I: constant elasticity of variance diffusion.” Journal of Portfolio Management 23, 15–17. Cox, D.R. and H.D. Miller. 1990. The theory of stochastic processes, Chapman & Hall, New York. Cox, J. and S. Ross. 1975. The pricing of options for jump processes, Wharton Working Paper #2–75, University of Pennsylvania. Cox, J. C. and S. A. Ross. 1975. Notes on option pricing I: constant elasticity of variance diffusion, Working paper, Stanford University. Cox, J. C. and S. A. Ross. 1976. “The valuation of options for alternative stochastic processes.” Journal of Financial Economics 3, 145–226. Cox, J. C. and S. A. Ross. 1976. “A Survey of Some New Results in Financial Option Pricing Theory”. The Journal of Finance, 31(2), 383–402. Cox, J. C. and M. Rubinstein. 1979. “Option pricing: a simplified approach.” Journal of Financial Economics 8, 229–263. Cox, J. C. and M. Rubinstein. 1985. Option markets, Prentice-Hall, Englewood Cliffs, NJ. Cox, S. H. and R. G. Schwebach. 1992. “Insurance futures and hedging insurance price risk.” Journal of Risk and Insurance 59, 628–644. Cox, J. C., S. A. Ross, and M. Rubinstein. 1979. “Option pricing: a simple approach.” Journal of Financial Economics 7, 229–263. Cox, J., J. Ingersoll, and S. Ross. 1980. “An analysis of variable rate loan contracts.” Journal of Finance 35, 389–403.
1642 Cox, J. C., J. E. Ingersoll, and S. A. Ross. 1981. “A re-examination of traditional hypotheses about the term structure of interest rates.” Journal of Finance 36, 769–799. Cox, J. C., J. E. Ingersoll Jr. and S. A. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53, 363–384. Cox, J. C., J. E. Ingersoll, and S. R. Ross. 1985. “A theory of the term structure of interest rates.” Econometrica 53(2), 385–408. Cox, S. H., J. R. Fairchild, and H. W. Pederson. 2004. “Valuation of structured risk management products.” Insurance: Mathematics and Economics 34, 259–272. Crabbe, L. and J. Helwege. 1994. “Alternative tests of agency theories of callable corporate bonds.” Financial Management 23(4), 3–20. Crack, T. 1999. “A classic case of “data snooping” for classroom discussion.” Journal of Financial Education 92–97. Cragg, J. G. and S. G. Donald. 1993. “Testing identifiability and specification in instrumental variable models.” Econometric Theory 9, 222. Crama, Y., and P. L. Hammer., T. Ibaraki. 1988. “Cause-effect relationships and partially defined boolean functions.” Annals of Operations Research 16, 299–326. Cramer, J. S. 1991. The logit model: an introduction for economists, Edward Arnold, London. Crawford, A. J., J. R. Ezzell, and J. A. Miles. 1995. “Bank CEO payperformance relations and the effects of deregulation.” Journal of Business 68, 231–256. Cremers, K. 2002. “Stock return predictability: a Bayesian model selection perspective.” Review of Financial Studies 15(4), 1223–1249. Cremers, M., J. Driessen, P. Maenhout, and D. Weinbaum. in press. Explaining the level of credit spreads: option-implied jump risk premia in a firm value model. Review of Financial Studies. Crosbie, P. and J. Bohn. 2002. Modeling default risk, KMV Corporation, San Francisco, CA. Csiszar, I. 1975. “I divergence geometry of probability distributions and minimization problems.” Annals of Probability 3, 146–158. Culp, C. L., D. Furbush, and B. T. Kavanagh. 1994. “Structured debt and corporate risk-management.” Journal of Applied Corporate Finance Fall 7(3), 73–84. Cumby, R. and J. Glen. 1990. “Evaluating the performance of international mutual funds.” Journal of Finance 45(2), 497–521. Cummins, J. D. 1990. “Multi-period discounted cash flow rate-making models in property-liability insurance.” Journal of Risk and Insurance 57(1), 79–109. Cummins, J. D. 1992. “Financial pricing of property-liability insurance,” in Contributions to insurance economics, G. Dionne (Ed.). Kluwer Academic, Boston, 141–168. Cummins, D. J. and H. Geman. 1994. “An Asian option approach to the valuation of insurance futures contracts.” Working paper, Financial Institutions Center University of Pennsylvania 03–94. Cummins, J. D. and H. Geman. 1995. “Pricing catastrophe futures and call spreads: an arbitrage approach.” Journal of Fixed Income March, 46–57. Cummins, J. D. and S. E. Harrington. 1987. Fair rate of return in property-liability insurance, Kluwer Academic, Norwell, MA. Cummins, J. D. and D. J. Nye. 1972. “The cost of capital of insurance companies: comment.” Journal of Risk and Insurance 39(3), 487–491. Cummins, J. D., R. D. Phillips, and S. D. Smith. 1998. “The rise of riskmanagement.” Economic Review, Federal Reserve Bank of Atlanta 83, 30–40. Curran, M. 1994. “Valuing Asian and portfolio options by conditioning on the geometric mean.” Management Science 40, 1705–1711. Curry, T., and L. Shibut. 2000. “Cost of the S&L crisis.” FDIC Banking Review 13(2), 26–35. Cutler, D. M., J. M. Poterba and L. H. Summers. 1991. “Speculative dynamics.” Review of Economic Studies 58, 529–546.
References Cvitanic, J. and I. Karatzas. 1996. “Hedging and portfolio optimization under transaction costs: a martingale approach.” Mathematical Finance 6 , 133–165. Cvitanic, J. and J. Ma. 1996. “Hedging options for a large investor and forward-backward SDEs.” Annals of Applied Probability 6, 370–398. Cvitanic, J., H. Pham, and N. Touze. 1999. “A closed-form solution to the problem of super-replication under transaction costs.” Finance and Stochastics 3, 35–54. Czyzyk, J., M. P. Mesnier, and J. J. Moré. 1998. “The NEOS server.” IEEE Computer Science Engineering 5(3), 68–75. Dahiya, S. and L. Klapper. 2007. “Who survives? A cross-country comparison.” Journal of Financial Stability 3, 261–278. Dahya, J., J. J. McConnell, and N. G. Travlos. 2002. “The cadbury committee, corporate performance, and top management turnover.” Journal of Finance 57(1), 461–483. Dai, Q. and K. J. Singleton. 2000. “Specification analysis of affine term structure models.” Journal of Finance 55(5), 1943–1978. Dai, Q. and K. Singleton. 2002. “Expectation puzzles, time-varying risk premia, and affine models of the term structure.” Journal of Financial Economics 63(3), 415–441. Dai, Q. and K. J. Singleton. 2002. “Fixed income pricing,” in Handbook of the economics of finance: financial markets and asset pricing, G. Constantinides, M. Harris, and R. M. Stulz (Eds.). Elsevier, Amsterdam. Dai, Q. and K. Singleton. 2003. “Term structure dynamics in theory and reality.” Review of Financial Studies 16(3), 631–678. Dai, Q., A. Le, and K. Singelton. 2006. Discrete-time dynamic term structure models with generalized market prices of risk, working paper, Graduate School of Business, Stanford University. Dai, Q., K. Singleton, and W. Yang. 2007. “Regime shifts in dynamic term structure model of the U.S. treasury bond yields.” Review of Financial Studies 20, 1669–1706. Daigler, R. T. and M. K. Wiley. 1999. “The impact of trader type on the futures volatility-volume relation.” Journal of Finance 6, 2297–2316. Daines, R. 2001. “Does delaware law improve firm value?” Journal of Financial Economics 62, 525. Damodaran, A. 2002. Investment valuation: tools and techniques for determining the value of any asset, 2nd ed., Wiley, NY. Damodaran, A. and C. H. Liu. 1993. “Insider trading as a signal of private information.” Review of Financial Studies 6, 79–119. Daniel, K. and S. Titman. 1997. “Evidence on the characteristics of cross-sectional variation in stock returns.” Journal of Finance 52, 1–33. Daniel, K. and S. Titman. 2006. “Market reactions to tangible and intangible information.” Journal of Finance 61(4), 1605–1643. Daniel, K., M. Grinblatt, S. Titman and R. Wermers. 1997. “Measuring mutual fund performance with characteristic-based benchmarks.” Journal of Finance 52, 1035–1058. Daniel, K., D. Hirshleifer, and A. Subrahmanyam. 2001. “Overconfidence, Arbitrage, and Equilibrium Asset Pricing.” Journal of Finance 56(3), 921–965. Daniel, K., S. Titman and K. C. J. Wei. 2001. “Explaining the crosssection of stock returns in Japan: factors or characteristics?” Journal of Finance 56, 743–766. Dao, B. and M. Jeanblanc. 2006. Double exponential jump diffusion process: a structural model of endogenous default barrier with rollover debt structure, Working paper, The Université Paris Dauphine and the Université d’Évry. Daouk, H. 2001. “Explaining asymmetric volatility: a new approach,” Western Finance Association Meetings, Tuscon, Arizona. D’Arcy, S. P. and N. A. Doherty. 1988. Financial theory of insurance pricing, S.S. Huebner Foundation, Philadelphia. Das, S. R. 2002. “The surprise element: jumps in interest rates.” Journal of Econometrics 106, 27–65.
References Das, S.R. and R.K. Sundaram. 1997. Of smiles and smirks: a term structure perspective, working paper, Harvard Business School and Stern School of Business, NYU, New York. Das, S. R. and R. K. Sundaram. 1999. “Of smiles and smirks: a term structure perspective.” Journal of Financial and Quantitative Analysis 34, 211–239. Das, S. and P. Tufano. 1996. Pricing credit sensitive debt when interest rates, credit ratings and credit spreads are stochastic. Journal of Financial Engineering 5, 161–198. Dassios, A. and J.-W. Jang. 2003. “Pricing of catastrophe reinsurance and derivatives using the cox process with shot noise intensity.” Finance and Stochastics 7(1), 73–95. David, A. 2007. Inflation uncertainty, asset valuations, and the credit spreads puzzle. Review of Financial Studies 20. Davidson, R. and J. G. Mackinnon. 1981. “Several tests for model specification in the presence of alternative hypotheses.” Econometrica 49(3), 781–793. Davidson, R. and J. G. MacKinnon, 1993. Estimation and inference in econometrics. Oxford University Press, New York. Davidson, G. K., R. C. Nash, R. R. Plumridge. 1998. Private placements, Practicing Law Institute, New York. Davis, M. and V. Lo. 2001. “Infectious defaults.” Quantitative Finance 1, 382–387. Davis, J. L., E. G. Fama and K. R. French. 2000. “Characteristics, covariances and average returns: 1929 to 1997.” Journal of Finance 55(1), 389–406. Davis, P., M. Pagano, and R. Schwartz. 2006. “Life after the big board goes electronic.” Financial Analysts Journal 62(5), 14–20. Davis, P., M. Pagano, and R. Schwartz. 2007. “Divergent expectation.” Journal of Portfolio Management 34(1), 84–95. Davydenko, S. A. and I. A. Strebulaev. 2007. Strategic actions and credit spreads: an empirical investigation. Journal of Finance 62, 2633–2671. Dawid, A. P. 1984. “Statistical theory: The prequential approach (with discussion).” Journal of the Royal Statistical Society Series A 147:278–292. Dawid, A. P. 1986. “Probability forecasting,” in Encyclopedia of Statistical Sciences, S. Kotz, N. L. Johnson, and C. B. Read (Eds.), Vol. 7, Wiley, New York, pp. 210–218. Day, T. E. and C. M. Lewis. 1992. “Stock market volatility and the information content of stock index options.” Journal of Econometrics 52, 267–287. Day, T. and C. Lewis. 1997. “Initial margin policy and stochastic volatility in the crude oil futures market.” Review of Financial Studies 10, 303–332. De J., Abe, T. T. Nguyen, M. A. Van Dijk. 2007. “Strategic debt: evidence from bertrand and cournot competition.” ERIM Report Series Reference No. ERS-2007-057-F&A. Deakin, E. B. 1976. “Distributions of financial accounting ratios: some empirical evidence.” Accounting Review 51(1), 90–96. DeAngelo, H. and R. W. Masulis. 1980. “Optimal capital structure under corporate and personal taxation.” Journal of Financial Economics 8, 3–29. Deaton, A. and G. Laroque. 1992. “On the behavior of commodity prices.” Review of Economic Studies 59, 1–23. Deaton, A. and G. Laroque. 2003. “A model of commodity prices after Sir Arthur Lewis.” Journal of Development Economics 71, 289–310. De Jong, F. and F. A. De Roon. 2005. “Time-varying market integration and expected returns in emerging markets.” Journal of Financial Economics 78, 583–613. De Jong, F. and J. Driessen. 2004. Liquidity risk premia in corporate bond and equity markets, Working paper, University of Amsterdam. De Jong, A., F. De Roon, and C. Veld. 1997. “Out-of-sample hedging effectiveness of currency futures for alternative models and hedging strategies.” Journal of Futures Markets 17, 817–837.
1643 Delianedis, G. and R. Geske. 2001. The components of corporate credit spreads: default, recovery, tax, jumps, liquidity, and market factors. SSRN eLibrary, http://ssrn.com/abstractD306479. De Long, B., A. Shleifer, L. Summers, and R. Waldman. 1990. “Noise trader risk in financial markets.” Journal of Political Economy 98, 703–738. Delbaen, F. 1992. “Representing martingale measures when asset prices are continuous and bounded.” Mathematical Finance 2, 107–130. Delbaen, F. 2003. Personal communication. Delbaen, F. and W. Schachermayer. 1994. “A general version of the fundamental theorem of asset pricing.” Mathematical Annals 300, 463–520. Delbaen, F. and W. Schachermayer. 1996. “The variance-optimal martingale measure for continuous processes.” Bernoulli 2, 81–105. Delbaen, F., P. Monat, W. Schachermayer, M. Schweizer, and Ch. Stricker. 1997. “Weighted norm inequalities and hedging in incomplete markets.” Finance and Stochastics 1, 181–227. Delbaen, F., P. Grandits, T. Rheinländer, D. Samperi, M. Schweizer, and C. Stricker. 2002. “Exponential hedging and entropic penalties.” Mathematical Finance 12(2), 99–123. Del Guercio, D. and J. Hawkins. 1999. “The motivation and impact of pension fund activism.” Journal of Financial Economics 52, 293–340. Delianedis, G. and R. Geske. 2001. The components of corporate credit spreads: default, recovery, tax, jumps, liquidity, and market factors, Working paper, UCLA. Dellacherie, C. and P. A. Meyer. 1980. Probabilités et potentiel. Chapitres V a VIII. Théorie des martingales, Hermann, Paris. DeLong, G. L. 2001. “Stockholder gains from focusing versus diversifying bank mergers.” Journal of Financial Economics 59, 221–252. DeLong, J. B., A. Schleifer, L. H. Summers and R. T. Waldmann. 1989. “The size and incidence of the losses from noise trading.” Journal of Finance 44, 681–696. DeMarzo, P. and D. Duffie. 1995. “Corporate incentives for hedging and hedge accounting.” Review of Financial Studies 8, 743–772. Demirgüç-Kunt, A. and V. Maksimovic. 1998. “Law, finance, and firm growth.” Journal of Finance 53, 2107–2137. Demmel, J., M. Heath, and H. van der Vorst 1993. “Parallel numerical linear algebra,” in Acta numerica Vol. 2. Cambridge University Press, Cambridge, England. Demsetz, H. 1968. “The cost of transacting.” Quarterly Journal of Economics 82(1), 33–53. Demsetz, R. S. and P. E. Strahan. 1995. “Historical patterns and recent changes in the relationship between bank holding company size and risk.” FRBNY Economic Policy Review 1(2), 13–26. Dennis, P. and S. Mayhew. 2002. “Risk-neutral skewness: evidence from stock options.” Journal of Financial and Quantitative Analysis 37(3), 471–493. Dennis, S., D. Nandy, and I. G. Sharpe. 2000. “The determinants of contract terms in bank revolving credit agreements.” Journal of Financial and Quantitative Analysis 35, 87. Dentcheva, D. 2005. Optimization models with probabilistic constraints. in Probabilistic and randomized methods for design under uncertainty, G. Calafiore and F. Dabbene (Eds.). Springer, London, pp. 49–97. Dentcheva, D. and A. Ruszczy´nski. 2003c. “Optimization with stochastic dominance constraints.” SIAM Journal on Optimization 14, 548–566. Dentcheva, D. and A. Ruszczy´nski. 2003a. “Optimization under linear stochastic dominance.” Comptes Rendus de l’Academie Bulgare des Sciences 56(6), 6–11. Dentcheva, D. and A. Ruszczy´nski. 2003b. “Optimization under nonlinear stochastic dominance.” Comptes Rendus de l’Academie Bulgare des Sciences 56(7), 19–25.
1644 Dentcheva, D. and A. Ruszczy´nski. 2004a. “Optimality and duality theory for stochastic optimization problems with nonlinear dominance constraints.” Mathematical Programming 99, 329–350. Dentcheva, D. and A. Ruszczy´nski. 2004b. “Convexification of stochastic ordering constraints.” Comptes Rendus de l’Academie Bulgare des Sciences 57(3), 5–10. Dentcheva, D. and A. Ruszczy´nski. 2004c. “Semi-infinite probabilistic optimization: first order stochastic dominance constraints.” Optimization 53, 583–601. Dentcheva, D. and A. Ruszczy´nski. 2006a. “Portfolio optimization with stochastic dominance constraints.” Journal of Banking and Finance 30/2, 433–451. Dentcheva, D. and A. Ruszczy´nski. 2006b. “Inverse stochastic dominance constraints and rank dependent expected utility theory.” Mathematical Programming 108, 297–311. Dentcheva, D. and A. Ruszczy´nski. 2008. “Duality between coherent risk measures and stochastic dominance constraints in risk-averse optimization.” Pacific Journal of Optimization 4, 433–446. Dentcheva, D. and A. Ruszczy´nski. 2010. Inverse cutting plane methods for optimization problems with second order stochastic dominance constraints, Optimization in press. Deo, R. S. and C. M. Hurvich. 2001. “On the log periodogram regression estimator of the memory parameter in long memory stochastic volatility models.” Econometric Theory 17, 686–710. Derman, E., I. Kani, and J. Zhou. 1996. “The local volatility surface: unlocking the information in index option prices.” Financial Analyst’s Journal 52, 25–36. DeRoon, F. A. and T. E. Nijman. 2001. “Testing for mean-variance spanning: a survey.” Journal of Empirical Finance 8, 111–155. De Roon, F. A., T. E. Nijman, and B. J. Werker. 2001. “Testing for mean-variance spanning with short sales constraints and transaction costs: the case of emerging markets.” Journal of Finance 56, 721–742. Derrien, F. 2005. “IPO pricing in ‘hot’ market conditions: who leaves money on the table?” Journal of Finance 60, 487–521. Desai, M. A., C. F. Foley, and J. R. Hines Jr. 2004. “A multinational perspective on capital structure choice and internal capital markets.” The Journal of Finance 59, 2451–2487. De Santis, G. and B. Gerard. 1997. “International asset pricing and portfolio diversification with time-varying risk.” Journal of Finance 52, 1881–1912. de Servigny, A., and O. Renault. 2004. Measuring and managing credit risk, McGraw-Hill, New York, NY. Deshmukh, S. D., S. I. Greenbaum, and G. Kanatas. 1983. “Interest rate uncertainty and the financial intermediary’s choice of exposure.” Journal of Finance 38, 141–147. Deuskar, P., A. Gupta, and M. Subrahmanyam. 2003. Liquidity effects and volatility smiles in interest rate option markets, Working paper, New York University. Dewachter, H. and M. Lyrio. 2006. “Macro factors and the term structure of interest rates.” Journal of Money, Credit, and Banking 38, 119–140. Dewachter, H. and K. Maes. 2000. Fitting correlations within and between bond markets, Working Paper, KU Leuven CES. Dewachter, H., M. Lyrio, and K. Maes. 2006. “A joint model for the term structure of interest rates and the macroeconomy.” Journal of Applied Econometrics 21, 439–462. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002. “The concept of comonotonicity in actuarial science and finance: theory.” Insurance: Mathematics and Economics 31, 3–33. Dhaene, J., M. Denuit, M. J. Goovaerts, R. Kaas, and D. Vyncke. 2002. “The concept of comonotonicity in actuarial science and finance: applications.” Insurance: Mathematics and Economics 31, 133–161. D’haeseleer, P., S. Forrest, and P. Helman. 1996. “An immunological approach to change detection: algorithms, analysis and implications.”
References Proceedings of the 1996 IEEE Symposium on Security and Privacy, pp. 100–119. Diaconis, P. 1988. Group representations in probability and statistics, Lecture Notes – Monograph Series 11, Institute of Mathematical Statistics, Hayward, CA. Diamandis, P. 2008. “Financial liberalization and changes in the dynamic behaviour of emerging market volatility: evidence from four Latin American equity markets.” Research in International Business and Finance 22(3), 362–377. Diamond, D. W. 1984. “Financial intermediation and delegated monitoring.” Review of Economic Studies 51, 393–414. Diamond, D. B. Jr. and B. Smith. 1985. “Simultaneity in the market for housing characteristics.” Journal of Urban Economics 17, 280–292. Diba, B. T. and H. Grossman. 1987. “On the inception of rational bubbles.” Quarterly Journal of Economics 102, 197–700. Diebold, F. X. 1986. “Modeling the persistence of conditional variances: a comment.” Econometric Reviews 5, 51–56. Diebold, F. 1988. “Serial correlation and the combination of forecasts.” Journal of Business and Economic Statistics 6, 105–111. Diebold, F. X. and C. Li. 2003. Forecasting the term structure of government bond yields, Working paper, University of Pennsylvania. Diebold, F. and J. Lopez. 1996. “Forecast evaluation and combination,” in Handbook of statistics, G. S. Maddala and C. R. Rao (Eds.). North Holland, Amsterdam. Diebold, F. X. and M. Nerlove. 1989. “The dynamics of exchange rate volatility: a multivariate latent factor arch model.” Journal of Applied Econometrics 4(1), 1–22. Diebold, F. and P. Pauly. 1987. “Structural change and the combination of forecasts.” Journal of Forecasting 6, 21–40. Diebold, F. and G. Rudebusch. 1996. “Measuring business cycles: a modern perspective.” Review of Economics and Statistics 78, 67– 77. Diebold, F., J. Gardeazabai, and K. Yilmaz. 1994. “On cointegration and exchange rate dynamics.” Journal of Finance 49, 727–735. Diebold, F. X., L. Ji, and C. Li. 2004. “A three-factor yield curve model: non-affine structure, systematic risk sources, and generalized duration,” in Memorial volume for Albert Ando. Cheltenham, L. Klein (Ed.). Edward Elgar, UK. Diener, F. and M. Diener. 2004. “Asymptotics of the price oscillations of a European call option in a tree model.” Mathematical Finance 14, 271–293. Dillon, W., R. Calantone, and P. Worthing. 1979. “The new product problem: an approach for investigating product failures.” Management Science 25(12), 1184–1196. Dimitras, A. I., S. H. Zanakis and C. Zopoundis 1996 “A survey of business failures with an emphasis on prediction methods and industrial application.” European Journal of Operation Research 90(3), 487–513. Dimson, E. and P. Marsh. 1990. “Volatility forecasting without datasnooping.” Journal of Banking & Finance 14, 399–421. Dimson, E. and M. Mussavian. 1999. “Three centuries of asset pricing.” Journal of Banking and Finance 23(12), 1745–1769. Ding, Z. and C. Granger. 1996. “Modeling volatility persistence of speculative returns: a new approach.” Journal of Econometrics 73, 185–215. Ding, Z., C. W. J. Granger, and R. F. Engle. 1993. “A long memory property of stock market returns and a new model.” Journal of Empirical Finance 1, 83–106. Dixit, A. 1980. “The role of investment in entry deterrence.” Economic Journal 90(357), 95–106. Dixit, A. K. 1990. Optimization in economic theory, 2nd ed., Oxford University Press, Oxford. Djankov, S., R. La Porta, F. Lopez-de-Silanes, and A. Shleifer. 2003. “Courts.” Quarterly Journal of Economics 118, 453–518.
References Djankov, S., R. La Porta, F. Lopez-de-Silanes, and A. Shleifer. 2008. “The law and economics of self-dealing.” Journal of Financial Economics 88, 430–465. Do, B. H. 2002. “Relative performance of dynamic portfolio insurance strategies: Australian evidence.” Accounting and Finance 42, 279–296. Do, B. H. and R. W. Faff. 2004. “Do futures-based strategies enhance dynamic portfolio insurance?” The Journal of Futures Markets 24(6), 591–608. Doleans-Dade, K. and P. A. Meyer. 1979. “Inégalités de normes avec poinds,” in Séminaire de Probabilités XIII, Lecture Notes in Mathematics, Vol. 721, Springer, Berlin, pp. 204–215. Domowitz, I. and B. Steil. 1999. “Automation, trading costs, and the structure of the securities trading industry.” Brookings-Wharton Papers on Financial Services, 33–92. Domowitz, I., J. Glen, and A. Madhavan. 2001. “Liquidity, volatility, and equity trading costs across countries and over time.” International Finance 4, 221–256. Dongarra, J., I. Duff, D. Sorensen, D., and H. van der Vorst 1991. Solving linear systems on vector and shared memory computers, SIAM, Philadelphia, PA. Doob, J. L. 1953. Stochastic processes. Wiley, New York. Dothan, L. 1978. “On the term structure of interest rates.” Journal of Financial Economics 6(1), 59–69. Douglas, G. W. 1969. “Risk in the equity markets: an empirical appraisal of market efficiency.” Yale Economic Essays 9, 3–45. Douglas, A. V. S. 2002. “Capital structure and the control of managerial incentives.” Journal of Corporate Finance 8(4), 287–311. Doukas, J. 1995. “Overinvestment, Tobin’s Q and gains from foreign acquisitions.” Journal of Banking and Finance 19(7), 1285–1303. Doukas, J. and L. Switzer. 1992. “The stock market’s valuation of R&D spending and market concentration.” Journal of Economics and Business 44(2), 95–114. Doumpos, M. and C. Zopounidis. 2002. Multicriteria decision aid classification methods, Kluwer Academic Publishers, Inc., Boston, MA. Doumpos, M. and C. Zopounidis 2002. “Multi-criteria classification and sorting methods: a literature review.” European Journal of Operational Research 138(2), 229–246. Doumpos, M., K. Kosmidou, G. Baourakis and C. Zopounidis. 2002. “Credit risk assessment using a multicriteria hierarchical discrimination approach: a comparative analysis.” European Journal of Operational Research 138(2), 392–412. Dowd, K. 2001. “Estimating VaR with order statistics.” Journal of Derivatives 8, 23–70. Downing, C., S. Underwood. and Xing, Y. 2007. “The relative informational efficiency of stocks and bonds: an intraday analysis.” Journal of Financial and Quantitative Analysis, forthcoming. Driessen, J. 2005. “Is default event risk priced in corporate bonds?” Review of Financial Studies 18, 165–195. Driessen, J. and L. Laeven. 2007. “International portfolio diversification benefits: cross-country evidence from a local perspective.” Journal of Banking and Finance 31, 1693–1712. Diewert, W. E. 2002. Hedonic producer price indexes and quality adjustment, Discussion Paper No.: 02–14, University of British Columbia, Vancouver. Duan, J. C. 1994. “Maximum likelihood estimation using pricing data of the derivative contract.” Mathematical Finance 4, 155–167. Duan, J. C. 1995. “The GARCH option pricing model.” Mathematical Finance 5(1), 13–32. Duan, J. C. 1996. “Cracking the smile.” Risk 9, 55–56, 58, 60. Duan, J. C. and M. T. Yu. 1999. “Capital standard, forbearance and deposit insurance pricing under GARCH.” Journal of Banking and Finance 23, 1691–1706. Duan, J.-C. and M.-T. Yu. 2005. “Fair insurance guaranty premia in the presence of risk-based capital regulations, stochastic interest
1645 rat and catastrophe risk.” Journal of Banking and Finance 29(10), 2435–2454. Duarte, J. 2004. “Evaluating an alternative risk preference in affine term structure models.” Review of Financial Studies 17, 379–404. Duarte, J. 2008. “Mortgage-backed securities refinancing and the arbitrage in the swaption market.” Review of Financial Studies 21, 1689–1731. Duarte, J. 2009. “The Causal Effect of Mortgage Refinancing on Interest Rate Volatility: Empirical Evidence and Theoretical Implications.” Review of Financial Studies 21, 1689–1731. Dubois D and H. Prade. 1988. Possibility theory, Plenum, New York. Dubitsky, R., J. Guo, L. Yang, R. Bhu, and S. Ivanov. 2006.“Subprime prepayment, default and severity models.” Credit Suisse, Fixed Income Research, May 17. Duffee, G. R. 1999. “Estimating the price of default risk.” Review of Financial Studies 12(1), 197–226. Duffee, G. R. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57(1), 405–443. Duffie, D. 1992. Dynamic asset pricing theory, 2nd Edition, Princeton University Press, Princeton, NJ. Duffie, D. 1996. Dynamic asset pricing theory, 2nd edition, Princeton University Press, Princeton, NJ. Duffie, D. 1998a. Defaultable term structure models with fractional recovery of par, Working paper, Graduate School of Business, Stanford University. Duffie, D. 1998b. First to default valuation, Working paper, Graduate School of Business, Stanford University. Duffie, D. 1999. “Credit swap valuation.” Financial Analysts Journal 55, 73–87. Duffie, D. 2001. Dynamic asset pricing theory, Princeton University Press, Princeton, NJ. Duffie, D. and C. Huang. 1985. “Implementing Arrow Debreu equilibria by continuous trading of few long lived securities.” Econometrica 6, 1337–1356. Duffie, D. and M. Huang. 1996. “Swap rates and credit quality.” Journal of Finance 51, 921–949. Duffie, D. and R. Kan. 1996. “A yield-factor model of interest rates.” Mathematical Finance 6(4), 379–406. Duffie, D. and D. Lando, 2001. “Term Structures of Credit Spreads with Incomplete Accounting Information.” Econometrica 69(3), 633– 664. Duffie, D. and K. J. Singleton. 1997. “An econometric model of the term structure of interest-rate swap yields.” Journal of Finance 52, 1287–1321. Duffie, D. and K. J. Singleton. 1999a. “Modeling term structures of defaultable bonds.” Review of Financial Studies 12, 687–720. Duffie, D. and K. J. Singleton. 1999b. Simulating correlated defaults, Working paper, Graduate School of Business, Stanford University. Duffie, D. and K. J. Singleton. 2003. Credit risk: pricing, measurement, and management, Princeton University Press, Princeton, NJ. Duffie, D., J. Pan, and K. Singleton. 2000. “Transform analysis and asset pricing for affine jump diffusions.” Econometrica 68(6), 1343–1376. Duffie, D., D. Filipovi, and W. Schachermayer. 2003. “Affine processes and applications in finance.” The Annals of Applied Probability 13, 984–1053. Duffie, D., L. Pedersen, and N. Garleanu. 2007. “Valuation in over-thecounter markets.” Review of Financial Studies 20, 1865–1900. Duffie, D, L. Saita, and K. Wang. 2007. “Multi-period corporate default prediction with stochastic covariates.” Journal of Financial Economics 83, 635–665. Duffy, D. 2006. Finite difference methods in financial engineering: a partial differential equation approach, Wiley, New York. Dufour, J.-M. 2003. “Identification, weak instruments, and statistical inference in econometrics.” Canadian Journal of Economics 36, 767–808.
1646 Dufresne, F. and H. U. Gerber 1989. “Three methods to calculate the probability of ruin”. ASTIN Bulletin 19, 71–90. Dufwenberg, M., T. Lindqvist and E. Moore. 2005. “Bubbles and experience: an experiment.” The American Economic Review 95, 1731–1737. Duncan, J. and R. M. Isaac. 2000. “Asset markets: how they are affected by tournament incentives for individuals.” American Economic Review, American Economic Association 90(4), 995–1004. Duiffie, G. 2002. “Term premia and interest rate forecasts in affine models.” Journal of Finance 57, 405–443. Dumas, B., J. Fleming, R. Whaley. 1998. “Implied volatility functions: empirical tests.” Journal of Finance 53(6), 2059–2106. Dupire, B. 1994. “Pricing with a smile.” Risk 7, 18–20. Dupire, B. 1994. “Pricing with a smile.” Risk 7, 32–39. Durand, D. 1952. “Costs of debt and equity funds for business trends and problems of measurement,” Conference on Research in Business Finance, National Bureau of Economic Research, New York, pp. 215–247. Durand, R., R. Newby and J. Sanghani. 2006. “An intimate portrait of the individual investor.” Working paper. Durbin, J. 1954. “Errors in variables.” Review of the International Statistical Institute 22, 23–32. Durham, G. B. 2003. “Likelihood-based specification analysis of continuous-time models of the short-term interest rate.” Journal of Financial Economics 70, 463–487. Durham, G. B. and A. R. Gallant. 2002. “Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes.” Journal of Business and Economic Statistics 20, 297–338. Dybvig, P. H. and S. A. Ross. 1985. “The analytics of performance measurement using a security market line.” Journal of Finance 40, 401–416. Dybvig, P. H., J. E. Ingersoll, and S. A. Ross. 1996. “Long forward and zero-coupon rates can never fall.” Journal of Business 69, 1–25. Dyl, E. A. 1975. “Negative betas: the attractions of selling short.” Journal of Portfolio Management 1, 74–76. Easley, D. and M. O’Hara. 1987. “Price, trade size, and information in securities markets.” Journal of Financial Economics 19, 69–90. Easley, D. and M. O’Hara. 1991. “Order form and information in securities markets.” Journal of Finance 46, 905–928. Easley, D. and M. O’Hara. 1992. “Time and the process of security price adjustment.” Journal of Finance 47, 577–606. Easley, D., M. O’Hara, and P. S. Srinivas. 1998. “Option volume and stock prices: evidence on where informed traders trade.” Journal of Finance 53, 431–466. Ebens, H. 1999. Realized stock volatility, Working paper, Department of Economics, John Hopkins University. Eberhart, A. C. and L. A. Weiss. 1998. “The importance of deviations from the absolute priority rule in chapter 11 bankruptcy proceedings.” Financial Management 27, 106–110. Eberhart, A., W. Moore, and R. Roenfeldt. 1990. “Security pricing and deviations from the absolute priority rule in bankruptcy proceedings.” Journal of Finance, December, 1457–1469. Eberlein, E., U. Keller, and K. Prause. 1998. “New insights into smile, mispricing, and value at risk: the hyperbolic model.” Journal of Business 71, 371–405. Eberlein, E. and U. Keller. 1995. “Hyperbolic distributions in finance.” Bernoulli 1, 281–299. Eberlein, E., U. Keller and K. Prause. 1998. “New insights into smile, mispricing and value at risk: The hyperbolic model.” Journal of Business 71, 371–406. Eckbo, B. E. 1986. “Valuation effects of corporate debt offerings.” Journal of Financial Economics 15, 119–152. Eckardt, W. and S. Williams. 1984. “The complete options indexes.” Financial Analysts Journal 40, 48–57. Economides, N. and R. Schwartz. 1995. “Electronic call market trading.” Journal of Portfolio Management 21, 10–18.
References Ederington, L. and J. Lee. 1993. “How markets process information: news releases and volatility.” Journal of Finance 48, 1161–1191. Ederington, L. and J. Lee. 2001. “Intraday volatiltiy in interest rate and foreign exchange markets: ARCH, announcement, and seasonality effects.” Journal of Futures Markets 21, 517–522. Edwards, A. K. and K. W. Hanley. 2007. “Short selling and initial public offerings.” Working Paper, Office of Economic Analysis, U.S. Securities and Exchange Commission, April 2007; http://ssrn.com/abstract=981242. Edwards, S. and R. Susmel. 2001. “Volatility dependence and contagion in emerging equity markets.” Journal of Development Economics 66, 505–532. Eftekhari, B., C. S. Pedersen and S. E. Satchell. 2000. “On the volatility of measures of financial risk: an investigation using returns from European markets.” The European Journal of Finance 6, 18–38. Ehrenberg, R. G. and M. L. Bognanno. 1990. “Do tournaments have incentive effects?” Journal of Political Economy 98, 1307–1324. Eisenbeis, R. A. 1977. “Pitfalls in the application of discriminant analysis in business, finance, and economics.” Journal of Finance 32(3), 875–900. Eisenbeis, R. A. and R. B. Avery. 1972. Discriminant analysis and classification procedure: theory and applications, D.C. Heath, Lexington, MA. Elerian, O., S. Chib, and N. Shephard. 2001. “Likelihood Inference for discretely observed onlinear diffusions.” Econometrica 69, 959–993. Eliasson, A. 2002. Sovereign credit ratings, Working Papers 02-1, Deutsche Bank. Elizalde, A., 2003. Credit risk models I: default correlation in intensity models, MSc Thesis, King’s College, London. Elizalde, A. 2005. Credit risk models II: structural models, Working paper, CEMFI and UPNA. El Karoui, N. and M. C. Quenez. 1995. “Dynamic programming and pricing of contingent claims in an incomplete market.” SIAM Journal on Control and Optimization 33, 29–66. El Karoui, N., R. Myneni, and R. Viswanathan. 1992. Arbitrage pricing and hedging of interest rate claims with state variables: I theory, Working paper, University of Paris. El Karoui, N., S. Peng, and M. C. Quenez. 1997. “Backward stochastic differential equations in finance.” Mathematical Finance 75, 1–71. Elliott, R. J. 1982. Stochastic calculus and applications, Springer, New York. Elliott, R. J. and R. S. Mamon. 2001. “A complete yield curve descriptions of a Markov interest rate model.” International Journal of Theoretical and Applied Finance 6, 317–326. Elliott, R. J. et al. 1995. Hidden Markov models: estimation and control, New York, Springer. Elton, E. J. and M. Gruber. 1974. “Portfolio theory when investment relatives are log normally distributed.” Journal of Finance 29, 1265–1273. Elton, E. J. and M. Gruber. 1978. “Simple criteria for optimal portfolio selection: tracing out the efficient frontier.” Journal of Finance 13, 296–302. Elton, E. J. and M. J. Gruber. 1994. “Introduction.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 1–5. Elton, E. J. and M. J. Gruber. 2007. Modern portfolio theory and investment analysis 7th ed., Wiley, New York. Elton, E. J., M. Gruber, and M. E. Padberg. 1976. “Simple criteria for optimal portfolio selection.” Journal of Finance 11, 1341–1357. Elton, E. J., M. J. Gruber, and T. Urich. 1978. “Are betas best?” Journal of Finance 23, 1375–1384. Elton, E. J., M. J. Gruber, S. Das, and M. Hlavka. 1993. “Efficiency with costly information: a reinterpretation of evidence from managed portfolios.” Review of Financial Studies 6, 1–22.
References Elton, E. J., M. J. Gruber, and J. Mei. 1994. “Cost of capital using arbitrage pricing theory: a case study of nine New York utilities.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 46–73. Elton E. J., M. J. Gruber, D. Agrawal and C. Mann. 2001. “Explaining the rate spread on corporate bonds?” Journal of Finance 56(1), 247–277. Elton, E. J., M. J. Gruber, S. J. Brown, and W. N. Goetzmann. 2006. Modern portfolio theory and investment analysis, Wiley, New York. Elton, E.J., M.J. Gruber, S.J. Brown, and W.N. Goetzmann. 2007. Modern portfolio theory and investment analysis, Wiley, New Jersey. Emanuel, D. and J. D. MacBeth. 1982. “Further results on constant elasticity of variance call option models.” Journal of Financial and Quantitative Analysis 17, 533–554. Embrechts, P., A. McNeil, and D. Straumann. 1999. Correlation and dependence in risk management: properties and pitfalls,. Mimeo. ETHZ Zentrum, Zurich. Embrechts, P., F. Lindskog, and A. McNeil. 2001. Modelling dependence with copulas, Working paper, Department of Mathematics, ETHZ. Emery, W. G. 2001. “Cyclical demand and the choice of debt maturity.” Journal of Business 74(4), 557–590. Engle, R. F. 1982. “Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation.” Econometrica 50, 987–1008. Engle, R. F. 2001. “GARCH 101: the use of ARCH/GARCH models in applied econometrics.” Journal of Economic Perspectives 15(4), 157–168. Engle, R. 2002a. “Dynamic conditional correlation: a simple class of multivariate GARCH models.” Journal of Business and Economic Statistics 20, 339–350. Engle, R. 2002b. “New frontiers for ARCH models.” Journal of Applied Econometrics 17, 425–446. Engle, R. F. and T. Bollerslev. 1986. “Modeling the persistence of conditional variances.” Econometric Reviews 5, 1–50. Engle, R. and C. Granger. 1987. “Co-integration and error correction: representation, estimation and testing.” Econometrica 55, 251–276. Engle, R. F. and K. F. Kroner. 1995. “Multivariate simultaneous generalized ARCH.” Econometric Theory 11, 122–150. Engle, R. F. and G. G. J. Lee. 1993. “A permanent and transitory component model of stock return volatility.” Discussion Paper, University of California, San Diego. Engle, R. F. and G. G. J. Lee. 1999. “A permanent and transitory component model of stock return volatility,” in Cointegration, causality and forecasting: a festschrift in honour of Clive W.J. Granger, R. F. Engle, H. White (Eds.). Oxford University Press, Oxford. Engle, R. F. and A. J. Patton. 2001. “What good is a volatility model?” Quantitative finance 1, 237–245. Engle, R. F. and J. G. Rangel. 2008. “The spline-GARCH model for low-frequency volatility and its global macroeconomic causes.” The Review of Financial Studies 21, 1187–1222. Engle, R., and K. Sheppard. 2001. “Theoretical and empirical properties of dynamic conditional correlation multivariate GARCH.” NBER Working Paper 8554. Eom, Y. H., M. G. Subrahmanyam, and J. Uno. 2000. Credit risk and the Yen interest rate swap market, Working paper, Stern Business School, New York. Eom, Y. H., J. Helwege, and J. Huang. 2004. “Structural models of corporate bond pricing: an empirical analysis.” Review of Financial Studies 17, 499–544. Epple, D., R. Romano, and H. Sieg. 2006. “Admission, tuition, and financial aid policies in the market for higher education.” Econometrica 74(4), 885–928. Epps, T. W. 2000. Pricing derivatives securities, World Scientific, Singapore.
1647 Epps, T. and M. Epps. 1976. “The stochastic dependence of security price changes and transaction volumes: implications for the mixture of distributions hypothesis.” Econometrica 44, 305–321. Eraker, B. 2003. “Do stock prices and volatility jump? Reconciling evidence from spot and option prices.” Journal of Finance 59, 1367–1404. Ericsson, J. and O. Renault. 2006. “Liquidity and credit risk.” Journal of Finance 61, 2219–2250. Ericsson, J. and J. Reneby. 2004a. “A note on contingent claims pricing with non-traded assets.” Finance Letters 2(3). Ericsson, J. and J. Reneby. 2004b. “An empirical study of structural credit risk models using stock and bond prices.” Journal of Fixed Income 13, 38–49. Ericsson, J. and J. Reneby. 2005. “Estimating structural bond pricing models.” Journal of Business 78, 707–735. Ericsson, J., K. Jacobs, and R. Oviedo. 2005. The determinants of credit default swap premia, Working Paper, McGill University. Ericsson, J., J. Reneby, and H. Wang. 2006. Can structural models price default risk? Evidence from bond and credit derivative markets, Working paper, McGill University and Stockholm School of Economics. Errunza, V. and E. Losq. 1985. “International asset pricing under mild segmentation: theory and test.” Journal of Finance 40, 105–124. Errunza, V. R. and E. Losq. 1989. “Capital flow controls, international asset pricing, and investors’ welfare: a multi-country framework.” Journal of Finance 44, 1025–1037. Errunza, V., E. Losq, and P. Padmanabhan. 1992. “Tests of integration, mild segmentation and segmentation hypotheses.” Journal of Banking and Finance 16, 949–972. Errunza, V., K. Hogan, and M. Hung. 1999. “Can the gains from international diversification be achieved without trading abroad?” Journal of Finance 54, 2075–2107. Ervine, J. and A. Rudd. 1985. “Index options: the early evidence.” Journal of Finance 40, 743–756. Estep, T. and M. Kritzman. 1988. “TIPP: Insurance without complexity.” Journal of Portfolio Management 14(4), 38–42. Estrella, A. and G. A. Hardouvelis. 1991. “The term structure as a predictor of real economic activity.” Journal of Finance 46, 555–576. Estrella, A. and F. S. Mishkin. 1996. “The yield curve as a predictor of US recessions.” Current Issues in Economics and Finance 2, 1–6. Estrella, A. and F. S. Mishkin. 1997. “The predictive power of the term structure of interest rates in Europe and the United States: implications for the European central bank.” European Economic Review 41, 1375–1401. Estrella, A. and F. S. Mishkin. 1998. “Financial variables as leading indicators predicting U.S. recessions.” Review of Economics and Statistics 80, 45–61. Evans, J. L. and S. H. Archer. 1968. “Diversification and the reduction of dispersion: an empirical analysis.” Journal of Finance 23, 761–767. Evans, M. D. and K. Lewis. 1995. “Do expected shifts in in?ation affect estimates of the long-run fisher relation?” The Journal of Finance 50(1), 225–253. Ezzamel, M., C. Mar-Molinero, and A. Beecher. 1987. “On the distributional properties of financial ratios.” Journal of Business Finance & Accounting 14(4), 463–481. Fabozzi, F. 2000. Bond markets, analysis and strategies. 4th edn. Prentice Hall, Upper Saddle River, NJ. Fabozzi, F. J. 2007. Fixed income analysis, 2nd ed., Wiley, Hoboken, NJ. Fabozzi, F. J. and J. C. Francis. 1978. “Beta as a random coefficient.” Journal of Financial and Quantitative Analysis 13, 101–116. Fabozzi, F., J. Francis and C. F. Lee. 1980. “Generalized functional form for mutual fund returns.” Journal of Financial and Quantitative Analysis 15, 1107–1119.
1648 Faccio, M., J. McConnell, and D. Stolin. 2006. “Returns to acquirers of listed and unlisted targets.” Journal of Financial and Quantitative Analysis 41, 197–220. Fairley, W. B. 1979. “Investment income and profit margins in propertyliability insurance: theory and empirical results.” Bell Journal of Economics 10(1), 191–210. Fama, E. 1963. “Mandelbrot and the stable paretian hypothesis.” Journal of Business 36, 420–419. Fama, E. 1965. “The behavior of stock market prices.” Journal of Business 34(January), 34–105. Fama, E. F. 1968. “Risk, return and equilibrium: some clarifying comments.” Journal of Finance 23, 29–40. Fama, E. F. 1970. “Efficient capital markets: a review of theory and empirical work.” Journal of Finance 25, 383–417. Fama, E. 1970. “Multiperiod consumption-investment decisions.” American Economic Review 60(1), 163–174. Fama, E. 1976. Foundation of finance, New York: Basic Books, 1976. Fama, E. 1980. “Agency problems and the theory of the firm.” Journal of Political Economy 88, 288–307. Fama, E. F. 1984. “The information in the term structure.” Journal of Financial Economics 13, 509–528. Fama, E. F. 1990. “Stock returns, expected returns, and real activity.” Journal of Finance 45, 1089–1108. Fama, E. 1991. “Efficient capital markets: II.” Journal of Finance 46, 1575–1617. Fama, E. F. and R. R. Bliss. 1987. “The information in long-maturity forward rates.” American Economic Review 77, 680–692. Fama, E. F. and K. R. French 1987 “Commodity futures prices: some evidence of forecast power premiums, and the theory of storage” Journal of Business 60, 55–73. Fama, E. and K. French. 1988. “Dividend yields and expected stock returns.” Journal of Financial Economics 22, 3–26. Fama, E. F. and K. R. French. 1988b. “Permanent and temporary components of stock prices.” Journal of Political Economy 96, 246–273. Fama, E. F. and K. R. French 1988 “Business cycles and the behavior of metals prices” Journal of Finance 43, 1075–1093. Fama, E. F. and K. R. French. 1989. “Business conditions and expected returns on stocks and bonds.” Journal of Financial Economics 25, 23–49. Fama, E. F. and K. R. French. 1992. “The cross-section of expected stock returns.” Journal of Finance 47(2), 427–465. Fama, E. F. and K. R. French. 1993. “Common risk factors in the returns on stocks and bonds.” Journal of Financial Economics 33, 3–56. Fama, E. F. and K. R. French. 1996b. “Multifactor explanations of asset pricing anomalies.” Journal of Finance 51(1), 55–87. Fama, E. F. and K. R. French. 1996a. “The CAPM is wanted, dead or alive.” Journal of Finance 51(5), 1947–1958. Fama, E. F. and K. R. French. 2002. “The equity premium.” Journal of Finance 57, 637–659. Fama, E. F. and J. D. MacBeth. 1973. “Risk, return and equilibrium: empirical tests.” Journal of Political Economy 81(3), 607–636. Fama, E. F. and M. H. Miller. 1972. Theory of finance, Holt, Rinehart and Winston, New york. Fama, E. F. and G. W. Schwert. 1977. “Asset returns and inflation.” Journal of Financial Economics 5, 115–146. Fama, E. F., L. Fisher, M. C. Jensen and R. Roll. 1969. “The adjustment of stock prices to new information.” International Economic Review 10, 1–21. Fan, H. and S. Sundaresan. 2000. “Debt valuation, renegotiations and optimal dividend policy.” Review of Financial Studies 13, 1057–1099. Fan, R., A. Gupta, and P. Ritchken. 2003. “Hedging in the possible presence of unspanned stochastic volatility: evidence from swaption markets.” Journal of Finance 58, 2219–2248. Farber, A., R. Gillet, and A. Szafarz. 2006. “A General Formula for the WACC.” International Journal of Business 11, 211–218.
References Faulkender, M. and M. A. Petersen. 2006. “Does the source of capital affect capital structure?” Review of Financial Studies 19, 45. Feenstra, R. C. and C. R. Knittel. 2004. Re-Assessing the U.S. Quality Adjustment to Computer Prices: The Role of Durability and Changing Software (October 2004). NBER Working Paper No. W10857. Fehle, F. 2003. “The components of interest rate swap spreads: theory and international evidence.” Journal of Futures Markets 23, 347–387. Feller, W. 1951. “An Introduction to Probability Theory and its Applications”, John Wiley, New York. Feller, W. 1951. “Two singular diffusion problems.” Annals of Mathematics 54(1), 173–182. Feller, W. 1968. An introduction to probability theory and its application, Vol. 1, Wiley, New York. Feltham, G. and J. Ohlson. 1999. “Residual earnings valuation with risk and stochastic interest rates.” Accounting Review 74(2), 165–183. Fernandes, M., B. Mota, and G. Rocha. 2005. “A multivariate conditional autoregressive range model.” Economics Letters 86, 435–440. Fernandez, P. 2004. “The value of tax shields is not equal to the present value of tax shields.” Journal of Financial Economics 73, 145–165. Ferri, G., L.-G. Liu, and J. Stiglitz. 1999. “The procyclical role of rating agencies: evidence from the East Asian crisis.” Economic Notes 3, 335–355. Ferson, W. E. and C. R. Harvey. 1991. “Sources of predictability in portfolio returns.” Financial Analysts Journal 3, 49–56. Ferson, W. E. and C. R. Harvey. 1991. “The variation of economic risk premiums.” Journal of Political Economy 99, 385–415. Ferson, W. and C. Harvey. 1994. “Sources of risk and expected returns in global equity markets.” Journal of Banking and Finance 18, 775–803. Ferson, W. E. and C. R. Harvey. 1997. “The fundamental determinants of international equity returns: a perspective on country risk and asset pricing.” Journal of Banking and Finance 21, 1625–1665. Ferson, W. E. and C. R. Harvey. 1999. “Conditioning variables and the cross-section of stock returns.” Journal of Finance 54, 1325–1360. Ferson, W. E. and R. Korajczyk. 1995. “Do arbitrage pricing models explain the predictability of stock returns?” Journal of Business 68, 309–349. Ferson, W. E. and R. Schadt. 1996. “Measuring fund strategy and performance in changing economic conditions.” Journal of Finance 51, 425–462. Figlewski, S. 1989. “Options arbitrage in imperfect markets.” Journal of Finance 44(5), 1289–1311. Figlewski, S. 1997. “Forecasting volatility.” Financial Markets, Institutions and Instruments 6, 1–88. Figlewski, S. and B. Gao. 1999. “The adaptive mesh model: a new approach to efficient option pricing.” Journal of Financial Economics 53, 313–351. Filipovi´c, D. 1999. “A note on the Nelson–Siegel family.” Mathematical Finance 9(4), 349–359. Filipovi´c, D. 2000. “Exponential–polynomial families and the term structure of interest rates.” Bernoulli 6(1), 1–27. Filipovi´c, D. 2001. “A general characterization of one factor affine term structure models.” Finance and Stochastics 5(3), 389–412. Financial Toolbox 3. 2009. User’s Guide. The MathWorks, Inc. Finnerty, J. 1978. “The Chicago board options exchange and market efficiency.” Journal of Financial and Quantitative Analysis 13, 29–38. Fischer, E., R. Heinkel, and J. Zechner. 1989. “Dynamic capital structure choice: theory and tests.” Journal of Finance 44, 19–40. Fishburn, P. C. 1964. Decision and value theory, Wiley, New York. Fishburn, P.C. 1970. Utility theory for decision making, Wiley, New York. Fishburn, P. C. 1977. “Mean-risk analysis with risk associated with below-target returns.” American Economic Review 67, 116–126. Fisher, R. 1936. “The use of multiple measurements in taxonomic problems.” Annals of Eugenics 7, 179–188.
References Fisher, F. M. 1966. The identification problem in econometrics, McGraw-Hill, New York. Fisher, S. 1978 “Call option pricing when the exercise price is uncertain and the valuation of index bonds” Journal of Finance 33, 169–176. Fitch Ratings. 2001. “Fitch simplifies bank rating scales.” Technical Report. Fitch Ratings. 2006. “The role of support and joint probability analysis in bank ratings.” Fitch Special Report. Fitzpatrick, P. J. 1932. “A comparison of ratios of successful industrial enterprises with those of failed firms.” Certified Public Accountant, October, November, and December, pp. 598–605, 656–662, and 727–731, respectively. Flanagan, C. 2008. Subprime mortgage prepayment and credit modeling, J.P. Morgan Chase & Co., April 2. Flannery, M. J. 1991. “Pricing deposit insurance when the insurer measures bank risk with an error.” Journal of Banking and Finance 15, 975–998. Flannery, M. J. and C. M. James. 1984. “The effect of interest rate changes on the common stock returns of financial institutions.” Journal of Finance 39, 1141–1153. Flannery, M. and A. Protopapadakis. 2002. “Macroeconomic factors do influence aggregate stock returns.” Review of Financial Studies 15(3), 751–782. Fleming, J. 1998. “The quality of market volatility forecasts implied by S&P 100 Index option prices.” Journal of Empirical Finance 5, 317–345. Fleming, M. J. and E. M. Remolona. 1997. “What moves the bond market?” Federal Reserve Bank of New York Economic Policy Review 3(4), 31–50. Fleming, M. J. and E. M. Remolona. 1999a. “What moves bond prices?” Journal of Portfolio Management, Summer 25(4), 28–38. Fleming, M. J. and E. M. Remolona. 1999b. “Price formation and liquidity in the U.S. treasury market: the response to public information.” Journal of Finance 54(5), 1901–1915, (working paper version in Federal Reserve Bank of New York Staff Reports No. 27, July 1997). Fleming, J., B. Ostdiek, and R. E. Whaley. 1995. “Predicting stock market volatility: a new measure.” Journal of Futures Markets 15, 265–302. Fleming, J., C. Kirby, and B. Ostdiek. 2001. “The economic value of volatility timing.” Journal of Finance 56, 329–352. Fleming, J., C. Kirby and B. Ostdiek. 2001. “The economic value of volatility timing.” Journal of Finance 61, 329–352. Fletcher, J. and A. Marshall. 2005. “An empirical examination of the benefits of international diversification.” Journal of International Financial Markets, Institutions and Money 15, 455–468. Flesaker, B. and L. Hughston. 1996. “Positive interest.” RISK 9(1), 46–49. Flood, R. P. and R. J. Hodrick. 1990. “On testing for speculative bubbles.” Journal of Economic Perspectives 4, 85–102. Flood, R., R. Hodrick, and P. Kaplan. 1986. An evaluation of recent evidence on stock market bubbles, NBER working paper. Flores, E and J. Garrido 2001 “Robust logistic regression for insurance risk classification.” Business Economics Universidad Carlos III, Departamento de Economía de la Empresa Working Papers wb016413. Fluck, Z. 1998. “Optimal financial contracting: debt versus outside equity.” Review of Financial Studies 11(2), 383–418. Fogler, H. R. and S. Ganapathy. 1982. Financial econometrics, Prentice-Hall, Englewood Cliffs, NJ. Follain, J. R. and E. Jimenez. 1985. “Estimating the demand for housing characteristics: a survey and critique.” Regional Science and Urban Economics 15(1), 77–107. Föllmer, H. and M. Schweizer. 1991. “Hedging of contingent claims under incomplete information,” in Applied stochastic analysis, Stochastics monographs Vol. 5, M. H. A. Davis and R. J. Elliott (Eds.). Gordon & Breach, London, pp. 389–414.
1649 Fomby, T. B., R. C. Hill, and S. R. Johnson. ‘1984. Advanced econometric methods, Springer-Verlag, New York. Fong, W. M., H. H. Lean, and W. K. Wong. 2005. “Stochastic dominance and the rationality of the momentum effect across markets.” Journal of Financial Markets 8, 89–109. Fons, J. S. 1994. Using default rates to model the term structure of credit risk. Financial Analysts Journal 50, 25–32. Fontes, D. B. M. M. 2008. “Fixed versus flexible production systems: a real option analysis.” European Journal of Operational Research 188(1), 169–184. Forbes, S. W. 1971. “Rates of return in the Nonlife insurance industry.” Journal of Risk and Insurance 38, 409–422. Forbes, S. W. 1972. “The cost of capital of insurance companies: comment.” Journal of Risk and Insurance 39(3), 491–492. Forsyth, P. and K. Vetzal. 2006. Numerical PDE methods for pricing path dependent options, University of Waterloo, Ontario, Canada. Forsythe, R., T. Palfrey and C. Plott. 1982. “Asset valuation in an experimental model.” Econometrica 50, 537–563. Foster, G. 1986. Financial statement analysis, 2nd Edition, PrenticeHall, Englewood Cliffs, NJ. Foster, D. P. and R. V. Vohra. 1998. “Asymptotic calibration.” Biometrika 85, 379–390. Foster, F. D., T. Smith and R. E. Whaley. 1997. Assessing goodnessof-fit of asset pricing models: the distribution of the maximal R-squared.” Journal of Finance 52, 591–607. Foucault, T. 1999. “Order flow composition and trading costs in a dynamic order driven market.” Journal of Financial Markets 2, 99–134. Foucault, T., O. Kaden, and E. Kandel. 2005. “The limit order book as a market for liquidity.” Review of Financial Studies 18, 1171–1217. Foufas, G. and M. Larson. 2004. Valuing European, barrier and lookback options using finite element method and duality techniques, Chalmers University of Technology, Sweden. Fouque, J.-P. and C.-H. Han. 2004. “Variance reduction for Monte Carlo methods to evaluate option prices under multi-factor stochastic volatility models.” Quantitative Finance 4(5), 597–606. Fouque, J.-P. and C.-H. Han. 2007. “A martingale control variate method for option pricing with stochastic volatility.” ESAIM Probability & Statistics 11, 40–54. Fouque, J.-P. and C.-H. Han. 2009. “Asymmetric variance reduction for pricing American options,” Mathematical Modeling and Numerical Methods in Finance, A. Bensoussan, Q. Zhang, and Ph. Ciarlet (Eds.), Elsevier, Amsterdam. Fouque, J.-P., G. Papanicolaou, R. Sircar, and K. Sølna. 2003. “Multiscale stochastic volatility asymptotics.” SIAM Journal on Multiscale Modeling and Simulation 2(1), 22–42. Fouque, J.-P., R. Sircar, and K. Sølna. 2006. “Stochastic volatility effects on defaultable bonds.” Applied Mathematical Finance 13(3), 215–244. Fouse, W., W. Jahnke, and B. Rosenberg. 1974. “Is beta phlogiston?” Financial Analysts Journal 30, 70–80. Francioni, R., S. Hazarika, M. Reck and R. A. Schwartz. 2008. “Equity market microstructure: taking stock of what we know.” Journal of Portfolio Management 35, 57–71. Francis, J. C. 1986. Investments: analysis and management, 4th Edition, McGraw-Hill, New York. Francis, J. C. and G. Alexander. 1986. Portfolio analysis, 3rd ed. Prentice-Hall, Englewood cliffs, NJ. Francis, J. C. and S. H. Archer. 1979. Portfolio analysis, Prentice-Hall, Englewood Cliffs, NJ. Francis, J. C. and R. Ibbotson. 2002. Investments: a global perspective, Prentice-Hall, Upper Saddle River, New Jersey, ISBN: 0-13890740-4.
1650 Francois, P. and E. Morellec. 2004. “Capital structure and asset prices: some effects of bankruptcy procedures.” Journal of Business 77, 387–411. Frankfurter, G. and H. Phillips. 1977. “Alpha-beta theory: a word of caution.” Journal of Financial Management 3, 35–40. Frankfurter, G. and J. Seagle. 1976. “Performance of the sharpe portfolio selection model: a comparison.” Journal of Financial and Quantitative Analysis 11, 195–204. Franks, J. and W. Torous. 1989. “An empirical investigation of U.S. firms in reorganization.” Journal of Finance 44, 747–767. Franses, P. H. and D. Van Dijk. 1995. “Forecasting stock market volatility using (non-linear) GARCH models.” Journal of Forecasting 15, 229–235. Frecka, T. J. and W. S. Hopwood. 1983. “The effects of outliers on the cross-sectional distributional properties of financial ratios.” Accounting Review 58(1), 115–128. Frecka, T. J. and C. F. Lee. 1983. “Generalized financial ratio adjustment processes and their implications.” Journal of Accounting Research (Spring 1983), 21, 308–316. Freed, N. and F. Glover. 1981. “Simple but powerful goal programming models for discriminant problems.” European Journal of Operational Research 7, 44–66. Freed, N., and F. Glover 1986. “Evaluating alternative linear programming models to solve the two-group discriminant problem.” Decision Sciences 17, 151–162. Frees, E. W. and E. A. Valdez. 1998. “Understanding relationships using copulas.” North American Actuarial Journal 2, 1–25. French, K. R. and J. M. Poterba. 1991. “Investor diversification and international equity markets.” American Economic Review 81, 222–226. French, K. R., G. W. Schwert, and R. F. Stambaugh. 1987. “Expected stock returns and volatility.” Journal of Financial Economics 19(September), 3–29. Frey, R. and A. McNeil. 2001. Modelling dependent defaults, Working paper, ETH Zurich. Fridman, M. and L. Harris. 1998. “A maximum likelihood approach for non-Gaussian stochastic volatility models.” Journal of Business and Economic Statistics 16, 284–291. Friedberg, S. H., A. J. Insel, and L. E. Spence. 1997. Linear algebra,” 3rd edition. Prentice Hall, Upper Saddle River, New Jersey. Friedman, B. M. 1992. “Monetary economics and financial markets: retrospect and prospect.” NBER Reporter, 1–5. Friend, I. and L. H. P. Lang. 1988. “An empirical test of the impact of managerial self-interest on corporate capital structure.” Journal of Finance 47, 271–281. Fritelli, M. 2000. The minimal entropy martingale measure and the valuation problem in incomplete markets. Mathematical Finance 10, 39–52. Froot, K. A. 1989. “New hope for the expectations hypothesis of the term structure of interest rates.” Journal of Finance 44, 283–305. Froot, K. A. (Ed.). 1999. The Financing of Catastrophe Risk, University of Chicago Press, Chicago. Froot, K. A. 2001. “The market for catastrophe risk: a clinical examination.” Journal of Financial Economics 60(2), 529–571. Froot, K. A. and J. C. Stein. 1998. “Risk management, capital budgeting, and capital structure policy for financial institutions: An integrated approach.” Journal of Financial Economics 47, 55–82. Froot, K. A., D. S. Scharfstein and J. C. Stein. 1992. “Herd on the street: informational inefficiencies in a market with short-term speculation.” Journal of Finance 47, 1461–1484. Froot, K. A., D. S. Scharfstein, and J. C. Stein. 1993. Risk-management: Coordinating corporate investment and financing policies, Journal of Finance 48, 1629–1658. Froot, K. A., P. G. J. O’Connell, and M. S. Seasholes. 2001. “The portfolio flows of international investors.” Journal of Financial Economics 59, 151–194.
References Frost, P. A. and J. E. Savarino. 1988. “For better performance: constrain portfolio weights.” Journal of Portfolio Management, 29–34. Fu, M., D. Madan, and T. Wang. 1998/1999. “Pricing continuous Asian options: a comparison of Monte Carlo and Laplace transform inversion methods.” The Journal of Computational Finance 2, 49–74. Fudenberg, D. and J. Tirole. 1984. “The fat-cat effect, the puppy dog ploy, and the lean and hungry look.” American Economic Review Papers and Proceedings 74, 361–366. Fujiwara, T. and Y. Miyahara. 2003. The minimal entropy martingale measures for geometric Lévy processes. Finance and Stochastics 7, 509–531. Fuller, W. A. 1977. Some properties of a modification of the limited information estimator. Econometrica 45, 939. Fuller, W. A. 1987. Measurement error models, John Wiley & Sons, New York. Fuller, K., J. Netter, and M. Stegemoller. 2002. “What do returns to acquiring firms tell us? Evidence from firms that make many acquisitions.” Journal of Finance 57, 1763–1793. Fung, W. and D. Hsieh. 2001. “The risk in hedge fund strategies: theory and evidence from trend followers.” Review of Financial Studies 14, 313–341. Gács P. 2005. “Uniform test of algorithmic randomness over a general space.” Theoretical Computer Science 341, 91–137. Galai, D. 1983a. “The components of the return from hedging options against stocks.” Journal of Business 56, 45–54. Galai, D. 1983b. “A survey of empirical tests of option pricing models,” in Option pricing, M. Brenner (Ed.). Lexington, MA: Heath, pp. 45–80. Galai, D. and R. W. Masulis. 1976. “The option pricing model and the risk factor of stock.” Journal of Financial Economics 3, 53–81. Galai, D., R. W. Masulis, R. Geske, and S. Givots. 1988. Option markets, Addison-Wesley Publishing Company, Reading, MA. Galindo, J., and P. Tamayo. 2000. “Credit risk assessment using statistical and machine learning/basic methodology and risk modeling applications.” Computational Economics 15, 107–143. Gallant, S. 1988. “Connectionist expert systems.” Communications of the ACM 31(2), 152–169. Gallant, A. R. and G. Tauchen. 1998. “Reprojecting partially observed systems with applications to interest rate diffusions.” Journal of American Statistical Association 93, 10–24. Gallant, R., P. E. Rossi, and G. Tauchen. 1992. “Stock prices and volumes.” Review of Financial Studies 5, 199–244. Gallant, A. R., D. Hsieh, and G. Tauchen. 1997. “Estimation of stochastic volatility models with diagnostics.” Journal of Econometrics 81, 159–192. Gallant, R., C. Hsu, and G. Tauchen. 1999. “Using daily range data to calibrate volatility diffusions and extracting integrated volatility.” Review of Economics and Statistics 81, 617–31. Galloway, T. M., W. B. Lee, and D. M. Roden. 1997. “Banks’ changing incentives and opportunities for risk taking.” Journal of Banking and Finance 21, 509–527. Garber, P. 1990. “Famous first bubbles.” Journal of Economic Perspectives 4, 35–54. Garcia, R. and P. Perron. 1996. “An analysis of the real interest rate under regime shifts” The Review of Economics and Statistics 78(1), 111–125. Garmaise, M. J. 2008. “Production in entrepreneurial firms: the effects of financial constraints on labor and capital.” Review of Financial Studies 21, 544. Garman, M. 1976. “Market microstructure.” Journal of Financial Economics 3, 33–53. Garman, M. and M. Klass. 1980. “On the estimation of security price volatilities from historical data.” Journal of Business 53, 67–78. Garman, M. B. and S. W. Kohlhagen. 1983. “Foreign currency option values.” Journal of International Money and Finance 2, 231–237.
References Garven, J. R. 1986. “A pedagogic note on the derivation of the Black– Scholes option pricing formula.” The Financial Review 21, 337–344. Gastineau, G. 1979. The stock options manual, McGraw-Hill, New York. Gastwirth, J. L. 1971. “A general definition of the Lorenz curve.” Econometrica 39, 1037–1039. Geczy, C., B. M. Minton, and C. Schrand. 1997. “Why firms use currency derivatives.” Journal of Finance 52, 1323–1354. Gehrlein, W. V., and D. Lepelley. 1998. “The condorcet efficiency of approval voting and the probability of electing the condorcet loser.” Journal of Mathematical Economics 29, 271–283. Geman, S. and D. Geman. 1984. “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721–741. Geman, H. and M. Yor. 1993. “Bessel processes, Asian options, and perpetuities.” Mathematical Finance 3, 349–375. Geman, H., N. El Karoui, and J.-C. Rochet. 1995. “Changes of Numéraire, change of probability measure and option pricing.” Journal of Applied Probability 32, 443–458. George, T. and F. Longstaff. 1993. “Bid-ask spreads and trading activity in the S&P 100 index options market.” Journal of Financial and Quantitative Analysis 28, 381–397. Gerber, H. U. 1970. “An extension of the renewal equation and its application in the collective theory of risk.” Scandinavian Actuarial Journal 53, 205–210. Gerber, H. U., B. Landry. 1998. On the discounted penalty at ruin in a jump-diffusion and the perpetual put option. Insurances: Mathematics and Economics 22, 263–276. Gerber, H. U. and E. S. W. Shiu. 1998. “On the time value of ruin.” North American Actuarial Journal 2, 48–78. Geske, R. 1976. The pricing of options with stochastic dividend yield, unpublished manuscript. Geske, R. 1977. “The valuation of corporate liabilities as compound options.” Journal of Financial and Quantitative Analysis 12, 541–552. Geske, R. 1979. “The valuation of compound options.” Journal of Financial Economics 7, 63–82. Geske, R. 1979. “A note on an analytical valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 7, 375–380. Geske, R. and H. E. Johnson. 1984. “The American put valued analytically.” Journal of Finance 1511–1524. Geske, R. and R. Roll. 1984. “On valuing American call options with the Black-Scholes European formula.” Journal of Finance 39, 443–455. Geske, R. and K. Shastri. 1985. “Valuation by approximation: a comparison of alternative option valuation techniques.” Journal of Financial and Quantitative Analysis 20, 45–72. Geske, R., R. Roll, and K. Shastri. 1983. “Over-the-counter option market dividend protection and “biases” in the Black-Scholes model: a note.” Journal of Finance 38, 1271–1277. Ghosh, S. K. 1991. Econometrics: theory and applications, Prentice Hall, Englewood Cliffs, NJ. Ghysels, E. 1998. “On stable factor structures in the pricing of risk: Do time-varying betas help or hurt?” Journal of Finance 53, 549–573. Ghysels, E., A. C. Harvey, and E. Renault. 1996. “Stochastic volatility,” in Statistical methods in finance, G. S. Maddala and C. R. Rao (Eds.). North Holland, Amsterdam, pp. 119–191. Gibbons, M. R. 1982. “Multivariate tests of financial models, a new approach.” Journal of Financial Economics 10, 3–27. Gibbons, M. R. and W. E. Ferson. 1985. “Testing asset pricing models with changing expectations and an unobservable market portfolio.” Journal of Financial Economics 14, 216–236. Gibbons, M., S. Ross and J. Shanken. 1989. “A test of the efficiency of a given portfolio.” Econometrica 57, 1121–1152.
1651 Gibson, R. and E. S. Schwartz 1990 “Stochastic convenience yield and the pricing of oil contingent claims” Journal of Finance 45, 959– 976. Giesecke, K. 2006. Default and information. Journal of Economic Dynamics and Control 30, 2281–2303. Gikhman, I. and A. V. Skorokhod. 1969. Investment to the theory of random processes, Saunders, New York. Gillan, S. and L. Starks. 2000. “Corporate governance proposals and shareholder activism: the role of institutional investors.” Journal of Financial Economics 57, 275–305. Gil-Pelaez, J. 1951. “Note on the inversion theorem.” Biometrika 38, 481–482. Gilroy, B. M. and E. Lukas. 2006. “The choice between greenfield investment and cross-border acquisition: a real option approach.” The Quarterly Review of Economics and Finance 46(3), 447–465. Gilson, S., E. Hotchkiss, and R. Ruback. 2000. “Valuation of bankrupt firms.” Review of Financial Studies 13(1), 43–74. Gitman, L. 2007. Principles of managerial finance, 12th edn. Pearson Prentice Hall, Boston. Glasserman, P. and S. G. Kou. 2003.“The Term Structure of Simple Forward Rates with Jump Risk.” Mathematical Finance, Blackwell Publishing, 13(3), 383–410. Glosten, L. 1994. “Is the electronic open limit order book inevitable?” Journal of Finance 49, 1127–1161. Glosten, L. and P. Milgrom. 1985. “Bid, ask and transaction prices in a specialist market with heterogeneously informed traders.” Journal of Financial Economics 14, 71–100. Glostern, L., R. Jagannathan, and D. Runkle. 1993. “On the relation between the expected value and the volatility of the nominal excess returns on stocks.” Journal of Finance 48, 1779–1802. Gockenbach, M. 2002. Partial differential equations: analytical and numerical methods, SIAM Press. Goettler, R., C. Parlour, and U. Rajan. 2005. “Equilibrium in a dynamic limit order market.” Journal of Finance 60, 2149–2192. Goetzmann, W. N. and P. Jorion. 1993. “Testing the predictive power of dividend yields.” The Journal of Finance 48, 663–679. Goetzmann, W. and L. Peng. 2006. “Estimating house price indexes in the presence of seller reservation prices.” Review of Economics and Statistics 88(1), 100–112. Goh, J., M. J. Gombola, H. W. Lee, and Y. L. Feng. 1999. “Private placement of common equity and earnings expectations.” Financial Review 34, 19–32. Goicoechea, A., D. R. Hansen, and L. Duckstein. 1982. Multiobjective decision analysis with engineering and business applications, Wiley, New York. Goldman, E., E. Valieva, and H. Tsurumi. 2008. “Kolmogorov-Smirnov, fluctuation and Zg tests for convergence of Markov Chain Monte Carlo draws.” Communication in Statistics, Simulation and Computations 37(2), 368–379. Goldstein, R. S. 2000. “The term structure of interest rates as a random field.” Review of Financial Studies 13, 365–384. Goldstein, R. and W. Keirstead. 1997. On the term structure of interest rates in the presence of reflecting and absorbing boundaries, Working paper, Ohio-State University. Goldstein, R., N. Ju, and H. Leland. 2001. “An EBIT-based model of dynamic capital structure.” Journal of Business 74, 483–512. Gollinger, T. L. and J. B. Morgan. 1993. “Calculation of an efficient frontier for a commercial loan portfolio.” Journal of Portfolio Management 39–46. Gómez-Bezares, F. 1993. “Penalized present value: net present value penalization with normal and beta distributions.” in: Capital budgeting under uncertainty, Aggarwal (Ed.). Prentice-Hall, Englewood Cliffs, New Jersey, pp 91–102. Gómez-Bezares, F. 2006. Gestión de carteras, Desclée de Brouwer, Bilbao.
1652 Gómez-Bezares, F., J. A. Madariaga and J. Santibánez. 2004. “Performance ajustada al riesgo.” Análisis Financiero 93, 6–17. G’omez Sanchez-Mariano, E. G., E. G. G’omez-Villegas, and J. M. Marin. 1998. “A multivariate generalization of the power exponential family of distributions.” Communication in Statistics 27, 589–600. G’omez Sanchez-Mariano, E. G., E. G. G’omez-Villegas, and J. M. Marin. 2002. “A matrix variate generalization of the power exponential family of distributions.” Communication in Statistics 31, 2167– 2182. Gompers, P. and J. Lerner. 2000. “Money chasing deals? The impact of fund inflows on private equity valuations.” Journal of Financial Economics 55, 281. Goovaerts, M. J., J. Dhaene, and A. De Schepper. 2000. “Stochastic upper bounds for present value functions.” Journal of Risk and Insurance Theory 67, 1–14. Gordon, M. J. 1974a. The cost of capital to a public utility, MSU Public Utilities Studies, East Lansing, MI. Gordon, R. 2000. “Does the ‘new economy’ measure up to the great inventions of the past?” Journal of Economic Perspectives 14, 49–74. Gordon, M. J. and E. Shapiro. 1956. “Capital equipment analysis: the required rate of profit.” Management Science 3(1), 102–110. Gorovoi, V. and V. Linetsky. 2004. “Black’s model of interest rates as options, eigenfunction expansions and Japanese interest rates.” Mathematical Finance 14(1), 49–78. Gorton, G. and R. Rosen. 1995. “Corporate control, portfolio choice, and the decline of banking.” Journal of Finance 50, 1377–1420. Gorton, G., F. Hayashi, and K. Rouwenhorst. 2007. The fundamentals of commodity futures, Yale working paper. Goswami, G., T. Noe, and M. Rebello 1995. “Debt financing under asymmetric information.” Journal of Finance 50, 633–660. Gould, J. P. 1968. “Adjustment costs in the theory of investment of the firm.” The Review of Economic Studies 1, 47–55. Gouriéroux, C., A. Monfort, E. Renault, and A. Trognon. 1987. “Generalized residuals.” Journal of Econometrics 34, 5–32. Gourieroux, C., J. P. Laurent, and H. Pham. 1998. “Mean–variance hedging and numeraire.” Mathematical Finance 8, 179–200. Goyal, A. and I. Welch. 2003. “Predicting the equity premium with dividend ratios.” Management Science 49, 639–654. Goyal, A. and I. Welch. 2004. “A comprehensive look at the empirical performance of equity premium prediction.” Working paper, Yale University. Graham, J. R. and C. R. Harvey. 2002. “How do CFOs make capital budgeting and capital structure decisions?” Journal of Applied Corporate Finance (Spring 2002), 15, 8–23. Graham, J. R. and D. A. Rogers. 2002. “Do firms hedge in response to tax incentives?” Journal of Finance 57, 815–839. Graham, J. R. and C. W. Smith Jr. 1999. “Tax incentives to hedge.” Journal of Finance 54, 2241–2262. Grammatikos, T. and A. Saunders. 1986. “Futures price variability: a test of maturity and volume effects.” Journal of Business 59, 319–330. Grandits, P. and T. Rheinländer. 2002. “On the minimal entropy martingale measure.” Annals of Probability 30, 1003–1038. Granger, C. W. J. 1969. “Investigating causal relations by econometric models and cross spectral methods.” Econometrica 37, 424–438. Granger, C. W. J. and R. Joyeux. 1980. “An introduction to longmemory time series models and fractional differencing.” Journal of Time Series Analysis 1, 15–29. Granger, C. and O. Morgenstern. 1970. Predictability of stock market prices, Heath Lexington. Granger, C. and P. Newbold. 1973. “Some comments on the evaluation of economic forecasts.” Applied Economics 5, 35–47. Granger, C. W. J. and P. Newbold. 1974. “Sourious Regressions in Econometrics.” Journal of Econometrics 2, 111–120.
References Granger, C. W. J. and P. Newbold. 1977. Forecasting economic time series, Academic, New York. Granger, C. and R. Ramanathan. 1984. “Improved methods of forecasting.” Journal of Forecasting 3, 197–204. Grant, D., G. Vora, and D. Weeks. 1997. “Path-dependent options: extending the Monte Carlo simulation approach.” Management Science 43(11), 1589–1602. Grauer, R. R. 1991. “Further ambiguity when performance is measured by the security market line.” Financial Review 26, 569–585. Grauer, R. R. 2008a. “Benchmarking performance measures with perfect-foresight and bankrupt asset-allocation strategies.” Journal of Portfolio Management 34, 43–57. Grauer, R. R. 2008b. “On the predictability of stock market returns: Evidence from industry rotation strategies.” Journal of Business and Management 14, 47–71. Grauer, R. R. 2008c. “Timing the market with stocks, bonds and bills.” Working paper Simon Fraser University, Canada. Grauer, R. R. 2009. “Extreme mean-variance solutions: estimation error versus modeling error.” In Applications of Management Science on Financial Optimization, 13 Financial Modeling and Data Envelopment Applications, K. D. Lawrence and G. Kleinman (Eds.), Emerald Group Publishing Limited, United Kingdom. Grauer, R. R. and N. H. Hakansson. 1982. “Higher return, lower risk: historical returns on long-run, actively managed portfolios of stocks, bonds and bills, 1936–1978.” Financial Analysts Journal 38, 39–53. Grauer, R. R. and N. H. Hakansson. 1985. “Returns on levered, actively managed long-run portfolios of stocks, bonds, and bills, 1934–1984.” Financial Analysts Journal 41, 24–43. Grauer, R. R. and N. H. Hakansson. 1986. “A half-century of returns on levered and unlevered portfolios of stocks, bonds, and bills, with and without small stocks.” Journal of Business 59, 287–318. Grauer, R. R. and N. H. Hakansson. 1987. “Gains from international diversification: 1968–85 returns on portfolios of stocks and bonds.” Journal of Finance 42, 721–739. Grauer, R. R. and N. H. Hakansson. 1993. “On the use of mean-variance and quadratic approximations in implementing dynamic investment strategies: a comparison of the returns and investment policies.” Management Science 39, 856–871. Grauer, R. R. and N. H. Hakansson. 1995a. “Gains from diversifying into real estate: three decades of portfolio returns based on the dynamic investment model.” Real Estate Economics 23, 117–159. Grauer, R. R. and N. H. Hakansson. 1995b. “Stein and CAPM estimators of the means in asset allocation.” International Review of Financial Analysis 4, 35–66. Grauer, R. R. and N. H. Hakansson. 1998. “On timing the market: the empirical probability assessment approach with an inflation adapter.” in Worldwide asset and liability modeling, W. T. Ziemba and J. M. Mulvey (Eds.). Cambridge University Press, Cambridge, 149–181. Grauer, R. R. and N. H. Hakansson. 2001. “Applying portfolio change and conditional performance measures: the case of industry rotation via the dynamic investment model.” Review of Quantitative Finance and Accounting 17, 237–265. Grauer R. R. and F. C. Shen. 2000. “Do constraints improve portfolio performance?” Journal of Banking and Finance 24, 1253–1274. Grauer, R. R., N. H. Hakansson, and F. C. Shen. 1990. “Industry rotation in the U.S. stock market: 1934–1986 returns on passive, semipassive, and active strategies.” Journal of Banking and Finance 14, 513–535. Gray, S. F. 1996. “Modeling the conditional distribution of interest rates as a regime-switching process.” Journal of Financial Economics 42, 27–62. Green, R. C. 1986. “Benchmark portfolio inefficiency and deviations from the security market line.” Journal of Finance 41, 295–312. Green, R. C. 1993. “A simple model of the taxable and tax-exempt yield curves.” Review of Financial Studies 6(2), 233–264.
References Green, R. C. and B. Hollifield. 1992. “When will mean-variance efficient portfolios be well diversified?” Journal of Finance 47, 1785–1809. Greene, W. H. 1993. Econometric analysis, Macmillan, New York), pp. 370–381. Greene, W. H. 1997. Econometric analysis, 3nd edn. Macmillan, New York. Greene, W. H. 2003. Econometric analysis, Prentice Hall, Upper Saddle River, NJ. Gressis, N., G. Philiippatos, and J. Hayya. 1976. “Multiperiod portfolio analysis and the inefficiencies of the market portfolio.” Journal of Finance 31, 1115–1126. Grice, J. S. and M. T. Dugan. 2001. “The limitations of bankruptcy prediction models: some cautions for the researcher.” Review of Quantitative Finance and Accounting 17, 151–166. Grice and R. W. Ingram. 2001. “Tests of generalizability of Altman’s bankruptcy prediction model.” Journal of Business Research 54, 53–61. Griffin, J. M., J. H. Harris, and S. Topaloglu. 2007. “Why are investors net buyers through lead underwriters?” Journal of Finance Economics 85, 518–551. Grinblatt, M. 2001. “An analytic solution for interest rate swap spreads.” International Review of Finance 2, 113–149. Grinblatt, M. and M. Keloharju. 2000. “The investment behavior and performance of various investor types: a study of Finland’s unique data set.” Journal of Financial Economics 55, 43–67. Grinblatt, M. and S. Titman. 1993. “Performance measurement without benchmarks: an examination of mutual fund returns.” Journal of Business 66, 47–68. Grinold, R. 1989. “The fundamental law of active management.” Journal of Portfolio Management 15(3), 30–37. Grossman, S. 1992. “The information role of upstairs and downstairs markets.” Journal of Business 65, 509–529. Grossman, S. and O. Hart. 1980. “Takeover bids, the free rider problem, and the theory of the corporation.” Bell Journal of Economics 11, 42–64. Grossman, S. and M. Miller. 1988. “Liquidity and market structure.” Journal of Finance 43(3), 617–637. Grossman, S. and J. Stiglitz. 1980. “On the impossibility of informationally efficient markets.” American Economic Review 70(3), 393–408. Gruber, M. J. 1996. “Another puzzle: the growth in actively managed mutual funds.” Journal of Finance 51, 783–810. Grunbichler, A., F. A. Longstaff and E. S. Schwartz. 1994. “Electronic screen trading and the transmission of information: an empirical examination.” Journal of Financial Intermediation 3(2), 166–187. Grundy, K. and B. G. Malkiel. 1996. “Reports of beta’s death have been greatly exaggerated.” Journal of Portfolio Management 22(3), 36–44. Guedes, J. and T. Opler 1996. “The determinants of the maturity of corporate debt issues.” Journal of Finance 51(5), 1809–1833. Gugler, K. 2003 “Corporate governance, dividend payout policy, and the interrelation between dividends, R&D, and capital investment” Journal of Banking & Finance 27, 1297–1321. Guo, W. 2001. “Maximum entropy in option pricing: a convex-spline smoothing method.” Journal of Futures Markets, 21, 819–832. Guo, X. 2002. “Some risk management problems for firms with internal competition and debt” Journal of Applied Probability 39, 55–69. Guo, J. and M. Hung. 2007. “A note on the discontinuity problem in Heston’s stochastic volatility model.” Applied Mathematical Finance 14(4), 339–345. Guo, X., R. A. Jarrow, and Y. Zeng. 2008. Credit risk models with incomplete information, Working paper. Gupta, M. C. 1972. “Differential effects of tight money: an economic rationale.” Journal of Finance 27(3), 825–838.
1653 Gupta, M. C. and R. Huefner 1972. “A cluster analysis study of financial ratios and industry characteristics.” Journal of Accounting Research 10(1), 77–95. Gupta, A. and M.G. Subrahmanyam. 2000. “An empirical examination of the convexity bias in the pricing of interest rate swaps.” Journal of Financial Economics 55, 239–279. Gupta, A., R. L. B. LeCompte, and L. Misra 1997. “Acquisitions of solvent thrifts: wealth effects and managerial motivations.” Journal of Banking and Finance 21, 1431–1450. Gupta, M. C., A. C. Lee, and C. F. Lee. 2007. Effectiveness of dividend policy under the capital asset pricing model: a dynamic analysis. Working paper, Fox School of Business. TM Gupton, G. M., C. C. Finger, and M. Bhatia. 1997. CreditMetrics – Technical document, Morgan Guaranty Trust Company, New York. Gurkaynak, R. S. 2005. “Econometric tests of asset price bubbles: taking stock,” REDS Working paper No. 2005–2004. Gurun, U. G., J. Rick, and M. Stanimir. 2008. “Sell-side debt analysts and market efficiency”, working paper. Gyetvan, F. and Y. Shi. 1992. “Weak duality theorem and complementary slackness theorem for linear matrix programming problems.” Operations Research Letters 11, 249–252. Haas, G. C. 1922. “Sales prices as a basis for farm land appraisal.” Technical bulletin, Vol 9, The University of Minnesota Agricultural Experiment Station, St. Paul. Haas, M., S. Mittnik, and B. Mizrach. 2006. “Assessing central bank credibility during the ems crises: comparing option and spot marketbased forecasts.” Journal of Financial Stability 2, 28–54. Hackbarth, D. 2006. A real options model of debt, default, and investment, Working paper, Olin School of Business, Washington University. Hackbarth, D., J. Miao, and E. Morellec. 2006. Capital structure, credit risk, and macroeconomic conditions. Journal of Financial Economics 82, 519–550. Hadar, J. and W. Russell. 1969. “Rules for ordering uncertain prospects.” The American Economic Review 59, 25–34. Hahn, W. F. and K. H. Mathews. 2007. Characteristics and hedonic pricing of differentiated beef demands. Agricultural Economics 36(3), 377–393. Hahn, J., J. Hausman, and G. Kuersteiner. 2004. “Estimation with weak instruments: accuracy of higher-order bias and MSE approximations.” Econometrics Journal 7, 272. Hai, W., N. Mark, and Y. Wu. 1997. “Understanding spot and forward exchange rate regressions.” Journal of Applied Econometrics 12, 715–734. Hakanoglu, E., R. Kopprasch, and E. Roman. 1989. “Constant proportion portfolio insurance for fixed-income investment: a useful variation on CPPI.” Journal of Portfolio Management 15(4), 58–66. Hakansson, N. H. 1970. “Optimal investment and consumption strategies under risk for a class of utility functions.” Econometrica 38, 587–607. Hakansson N. H. 1971. “On optimal myopic portfolio policies, with and without serial correlation of yields.” Journal of Business 44, 324–334. Hakansson, N. 1971. “Optimal entrepreneurial decisions in a completely stochastic environment.” Management Science 17(7), 427–449. Hakansson N. H. 1974. “Convergence to isoelastic utility and policy in multiperiod portfolio choice.” Journal of Financial Economics 1, 201–224. Hall, P., B.-Y. Jing, and S. N. Lahiri. 1998. “On the sampling window method for long-range dependent data.” Statistica Sinica 8, 1189– 1204. Haley, C. W. and L. D. Schall. 1979. Theory of financial decision, 2nd ed., McGraw-Hill, New York.
1654 Halverson, R. and H. O. Pollakowski. 1981. “Choice of functional form equations.” Journal of Urban Economics 10(1), 37–49. Halvorsen, R. and R. Palmquist. 1980. “The interpretation of dummy variables in semilogrithmic regressions.” American Economic Review 70, 474–475. Hamada, R. S. 1972. “The effect of the firm’s capital structure on the systematic risk of common stocks.” Journal of Finance 27, 435–452. Hamao, Y. R., W. Masulis, and V. K. Ng. 1990. “Correlations in price changes and volatility across international stock markets.” Review of Financial Studies 3, 281–307. Hambly, B. 2006. Lecture notes on financial derivatives, Oxford University, New York. Hamer, M. M. 1983. “Failure prediction: sensitivity of classification accuracy to alternative statistical methods and variable sets.” Journal of Accounting and Public Policy 2(4), 289–307. Hamidi, B., E. Jurczenko, and B. Maillet. 2007. An extended expected CAViaR approach for CPPI. Variances, Working paper, University of Paris-1. Hamilton, J. D. 1988. “Rational-expectations econometric analysis of changes in regime: an investigation of the term structure of interest rates.” Journal of Economic Dynamics and Control 12, 385–423. Hamilton, J. 1989. “A new approach to the economic analysis of nonstationary time series and the business cycle.” Econometrica 57, 357–384. Hamilton, J. D. 1994. Time series analysis, Princeton University Press, New Jersey. Hamilton, J. D. and R. Susmel. 1994. “A conditional heteroskedasticity and change in regimes.” Journal of Econometrics 64, 307–333. Hammer, P. L. 1986. “Partially defined Boolean functions and causeeffect relationships.” International Conference on Multi-Attribute Decision Making Via OR-Based Expert Systems. University of Passau, Passau, Germany. Hammer, P. L., A. Kogan, and Lejeune M. A. 2006. “Modeling country risk ratings using partial orders.” European Journal of Operational Research 175(2), 836–859. Hammer, P. L., A. Kogan, and M. A. Lejeune. 2007b. “Reverseengineering banks’ financial strength ratings using logical analysis of data.” RUTCOR Research Report 10–2007, Rutgers University, New Brunswick, NJ. Hammer, P. L., A. Kogan, and M. A. Lejeune. 2010. Reverseengineering country risk ratings: A combinatorial non-recursive model. In print in: Annals of Operations Research. DOI 10.1007/ s10479-009-0529-0. Hamori, S. and Y. Imamura. 2000. “International transmission of stock prices among G7 countries: LA-VAR approach.” Applied Economics Letters 7, 613–618. Han, B. 2007. “Stochastic volatilities and correlations of bond yields.” Journal of Finance LXII(3), 1491–1524. Han, C.-H. submitted. “Estimating joint default probability by efficient importance sampling with applications from bottom up.” Han, J. and M. Kamber. 2001. Data mining: concepts and techniques, Morgan Kaufmann Publishers, San Francisco, CA. Han, C.-H. and Y. Lai. in press. “Generalized control variate methods for pricing asian options.” Journal of Computational Finance. Han, S. and H. Zhou. 2008. Effects of liquidity on the nondefault component of corporate yield spreads: evidence from intraday transactions data. SSRN eLibrary, http://ssrn.com/paperD946802. Handa, P. and R. Schwartz. 1996a. “Limit order trading.” Journal of Finance 51, 1835–1861. Handa, P. and R. Schwartz. 1996b. “How best to supply liquidity to a securities market.” Journal of Portfolio Management 22, 44–51. Handa, P., R. Schwartz, and A. Tiwari. 2003. “Quote setting and price formation in an order driven market.” Journal of Financial Markets 6, 461–489. Hanke, J. and D. Wichern. 2005. Business forecasting, 8th ed., Prentice Hall, New York.
References Hanley K. W. and W. J. Wilhelm. 1995. “Evidence on the strategic allocation of initial public offerings.” Journal of Finance Economics 37, 239–257. Hanoch, G. and H. Levy, 1969. “The efficiency analysis of choices involving risk.” Review of Economic Studies 36, 335–346. Hansen, L. 1982. “Large sample properties of generalized method of moments estimators.” Econometrica 50, 1029–1054. Hansen, B. E. 1994. “Autoregressive conditional density estimation.” International Economic Review 35, 705–730. Hansen, B. E. 1999. “Threshold effects in non-dynamic panels: estimation, testing, and inference.” Journal of Econometrics 93, 345–368. Hansen, L. P. and R. Jagannathan. 1991. “Implications of security market data for models of dynamic economies.” Journal of Political Economy 99, 225–262. Hansen, R. and J. Lott. 1996. “Externalities and corporate objectives in a world with diversified shareholders/consumers.” Journal of Financial and Quantitative Analysis 31, 43–68. Hao, Q. 2007. “Laddering in initial public offerings.” Journal of Finance Economics 85, 102–122. Hao, X. R. and Y. Shi. 1996. MC2 Program: version 1.0: a C C C program run on PC or Unix, College of Information Science and Technology, University of Nebraska at Omaha. Haque, N. U., M. S. Kumar, N. Mark, and D. Mathieson. 1996. “The economic content of indicators of developing country creditworthiness.” International Monetary Fund Working Paper 43(4), 688–724. Haque, N. U., M. S. Kumar, N. Mark, and D. Mathieson. 1998. “The relative importance of political and economic variables in creditworthiness ratings.” International Monetary Fund Working Paper 46, 1–13. Härdle, W. 1990. Applied nonparametric regression, Cambridge University Press, Cambridge. Hardy, G. H., J. E. Littlewood and G. Polya. 1934. Inequalities, Cambridge University Press, Cambridge, MA. Harlow, W. V. 1991. “Asset allocation in a downside-risk framework.” Financial Analysts Journal 47, 28–40. Harrington, S. E. and G. Niehaus. 2003. “Capital, corporate income taxes, and catastrophe insurance.” Journal of Financial Intermediation 12(4), 365–389. Harris, L. 1986. “Cross-security tests of the mixture of distributions hypothesis.” Journal of Financial and Quantitative Analysis 21, 39–46. Harris, L. 1987. “Transaction data tests of the mixture of distributions hypothesis.” Journal of Financial Quantitative Analysis 2, 127–141. Harris, L. 1991. “Stock price clustering and discreteness.” Review of Financial Studies 4, 389–415. Harris, L. 1994. “Minimum price variations, discrete bid-ask spreads, and quotation sizes.” Review of Financial Studies 7, 149–178. Harris, L. 2003. Trading and exchanges: market microstructure for practitioners, Oxford University Press, New York. Harris, M. and A. Raviv. 1988. “Corporate control contests and capital structure.” Journal of Financial Economics 20, 55–86. Harris, M. and A. Raviv. 1991. “The theory of capital structure.” The Journal of Finance 46(1), 297–355. Harris, M. and A. Raviv. 1993. “Differences of opinion make a horse race.” Review of Financial Studies 6(3) 473–506. Harrison, J. M. 1985. Brownian motion and stochastic flow systems, Wiley, New York. Harrison, M. and D. Kreps. 1978. “Speculative investor behavior in a stock market with heterogeneous expectations.” Quarterly Journal of Economics 92, 323–336. Harrison, J. M. and D. M. Kreps. 1979. “Martingales and arbitrage in multiperiod securities markets.” Journal of Economic Theory 20, 381–408. Harrison, J. M. and S. Pliska. 1981. “Martingales and stochastic integrals in the theory of continuous trading.” Stochastic Processes and Their Applications 11, 215–260.
References Harrison, J. and S. Pliska. 1981. “Martingales and stochastic integrals in the theory of continuous trading.” Stochastic Processes and Their Application 2, 261–271. Hart, O. D. and D. M. Kreps. 1986. “Price destabilizing speculation.” Journal of Political Economy 94, 927–953. Hartzell, J.C. and L. T. Starks. 2003. “Institutional investors and executive compensation.” Journal of Finance 58(6), 2351–2375. Harvey, C. R. 1989. “Time-varying conditional covariances in tests of asset pricing models.” Journal of Financial Economics 24, 289–318 Harvey, C. 1991. “The world price of covariance risk.” The Journal of Finance 46(1), 111–157. Harvey, C. 1995. “Predictable risk and returns in emerging markets.” Review of Financial Studies 8, 773–816. Harvey, C. and R. Huang. 1991. “Volatility in the foreign currency futures market.” Review of Financial Studies 4, 543–569. Harvey, A. C. and N. Shephard. 1996. “Estimation of asymmetric stochastic volatility model for asset returns.” Journal of Business and Economic Statistics 14, 429–434. Harvey, C. and R. Whaley. 1992a. “Market volatility and the efficiency of the S&P 100 index option market.” Journal of Financial Economics 31, 43–73. Harvey, C. and R. Whaley. 1992b. “Dividends and S&P 100 index option valuation.” Journal of Futures Markets 12, 123–137. Harvey, A., E. Ruiz, and N. Shephard. 1994. “Multivariate stochastic variance models.” Review of Economic Studies 61, 247–264. Harvey, C. R., K. V. Lins, and A. H. Roper. 2004. “The effect of capital structure when expected agency costs are extreme.” Journal of Financial Economics 74, 3. Hasbrouck, J. 1991. “Measuring the information content of stock trades.” Journal of Finance 46, 179–207. Hasbrouck, J. 1993. “Assessing the quality of a security market: a new approach to transaction-cost measurement.” Review of Financial Studies 6(1), 191–212. Hasbrouck, J. 1995. “One security, many markets: determining the contribution to price discovery.” Journal of Finance 50, 1175–1199. Hasbrouck, J. 2007. Empirical market microstructure, Oxford University Press, New York. Hasbrouck, J. and G. Sofianos. 1993. “The trades of market makers: an empirical analysis of NYSE specialists.” Journal of Finance 48, 1565–1593. Hasbrouck, J. and D. Seppi. 2001. “Common factors in prices, order flows and liquidity.” Journal of Financial Economics 59, 383–411. Haslett, J. 1999. “A simple derivation of deletion diagnostic results for the general linear model with correlated errors.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(3), 603–609. Hastie, T., R. Tibshirani and J. Friedman. 2001. The elements of statistical learning, Springer, New York. Haug, E.G. 1998. Option pricing formulas, McGraw-Hill, New York. Haugen, R. A. and C. O. Kroncke. 1971. “Rate regulation and the cost of capital in the insurance industry.” Journal of Financial and Quantitative Analysis 6(5), 1283–1305. Haugen, R. and D. Wichern. 1975. “The intricate relationship between financial leverage and the stability of stock prices.” Journal of Finance 20, 1283–1292. Haushalter, G. D. 2000. “Financing policy, basis risk, and corporate hedging: Evidence from oil and gas producers.” Journal of Finance 55, 107–152. Hausman, J. 1978. “Specification tests in econometrics.” Econometrica 46, 1251–1271. Hayashi, F. 2000. Econometrics, Princeton University Press, Princeton and Oxford. Hayre, L. S. and M. Saraf. 2008. A loss severity model for residential mortgages, Citigroup Global Markets, Inc., U.S. Fixed Income Strategy & Analysis – Mortgages, January 22.
1655 Hayre, L. S., M. Saraf, R. Young, and J. D. Chen. 2008. Modeling of mortgage defaults, Citigroup Global Markets, Inc., U.S. Fixed Income Strategy & Analysis – Mortgages, January 22. He, H. 2000. Modeling term structures of swap spreads, Working paper, Yale University. He, G. and R. Litterman. 1999. The Intuition Behind BlackLitterman Model Portfolios. Investment Management Division, Goldman Sachs. He, J., W. Hu, and L. H. P. Lang. 2000. Credit spread curves and credit ratings. SSRN eLibrary, http://ssrn.com/abstractD224393. Healy, P. M. and K. G. Palepu. 1988. “Earnings information conveyed by dividend initiations and omissions.” Journal of Financial Economics 21, 149–176. Heath, D., R. Jarrow, and A. Morton. 1992. “Bond pricing and the term structure of interest rates: a new technology for contingent claims valuation.” Econometrica 60(1), 77–105. Heckman, J. 1979. “Sample selection as a specification error.” Econometrica 47, 1531–1161. Heidari, M. and L. Wu. 2003. “Are interest rate derivatives spanned by the term structure of interest rates?.” Journal of Fixed Income 13(1), 75–86. Heine, R. and F. Harbus 2002. “Toward a more complete model of capital structure.” Journal of Applied Corporate Finance 15(1), 31–46. Heinemann, F. 2003. “Risk aversion and incentive effects: comment.” Working paper (Ludwig-Maximilians-Universitat Munchen). Heinkel, R., M. Howe, and J. S. Hughes 1990 “Commodity convenience yields as an option profit” Journal of Futures Markets 10, 519–533. Helwege, J. 1996. “Determinants of saving and loan failures: estimates of a time-varying proportional hazard function.” Journal of Financial Services Research 10, 373–392. Helwege, J. and C. M. Turner. 1999. “The slope of the credit yield curve for speculative-grade issuers.” Journal of Finance 54, 1869–1884. Henderson, J. and R. Quandt. 1980. Microeconomic theory: a mathematical approach, 3d Edition, McGraw-Hill, New York. Henry, D. and D. Welch 2002. “$100 billion goes poof.” Business Week 10, 131–133. Hermalin, B. E. and M. S. Weisbach. 1998. “Endogenously chosen board of directors and their monitoring of the CEO.” American Economic Review 88(1), 96–188. Hermalin, B. E. and M. S. Weisbach. 2003. “Boards of directors as an endogenously determined institution: a survey of the economic literature.” Federal Reserve Bank of New York Economic Policy Review 9, 7–26. Herold, U., R. Maurer, and N. Purschaker. 2005. “Total return fixedincome portfolio management: a risk-based dynamic strategy.” Journal of Portfolio Management 31, 32–43. Hertzel, M. G. and L. Rees. 1998. “Earnings and risk changes around private placements of equity.” Journal of Accounting, Auditing and Finance 13, 21–35. Hertzel, M. G. and R. L. Smith. 1993. “Market discounts and shareholder gains for placing equity privately.” Journal of Finance 48, 459–485. Hertzel, M. G., M. Lemmon, J. Linck, and L. Rees. 2002. “Long-run performance following private placements of equity.” Journal of Finance 57, 2595–2617. Heston, S.L. 1991. “A closed form solution for options with stochastic volatility with application to bond and currency options.” Review of Financial Studies 6, 327–343. Heston, S. 1993. “A closed-form solution for options with stochastic volatility with application to bond and currency options.” The Review of Financial Studies 6(2), 327–343. Heston, S. L. and S. Nandi. 2000. “A closed-form GARCH option valuation model.” Review of Financial Studies 13(3), 585–625. Heston, S. and G. Zhou. 2000. “On the rate of convergence of discretetime contingent claims.” Mathematical Finance 10, 53–75.
1656 Heston, S. L., M. Lowenstein, and G. A. Willard. 2007. “Options and bubbles.” Review of Financial Studies 20, 359–390. Heynen, R. C. and H. M. Kat. 1994. “Volatility prediction: a comparison of stochastic volatility, GARCH(1,1) and EGARCH(1,1) models.” Journal of Derivatives 2(2), 50–65. Heynen, R., A. G. Z. Kemna, and T. Vorst. 1994. “Analysis of the term structure of implied volatilities.” Journal of Financial and Quantitative Analysis 29, 31–56. Hiemstra, C. and J. Jones. 1994. “Testing for linear and nonlinear Granger causality in the stock price-volume relation.” Journal of Finance 49, 1639–1664. Higgins, R. C. 1974. “Growth, dividend policy and cost of capital in the electric utility industry.” Journal of Finance 29, 1189–1201. Higgins, R. C. 1984. Analysis for financial management, Richard D. Irwin, Homewood, IL. Hilberink, B. and L. C. G. Rogers. 2002. “Optimal capital structure and endogenous default.” Finance and Stochastics 6, 237–263. Hill, R. D. 1979. “Profit regulation in property-liability insurance.” Bell Journal of Economics 10(1), 172–191. Hill, J. R. and B. Melamed. 1995. “TEStool: a visual interactive environment for modeling autocorrelated time series.” Performance Evaluation 24(1&2), 3–22. Hill, R. D. and F. Modigliani 1987. “The massachusetts model of profit regulation in Nonlife insurance: an appraisal and extension,” in Fair rate of return in property-liability insurance, J. D. Cummins and S. E. Harrington (Eds.). Kluwer Academic Publishers, Norwell, MA. Hillebrand, E. 2005. “Neglecting parameter changes in GARCH models.” Journal of Econometrics 129, 121–138. Hilliard, J. E. and A. L. Schwartz. 1996. “Binomial option pricing under stochastic volatility and correlated state variable.” Journal of Derivatives 4, 23–39. Hillier, F. S. 1963. “The derivation of probabilities information for the evaluation of risky investments.” Management Science 10, 443–457. Hillier, F. S. 1969. The evaluation of risky interrelated investments, North-Holland Publishing Co., Amsterdam. Hirota and Sunder. 2005. “Price bubbles sans dividend anchors: evidence from laboratory stock markets.” Working paper. Hirtle, B. 1997. “Derivatives, portfolio composition and bank holding company interest rate risk exposure.” Journal of Financial Services Research 12, 243–276. Hjort, N. L. and D. Pollard. 1993. Asymptotics for minimisers of convex processes, Statistical Research Report, University of Oslo. Ho, T. H. 1996. “Finite automata play repeated prisoner’s dilemma with information processing costs.” Journal of Economic Dynamics and Control 20(January–March), 173–207. Ho, H. C. 2006. “Estimation errors of the Sharpe ratio for long-memory stochastic volatility models,” in Time series and related topics, IMS Lecture Notes-Monograph Series 52, pp. 165–172. Ho, H. C. and T. Hsing. 1996. “On the asymptotic expansion of the empirical process of long-memory moving averages.” The Annals of Statistics 24, 922–1024. Ho, H. C. and T. Hsing. 1997. “Limit theorems for functional of moving averages.” The Annals of Probability 25, 1636–1669. Ho, T. and S. Lee. 1986. “Term structure movements and pricing interest rate contingent claims.” Journal of Finance 41, 1011–1029. Ho, T. S. Y. and S. B. Lee. 2004a. The oxford guide to financial modeling, Oxford University Press, New York. Ho, T. S. Y. and S. B. Lee. 2004b. “Valuing high yield bonds: a business modeling approach.” Journal of Investment Management 2(2), 1–12. Ho, H. C. and F. I. Liu. 2008. Sample quantile analysis on long-memory stochastic volatility models and its applications in estimating VaR, Working paper. Ho, T. S. Y. and R. Marcis. 1984. “Dealer bid-ask quotes and transaction prices: an empirical study of some AMAX options.” Journal of Finance 39(1), 23–45.
References Ho, T. S. Y. and R. F. Singer. 1982. “Bond indenture provisions and the risk of corporate debt.” Journal of Financial Economics 10, 375–406. Ho, T. and H. Stoll. 1980. “On dealer markets under competition.” Journal of Finance 35, 259–267. Ho, T. and H. Stoll. 1981. “Optimal dealer pricing under transactions and return uncertainty.” Journal of Financial Economics 9, 47–73. Ho, T. and H. Stoll. 1983. “The dynamics of dealer markets under competition.” Journal of Finance 38, 1053–1074. Ho, T. R. Schwartz, and D. Whitcomb. 1985. “The trading decision and market clearing under transaction price uncertainty.” Journal of Finance 40, 21–42. Ho, Y. K., M. Tjahjapranata, and C. M. Yap. 2006. “Size, leverage, concentration, and R&D Investment in generating growth opportunities.” Journal of Business 79(2), 851–876. Ho, L. C., J. Cadle and M. Theobald. 2008. An analysis of risk-based asset allocation and portfolio insurance strategies, Working paper, Central Bank of the Republic of China (Taiwan). Ho, H. C., S. S. Yang, and F. I. Liu. 2008. A stochastic volatility model for reserving long-duration equity-linked insurance: long-memory, Working paper. Hocking, R. R. 1976. “The analysis and selection of variables in linear regression.” Biometrics 32, 1–49. Hodrick, R. 1992. “Dividend yields and expected stock returns: alternative procedures for inference and measurement.” Review of Financial Studies 5, 357–386. Hoeting, J., D. Madigan, A. Raftery, and C. Volinsky. 1999. “Bayesian model averaging: a tutorial.” Statistical Science 14(4), 382–417. Hoff, P. D. 2006. “Marginal set likelihood for semiparametric copula estimation.” mimeograph. Hoffman, C. 2000. Valuation of American options, Oxford University Thesis. Hoffman, N., E. Platen, and M. Schweizer. 1992. “Option pricing under incompleteness and stochastic volatility.” Mathematical Finance 2, 153–187. Hobson, D. 2004. “Stochastic volatility models, correlation and the q-optimal measure.” Mathematical Finance 14(4), 537–556. Hodges, S. D. and A. Neuberger. 1989. Optimal replication of contingent claims under transaction costs. Review of Future Markets 8, 222–239. Hofmeyr, S. A. and S. Forrest. 2000. “Architecture for an artificial immune system.” Evolutionary Computation 8(4), 443–473. Hogg, R. V. and S. A. Klugman. 1983. “On the estimation of long tailed skewed distributions with actuarial applications.” Journal of Econometrics 23, 91–102. Holden, C. W. 2005. Excel modeling in investments, 2nd ed. PrenticeHall, Upper Saddle Ridge, New Jersey, ISBN: 0-13-142426-2. Holden C. W. 2009. Holden, excel modeling and estimation, 3rd ed. Prentice-Hall, Upper Saddle Ridge, New Jersey. Holland, J. H. 1975. Adaptation in natural and artificial systems, The University of Michigan Press, Ann Arbor. Holt, C. A. and S. K. Laury. 2002. “Risk aversion and incentive effects.” American Economic Review 92, 1644–1655. Homan, M. 1989. A study of administrations under the insolvency act 1986: the results of administration orders made in 1987, ICAEW, London. Hong, Y. and H. Li. 2005. “Nonparametric specification testing for continuous-time models with applications to term structure of interest rates.” Review of Financial Studies 18, 37–84. Hong, H. and J. Stein. 2003. “Differences of opinion, short-sales constraints and market crashes.” Review of Financial Studies 16(2), 487–525. Hong, H. G., J. A. Scheinkman and W. Xiong. 2006. “Asset float and speculative bubbles.” Journal of Finance 61, 1073–1117.
References Horvath, L., P. Kokoszka, and A. Zhang. 2006. “Monitoring constancy of variance in conditionally heteroskedastic time series.” Econometric Theory 22, 373–402. Hoshi, T., A. Kashyap and D. Scharfstein. 1991. “The investment, liquidity and ownership – the evidence from the Japanese groups.” Quarterly Journal of Economics 106, 33–60. Hosking, J. R. M. 1981. “Fractional differencing.” Biometrika 68, 165– 176. Hosking, J. R. M. 1994. “The four-parameter kappa distribution.” IBM Journal of Research and Development 38, 251–258. Hotchkiss, E. S. and Ronen, T. 2002. “The informational efficiency of the corporate bond market: an intraday analysis.” Review of Financial Studies 15, 1325–1354. Houston, J. F. and C. James. 1995. “CEO compensation and bank risk: Is compensation in banking structured to promote risk taking?” Journal of Monetary Economics 36, 405–431. Houston, J. F. and C. James. 1996. “Banking relationships, financial constraints and investment: are bank dependent borrowers more financially constrained?” Journal of Financial Intermediation 5(2), 211–221. Houston, J. F. and M. D. Ryngaert. 1994. “The overall gains from large bank mergers.” Journal of Banking and Finance 18, 1155–1176. Houston, J. F. and S. Venkataraman 1994. “Optimal maturity structure with multiple debt claims.” Journal of Financial and Quantitative Analysis 29, 179–201. Houston, J. F., C. James, and M. Ryngaert. 2001. “Where do merger gains come from? Bank mergers from the perspective of insiders and outsiders.” Journal of Financial Economics 60, 285–331. Houweling, P. and T. Vorst. 2005. “Pricing default swaps: empirical evidence.” Journal of International Money and Finance 24, 1200–1225. Hovakimian, A., T. Opler, and S. Titman 2001. “The debt-equity choice.” Journal of Financial and Quantitative Analysis 36(1), 1–25. Howard, C. T. and L. J. D’Antonio. 1984. “A risk-return measure of hedging effectiveness.” Journal of Financial and Quantitative Analysis 19, 101–112. Howe, K. M. and J. H. Patterson. 1985. “Capital investment decisions under economies of scale in flotation costs.” Financial Management 14, 61–69. Howton, S. and S. Perfect. 1998. “Currency and interest-rate derivatives use in U.S. firms.” Financial Management 27(4), 111–121. Hsia, C. C. 1981. “Coherence of the modern theories of finance.” The Financial Review, 16, 27–42. Hsieh, J. and R. A. Walkling. 2005. “Determinants and implications of arbitrage holdings in acquisitions.” Journal of Financial Economics 77, 605. Hsu, D. A. 1977. “Tests for variance shift at an unknown time point.” Journal of the Royal Statistical Society, Series C, 26(3), 279–284. Hsu, D. A. 1979. “Detecting shifts of parameter in gamma sequences with applications to stock price and air traffic flow analysis.” Journal of the American Statistical Association 74(365), 31–40. Hsu, D. A. 1982a. “A Bayesian robust detection of shift in the risk structure of stock market returns.” Journal of the American Statistical Association 77(377), 29–39. Hsu, D. A. 1982b. “Robust inferences for structural shift in regression models.” Journal of Econometrics 19, 89–107. Hsu, D. A. 1984. “The behavior of stock returns: Is it stationary or evolutionary?” Journal of Financial and Quantitative Analysis 19(1), 11–28. Hsu, D. A., R. B. Miller, and D. W. Wichern. 1974. “On the stable Paretian behavior of stock-market prices.” Journal of the American Statistical Association 69(345), 108–113. Hsu, Y. L., T. I. Lin, and C. F. Lee. 2008. “Constant elasticity of variance (CEV) option pricing model: integration and detailed derivation.” Mathematics and Computers in Simulation 79(1), 60–71.
1657 Hu, Y.-T., R. Kiesel, and W. Perraudin. 2002. “The estimation of transition matrices for sovereign credit ratings.” Journal of Banking and Finance 26(7), 1383–1406. Huang, J.-Z. 1997. Default risk, renegotiation, and the valuation of corporate claims, PhD dissertation, New York University. Huang, J.-Z. 2005. Affine structural models of corporate bond pricing, Working Paper, Penn State University, Presented at the 2005 FIDC Conference. Huang, J.-Z. and M. Huang. 2003. How much of corporatetreasury yield spread is due to credit risk? SSRN eLibrary, http://ssrn.com/paperD307360. Huang, J.-Z. and W. Kong. 2003. Explaining credit spread changes: new evidence from option-adjusted bond indexes. Journal of Derivatives 11, 30–44. Huang, J.-Z. and W. Kong. 2005. Macroeconomic news announcements and corporate bond credit spreads. SSRN eLibrary, http://ssrn.com/paperD693341. Huang, N. E. and S. P. Shen. 2005. Hilbert–Huang transform and its applications, World Scientific, London. Huang, R. D. and H. R. Stoll. 1996. “Dealer versus auction markets: a paired comparison of execution costs on NASDAQ and the NYSE.” Journal of Financial Economics 41, 13–357. Huang, J.-Z. and X. Zhang. 2008. The slope of credit spread curves. Journal of Fixed Income 18, 56–71. Huang, J.-Z. and H. Zhou. 2008. Specification analysis of structural credit risk models. SSRN eLibrary, http://ssrn.com/paperD1105640. Huang, N. E., Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung, and H. H. Lin. 1998. “The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis.” Proceedings of the Royal Society of London A 454, 903–995. Huang, J., N. Ju, and H. Ou-Yang. 2003. A model of optimal capital structure with stochastic interest rates, Working paper, New York University. Huang, N. E., M. C. Wu, S. R. Long, S. P. Shen, W. Qu, P. Gloersen, and K. L. Fan. 2003. “A confidence limit for the empirical mode decomposition and Hilbert spectral analysis.” Proceedings of the Royal Society of London A 459(2037), 2317–2345. Huang, Z., H. Chen, C.-J. Hsu, W.-H. Chen, and S. Wu. 2004. “Credit rating analysis with support vector machines and neural networks: a market comparative study.” Decision Support Systems 37, 543–558. Hubbard, R. G. and D. Palia. 1995. “Executive pay and performance: Evidence from the U.S. banking industry.” Journal of Financial Economics 39, 105–130. Huberman, G. and S. Kandel. 1987. “Mean-variance spanning.” Journal of Finance 42, 873–888. Huberman, G. and S. Kandel. 1990. “Market efficiency and value line’s record.” Journal of Business 63, 187–216. Huberman, G. and S. Ross. 1983. “Portfolio turnpike theorems, risk aversion and regularly varying utility functions.” Econometrica 51, 1104–1119. Huberman, G. and G. William Schwert. 1985. “Information aggregation, inflation, and the pricing of indexed bonds.” Journal of Political Economy 95, 92–114. Hudson, J. 1987. “The age, regional and industrial structure of company liquidations.” Journal of Business Finance & Accounting 14(2), 199–213. Hueng, C. J. and R. Brooks. 2002. Forecasting asymmetries in aggregate stock market returns: evidence from higher moments and conditional densities, working paper, WP02–08–01, The University of Alabama. Hueng, C. J. and J. B. McDonald. 2005. “Forecasting asymmetries in aggregate stock market returns: evidence from conditional skewness.” Journal of Empirical Finance 12, 666–685.
1658 Hughes, J., W. Lang, L. J. Mester, and C-G. Moon. 1996. “Efficient banking under interstate branching.” Journal of Money, Credit, and Banking 28, 1045–1071. Hughes, J., W. Lang, G.-G. Moon, and M. S. Pagano. 2006. Managerial incentives and the efficiency of capital structure in U.S. commercial banking, Rutgers University, Working Paper. Hull, J. 1993. Options, futures, and other derivative securities, 2nd Edition, Prentice-Hall, Englewood Cliffs, NJ. Hull, J. C. 2002. Options, futures, and other derivatives, 5th Edition, Prentice Hall, Hoboken, NJ. Hull, J. 2003. Options futures and other derivatives, 5th ed., Prentice Hall, Upper Saddle River, New Jersey. Hull, J. 2005. Options, futures, and other derivatives, 6th Edition, Prentice Hall, Upper Saddle River, NJ. Hull, J. 2006. Options, futures, and other derivatives, 6th ed., Pearson, Upper Saddle River, NJ. Hull, J. 2009. Options, futures, and other derivative securities, 7th ed., Prentice-Hall, Upper Saddle River. Hull, J. and A. White. 1987. “The pricing of options on assets with stochastic volatilities.” Journal of Finance 42, 281–300. Hull, J., and A. White. 1987. “Hedging the risks from writing foreign currency options.” Journal of International Money and Finance 6(2), 131–152. Hull, J. and A. White. 1993. “Efficient procedures for valuing European and American path-dependent options.” The Journal of Derivatives 1, 21–31. Hull, J. and A. White. 1993. “One-factor interest-rate models and the valuation of interest-rate derivative securities.” Journal of Financial and Quantitative Analysis 28, 235–254. Hull, J. and A. White. 2000. “Valuing credit default swaps I: no counterparty default risk.” Journal of Derivatives 8, 29–40. Hull, J. and A. White. 2001. “Valuing credit default swaps II: modeling default correlations.” Journal of Derivatives 8, 12–22. Hull, J. and A. White. 2004. “Valuing of a CDO and an n-th to default CDS without Monte Carlo simulation.” Journal of Derivatives 12(2), 8–48. Hull, J., M. Predescu, and A. White. 2004. “The relationship between credit default swap spreads, bond yields, and credit rating announcements.” Journal of Banking and Finance 28, 2789–2811. Hull, J., I. Nelken, and A. White. 2004. Merton’s model, credit risk, and volatility skews, Working Paper, University of Toronto. Huson, M. R., P. H. Malatesta, and R. Parrino. 2006. Capital market conditions and the volume and pricing of private equity sales, Unpublished working paper, University of Alberta, University of Washington and University of Texas at Austin. Hulton, C. and F. Wycoff. 1981. “The measurement of economic depreciation,” in Depreciation, inflation, and the taxation of income from capital, C. Hulten (Ed.). Urban Institute Book, Washington. Hutchinson, J., A. Lo, and T. Poggio. 1994. “A nonparametric approach to the pricing and hedging of derivative securities via learning networks.” Journal of Finance 49(June), 851–889. Hwang, D.-Y., F.-S. Shie, K. Wang and J.-C. Lin. 2009. “The pricing of deposit insurance considering bankruptcy costs and closure policies.” Journal of Banking and Finance 33(10), 1909–1919. Ibbotson Associates. 1994. Chicago, stocks, bonds, bills and inflation 1994 yearbook, Ibbotson Associates, Inc., Chicago. Ibbotson, R. C. 1992. Stocks, bonds, bills, and inflation: 1991 yearbook, R.G. Ibbotson Associates, Chicago, Ill. Ibbotson, R. G. and R. A. Sinquefield. 1976. “Stocks, bonds, bills, and inflation: simulations of the future (1976–2000).” Journal of Business 49, 313–338. Ilmanen, A. 2003. “Expected returns on stocks and bonds.” Journal of Portfolio Management Winter 2003, 7–27. Inclan, C. 1993. “Detection of multiple changes of variance using posterior odds.” Journal of Business and Economic Statistics 11, 189– 200.
References Inclan, C. and G. C. Tiao. 1994. “Use of cumulative sums of squares for retrospective detection of changes of variance.” Journal of the American Statistical Association 89(427), 913–923. Ingersoll, J. E. 1976. “A theoretical and empirical investigation of the dual purpose funds: an application of contingent claims and analysis.” Journal of Financial Economics 3, 83–123. Ingersoll, J. 1977. “An examination of corporate call policies on convertible securities.” Journal of Finance 32(2), 463–78. Ingersoll, J. 1987. Theory of Financial Decision Making, Rowman & Littlefield Publishers. Israel, R.. 1992. “Capital and ownership structures, and the market for corporate control.” Review of Financial Studies 5(2), 181–198. Itô, K. 1944. “Stochastic integral.” Proc Imperial Acad, Tokyo 20, 519– 524. Jacklin, C. and M. Gibbons. 1989. Constant elasticity of variance diffusion estimation, Working paper, Stanford University. Jackson, D. 1976. “Jackson personality inventory manual,” Research Psychologists Press, Goshen, NY. Jackson, D. 1977. “Reliability of the Jackson personality inventory.” Psychological Reports 40, 613–614. Jackson, M. and M. Staunton. 2001. Advanced modeling in finance using excel and VBA, Wiley, New York. Jackson, D., D. Hourany and N. Vidmar. 1972. “A four dimensional interpretation of risk taking.” Journal of Personality 40, 433–501. Jackwerth, J.C. 2000. “Recovering risk aversion from options prices and realized returns.” Review of Financial Studies 13, 433–451. Jackwerth, J. C. and M. Rubinstein. 1996. “Recovering probability distributions from option prices.” Journal of Finance 51, 1611–1631. Jacob, N. 1974. “A limited-diversification portfolio selection model for the small investor.” Journal of Finance 19, 847–856. Jacod, J. 1979. Calcule Stochastique et problèmes des martingales, Lecture Notes in Mathematics, Vol. 714, Springer, Berlin. Jacod, J. and A. N. Shiryaev. 1987. Limit Theorems for Stochastic Processes, 2nd Edition, Springer, New York. Jacques, M. 1996. “On the hedging portfolio of Asian options” ASTIN Bulletin 26, 165–183. Jacquier, E., N. G. Polson, and P. E. Rossi. 1994. “Bayesian analysis of stochastic volatility models (with discussion).” Journal of Business and Economic Statistics 12, 371–417. Jacquier, E., N. G. Polson, and P. E. Rossi. 1998. Stochastic volatility: univariate and multivariate extensions, CIRANO working paper. Jacquier, E., N. G. Polson, and P. E. Rossi. 2004. “Bayesian analysis of stochastic volatility models with fat-tails and correlated errors.” Journal of Econometrics 122, 185–212. Jaffe, J. F. 1974. “Special information and insider trading.” Journal of Business 47, 410–428. Jaffee, D. 2003. “The interest rate risk of Fannie Mae and Freddie Mac.” Journal of Financial Services Research 24(1), 5–29. Jagannathan, R. and T. S. Ma. 2003. “Risk reduction in large portfolios: why imposing the wrong constraints helps.” Journal of Finance 58, 1651–1683. Jagannathan, R. and Z. Wang. 1996. “The conditional CAPM and the cross-section of expected returns.” Journal of Finance 51(1), 3–53. Jagannathan, R., A. Kaplin, and S. Sun. 2003. “An evaluation of multifactor CIR models using LIBOR, swap rates, and cap and swaption prices.” Journal of Econometrics 116, 113–146. Jagerman, D. L. and B. Melamed. 1992a. “The transition and autocorrelation structure of TES processes part I: general theory.” Stochastic Models 8(2), 193–219. Jagerman, D. L. and B. Melamed. 1992b. “The transition and autocorrelation structure of TES processes part II: special cases.” Stochastic Models 8(3), 499–527. Jagerman, D. L. and B. Melamed. 1994. “The spectral structure of TES processes.” Stochastic Models 10(3), 599–618.
References Jagerman, D. L. and B. Melamed. 1995. “Bidirectional estimation and confidence regions for TES processes.” Proceedings of MASCOTS ‘95, Durham, NC, pp. 94–98. Jaimungal, S. and T. Wang. 2006. “Catastrophe options with stochastic interest rates and compound poisson losses.” Insurance: Mathematics and Economics 38, 469–483. Jain, P. C. 1992. “Equity issues and changes in expectations of earnings by financial analysts.” Review of Financial Studies 5, 669–683. Jain, K., R Duin, and J. Mayo. 2000. “Statistical pattern recognition: a review.” IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37. James, D. and R. M. Isaac. 2000. “Asset markets: how they are affected by tournament incentives for individuals.” American Economic Review 90, 995–1004. Jamshidian, F. 1996. “Bond, futures and option valuation in the quadratic interest rate model.” Applied Mathematical Finance 3, 93–115. Janosi, T., R. Jarrow, and Y. Yildirim. 2002. “Estimating expected losses and liquidity discounts implicit in debt prices.” Journal of Risk 5, 1–38. Jarrow, R. 1992. “Market manipulation, bubbles, corners and short squeezes.” Journal of Financial and Quantitative Analysis September, 311–336. Jarrow, R. 1994. “Derivative security markets, market manipulation and option pricing.” Journal of Financial and Quantitative Analysis 29(2), 241–261. Jarrow, R. 2001. “Default parameter estimation using market prices.” Financial Analysts Journal, September/October, 75–92. Jarrow, R. and A. Rudd. 1982. “Approximate option valuation for arbitrary stochastic processes.” Journal of Financial Economics 10, 347–369. Jarrow, R. A. and A. Rudd. 1983. Option pricing, Richard D. Irwin, Homewood, IL. Jarrow, R. A. and S. M. Turnbull. 1995. “Pricing derivatives on financial securities subject to credit risk.” Journal of Finance 50, 53–85. Jarrow, R. and S. Turnbull. 1997. “An integrated approach to the hedging and pricing of Eurodollar derivatives.”Journal of Risk and Insurance 64(2), 271–299. Jarrow, R. A. and S. Turnbull. 1999. Derivatives securities, 2nd Edition, South-Western College Pub, Cincinnati, OH. Jarrow, R. A. and F. Yu. 2001. “Counterparty risk and the pricing of defaultable securities.” Journal of Finance 56, 1765–1799. Jarrow, R. A., D. Lando, and S. M. Turnbull. 1997. “A Markov model for the term structure of credit risk spreads.” Review of Financial Studies 10, 481–523. Jarrow, R. A., D. Lando, and F. Yu. 2005. “Default risk and diversification: theory and empirical implications.” Mathematical Finance 15, 1–26. Jarrow, R., H. Li, and F. Zhao. 2007. “Interest rate caps “Smile” Too! but can the LIBOR market models capture it?” Journal of Finance 62, 345–382. Jegadeesh, N. 1992. “Does market risk really explain the size effect?” Journal of Financial and Quantitative Analysis 27, 337–351. Jegadeesh, N. and S. Titman. 2001. “Profitability of momentum strategies: an evaluation of alternative explanations.” Journal of Finance 56, 699–720. Jelenkovic, P. and B. Melamed. 1995. “Algorithmic modeling of TES processes.” IEEE Transactions on Automatic Control 40(7), 1305–1312. Jensen, M. C. 1968. “The performance of mutual funds in the period 1945–1964.” Journal of Finance 39, 389–416. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolios.” Journal of Business 42(2), 167–247. Jensen, M. C. 1969. “Risk, the pricing of capital assets, and the evaluation of investment portfolio.” Journal of Business 42, 607–636.
1659 Jensen, M. C. 1972. “Capital markets: theory and evidence.” The Bell Journal of Economic and Management Science 3, 357–398. Jensen, M. C. 1986. “Agency costs of free cash flow, corporate finance and takeovers.” American Economic Review 76, 323–329. Jensen, M. and W. Meckling. 1976. “Theory of the firm: managerial behavior, agency costs and ownership structure.” Journal of Financial Economics 3(4), 305–360. Jensen, M.C. and W.H. Meckling. 1978. “Can the corporation survive?” Financial Analysts Journal, 34, 31–37. Jensen, P. H. and E. Webster. 2007. “Labelling characteristics and demand for retail grocery products in Australia.” Australian Economic Papers 47(2), 129–140. Jeon, B. N. and B. S. Jang. 2004. “The linkage between the US and Korean stock markets: the case of NASDAQ, KOSDAQ, and the semiconductor stocks.” Research in International Business and Finance 18, 319–340. Jeon, B. N. and G. M. von Furstenberg. 1990. “Growing international co-movement in stock price indexes.” Quarterly Review of Economics Business 30, 15–30. Jin, Y., and P. Glasserman. 2001. “Equilibrium positive interest rates: a unified view.” Review of Financial Studies 14(1), 187–214. Joachims, T. 1999. Making large-scale SVM learning practical. MIT Press, Cambridge, MA. Jobson, J. D. and B. Korkie. 1980. “Estimation for Markowitz efficient portfolios.” Journal of the American Statistical Association 75, 544–554. Jobson, J. D. and B. Korkie. 1981. “Putting Markowitz theory to work.” Journal of Portfolio Management 7, 70–74. Jobson, J. D. and B. Korkie. 1989. “A performance interpretation of multivariate tests of asset set intersection, spanning and meanvariance efficiency.” Journal of Financial and Quantitative Analysis 24, 185–204. Jobson, J. D., B. Korkie, and V. Ratti. 1979. “Improved estimation for Markowitz portfolios using James-Stein type estimators.” Proceedings of the American Statistical Association, Business and Economics Statistics Section 41, 279–284. Joe, H. 1997. Multivariate models and dependence concepts, Chapman & Hall, London. Johannes, M. 2004. The statistical and economic role of jumps in continuous-time interest rate models.” Journal of Finance 59, 227–260. John, T.A. 1993. “Accounting measures of corporate liquidity, leverage and costs of financial distress.” Financial Management 22, 91–100. John, K. and L. Lang. 1991. “Insider trading around dividend announcements: theory and evidence.” Journal of Finance 46, 1361–1390. John, K. and B. Mishra. 1990. “Information content of insider trading around corporate announcements: the case of capital expenditures.” Journal of Finance 45, 835–855. John, K. and J. Williams. 1985. “Dividends, dilution and taxes: a signaling equilibrium.” Journal of Finance 40, 1053–1070. John, K., L. Lang, and F. Shih. 1992. Anti takeover measures and insider trading: theory and evidence, Working Paper, NYU Stern Graduate School of Business. Johnson, L. L. 1960. “The theory of hedging and speculation in commodity futures.” Review of Economic Studies 27, 139–151. Johnson, H. 1981. “The pricing of complex options.” Unpublished manuscript. Johnson, H. 1987. “Options on the maximum or the minimum of several assets.” Journal of Financial and Quantitative Analysis 22, 277– 283. Johnson, S. A. 1997. “An empirical analysis of the determinants of corporate debt ownership structure.” Journal of Financial and Quantitative Analysis 32(1), 47–69. Johnson, S. A., 1998. “The effect of bank debt on optimal capital structure.” Financial Management 27(1), 47–56.
1660 Johnson, S. A. 2003. “Debt maturity and the effects of growth opportunities and liquidity risk on leverage.” Review of Financial Studies 16, 209. Johnson, T. W. 2004. “Predictable investment horizons and wealth transfers among mutual fund shareholders.” Journal of Finance 59(5), 1979–2012. Johnson, N. and S. Kotz. 1970. Distributions in statistics: continuous univariate distributions-2, Houghton Mifflin, Boston. Johnson, N. L. and S. Kotz. 1970. Continuous univariate distributions – I Distributions in statistics, Wiley, New York. Johnson, N. L. and S. Kotz. 1972. Distributions in statistics: continuous multivariate distribution, Wiley, New York. Johnson, H. and D. Shanno. 1987. “Option pricing when the variance is changing.” The Journal of Financial and Quantitative Analysis 22(2), June, 143–151. Johnson, N. L., S. Kotz and N. Balakrishnan. 1994. Continuous univariate distributions, vol. 1, 2nd ed., Wiley, New York. Jones, E. P. 1984. “Option arbitrage and strategy with large price changes.” Journal of Financial Economics 13, 91–113. Jones, C. S. 2003. “Nonlinear mean reversion in the short-term interest rate.” Review of Financial Studies 16, 793–843. Jones, C. and O. Lamont. 2002. “Short sale constraints and stock returns.” Journal of Financial Economics 66(2), 207–239. Jones, E. P., S. P. Mason, E. Rosenfeld. 1984. “Contingent claims analysis of corporate capital structures: an empirical investigation.” Journal of Finance 39, 611–625. Jones, C., G. Kaul, and M. Lipson. 1994. “Information, trading and volatility.” Journal of Financial Economics 36, 127–154. Jones, C. M., G. Kaul, and M. L Lipson. 1994. “Transactions, volume, and volatility.” Review of Financial Studies 7(4), 631–651. Jones, C. M., O. Lamont and R. L. Lumsdaine. 1998. “Macroeconomic news and bond market volatility.” Journal of Financial Economics 47(3), 315–337. Jorge, C. H. and I. Iryna. 2002. Asian flu or Wall Street virus? Price and volatility spillovers of the tech and non-tech sectors in the United States and Asia, IMF working paper wp/02/154. Jorion, P. 1985. “International portfolio diversification with estimation risk.” Journal of Business 58, 259–278. Jorion, P. 1991. “Bayesian and CAPM estimators of the means: implications for portfolio selection.” Journal of Banking and Finance 15, 717–727. Jorion, P. 1995. “Predicting volatility in the foreign exchange market.” Journal of Finance 50(2), 507–528. Jorion, P. 2000. Value at risk, McGraw Hill, New York. Jorion, P. and E. Schwartz. 1986. “Integration vs. segmentation in the Canadian stock market.” Journal of Finance 41, 603–613. Jorion, P. and G. Zhang. 2007. “Good and bad credit contagion: evidence from credit default swaps.” Journal of Financial Economics 84, 860–883. Jouini, E. 2000. “Price functionals with bid-ask spreads: an axiomatic approach.” Journal of Mathematical Economics 34(4), 547–558. Jouini, E. and H. Kallal. 1995. “Martingales and arbitrage in securities markets with transaction costs.” Journal of Economic Theory 66(1), 178–197. Jouini, E., H. Kallal, and C. Napp. 2001. “Arbitrage in financial markets with fixed costs.”Journal of Mathematical Economics 35(2), 197–221. Joy, M. O. and J. O. Tollefson. 1975. “On the financial applications of discriminant analysis.” Journal of Financial and Quantitative Analysis 10(5), 723–739. Ju, N. 2002. “Pricing Asian and basket options via Taylor expansion.” Journal of Computational Finance 5(3), 1–33. Ju, N. and H. Ou-Yang. 2006. “Capital structure, debt maturity, and stochastic interest rates.” Journal of Business 79, 2469–2502.
References Ju, N., R. Parrino, A. Poteshman, and M. Weisbach. 2004. “Horses and rabbits? Trade-off theory and optimal capital structure.” Journal of Financial and Quantitative Analysis 40, 259–281. Judge, G. G., W. E. Griffiths, R. C. Hill and T.-C. Lee. 1980. The theory and practice of econometrics, 1st Edition, John Wiley & Sons, New York. Judge, G. G., W. E. Griffiths, R. C. Hill, H. Lütkepohl, and T. Lee. 1985. The theory and practice of econometrics, Wiley, New York. Jung, K., Y. C. Kim, and R. M. Stulz. 1996. “Timing, investment opportunities, managerial discretion, and security issue decision.” Journal of Financial Economics, 42, 159–185. Kaas, R., J. Dhaene, M. J. Goovaerts 2000. “Upper and lower bounds for sums of random variables.” Insurance: Mathematics and Economics 27, 151–168. Kahl, C. and P. Jäckel. 2005. “Not-so-complex logarithm in the Heston model.” Wilmott Magazine 19, 94–103. Kahle, K. M. 2000. “Insider trading and the long-run performance of new security issues.” Journal of Corporate Finance: Contracting, Governance and Organization 6, 25–53. Kahn, R. N. and A. Rudd. 1995. “Does historical performance predict future performance?” Financial Analysts Journal 51, 43–52. Kahn, J., S. Landsburg, and A. Stockman. 1996. “The positive economics of methodology.” Journal of Economic Theory 68(1), 64–76. Kakade, S. and D. Foster. 2004. “Deterministic calibration and Nash equilibrium,” in Proceedings of the seventeenth annual conference on learning theory, Lecture Notes in Computer Science Vol. 3120, J. Shawe-Taylor and Y. Singer (Eds.). Springer, Heidelberg, pp. 33–48. Kaldor, N. 1939. “Speculation and economic stability.” Review of Economic Studies 7, 1–27. Kalev, P. S. and H. N. Duong. 2008. “A test of the Samuelson Hypothesis using realized range.” Journal of Futures Markets 28, 680–696. Kalman, R. E. 1960. “A new approach to linear filtering and prediction problems.” Transactions of the ASME – Journal of Basic Engineering 82(Series D), 35–45. Kamien, M. I. and N. L. Schwartz. 1982. Market structure and innovation, Cambridge University Press, New York. Kaminsky, G., and S. L. Schmukler. 2002. “Emerging market instability: do sovereign ratings affect country risk and stock returns?” World Bank Economic Review 16, 171–195. Kan, R., and G. Zhou. 2008. “Tests of mean-variance spanning.” Working Paper, University of Toronto and Washington University. Kandel, E. and M. Pearson. 1995. “Differential interpretation of information and trade in speculative markets.” Journal of Political Economy 103(4), 831–872. Kandel, S. and R. F. Stambaugh. 1996. “On the predictability of stock returns: an asset-allocation perspective.” The Journal of Finance 51, 385–424. Kander, Z. and S. Zacks. 1966. “Test procedures for possible changes in parameters of statistical distributions occurring at unknown time points.” Annals of Mathematical Statistics 37, 1196–1210. Kane, A., A. Marcus, and R. MacDonald. 1984. “How big is the tax advantage to debt.” Journal of Finance 39, 841–852. Kane, A., A. Marcus, and R. MacDonald. 1985. “Debt policy and the rate of return premium to leverage.” Journal of Financial and Quantitative Analysis 20, 479–499. Kaplan, S. and R. Ruback. 1995. “The valuation of cash flow forecasts: an empirical analysis.” Journal of Finance 50(4), 1059–1093. Karatzas, I. and S. Shreve. 1991. Brownian motion and stochastic calculus, 2nd ed. Springer, New York. Karatzas, I. and S. E. Shreve. 2000. Brownian motion and stochastic calculus, Springer, New York. Karatzas, I. and S. Shreve. 2005. Brownian motion and stochastic calculus, Corrected Eighth Printing, Springer, New York.
References Karels, G. V. and A. J. Prakash. 1987. “Multivariate normality and forecasting of business bankruptcy.” Journal of Business Finance & Accounting 14(4), 573–595. Karlin, S. and H. Tayler. 1981. A second course in stochastic processes, Academic, San Diego, CA. Karolyi, A. G. 1995. “Multivariate GARCH model of international transmissions of stock returns and volatility: the case of the Unites States and Canada.” Journal of Business and Economics Statistics 13, 11–25. Karpoff, J. M. and D. Lee. 1991. “Insider trading before new issue announcements.” Financial Management 20, 18–26. Kashyap, A. and J. C. Stein. 1995. “The impact of monetary policy on bank balance sheets.” Carnegie-Rochester Conference Series on Public Policy 42, 151–195. Kaufmann, D., A. Kraay, and P. Zoido-Lobaton. 1999a. Aggregating governance indicators, World Bank Policy Research Department Working Paper 2195. Kaufmann, D., A. Kraay, and P. Zoido-Lobaton. 1999b. Governance matters, World Bank Policy Research Department Working Paper 2196. Kavajecz, K. A. and E. R. Odders-White. 2001. “An examination of changes in specialists’ posted price schedules.” Review of Financial Studies 14, 681. Kazamaki, N. 1994. Continuous exponential martingales and BMO, Lecture Notes in Mathematics, Vol. 1579, Springer, Berlin. Kealhofer, S. and M. Kurbat. 2001. The default prediction power of the Merton approach, relative to debt ratings and accounting variables, Working paper, Moody’s KMV. Keane, M. P. and D. E. Runkle. 1998. “Are financial analysts’ forecasts of corporate profits rational?” Journal of Political Economy 106, 768–805. Keasey, K. and R. Watson. 1987. “Non-financial symptoms and the prediction of small company failure: a test of the Argenti’s hypothesis.” Journal of Business Finance & Accounting 14(3), 335–354. Keasey, K. and R. Watson. 1991. “Financial distress prediction models: a review of their usefulness.” British Journal of Management 2(2), 89–102. Kedia, S. 2006. “Estimating product market competition: methodology and applications.” Journal of Banking and Finance 30(3), 875–894. Keeley, M. C. 1990. “Deposit insurance, risk, and market power in banking.” American Economic Review 80, 1183–1200. Keeton, W. R. and C. S. Morris. 1987. “‘Why do banks’ loan losses differ?’ Federal Reserve Bank of Kansas City.” Economic Review, May, 3–21. Keim, D. B. 1983. “Size-related anomalies and stock return seasonality.” Journal of Financial Economics 12, 13–32. Keim, D., and A. Madhavan. 1996. “The upstairs market for large-block transactions: analysis and measurement of price effects.” Review of Financial Studies 9, 1–36. Keim, D. B. and R. F. Stambaugh. 1986. “Predicting returns in the stock and bond markets.” Journal of Financial Economics 17, 357–390. Kellison, S. G. 1991. The theory of interest, 2nd Edition, IRWIN, Homewood, IL. Kelly, F. P. 1979. Reversibility and stochastic networks, Wiley, New York. Kemna, A. G. Z. and A. C. F. Vorst 1990. “A pricing method for options based on average asset values.” Journal of Banking and Finance 14, 113–129. Kendall, M. G. 1954. “A Note on the bias in the estimation of autocorrelation.” Biometrica 41, 403–404. Kennedy, P. 1991. “Comparing classification techniques.” International Journal of Forecasting 7(3), 403–406. Kennedy, P. 1992. A guide to econometrics, 3rd Edition, Blackwell, Oxford. Kennedy, D. P. 1994. “The term structure of interest rates as a Gaussian random field.” Mathematical Finance 4, 247–258.
1661 Keown, A. J. and J. M. Pinkerton. 1981. “Merger announcements and insider trading activity: an empirical investigation.” Journal of Finance 36, 855–869. Kerfriden, C. and J. C. Rochet. 1993. “Actuarial pricing of deposit insurance.” Geneva Papers on Risk and Insurance Theory 18, 111–30. Keynes, J. M. 1930. A treatise on money, Vol 2. Macmillan & Co., London. Khindanova, I., S. T. Rachev, and E. Schwartz. 2001. “Stable modeling of value at risk.” Mathematical and Computer Modeling 34, 1223–1259. Khorana, A. and H. Servaes. 1999. “The determinants of mutual fund starts.” Review of Financial Studies 12(5), 1043–1074. Kijima, M. and Y. Muromachi. 2000. “Credit events and the valuation of credit derivatives of basket type.” Review of Derivatives Research 4, 55–79. Killough, L. N. and T. L. Souders. 1973. “A goal programming model for public accounting firms.” The Accounting Review XLVIII(2), 268–279. Kijima, M. 2000. “Valuation of a credit swap of the basket type.” Review of Derivatives Research 4, 81–97. Kim, I.-J. 1990. “The analytical valuation of American options.” Review of Financial Studies 3(4), 547–572. Kim, D. 1995. “The errors in the variables problem in the cross-section of expected stock returns.” Journal of Finance 50, 1605–1634. Kim, D. 1997. “A reexamination of firm size, book-to-market, and earnings-price in the cross-section of expected stock returns.” Journal of Financial and Quantitative Analysis 32, 463–489. Kim, S. H. and E. J. Farragher. 1981. “Current capital budgeting practices.” Management Accounting 6, 26–30. Kim, Y. S. and J. H. Lee. 2007. “The relative entropy in CGMY processes and its applications to finance.” Mathematical Methods of Operations Research 66, 327–338. Kim, C. J. and C. Nelson. 1999. “A Bayesian approach to testing for Markov switching in univariate and dynamic factor models,” Discussion Papers in Economics at the University of Washington 0035, Department of Economics at the University of Washington. Kim, T. S. and E. Omberg. 1996. “Dynamic nonmyopic portfolio behavior.” Review of Financial Studies 9, 141–161. Kim, T. H. and H. White. 2003. On more robust estimation of skewness and kurtosis: simulation and application to the S&P 500 Index, Working paper, Department of Economics, UC San Diego. Kim, I. J., K. Ramaswamy, and S. Sundaresan. 1993. “Does default risk in coupons affect the valuation of corporate bonds?: a contingent claims model.” Financial Management 22, 117–131. Kim, S., N. Shephard, and S. Chib. 1998. “Stochastic volatility: likelihood inference and comparison with ARCH models.” The Review of Economic Studies 65, 361–393. King, B. 1966. “Market and industry factors in stock price behavior.” Journal of Business 39, 139–140. King, D. and D. C. Mauer 2000. “Corporate call policy for nonconvertible bonds.” Journal of Business 73, 403–444. King, R. R., V. L. Smith, A. Williams and M. Van Boening. 1990. “The robustness of bubbles and crashes,” Working paper, University of Washington, Seattle, WA. Kish, R. J. and M. Livingston. 1992. “Determinants of the call option on corporate bonds.” Journal of Banking and Finance 16(4), 687–704. Klaoudatos, G., M. Devetsikiotis, and I. Lambadaris. 1999. “Automated modeling of broadband network data using the QTES methodology,” Proceedings IEEE ICC ‘99, Vol 1, Vancouver, Canada, pp. 397–403. Klarman, S. 1991. Margin of safety: risk-averse value investing strategies for the thoughtful investor, HarperBusiness, New York, NY. Klassen, T. R. 2001. “Simple, fast, and flexible pricing of Asian options.” Journal of Computational Finance 4, 89–124. Klebaner, F. C. 2005. Introduction to stochastic calculus with applications, Imperial College Press, London.
1662 Kleibergen, F. 2002. “Pivotal statistics for testing structural parameters in instrumental variables regression.” Econometrica 70, 1781. Kleidon, A. W. 1986. “Variance bounds tests and stock price valuation models.” Journal of Political Economy 94, 953–1001. Klein, R. and V. Bawa. 1976. “The effect of estimation risk on optimal portfolio choice.” Journal of Financial Economics 3, 215–231. Knez, P. J., R. Litterman, and J. Scheinkman. 1994. “Explorations into factors explaining money market returns.” Journal of Finance 49(5), 1861–1882. Knight, J. and S. Satchell (Eds.). 1998. Forecasting volatility in the financial markets, Butterworth Heinemann, London. Koenker, R. 2005. Quantile regression, Cambridge University Press, New York. Koenker, R. and G. Bassett. 1978. “Regression quantiles.” Econometrica 46, 33–50. Koenker, R. and K. Hallock. 2001. “Quantile regression.” Journal of Economic Perspectives 15, 143–156. Kolbe, A. L., J. A. Read, and G. R. Hall. 1984. The cost of capital: estimating the rate of return for public utilities, MTT Press, Cambridge, Mass. Kolmogorov, A. N. 1929. “Sur la loi des grands nombres. Atti della Reale Accademia Nazionale dei Lincei, Serie VI.” Rendiconti 9, 470–474. Kolmogorov, A. N. 1930. “Sur la loi forte des grands nombres.” Comptes rendus hebdomadaires des séances de l’Académie des Sciences 191, 910–912. Konno, H. and H. Yamazaki. 1991. “Mean-absolute deviation portfolio optimization model and its application to Tokyo stock market.” Management Science 37, 519–531. Korn, G. A. and T. M. Korn. 2000. Mathematical Handbook for scientists and engineers: definitions, theorems, and formulas for reference and review, Courier Dover, New York. Kothari, S. P. and J. Shanken. 1997. “Book-to-market time series analysis.” Journal of Financial Economics 44, 169–203. Kothari, S. P., J. Shanken and R. G. Sloan. 1992. Another look at the cross-section of expected stock returns, Working Paper, Bradley Policy Research Center, University of Rochester, New York. Kothari, S. P., J. Shanken, and R. G. Sloan. 1995. “Another look at the cross-section of expected stock returns.” Journal of Finance 50, 185–224. Kou, G. and Y. Shi. 2002. Linux based multiple linear programming classification program: version 1.0., College of Information Science and Technology, University of Nebraska-Omaha, Omaha, NE. Kou, S. G. and H. Wang. 2003. “First passage times of a jump diffusion process.” Advances in Applied Probability 35, 504–531. Kou, S.G. and H. Wang. 2004. “Option pricing under a double exponential jump diffusion model.” Management Science 50, 1178–1192. Kou, G., X. Liu, Y. Peng, Y. Shi, M. Wise, and W. Xu. 2003. “Multiple criteria linear programming approach to data mining: models, algorithm designs and software development.” Optimization Methods and Software 18, 453–473. Koza, J. R. 1992. Genetic programming, MIT Press, Cambridge, MA. Kozicki, S. and P. A. Tinsley. 2001. “Shifting endpoints in the term structure of interest rates.” Journal of Monetary Economics 47, 613–652. Kozicki, S. and P. A. Tinsley. 2002. “Dynamic specifications in optimizing trend-deviation macro models.” Journal of Economic Dynamics and Control 26, 1585–1611. Kraus, A. and R. Litzenberger. 1976. “Market equilibrium in a multiperiod state preference model with logarithmic utility.” Journal of Finance (forthcoming). Kraus, A. and S. A. Ross. 1982. “The determination of fair profits for the property-liability insurance firm.” Journal of Finance 37(4), 1015–1026.
References Kreps, D. M. 1981. “Arbitrage and equilibrium in economics with infinitely many commodities.” Journal of Mathematical Economics 8, 15–35. Kreps, D. M. 1982. “Multiperiod securities and the efficient allocation of risk: a comment on the Black–Scholes Option pricing model.” in The economics of information and uncertainty, J. J. McCall (Ed.). University of Chicago Press, Chicago. Krishnamurthy S., S. Paul, S. Venkat, and W. Tracie. 2005. “Does equity placement type matter? Evidence from long-term stock performance following private placements.” Journal of Financial Intermediation 14, 210–238. Kruse, S. and U. Nögel. 2005. “On the pricing of forward starting options in Heston’s model on stochastic volatility.” Finance and Stochastics 9, 223–250. Kshirsagar, A. M. 1971. Advanced theory of multivariate analysis, Dekker, New York, NY. Kücher, U., K. Neumann, M. Sørensen, and A. Streller. 1999. “Stock returns and hyperbolic distributions.” Mathematical and Computer Modelling 29, 1–15. Kumon, M. and A. Takemura. 2008. “On a simple strategy weakly forcing the strong law of large numbers in the bounded forecasting game.” Annals of the Institute of Statistical Mathematics 60, 801–812. Kunczik, M. 2000. “Globalization: news media, images of nations and the flow of international capital with special reference to the role of rating agencies.” Paper presented at the IAMCR Conference, Singapore, pp. 1–49. Kunitomo, N. 1992. “Improving the Parkinson method of estimating security price volatilities.” Journal of Business 65, 295–302. Kuo, H. C., S. Wu, L. Wang, and M. Chang. 2002. “Contingent fuzzy approach for the development of banks’ credit-granting evaluation model.” International Journal of Business 7(2), 53–65. Kwak, W. 2006. “An accounting information system selection using a fuzzy set approach.” The Academy of Information and Management Sciences Journal 9(2), 79–91. Kwak, W., Y. Shi, H. Lee, and C. F. Lee. 1996. “Capital budgeting with multiple criteria and multiple decision makers.” Review of Quantitative Finance and Accounting 7, 97–112. Kwak, W., Y. Shi, H. Lee, and C. F. Lee. 2002. “A fuzzy set approach in international transfer pricing problems.” Advances in Quantitative Analysis of Finance and Accounting Journal 10, 57–74. Kwak, W., Y. Shi, and K. Jung. 2003. “Human resource allocation for a CPA firm: a fuzzy set approach.” Review of Quantitative Finance and Accounting 20(3), 277–290. Kwak, W., Y. Shi, and J. Cheh. 2006a. “Firm bankruptcy prediction using multiple criteria linear programming data mining approach.” Advances in Investment Analysis and Portfolio Management 2, 27–49. Kwak, W., Y. Shi, S. W. Eldridge, and G. Kou. 2006b. “Bankruptcy prediction for Japanese firms: using multiple criteria linear programming data mining approach.” The International Journal of Business Intelligence and Data Mining 1(4), 401–416. Kwakernaak, H. 1978. “Fuzzy random variables i: definitions and theorems.” Information Sciences 15, 1–29. Kwan, C. C. Y. 1984. “Portfolio analysis using single index, multiindex, and constant correlation models: a unified treatment.” Journal of Finance 39, 1469–1483. Kwan, S. H. 1996. “Firm-specific information and the correlation between individual stocks and bonds.” Journal of Financial Economics 40(1), 63–80. Kyle, A. 1985. “Continuous auctions and insider trading.” Econometrica 53, 1315–1335. Labys, W. C. and C.-W. Yang. 1996. Le Chatelier principle and the flow sensitivity of spatial commodity models. in Recent advances in spatial equilibrium modeling: methodology and applications, van de Bergh, J. C. J. M., P. Nijkamp and P. Rietveld (Eds.). Springer, Berlin, pp. 96–110.
References Lai, T. L. and H. Xing. 2006. “Structural change as an alternative to long memory in financial time series,” in Advances in econometrics, H. Carter and T. Fomby (Eds.), Vol. 20, Elsevier, Amsterdam, pp. 209– 228. Lai, T. L. and H. Xing. 2008a. Statistical models and methods for financial markets, Springer, New York. Lai, T. L. and H. Xing. 2008b. Hidden Markov models of structural changes in econometric time series, Technical Report, Department of Statistics, Stanford University. Lakonishok, J. and I. Lee. 2001. “Are insider trades informative?” Review of Financial Studies 14, 79–111. Lakonishok, J. and S. Smidt. 1986. “Volume for winners and losers: taxation and other motives for stock trading.” Journal of Finance 41, 951–974. Lamberton, D. and B. Lapeyre. 2007. Introduction to stochastic calculus applied to finance, Chapman & Hall, New York. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Heteroskedasticity in stock returns data: volume versus GARCH effects.” Journal of Finance 45(1), 221–229. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Persistence in variance, structural change and the GARCH model.” Journal of Business and Economic Statistics 8, 225–234. Lamoureux, C. and W. Lastrapes. 1994. “Endogenous trading volume and momentum in stock-return volatility.” Journal of Business and Economic Statistics 12, 253–260. La Porta, R., F. Lopez-de-Silanes, A. Shleifer, and R. Vishny. 1998. “Law and finance.” Journal of Political Economy 106, 1113–1155. La Porta, R., F. Lopez-de-Silanes, A. Shleifer, and R. Vishny. 2000. “Investor protection and corporate governance.” Journal of Financial Economics 58, 3–27. Lackner, P. 1993. “Martingale measure for a class of right continuous processes.” Mathematical Finance 3, 43–53. Laderman, E. 2001. Subprime mortgage lending and the capital markets, Federal Reserve Bank of San Francisco, FRBSF Economic Letter (Number 2001–38), December 28. Lai, T. L., H. Liu, and H. Xing. 2005. “Autoregressive models with piecewise constant volatility and regression parameters.” Statistica Sinica 15, 279–301. Lai, T. L., H. Xing, and Z. Chen. 2008. Mean–variance portfolio optimization when means and covariances are unknown, Technical Report, Department of Statistics, Stanford University. Lam, K. F. and J. W. Moy. 2003. “A simple weighting scheme for classification in two-group discriminant problems.” Computers and Operations Research 30, 155–164. Lam, K., E. Choo, and J. Moy. 1996. “Improved linear programming formulations for the multi-group discriminant problem.” Journal of the Operational Research Society 47(12), 1526–1529. Lamoureux, C. G. and W. D. Lastrapes. 1990. “Persistence in variance, structural change, and GARCH model.” Journal of Business and Economic Statistics 8(2), 225–234. Lancaster, K. J. 1966. “A new approach to consumer theory.” Journal of Political Economy 74, 132–157. Landen, C. 2000. “Bond pricing in a hidden Markov model of the short rate.” Finance and Stochastics 4, 371–389. Lando, D. 1998. “On Cox processes and credit risky securities.” Review of Derivatives Research 2, 99–120. Lando, D. 2004. Credit risk modeling: theory and applications, Princeton University Press, Princeton, NJ. Lando, D. and A. Mortensen. 2005. Revisiting the slope of the credit curve. Journal of Investment Management 3, 6–32. Lang, L. and R. Litzenberger. 1989. “Dividend announcements: cash flow signaling versus free cash flow hypothesis?” Journal of Financial Economics 24, 181–192. Lang, L., R. Stulz, and R. Walking. 1989. “Managerial performance, Tobin’s Q, and the gains from successful tender offers.” Journal of Financial Economics 24, 137–154.
1663 Lang, L., R. Stulz, and R. Walking. 1991. “A test of the free cash flow hypothesis: the case of bidder returns.” Journal of Financial Economics 29, 315–335. Lang, L. E. Ofek, and R. M. Stulz 1996. “Leverage, investment, and firm growth.” Journal of Financial Economics (40)1, 3–29. Lang, L. H. P., R. H. Litzenberger, and A. L. Liu. 1998. “Determinants of interest rate swap spreads.” Journal of Banking and Finance 22, 1507–1532. Larcker, D. F. and T. O. Rusticus. 2007. On the use of instrumental variables in accounting research, Working Paper. Larrain, G., H. Reisen, J. von Maltzan. 1997. “Emerging market risk and sovereign credit ratings.” OECD Development Center, 1–30. Last, G. and A. Brandt. 1995. Marked point processes on the real line, Springer, New York. Latané, H. A. and R. J. Rendleman, Jr. 1976. “Standard derivations of stock price ratios implied in option prices.” Journal of Finance 31, 369–381. Latane, H., D. Tuttle, and A. Young. 1971. “How to choose a market index.” Financial Analysts Journal 27, 75–85. Launie, J. J. 1971. “The cost of capital of insurance companies.” Journal of Risk and Insurance 38(2), 263–268. Launie, J. J. 1972. “The cost of capital of insurance companies: reply.” Journal of Risk and Insurance 39(3), 292–495. Laurent, J. P. and H. Pham. 1999. “Dynamic programming and mean– variance hedging.” Finance and Stochastics 3, 83–110. Lauterbach, B. and P. Schultz. 1990. “Pricing warrants: an empirical study of the Black-Scholes model and its alternatives.” Journal of Finance 45(September), 1081–1120. Law, A. and D. W. Kelton. 1991. Simulation modeling and analysis, McGraw-Hill, New York. Leamer, E. E. 1978. Specification searches: ad hoc inference with nonexperimental data. Wiley, New York. Lease, R. C., R. W. Masulis, W. Ronald, and J. Page. 1991. “An investigation of market microstructure impacts on event study returns.” Journal of Finance 46, 1523–1536. LeBaron, B. 2001. “Stochastic volatility as a simple generator of apparent financial power laws and long memory.” Quantitative Finance 1(6), 621–631. Lecraw, D. J. 1985. “Some evidence on transfer pricing by multinational corporations,” in Multinationals and transfer pricing, A. M. Rugman and L. Eden (Eds.). St. Martin’s Press, New York. Lee, C. F. 1976. “Investment horizon and the functional form of the capital asset pricing model.” Review of Economics and Statistics 58, 356–363. Lee, C. F. 1976. “Functional form and the dividend effect of the electric utility industry.” Journal of Finance 31(5), 1481–1486. Lee, C. F. 1983. Financial analysis and planning: theory and application. a book of readings. Addison-Wesley, MA. Lee, C. F. 1985. Financial analysis and planning: theory and application, Addison-Wesley, Reading, MA. Lee, C. F. 1991. “Markowitz, Miller, and Sharpe: the first Nobel laureates in finance,” Review of Quantitative Finance and Accounting 1 209–228. Lee, C. F. 1993. Statistics for business and financial economics, Lexington, Heath, MA. Lee, B.-S. 1996. “Time-series implications of aggregate dividend behavior.” Review of Financial Studies 9, 589–618. Lee, I. 1997. “Do firms knowingly sell overvalued equity?” Journal of Finance 52, 1439–1466. Lee, B.-S. 1998. “Permanent, temporary and nonfundamental components of stock prices.” Journal of Financial and Quantitative Analysis 33, 1–32. Lee, John C. 2001. “Using microsoft excel and decision trees to demonstrate the binomial option pricing model.” Advances in Investment Analysis and Portfolio Management 8, 303–329.
1664 Lee, R. 2005. “Option pricing by transform methods: extensions, unification, and error control.” Journal of Computational Finance 7(3), 51–86. Lee, C. F. 2009. Handbook of quantitative finance and risk management, Springer, New York. Lee, C. F. and J. B. Kau. 1976. “The functional form in estimating the density gradient: an empirical investigation.” Journal of American Statistical Association 71, 326–327. Lee, C. F. and S. N. Chen. 1980. “A random coefficient model for reexamining risk decomposition method and risk-return relationship test.” Quarterly Review of Economics and Business 20, 58–69. Lee, C. F. and S. N. Chen. 1981. “The sampling relationship between Sharpe’s performance measure and its risk proxy: sample size, investment horizon and market conditions.” Management Science 27(6), 607–618. Lee, C. F. and S. N. Chen. 1984. “On the measurement errors and ranking of composite performance measures.” Quarterly Review of Economics and Business 24, 6–17. Lee, C. F. and S. N. Chen. 1986. “The effects of the sample size, the investment horizon and market conditions on the validity of composite performance measures: a generalization,” Management Science 32(11), 1410–1421. Lee, C. F. and J. E. Finnerty. 1990. Corporate finance: theory, method, and applications. Harcourt Brace Jovanovich, Orlando, FL. Lee, C. F. and S. W. Forbes. 1980. “Dividend policy, equity value, and cost of capital estimates for the property and liability insurance industry.” Journal of Risk and Insurance 47, 205–222. Lee, C. F. and J. C. Junkus. 1983. “Financial analysis and planning: an overview.” Journal of Economics and Business 34, 257–283. Lee, C. F. and A. C. Lee. 2006. Encyclopedia of finance, Springer, New York. Lee, P. M. and S. Wahal. 2004. “Grandstanding, certification and the underpricing of venture capital backed IPOs.” Journal of Financial Economics 73, 375. Lee, C. F. and K. C. J. Wei. 1988. “Impacts of rates of return distributions on the functional form of CAPM.” Working paper, Rutgers University. Lee, C. F. and C. Wu. 1988. “Expectation formation and financial ratio adjustment processes.” The Accounting Review (April 1988), 63, 292–306. Lee, C.-F. and Y. L. Wu. in press. “An econometric investigation of the information content of equity-selling mechanisms.” Review of Quantitative Finance and Accounting. Lee, J.-P. and M.-T. Yu. 2002. “Pricing default-risky CAT bonds with moral hazard and basis risk.” Journal of Risk and Insurance69(1), 25–44. Lee, J.-P. and M.-T. Yu. 2007. “Valuation of catastrophe reinsurance with CAT bonds.” Insurance: Mathematics and Economics 41, 264–278. Lee, C. F., F. J. Fabozzi, and J. C. Francis. 1980. “Generalized functional form for mutual fund returns.” Journal of Financial and Quantitative Analysis 15(5), 1107–1120. Lee, C. F., P. Newbold, J. E. Finnerty, and C.-C. Chu. 1986. “On accounting-based, market Based and composite-based beta predictions: method and implications.” Financial Review 21(1), 51–68. Lee, C. F., J. E. Finnerty, and D. H. Wort. 1990. Security analysis and portfolio management. Scott, Foresman/Little Brown, Glenview, IL. Lee, C. F., C. Wu and K. C. J. Wei. 1990. “The heterogeneous investment horizon and the capital asset pricing model: theory and implications.” Journal of Financial and Quantitative Analysis 25, 361–376. Lee, Y. R., Y. Shi, and P. L. Yu. 1990. “Linear optimal designs and optimal contingency plans.” Management Science 36, 1106–1119.
References Lee, J. C., C. F. Lee, and K. C. J. Wei. 1991. “Binomial option pricing with stochastic parameters: A beta distribution approach.” Review of Quantitative Finance and Accounting 1(3), 435–448. Lee, C., J. Myers and B. Swaminathan. 1999. “What is the intrinsic value of the dow?” Journal of Finance 54, 1693–1742. Lee, C. F., J. C. Lee, and A. C. Lee. 2000. Statistics for business and financial economics, World Scientific Publishing Co, Singapore. Lee, J. C., C. F. Lee, R. S. Wang, and T. I. Lin. 2004. “On the limit properties of binomial and multinomial option pricing models: review and integration.” Advances in Quantitative Analysis of Finance and Accounting New Series, Volume 1. World Scientific: Singapore. Lee, S. C., J. P. Lee, and M. T. Yu. 2005. “Bank capital forbearance and valuation of deposit insurance.” Canadian Journal of Administrative Sciences 22, 220–229. Lee, C. F., G. H. Tzeng, and S. Y. Wang. 2005a. “A new application of fuzzy set theory to the Black-Scholes option pricing model.” Expert Systems with Applications 29(2), 330–342. Lee, C. F., G. H. Tzeng, and S. Y. Wang. 2005b. “A Fuzzy set approach to generalize CRR model: an empirical analysis of S&P 500 index option.” Review of Quantitative Finance and Accounting 25(3), 255–275. Lee, C. F., John C. Lee, and Alice C. Lee. 2009, Financial analysis, planning and forecasting, 2nd ed. Hackensack, NJ, World Scientific Publishers. Lehmann, E. 1955. “Ordered families of distributions.” Annals of Mathematical Statistics 26, 399–419. Lei, V., C. Noussair and C. Plott. 2001. “Nonspeculative bubbles in experimental asset markets: lack of common knowledge of rationality vs. actual irrationality.” Econometrica 69(4), 831–859. Leippold, M. and L. Wu. 2002. “Asset pricing under the quadratic class.” Journal of Financial and Quantitative Analysis 37(2), 271–295. Leippold, M. and L. Wu. 2003. “Design and estimation of quadratic term structure models.” European Finance Review 7(1), 47–73. Leisen, D. P. J. and M. Reimer. 1996. “Binomial models for option valuation – examining and improving convergence.” Applied Mathematical Finance 3, 319–346. Leland, H. 1972. “On turnpike portfolios.” in Mathematical methods in investment and finance, K. Shell and G. P. Szego (Eds.). North-Holland, Amsterdam. Leland, E. Hayne. 1985. “Option pricing and replication with transactions costs.” Journal of Finance 40(5), 1283–1301. Leland, H. E. 1994a. “Corporate debt value, bond covenants, and optimal capital structure.” Journal of Finance 49, 1213–1252. Leland, H. E. 1994b. Bond prices, yield spreads, and optimal capital structure with default risk, Working paper n240, IBER. University of California, Berkeley, CA. Leland, H. E. 1998. “Agency cost, risk management, and capital structure.” Journal of Finance 53, 1213–1243. Leland, H. E., 2004. “Prediction of default probabilities in structural models of debt.” Journal of Investment Management 2(2), 5–20. Leland, H. E. 2006. Structural models in corporate finance, Princeton University Bendheim Lecture Series in Finance. Leland, H. E. and D. H. Pyle. 1977. “Informational asymmetries, financial structure, and financial intermediation.” Journal of Finance 32, 371–387. Leland, H. E. and K. B. Toft. 1996. “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads.” Journal of Finance 51, 987–1019. Lennox, C. 1999 “Identifying failing companies: a reevaluation of the logit, probit and DA approaches.” Journal of Economics and Business 51(4), 347–364. LeRoy, S. and R. Porter. 1981. “The present value relation: tests Based on Variance Bounds.” Econometrica 49, 555–574. Lesmond, D. A., J. P. Ogden, and C. A. Trzcinka. 1999. A new estimate of transaction costs. Review of Financial Studies 12, 1113–1141.
References Lessard, D. 1974. “World, national, and industry factors in equity returns.” The Journal of Finance 29(2), 379–391. Lettau, M. 1997. “Explaining the facts with adaptive agents.” Journal of Economic Dynamics and Control 21, 1117–1147. Lettau, M. and S. Ludvigson. 2001. “Consumption, aggregate wealth and expected stock returns.” Journal of Finance 56, 815–849. Lev, B. 1969. “Industry averages as targets for financial ratios.” Journal of Accounting Research (Autumn 1969), 7, 290–299. Lev, B. and S. Sunder. 1979. “Methodological issues in the use of financial ratios.” Journal of Accounting and Economics 1(3), 187–210. Levhari, D. and H. Levy. 1977. “The capital asset pricing model and the investment horizon.” Review of Economics and Statistics 59, 92– 104. Levin, L. A. 1976. “Uniform tests of randomness. Soviet Mathematics Doklady 17, 337–340, the Russian original in: Doklady AN SSSR 227(1), 1976. Levin, L. A. 1984. “Randomness conservation inequalities; information and independence in mathematical theories.” Information and Control 61, 15–37. Levine, R. and S. Zervos 1998 “Stock markets, banks, and growth.” American Economic Review 88(3), 537–558. Levy, P. 1925. Calcul des probabilities, Gauthier-Villars, Paris. Levy, H. 1972. “Portfolio performance and the investment horizon.” Management Science 36, 645–653. Levy, R. 1974. “Beta coefficients as predictors of return.” Financial Analysts Journal 30, 61–69. Lévy, E. 1992. “Pricing European average rate currency options.” Journal of International Money and Finance 11, 474–491. Levy, H. 2006. Stochastic dominance: investment decision making under uncertainty, 2nd ed., Springer, New York. Levy, H. and M. Sarnat. 1970. “International diversification of investment portfolios.” American Economic Review 60, 668–675. Levy, H. and M. Sarnat. 1971. “A note on portfolio selection and investors’ wealth.” Journal of Financial and Quantitative Analysis 6, 639–642. Lévy, E. and S. Turnbull. 1992. “Average intelligence.” RISK 5, 53–58. Lewellen, J. W. 2004. “Predicting returns using financial ratios.” Journal of Financial Economics, 74, 209–235. Lewellen, J. and S. Nagel. 2006. “The conditional CAPM does not explain asset-pricing anomalies.” Journal of Financial Economics 82, 289–314. Lewellen, J. and J. Shanken. 2002. “Learning, asset-pricing tests, and market efficiency.” Journal of Finance, 57(3), 1113–1145. Lewis, A. L. 1988. “A simple algorithm for the portfolio selection problem.” Journal of Finance 43, 71–82. Lewis, C. M. 1990. “A multiperiod theory of corporate financial policy under taxation.” Journal of Financial and Quantitative Analysis 25(1), 25–43. Lewis, K. 1991. “Was there a ‘Peso problem’ in the U.S. term structure of interest rates: 1979–1982?” International Economic Review 32(1), 159–173. Lewis, A. 2000. Option valuation under stochastic volatility, Finance Press, CA. Li, H. 1998. “Pricing of swaps with default risk.” Review of Derivatives Research 2, 231–250. Li, D. 1998. Constructing a credit curve, Credit Risk: Risk Special Report, November, 40–44. Li, D. X. 2000. “On default correlation: a copula function approach.” Journal of Fixed Income 9(4), 43–54. Li, T. 2000b. A model of pricing defaultable bonds and credit ratings, Working paper, Olin School of Business, Washington University at St. Louis. Li, X. 2006. “An empirical study of the default risk and liquidity components of interest rate swap spreads.” Working paper, York University. Li, L. and R. F. Engle. 1998. “Macroeconomic announcements and volatility of treasury futures,” EFMA meetings, (Helsinki).
1665 Li, D. and S. Li. 1996. “A theory of corporate scope and financial structure.” Journal of Finance 51, 691–709. Li, K. and Prabhala, N. R. 2005. Self-selection models in corporate finance, Working Paper. Li, H. and F. Zhao. 2006. “Unspanned stochastic volatility: evidence from hedging interest rate derivatives.” Journal of Finance 61, 341–378. Li H. and F. Zhao. 2009. “Nonparametric Estimation of State-Price Densities Implicit in Interest Rate Cap Prices.” Review of Financial Studies 22(11), 4335–4376. Li, M., N. D. Pearson, and A. M. Poteshman. 2001. Facing up to conditioned diffusions, Working paper, University of Illinois at UrbanaChampaign. Li, K., A. Sarkar, and Z. Wang. 2003. “Diversification benefits of emerging markets subject to portfolio constraints.” Journal of Empirical Finance 10, 57–80. Liao, A. and J. Williams. 2004. “Volatility transmission and changes in stock market interdependence in the European Community.” European Review of Economic and Finance 3, 203–231. Liaw, K. T. and R. L. Moy. 2000. The Irwin guide to stocks, bonds, futures, and options, McGraw-Hill Companies, New York. Liberatore, M. J., T. F. Mohnhan, and D. E. Stout. 1992. “A framework for integrating capital budgeting analysis with strategy.” The Engineering Economist 38, 31–43. Lie, E. and H. Lie. 2002. “Multiples used to estimate corporate value.” Financial Analysts Journal 58(2), 44–53. Lien, D. and Y. K. Tse. 1998. “Hedging time-varying downside risk.” Journal of Futures Markets 18, 705–722. Lien, D. and Y. K. Tse. 2000. “Hedging downside risk with futures contracts.” Applied Financial Economics 10, 163–170. Lien, D., and Y. K. Tse. 2001. “Hedging downside risk: futures vs. options.” International Review of Economics and Finance 10, 159–169. Lin, T. and C. W. Duan 2007 “Oil convenience yields estimated under demand/supply shock.” Review of Quantitative Finance and Accounting 28, 203–225. Lin, J. C. and J. S. Howe. 1990. “Insider trading in the OTC market.” Journal of Finance 45, 1273–1284. Lin, L. and J. Piesse. 2004. “The identification of corporate distress in UK industrials: a conditional probability analysis approach.” Applied Financial Economics 14, 73–82. Lin, C. M. and S. D. Smith. 2007. “Hedging, financing, and investment decisions: A simultaneous equations framework.” Financial Review 42, 191–209. Lin, L., C. Lefebvre, and J. Kantor. 1993. “Economic determinants of international transfer pricing and the related accounting issues, with particular reference to Asian pacific countries.” The International Journal of Accounting 28, 49–70. Lin, H., S. Liu, and C. Wu. 2007. Nondefault components of corporate yield spreads: taxes or liquidity?, Working paper, University of Missouri-Columbia. Linetsky, V. 2002. “Exact pricing of Asian options: an application of spectral theory,” Working Paper, Northwestern University. Lintner, J. 1956. “Distribution of income of corporations among dividends, retained earnings, and taxes.” American Economic Review 46(2), 97–113. Lintner, J. 1965. “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets.” Review of Economics and Statistics 47, 13–37. Liptser, R. Sh. and A. N. Shiryayev. 1986. Martingale theory, Nauka, Moscow. Litterman, R. and J. Scheinkman. 1991. “Common factors affecting bond returns.” Journal of Fixed Income 1(1), 54–61. Litzenberger, R. and N. Rabinowitz. 1995. “Backwardation in oil futures markets: theory and empirical evidence.” Journal of Finance 50(5), 1517–1545.
1666 Litzenberger, R. H. and K. Ramaswamy 1979. “The effects of personal taxes and dividends on capital asset prices: theory and empirical evidence.” Journal of Financial Economics 7, 163–196. Litzenberger, R. H., D. R. Beaglehole, and C. E. Reynolds. 1996. “Assessing catastrophe reinsurance-linked securities as a new asset class.” Journal of Portfolio Management Special Issue, 76–86. Liu, J. 2003. Profit taking. Dissertation thesis, Rutgers University. Liu, W. 2006. “A liquidity-augmented capital asset pricing model.” Journal of Financial Economics 82, 631–671. Liu, S. M. and B. Brorsen. 1995. “Maximum likelihood estimation of a GARCH – stable model.” Journal of Applied Econometrics 2, 273–285. Liu, Y. H. and Y. Shi 1994. “A fuzzy programming approach for solving a multiple criteria and multiple constraint level linear programming problem.” Fuzzy Sets and Systems 65, 117–124. Liu, D., D. Nissim, and J. Thomas. 2001. “Equity valuation using multiples.” Journal of Accounting Research 40, 135–172. Liu, J., F. A. Longstaff, and R. E. Mandell. 2004. “The market price of risk in interest rate swaps: the roles of default and liquidity risks.” Journal of Business 79, 2337–2369. Liu, S., J. Shi, J. Wang, and C. Wu. 2007. “How much of the corporate bond spread is due to personal taxes?” Journal of Financial Economics 85, 599–636. Livny, M., B. Melamed, and A. K. Tsiolis. 1993. “The impact of autocorrelation on queueing systems.” Management Science 39(3), 322–339. Ljungqvist, A., F. Marston, and W. J. Wilhelm Jr. 2006. “Competing for securities underwriting mandates: banking relationships and analyst recommendations.” Journal of Finance 61, 301. Lo, A. W. 1986. “Logit versus discriminant analysis: a specification test and application to corporate bankruptcies.” Journal of Econometrics 31(3), 151–178. Lo, A. 2007. Where do alphas come from? A new measure of the value of active investment management, Working paper, Sloan School of Business. Lo, A. W. and A. Craig MacKinlay. 1988. “Stock market prices do not follow random walks: evidence from a simple specification test,” Review of Financial Studies 1(1), 41–66. Lo, A. and A. C. MacKinlay. 1990. “When are contrarian profits due to stock market overreaction?” Review of Financial Studies 3, 175–205. Lo, A. and A. MacKinlay. 1990. “Data snooping biases in tests of financial asset pricing models.” Review of Financial Studies 3, 431–468. Lo, W. A. and J. Wang. 2000. “Trading volume: definitions, data analysis, and implications of portfolio theory.” Review of Financial Studies 13, 257–300. Lo, K., K. Wang, and M. Hsu. 2008. “Pricing European Asian options with skewness and kurtosis in the underlying distribution.” Journal of Futures Markets 28(6), 598–616. Lobato, I. N. and N. E. Savin. 1998. “Real and spurious long-memory properties of stock market data.” Journal of Business & Economic Statistics 16, 261–268. Locke, E. A., K. N. Shaw, and G. P. Lathom. 1981. “Goal setting and task performance: 1969–1980.” Psychological Bulletin 89, 125–152. Long, J. 1974. “Stock prices, inflation and the term structure of interest rates.” Journal of Financial Economics 1, 131–170. Long, M. and L. Malitz. 1985. “The investment-financing nexus: some empirical evidence from UK.” Midland Corporate Finance Journal, 234–257. Longin, F. and B. Solnik. 1995. “Is the correlation in international equity returns constant: 1960–1990?” Journal of International Money and Finance 14, 3–23. Longstaff, F. A. 1989. “A nonlinear general equilibrium model of the term structure of interest rates.” Journal of Financial Economics 23, 195–224.
References Longstaff, F. A. 1992. “Multiple equilibria and term structure models.” Journal of Financial Economics 32, 333–344. Longstaff, F. 1995. “Option pricing and the Martingale restriction.” Review of Financial Studies 8(4), 1091– 1124. Longstaff, F. A. 2000. “The term structure of very short-term rates: new evidence for the expectations hypothesis.” Journal of Financial Economics 58, 397–415. Longstaff, F. A. and E. S. Schwartz. 1992. “Interest rate volatility and the term-structure of interest rates: a two-factor general equilibrium model.” Journal of Finance 47, 1259–1282. Longstaff, F. A. and E. S. Schwartz. 1995. “A simple approach to valuing risky fixed and floating rate debt.” Journal of Finance 50(3), 789–819. Longstaff, F. A. and E. Schwartz. 2001. “Valuing American options by simulation: a simple least-squares approach.” Review of Financial Studies 14(1), 113–147. Longstaff, F. A., P. Santa-Clara, and E. S. Schwartz. 2001. “Throwing away a million dollars: the cost of suboptimal exercise strategies in the swaptions market.” Journal of Financial Economics 62(1), 39–66. Longstaff, F., P. Santa-Clara, and E. Schwartz. 2001. “The relative valuation of caps and swaptions: theory and evidence.” Journal of Finance 56, 2067–2109. Longstaff, F. A. S. Mithal, and E. Neis. 2003. The credit default swap market: is credit protection priced correctly?, Working paper, UCLA. Longstaff, F. A., S. Mithal, and E. Neis. 2005. “Corporate yield spreads: default risk or liquidity? New evidence from the credit default swap market.” Journal of Finance 60, 2213–2253. Lopes, H. 1999. PhD thesis, Institute of Statistics and Decision Sciences. Lopez, J. 2001. Economic letter, 23 Nov 2001, “Financial instruments for mitigating credit risk,” San Francisco Federal Reserve Bank. Lorenz, M. O. 1905. “Methods of measuring concentration of wealth.” Journal of the American Statistical Association 9, 209–219. Loria, S., T. M. Pham, and A. B. Sim. 1991. “The performance of a stock index futures-based portfolio insurance scheme: Australian evidence.” Review of Futures Markets 10(3), 438–457. Lorie, J. H. and L. J. Savage. 1955. “Three problems in rationing capital.” Journal of Business 28, 229–239. Loubergé, H., E. Kellezi, and M. Gilli. 1999. “Using catastrophe-linked securities to diversify insurance risk: a financial analysis of cat bonds.” Journal of Insurance Issues 22, 125–146. Loughran, T. and J. R. Ritter. 1995. “The new issues puzzle.” Journal of Finance 50, 23–51. Loughran, T. and J. R. Ritter. 1997. “The operating performance of firms conducting seasoned equity offerings.” Journal of Finance 52, 1823–1850. Loughran, T. and J. R. Ritter. 2002. “Why don’t issuers get upset about leaving money on the table.” Review of Financial Studies 15, 413–443. Loughran, T. and J. R. Ritter. 2004. “Why has IPO underpricing changed over time?” Financial Management, Autumn 33, 5–37. Loviscek, A. L. and C.-W. Yang. 1997. “Assessing the impact of the investment company act’s diversification rule: a portfolio approach.” Midwest Review of Finance and Insurance 11(1), 57–66. Lowenstein, R. 2000. When genius failed: the rise and fall of long-term capital management. Random House Trade Paperbacks, New York. Lowry, M. and S. Shu. 2002. “Litigation risk and IPO underpricing.” Journal of Financial Economics 65, 309. Lucas, R. E. Jr. 1978. “Asset prices in an exchange economy.” Econometrica 46, 1429–1445. Lutkepohl, H. 1991. Introduction to multiple time series analysis, Springer, Berlin.
References Lyanders, E. 2006. “Capital structure and interaction among firms in output markets: theory and evidence.” The Journal of Business 79(5), 2381–2422. Lyden, S. and D. Saraniti. 2000. An empirical examination of the classical theory of corporate security valuation, Working paper, Barclays Global Investors. Macbeth, J. and L. Merville. 1979. “An empirical examination of the Black-Scholes call option pricing model.” Journal of Finance 34, 1173–1186. Macbeth, J. D. and L. J. Merville. 1980. “Tests of the Black-Scholes and Cox call option valuation models.” Journal of Finance 35(May), 285–301. MacKinlay, A. C. 1995. “Multifactor models do not explain deviations from the CAPM.” Journal of Financial Economics 38, 3–28. MacKinlay, A. C. 1997. “Event studies in economics and finance.” Journal of Economic Literature 35, 13–39. Maclennan, D. 1977. “Some thoughts on the nature and purpose of hedonic price functions.” Urban Studies 14, 59–71. Maclennan, D. 1982. Housing economics, Longman, London. MacMillan, L. W. 1986. “Analytic approximation for the American put options.” Advances in Futures and Options Research 1, 119–139. Madan, D. B. and F. Milne. 1987. The multinomial option pricing model and its limits, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. Madan, D. B. and F. Milne. 1988a. Arrow Debreu option pricing for long-tailed return distributions, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. Madan, D. B. and F. Milne. 1988b. Option pricing when share prices are subject to predictable jumps, Department of Econometrics working papers, University of Sydney, Sydney, NSW 2066. Madan, D. B. and E. Seneta. 1990. “The variance gamma model for share market returns.” Journal of Business 63, 511–524. Madan, D. B. and H. Unal. 1998. “Pricing the risks of default.” Review of Derivatives Research 2, 121–160. Madan, D. B., F. Milne, and H. Shefrin. 1989. “The Multinomial Option Pricing Model and its Brownian and Poisson Limits.” The Review of Financial Studies, 2(2), 251–265. Madan, D., P. Carr and E. Chang. 1998. The variance gamma process and option pricing, European Finance Review 2, 79–105. Maddala, G. S. 1977 Econometrics, McGraw-Hill, New York. Maddala, G. S. 1983. Limited-dependent and qualitative variables in econometrics, Cambridge University Press, Cambridge. Madhavan, A. 2000. “Market microstructure.” Journal of Financial Markets 3, 205–258. Madhavan, A. and M. Cheng. 1997. “In search of liquidity: an analysis of upstairs and downstairs trades.” Review of Financial Studies 10, 175–204. Maes, K. 2004. Modeling the term structure of interest rates: where do we stand?, Working paper, National Bank of Belgium. Maginn, J. L., D. L. Tuttle, J. E. Pinto, and D. W. McLeavey. 2007. Managing investment portfolios: a dynamic process, CFA Institute Investment Series, 3rd ed., Wiley, New Jersey. Mago, A. 2007. Subprime MBS: grappling with credit uncertainties, Lehman Brothers Inc., March 22. Mahanti, S., A. J. Nashikkar, and M. G. Subrahmanyam. 2007. “Latent liquidity and corporate bond yield spreads.” SSRN eLibrary , http://ssrn.com/paperD966695. Mahanti, S., A. Nashikkar, M. G. Subrahmanyam, G. Chacko, and G. Mallik. 2008. “Latent liquidity: a new measure of liquidity, with an application to corporate bonds.” Journal of Financial Economics 88(2), 272–298. Maich, S. 2003. “Analysts try to justify latest internet rise: a boom or bubble?” National Post, 9 September, 1. Maksimovic, V. 1990. “Product market imperfections and loan commitments.” Journal of Finance 45, 1641–1653.
1667 Maksimovic, V., A. Stomper, and J. Zechner 1998. “Capital structure, information acquisition and investment decisions in industry equilibrium.” European Finance Review 2, 251–271. Malevergne, Y. and D. Sornette. 2005. “Higher-moment portfolio theory; capitalizing on behavioral anomalies of stock markets.” Journal of Portfolio Management 31, 49–55. Malik, F. 2003. “Sudden changes in variance and volatility persistence in foreign exchange markets.” Journal of Multinational Financial Management 13(3), 217–230. Malik, F., B. T. Ewing, and J. E. Payne. 2005. “Measuring volatility persistence in the presence of sudden changes in the variance of Canadian stock returns.” Canadian Journal of Economics 38(3), 1037–1056. Malkiel, B. 1966. The term structure of interest rates: expectations and behavior patterns, Princeton University Press, Princeton. Malkiel, B. G. 1995. “Returns from investing in equity mutual funds 1971 to 1991.” Journal of Finance 50, 549–572. Malkiel, B. G. and Y. Xu. 1997. “Risk and return revisited.” Journal of Portfolio Management, 23, 9–14. Malliaris, A. G. 1983 “Itô’s calculus in financial decision making.” Society of Industrial and Applied Mathematical Review 25, 482–496. Malliaris, A. and W. A. Brock. 1982. Stochastic methods in economics and finance, North-Holland, Amsterdam. Malpezzi, S. 1998. “A simple error correction model of house prices,” Wisconsin-Madison CULER working papers 98–11, University of Wisconsin Center for Urban Land Economic Research. Manaster, S. and R. J. Rendelman. 1982. “Option prices as predictors of equilibrium stock prices.” Journal of Finance 37, 1043–1057. Mandelbrot, B. 1963. “The variation of certain speculative prices.” Journal of Business 36, 394–419. Mandelbrot, B. 1967. “The variation of some other speculative prices.” Journal of Business 40, 393–413. Mandelbrot, B. 1971. “When can price be arbitraged efficiently? A limit to validity of the random walk and martingale models.” Review of Economics and Statistics 53, 225–236. Mandelbrot, B. and J. W. van Ness. 1968. “Fractional Brownian motions, fractional noises and applications.” SIAM Review 10, 422–437. Mangasarian, O. 1965. “Linear and nonlinear separation of patterns by linear programming.” Operations Research 13, 444–452. Mania, M. and R. Tevzadze. 2000. “A semimartingale Bellman equation and the variance-optimal martingale measure.” Georgian Mathematical Journal 7, 765–792. Mania, M., M. Santacroce, and R. Tevzadze. 2002. “A semimartingale backward equation related to the p-optimal martingale measure and the lower price of a contingent claim,” in Stochastic processes and related topics, Stochastics Monograph Vol. 12, Taylor & Francis, London, pp. 189–212. Mania, M., M. Santacroce, and R. Tevzadze. 2003. A semimartingale BSDE related to the minimal entropy martingale measure. Finance and Stochastics 7, 385–402. Mankiw, N. G., D. Romer, and M. D. Shapiro. 1991. “Stock market forecastability and volatility: a statistical appraisal.” Review of Economic Studies 58, 455–477. Manski, C. F. and D. L. McFadden. 1981. Structural analysis of discrete data and econometric applications. Cambridge, The MIT Press. Mao, J. C. F. 1969. Quantitative analysis of financial decisions, Macmillan, New York. Maquieira, C., W. Megginson, and L. Nail. 1998. “Wealth creation versus wealth redistributions in pure stock-for-stock mergers.” Journal of Financial Economics 48, 3–33. Marcello, J. L. M., J. L. M. Quiros, and M. M. M. Quiros. 2008. “Asymmetric variance and spillover effects: regime shifts in the Spanish stock market.” Journal of International Financial Markets, Institutions and Money 18(1), 1–15.
1668 Marcus, A. J. and I. Shaked. 1984. “The valuation of FDIC deposit insurance using option-pricing estimates.” Journal of Money, Credit and Banking 16, 446–460. Marimon, R., E. McGrattan, and T. J. Sargent. 1990. “Money as a medium of exchange in an economy with artificially intelligent agents.” Journal of Economic Dynamics and Control 14, 329–373. Marjit, S. and A. Mukherjee. 1998. “Technology collaboration and foreign equity participation: a theoretical analysis.” Review of International Economics 6(1), 142–150. Marjit, S. and A. Mukherjee. 2001. “Technology transfer under asymmetric information: the role of equity participation.” Journal of Institutional and Theoretical Economics 157(2), 282–300. Mark, N. 1995. “Exchange rates and fundamentals: evidence on longhorizon predictability.” American Economic Review 85, 201–218. Mark, N. and D. Sul. 2001. “Nominal exchange rates and monetary fundamentals: evidence from a small post-Bretton Woods panel.” Journal of International Economics 53, 29–52. Mark, N. and Y. Wu. 1998. “Rethinking deviations from uncovered interest parity: the role of covariance risk and noise.” Economic Journal 108, 1686–1706. Markowitz, H. 1952. “Portfolio selection.” Journal of Finance 7, 77–91. Markowitz, H. M. 1956 “The optimization of a quadratic function subject to linear constraints.” Naval Research Logistics Quarterly 3, 111–133. Markowitz, H. M. 1959. Portfolio selection. Cowles Foundation Monograph 16, Wiley, New York. Markowitz, H. 1959. Portfolio selection: Efficient Diversification of Investment, John Wiley and Sons, New York. Markowitz, H. M. 1976. “Markowitz revisited.” Financial Analysts Journal 32, 47–52. Markowitz, H. M. 1987. Mean-variance analysis in portfolio choice and capital markets. Blackwell, Oxford. Markowitz, H. M. 1991. “Foundation of portfolio theory.” Journal of Finance XLVI(2), 469–477. Marks, S. and O. J. Dunn. 1974. “Discriminant functions when covariance matrices are unequal.” Journal of the American Statistical Association 69(346), 555–559. Marmol, F. 1998. “Spurious regression theory with non-stationary fractionally integrated processes.” Journal of Econometrics 84, 233– 250. Marsh, P. 1982. “The choice between equity and debt: an empirical study.” Journal of Finance (March 1982), 37, 121–144. Marsh, T. A. and R. C. Merton. 1986. “Dividend variability and variance bounds tests for the rationality of stock market prices.” American Economic Review 76, 483–498. Martens, M. and D. van Dijk. 2007. “Measuring volatility with the realized range.” Journal of Econometrics 138, 181–207. Martin, A. D., Jr. 1955. “Mathematical programming of portfolio selections.” Management Science 1, 152–166. Martin, D. 1977. “Early warning of bank failure: a logit regression approach.” Journal of Banking and Finance 1(3), 249–276. Masih, R. and A. M. Masih. 2001. “Long and short term dynamic causal transmission amongst international stock markets.” Journal of International Money and Finance 20, 563–587. Mason, J. R. and J. Rosner. 2007. Where did the risk go? How misapplied bond ratings cause mortgage backed securities and collateralized debt obligation market disruptions, Mortgage-Backed Security Ratings, May 3. Masulis, R. 1983. “The impact of capital structure change on firm value: some estimates.” Journal of Finance 38, 107–126. Masulis, R. W. and A. N. Korwar. 1986. “Seasoned equity offerings: an empirical investigation.” Journal of Financial Economics 15, 91–118. Mathews, J. and K. Fink. 1999. Numerical methods using Matlab, Prentice Hall, Upper Saddle River, New Jersey.
References Matsuba, I. and H. Takahashi. 2003. “Generalized entropy approach to stable Lévy processes with financial application.” Physica A 319, 458–468. Mauer, D. C. 1993. “Optimal bond call policies under transactions costs.” Journal of Financial Research 16(1), 23–37. Mauer, D. C. and A. J. Triantis 1994. “Interactions of corporate financing and investment decisions: a dynamic framework.” Journal of Finance 49, 1253–1277. McAleer, M. and M. Medeiros. 2008. “Realized volatility: a review.” Econometric Reviews 27, 10–45. McAndrews, J. J., and L. I. Nakamura 1992. “Entry-deterring debt.” Journal of Financial and Quantitative Analysis 24(1), 98–110. McBeth, J. and L. Merville. 1979. “An empirical examination of the Black–Scholes call option pricing model.” Journal of Finance 34, 1173–1186. McCallum, B. T. 1972. “Relative asymptotic bias from errors of omission and measurement.” Econometrica 40, 757–758. McCauley, J. L. 2003. “Thermodynamic analogies in economics and finance: instability of markets.” Physica A 329(1-2), 199–212. McConnell, J. J. and H. Servaes. 1990. “Additional evidence on equity ownership and corporate value.” Journal of Financial Economics 27(2), 595–612. McConnell, J. and H. Servaes 1995. “Equity ownership and the two faces of debt.” Journal of Financial Economics 39(1), 37–151. McCulloch, R. E. and R. S. Tsay. 1993. “Bayesian inference and prediction for mean and variance shifts in autoregressive time series.” Journal of the American Statistical Association 88, 968–978. McDonald, B. 1983. “Functional forms and the capital asset pricing model.” Journal of Financial and Quantitative Analysis 18, 319–329. McDonald, J. B. 1991. “Parametric models for partially adaptive estimation with skewed and leptokurtic residuals.” Economics Letters 37, 273–278. McDonald, R. L. 2005. Derivatives markets, 2nd Edition, Addison Wesley, Boston, MA. McDonald, R. L. 2006. “The role of real options in capital budgeting: theory and practice.” Journal of Applied Corporate Finance 18(2), 28–39. McDonald, B. and M. H. Morris. 1984. “The functional specification of financial ratios: an empirical examination.” Accounting and Business Research 15(59), 223–228. McDonald, J. B. and Y. J. Xu. 1995. “A generalization of the beta distribution with applications.” Journal of Econometrics 66, 133–152. McElroy, M. B. and E. Burmeister. 1988. “Arbitrage pricing theory as a restricted nonlinear multivariate regression model.” Journal of Business and Economic Statistics 6, 29–42. McGuire, J. and S. Dow. 2002. “The Japanese Keiretsu system: an empirical analysis.” Journal of Business Research 55, 33–40. McLeay, S. 1986. “Students and the distribution of financial ratios.” Journal of Business Finance & Accounting 13(2), 209–222. McQueen, G. and V. Vance Roley. 1993. “Stock prices, news, and business conditions.” Review of Financial Studies 6(3), 683–707. McQueen, G. and K. Vorkink. 2004. “Whence GARCH? a preferencebased explanation for conditional volatility.” Review of Financial Studies 17(4), 915–949. Meese, R. A. and K. Rogoff. 1983. “Empirical exchange rate models of the seventies: do they fit out of sample?” Journal of International Economics 14, 3–24. Mehran, H. 1992. “Executive incentive plans, corporate control, and capital structure.” Journal of Financial and Quantitative Analysis 27, 539–560. Melamed, B. 1991. “TES: a class of methods for generating autocorrelated uniform variates.” ORSA Journal on Computing 3(4), 317–329. Melamed, B. 1993. “An overview of TES processes and modeling methodology,” in Performance evaluation of computer and commu-
References nications systems, L. Donatiello and R. Nelson (Eds.). Lecture Notes in Computer Science, Springer, New York, pp. 359–393. Melamed, B. 1997. “The empirical TES methodology: modeling empirical time series.” Journal of Applied Mathematics and Stochastic Analysis 10(4), 333–353. Melamed, B. 1999. “ARM processes and modeling methodology.” Stochastic Models 15(5), 903–929. Melamed, B., Q. Ren, and B. Sengupta. 1996. “The QTES/PH/1 queue,” Performance Evaluation 26, 1–20. Melick, W. and C. Thomas. 1997. “Recovering an asset’s implied PDF from option prices: an application to crude oil during the Gulf crisis.” Journal of Financial and Quantitative Analysis 32, 91–115. Melino, A. and S. Turnbull. 1990. “Pricing foreign currency options with stochastic volatility.” Journal of Econometrics 45, 239–265. Melino, A. and S. Turnbull. 1995. “Misspecification and the pricing and hedging of long-term foreign currency options.” Journal of International Money and Finance 45, 239–265. Mella-Barral, P. 1999. “The dynamics of default and debt reorganization.” Review of Financial Studies 12, 535–578. Mella-Barral, P. and W. Perraudin. 1997. “Strategic debt service.” Journal of Finance 52, 531–556. Mello, A. S. and J. E. Parsons. 1992. “Measuring the agency cost of debt.” Journal of Finance 47, 1887–1887. Mensah, Y. 1983. “The differential bankruptcy predictive ability of specific price level adjustments: some empirical evidence.” Accounting Review 58(2), 228–246. Mensah, Y. M. and P. J. Miranti, Jr. 1989. “Capital expenditure analysis and automated manufacturing systems: a review and synthesis.” Journal of Accounting Literature 8, 181–207. Mendelson, M., J. Peake, and T. Williams. 1979. “Toward a modern exchange: the Peake-Mendelson-Williams proposal for an electronically assisted auction market,” in Impending changes for securities markets: what role for the exchange?, E. Bloch and R. Schwartz (Eds.). JAI Press, Greenwich, CT. Menkveld, A., S. Koopman, and A. Lucas. 2007. “Modelling aroundthe-clock price discovery for cross-listed stocks using state space methods.” Journal of Business and Economic Statistics 25(2), 213–225. Mercer, J. 1909. “Functions of positive and negative type, and their connection with the theory of integral equations.” Philosophical Transactions of the Royal Society of London Series A, Containing Papers of a Mathematical or Physical Character 209, 415–446. Merton, R. C. 1971. “Optimum consumption and portfolio rules in a continuous-time model.” Journal of Economic Theory 3, 373–413. Merton, R. 1972. “An analytical derivation of efficient portfolio frontier.” Journal of Financial and Quantitative Analysis 7, 1851–1872. Merton, R. 1973. “Theory of rational option pricing.” Bell Journal of Economics and Management Science 4, 141–183. Merton, R. C. 1973. “An intertemporal capital asset pricing model.” Econometrica 41, 867–887. Merton, R. C. 1974. “On the pricing of corporate debt: the risk structure of interest rates.” Journal of Finance 29(2), 449–470. Merton, R. C. 1975. “An asymptotic theory of growth under uncertainty.” Review of Economic Studies 42, 375–393. Merton, R. 1975. Option pricing when underlying stock returns are discontinuous, Working Paper #787–75, Massachusetts Institute of Technology. Merton, R. C. 1976. “Option pricing when underlying stock returns are discontinuous.” Journal of Financial Economics 3, 125–144. Merton, R. C. 1977. “An analytic derivation of the cost of deposit insurance and loan guarantees.” Journal of Banking and Finance 1, 3–11. Merton, R. C. 1977. “On the pricing of contingent claims and the Modigliani-Miller theorem.” Journal of Financial Economics 5, 241–250.
1669 Merton, R. C. 1978. “On the cost of deposit insurance when there are surveillance costs.” Journal of Business 51, 439–452. Merton, R. C. 1980. “On estimating the expected return on the market, an exploratory investigation.” Journal of Financial Economics 8, 323–361. Merton, R. C. 1982. “On the mathematics and economics assumption of continuous-time model.” in Financial Economics: Essay in Honor of Paul Cootner, W. F. Sharpe and C. M. Cootner (Ed.). Prentice-Hall, New Jersey, 19–51. Merton, R. C. 1990. “A dynamic general equilibrium model of the asset market and its application to the pricing of the capital structure of the firm,” chapter 11 in Continuous-time finance Basil Blackwell Cambridge, MA. Merton, R. C. 1992. Continuous-time finance, Wiley-Blackwell, New York. Merville, L. J. and W. J. Petty. 1978. “Transfer pricing for the multinational corporation.” The Accounting Review 53, 935–951. Meulbroek, L. A. 1992. “An empirical analysis of illegal insider trading.” Journal of Finance 47, 1661–1699. Meulbroek, L. K. 2002. “A senior manager’s guide to integrated risk management.” Journal of Applied Corporate Finance 14, 56–70. Meyers, L., G. Gamst, and A. Guarino. 2006. Applied multivariate research, Sage Publications, Thousand Oaks, CA. Mian, S. L. 1996. “Evidence on corporate hedging policy.” Journal of Financial and Quantitative Analysis 31, 419–439. Mikkelson, W. R. and M. M. Partch. 1986. “Valuation effects of security offerings and the issuance process.” Journal of Financial Economics 15, 31–60. Mildenstein, E., and H. Schleef. 1983. “The optimal pricing policy of a monopolistic marketmaker in the equity market.” Journal of Finance 38, 218–231. Milgrom, P. and J. Roberts. 1990a. “The economics of modern manufacturing.” American Economic Review 80, 511–528. Milgrom, P. and J. Roberts. 1990b. “Rationalizability, learning and equilibrium and in games with strategic complementarities.” Economertica 58, 1255–1278. Miles, J. and R. Ezzell. 1980. “The weighted average cost of capital, perfect capital markets, and project life: a clarification.” Journal of Financial and Quantitative Analysis (September 1980) 15, 719–730. Milevsky, M. A. and S. E. Posner. 1998a. “A closed-form approximation for valuing basket options.” The Journal of Derivatives 5(4), 54–61. Milevsky, M. A. and S. E. Posner. 1998b. “Asian options, the sum of lognormals, and the reciprocal Gamma distribution.” Journal of Financial and Quantitative Analysis 33, 409–422. Milgrom, P. and N. Stokey. 1982. “Information, trade and common knowledge.” Journal of Economic Theory 26(1), 17–27. Miller, M. H. 1977. “Debt and taxes.” Journal of Finance 32, 101–175. Miller, E. 1977. “Risk, uncertainty and divergence of opinion.” Journal of Finance 32, 1151–1168. Miller, M. H. 1988. “The Modigliani–Miller proposition after 30 years.” Journal of Economic Perspectives 2, 99–120. Miller, M. H. and F. Modigliani. 1961. “Dividend policy growth and the valuation of share.” Journal of Business 34, 411–433. Miller, M. and K. Rock. 1985. “Dividend policy under asymmetric information.” Journal of Finance 40, 1031–1051. Miller, M. and M. Scholes. 1972. “Rates of return in relation to risk: a reexamination of some recent findings,” in Studies in theory of capital markets, M. C. Jensen (Ed.). Praeger, New York, pp. 47–78. Miller, M. and M. Scholes. 1978. “Dividends and taxes.” Journal of Financial Economics 6, 333–364. Milonas, N. T and S. B. Thomadakis. 1997. “Convenience yields as call options: an empirical analysis” Journal of Futures Markets 17, 1–15.
1670 Miltersen, M., K. Sandmann, and D. Sondermann. 1997. “Closedform solutions for term structure derivatives with lognormal interest rates.” Journal of Finance 52, 409–430. Milne, F. and H. Shefrin. 1987. “A critical examination of limiting forms of the binomial option pricing theory.” Department of Economics, Australian National University, Canberra, ACT 2601. Mincer, J. and V. Zarnowitz. 1969. “The evaluation of economic forecasts,” in Economic forecasts and expectations, J. Mincer (Ed.). National Bureau of Economic Research, New York. Minton, B. A. 1997. “An empirical examination of basic valuation models for plain vanilla US interest rate swaps.” Journal of Financial Economics 44, 251–277. Mitchell, K. 1991. “The call, sinking fund, and term-to-maturity features of corporate bonds: an empirical investigation.” Journal of Financial and Quantitative Analysis 26(2), 201–222. Mitchell, M. L. and E. Stafford. 2000. “Managerial decisions and longterm stock price performance.” Journal of Business 73, 287–329. Mitton, T. 2006. “Stock market liberalization and operating performance at the firm level.” Journal of Financial Economics 81, 625. Miyahara, Y. 1996. “Canonical martingale measures of incomplete assets markets,” in Probability theory and mathematical statistics: proceedings of the seventh Japan–Russia symposium, World Scientific, Tokyo, pp. 343–352. Miyakoshi, T. 2002. “ARCH versus information-based variances: evidence from the Tokyo stock market.” Japan and the World Economy 2, 215–231. Mizrach, B. 1995. “Target zone models with stochastic realignments: an econometric evaluation.” Journal of International Money and Finance 14, 641–657. Mizrach, B. 2006. “The Enron bankruptcy: when did the options market lose its smirk.” Review of Quantitative Finance and Accounting 27, 365–382. MMC Security. 2007. Market update: the catastrophe bond market at year-end 2006, Guy Carpenter & Company, Inc. Modigliani L. 1997. “Don’t pick your managers by their alphas,” U.S. Investment Research, U.S. Strategy, Morgan Stanley Dean Witter, June. Modigliani, F. and M. Miller 1958. “The cost of capital, corporation finance and the theory of investments.” American Economic Review 48, 162–197. Modigliani, F. and M. Miller. 1958. “Corporate income taxes and the cost of capital.” American Economic Review 53, 433–443. Modigliani, F. and M. Miller. 1963. “Corporate income taxes and the cost of capital: a correction.” American Economic Review 53, 433–443. Modigliani, F. and L. Modigliani. 1997. “Risk-adjusted performance.” Journal of Portfolio Management 23(2), 45–54. Modigliani, F. and G. A. Pogue. 1974. “An introduction to risk and return.” Financial Analysis Journal 30, 69–86. Moeller, S., F. Schlingemann, and R. Stulz. 2004. “Firm size and the gains from acquisitions.” Journal of Financial Economics 73, 201–228. Mogedey, L. and W. Ebeling. 2000. “Local order, entropy and predictability of financial time series.” The European Physical Journal B 5, 733–737. Molina, C. A. 2005. “Are firms underleveraged? An examination of the effect of leverage on default probabilities.” Journal of Finance 60, 1427. Monoyios, M. 2007. “The minimal entropy measure and an Esscher transform in an incomplete market model.” Statistics and Probability Letters 77, 1070–1076. Moody’s. 2006. “Bank financial strength ratings: revised methodology.” Moody’s Global Credit Research Report. Moore, T. and P. Wang. 2007. “Volatility in stock returns for new EU member states: Markov regime switching model.” International Review of Financial Analysis 16(3), 282–292.
References Mora, N. 2006. “Sovereign credit ratings: guilty beyond reasonable doubt?” Journal of Banking and Finance 30(7), 2041–2062. Morck, R., A. Shleifer, and R. W. Vishny. 1988. “Management ownership and market valuation: an empirical analysis.” Journal of Financial Economics 20, 293–315. Mordecki, E. 2002. “Optimal stopping and perpetual options for Lévy processes.” Finance and Stochastics 6, 473–493. Moreira, M. J. 2002. Tests with correct size in the simultaneous equations model, University of California, Berkeley. Morgan, I. G. 1977. “Grouping procedures for portfolio formation.” Journal of Finance 21, 1759–1765. Morgan, D. P. 2002. “Rating banks: risk and uncertainty in an opaque industry.” The American Economic Review 92(4), 874–888. Morrison, D. G. 1969. “On the interpretation of discriminant analysis.” Journal of Marketing Research 6, 156–163. Morrison, D. F. 1990. Multivariate statistical methods, 3rd edn. McGraw-Hill, Inc., New York. Morse, D. 1980. “Asymmetric information in securities markets and trading volume.” Journal of Financial and Quantitative Analysis 15, 1129–1148. Morton, K. W. and D. F. Mayers. 2005. Numerical solution of partial differential equations: an introduction, Cambridge University Press, Cambridge. Mosler, K. and M. Scarsini (Eds.). 1991. Stochastic orders and decision under risk, Institute of Mathematical Statistics, Hayward, California. Mossin, J. 1966. “Equilibrium in a capital asset market.” Econometria 34, 768–873. Mossin, J. 1968. “Optimal multiperiod portfolio policies.” Journal of Business 41, 215–229. Mosteller, F. 1946. “On some useful “inefficient” statistics.” Annals of Mathematical Statistics 17, 377–408. Mukherjee, A. 2002. “Subsidy and entry: the role of licensing.” Oxford Economic Papers 54(1), 160–171. Muliere, P. and M. Scarsini, 1989. “A note on stochastic dominance and inequality measures.” Journal of Economic Theory 49, 314–323. Müller, A. and D. Stoyan. 2002. Comparison methods for stochastic models and risks, Wiley, Chichester. Mullins, H. M. and D. H. Pyle. 1994. “Liquidation costs and risk-based bank capital.” Journal of Banking and Finance 18, 113–138. Musiela, M. and M. Rutkowski. 1997. Martingale methods in financial modeling, Springer, New York. Musiela, M. and T. Zariphopoulou. 2003. “An example of indifference prices under exponential preferences.” Finance and Stochastics 8(2), 229–239. Muth, J. F. 1961. “Rational expectations and the theory of price movements.” Econometrica 29, 315–335. Myers, S. C. 1974. “Interactions of corporate financing and investment decisions – implications for capital budgeting.” Journal of Finance (March 1974), 29, 1–25. Myers, S. C. 1977. “Determinants of corporate borrowing.” Journal of Financial Economics 5, 147–175. Myers, S. C. 1984. “The capital structure puzzle.” Journal of Finance 39, 575–592. Myers, S. C. 1984. “Finance theory and financial strategy.” Interface 14, 126–137, reprinted in 1987. Midland Corporate Finance Journal 5(1), 6–13. Myers, S. C. and L. S. Borucki. 1994. “Discounted cash flow estimates of the cost of equity capital – a case study.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 9–45. Myers, S. C. and R. A. Cohn. 1987. “A discounted cash flow approach to property-liability insurance rate regulation,” in Fair rate of return in property-liability insurance, J. D. Cummins and S. E. Harrington (Eds.). Kluwer Academic Publishers, Norwell, MA.
References Myers, J. H. and E. W. Forgy. 1963. “The development of numerical credit evaluation systems.” Journal of American Statistical Association 58, 799–806. Myers, S. C. and N. S. Majluf. 1984. “Corporate financing and investment decisions when firms have information that investors do not have.” Journal of Financial Economics 13, 187–221. Nakatsuma, T. 2000. “Bayesian analysis of ARMA-GARCH models: a Markov chain sampling approach.” Journal of Econometrics 95, 57–69. Nance, D. R., C. W. Smith Jr., and C. W. Smithson. 1993. “On the determinants of corporate hedging.” Journal of Finance 48, 267–284. Nanda, V., Z. Jay Wang, and L. Zheng. 2004. “Family values and the star phenomenon.” Review of Financial Studies 17, 667–698. Nandi, S. 1996. “Pricing and hedging index options under stochastic volatility.” Working Paper, Federal Reserve Bank of Atlanta. Neave, E. H. and G. L. Ye. 2003. “Pricing Asian options in the framework of the binomial model: a quick algorithm.” Derivative Use, Trading and Regulation 9(3), 203–216. Neely, C., P. Weller, and R. Dittmar. 1997. “Is technical analysis in the foreign exchange market profitable? A genetic programming approach.” Journal of Financial and Quantitative Analysis 32(4), 405–426. Neftci, S. 2000. Introduction to the mathematics of financial derivatives, Academic Press, New York. Nelson, D. B. 1990. “Stationarity and persistence in GARCH(1,1) model.” Econometric Theory 6, 318–334. Nelson, D. 1990. “ARCH models as diffusion approximations.” Journal of Econometrics 45, 7–38. Nelson, D. 1991. “Conditional heteroskedasticity in asset returns: a new approach.” Econometrica 59(2), 347–370. Nelson, D. B. 1992. “Filtering and forecasting with misspecified ARCH models I: getting the right variance with the wrong model.” Journal of Econometrics 52, 61–90. Nelson, D. B. and D. P. Foster. 1995. “Filtering and forecasting with misspecified ARCH models II: making the right forecast with the wrong model.” Journal of Econometrics 67, 303–335. Nera Economic Consulting. 2007. At a glance: the chilling effects of the subprime meltdown, Marsh & McLennan Companies, September. Nesbitt, S. L. 1994. “Long-term rewards from shareholder activism: a study of the CalPERS effect.” Journal of Applied Corporate Finance 6, 75–80. Newey, W. and K. West. 1987. “A simple positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix.” Econometrica 55, 703–708. Ng, W.-Y., K.-W. Choi, and K.-H. Shum. 1996. “Arbitrated matching: formulation and protocol.” European Journal of Operational Research 88(2), 348–357. Niehans, J. 1978. A theory of money, Johns Hopkins, Baltimore, MD, pp. 166–182. Nielsen, J. and K. Sandmann. 1996. “The pricing of Asian options under stochastic interest rate.” Applied Mathematical Finance 3, 209–236. Nielsen, L., J. Sao-Requejo, and P. Santa-Clara. 1993. Default risk and interest rate risk: the term structure of default spreads, Working Paper, INSEAD. Nielsen, L. T., J. Saá-Requejo, and P. Santa-Clara. 2001. Default risk and interest rate risk: the term structure of default spreads, Working paper, INSEAD. Nison, S. 1991. Japanese candlestick charting techniques, New York Institute of Finance, New York. Norden, L. and M. Weber. 2004. “Informational efficiency of credit default swap and stock markets: the impact of credit rating announcements.” Journal of Banking and Finance 28, 2813–2843. Nordman, D. J. and S. N. Lahiri. 2005. “Validity of the sampling window method for long-range dependent linear processes.” Econometric Theory 21, 1087–1111.
1671 Norgaard, M., N. K. Poulsen, and O. Raven. 2000. “New developments in state estimation for nonlinear systems.” Automatica 36(11), 1627–1638. Novomestky, F. 1997. “A dynamic, globally diversified, index neutral synthetic asset allocation strategy.” Management Science 43, 998– 1016. O’Brien, J. 2003. “The capital structure implications of pursuing a strategy of innovation.” Strategic Management Journal 24(5), 415–431. Obstfeld, M. 1994. “Risk-taking, global diversification, and growth.” American Economic Review 84, 1310–1329. Officer, M. 2007. “The price of corporate liquidity: acquisition discounts for unlisted targets.” Journal of Financial Economics 83(3), 571–598. Ogden, J. P. 1987. “Determinants of the ratings and yields on corporate bonds: tests of the contingent claims model.” Journal of Financial Research 10, 329–339. Ogryczak, W. and A. Ruszczy´nski. 1999. “From stochastic dominance to mean-risk models: semideviations as risk measures.” European Journal of Operational Research 116, 33–50. Ogryczak, W. and A. Ruszczy´nski. 2001. “On consistency of stochastic dominance and mean-semideviation models.” Mathematical Programming 89, 217–232. Ogryczak, W. and A. Ruszczy´nski. 2002. “Dual stochastic dominance and related mean-risk models.” SIAM Journal on Optimization 13, 60–78. O’Hara, M. 1997. Market microstructure theory, Basil Blackwell, Cambridge, MA. Ohlson J. T. 1980 “Financial ratios and the probabilistic prediction of bankruptcy.” Journal of Accounting Research 18(1), 109–131. Ohlson, J. and Z. Gao. 2006. “Earnings, earnings growth and value.” Foundations and Trends in Accounting 1(1), 1–70. Oksendal, B. 2007. Stochastic differential equations, 5th ed., Springer, New York. Øksendal, B. 2007. Stochastic differential equations: an introduction with applications, 6th Edition, Springer-Verlag, Berlin Heidelberg. Oliner, S. and D. Sichel. 2000. “The resurgence of growth in the late 1990s: is information technology the story?” Journal of Economic Perspectives 14, 3–22. Omberg, E. 1988. “Efficient discrete time jump process models in option pricing.” Journal of Financial Quantitative Analysis 23, 161–174. Opler, T., L. Pinkowitz, R. Stulz, and R. Williamson 1999. “The determinants and implications of corporate cash holdings.” Journal of Financial Economics 52(1), 3–46. Ortega, J.M. 1988. Introduction to parallel and vector solution of linear systems. Plenum Press, New York. Ortiz-Molina, H. 2006. “Top management incentives and the pricing of corporate public debt.” Journal of Financial and Quantitative Analysis 41, 317. Ouederni, B. N. and W. G. Sullivan. 1991. “Semi-variance model for incorporating risk into capital investment analysis.” The Engineering Economist 36, 83–106. Ozenbas, D., R. Schwartz, and R. Wood. 2002. “Volatility in U.S. and European equity markets: an assessment of market quality.” International Finance 5(3), 437–461. Pagano, M. S. 1999. An empirical investigation of the rationales for risk-taking and risk-management behavior in the U.S. banking industry, Unpublished dissertation, Rutgers University. Pagano, M. S. 2001. “How theories of financial intermediation and corporate risk-management influence bank risk-taking.” Financial Markets, Institutions, and Instruments 10, 277–323. Pagano, M. and A. Röell. 1996. “Transparency and liquidity: a comparison of auction and dealer markets with informed trading.” Journal of Finance 51, 579–611.
1672 Page, E. S. 1954. “Continuous inspection schemes.” Biometrika 41, 100–115. Pain, D. and J. Rand. 2008. “Recent developments in portfolio insurance.” Bank of England Quarterly Bulletin Q1, 37–46. Palatella, L., J. Perello, M. Montero, and J. Masoliver. 2005. “Diffusion entropy technique applied to the study of the market acitivity.” Physica A 355, 131–137. Palepu, K. G. 1986. “Predicting takeover targets: a methodological and empirical analysis.” Journal of Accounting and Economics 8(1), 3– 35. Palia, D. 2001. “The endogeneity of managerial compensation in firm valuation: a solution.” Review of Financial Studies 14, 735. Pan, E. 1998. “Collpase of detail.” International Journal of Theoretical and Applied Finance 1(2), 247–282. Pan, J. 2002. “The jump-risk premia implicit in options: evidence from an integrated time-series study.” Journal of Financial Economics 63, 3–50. Pan, J. and A. Poteshman. 2003. The information in option volume for stock prices, Working Paper, MIT and University of Illinois. Pan, E. and L. Wu. 2006. “Taking positive interest rates seriously.” chapter 14, Advances in Quantitative Analysis of Finance and Accounting 4, Reprinted as chapter 98 herein. Panayi, S. and L. Trigeorgis. 1998. “Multi-stage real options: the cases of information technology infrastructure and international bank expansion.” Quarterly Review of Economics and Finance 38 (Special Issue), 675–692. Panjer, H. H. et al. 1998. Financial economics, The Actuarial Foundation, Schaumburg, IL. Papageorgiou, E. and R. Sircar. submitted. “Multiscale intensity models and name grouping for valuation of multi-name credit derivatives.” Pardoux, E. and S. G. Peng. 1990. “Adapted solution of a backward stochastic differential equation.” Systems & Control Letters 14, 55–61. Park, S., H. J. Jang, and M. P. Loeb. 1995. “Insider trading activity surrounding annual earnings announcements.” Journal of Business Finance & Accounting 22, 587–614. Parkinson, M. 1977. “Option pricing: the American put.” Journal of Business 50, 21–36. Parkinson, M. 1980. “The extreme value method for estimating the variance of the rate of return.” Journal of Business 53, 61–65. Parlour, C. 1998. “Price dynamics in limit order markets.” Review of Financial Studies 11, 789–816. Parlour, C. and D. Seppi. 2008. “Limit order markets: a survey,” Handbook of financial intermediation & banking, A. W. A. Boot and A. V. Thakor (Eds.), Elesevier, Amsterdam. Paroush, J., R. Schwartz, and A. Wolf. 2008. The dynamic process of price discovery in an equity market, Working paper, Baruch College, CUNY. Parrino, R., R. W. Sias, and L. T. Starks. 2003. “Voting with their feet: institutional ownership changes around forced CEO turnover.” Journal of Financial Economics 68, 3–76. Parulekar, R., U. Bishnoi, and T. Gang. 2008. ABS & mortgage credit strategy cross sector relative value snapshot, Citigroup Global Markets Inc., June 13. Pastena, V. and W. Ruland. 1986. “The merger bankruptcy alternative.” Accounting Review 61(2), 288–301. Pastor, L. 2000. “Portfolio selection and asset pricing models.” Journal of Finance 55(1), 179–223. Pastor, L. and R. Stambaugh. 1999. “Cost of equity capital and model mispricing.” Journal of Finance 54(1), 67–121. Pástor, L. and R. Stambaugh. 2000. “Comparing asset pricing models: an investment perspective.” Journal of Financial Economics 56, 335–381. Pástor, L. and R. Stambaugh. 2003. “Liquidity risk and expected stock returns.” Journal of Political Economy 113, 642–685.
References Patel, J., R. J. Zeckhauser, and D. Hendricks. 1994. “Investment flows and performance: evidence from mutual funds, cross-border investments, and new issues,” in Japan, Europe and the international financial markets: analytical and empirical perspectives, R. Satl, R. Levitch and R. Ramachandran (Eds.). Cambridge University Press, New York, pp. 51–72. Patro, D. K. 2001. “Measuring performance of international closed-end funds.” Journal of Banking and Finance 25, 1741–1767. Patro, D. K. 2008. “Stock market liberalization and emerging market country fund premiums.” Journal of Business 78(1), 135–168. Patron, H. 2007. “The value of information about central bankers’ preferences.” International Review of Economics and Finance 16, 139–148. Pearson, N. D. and T. S. Sun. 1994. “Exploiting the conditional density in estimating the term structure: an application to the Cox, Ingersoll, and Ross model.” Journal of Finance 49, 1279–1304. Pedersen, C. S. and S. E. Satchell. 2002. “On the foundations of performance measures under asymmetric returns.” Quantitative Finance 3, 217–223. Pedrosa, M. and R. Roll. 1998. “Systematic risk in corporate bond credit spreads.” Journal of Fixed Income 8(3), 7–26. Peel, M. J. and D. A. Peel. 1987. “Some further empirical evidence on predicting private company failure.” Accounting and Business Research 18(69), 57–66. Penman, S. 2001. Financial statement analysis and security valuation. McGraw-Hill, Boston. Pennacchi, G. G. 1987. “A reexamination of the over- (or under-) pricing of deposit insurance.” Journal of Money, Credit and Banking 19, 340–360. Perold, A. R. 1986. Constant proportion portfolio insurance, Working paper, Harvard Business School. Perold, A. 2007. “Fundamentally flawed indexing.” Financial Analysts Journal 63(6), 31–37. Perold, A. R. and W. F. Sharpe. 1988. “Dynamic strategies for asset allocation.” Financial Analysts Journal 44(1), 16–27. Perron, P. 1989. “The great crash, the oil price shock and the unit root hypothesis.” Econometrica 57, 1361–1401. Peterson, C. L. 2007. Subprime mortgage market turmoil: examining the role of securitization – a hearing before the U.S. Senate Committee on Banking, Housing, and Urban Affairs Subcommittee on Securities, Insurance, and Investment, University of Florida, April 17. Peterson, S., R. C. Stapleton, and M. G. Subrahmanyam. 1998. A two factor lognormal model of the term structure and the valuation of American-style options on bonds, Working paper, New York University. Petkova, R. and L. Zhang. 2005. “Is value riskier than growth?” Journal of Financial Economics 78, 187–202. Pham, H., Th. Rheinländer, and M. Schweizer. 1998. “Mean–variance hedging for continuous processes: new proofs and examples.” Finance and Stochastics 2, 173–198. Phillips, P. C. B. 1986. “Understanding spurious regressions in econometrics.” Journal of Econometrics 33, 311–340. Phillips, P. C. B. 1998. “New tools for understanding spurious regressions.” Econometrica 66, 1299–1326. Philips, T. K., G. T. Rogers and R. E. Capaldi. 1996. “Tactical asset allocation: 1977–1994.” Journal of Portfolio Management 23, 57–64. Piazzesi, M. 2001. An econometric model of the yield curve with macroeconomic jump effects, Working paper, UCLA. Piazzesi, M. 2005. “Bond yields and the Federal Reserve.” Journal of Political Economy 113(2), 311–344. Piazzesi, M. 2010. “Affine term structure models.” in Handbook of Financial Econometrics, North-Holland. Pike, R. 1983. “The capital budgeting behavior and corporate characteristics of capital-constrained firms.” Journal of Business Finance & Accounting 10, 663–671.
References Pike, R. 1988. “An empirical study of the adoption of sophisticated capital budgeting practices and decision-making effectiveness.” Accounting and Business Research 18, 341–351. Piotroski, J. 2000. “Value investing: the use of historical financial statement information to separate winners from losers.” Journal of Accounting Research 38 (Supplement), 1–41. Piotroski, J.D. and D. Roulstone. 2005. “Do insider trades reflect both contrarian beliefs and superior knowledge about future cash flow realizations?” Journal of Accounting and Economics 39, 55–82. Piramuthu, S. 1999 “Financial credit-risk evaluation with neural and neurofuzzy systems.” European Journal of Operational Research 112(2) 310–321. Pliska, S.R. 1997. Introduction to mathematical finance, Blackwell, Oxford. Plott, C. and S. Sunder. 1982. “Efficiency of experimental markets with insider information: an application of rational expectations models.” Journal of Political Economy 90, 663–698. Poitevin, M. 1989. “Financial signaling and the deep-pocket argument.” Rand Journal of Economics 20, 26–40. Pogue, G. A. and K. Lull. 1974. “Corporate finance: an overview.” Sloan Management Review, 15, 19–38. Pollak, M. and Siegmund, D. 1985. “A diffusion process and its applications to detecting a change in the drift of Brownian motion.” Biometrika 72(2), 267–280. Pontiff, J. and L. Schall. 1998. “Book-to-market as a predictor of market returns.” Journal of Financial Economics 49, 141–160. Poon, S. and C. W. J. Grander. 2003. “Forecasting volatility in financial markets: a review.” Journal of Economic Literature 41, 478–539. Poon, W. P. H., M. Firth, and H.-G. Fung. 1999. “A multivariate analysis of the determinants of Moody’s bank financial strength ratings.” Journal of International Financial Markets, Institutions & Money 9(3), 267–283. Porter, D. P. and V. L. Smith. 2003. “Stock market bubbles in the laboratory.” Journal of Behavioral Finance 4(1), 7–20. Porter, D. and D. Weaver. 1998. “Post-trade transparency on Nasdaq’s national market system.” Journal of Financial Economics 50(2), 231–252. Poterba, J.M. and L.H. Summers. 1988. “Mean reversion in stock prices: evidence and implications”, Journal of Financial Economics 22, 27–59. Posner, S. E. and M. A. Milevsky. 1998. “Valuing exotic options by approximating the SPD with higher moments.” The Journal of Financial Engineering 7(2), 109–125. Predescu, M. 2005. The performance of structural models of default for firms with liquid CDS spreads, Working Paper, Rotman School of Management, University of Toronto. Prékopa, A. 2003. Probabilistic programming. in Stochastic Programming, Ruszczy´nski, A. and A. Shapiro. (Eds.). Elsevier, Amsterdam, pp. 257–352. Press, W., S. Teukolsky, W. Vetterling, and B. Flannery. 2007. Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge. Price, K., B. Price, and T. J. Nantell. 1982. “Variance and lower partial moment measures of systematic risk: some analytical and empirical results.” Journal of Finance 37, 843–855. Protter P. 1990. “Stochastic Integration and Differential Equations: A new approach.” Springer-Verlag. Protter, P. 2001. “A partial introduction to financial asset pricing theory.”Stochastic Processes and their Applications 91, 169–203. Protter, P. 2003. Stochastic integration and differential equations, Second Edition, Springer, New York. Protter, P. 2004. Stochastic integration and differential equations, 2nd edition, Springer, Heidelberg. Puri, M.L and D. A. Ralescu. 1986. “Fuzzy random variables.” Journal of Mathematical Analysis and Applications 114, 409–422.
1673 Pye, G. 1966. “A Markov model of the term structure.” Quarterly Journal of Economics 25, 60–72. Qian, X., R. Ashizawa, and H. Tsurumi. 2005. “Bayesian, MLE, and GMM estimation of spot rate model.” Communication in Statistics 34(12), 2221–2233. Quenouille, M. 1949. “Approximate tests of correlation in time series.” Journal of the Royal Statistical Society, Series B 11, 18–84. Quiggin, J. 1982. “A theory of anticipated utility.” Journal of Economic Behavior and Organization 3, 225–243. Quiggin, J. 1993. Generalized expected utility theory – the rankdependent expected utility model, Kluwer, Dordrecht. Quirin, G. D. and W. R. Waters. 1975. “Market efficiency and the cost of capital: the strange case of fire and casualty insurance companies.” Journal of Finance 30(2), 427–224. Quirk, J.P. and R. Saposnik. 1962. “Admissibility and measurable utility functions.” Review of Economic Studies 29, 140–146. Rachev, S. T. and S. Mittnik. 2000. Stable paretian models in finance, Wiley, Chichester. Radner, R. and L. Shepp. 1996. “Risk vs. profit potential: a model for corporate strategy” Journal of Economic Dynamics and Control 20, 1373–1393. Ragunathan, V. and A. Peker. 1997. “Price variability, trading volume, and market depth: evidence from the Australian futures market.” Applied Financial Economics 5, 447–454. Rajan, R. G. and L. Zingales. 1995. “What do we know about capital structure? Some evidence from international data.” Journal of Finance 50, 1421–1460. Ramanathan, R. 1995. Introductory econometrics with application, Dryden, Fort Worth, TX. Ramsden, R., L. B. Appelbaum, R. Ramos, and L. Pitt. 2007. The subprime issue: a global assessment of losses, contagion and strategic implications, Goldman Sachs Group, Inc. – Global: Banks, November 20. Ray, B. K. and R. S. Tsay. 2000. “Long range dependence in daily stock volatilities.” Journal of Business & Economic Statistics 18(2), 254–262. Raymar, S. 1991. “A Model of Capital Structure When Earnings Are Mean-Reverting.” Journal of Financial and Quantitative Analysis 26(3), 327–344. Rebonato, R. 1998. Interest rate option models, 2nd ed., Wiley, New York. Rebonato, R. 2002. Modern pricing of interest rate derivatives, Princeton University Press, New Jersey. Rebonato, R. 2004. Volatility and correlation: the perfect hedger and the fox, 2nd ed., Wiley, New York. Rees, B. 1990. Financial analysis, Prentice-Hall, London. Reilly, F. K. 1985. Investment analysis and portfolio management, 2nd ed., Dryden Press, Hinsdale. Reinganum, M. R. 1981. “Misspecification of capital asset pricing: empirical anomalies based on earnings yields and market values.” Journal of Financial Economics 8, 19–46. Reinhart, C. M. 2002. “Default, currency crises, and sovereign credit ratings.” World Bank Economic Review 16, 151–170. Rendleman, R. J., Jr. and B. J. Barter. 1979. “Two-state option pricing.” Journal of Finance 34(5), 1093–1110. Rendleman, R. J., Jr. and B. J. Barter. 1980. “The pricing of options on debt securities.” Journal of Financial and Quantitative Analysis 15, 11–24. Rendleman, R. J., Jr. and T. J. O’Brien. 1990. “The effects of volatility misestimation on option-replication portfolio insurance.” Financial Analysts Journal 46(3), 61–70. Revuz, D. and M. Yor. 1999. Continuous Martingales and Brownian motion, Springer, Berlin. Revuz, D. and M. Yor. 2005. Continuous martingales and Brownian motion, corrected 3rd printing of the 3rd Edition. Springer, Berlin.
1674 Rheinländer, Th. 1999. Optimal martingale measures and their applications in mathematical finance, Phd Thesis, Berlin. Rheinländer, T. 2005. “An entropy approach to the stein/stein model with correlation.” Finance and Stochastics 9, 399–412. Rheinländer, T. and G. Steiger. 2006. “The minimal entropy martingale measure for general Barndorff–Nielsen/Shephard models.” Annals of Applied Probability 16(3), 1319–1351. Rhoades, S. A. 1993. “Efficiency effects of horizontal (in-market) bank mergers.” Journal of Banking and Finance 17, 411–422. Rhoades, S. A. 1998. “The efficiency effects of bank mergers: an overview of case studies of nine mergers.” Journal of Banking and Finance 22, 273–291. Riani, M. and A. Atkinson. 2007. “Fast calibrations of the forward search for testing multiple outliers in regression.” Advances in Data Analysis and Classification 1(2) 123–141. Ribeiro, R. A, H. J. Zimmermann, R. R. Yager, J. Kacprzyk (Eds.). 1999. Soft computing in financial engineering, Physica-Verlag, Heidelberg. Richard, S. F. 1978. “An arbitrage model of the term structure of interest rates.” Journal of Financial Economics 6, 33–57. Richardson, D. H. and De-Min Wu. 1970. “Least squares and grouping method estimation in the errors in variable model.” Journal of American Statistical Association 65, 724–748. Ricke, M. 2004. “What is the link between margin loans and stock market bubbles?” University of Muenster, Department of Banking No. 03–01. Riley, J. 1979. “Informational equilibrium.” Econometrica 47, 331–359. Risa, S. 2004. The New Lehman HEL OAS model, Lehman Brothers Inc., December 8. Ritchey, R. 1990. “Call option valuation for discrete normal mixtures.” Journal of Financial Research 13, 285–295. Ritchken, P. 1987. Option: theory, strategy, and applications, Scott, Foresman: Glenview, IL. Ritchken, P. 1995. “On pricing barrier options.” The Journal of Derivatives 3, 19–28. Ritchken, P. H. and R. Trevor. 1999. “Pricing options under generalized GARCH and stochastic volatility processes.” Journal of Finance 54, 377–402. Rivoli, P., and T. L. Brewer. 1997. “Political instability and country risk.” Global Finance Journal 8(2), 309–321. Robbins, H. 1970. “Statistical methods related to the law of the iterated logarithm.” Annals of Mathematical Statistics 41, 1397–1409. Robert, C. and G. Casella. 1999. Monte Carlo statistical methods, Springer Texts in Statistics. Robert, C. P. and G. Gasella. 2004. Monte Carlo statistical methods, 2nd ed. Springer, New York. Robertson, T. and J. Kennedy. 1968. “Prediction of consumer innovators - multiple discriminant analysis.” Journal of Marketing Research 5, 64–69. Robicheck, A. A. and R. A. Cohn. 1974. “The economic determinants of systematic risk.” Journal of Finance 29, 439–447. Robichek, A. and S. Myers. 1965. Optimal financing decisions, Prentice-Hall, Prentice. Robichek, A. and S. C. Myers. 1966. “Risk-adjusted discount rates.” Journal of Finance (December 1966), 21, 727–730. Robin, S. and B. Ruffieux. 2001. “Price bubbles in laboratory asset markets with constant fundamental values.” Experimental Economics 4, 87–105. Rockafellar, R. T. and S. Uryasev. 2000. “Optimization of conditional value-at-risk.” Journal of Risk 2, 21–41. Rogers, L. C. G. 1995. Mathematical finance, vol. IMA Volume 65, Springer, New York. Rogers, L. C. G. 1996. “Gaussian errors.” RISK 9, 42–45. Rogers, L. C. G. 1997. “The potential approach to the term structure of interest rates and foreign exchange rates.” Mathematical Finance 7, 157–176.
References Rogers, L. and S. Satchell. 1991. “Estimating variance from high, low and closing prices.” Annals of Applied Probability 1, 504–512. Rogers, L. and Z. Shi 1995. “The value of an Asian option” Journal of Applied Probability 32, 1077–1088. Rogers, L., S. Satchell, and Y. Yoon. 1994. “Estimating the volatility of stock prices: a comparison of methods that use high and low prices.” Applied Financial Economics 4, 241–247. Roll, R. 1969. “Bias in fitting the Sharpe model to time series data.” Journal of Financial and Quantitative Analysis 4, 271–289. Roll, R. 1977. “An analytic valuation formula for unprotected American call options on stocks with known dividends.” Journal of Financial Economics 5, 251–281. Roll, R. 1977. “A critique of the asset pricing theory’s tests-part i: on past and potential testability of the theory.” Journal of Financial Economics 4, 129–176. Roll, R. 1978. “Ambiguity when performance is measured by the securities market line.” Journal of Finance 33, 1051–1069. Roll, R. 1989. “Price volatility, international market links, and their implications for regulatory policy.” Journal of Financial Services Research 3, 211–246. Roll, R. and S. A. Ross. 1980. “An empirical investigation of arbitrage pricing theory.” Journal of Finance 35, 1073–1103. Roll, R. and S. A. Ross. 1994. “On the cross-sectional regression between expected returns and betas.” Journal of Finance 49, pp. 101–121. Rolski, T., H. Schmidli, V. Schmidt, and J. Teugels. 1999. Stochastic processes for insurance and finance. Wiley, England. Ronen, T. and X. Zhou. 2008. “Where did all the information go? Trade in the corporate bond market.” Working paper, Rutgers University. Ronn, E. I. and A. K. Verma. 1986. “Pricing risk-adjusted deposit insurance: an option-based model.” Journal of Finance 41, 871–895. Rosen, S. 1974. “Hedonic prices and implicit markets: product differentiation in pure competition.” Journal of Political Economy 82(1), 34–55. Rosenberg, B. 1972. The behavior of random variables with nonstationary variance and the distribution of security prices, Working Paper #11, Research Program in Finance, University of California, Berkeley. Rosenberg, J. V. and R. F. Engle. 2002. “Empirical pricing kernels.” Journal of Financial Economics 64(3), 341–372. Rosenberg, B. and J. Guy. 1976a. “Prediction of beta from investment fundamentals.” Financial Analysts Journal 32, 60–72. Rosenberg, B. and J. Guy. 1976b. “Prediction of beta from investment fundamentals, part II.” Financial Analysts Journal 32, 62–70. Rosenberg, B. and V. Marathe. 1974. The prediction of investment risk: systematic and residual risk, Berkeley Working Paper Series. Rosenberg, B. and V. Marathe. 1975. Tests of the capital asset pricing hypothesis, Working Paper No. 32 of the Research Program in France, Graduate School of Business and Public Administration, University of California, Berkeley. Rosenberg, B. and W. McKibben. 1973. “The prediction of systematic and specific risk in common stocks.” Journal of Finance and Quantitative Analysis 8, 317–333. Ross, S. A. 1970. “On the general validity of the mean-variance approach in large markets,” in Financial economics: essays in honor of Paul Cootner, W. F. Sharpe and C. M. Cootner (Eds.). Prentice Hall, Englewood Cliffs, NJ, pp. 52–84. Ross, S. 1974. “Portfolio turnpike theorems for constant policies.” Journal of Financial Economics 1, 171–198. Ross, S. 1976. “The arbitrage theory of capital asset pricing.” Journal of Economic Theory, 13, 341–360. Ross, S. A. 1977. “Return, risk and arbitrage,” in Risk and return in finance, Vol. 1, I. Friend and J. L. Bicksler (Eds.). Ballinger, Cambridge, pp. 187–208. Ross, S. 1977. “The determination of financial structure: the incentive signaling approach.” Bell Journal of Economics 8, 23–40.
References Ross, S. 1978. “Mutual fund separation in financial theory – the separating distributions.” Journal of Economic Theory 15, 254–286. Ross, S. 1995. “Hedging long-run commitments: exercises in incomplete market pricing.” Working Paper, Yale School of Management. Ross, S. A., R. W. Westerfield, and J. F. Jaffe. 2002. Corporate finance, 6th Edition, McGraw-Hill, New York, NY. Ross, S., R. Westerfield, and B. Jordan. 2008. Fundamentals of corporate finance, 8th edn. McGraw Hill Irwin, New York. Rotemberg, J. and D. Scharfstein. 1990. “Shareholder value maximization and product market competition.” Review of Financial Studies 3, 367–393. Rothenberg, T. J., Z. Griliches, and M. D. Intriligator. 1984. “Approximating the distributions of econometric estimators and test statistics,” in Handbook of econometrics. Volume II (Handbooks in Economics series, book 2. Amsterdam; New York and Oxford: NorthHolland; distributed in the U.S. and Canada by Elsevier Science, New York). Rothschild, M. and J. E. Stiglitz. 1969. “Increasing risk: I. A definition.” Journal of Economic Theory 2, 225–243. Rouge, R. and N. El Karoui. 2000. “Pricing via utility maximization and entropy.” Mathematical Finance 10, 259–276. Rousseeuw P. J. 1983. “Regression techniques with high breakdown point.” The Institute of Mathematical Statistics Bulletin 12, 155. Rousseeuw, P. J. 1984. “Least median of squares regression.” Journal of the American Statistical Association 79(388), 871–880. Rousseeuw, P. J. and A. Christmann. 2003. “Robustness against separation and outliers in logistic regression.” Computational Statistics & Data Analysis 43(3) 315–332. Rousseeuw P. J. and V. J. Yohai. 1984. “Robust regression by means of S-estimators,” in Robust and nonlinear time series analysis, Vol. 26, W. H. Franke and R. D. Martin (Eds.). Springer, New York, pp. 256–272. Royston, P. 1982. “An extension of Shapiro and Wilk’s W Test for normality to large samples.” Applied Statistics 31, 115–124. Rozeff, M. S. and M. A. Zaman. 1988. “Market efficiency and insider trading: new evidence.” Journal of Business 61, 25–44. Rozeff, M. S. and M. A. Zaman. 1998. “Overreaction and insider trading: evidence from growth and value Portfolios.” Journal of Finance 53, 701–716. Ruback, R. 2002. “Capital cash flows: a simple approach to valuing risky cash flows.” Financial Management (Summer 2002), 31, 85–103. Rubinstein, M. E. 1973. “A mean-variance synthesis of corporate financial theory.” Journal of Finance 28, 167–168. Rubinstein, M. 1973. “A comparative statics analysis of risk premiums,” essay 2 in four essays in the theory of competitive security markets, Ph.D. dissertation (UCLA 1971), appeared in Journal of Business. Rubinstein, M. 1974a. “An aggregation theorem for security markets.” Journal of Financial Economics 1(3), 225–244. Rubinstein, M. 1974b. A discrete-time synthesis of financial theory, Working Papers #20 and #21, Research Program in Finance, University of California, Berkeley. Rubinstein, M. 1975. “Securities market efficiency in an arrow-debreu economy.” American Economic Review 65(5), 812–824. Rubinstein, M. 1976. “The strong case for the generalized logarithmic utility model as the premier model of financial markets.” Journal of Finance 3, 551–571. Rubinstein, M. 1976. “The valuation of uncertain income streams and the pricing of options.” Bell Journal of Economics 7, 407–425. Rubinstein, M. 1985. “Nonparametric tests of alternative option pricing models using all reported trades and quotes on the 30 most active CBOE option classes from August 23, 1976 through August 31, 1978.” Journal of Finance 40, 455–480. Rubinstein, M. 1994. “Implied binomial trees.” Journal of Finance 49, 771–818.
1675 Rubinstein, M. 1998. “Edgeworth binomial trees.” The Journal of Derivatives 5(3), 20–27. Rubinstein, M. and H. E. Leland. 1981. “Replicating options with positions in stock and cash.” Financial Analysts Journal 37, 63–72. Rubinstein, M. and E. Reiner. 1991. “Breaking down the barriers.” Risk Magazine 4, 28–35. Rubinstein, M., H. Leland, and J. Cox. 1985. Option markets, PrenticeHall, Englewood Cliffs, NJ. Ruszczy´nski, A. and R.J. Vanderbei. 2003. “Frontiers of stochastically nondominated portfolios.” Econometrica 71, 1287–1297. Rutledge, D. J. S. 1972. “Hedgers’ demand for futures contracts: a theoretical framework with applications to the United States soybean complex.” Food Research Institute Studies 11, 237–256. Rydberg, T. H. 1997. “The normal inverse Gaussian L’evy process: simulation and approximation.” Communications in Statistics: Stochastic models 13, 887–910. Rydberg, T. 1999. “Generalized hyperbolic diffusion processes with applications towards finance.” Mathematical Finance 9, 183–201. Ryngaert, M. D. and J. F. Houston. 1994. “The overall gains from large bank mergers.” Journal of Banking and Finance 18, 1155–1176. Salop, S. and D. Scheffman. 1983. “Raising rivals’ costs.” American Economic Review 73, 267–271. Samuelson, P. A. 1948. Foundations of Economic Analysis, Cambridge, MA: Harvard University Press. Samuelson, P. A. 1960. An extension of the Le Chatelier principle. Econometrica pp. 368–379. Samuelson, P. A. 1965. “Rational theory of warrant pricing.” Industrial Management Review 6, 13–39. Samuelson, P. A. 1965. “Proof that properly anticipated prices fluctuate randomly.” Industrial Management Review 6, 41–49. Samuelson, P. A. 1972. “Maximum principles in analytical economics.” American Economic Review 62(3), 249–262. Samuelson, P. 1973. “Proof that properly discounted present values of assets vibrate randomly.” Bell Journal of Economics and Management Science 4(2), 369–374. Samuelson, P. and R. Merton. 1969. “A complete model of warrant pricing that maximizes utility.” Industrial Management Review 10, 17–46. Sánchez, J. de A. and A. T. Gómez. 2003a. “Applications of fuzzy regression in actuarial analysis.” JRI 2003 70(4), 665–699. Sánchez, J. de A. and A. T. Gómez. 2003b. “Estimating a term structure of interest rates for fuzzy financial pricing by using fuzzy regression methods.” Fuzzy Sets and Systems 139(2), 313–331. Sánchez, J. de A. and A. T. Gómez. 2004. “Estimating a fuzzy term structure of interest rates using fuzzy regression techniques.” European Journal of Operational Research 154, 804–818. Sandas, P. 2001. “Adverse selection and competitive market making: empirical evidence from a limit order market.” Review of Financial Studies 14, 705–734. Sanders, A. B. and H. Unal. 1988. “On the intertemporal behavior of the short-term rate of interest.” Journal of Financial and Quantitative Analysis 23, 417–423. Sandroni, A., R. Smorodinsky, and R. Vohra. 2003. “Calibration with many checking rules.” Mathematics of Operations Research 28, 141–153. Sankaran, M. 1963. “Approximations to the non-central Chi-square distribution.” Biometrika 50, 199–204. Santa-Clara, P. and D. Sornette. 2001. “The dynamics of the forward interest rate curve with stochastic string shocks.” Review of Financial Studies 14, 149–185. Santhanam, R., K. Muralidhar, and M. Schniederjans. 1989. “A zeroone goal programming approach for information system project selection.” OMEGA: International Journal of Management Science 17(6), 583–593. Sargan. J. D. 1958. “The estimation of economic relationships using instrumental variables.” Econometrica 26, 393–415.
1676 Sarig, O. and A. Warga. 1989. “Some empirical estimates of the risk structure of interest rates.” Journal of Finance 44, 1351–1360. Sarkar, S., and R. S. Sriram. 2001. “Bayesian models for early warning of bank failures.” Management Science 47(11), 1457–1475. SAS/OR. 1990. User’s guide, SAS Institute Inc., Cary, NC. Sato, K. 1999. Lévy processes and infinitely divisible distributions, Cambridge University Studies in Advanced Mathematics, Vol. 68, Cambridge University Press, Cambridge. Saunders, A. 1997. Financial institutions management: a modern perspective, Irwin/McGraw-Hill, New York, NY, pp. 330–331. Saunders, A. and L. Allen. 2002. Credit risk measurement: new approaches to value at risk and other paradigms, Wiley, New York. Saunders, A., E. Strock, and N. G. Travlos. 1990. “Ownership structure, deregulation, and bank risk taking.” Journal of Finance 45, 643–654. Schaefer, S. and E. Schwartz. 1984. “A two-factor model of the term structure: an approximate analytical solution.” Journal of Financial and Quantitative Analysis 19(4), 413–424. Schaefer, S. and I. A. Strebulaev. 2008. “Structural models of credit risk are useful: evidence from hedge ratios on corporate bonds.” Journal of Financial Economics 90(1), 1–19. Schall, L. D. 1972. “Asset valuation, firm investment, and firm diversification.” Journal of Business 45, 11–28. Scheinkman, J. A. and W. Xiong. 2003. “Overconfidence and speculative bubbles.” Journal of Political Economy 111, 1183–1219. Schiff, J.L. 1999. The Laplace transform: theory and applications. Springer, New York. Schink, G. R. and R. S. Bower. 1994. “Application of the fama-french model to utility stocks.” Financial Markets, Institutions and Instruments. Estimating Cost of Capital: Methods and Practice 3(3), 74–96. Schliefer, A. and R. Vishny. 1992. “Liquidation values and debt capacity.” Journal of Finance 47, 1343–1366. Schmidt, P. 1976. Econometrics, Marcel Dekker: New York. Scholes, M. 1976. “Taxes and the pricing of options.” Journal of Finance 31, 319–332. Scholes, M. and J. Williams. 1977. “Estimating beta from nonsynchronous trading.” Journal of Financial Economics 5, 309–327. Schölkopf, B. and A. J. Smola. 2002 Learning with kernels. MIT Press, Cambridge, MA. Schonbucher, P. J. 2003. Credit derivatives pricing models: models, pricing and implementation. Wiley, UK. Schonbucher, P. and D. Schubert. 2001. Copula-dependent default risk in intensity models, Working paper, Bonn University. Schrage L. 1991. User’s manual: linear, integer, and quadratic programming with Lindo, The Scientific Press, San Francisco, CA. Schrage, L. 2000. Linear, integer and quadratic programming with LINDO, Palo Alto, CA: The Scientific Press. Schrand, C. M. 1997. “The association between stock-price interest rate sensitivity and disclosures about derivatives instruments.” The Accounting Review 72, 87–109. Schrand, C. M. and H. Unal. 1998. “Hedging and coordinated riskmanagement: Evidence from thrift conversions.” Journal of Finance 53, 979–1014. Schroder, M. 1989. “Computing the constant elasticity of variance option pricing formula.” Journal of Finance 44(March), 211–220. Schwartz, R. 1991. Reshaping the equity markets: a guide for the 1990s, Harper Business, New York. Schwartz, E. S. 1997. “The stochastic behavior of commodity prices: implication for valuation and hedging.” Journal of Finance 52, 923–973. Schwartz, R. and R. Francioni. 2004. Equity markets in action, Wiley, New York. Schwartz, E. and M. Moon. 2000. “Rational pricing of internet companies.” Financial Analysts Journal 56, 62–75.
References Schwartz, E. and M. Moon. 2001. “Rational pricing of internet companies revisited.” University of California at Los Angeles, (Revised April 2001). Schweizer, M. 1992. “Martingale densities for general asset prices.” Journal of Mathematical Economics 21, 363–378. Schweizer, M. 1994. “Approximating random variables by stochastic integrals.” Annals of Probability 22, 1536–1575. Schweizer, M. 1996. “Approximation pricing and the variance optimal martingale measure.” Annals of Probability 24, 206–236. Schwert, G. W. 1989. “Why does stock market volatility change over time?” Journal of Finance 44(5), 1115–1153. Schwert, G. W. 2003. “Anomalies and market efficiency,” in Handbook of the Economics of Finance, G. M. Constantinides, M. Harris, R. M. Stulz (Eds.). North Holland, Amsterdam. Scott, W. 1979. “On optimal and data-based histograms.” Biometrika 33, 605–610. Scott, L. 1987. “Option pricing when the variance changes randomly: theory, estimation, and an application.” Journal of Finance and Quantitative Analysis 22(4), 419–438. Scott, L. 1997. “Pricing stock options in a jump-diffusion model with stochastic volatility and interest rates: application of Fourier inversion methods.” Mathematical Finance 7, 413–426. Sears, S. and G. Trennepohl. 1982. “Measuring portfolio risk in options.” Journal of Financial and Quantitative Analysis 17, 391–410. Seiford, L. and P. L. Yu. 1979. “Potential solutions of linear systems: the multi-criteria multiple constraint level program.” Journal of Mathematical Analysis and Applications 69, 283–303. Seligman, J. 1984. “Reappraising the appraisal remedy.” George Washington Law Review 52, 829. Seppi, D. 1990. “Equilibrium block trading and asymmetric information.” Journal of Finance 45, 73–94. Seyhun, H. N. 1986. “Insiders’ profits, costs of trading, and market efficiency.” Journal of Financial Economics 16, 189–212. Seyhun, H. N. 1992. “Why does aggregate insider trading predict future stock returns?” Quarterly Journal of Economics 107, 1303–1331. Shafer, G. and V. Vovk. 2001. Probability and finance: it’s only a game, Wiley, New York. Shailer, G. 1989. “The predictability of small enterprise failures: evidence and issues.” International Small Business Journal 7(4), 54–58. Shalen, C. T. 1993. “Volume, volatility, and the dispersion of beliefs.” Review of Financial Studies 6, 405–434. Shanken, J. 1990. “Intertemporal asset pricing: an empirical investigation.” Journal of Econometrics 45, 99–120. Shanken, J. 1992. “On the estimation of beta-pricing models.” Review of Financial Studies 5, 1–33. Shapiro, A. C. and S. Titman. 1985. “An integrated approach to corporate risk-management.” Midland Corporate Finance Journal 3, 41–56. Sharfman, K. 2003. “Valuation averaging: a new procedure for resolving valuation disputes.” Minnesota Law Review 88(2), 357–383. Sharpe, W. F. 1961. “Portfolio analysis based on a simplified model of the relationships among securities.” Ph.D. Dissertation, University of California, Los Angeles. Sharpe, W. F. 1963. “A simplified model for portfolio analysis.” Management Science 9(2), 277–293. Sharpe, W. 1964. “Capital asset prices: a theory of market equilibrium under conditions of risk.” Journal of Finance 19(3) 425–442. Sharpe, W. F. 1966. “Mutual fund performance.” Journal of Business 39, 119–138. Sharpe, W. F. 1967. “A linear programming algorithm for mutual fund portfolio selection.” Management Science 13, 499–510. Sharpe, W. F. 1970. Portfolio theory and capital markets, McGraw-Hill, New York.
References Sharpe, W. F. 1992. “Asset allocation: management style and performance measurement.” Journal of Portfolio Management 18, 7–19. Sharpe, W. F. 1994. “The Sharpe ratio.” Journal of Portfolio Management 21, 49–58. Shastri, K. and K. Tandon. 1986. “Valuation of foreign currency options: some empirical results.” Journal of Financial and Quantitative Analysis 21, 377–392. Shaw, W. 2006. Stochastic volatility: models of Heston type, lecture notes. http://www.mth.kcl.ac.uk/~shaww/web_page/papers/ StoVolLecture.pdf http://www.mth.kcl.ac.uk/~shaww/web_page/ papers/StoVolLecture.nb. Shea, J. 1997. “Instrument relevance in multivariate linear models: a simple measure.” Review of Economics and Statistics 79, 348. Sheard, P. 1994. “Interlocking shareholdings and corporate governance,” in The Japanese firm: the sources of competitive strength, M. Aoki and M. Dore (Eds.). University Press, Oxford, 310–349. Shen, F. C. 1988. Industry rotation in the U.S. stock market: an application of multiperiod portfolio theory.” unpublished Ph.D. dissertation, Simon Fraser University, Canada. Shen, C. H. 1994. “Testing efficiency of the Taiwan–US forward exchange market – a Markov switching model.” Asian Economic Journal 8, 205–215. Shephard, N. 1996. “Statistical aspects of ARCH and stochastic volatility,” in Statistical models in econometric, finance and other fields, O. E. Barndorff-Nielsen, D. R. Cox, and D. V. Hinkley (Eds.). Chapman & Hall, London, pp. 1–67. Sheppard, S. 1999. “Hedonic analysis of housing markets,” in Handbook of regional and urban economics, P. C. Chesire and E. S. Mills (Eds.). Vol 3, Elsevier, New York. Shi, Y. 2001. Multiple criteria multiple constraint-levels linear programming: concepts, techniques and applications, World Scientific Publishing, River Edge, NJ. Shi, Y. and H. Lee. 1997. “A binary integer linear programming with multicriteria and multiconstraint levels.” Computers and Operations Research 24, 259–273. Shi, Y. and Y. H. Liu. 1993. “Fuzzy potential solutions in multicriteria and multiconstraints levels linear programming problems.” Fuzzy Sets and Systems 60, 163–179. Shi, Y., and P. L. Yu. 1989. “Goal setting and compromise solutions,” in Multiple criteria decision making and risk analysis using microcomputers, B. Karpak and S. Zionts (Eds.). Springer, Berlin, pp. 165–204. Shi, Y. and P. L. Yu. 1992. “Selecting optimal linear production systems in multiple criteria environments.” Computer and Operations Research 19, 585–608. Shi, Y., M. Wise, M. Luo, and Y. Lin. 2001. “Data mining in credit card portfolio management: a multiple criteria decision making approach,” in Multiple criteria decision making in the new millennium, M. Koksakan and S. Zionts (Eds.). Springer, Berlin, pp. 427–436. Shi, Y., Y. Peng, W. Xu, and X. Tang. 2002. “Data mining via multiple criteria linear programming: applications in credit card portfolio management.” International Journal of Information Technology & Decision Making 1(1), 131–151. Shih, F. and D. Suk. 1992. Information content of insider trading and insider ownership around special dividend announcements, Working Paper, Rider College. Shiller, R. 1981. “Do stock prices move too much to be justified by subsequent changes in dividends?” American Economic Review 71, 421–436. Shiller, R. 1984. “Stock prices and social dynamics.” Brookings Papers on Economic Activity 2, 457–498. Shiller, R. J. 1988. “Fashions, fads, and bubbles in financial markets,” in Knights, raiders, and targets, J. Coffee Jr., L. Lowenstein and S. Ross-Ackerman (Eds.). Oxford Press, New York.
1677 Shimko, D. 1993. “Bounds of probability.” Risk 6, 33–37. Shimko, D. C., N. Tejima, and D. Van Deventer. 1993. “The pricing of risky debt when interest rates are stochastic.” The Journal of Fixed Income 3, 58–65. Shiryaev, A. and L. Shepp 1996. “Hiring and firing optimally in a large corporation” Journal of Economic Dynamics and Control 20, 1523–1539. Shleifer, A. and R. Vishny. 1986. “Large shareholders and corporate control.” Journal of Political Economy 94, 448–461. Shleifer, A. and R. W. Vishny. 1990. “Equilibrium short horizons of investors and firms.” American Economic Review 80, 148–153. Shmueli, G., N. Patel, and P. Bruce. 2006. Data mining for business intelligence: concepts, techniques, and applications in microsoft office excel with XLMiner, Wiley, New Jersey. Shreve, S. E. 2004a. Stochastic calculus for finance I: The binomial asset pricing model, Springer Finance, New York. Shreve, S. E. 2004b. Stochastic calculus for finance II: continuous-time model, Springer Finance, New York. Shu, J. H. and J. E. Zhang. 2006. “Testing range estimators of historical volatility.” Journal of Futures Markets 26, 297–313. Shumway, T. 2001. “Forecasting bankruptcy more accurately: a simple hazard model.” Journal of Business 74(1), 101–124. Sick, G. A. 1990. “Tax adjusted discount rates.” Management Science (December 1990), 36, 1432–1450. Siegel, J. J. 2003. “What is an asset price bubble? an operational definition.” European Financial Management 9, 11–24. Siegel, J. 2006. “The noisy market hypothesis.” Wall Street Journal June 14, 2006, A14. Siegmund, D. 1985. Sequential analysis, Springer-Verlag, New York, NY. Silberberg, E. 1971. “The Le Chatelier principle as a corollary to a generalized envelope theorem.” Journal of Economic Theory 3, 146– 155. Silberberg, E. 1974. “A revision of comparative statics methodology in economics, or how to do economics on the back of an envelope.” Journal of Economic Theory 7, 159–172. Silberberg, E. and W. Suen. 2000. The structure of economics: a mathematical analysis, McGraw-Hill, New York. Silverman, B.W. 1986. Density estimation, Chapman and Hall, London. Sims, C. A. 1972. “Money, income, and causality.” American Economic Review Vol. 62, pp. 540–552. Sims, C. A. 1980. “Macroeconomics and reality.” Econometrica 48, 1–48. Sinkey, J. F., Jr. and D. Carter. 1994. “The derivatives activities of U.S. commercial banks,” in The declining role of banking, Papers and proceedings of the 30th annual conference on bank structure and regulation, Federal Reserve Board of Chicago, pp. 165–185. Sirri, E. R. and P. Tufano. 1998. “Costly search and mutual fund flows.” Journal of Finance 53, 1589–1622. Sklar, A. 1959. Functions de repartition an n dimension et leurs marges, Pub. Inst. Statistics. University Paris 8, 229–231. Skouras, S. 2007. “Decisionmetrics: a decision-based approach to econometric modelling.” Journal of Econometrics 127, 414–440. Smidt, S. 1971. “Which road to an efficient stock market: free competition or regulated monopoly?” Financial Analyst Journal, 27, 18–20, 64–69. Smidt, S. 1990. “Long-run trends in equity turnover.” Journal of Portfolio Management 17, 66–73. Smirlock, M. and L. Starks. 1985. “A further examination of stock price changes and transaction volume.” Journal of Financial Research 8, 217–225. Smith, K. V. 1969. “Stock price and economic indexes for generating efficient portfolios.” Journal of Business 42, 326–335. Smith, C. 1976. “Option pricing: a review.” Journal of Financial Economics 3, 3–51.
1678 Smith, C. W., Jr. 1979. “Applications of option pricing analysis.” in Handbook of financial economics, J. L. Bicksler (Ed.). North-Holland, Amsterdam, pp. 79–121. Smith, C. W., Jr. 1995. “Corporate risk management: Theory and practice.” Journal of Derivatives 2, 21–30. Smith, M. 1996. “Shareholder activism by institutional investors: evidence from CalPERS.” Journal of Finance 51, 227–252. Smith R. and R. Sidel 2006. “J.P. Morgan agrees to settle IPO cases for $425 million.” Wall Street Journal April 21, C4. Smith, C. W., Jr. and R. M. Stulz. 1985. “The determinants of firms’ hedging policies.” Journal of Financial and Quantitative Analysis 20, 391–405. Smith, C. Jr. and J. B. Warner. 1979. “On financial contracting: an analysis of bond covenants.” Journal of Financial Economics (June 1979), 7, 117–61. Smith, C. W., Jr. and R. L. Watts. 1992. “The investment opportunity set and corporate financing, dividend, and compensation policies.” Journal of Financial Economics 32, 263–292. Smith, V., G. Suchanek and A. Williams. 1988. “Bubbles, crashes, and endogenous expectations in experimental spot asset markets.” Econometrica 56, 1119–1151. Snedecor, G. W. and W. G. Cochran. 1989. Statistical methods, 8th ed., Iowa State Press, Ames, Iowa. So, M. K. P., K. Lam, and W. K. Li. 1998. “A stochastic volatility model with Markov switching.” Journal of Business and Economic Statistics 16, 244–253. Sobehart, J. R. and S. C. Keenan 2004 “Performance evaluation for credit spread and default risk models,” in Credit risk: models and management Second Edition, D. Shimko (Ed.). Risk Books London, pp. 275–305. Sobehart, J. R., S. C. Keenan, and R. M. Stein. 2000. “Benchmarking quantitative default risk models: a validation methodology.” Moody’s Investors Service. New York, NY. Sobehart, J. R., S. C. Keenan and R. M. Stein 2000 “Rating methodology: benchmarking quantitative default risk models: a validation methodology.” Moody’s Investors Service, Global Credit Research, New York. Sodal, S., S. Koekebakker, and R. Aadland. 2008. “Market switching in shipping: a real option model applied to the valuation of combination carriers.” Review of Financial Economics 17(3), 183–203. Sola, M. and J. Driffill. 1994. “Testing the term structure of interest rates using a vector autoregression with regime switching.” Journal of Economic Dynamics and Control 18, 601–628. Solnik, B. 1974a. “An equilibrium model of international capital market.” Journal of Economic Theory 36, 500–524. Solnik, B. 1974b. “An international market model of security price behavior.” Journal of Financial and Quantitative Analysis 10, 537– 554. Solnik, B. 1974c. “The international pricing of risk: an empirical investigation of the world capital market structure.” Journal of Finance 29, 365–377. Solnik, B. 1993. “The performance of international asset allocation strategies using conditioning information.” Journal of Empirical Finance 1, 33–55. Solnik, B. 2004. International investments, Pearson-Addison Wesley, Salt Lake City. Soner, H. M., S. Shreve, and J. Cvitanic. 1995. “There is no nontrivial hedging portfolio for option pricing with transaction costs.” Annals of Applied Probability 5, 327–355. Sorensen, M. 2007. “How smart is smart money? A two-sided matching model of venture capital.” The Journal of Finance 62(38), 2725–2762. Sortino, F. and L. Price. 1994. “Performance measurement in a downside risk framework.” Journal of Investing 3, 59–65. Spence, M. 1977. “Entry, capacity, investment and oligopolistic pricing.” Bell Journal of Economics 10, 1–19.
References Spong, K. R. and R. J. Sullivan. 1998. “How does ownership structure and manager wealth influence risk?” Financial Industry Perspectives, Federal Reserve Bank of K.C., pp. 15–40. Sprenkle, C. 1961. “Warrant prices as indicators of expectations and preferences.” Yale Economic Essays 1(2), 178–231. Staiger, D. and J. H. Stock. 1997. “Instrumental variables regression with weak instruments.” Econometrica 65, 557. Stambaugh, R. F. 1988. “The Information in forward rates: implications for models of the term structure.” Journal of Financial Economics 21, 41–70. Stambaugh, R. S. 1999. “Predictive regressions.” Journal of Financial Economics 54, 315–421. Stanton, R. 1997. “A nonparametric model of term structure dynamics and the market price of interest rate risk.” Journal of Finance 52, 1973–2002. Stein, C. 1973. “Estimation of the mean of a multivariate normal distribution.” Proceedings of the Prague Symposium on Asymptotic Statistics. Stein, J. C. 1998. “An adverse-selection model of bank asset and liability management with implications for the transmission of monetary policy.” RAND Journal of Economics 29, 466–486. Stein, R. M. 2002 “Benchmarking default prediction models: pitfall and remedies in model validation.” Moody’s KMV, Technical Report, No. 030124. Stein, E. M. and R. Shakarchi. 2005. Real analysis. Princeton University Press, Upper Saddle River, NJ. Stein, E. and J. Stein. 1991.“Stock price distributions with stochastic volatility.”Review of Financial Studies 4, 727–752. Stephan, J. and R. Whaley. 1990. “Intraday price changes and trading volume relations in the stock and stock option markets.” Journal of Finance 45, 191–220. Steuer, R. 1986. Multiple criteria optimization: theory. computation and application, Wiley, New York. Stevens, G. V. G. 1998. “On the inverse of the covariance matrix in portfolio analysis.” Journal of Finance, October, 53(5), 1821–1827. Stevenson, B. G. and M. W. Fadil. 1995. “Modern portfolio theory: can it work for commercial loans?” Commercial Lending Review 10(2), 4–12. Stickel, S. and R. Verrecchia. 1994. “Evidence that volume sustains price changes.” Financial Analysts Journal 57–67. Stigler, G. 1964. “Public regulation of the securities markets.” Journal of Business 37, 117–142. Stiglitz, J. E. 1969. “A re-examination of the Modigliani-Miller theorem.” The American Economic Review 54, 784–793. Stiglitz, J. E. 1974. “On the irrelevance of corporate financial policy.” The American Economic Review 54, 851–866. Stiglitz, J. 1990. “Symposium on bubbles.” Journal of Economic Perspective 4, 13–18. Stochs, M. H. and D. C. Mauer 1996. “The determinants of corporate debt maturity structure.” Journal of Business 69(3), 279–312. Stock, J. H. 1987. “Measuring business cycle time.” Journal of Political Economy 95, 1240–1261. Stock, D. 1992. “The analytics of relative holding-period risk for bonds.” Journal of Financial Research 15(3), 253–264. Stock, J. and M. Watson. 2001. “Vector autoregressions.” Journal of Economic Perspectives 15, 101–115. Stock, J. H., and M. Yogo. 2002. Testing for weak instruments in linear IV regression, (National Bureau of Economic Research, Inc, NBER Technical Working Paper: 284). Stock, J. H., J. H. Wright, and M. Yogo. 2002. “A survey of weak instruments and weak identification in generalized method of moments.” Journal of Business and Economic Statistics 20, 518. Stoll, H. R. 1969. “s” Journal of Finance 24, 801–824. Stoll, H. R. 1976. “Dealer inventory behavior: an empirical investigation of NASDAQ stocks.” Journal of Financial and Quantitative Analysis 11(3), 356–380.
References Stoll, H. R. 1978a. “The supply of dealer services in securities markets.” Journal of Finance 33, 1122–1151. Stoll, H. R. 1978b. “The pricing of security dealer services: an empirical study of NASDAQ stocks.” Journal of Finance 33, 1153–1172. Stoll, H. 1989. “Inferring the components of the bid-ask spread: theory and empirical tests.” Journal of Finance 44, 115–134. Stoll, H. R. 2000. “Friction.” Journal of Finance 55, 1479–1514. Stone, B. 1973. “A linear programming formulation of the general portfolio selection problem.” Journal of Financial and Quantitative Analysis 8, 621–636. Storey, D. J., K. Keasey, R. Watson, and P. Wynarczyk. 1987. The performance of small firms, Croom Helm, Bromley. Stricker, C. 1990. “Arbitrage et Lois de Martingale.” Annales de l’Institut Henri Poincare (B) Probability and Statistics 26, 451–460. Stulz, R. M. 1982. “Options on the Minimum or Maximum of Two Risky Assets: Analysis and Applications.” Journal of Financial Economics 10, 161–185. Stulz, R. M. 1984. “Optimal hedging policies.” Journal of Financial and Quantitative Analysis 19, 127–140. Stulz, R. 1988. “Managerial control of voting rights: financing policies and the market for corporate control.” Journal of Financial Economics 20, 25–54. Stulz, R. M. 1990. “Managerial discretion and optimal financing policies.” Journal of Financial Economics 26, 3–27. Stulz, R. M. 1996. “Rethinking risk-management,” Journal of Applied Corporate Finance 9, 8–24. Stulz, R. M. 1999. “International portfolio flows and security markets,” in International capital flows, M. Feldstein (Ed.). University of Chicago Press, Chicago. Stulz, R. M. and R. Williamson. 2003. “Culture, openness, and finance.” Journal of Financial Economics 70, 313–349. Stutzer, M. 1996. “A simple nonparametric approach to derivative security valuation.” Journal of Finance 51(5), 1633–1652. Sullivan, R., A. Timmermann, and H. White. 1999. “Data-snooping, technical trading rule performance, and the bootstrap.” Journal of Finance 54(5), 1647–1691. Summa, J. F. and J. W. Lubow. 2001. Options on futures, John Wiley & Sons, New York. Sun, L. 2007. “A re-evaluation of auditors’ opinions versus statistical models in bankruptcy prediction.” Review of Quantitative Finance and Accounting 28, 55–78. Sun, T. S., S. Sundaresan, and C. Wang. 1993. “Interest rate swaps: an empirical investigation.” Journal of Financial Economics 34, 77–99. Sundaram, A. K., T. A. John, and K. John. 1996. “An empirical analysis of strategic competition and firm values – the case of R&D competition.” Journal of Financial Economics 40(3), 459–486. Sundaresan, M. 1984. “Consumption and equilibrium interest rates in stochastic production economies.” Journal of Finance 39, 77–92. Sundaresan, S. 2003. Fixed income markets and their derivatives, 2nd Edition, South-Western Thomson, Cincinnati, OH. Sunder, S. 1980. “Stationarity of market risk: random coefficients tests for individual stocks.” Journal of Finance 35, 883–896. Sunder, S. 1992. “Experimental asset markets: a survey,” in Handbook of experimental economics, J. H. Kagel and A. E. Roth (Eds.). Princeton University Press, Princeton, NY. Sung, T. K., N. Chang, and G. Lee. 1999. “Dynamics of modeling in data mining: interpretive approach to bankruptcy prediction.” Journal of Management Information Systems 16(1), 63–85. Sy, A. N. R. 2004. “Rating the rating agencies: anticipating currency crises or debt crises?” Journal of Banking and Finance 28(11), 2845–2867. Szego, G. 2002. “Measures of risk.” Journal of Banking and Finance 26, 1253–1272. Szewczyk, S. H., G. P. Tsetsekos, and Z. Zantout. 1996. “The valuation of corporate R&D expenditures: evidence from investment opportunities and free cash flow.” Financial Management 25(1), 105–110.
1679 Taggart, R. A Jr. 1980. “Taxes and corporate capital structure in an incomplete market.” Journal of Finance 35, 645–659. Taggart, R. A. 1991. “Consistent valuation and cost of capital expressions with corporate and personal taxes.” Financial Management (Autumn 1991), 20, 8–20. Taleb, N. 1997. Dynamic hedging, Wiley, New York. Tam, K. and M. Kiang. 1992. “Managerial applications of neural networks: the case of bank failure predictions.” Management Science 38(7), 926–947. Tanaka, H., S. Uejima, and K. Asai. 1982. “Linear regression analysis with fuzzy model.” IEEE Transactions on Systems, Man and Cybernetics 12(6), 903–907. Tang, R. Y. W. 1992. “Transfer pricing in the 1990s.” Management Accounting 73, 22–26. Tang, D. Y. and H. Yan. 2006. “Liquidity, liquidity spillover, and credit default swap spreads.” 2007 AFA Chicago Meetings Paper. Tang, D. Y. and H. Yan. 2006. Macroeconomic conditions, firm characteristics, and credit spreads. Journal of Financial Services Research 29, 177–210. Tang, D. Y. and H. Yan. 2007. Liquidity and credit default swap spreads, Working paper. Tang, C. M., A. Y. T. Leung, and K. C. Lam. 2006. “Entropy application to improve construction finance decisions.” Journal of Construction Engineering and Management October issue, 1099–1113. Tansey, R., M. White, and R. Long. 1996. “A comparison of loglinear modeling and logistic regression in management research.” Journal of Management 22(2), 339–358. Tarjan, R. E. 1972. “Depth-first search and linear graph algorithms.” SIAM Journal on Computing 1, 146–160. Tauchen, G. E. and M. Pitts. 1983. “The price variability – volume relationship on speculative markets.” Econometrica 51(2), 485–505. Taurén, M. 1999. A model of corporate bond prices with dynamic capital structure, Working paper, Indiana University. Taylor, S. J. 1986. Modelling financial time series, Wiley, Chichester. Taylor. S. 1994. “Modeling stochastic volatility: a review and comparative study.” Mathematical Finance 4, 183–204. Taylor, S. J. and X. Xu. 1997. “The incremental volatility information in one million foreign exchange quotations.” Journal of Empirical Finance 4, 317–340. Ted Hong, C. H. 2005. Modeling fixed rate MBS prepayments, Beyondbond Inc, October. Ted Hong, C. H. and M. Chang. 2006. Non-agency hybrid ARM prepayment model, Beyondbond Inc, July, 6–17. Telser, L. G. 1958 “Futures treading and the storage of cotton and wheat” Journal of Political Economy 66, 233–255. Tenenbaum, M. and H. Pollard. 1985. Ordinary differential equations. Dover Publications, New York. ter Horst, E. 2003. A Levy generalization of compound Poisson processes in finance: theory and applications, PhD thesis, Institute of Statistics and Decision Sciences. Tesar, L. and Werner, I. 1994. “International equity transactions and U.S. portfolio choice.” in The internationalization of equity markets, J. Frankel (Ed.). University of Chicago Press, Chicago, pp. 185–220. Tesar, L. and I. Werner. 1995. “U.S. equity investment in emerging stock markets.” World Bank Economic Review 9, 109–130. Thanassoulis, E. 1985. “Selecting a suitable solution method for a multi objective programming capital budgeting problem.” Journal of Business Finance & Accounting 12, 453–471. The International Monetary Fund. 2001. World economic outlook, Washington DC. Theil, H. 1958. Economic forecasts and policy, North-Holland, Amsterdam. Theil, H. 1966. Applied economic forecasting, North-Holland, Amsterdam.
1680 Theodossiou, P. 1993. “Predicting shifts in the mean of a multivariate time series process: an application in predicting business failures.” Journal of the American Statistical Association 88, 441–449. Theodossiou, P. 1998. “Financial data and the skewed generalized t distribution.” Management Science 44, 1650–1661. Thomas, L., K. Jung, S. Thomas, and Y. Wu. 2006. “Modeling consumer acceptance probabilities.” Expert Systems with Applications 30, 499–506. Thompson, D. J. 1976. “Sources of systematic risk in common stocks.” Journal of Business 49, 173–188. Thompson, G. W. P. 2000. Fast narrow bounds on the value of Asian options, Working paper, Centre for Financial Research, Judge Institute of Management, University of Cambridge. Thompson, S. 2004. “Identifying term structure volatility from the LIBOR-swap curve.” Review of Financial Studies 21, 819–854. Thompson, H. E. and W. K. Wong. 1991. “On the unavoidability of ‘unscientific’ judgment in estimating the cost of capital.” Managerial and Decision Economics 12, 27–42. Tian, Y. 1993 “A modified lattice approach to option pricing” Journal of Futures Markets 13, 563–577. Tian, Y. 1999. “A flexible binomial option pricing model.” Journal of Futures Markets 19, 817–843. Tinic, S. 1972. “The economics of liquidity services.” The Quarterly Journal of Economics 86(1), 79–93. Tinic, S. and R. West. 1972. “Competition and the pricing of dealer services in the over the counter stock market.” Journal of Financial and Quantitative Analysis 7, 1707–1727. Tinic, S. and R. West. 1980. “The securities industry under negotiated brokerage commissions: changes in the structure and performance of New York stock exchange member firms.” Bell Journal of Economics 11, 29–41. Tirole, J. 1988. Theory of industrial organization, MIT Press, Cambridge, MA. Titman, S. 1984. “The effect of capital structure of a firm’s liquidation decision.” Journal of Financial Economics 13, 137–151. Titman, S. and W. Torous. 1989. “Valuing commercial mortgages: an empirical investigation of the contingent-claims approach to pricing risky debt.” Journal of Finance 44, 345–373. Titman, S. and R. Wessels. 1988. “The determinants of capital structure.” Journal of Finance 43(1), 1–19. Tkac, P. 1996. A trading volume benchmark: theory and evidence, Working Paper, University of Notre Dame. Tobin, J. 1958. “Liquidity preference as behavior toward risk.” Review of Economic Studies 25, 65–86. Tobin, J. 1958. “Liquidity preference as behavior towards risk.” The Review of Economic Studies 67, 65–86. Tobin, J. 1965. “The theory of portfolio selection,” in The theory of interest rates, F. Hain and F. Breechling (Eds.). MacMillan, London, 3–51. Tompkins, R. 2001. “Implied volatility surfaces: uncovering regularities for options on financial futures.” The European Journal of Finance 7, 198–230. Topkis, D. 1978. “Minimizing a submodular function on a lattice.” Operations Research 26, 305–321. Topkis, D. 1979. “Equilibrium points in non-zero sum n-person submodular games.” SIAM Journal of Control and Optimization 17, 773–787. Topper, J. 2005. Financial engineering with finite elements, Wiley, New York. Travlos, N. 1986. “Corporate takeover bids, method of payment, and bidding firms’ stock returns.” Journal of Finance 42, 943–963. Treacy, W. F., and M. S. Carey. 2000. “Credit risk rating systems at large US banks.” Journal of Banking & Finance 24(1–2), 167–201. Trennepohl, G. 1981. “A comparison of listed option premium and Black-Scholes model prices: 1973–1979.” Journal of Financial Research 4, 11–20.
References Treynor, J. L. 1965. “How to rate management of investment funds.” Harvard Business Review 43, 63–75. Treynor, J. and F. Black. 1973. “How to use security analysis to improve portfolio selection.” Journal of Business 46, 66–86. Triantis, A. and A. Borison. 2001. “Real options: state of the practice.” Journal of Applied Corporate Finance 14(2), 8–24. Trigeorgis, L. 1993. “The nature of option interactions and the valuation of investments with multiple real options.” Journal of Financial and Quantitative Analysis 28(1), 1–20. Trolle A. B., and E. S. Schwartz, 2009. “A General Stochastic Volatility Model for the Pricing of Interest Rate Derivatives.” Review of Financial Studies 22(5), 2007–2057. Tsai, C. C,-L. and G. E. Wilmott. 2002. “A generalized defective renewal equation for the surplus process perturbed by diffusion.” Insurance: Mathematics and Economics 30, 51–66. Tsurumi, H. and H. Shimizu. 2008. “Bayesian inference of the regression model with error terms following the exponential power distribution and unbiasedness of the LAD estimator.” Communication in Statistics 37(1) 221–246. Tucker, A., D. Peterson, and E. Scott. 1988. “Tests of the Black-Scholes and constant elasticity of variance currency call option valuation models.” Journal of Financial Research 11(3), 201–214. Tuckman, B. 2002. Fixed income securities: tools for today’s markets, 2nd ed., Wiley, Hoboken, NJ. Tufano, P. 1996. “Who manages risk? An empirical examination of riskmanagement practices in the gold mining industry.” Journal of Finance 51, 1097–1137. Tufano, P. 1996b. “How financial engineering can advance corporate strategy.” Harvard Business Review 74, 136–140. Tufano, P. 1998a. “The determinants of stock price exposure: financial engineering and the gold mining industry.” Journal of Finance 53, 1015–1052. Tufano, P. 1998b. “Agency costs of corporate risk-management.” Financial Management 27, 67–77. Turnbull, S. M. and L. M. Wakeman 1991. “A quick algorithm for pricing European average options.” Journal of Financial and Quantitative Analysis 26, 377–389. Turney, S. T. 1990. “Deterministic and stochastic dynamic adjustment of capital investment budgets.” Mathematical Computation & Modeling 13, 1–9. Uhlenbeck, G. E. and L. S. Ornstein. 1930. “On the theory of Brownian motion.” Physical Review 36, 823–841. Unal, H. and E. J. Kane. 1988. “Two approaches to assessing the interest rate sensitivity of deposit institution equity returns.” Research in Finance 7, 113–137. Value Line, Mutual fund survey New York: Value Line, 19. Van Horne, J. C. 1985. Financial management and policy, 6th ed., Prentice-Hall, Englewood Cliffs, NJ. Van Horne, J. C. 1989. Financial management and policy, PrenticeHall, Englewood Cliffs, New Jersey. Van Horne, J. C. 1998. Financial markets rates and flows, Prentice-Hall, Upper Saddle River, NJ. Vapnik, V. 1995. The nature of statistical learning theory. Springer, New York. Vargas, M. 2006. “Fondos de inversión españoles: análisis empírico de eficiencia y persistencia en la gestión.” PhD Dissertation, University of Zaragoza, Spain. Varian, H. 1985. “Divergence of opinion in complete markets: a note.” Journal of Finance 40, 309–317. Vasicek, O. A. 1973. “A note on using cross-sectional information in Bayesian estimation of security betas.” Journal of Finance 28, 1233–1239. Vasicek, O. A. 1977. “An equilibrium characterization of the term structure.” Journal of Financial Economics 5, 177–188. Vassalou, M. and Y. Xing. 2004. “Default risk in equity returns.” Journal of Finance 59, 831–868.
References Vecer, J. 2001. “A new PDE approach for pricing arithmetic average Asian options.” The Journal of Computational Finance 4, 105–113. Verbrugge, J. A. and J. S. Jahera. 1981. “Expense-preference behavior in the savings and loan industry.” Journal of Money, Credit, and Banking 13, 465–476. Viceira, L. 2001. “Optimal portfolio choice for long-horizon investors with nontradable labor income.” Journal of Finance 56, 433–470. Vijh, A. M. 1988. “Potential biases from using only trade prices of related securities on different exchanges.” Journal of Finance 43, 1049–1055. Villani, G. 2008. “An R&D investment game under uncertainty in real option analysis.” Computational Economics 32(1/2), 199–219. Ville, J. A. 1939. Étude critique de la notion de collectif. GauthierVillars, Paris, this differs from Ville’s dissertation, which was defended in March 1939, only in that a 17-page introductory chapter replaces the dissertation’s one-page introduction. For a translation into English of the passages where Ville spells out an example of a sequence that satisfies von Mises’s conditions but can still be beat by a gambler who varies the amount he bets, see http://www.probabilityandfinance.com/misc/ville1939.pdf. Ville, J. A. 1955. Notice sur les travaux scientifiques de M. Jean Ville. Prepared by Ville when he was a candidate for a position at the University of Paris and archived by the Institut Henri Poincaré in Paris. Vincete Lorente, J. D. 2001. “Specificity and opacity as resource-based determinants of capital structure: evidence for Spanish manufacturing firms.” Strategic Management Journal 22(2), 57–77. Vishwasrao, S., S. Gupta, and H. Benchekroun. 2007. “Optimum tariffs and patent length in a model of north-south technology transfer.” International Review of Economics and Finance 16, 1–14. von Neumann, J. and O. Morgenstern. 1944. Theory of games and economic behavior, Princeton University Press, Princeton. Von Neumann, J. and O. Morgenstern. 1947. Theory of games and economic behavior, 2nd Edition, Princeton University Press, Princeton, NJ. Voth, H.-J. and Temin, P. 2003. “Riding the South Sea Bubble,” MIT Economics, Working Paper No. 04–02. Vovk, V. 2004. Non-asymptotic calibration and resolution, Working Paper 13, November 2004, http://www.probabilityandfinance.com, revised July 2006. Vovk, V. 2005a. Competing with wild prediction rules, Working Paper 16, December 2005, http://www.probabilityandfinance.com, revised January 2006. Vovk, V. 2005b. Competitive on-line learning with a convex loss function, Working Paper 14, May 2005, http://www.probabilityandfinance.com, revised September 2005. Vovk, V. 2005c. On-line regression competitive with reproducing kernel Hilbert spaces, Working Paper 11, November 2005, http://www.probabilityandfinance.com, revised January 2006. Vovk, V. 2006. Predictions as statements and decisions, Working Paper 17, http://www.probabilityandfinance.com. Vovk, V. 2007a. Continuous and randomized defensive forecasting: unified view, Working Paper 21, http://www.probabilityandfinance.com. Vovk, V. 2007b. Defensive forecasting for optimal prediction with expert advice, Working Paper 20, http://www.probabilityandfinance.com. Vovk, V. 2007c. Leading strategies in competitive on-line prediction, Working Paper 18, http://www.probabilityandfinance.com. Vovk, p V. and G. Shafer. 2003. A game-theoretic explanation of the dt effect, Working Paper 5, January, http://www.probabilityandfinance.com. Vovk, V. and G. Shafer. 2008 “The game-theoretic capital asset pricing model.” International Journal of Approximate Reasoning 49, 175–197, see also Working Paper 1 at http://www.probabilityandfinance.com.
1681 Vovk, V., A. Takemura, and G. Shafer. 2004. Defensive forecasting, Working Paper 8, September 2004, http://www.probabilityandfinance.com. An abridged version appeared in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, www.gatsby.ucl.ac.uk/aistats/, revised January 2005. Vovk, V., A. Gammerman, and G. Shafer. 2005a. Algorithmic Learning in a Random World. Springer, New York. Vovk, V., I. Nouretdinov, A. Takemura, and G. Shafer 2005b. Defensive forecasting for linear protocols, Working Paper 10, http://www.probabilityandfinance.com. An abridged version appeared in Algorithmic Learning Theory: Proceedings of the 16th International Conference, ALT 2005, Singapore, October 8–11, 2005, edited by Sanjay Jain, Simon Ulrich Hans, and Etsuji Tomita, http://www-alg.ist.hokudai.ac.jp/thomas/ALT05/alt05.jhtml on pp. 459–473 of Lecture Notes in Computer Science, Vol. 3734, Springer, New York, 2005. Wachter, J. 2002. “Portfolio and consumption decisions under meanreverting returns: an exact solution for complete markets.” Journal of Financial and Quantitative Analysis 37, 63–91. Wachter, J. A. 2006. “A consumption-based model of the term structure of interest rates.” Journal of Financial Economics 79, 365–399. Wackerly, D., W. Mendenhall, and R. L. Scheaffer. 2007. Mathematical statistics with applications, 7th Edition, Duxbury Press, California. Walker, I. E. 1992. Buying a company in trouble: a practical guide, Gower, Hants. Wall, L. D. and J. J. Pringle. 1989. “Alternative explanations of interest rate swaps: A theoretical and empirical analysis.” Financial Management 18, 59–73. Wallace, H. A. 1926. “Comparative farmland values in Iowa,” Journal of Land and Public Utility Economics 2, 385–392. Wang, Z. 1998. “Efficiency loss and constraints on portfolio holdings.” Journal of Financial Economics 48, 359–375. Wang, W. 2006. “Loss severity measurement and analysis.” The MarketPulse, LoanPerformance, Issue 1, 2–19. Wang, P. and T. Moore. 2009. “Sudden changes in volatility: the case of five central European stock markets.” Journal of International Financial Markets, Institutions and Money 19(1), 33–46. Wang, P. and M. Theobald. 2008. “Regime-switching volatility of six east Asian emerging markets.” Research in International Business and Finance 22(3), 267–283. Wang, S. S. and V. R. Yong. 1998. “Ordering risks: expected utility versus Yaari’s dual theory of risk.” Insurance Mathematics and Economics 22, 145–161. Wang, J. and E. Zivot. 2000. “A Bayesian time series model of multiple structural changes in level, trend and variance.” Journal of Business and Economic Statistics 18, 374–386. Wang, P., P. J. Wang, and A. Y. Liu. 2005. “Stock return volatility and trading volume: evidence from the Chinese stock market.” Journal of Chinese Economic and Business Studies 1, 39–54. Wang, S. S., V. R. Yong, and H.H. Panjer. 1997. “Axiomatic characterization of insurance prices.” Insurance Mathematics and Economics 21, 173–183. Wang, K.-L., C. Fawson, C. B. Barrett and J. B. McDonald. 2001. “A flexible parametric GARCH model with an application to exchange rates.” Journal of Applied Econometrics 16, 521–536. Wang, P., P. J. Wang, and A. Y. Liu. 2005. “Stock return volatility and trading volume: evidence from the Chinese stock market.” Journal of Chinese Economic and Business Studies 1, 39–54. Warner, J. B. 1977. “Bankruptcy costs: some evidence.” Journal of Finance 32, 337–347. Watson, M. 1994. “Vector autoregressions and cointegration,” in Handbook of econometrics, Vol. IV, R. F. Engle and D. McFadden (Eds.). Elsevier, Amsterdam. Watson, R. D. 2008. “Subprime mortgages, market impact, and safety nets: the good, the bad, and the ugly.” Review of Pacific Basin of Financial Markets and Policies 11, 465–492.
1682 Watson, D. J. H. and J. V. Baumler. 1975. “Transfer pricing: a behavioral context.” The Accounting Review 50, 466–474. Wedel, M. and W. Kamakura. 1998. Market segmentation: conceptual and methodological foundations, Kluwer Academic Publishers, Inc., Boston, MA. Wei, K. C. J. 1988. “An asset-pricing theory unifying the CAPM and APT.” Journal of Finance 43(4), 881–892. Wei, D. G. and D. Guo. 1997. Pricing risky debt: an empirical comparison of Longstaff and Schwartz and Merton models. Journal of Fixed Income 7, 8–28. Wei, Z., F. Xie, and S. Zhang. 2005. “Ownership structure and firm value in China’s privatized firms: 1991–2001.” Journal of Financial and Quantitative Analysis 40, 87. Weingartner, H. M. 1963. Mathematical programming and the analysis of capital budgeting problems, Prentice-Hall, Englewood Cliffs, NJ. Weinstein, M. I. 1983. “Bond systematic risk and the option pricing model.” Journal of Finance 38(5), 1415–1429. Weinstein, M. 1985. “The equity component of corporate bonds.” Journal of Portfolio Management 11(3), 37–41. Weiss, L. A. 1990. “Bankruptcy resolution: direct costs and violation of priority of claims.” Journal of Financial Economics 27, 285–314. Welch, W. 1982. Strategies for put and call option trading, Winthrop, Cambridge, MA. Wells, E. and S.Harshbarger, 1997. Microsoft Excel 97 Developer’s Handbook. Microsoft Press, Redmond. Wermers, R. 2000. “Mutual fund performance: an empirical decomposition into stock-picking talent, style, transaction costs, and expenses.” Journal of Finance 55, 1655–1695. West, K. 1988. “Bubbles, fads, and stock price volatility tests: a partial evaluation.” Journal of Finance 43, 639–655. West, K. D. 2001. “Encompassing tests when no model is encompassing.” Journal of Econometrics 105, 287–308. West, K. and D. Cho. 1995. “The predictive ability of several models of exchange rate volatility.” Journal of Econometrics 69, 367–391. West, M. and J. Harrison. 1999. Bayesian forecasting and dynamic models, Springer, Berlin. West, P. M., P. L. Brockett, and L. L. Golden. 1997. “A comparative analysis of neural networks and statistical methods for predicting consumer choice.” Marketing Science 16(4), 370–391. Westgaard, S and N. Wijst 2001 “Default probabilities in a corporate bank portfolio: a logistic model approach.” European Journal of Operational Research, 135(2), 338–349. Westin, R. 1973. “Predictions from binary choice models.” Discussion Paper No. 37. Northwestern University, Evanston, IL. Weston, J. F. 1981. “Developments in finance theory.” Financial Management 10 (Tenth Anniversary Issue), 5–22. Weston, J. F. and T. E. Copeland. 1986. Managerial finance, 8th ed., Dryden Press, Hinsdale. Whalen, G. 1991. “A proportional hazard model of bank failure: an examination of its usefulness as an early warning tool.” Economic Review, First Quarter, 20–31. Whaley, Robert E. 1981. “On the valuation of American call options on stocks with known dividends.” Journal of Financial Economics 9, 207–211. Whaley, R. E. 1982. “Valuation of American call options on dividendpaying stocks: empirical tests.” Journal of Financial Economics 10, 29–58. Whaley, R. E. 1986. “Valuation of American futures options: theory and empirical tests.” Journal of Finance 41, 127–150. White, H. 1980. “A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.” Econometrica 48(4), 817–838. White, H. 1980. “A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity.” Econometrica 50, 483–499.
References White, E. 1990. “The stock market boom and crash of 1929 revisited.” Journal of Economic Perspectives 4, 67–84. White, H. 2000. “A reality check for data snooping.” Econometrica 68, 1097–1126. Whitmore, G. A. and M. C. Findlay. (Eds.). 1978. Stochastic dominance: an approach to decision-making under risk, D.C.Heath, Lexington, MA. Whittington, G. 1980. “Some basic properties of accounting ratios.” Journal of Business Finance & Accounting 7(2), 219–232. Wichern, D. W., R. B. Miller, and D. A. Hsu. 1976. “Changes of variance in first-order autoregressive time series models – with an application.” Journal of the Royal Statistical Society, Series C, 25(3), 248–256. Widdicks, M., A. D. Andricopoulos, D. P. Newton, and P. W. Duck. 2002. “On the enhanced convergence of standard lattice methods for option pricing.” Journal of Futures Markets 22, 315–338. Wiggins, J. B. 1987. “Option values under stochastic volatility: theory and empirical evidence.” Journal of Financial Economics 19(2), 351–372. Wiggins, J. 1990. “The relation between risk and optimal debt maturity and the value of leverage.” Journal of Financial and Quantitative Analysis 25, 377–386. Wiggins J. 1991. “Empirical tests of the bias and efficiency of the extreme-value variance estimator for common stocks.” Journal of Business 64, 417–432. Wiginton, J. C. 1980. “A note on the comparison of logit and discriminant models of consumer credit behavior.” The Journal of Financial and Quantitative Analysis 15(3), 757–770. Wilks, S. S. 1963. Mathematical statistics, 2nd Printing, Wiley, NY. Williams, J. 1938. Theory of investment value, North Holland, Amsterdam. Williamson, O. 1988. “Corporate finance and corporate government.” Journal of Fiance 43(3), 567–591. Wilmott, P. 2000. Paul Wilmott on quantitative finance, Wiley, New York. Wilmott, J. 2006. Wilmott on quantitative finance, Wiley, New York, pp. 1213–1215. Wilson, T. 1994. “Debunking the myths.” Risk 4, 68–73. Wilson, R. L. and R. Sharda. 1994. “Bankruptcy prediction using neural networks.” Decision Support Systems 11, 545–557. Wilson, B., R. Aggarwal, and C. Inclan. 1996. “Detecting volatility changes across the oil sector.” Journal of Futures Markets 16(3), 313–330. Wirl, F. 2008. “Optimal maintenance and scrapping versus the value of back ups.” Computational Management Science 5(4), 379–392. Wishart, J. 1947. “The cumulants of the Z and of the logarithmic 2 and t distributions.” Biometrika 34, 170–178. Witten, I. H., and E. Frank. 2000. Data mining: practical machine learning tools and techniques with java implementations, Morgan Kaufmann Publishers, San Francisco, CA. Wolf, R. 2000. “Is there a place for balance-sheet ratios in the management of interest rate risk?” Bank Accounting and Finance 14(1), 21–26. Womack, K. 1996. “Do brokerage analysts’ recommendations have investment value?” Journal of Finance 51, 137–167. Wong, W. K. 2007. “Stochastic dominance and mean-variance measures of profit and loss for business planning and investment.” European Journal of Operational Research 182, 829–843. Wong, W. K. and R. Chan. 2004. “The estimation of the cost of capital and its reliability.” Quantitative Finance 4(3), 365–372. Wong, W. K. and R. Chan. 2008. “Markowitz and prospect stochastic dominances.” Annals of Finance 4(1), 105–129. Wong, W. K. and C. K. Li. 1999. “A note on convex stochastic dominance theory.” Economics Letters 62, 293–300.
References Wong, W. K. and C. Ma. 2008. Preferences over location-scale family, Economic Theory 37(1), 119–146. Wong, W. K., B. K. Chew, and D. Sikorski. 2001. “Can P/E ratio and bond yield be used to beat stock markets?” Multinational Finance Journal 5(1), 59–86. Wong, W. K., M. Manzur, and B. K. Chew. 2003. “How rewarding is technical analysis? Evidence from Singapore stock market.” Applied Financial Economics 13(7), 543–551. Wong, W. K., H. E. Thompson, S. Wei, and Y. F Chow. 2006. “Do Winners perform better than Losers? A Stochastic Dominance Approach.” Advances in Quantitative Analysis of Finance and Accounting 4, 219–254. Wong, W. K., K. F. Phoon, and H. H. Lean. 2008. “Stochastic dominance analysis of Asian hedge funds.” Pacific-Basin Finance Journal 16(3), 204–223. Wood, D. and J. Piesse. 1988. “The information value of failure predictions in credit assessment.” Journal of Banking and Finance 12, 275–292. Wooldridge, J. M. 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, MA. Woolridge, J. R. 1988. “Competitive decline and corporate restructuring: is a myopic stock market to blame?” Journal of Applied Corporate Finance 1, 26–36. Working, H. 1948 “Theory of the inverse carrying charge in futures markets” Journal of Farm Economics 30, 1–28. Working, H. 1949 “The theory of the price of storage” American Economic Review 39, 1254–1262. Working, H. 1953. “Hedging reconsidered.” Journal of Farm Economics 35, 544–561. Wruck, H. K. 1989. “Equity ownership concentration and firm value: evidence from private equity financings.” Journal of Financial Economics 23, 3–28. Wruck, K. H. 1990. “Financial distress, reorganization, and organizational efficiency.” Journal of Financial Economics 27, 419–444. Wu, D.-M. 1973. “Alternative tests of independence between stochastic regressors and disturbances.” Econometrica 41, 733. Wu, H.-C. 2004. “Pricing European options based on the fuzzy pattern of Black–Scholes formula.” Computers & Operations Research 31, 1069–1081. Wu, Y. L. 2004. “The choice of equity-selling mechanisms.” Journal of Financial Economics 74, 93–119. Wu, T. 2006. “Macro factors and the affine term structure of interest rates.” Journal of Money, Credit, and Banking 38, 1847–1875. Wu, Z. and Huang, N. E. 2004. “A study of characteristics of white noise using the empirical mode decomposition method.” Proceedings of the Royal Society of London 460, 1597–1611. Wu, W. and G. Shafer. 2007. Testing lead-lag effects under game-theoretic efficient market hypotheses, Working Paper 23, http://www.probabilityandfinance.com. Wu, C. and X. M. Wang 2000 “A neural network approach for analyzing small business lending decisions.” Review of Quantitative Finance and Accounting, 15(3), 259–276. Wu, S. and Y. Zeng. 2004. “Affine regime-switching models for interest rate term structure,” in Mathematics of Finance, G. Yin and Q. Zhang (Eds). American Mathematical Society Series – Contemporary Mathematics, Vol. 351, pp. 375–386. Wu, S. and Y. Zeng. 2005. “A general equilibrium model of the term structure of interest rates under regime-switching risk.” International Journal of Theoretical and Applied Finance 8, 839–869. Wu, S. and Y. Zeng. 2006. “The term structure of interest rates under regime shifts and jumps.” Economics Letters 93, 215–221. Wu, S. and Y. Zeng. 2007. “An exact solution of the term structure of interest rate under regime-switching risk.” in Hidden Markov models in finance, Mamon, R. S. and R. J. Elliott (Eds). International Series in Operations Research and Management Science, Springer, New York, NY, USA.
1683 Wu, J.-W., W.-L. Hung, and H.-M. Lee. 2000. “Some moments and limit behaviors of the generalized logistic distribution with applications.” Proceedings of the National Science Council, ROC 24, 7–14. Wurgler, J. and E. Zhuravskaya. 2002. “Does arbitrage flatten demand curves for stock?” The Journal of Business 75, 583–609. Xia, Y. 2001. “Learning about predictability: the effect of parameter uncertainty on dynamic asset allocation.” The Journal of Finance 56, 205–246. Xu, X. and S. J. Taylor. 1994. “The term structure of volatility implied by foreign exchange options.” Journal of Financial and Quantitative Analysis 29, 57–74. Xu, X. and S. J. Taylor. 1995. “Conditional volatility and the informational efficiency of the PHLX currency options markets.” Journal of Banking & Finance 19, 803–821. Yaari, M. E. 1987. “The dual theory of choice under risk.” Econometrica 55, 95–115. Yan, A. 2006. “Leasing and debt financing: substitutes or complements?” Journal of Financial and Quantitative Analysis 41, 709. Yang, Tze-Chen. 2005. The pricing of credit risk of portfolio base on listed corporation in Taiwan market, Master Thesis at National Tsing Hua University in Taiwan. Yang, D. and Q. Zhang. 2000. “Drift-independent volatility estimationbased on high, low, open, and closing prices.” Journal of Business 73, 477–491. Yang, C.W., K. Hung, and J. A. Fox. 2006. “The Le Châtelier principle of the capital market equilibrium.” in Encyclopedia of Finance, C. F. Lee and A. C. Lee (Eds.). Springer, New York, pp. 724–728. Yawitz, J. and J. Anderson 1977. “The effect of bond refunding on shareholder wealth.” Journal of Finance 32(5), 1738–1746. Yawitz, J. B., K. J. Maloney, and L. H. Ederington. 1985. “Taxes, default risk, and yield spreads.” Journal of Finance 40, 1127–1140. Yee, K. 2002. “Judicial valuation and the rise of DCF.” Public Fund Digest 76, 76–84. Yee, K. 2004a. “Perspectives: combining value estimates to increase accuracy.” Financial Analysts Journal 60(4), 23–28. Yee, K. 2004b. “Forward versus trailing earnings in equity valuation.” Review of Accounting Studies 9(2–3), 301–329. Yee, K. 2005a. “Aggregation, dividend irrelevancy, and earnings-value relations.” Contemporary Accounting Research 22(2), 453–480. Yee, K. 2005b. “Control premiums, minority discounts, and optimal judicial valuation.” Journal of Law and Economics 48(2), 517–548. Yee, K. 2007. “Using accounting information for consumption planning and equity valuation.” Review of Accounting Studies 12(2–3), 227–256. Yee, K. 2008a. “A Bayesian framework for combining value estimates.” Review of Quantitative Finance and Accounting 30(3), 339–354. Yee, K. 2008b. “Deep-value investing, fundamental risks, and the margin of safety.” Journal of Investing, 17(3), 35–46. Yee, K. 2008c. “Dueling experts and imperfect verification.” International Review of Law and Economics 28(4), 246–255. Yin, G. G. and Q. Zhang. 1998. Continuous-time Markov chains and applications. A singular perturbation approach, Springer, Berlin. Yoo, Y. 2006. “The valuation accuracy of equity valuation using a combination of multiples.” Review of Accounting and Finance 5(2), 108–123. Yoshihara, K. 1995. “The Bahadur representation of sample quantiles for sequences of strongly mixing random variables.” Statistics and Probability Letters 24, 299–304. Yu, P. L. 1985. Multiple criteria decision making: concepts, techniques and extensions, Plenum, New York. Yu, F. 2002. “Modeling expected return on defaultable bonds.” Journal of Fixed Income 12, 69–81. Yu, F. 2003. Correlated defaults in reduced-form models, Working paper, University of California Irvine.
1684 Yule G. U. 1926. “Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series.” Journal of the Royal Statistical Society 89, 1–64. Yunker, P. J. 1983. “Survey study of subsidiary, autonomy, performance evaluation and transfer pricing in multinational corporations.” Columbia Journal of World Business 19, 51–63. Zadeh, L. A. 1965. “Fuzzy sets.” Information and Control 8, 338–353. Zadeh, L. A. 1978. “Fuzzy sets as a basis for a theory of possibility.” Fuzzy Sets and Systems 1, 3–28. Zahavi, J. and N. Levin. 1997. “Applying neural computing to target marketing.” Journal of Direct Marketing 11(4), 76–93. Zajdenweber, D. 1998. The valuation of catastrophe-reinsurance-linked securties, American risk and insurance association meeting, Conference Paper. Zantout, Z. Z. 1997. “A test of the debt-monitoring hpothesis: the case of corporate R&D expenditures.” Financial Review 32(1), 21–48. Zantout, Z. and G. Tsetsekos. 1994. “The wealth effects of announcements of R&D expenditure increase.” Journal of Financial Research 17, 205–216. Zarembka, P. 1968. “Functional form in the demand for money.” Journal of the American Statistical Association 63, 502–511. Zavgren, C. V. 1985. “Assessing the vulnerability to failure of American industrial firms: a logistic analysis.” Journal of Business Finance & Accounting 12(1), 19–45. Zavgren, C. V. 1988. “The association between probabilities of bankruptcy and market responses – a test of market anticipation.” Journal of Business Finance & Accounting 15(1), 27–45. Zeleny, M. 1974. Linear multiobjective programming, Springer, Berlin. Zellner, A. 1962. “An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias.” Journal of the American Statistical Association 57, 348–368. Zellner, A. and V. Chetty. 1965. “Prediction and decision problems in regression models from the Bayesian point of view.” Journal of American Statistical Association 60, 608–616. Zhang, P. 1995. “Flexible arithmetic Asian options.” Journal of Derivatives 2, 53–63. Zhang, P. G. 1998. Exotic options: a guide to second generation options, 2nd Edition, World Scientific Pub Co Inc, Singapore.
References Zhang, J. E. 2001. “A semi-analytical method for pricing and hedging continuously sampled arithmetic average rate options.” Journal of Computational Finance 5, 59–79. Zhang, J. E. 2003. “Pricing continuously sampled Asian options with perturbation method.” Journal of Futures Markets 23(6), 535–560. Zhang, D. 2004. “Why do IPO underwriters allocate extra shares when they expect to buy them back?” Journal of Financial and Quantitative Analysis 39, 571–594. Zhang, Y., H. Zhou, and H. Zhu. 2005. “Explaining credit default swap spreads with equity volatility and jump risks of individual firms.” Bank for International Settlements. Zhang, B. Y., H. Zhou, and H. Zhu. in press. Explaining credit default swap spreads with equity volatility and jump risks of individual firms. Review of Financial Studies. Zhao, Y. and W. T. Ziemba. 2000. “A dynamic asset allocation model with downside risk control.” Journal of Risk 3(1), 91–113. Zhou, C. 1997. “A jump-diffusion approach to modeling credit risk and valuing defaultable securities.” Federal Reserve Board. Zhou, C. 2001. The term structure of credit spreads with jump risk. Journal of Banking and Finance 25, 2015–2040. Zhou, X. 2006. Information-based trading in the junk bond market, Working paper, Cornell University. Zhu, H. 2006. “An empirical comparison of credit spreads between the bond market and the credit default swap market.” Journal of Financial Services Research 29, 211–235. Zhu, Y. and R. C. Kavee. 1988. “Performance of portfolio insurance strategies.” Journal of Portfolio Management 14(3), 48–54. Zhu A. M. Ash and R. Pollin 2002 “Stock market liquidity and economic growth: a critical appraisal of the Levine/Zervos model.” International Review of Applied Economics 18(1), 1–8. Zitzewitz, E. 2000. “Arbitrage proofing mutual funds.” The Journal of Law, Economics, and Organization 19(2), 245–280. Zmijewski, M. E. 1984. “Methodological issues related to the estimation of financial distress prediction models.” Journal of Accounting Research 22(Suppl.), 59–82. Zwiebel, J. 1996. “Dynamic capital structure under managerial entrenchment.” American Economic Review 86(5), 1197–1215. Accounting Today. 2001. February 12–25, 2001. 15(3), 8.
Author Index
A Aadland, R., 1042 Abdel-khalik, A.R., 1307 Abraham, J.M., 1206 Abramowitz, M., 1057, 1059, 1515, 1521, 1623 Acharya, V.V., 667, 942, 988, 1000, 1001 Ackert, L., 137, 140 Acs, Z.J., 829–831, 839, 840 Adams, J., 873 Adenstedt, R.K., 1410 Admati, A.R., 1173 Afonso, A., 641, 659 Agarwal, V., 1597 Aggarwal, R., 843, 1349, 1351 Agrawal, D., 884, 885, 991, 995 Agrawal, V., 807 Aguilar, O., 1109 Aharony, J., 1406 Ahn, C.M., 984, 1121, 1128 Ahn, D.-H., 587, 717, 718, 983, 985–987, 1121, 1128, 1490 Aigner, D.J., 1093 Aitken, M., 1524 Ait-Sahalia, Y., 532, 985–987 Akgiray, V., 1333 Akhter, S.H., 1308 Alam, P., 1308 Al-Eryani, M.F., 1308 Alex, F., 1524 Alexander, G.J., 73, 265 Alexe, G., 641, 642 Alexe, S., 641 Alizadeh, S., 1109, 1118, 1273, 1274, 1276, 1279 Allayannis, G., 679 Allen, F., 1223, 1608 Allen, L., 138, 139, 671, 759, 819–822, 824, 825 Altiok, T., 1136 Altman, E.I., 193, 649, 665, 807, 965, 1061, 1270, 1327, 1328, 1596–1598, 1603 Ambarish, R., 1400, 1401 Amemiya, T., 1203, 1597–1600 Amihud, Y., 293, 294, 340, 342, 686, 1093 Amin, K.I., 533, 547, 587, 1191 Amram, M., 375, 392, 418 Ananthanarayanan, A.L., 458 Anas, A., 1206 Andersen, L., 703, 726, 727
Andersen, T.G., 526, 548, 715, 725, 737, 904, 985, 987, 1174, 1278, 1334, 1335, 1338, 1341, 1344 Anderson, D.W., 988 Anderson, J., 1036 Anderson, R., 668, 941, 949 Andricopoulos, A.D., 506 Ang, A., 185, 984, 985, 1122 Ang, J.S., 38, 137–163, 1615 Angbazo, L., 686 Angel, J., 342 Angrist, J.D., 1365, 1366 Ansel, J.P., 465 Aoki, M., 830 Applebaum, D., 1569 Aragó, V., 1174 Arai, T., 1568 Archer, S.H., 79, 83 Arditti, F.D., 38–40, 1224 Arnold, L., 451 Arnott, R.D., 185, 193, 305–307 Arora, N., 668, 670 Arzac, E., 1223 Asai, K., 1197 Ash, M., 965 Ashizawa, R., 1374 Asmussen, S., 47, 1445, 1446, 1448, 1451, 1454, 1457, 1461 Asquith, P., 1383, 1406 Atiya, A., 965 Atkinson, A., 966 Atkinson, A.C., 880, 965, 966 Aumann, R., 1382 Ausin, M.C., 1376, 1378 Avellaneda, M., 1568 Avery, R.B., 1598 Avram, F., 47, 1445, 1446, 1448, 1451, 1454, 1457, 1461 Avramov, D., 192 B Baba, Y., 1283–1285 Babbs, S.H., 1483, 1490 Bachelier, L., 1055, 1615 Back, K., 347, 1017 Backus, D., 986, 1490 Bae, K., 221 Bagehot, W., 340 Bai, J., 1347, 1348 Bailey, A.D., 1307
1685
1686 Baillie, R.T., 1175, 1394, 1395, 1420 Bajari, P., 1206 Baker, G.P., 139, 143 Bakshi, C., 524 Bakshi, G.S., 533, 537, 547–571, 575, 987 Balakrishnan, N., 854 Balakrishnan, S., 831 Balcaena, S., 1597, 1598 Balduzzi, P., 986, 987, 1490 Bali, T.G., 985 Ball, C.A., 725, 737, 985, 1273, 1607 Ballie, R., 1607 Bank, P., 1007, 1008 Bansal, R., 717, 984, 1121, 1126, 1128 Bantwal, V.J., 756, 757 Banz, R.W., 109, 378, 1091, 1093 Bao, J., 670 Baourakis, G., 965 Barber, B.M., 1235, 1236, 1244 Barberis, N., 204, 289, 293, 301 Barclay, E.I., 1025 Barclay, M., 342 Barclay, M.J., 1028, 1153 Barle, S., 515, 520 Barles, G., 1007 Barndorff-Nielsen, O.E., 49, 515, 527, 528, 854, 873, 874, 876, 877, 1273, 1278, 1420, 1567–1570, 1572, 1574 Barnea, A., 27, 36, 1025, 1036 BarNiv, R., 1600 Barone-Adesi, G., 523, 552, 762, 1098, 1191 Barr, R.S., 643 Barraquand, J., 587 Barrett, B.E., 966 Barrett, C.B., 854, 855 Barry, D., 1347 Bartels, L.M., 1359, 1362 Bartter, B.J., 395, 409, 418, 505 Baruch, S., 341 Bas, J., 984 Baskin, J., 1036 Bassett, G., 836, 1411 Basu, S., 109 Bates, D., 526, 528, 532, 533, 547, 549, 550, 552, 555–558, 575, 1109 Bates, J.M., 42, 193, 199, 1476–1479 Baufays, P., 1347 Baum, D., 1007, 1008 Baumler, J.V., 1308 Baumol, W.J., 82, 1319 Bauwens, L., 1278, 1424 Bawa, V.S., 87, 192, 286 Baxter, M., 451 Bayer, P., 1206 Beaglehole, D.R., 754, 984, 1490 Beatty, R., 186 Beaver, W.H., 102, 117, 1227, 1596 Becher, D.A., 1427 Becherer, D., 1568 Beck, T., 226, 1007, 1367 Becker, B.E., 140, 144 Beckers, S., 1273, 1279, 1397, 1615 Beecher, A., 1598 Begley, J., 1328 Beja, A., 339, 606 Bekaert, G., 223, 226, 233, 984, 986, 1122, 1236, 1237, 1394 Beladi, H., xlv Belkaoui, A., 1098
Author Index Ben-Ameur, H., 584 Benchekroun, H., 1381 Beneish, M., 193 Ben-Horin, M., 68 Benkard, C.L., 1206 Benninga, S., 182, 619, 620, 623, 626 Bensoussan, A., 1574 Benston, G., 810 Benth, F.E., 1567–1575 Benveniste, L.M., 844 Benzoni, L., 715 Bera, A.K., 1333 Beranek, W., 1224 Bergantino, S., 802 Berger, A.N., 767, 770, 775, 1366, 1438 Berger, P.G., 863, 864, 869 Berk, J., 1223, 1237 Berkman, H., 676, 679 Berkowitz, J., 515, 525, 528 Bernard, R.H., 1319 Berndt, A., 996 Bernhard, R., 1224 Bernstein, P.L., 305–307, 342 Berry, M.J.A., 1324 Bertoin, J., 1447, 1569 Bertrand, P., 321, 324 Bertsekas, D., 91 Bessembinder, H., 342 Bessler, W., 342 Best, M.J., 205 Bey, R.P., 1098 Bhagat, S., 831 Bhamra, H.S., 670 Bharath, S., 669, 1602, 1603 Bhaskar, K.A., 1319 Bhatia, A.V., 649 Bhatia, M., 697, 699 Bhattacharya, M., 458 Bhattacharya, R. N., 401 Bhattacharya, S., 1400 Bhattacharya, U., 1236 Bhojraj, S., 186 Bhu, R., 802 Biagini, F., 1556 Biais, B., 339, 342, 1153 Bibby, B.M., 874 Bielecki, T.R., 671, 990, 1128, 1129 Bierman, H., Jr., 138 Bikbov, R., 715 Bingham, N.H., 874 Bishnoi, U., 802 Bismuth, J.M., 1556 Bitler, M.P., 1366 Bjerksund, P., 523, 524 Bjork, T., 448, 826, 1491, 1500 Black, F., 25–27, 79, 107, 188, 191, 199, 204, 267, 273, 274, 321, 324, 327, 355, 378, 385, 447, 454, 464, 465, 532, 576, 606, 611, 612, 665–667, 670, 728, 729, 763, 768, 809, 810, 910, 933, 934, 937, 985, 988, 1091, 1110, 1191, 1396, 1472, 1483, 1490, 1511, 1567, 1568, 1601, 1616, 1619 Blæsid, P., 877 Blair, B.J., 1333–1344 Blair, R.D., 682 Blalock, H.M., 1303 Blanco, R., 1000 Bliss, R.R., 292, 522, 986
Author Index Blomquist, G., 1206 Blum, M., 1594 Blume, M.E., 102, 108, 117, 610, 884, 886 Boardman, C.M., 1319 Bodhurta, J., 375 Bodie, Z., 172 Bodnar, G.M., 679 Bodurtha, J.N., 465 Boe, W.J., 1307 Boehmer, E., 1153 Bognanno, M.L., 138, 140, 143 Bogue, M., 605 Bohl, M.T., 1174 Bohn, H., 1236, 1237, 1247 Bohn, J.R., 668, 670 Bohn, J., 1602 Bollerslev, T., 526, 853, 899, 904, 1109, 1175, 1180, 1181, 1273, 1278, 1279, 1290, 1333–1336, 1341, 1344, 1347–1349, 1409, 1418–1421, 1607 Bollwerslev, T., 1394, 1395 Bonates, T.O., 641 Bondarenko, O., 534 Bongaerts, D., 670 Bontemps, C., 1206 Bookstaber, R.M., 375, 392, 418 Booth, L., 1223, 1224 Booth, N.B., 1347 Borison, A., 1042 Boros, E., 641–643 Borucki, L.S., 1465, 1467 Bose, I., 1324 Boser, B., 1269 Bossaerts, P., 290, 294 Botteron, P., 1041 Bouaziz, L., 585 Bouchet, M.H., 664 Boudoukh, J., 292, 587, 986, 987 Boudreaux, K.J., 1224 Bourke, P., 648 Bouye, E., 698 Bowden, R.J., 1206 Bower, D.H., 1465, 1467 Bower, R.S., 1465, 1467 Bowers, N.L., 958 Bowman, R.G., 68 Box, G.E.P., 1205, 1347, 1525–1527 Boyarchenko, S.I., 1446 Boyce, W.M., 1027 Boyd, J.H., 885 Boyle, P., 402, 405, 584 Bradbury, M.E., 676, 679 Bradley, M., 831, 836 Brailsford, T.J., 1333 Brancheau, J.C., 1316 Brander, J., 1381, 1399, 1403 Brandimarte, P., 182 Brandt, A., 1129 Brandt, M.W., 471, 718, 1109, 1118, 1273, 1274, 1276–1279, 1490 Branger, N., 1568 Bratley, P., 1137, 1139 Brav, A., 289 Brealey, R., 464, 1123 Brealey, R.A., 1593, 1604 Breeden, D.T., 283, 518, 737 Breen, W., 1068, 1075 Breidt, F.J., 1409, 1410
1687 Breiman, L., 1617 Brennan, M.J., 204, 289–318, 534, 667, 769, 778, 915, 916, 921, 937, 987, 1041, 1191, 1236, 1237, 1247, 1515 Brennan, S., 1000 Brenner, R., 737 Breton, M., 584 Brewer, E., 678, 686 Brewer, T.L., 648, 659 Brick, I.E., 1025–1027, 1223–1232 Bris, A., 1524, 1530 Britten-Jones, M., 458 Briys, E., 585, 666, 934, 988 Broadie, M., 505, 506, 667, 942, 947, 948, 1168, 1170 Brock, W.A., 447, 885 Brockett, P.L., 1065 Brockwell, P.J., 1410 Broll, U., 1179 Brooks, C., 853, 854 Brooks, R., 854 Brorsen, B., 853 Brotherton-Ratcliffe, R., 726, 727 Broto, C., 1421 Brown, D., 995 Brown, J.R., 1366 Brown, K.C., 999 Brown, R.H., 987, 1494 Brown, S.J., 46, 125, 127, 227, 249, 278, 283–287, 987, 1098 Bruce, P., 1062 Bruche, M., 515–528 Bruner, R.F., 273, 1223 Brunner, B., 522 Bubnys, E.L., 1465, 1545 Buchen, P.W., 1568 Bühler, W., 999 Bulirsch, R., 1033 Bulow, J., 1403, 1594, 1604 Burke, S.P., 853, 854 Burmeister, E., 1092 Buss, M.D.J., 1316 C Cadle, J., 319–332 Caginalp, G., 138 Cai, J., 1421, 1447 Cakici, N., 515, 520 Calantone, R., 1061, 1062 Campbell, J.Y., 289, 290, 292, 299, 896, 910, 986, 1068, 1080, 1087, 1283, 1474 Camerer, C., 137 Campa, J.M., 533, 1366 Campbell, J., 471, 668, 1174, 1176, 1393, 1394 Campbell, T.S., 677 Campello, M., 1366 Canina, L., 533, 1333, 1335 Cantor, R., 648 Cao, C., 524, 533, 537, 547–574, 670, 987, 1397 Cao, H., 1236, 1237, 1247 Capaldi, R.E., 290 Capobianco, E., 1420 Caporale, M.G., 1284 Cappiello, L., 1278 Carelton, W.T., 1025 Carey, M., 1280 Carey, M.S., 639, 640 Carhart, M.M., 1237, 1244, 1524, 1527, 1545
1688 Cario, M.C., 1136 Carlsson, C., 1196 Carpenter, J.N., 667, 942, 988 Carr, P., 526, 547, 556, 565, 575, 667 Carter, D., 695 Carverhill, A.P., 585 Casassus, J., 717, 916, 917, 1041 Casella, G., 1109, 1372 Cason, T., 138 Castellacci, G.M., 584 Cathcart, L., 988 Cebenoyan, A.S., 677, 679, 686 Cecchetti, S.J., 1122 Celec, S.E., 1319 Cesa-Bianchi, N., 1257 Cesari, R., 328–329 Cetin, U., 667 Chacko, G., 670, 714, 1128 Chakravarty, S., 1397 Chalamandaris, G., 447–470 Chalasani, P., 49, 587, 588, 590, 592, 593, 596–602 Chan, C.K., 1072, 1324 Chan, K., 221, 548, 557, 1374, 1396, 1397 Chan, L.K.C., 185, 193, 273, 832 Chan, R., 1179 Chan, S.H., 829, 831, 834 Chan, S.P., 1190 Chan, T., 1446 Chan, Y.L., 1394 Chance, D.M., 458, 465 Chang, C.-C., 226, 587, 762, 763, 767–778, 1028 Chang, E., 547, 556, 565, 1524, 1537 Chang, J.S., 233, 762, 763 Chang, J.-R., 697–711 Chang, K.H., 533 Chang, N., 1325 Chang, S., 1427, 1428 Chao, C., 1381 Chao, J.C., 1365, 1366 Chapman, D.A., 718, 986, 1490 Charupat, N., 137, 140 Chaudhury, M.M., 1523, 1526, 1534 Chauvin, K.W., 835 Chava, S., 996, 1397 Cheh, J., 1324 Cheh, J.J., 1325 Chen, A.-C., 697–711 Chen, C.R., 1301–1306 Chen, C.-S., 829–840 Chen, C.-J., 953–963 Chen, F., 186, 189 Chen, H., 641, 647, 670 Chen, H.-Y., 69–91, 125–135, 491–503 Chen, J., 1347 Chen, J.D., 802 Chen, L., 670, 1366 Chen, N.-F., 667, 1072, 1446, 1478 Chen, R., 725, 737, 950, 982, 1483–1486 Chen, R.-R., 439–446, 471, 473, 477, 531–545, 670, 990, 1055–1060, 1483–1488, 1490, 1615–1625 Chen, S., 1407 Chen, S.N., 98 Chen, S.-S., 873, 877, 933–950, 1160 Chen, W.-H., 641, 647 Chen, W.-P., 165–184, 807–818 Chen, Y.-T., 47, 1293, 1297–1300, 1445–1464
Author Index Chen, Y.-K., 965–976 Chen, Y.-D., 807–818 Chen, Z., 524, 533, 537, 547–575, 987, 1397, 1424, 1425 Cheng, J.K., 1319 Cheng, M., 341 Cheng, T.-C., 965 Cheng, X., 990 Cheng-Few, L., 1151–1162 Cheridito, P., 987, 1128 Chernov, M., 667, 715, 725, 947, 1109 Chesney, M., 1041 Chetty, V., 192 Cheung, C.S., 132 Cheung, Y., 1273 Chevalier, J., 1235, 1237, 1244, 1247 Chew, B.K., 1179 Chiang, R.C., 1025, 1036 Chiang, T.C., 853–862 Chiappori, P.-A., 1156, 1161 Chib, S., 986, 1109–1113 Chidambaran, N.K., 1607–1614 Childs, P., 942 Chiou, W., 226 Chiou, W.-J.P., 221–233 Chiras, D.P., 458, 1333 Chitashvili, R., 1556 Cho, D., 1281 Cho, M.-H., 1366 Chod, J., 1407 Choi, K.-W., 654 Choi, S., 843 Choi, T.S., 1319 Choo, E.U., 653, 1063 Chordia, T., 342 Chou, H., 1273–1280 Chou, R., 1273, 1274, 1277–1279 Chou, R. Y., 1273–1280, 1333 Chou, W., 1381 Chow, Y.F., 1179 Christensen, B.J., 1093, 1333–1335, 1343, 1491, 1500 Christensen, K., 1274, 1279 Christensen, M., 1211 Christie, W., 342 Christmann, A., 965 Christoffersen, P., 515, 524, 525, 528, 1276, 1278 Christopherson, J.A., 1068, 1070, 1085 Chu, C.C., 103–105, 1476, 1479 Chu, C.-S.J., 1347, 1349 Chuang, H.W., 575–581 Chui-Yu, C., 1190 Chung, H., 165–184, 807–815 Chung, S.L., 42, 505–513 Chung, Y.P., 1396 Church, B., 137 Churchill, G.A. Jr., 1474 Citron, J.T., 648, 659 Claessens, S., 863, 864 Clark, E., 660 Clark, P.K., 1175 Clarke, J., 1152 Clemen, R., 193, 194 Clewlow, L.J., 585 Cliff, M.T., 1366 Cochran, W.G., 859 Cochrane, J.H., 283–285, 290–292, 294, 533, 981, 986, 1067, 1069, 1070, 1082, 1086, 1394
Author Index Cohen, K.J., 339–341 Cohen, R., 610 Cohn, R.A., 1466 Collin-Dufresne, P., 666, 668, 670, 714–717, 721, 722, 884, 885, 910, 916, 917, 941, 949, 985, 988, 995, 996 Collins, A., 1206 Conley, T.G., 986 Conrad, J., 1071, 1174, 1366 Conroy, P., 843 Constantinides, G.M., 984, 1017, 1229, 1490 Cook, R.D., 966 Cooper, I.A., 996, 1225 Cooperman, E.S., 677, 679, 686 Copeland, M., 677 Copeland, T.E., 339, 340, 677, 1173, 1176 Cornell, B., 886 Corrado, C., 1279 Corrado, J.C., 587 Cortazar, G., 1041 Cosset, J., 221, 226 Cossin, D., 670 Cosslett, S.R., 1600 Costabile, M., 587 Costanigro, M., 1206 Costinot, A., 994 Cotton, P., 1118 Coulson, N., 194 Counsell, G., 1595 Court, A.T., 1201 Courtadon, G.R., 465 Cox, D., 1205 Cox, D.R., 453, 1525–1527 Cox, J., 15, 397, 471–473, 531, 547, 549, 550, 575, 576, 612, 617, 666, 667, 670, 754, 764, 1128, 1274, 1374, 1483, 1615, 1616, 1618, 1619, 1623 Cox, J.C., 399, 409, 415, 418, 425, 427, 453, 458, 464, 466, 483, 505, 579, 769, 826, 933, 934, 937, 982, 988, 1071, 1190, 1191, 1494, 1503, 1504, 1515, 1521 Cox, S.H., 757–759, 761 Crabbe, L., 886 Crack, T., 195 Cragg, J.G., 1364, 1367, 1368 Crama, Y., 641 Cramer, J.S., 1599, 1600 Crato, N., 1409, 1410 Crawford, A.J., 686 Cremers, K., 42, 187, 192, 199 Cremers, M., 668, 669 Cremonini, D., 328–329 Crosbie, P., 1602 Crouhy, M., 585 Csiszar, I., 1557 Cui, J., 235–245 Culp, C.L., 676 Cumby, R., 1537 Cumming, J.D., xlvi Curran, M., 584, 587 Curry, T., 643 Cutler, D.M., 138 Cvitanic, J., 1007, 1008 Czyzyk, J., 646 D D’Antonio, L.J., 21, 873, 878 D’Arcy, S.P., 1466 D’haeseleer, P., 1325
1689 Dahiya, S., 1596 Dahya, J., 866 Dai, Q., 671, 713, 717, 718, 912, 979, 981, 983, 984, 986, 987, 1121, 1126–1129, 1490 Daigler, R.T., 1173 Daines, R., 1366 Damodaran, A., 186, 192, 198, 1152 Daniel, K., 199, 273, 274, 831 Dao, B., 942, 1446 Daouk, H., 912 Das, S., 208, 665, 714, 1121, 1128, 1490 Dassios, A., 763 David, A., 670 Davidson, G.K., 1162 Davidson, R., 1363, 1365, 1478 Davis, J.L., 273 Davis, M., 994 Davis, P., 334, 343–345 Davis, R.A., 1410 Davydenko, S.A., 668 Dawid, A.P., 1260, 1264 Day, T., 548, 555 De Jong, A., 873 De Jong, F., 226, 670 De Lima, P., 1409, 1410 De Long, B., 1395 De Roon, F., 873 De Santis, G., 221, 1235 De Schepper, A., 586 de Servigny, A., 641, 644 de Varenne, F., 666, 934, 988 Deakin, E.B., 1597, 1598 DeAngelo, H., 1230 Deaton, A., 42, 197, 199 Deaves, R., 137, 140 DeGennaro, R., 1607 Del Guercio, D., 865 Delbaen, F., 465, 1012, 1018, 1555, 1557, 1567 Delianedis, G., 668, 949 Dellacherie, C., 1556, 1560, 1561 DeLong, G.L., 1427, 1439 DeLong, J.B., 138 DeMarzo, P., 677, 684, 1223 Demirgüç-Kunt, A., 226 Demmel, J., 1033 Demsetz, H., 340, 1041 Demsetz, R.S., 683, 767, 770, 775 Denis, D.J., 1366 Dennis, P., 533 Dennis, S., 1366 Dentcheva, D., 247–257 Denuit, M., 586, 593, 594 Deo, R.S., 1414 Derman, E., 458, 464, 985, 1511 DeRoon, F.A., 173 Derrien, F., 844, 847 Desai, M.A., 1366 Deshmukh, S.D., 678, 686 Detemple, J., 505, 506 Deuskar, P., 734 Devetsikiotis, M., 1136 Dewachter, H., 985, 987 Dhaene, J., 586, 593, 594 Diaconis, P., 1137 Diamandis, P., 1352 Diamond, D.W., 678, 686
1690 Diamond, D.B.Jr., 1206 Diavatopoulos, D., 137–163 Diba, B.T., 139 Diebold, F.X., 526, 885, 1175, 1273, 1274, 1278, 1279, 1421, 1500 Diebold, F., 193, 194, 1395 Diener, F., 505 Diener, M., 505 Diewert, W.E., 1206 Dillon, W., 1061, 1062 Dimitras, A.I., 965 Dimson, E., 273, 1333 Ding, Z., 1409, 1411 Dittmar, R., 1614 Dittmar, R.F., 746, 1001, 1133, 1500 Dixit, A., 1381 Djankov, S., 226 Do, B.H., 321, 328 Doherty, N.A., 1466 Doleans-Dade, K., 1556, 1557, 1562 Domowitz, I., 340 Donald, S.G., 1364, 1367, 1368 Dong, G.N., 1209–1220 Dongarra, J., 1033 Doob, J.L., 1262 Dothan, U., 1506, 1515 Douglas, A.V.S., 1028 Douglas, G.W., 107 Doukas, J., 829, 831, 834 Doumpos, M., 965, 1061 Dow, S., 830 Dowd, K., 1411 Downing, C., 1397 Dreher, A., 1368 Dreiman, M., 1206 Driessen, J., 221, 226, 993, 995 Driffill, J., 1122 Duan, C.W., 915–931 Duan, J.C., 533, 670, 754, 819, 823, 825, 949 Duarte, J., 715, 717, 729, 743, 744, 746, 987, 1128 Dubitsky, R., 802 Dubois, D., 1184 Duck, P.W., 513 Duckstein, L., 1330 Duff, I., 1033 Duffee, G.R., 714, 815, 949, 987, 993, 995, 1490 Duffie, D., 342, 447, 464, 665, 667, 671, 677, 684, 714, 716–718, 723, 726, 734, 748, 981–985, 988, 990, 993, 995–997, 999, 1000, 1012, 1016, 1121, 1128, 1270, 1490, 1491, 1499, 1599 Dufresne, F., 1445, 1453, 1454 Dufwenberg, M., 138 Dugan, M.T., 1328 Duiffie, G., 1128 Duin, R., 40 Dumas, B., 515, 522, 532, 533, 542, 547, 555, 556, 1237, 1244 Dunbar, C., 1162 Dunn, O.J., 1598 Duong, H.N., 1279 Dupire, B., 458, 518 Durand, D., 144 Durand, R., 934 Durbin, J., 1363 Durham, G.B., 986 Durrleman, V., 711 Dybvig, P.H., 208, 987, 1505 Dyl, E.A., 80
Author Index E Eades, K.M., 273, 1223 Easley, D., 340, 1397 Ebeling, W., 1568 Ebens, H., 1278, 1334, 1335, 1338 Eberhart, A., 941 Eberlein, E., 533, 874 Eckardt, W., 375 Eckbo, B.E., 1400, 1406 Economides, N., 341 Ederington, L.H., 995 Ederington, L., 883 Edwards, A.K., 844 Edwards, S., 1283 Eftekhari, B., 274 Egriboyun, F., 49, 587, 588, 590, 592, 593, 599–602 Ehrenberg, R.G., 138, 140, 143 Eisenbeis, Robert A., 1597, 1598 Eisenbeis, Robert O., 1603 El Karoui, N., 1490, 1555–1557 Eldridge, S.W., 1324, 1326, 1328, 1329 Elerian, O., 986 Eliasson, A., 641, 648 El-Jahel, L., 988 Elliott, G., 1280 Elliott, R.J., 1128, 1129, 1557 Ellison, G., 1235, 1237, 1244, 1247 Elton, E.J., 46, 82, 125, 127, 128, 130, 132, 133, 208, 227, 235, 236, 249, 285, 884, 885, 991, 995, 1465, 1466 Emanuel, D., 471, 477, 532, 1615, 1618, 1619 Embrechts, P., 698, 994 Emery, W.G., 1025 Engle, R., 342, 738, 1274, 1277, 1278 Engle, R.F., 532, 885, 901, 903, 909, 1109, 1122, 1176, 1180, 1181, 1273, 1284, 1285, 1291, 1347, 1348, 1409, 1417–1419, 1421, 1422, 1424 Eom, Y., 986, 987 Eom, Y.H., 667, 668, 670, 949, 995, 999 Epple, D., 1206 Epps, M., 1173–1175 Epps, T., 1173–1175 Eraker, B., 725 Ericsson, J., 668, 670, 943, 949, 950, 995 Errunza, V., 226, 1235, 1527 Ervine, J., 375 Estep, T., 330 Estrella, A., 985 Eum, S.J., 1206 Eun, C., 1524, 1537 Evans, M.D., 1122 Evans, J.L., 79 Ewing, B.T., 1352 Ezzamel, M., 1598 Ezzell, J.R., 686 Ezzell, R., 1223–1227, 1229, 1231 F Fabozzi, F., 912 Fabozzi, F.J., 98, 1523, 1525, 1526 Faccio, M., 1427, 1428 Fadil, M.W., 700 Faff, R.W., 321, 328, 1333 Fairchild, J.R., 757–759 Fairley, William B., 1466 Fama, E., 193, 448, 453, 853, 864, 1088, 1394, 1615 Fama, E.G., 273
Author Index Fan, K.L., 1577 Fan, R., 714 Farber, A., 1223 Farragher, E.J., 1319 Faulkender, M., 1366 Fawson, C., 854, 855 Feenstra, R.C., 1206 Fehle, F., 999 Feller, W., 42, 297, 471–473, 478, 479, 1617 Feltham, G., 185 Fernandes, M., 1278 Fernandez, P., 1225 Ferreira, F., 1206 Ferri, G., 645, 648, 654, 664 Ferruz, L., 267–280 Ferson, W.E., 680, 1067–1089, 1072, 1082, 1085, 1537 Figlewski, S., 505, 533, 558, 1333, 1335, 1607 Filipovi, D., 983, 985, 987 Filipovic, D., 987, 1128, 1499 Findlay, M.C., 247 Finger, C.C., 697, 699 Fink, K., 1221 Finnerty, J., 375, 392, 418 Finnerty, J.E., 69–91, 93–105, 111–122, 377–392 Firoozi, F., 1381–1389 Firth, M., 644 Fischer, E., 941, 943 Fishburn, P.C., 247, 249, 873 Fisher, F.M., 1303 Fisher, L., 1067 Fisher, R., 1062 Fisher, S., 915, 916, 920 Fitzpatrick, P.J., 1596 Flanagan, C., 802 Flannery, B., 1271 Flannery, M.J., 676–680, 686, 694, 819, 823, 825 Flannery, M., 884 Fleming, M.J., 884, 885, 912 Fleming, J., 233, 515, 522, 1070, 1333–1335, 1343, 1529 Flesaker, B., 1490 Fletcher, J., 221 Flood, R.P., 138 Flood, R., 1393 Flores, E., 965 Fluck, Z., 1028 Fogler, H.R., 1303 Foley, C.F., 1366 Follain, J.R., 1202, 1206 Follmer, H., 1555 Fomby, T.B., 1095 Fong, W.M., 1179, 1396 Fons, J.S., 669 Fontes, D.B.M.M., 1041 Forbes, S.W., 1465, 1466 Foresi, S., 986, 987, 1490 Forrest, S., 1325 Forsyth, P., 1218 Forsythe, R., 140, 145 Foster, D.P., 1265, 1333 Foster, D., 1265 Foster, F.D., 1068, 1069, 1076, 1079 Foster, G., 1594 Foster, K.R., 515, 520 Foucault, T., 341 Foufas, G., 1219 Fouque, J.-P., 1109–1119
1691 Fouse, W., 122 Fox, B.L., 1137, 1139 Fox, I., 831 Francioni, R., 333–347 Francis, J.C., 73, 83, 98, 259–266 Francois, P., 667, 941, 947 Frank, E., 1324 Frankfurter, G., 122 Franks, J., 941 Franses, P.H., 1333 Frecka, T.J., 1224, 1597, 1598 Freed, N., 1062, 1063, 1325 Frees, E.W., 994 French, K., 193, 1088, 1247, 1394 French, K.R., 221, 273, 289, 290, 298, 915, 916, 920, 921, 930, 931, 988, 1068, 1071, 1073, 1075, 1091, 1092, 1094, 1095, 1100, 1102, 1105, 1107, 1467, 1468, 1474, 1607 Frey, R., 995 Fridman, M., 1109 Friedberg, S.H., 1451 Friedman, B.M., 151 Friedman, J., 1069 Friend, I., 108, 610, 863 Fritelli, M., 1567 Froot, K.A., 137, 140, 677, 678, 680, 686, 763, 986, 1236, 1237 Fu, H.C., 587 Fu, M., 585 Fudenberg, D., 1403 Fuh, Cheng-Der, 1577–1591 Fujiwara, T., 1568 Fuller, K., 1427, 1428 Fuller, R., 1196 Fuller, W.A., 1095, 1365 Fung, H.-G., 644 Fung, W., 196 Furbush, D., 676 G G’omez Sanchez-Mariano, E.G., 1377 G’omez-Villegas, E.G., 1377 Gacs, P., 1265 Galai, D., 340, 465, 549, 565 Gale, D., 139 Galindo, J., 641 Gallant, A.R., 717, 718, 983, 986, 1121, 1128, 1421, 1490 Gallant, R., 1109, 1174, 1279 Gallant, S., 1061 Galloway, T.M., 677, 679, 686 Gammerman, A., 1257 Gamst, G., 1063 Ganapathy, S., 1303 Gang, T., 802 Gao, B., 505, 717, 718, 986, 987 Gao, Z., 185 Garber, P., 137 Garcia, R., 984, 1122, 1123 Gardeazabai, J., 1395 Garleanu, N., 1599 Garmaise, M.J., 1366 Garman, M.B., 321, 339, 1273, 1275, 1279 Garrido, J., 965 Garven, J.R., 457 Gastineau, G., 375 Gastwirth, J.L., 250 Geanakoplos, J., 1403 Geczy, C., 679
1692 Gehrlein, W.V., 654 Geman, D., 1371 Geman, H., 585, 586, 761, 762, 1023 Geman, S., 1371 George, C., 447–465 George, E.I., 1372 George, T., 565 Gerard, B., 221, 1235 Gerber, H.U., 47, 1445, 1447, 1453, 1454, 1457, 1462 Geske, R., 41, 439–442, 532, 613, 666–668, 670, 934, 949, 988, 1191, 1615 Ghosh, S.K., 1305 Ghysels, E., 273, 725, 1109, 1420 Gibbons, M., 286, 1615, 1616 Gibbons, M.R., 1069, 1091, 1092 Giesecke, K., 667 Gikhman, I., 451 Gillan, S., 864, 865 Gillet, R., 1223 Gilli, M., 754 Gil-Pelaez, J., 485, 489 Gilroy, B.M., 1041 Gilson, S., 198 Gitman, L., 1223 Givots, S., 375 Glasserman, P., 727, 1490 Glassman, D., 1068, 1070, 1085 Glen, J., 342, 1537 Gloersen, P., 1577 Glosten, L., 340, 341, 471, 854, 1007, 1009, 1223, 1333, 1336, 1340, 1348 Glover, F., 1062, 1063, 1325 Gockenbach, M., 1219 Goettler, R., 341 Goetzmann, W.N., 46, 127, 227, 249, 278, 285, 289, 1524, 1530 Goetzmann, W., 46, 227, 1206 Goh, J., 1151 Goicoechea, A., 1308 Golden, L.L., 1065 Goldman, E., 1373, 1377 Goldman, M.B., 339 Goldstein, R.S., 666, 668, 670, 715, 716, 721, 722, 885, 910, 941–943, 945–949, 985, 986, 988, 1490 Gollinger, T.L., 700 Gombola, M.J., 1151 Gomes, P., 641 Gómez, A.T., 1190, 1197, 1198 Gómez-Bezares, F., 267–280 Gompers, P., 1366 Goovaerts, M.J., 586, 593, 594 Gordon, M.J., 606, 1466 Gordon, R., 1283 Gorovoi, V., 1490 Gorton, G., 138, 139, 197, 676, 677, 683, 686 Goswami, G., 1025, 1036 Gould, J.P., 47, 1293, 1296 Gourieroux, C., 1156, 1555 Goyal, A., 290, 294, 295, 301, 1089 Graham, J., 675–677, 680, 682, 1028, 1224 Grammatikos, T., 1275 Grandits, P., 1555, 1557, 1567, 1569 Granger, C., 42, 193, 199, 342, 607, 1409, 1476–1479 Granger, C.W.J., 44, 45, 1067, 1068, 1072, 1075, 1076, 1279, 1392, 1409–1411, 1417, 1465–1480 Grant, D., 587 Grauer, R.R., 203–218
Author Index Gray, J.B., 966 Gray, S.F., 984, 1122, 1123, 1421 Green, K., 886 Green, R.C., 208, 226, 1030, 1237 Greenbaum, S.I., 678, 686 Greene, W.H., 680, 1303, 1305, 1472, 1599, 1600 Gregory, A.W., 986 Gressis, N., 92 Grice, J.S., 1328 Griffin, J.M., 139, 844 Griffiths, W., 1095, 1472, 1481 Griliches, Z., 1364 Grinblatt, M., 208–210, 273, 274, 996, 997, 1000, 1236 Grinold, R., 187 Groslambert, B., 664 Grossman, H., 139 Grossman, S., 340, 341, 865, 1007, 1009, 1174, 1176 Groznik, Peter., 1236 Gruber, M.J., 46, 82, 125, 127, 128, 130, 133, 135, 208, 227, 235, 236, 249, 285, 884, 885, 991, 995, 1235, 1237, 1244, 1465, 1466 Grunbichler, A., 913 Grundy, K., 273 Guarino, A., 1063 Guasoni, P., 1556 Guedes, J., 1036 Gugler, K., 836 Gulen, Huseyin., 1397 Guo, D., 668, 949 Guo, J., 1168, 1169, 1171 Guo, Jia-Hau, 1165–1171 Guo, W., 1568 Guo, X., 667, 1056 Gupta, A., 714, 996, 1427–1443 Gupta, A.K., 1347 Gupta, M.C., 1025–1037, 1297 Gupta, S., 1381 Guy, J., 117 Guyon, I., 1269 Gyetvan, F., 1330 H Haas, G.C., 515, 517, 523, 524 Haas, M., 1201 Hackbarth, D., 667, 942 Hadar, J., 247, 249 Hafner, R., 522 Hagerman, R., 810 Hahn, J., 1365, 1366 Hahn, W.F., 1206 Hai, W., 1395 Hakanoglu, E., 321 Hakansson, N.H., 203–205, 289, 339, 609 Haldeman, R.G., 1598, 1603 Hall, G.R., 1466 Hall, P., 1414 Hallock, K., 836 Halverson, R., 1205 Hamada, R.S., 35, 683, 684 Hamao, Y.R., 1283 Hambly, B., 517 Hamer, M.M., 1599 Hamidi, B., 324, 329 Hamilton, J.D., 1122–1125, 1397, 1421 Hammeed, A., 1174 Hammer, P.L., 640, 641, 643, 644, 648–650, 652, 654, 655, 657, 658 Hamori, S., 1283
Author Index Han, B., 716, 724, 725, 727, 728 Han, C.-H., 1109–1120 Han, J., 1324 Han, S., 670 Handa, P., 341, 346 Hanke, J., 1276 Hanley, K.W., 843, 844 Hanoch, G., 39, 249 Hansen, B.E., 808, 811, 814, 854 Hansen, L.P., 284–286, 555, 1362 Hao, Q., 844, 845, 847, 849 Hao, X.R., 1308 Haque, N.U., 641, 659 Harbus, F., 1025, 1028, 1036 Härdle, W., 543 Hardouvelis, G.A., 985 Hardy, G.H., 247 Harjes, R., 737 Harlow, W.V., 879, 999 Harrington, S.E., 763, 1466 Harris, C., 1381, 1382 Harris, J.H., 139, 844 Harris, J., 342 Harris, L., 344, 1109, 1173, 1175 Harris, M., 863, 864, 885, 936 Harris, R.S., 30 Harrison, J., 458, 1619 Harshbarger, S., 627 Hart, O., 865 Hart, S., 1382 Hartigan, J.A., 1347 Hartzell, J.C., 864, 865 Harvey, A., 1109, 1277, 1409, 1410 Harvey, C., 342, 556, 1028, 1236, 1237, 1537 Harvey, C.R., 680, 1069, 1070, 1082, 1224, 1334, 1366 Hasbrouck, J., 339, 342, 1395, 1396 Haslett, J., 966 Hastie, T., 1069 Hathorn, J., 1600 Haug, E.G., 465 Haugen, R., 122 Haugen, R.A., 27, 1466 Haushalter, G.D., 677, 679 Hausman, J., 692, 1361, 1363 Hawkins, J., 865 Hayashi, F., 197, 1361 Hayre, L.S., 802 Hayt, G.S., 679 Hayya, J., 92 Hazarika, S., 333–347 He, G., 191 He, H., 999 He, J., 669 Healy, P.M., 1406 Heath, D., 716, 981, 1008, 1490, 1494 Heath, M., 1033 Heaton, J.B., 289 Heckman, J., 1358 Heggestad, A.A., 682 Heidari, M., 714, 1490 Heine, R., 1025, 1028, 1036 Heinemann, F., 144 Heinkel, R., 916 Helman, P., 1325 Helmstadter, E., 841 Helwege, J., 669, 886, 995, 1604
1693 Hendershott, T., 342, 1028, 1153 Henderson, J., 71 Hendricks, D., 1237 Henke, H., 1174 Henry, D., 1028 Hentschel, L., 910 Heravi, S., 853, 854 Hermalin, B.E., 866, 1357 Herold, U., 323, 324, 329 Hertzel, M.G., 1151, 1152 Heston, S., 47, 458, 481, 485, 487, 489, 505, 506, 510, 531, 547, 549–551, 575, 577, 725, 884, 1165 Heynen, R., 533 Hiemstra, C., 1174 Higgins, M.L., 1333 Higgins, R.C., 24, 25, 100 Hilberink, B., 47, 667, 1445, 1446, 1451–1453, 1458 Hill, J.R., 1140 Hill, R., 1095, 1472, 1481 Hill, R.C., 1303, 1305, 1598 Hill, R.D., 1466 Hillebrand, E., 1421 Hilliard, J.E., 587 Hillier, F.S., 1319, 1320, 1323 Hillion, P., 290, 294, 1153 Hines, J.R. Jr., 1366 Hirschey, M., 836 Hirshleifer, D., 199 Hirtle, B., 686 Hjort, N.L., 1412 Hlavka, M., 208 Ho, H.C., 1412, 1413 Ho, K.W., 1160 Ho, K.Y., 165–184 Ho, L.C., 319–332 Ho, T., 324, 329, 335, 1483 Ho, T.S.Y., 988, 1041 Ho, Y.K., 832, 840 Hobson, D., 1568 Hocking, R.R., 1203 Hodges, S.D., 1567 Hodrick, R., 1393, 1394 Hodrick, R.J., 138, 986 Hoeting, J., 42, 192, 199 Hoff, P.D., 1376 Hoffman, C., 523 Hoffman, N., 1555 Hofmeyr, S.A., 1325 Hogan, K., 226, 233, 1235 Hogg, R.V., 854 Holden, C.W., 266 Holderness, C.G., 1153 Holland, J.H., 1611 Hollifield, B., 226 Holt, C.A., 144 Homan, M., 1595 Hong, C.H. Ted, 779–805 Hong, H.G., 138 Hong, H., 1524 Hong, Y., 987 Hopwood, W.S., 1597, 1598 Horvath, L., 1349 Hoshi, T., 834 Hosking, J.R.M., 854, 1410 Hotchkiss, E., 198 Hourany, D., 144
1694 Houston, J.F., 686, 836, 1025, 1427, 1434, 1439, 1441 Houweling, P., 670, 990, 999, 1000 Hovakimian, A., 1028 Howard, C.T., 21, 873, 878 Howe, J.S., 1153, 1154 Howe, K.M., 1319 Howe, M., 916 Howton, S., 679 Hricko, T., 670 Hsieh, D., 196, 1421 Hsieh, D.T., 829–840 Hsieh, J., 1366 Hsieh, P.F., 767–778 Hsing, T., 1414 Hsu, C., 1279 Hsu, C.-J., 641, 647 Hsu, D.A., 1345, 1347, 1349 Hsu, D., 1345–1354 Hsu, J., 185, 193 Hsu, M., 593 Hsu, M.F., 587–602 Hsu, T.L., 165–184 Hsu, Y.L., 471–480, 575–581 Hu, J.N., 885 Hu, W., 669 Hu, Y.-T., 654 Hu, S., 950 Huang, B.-Y., 965–976 Huang, C., 447, 464 Huang, Dar-Yeh Huang, J., 550, 949, 995 Huang, J.-Z., 665–671 Huang, M., 666–670, 949, 988, 995–997 Huang, Ming Huang, N.E., 1577, 1578 Huang, R.D., 808 Huang, R., 342 Huang, Z., 641, 647 Hubbard, R.G., 686 Huberman, G., 172, 173, 203, 883, 1071 Hudson, J., 1594, 1596 Hueng, C.J., 854 Hughes, J., 685 Hughes, J.S., 916 Hughston, L., 1490 Hull, J., 49, 458, 464, 531, 547, 549, 551, 584, 586, 587, 593, 599–602, 670, 698, 727, 982, 996, 999, 1000, 1145, 1191, 1210, 1269, 1483, 1611, 1615 Hulton, C., 1205 Hung, K., 235–245 Hung, Ken Hung, M., 1168, 1169, 1171 Hung, M.-W., 1165–1171 Hung, W.-L., 854 Hurvich, C.M., 293, 294, 1414 Huselid, M.A., 140, 144 Husick, F., 106 Huson, M.R., 1151, 1152 Hutchinson, J., 1607, 1610, 1611, 1613 Hwang, D.Y., 819–826 I Ibaraki, T., 641, 642 Ibbotson, R.G., 68 Ibbotson, R.C., 1100 Ik, K.H., 1407
Author Index Ilmanen, A., 305, 307 Imamura, Y., 1283 Imbens, G.W., 1365, 1366 Inclan, C., 1349, 1351 Ingersoll, J.E., Jr., 483, 549, 754, 764, 1128, 1483, 1503, 1504, 1515, 1521, 1522, 1618, 1623 Ingram, R.W., 1328 Insel, A.J., 1451 Iryna, I., 1284 Isaac, R.M., 138 Isberg, S.C., 829–831, 839, 840 Israel, R., 864 Itô, K., 578 Ivanov, S., 802 Ivkovic, Z., 1366 J Jäckel, P., 1168, 1171 Jacklin, C., 1615, 1616 Jackson, D., 144 Jackson, M., 182 Jackson, R., 92 Jackson, W.E., 686 Jackwerth, J.C., 520, 532, 533, 738 Jacob, N., 122 Jacobs, K., 524, 528 Jacod, J., 1556 Jacques, M., 584–586 Jacquier, E., 1109, 1110, 1421 Jaffe, J.F., 1154 Jagannathan, R., 226, 273, 283–285, 471, 714, 721, 854, 885, 1067, 1068, 1070, 1072, 1075, 1082, 1086, 1333, 1336, 1340, 1348, 1467, 1468 Jagerman, D.L., 1136, 1139, 1141 Jahera, J.S., 1438 Jahnke, W., 122 Jaimungal, S., 758, 759 Jain, K., 641 Jain, P.C., 1151 James, C., 686, 836, 1427, 1439 James, C.M., 676–680, 686, 694 Jamshidian, F., 1490 Jang, B.S., 1284 Jang, H.J., 1153 Jang, J.-W., 763 Janosi, T., 996 Jarrell, G., 836 Jarrow, R.A., 361, 362, 505, 507, 547, 665, 667, 713, 716, 734, 744, 746, 755, 961, 981, 996, 1007–1017, 1490, 1494 Jeanblanc, M., 942, 1446 Jegadeesh, N., 274, 1092 Jelenkovic, P., 1140 Jensen, M., 36, 611, 835, 836, 936, 947 Jensen, P.H., 1206 Jeon, B.N., 1283, 1284 Jha, S., 49, 587, 588, 590, 592, 593, 596–602 Ji, L., 1500 Jimenez, E., 1202 Jin, Y., 1490 Jing, B.-Y., 1414 Joachims, T., 1269 Jobson, J.D., 173, 204 Joe, H., 994 Johannes, M., 984, 987 John, K., 1153, 1154, 1399–1407 John, T.A., 829, 1036, 1400, 1407
Author Index Johnson, H.E., 1191 Johnson, K.M., 1366 Johnson, L.L., 20, 21, 873 Johnson, N., 1618, 1622 Johnson, S.A., 832, 836 Johnson, S.R., 1095 Johnson, T.W., 1523 Johnson, T.C., 1000, 1001 Jones, C., 342, 1273, 1274, 1277, 1279, 1524 Jones, C.M., 884, 885, 1173, 1174, 1176 Jones, C.S., 986 Jones, E.P., 668, 995 Jones, J., 1174 Jones, R., 321 Jordan, B., 1223 Jorge, C.H., 1284 Jorion, P., 204, 289, 533, 703, 1000, 1334, 1527 Jouini, E., 1007 Joy, M.O., 1603 Joyeux, R., 1410 Ju, N., 600, 941, 942, 948 Judge, G.G., 1095, 1303, 1305, 1472, 1481, 1598 Jung, K., 864, 869, 1061, 1312 Jurczenko, E., 332 K Kücher, U., 874 Kaas, R., 586, 593, 594 Kacprzyk, J., 1190 Kaden, O., 341 Kahl, C., 1168, 1171 Kahle, K.M., 1152 Kahn, J., 195, 196 Kahn, R.N., 278–280 Kakade, S., 1265 Kaldor, N., 197, 915 Kalev, P.S., 1279 Kallal, H., 1007 Kalman, R.E., 1495 Kalotay, A.J., 1027 Kamakura, W., 1061 Kamber, M., 1324 Kamien, M.I., 831 Kaminsky, G., 648, 664 Kan, R., 172, 174, 717, 983, 985, 1121, 1128, 1490 Kanatas, G., 678, 686 Kandel, E., 341, 885 Kandel, S., 172, 173, 204, 209, 289, 1070, 1071 Kander, Z., 1347 Kane, A., 172, 941 Kane, E.J., 679, 686 Kang, Q., 471 Kani, I., 458, 464 Kantor, J., 1307 Kaplan, S.N., 185, 198, 769, 1224, 1430 Kaplan, P., 1393 Kaplin, A., 714, 721 Karasinski, P., 985, 1483 Karatzas, I., 447, 448, 460, 517, 1017 Karceski, J., 233, 1446 Karels, G.V., 1597, 1598 Karjalainen, R., 1608 Karlin, S., 483, 1617 Karlsen, K.H., 1568 Karolyi, A.G., 1284, 1379 Karolyi, G.A., 548, 557, 1122, 1515
1695 Karpoff, J.M., 1152 Kashyap, A., 685, 834 Kat, H.M., 1333 Kau, J.B., 1205 Kaufman, H., 342 Kaufmann, D., 649 Kaul, G., 342, 1071, 1173, 1174, 1176 Kavajecz, K.A., 1366 Kavanagh, B.T., 676 Kavee, R.C., 324–327 Kaya, Ö., 1168, 1170 Kazamaki, N., 1557, 1560, 1563 Kealhofer, S., 669 Keane, M.P., 1161 Keasey, K., 1597–1599 Kedia, S., 1366, 1407 Keeley, M.C., 695 Keenan, S.C., 656, 658, 967 Keeton, W.R., 684 Keim, D.B., 109, 289, 884, 886, 1068, 1075, 1091 Keirstead, W., 1490 Keller, U., 533, 874 Kellezi, E., 754 Kellison, S.G., 955 Kelly, F.P., 1141 Kelly, M., 1568 Keloharju, M., 1236 Kelton, D.W., 1137 Kemna, A.G.Z., 533, 584, 585 Kendall, M.G., 1072 Kennedy, D.P., 986 Kennedy, P., 1598 Kensinger, J., 829, 831, 834 Kent, J., 854 Keown, A.J., 1153 Kerfriden, C., 819, 822 Kettler, P., 102, 117 Keynes, J.M., 18, 915, 919 Khindanova, I., 853 Khorana, A., 1237, 1247 Kiang, M., 1061, 1062 Kiesel, R., 654, 874 Kijima, M., 993 Killough, L.N., 1313 Kim, C.J., 1123 Kim, D., 273, 1091–1108, 1467, 1468, 1481 Kim, E.H., 831, 836 Kim, I., 940, 942, 988 Kim, I.-J., 552 Kim, S., 1110–1113 Kim, S.H., 1319 Kim, T.H., 587 Kim, T.S., 299 Kim, Y.C., 864, 869 Kim, Y.S., 1568 Kimmel, R., 987, 1128 King, B., 122 King, D., 1025 King, R.R., 148 Kirby, C., 233, 1070 Kish, R.J., 913 Kishore, V., 1270 Klaoudatos, G., 1136 Klapper, L., 1596 Klarman, S., 185, 198 Klass, M., 1273–1275, 1279
1696 Klassen, T.R., 587 Klebaner, F.C., 503 Kleibergen, F., 1365 Kleidon, A.W., 289 Klein, R., 192 Klemperer, P., 1403, 1594, 1604 Klimberg, R., 1061–1066 Kluger, B., 137, 140 Klugman, S.A., 854 Knez, P.J., 1490 Knight, J., 1417 Knittel, C.R., 1206 Koekebakker, S., 1042 Koenker, R., 836, 1411 Kogan, A., 639–664 Kohlhagen, S.W., 321 Kokoszka, P., 1349 Kolbe, A.L., 1466 Kolmogorov, A.N., 1264, 1267 Kolodny, R., 533, 1524, 1537 Kong, W., 668, 669 Konno, H., 248 Koopman, S., 342 Kopprasch, R., 321 Korajczyk, R., 1072 Korkie, B., 173, 204 Korwar, A.N., 1151 Kosmidou, K., 965, 1066 Kosturov, N., 883–913 Kothare, M., 807 Kothari, S.P., 273, 1068, 1092 Kotz, S., 421, 425, 1260, 1618, 1622 Kou, G., 1326 Kou, S., 666, 667, 1455 Kou, S.G., 47, 727 Koza, J.R., 1610, 1611 Kozicki, S., 985 Kracaw, W.A., 677 Krafet, D., 1283–1285 Kraus, A., 611, 1466 Kreps, D., 139, 550, 826, 981, 1126, 1130, 1619 Krishnamurthy, S., 1152 Kritzman, M., 330 Kroncke, C.O., 1466 Kroner, K., 737, 1181, 1279, 1333, 1349 Kroner, K.F., 1176, 1283–1285 Krueger, A.B., 1365, 1366 Kruse, S., 1168 Kshirsagar, A.M., 1597 Kudbya, S., 1061–1066 Kuehn, L.-A., 670 Kuersteiner, G., 1365, 1366 Kulatilaka, N., 375, 392, 418 Kumar, M.S., 641, 659 Kumar, A., 1244 Kumon, M., 1264 Kunczik, M., 640 Kunitomo, N., 1273, 1275 Kunreuther, H.C., 756, 757 Kuo, H.-C., 1593–1604 Kurbat, M., 669 Kwak, W., 1307–1329 Kwakernaak, H., 1184 Kwan, C.C.Y., 132 Kwan, S.H., 886, 1397 Kyle, A.S., 340, 343, 1007, 1009, 1173
Author Index L La Porta, R., 226 Labys, P., 1273, 1274, 1278 Labys, W.C., 236 Lackner, P., 465 Laderman, E., 802 Laeven, L., 221, 226 Lagnado, R., 204, 289 Lahiri, S.N., 1414 Lai, H.-N., 767–777 Lai, T.L., 1417–1425 Lai, Y., 1118 Lakonishok, J., 185, 193, 233, 1153, 1174, 1446 Lam, K., 1063, 1421 Lam, K.F., 1065 Lam, P., 1122 Lambadaris, I., 1136 Lamberton, D., 447, 460 Lamont, O., 1524 Lamoureux, C., 1180 Lancaster, K.J., 1202 Landen, C., 1128, 1129 Lando, D., 665, 667, 669, 671, 988–990 Landry, B., 47, 1445, 1447, 1462 Landsburg, S., 195, 196, 278–280 Lang, L., 831, 834, 863, 864, 1026, 1028, 1036, 1400, 1401, 1406 Lang, L.H.P., 669, 863, 999, 1153, 1154 Lang, W., 685 Lapeyre, B., 447, 460 Larcker, D.F., 1359, 1366 Laroque, G., 42, 197, 199 Larrain, G., 648 Larson, M., 1219 Last, G., 1129 Lastrapes, W., 1180 Latane, H.A., 1333, 1397 Lathom, G.P., 1324 Launie, J.J., 1466 Laurent, J.P., 1556, 1557 Laurent, S., 1278, 1424 Laury, S.K., 144 Lauterbach, B., 1616 Law, A., 1137 Lawrence, K., 1061–1066 Lawrence, S., 1061–1066 Le, A., 984, 1126 Leal, R., 1349 Leamer, E.E., 1203 Lean, H.H., 1179 Lease, R., 610 Lease, R.C., 1161 LeBaron, B., 1109 LeBaron, B.D., 885 LeCompte, R.L.B., 1430 Lecraw, D.J., 1307 L’Ecuyer, P., 584 Lee, A.C., 53–68, 233, 393–397, 421–438, 619, 623, 1025–1037, 1297, 1465–1481 Lee, B.-S., 1394 Lee, C., 186, 950, 1089 Lee, C.H. Jevons, 1607, 1611, 1613 Lee, C.F., 10, 35, 47, 53–91, 93–109, 111–135, 172, 186, 221–233, 355–375, 377–392, 399–418, 421–446, 471–478, 481–503, 575–581, 619, 623, 670, 873–882, 933–950, 1025, 1151–1162, 1183–1198, 1205, 1224, 1293–1329, 1445–1458, 1465, 1466, 1476, 1479, 1523–1552, 1615–1625
Author Index Lee, D., 1152 Lee, G., 1325 Lee, H., 950, 1307, 1308, 1320, 1322–1324 Lee, H.-H., 233, 933–950, 1028 Lee, H.W., 1151 Lee, I., 1152, 1153 Lee, J., 53–68, 355–375, 393–397, 429–438, 617–636, 883, 903 Lee, J.C., 399–408, 421–428, 481–490, 619, 623, 1191 Lee, J.H., 1568 Lee, J.-P., 753–765 Lee, J.-Y, 873–880 Lee, K.W., 863–872 Lee, P.M., 1366 Lee, R., 1165 Lee, S.S., 877, 1483 Lee, S.C., 819, 821, 822, 825 Lee, S.B., 1041–1054 Lee, T.C., 1095, 1303, 1305, 1472, 1481, 1598 Lee, W.B., 677, 679, 686 Lee, Y.R., 1308 Lefebvre, C., 1307 Lehmann, E., 249 Lei, V., 137 Leippold, M., 714, 718, 748, 1490 Leisen, D.P.J., 505, 506 Lejeune, M.A., 639–664 Leland, H.E., 9, 203, 321, 328, 461, 531, 533, 667–670, 835, 934, 937–959, 988, 995, 1299, 1446, 1451, 1452, 1458 Lemmon, M., 1151, 1152 Lennox, C., 965 Lerner, J., 1366 Lesmond, D.A., 47, 186, 670, 1293, 1297–1300, 1366 Lettau, M., 1067–1070, 1089, 1608 Leung, A.Y.T., 1568 Lev, B., 686, 1224, 1596, 1597 Levary, R.R., 1319 Levendorskii, S., 1446 Levhari, D., 1523, 1526 Levin, L.A., 1265 Levin, N., 1062 Levine, R., 965, 1366 Lévy, E., 587, 596, 597, 600, 601 Levy, H., 39, 79, 249, 1224, 1235, 1236, 1523, 1526 Levy, P., 485 Levy, R., 122 Lewellen, J.W., 190, 273, 289, 610 Lewis, A., 1171 Lewis, B., 1324 Lewis, C., 548, 555 Lewis, C.M., 533, 1025, 1028, 1333, 1335, 1338 Lewis, K., 1122 Li, C., 1500 Li, C.K., 1179 Li, D., 698, 701, 961, 963, 1381, 1382 Li, H., 713–716, 718, 719, 746, 987, 996 Li, J., 853–861 Li, K., 221, 226, 1358 Li, L., 885, 901, 909, 1371-1380 Li, M., 986 Li, S., 1381, 1382 Li, T., 990 Li, W.K., 1421 Li, X., 996, 997, 999, 1000 Liao, A., 1284 Liaw, K.T., 35 Liberatore, M.J., 1319
1697 Lie, E., 187, 198 Lie, H., 187, 198 Lien, D., 873, 879, 882, 1381–1389 Liew, V.K.-S., 1283–1289 Lin, C.S.-M., 409–418 Lin, C.M., 677 Lin, H., 979–1001 Lin, J C., 819, 821, 822, 825, 1153, 1154 Lin, L., 1307, 1593–1604 Lin, T., 926 Lin, T.I., 471–478, 575 Lin, W.T., 915–931 Lin, Y., 1325 Linck, J., 1151, 1152 Lindenberg, E., 286 Lindqvist, T., 138 Lindskog, F., 994 Linetsky, V., 585, 1490 Linoff, G., 1324 Lins, K.V., 1366 Lintner, J., 89, 93, 125, 126, 129, 208, 270, 607, 1091 Lions, J.-L., 1574 Lipson, M.L., 1173, 1174, 1176 Liptser, R.Sh., 1556 Litterman, R., 42, 187, 191, 199, 204, 1484, 1490 Littlewood, J.E., 247 Litzenberger, R., 197, 518, 611, 1400, 1406 Liu, A.L., 999 Liu, A.Y., 1174 Liu, B., 950, 1523–1552 Liu, C.H., 1152 Liu, D., 198, 199 Liu, F.-I., 1409–1414 Liu, H., 1422, 1423 Liu, J., 185, 996, 997, 1056 Liu, L.-G., 645, 648, 654 Liu, N., 1273–1280 Liu, S., 990–992, 995–997 Liu, S.M., 853 Liu, W., 274 Liu, X., 1325, 1371–1380 Liu, Y.H., 1309, 1324 Livingston, M., 913 Livny, M., 1136 Ljungqvist, A., 1366, 1367 Lo, A., 185, 195, 290, 1092, 1607, 1610, 1611, 1613 Lo, A.W., 292, 532, 738, 886, 1068, 1071, 1284, 1474, 1598 Lo, K., 593 Lo, V., 994 Lo, W.A., 1174, 1177 Loayza, N., 1366 Lobato, I.N., 1409 Locke, E.A., 1324 Loeb, M.P., 1153 Logan, B., 1055–1060 Logue, D.E., 1465, 1467 Long, J., 605 Long, M., 831, 836 Long, R., 1062 Long, R.W., 1224 Long, S.R., 1578, 1579 Longin, F., 1283 Longstaff, F., 518, 522, 533, 548, 556, 557, 565, 666, 668, 716, 743, 940, 949, 987, 1122, 1483, 1515, 1615 Longstaff, F.A., 548, 557, 584, 670, 775, 885, 886, 910, 982–988, 990, 991, 993, 995, 999–1001, 1122, 1490, 1506, 1511, 1515
1698 Lookman, A.A., 996 Lopes, H., 1109 Lopes, H.F., 1376, 1378 Lopez, J., 193, 884 Lopez-de-Silanes, F., 226 Lorenz, M.O., 250 Loria, S., 327–328 Lorie, J.H., 1321 Losq, E., 226, 1235, 1236, 1527 Lott, J., 1427, 1429 Loubergé, H., 754 Loughran, T., 843, 1152 Loviscek, A.L., 236 Lowenstein, M., 1507 Lowenstein, R., 203, 204 Lowry, M., 1366 Lubow, J.W., 375, 392, 419 Lucas, A., 342 Lucas, R.E. Jr., 1071 Ludvigson, S., 1068–1070, 1089 Lugosi, G., 1257 Lukas, E., 1041 Lumsdaine, R.L., 885, 1348 Lund, J., 548, 725, 737, 985, 987 Luo, M., 1325 Lusk, E.J., 1307 Lutkepohl, H., 1095, 1303, 1305, 1392, 1472, 1481, 1598 Luttmer, E.G.J., 986 Lyanders, E., 1407 Lyden, S., 668, 995 Lyrio, M., 985 M Ma, C., 1179 Ma, J., 1007, 1008 Ma, T.S., 226 MacBeth, J., 471, 1073, 1615, 1618, 1619 MacDonald, R., 941 Mackinlay, A.C., 195, 273, 292, 886, 1068, 1071, 1092, 1284, 1431, 1474 MacKinnon, J.G., 1365, 1478 Maclennan, D., 1206 MacMillan, L.W., 763 Madan, D., 547, 556, 565, 575, 585, 996 Madan, D.B., 399, 401, 988, 1570 Madariaga, J.A., 276 Maddala, G., 1119 Maddala, G.S., 1067, 1600, 1603 Madhavan, A., 339, 341, 342 Madigan, D., 42, 192, 199 Maenhout, P., 671 Maes, K., 979, 985, 987 Maginn, J.L., 68, 92, 106, 122, 133 Mago, A., 802 Mahanti, S., 670 Mahapatra, R.K., 1324 Mai, J.S.-Y., 125–133, 393–397 Maich, S., 1283 Maier, S., 339–341 Maillet, B., 324, 329 Majluf, N.S., 1383, 1151, 1152 Maksimovic, V., 226, 1028, 1399, 1403, 1405–1406 Malatesta, P.H. 1151, 1152 Malevergne, Y., 260 Malik, F., 1352 Malitz, L., 831, 836
Author Index Malkiel, B.G., 273, 278 Malliaris, A., 447 Malliaris, A.G., 447–470 Mallik, G., 670 Maloney, K.J., 995 Malpezzi, S., 1206 Mamon, R.S., 1128, 1129 Manaster, S., 458, 1333, 1397 Mandelbrot, B., 1273, 1345, 1410, 1615 Mandell, R.E., 996, 997 Mangasarian, O., 1062 Mania, M., 1555–1564 Mankiw, N.G., 1394 Mann, C., 46, 227, 285 Manzur, M., 1179 Mao, J.C.F., 71 Maquieira, C., 1429 Marathe, V., 102, 118 Marcello, J.L.M., 1352 Marcis, R., 1041 Marcus, A., 172, 941 Marimon, R., 1608 Marin, J.M., 1377 Marjit, S., 1381 Mark, N., 641, 659, 1395 Mark, N.C., 1122 Markowitz, H., 20, 111, 125, 221, 226, 236, 259, 261 Marks, S., 1598 Marmol, F., 1075 Mar-Molinero, C., 1598 Marsh, I., 1000 Marsh, P., 1224, 1333 Marsh, T.A., 289 Marshall, A., 221 Marshall, D.A., 986, 1122 Marston, F., 1366 Marston, R.C., 679 Martens, M., 1274, 1278, 1279 Martin, A.D., Jr., 92 Martin, D., 1598 Martin, J., 829, 831, 834 Martin, J.S., 884, 885, 995 Marx, L.M., 342, 1028 Masih, A.M., 1283 Masih, R., 1283 Masoliver, J., 1568 Mason, J.R., 802 Mason, S.P., 668, 995 Massabo, I., 587 Masulis, R., 1400 Masulis, R.W., 1151, 1161, 1230 Masulis, W., 1283 Mathews, J., 1221 Mathews, K.H., 1206 Mathieson, D., 641, 659 Matsuba, I., 1568 Mauer, D., 942 Maurer, R., 329 Mayers, D.F., 1211 Mayhew, S., 533, 1397 Mayo, J., 641 Mayoraz, E., 641–643 McAleer, M., 1274, 1278 McAndrews, J.J., 1028 McBeth, J., 558 McCallum, B.T., 1093
Author Index McCauley, J.L., 1568 McCluskey, J.J., 1206 McConnell, J., 1028, 1427, 1428 McConnell, J.J., 865, 866 McCormick, T., 342, 1028 McCorry, M.S., 1524 McCulloch, R.E., 1422 McDonald, B., 1525, 1526, 1534, 1545 McDonald, J.B., 854, 855 McDonald, R., 458 McElroy, M.B., 1092 McGrattan, E., 1608 McGuire, J., 830 McKibben, W., 108, 118 McLeavey, D.W., 68, 92, 106, 122, 133 McLeay, S., 1598 McMillan, R., 1206 McNamee, P., 1319 McNeil, A., 698, 994, 995 McQueen, G., 883, 884, 1352 Meckling, W., 36, 835, 936, 947 Meckling, W.H., 22 Medeiros, M., 1274, 1278 Meese, R.A., 1394 Megginson, W., 1429 Mehran, H., 863 Mei, J., 1465 Melamed, B., 1135–1149 Melick, W., 515 Melino, A., 549, 555 Mella-Barral, P., 667, 941, 988 Mello, A.S., 667, 996 Mendelson, H., 342, 1093 Mendelson, M., 340 Mendenhall, W., 68, 92, 106, 122, 133 Menkveld, A., 342 Mensah, Y.M., 1319 Mensah, Y., 1598 Mercer, J., 1267 Merton, R., 360, 526, 547, 605, 611–613, 1044, 1457 Merton, R.C., 27, 204, 268, 269, 289, 387, 447, 450, 454, 458, 464, 665–670, 807–809, 819–825, 933, 940, 941, 949, 988, 996, 1071, 1087, 1269, 1515–1517, 1596, 1601, 1602, 1607, 1613 Merton, R.A., 1055 Merville, L., 558, 1615 Merville, L.J., 458, 532, 1307, 1607 Mesnier, M.P., 646 Mester, L., 685 Meulbroek, L.K., 676 Meyer, P.A., 1556, 1557, 1560–1562 Meyer-Brandis, T., 1567–1575 Meyers, L., 1063 Mian, S.L., 676, 679 Miao, J., 667, 942 Mikkelsen, H.O., 1420 Mikkelson, W.R., 1400, 1406 Mildenstein, E., 340 Miles, J., 1223–1227, 1229, 1231 Miles, J.A., 686 Milevsky, M.A., 587, 600, 601 Milgrom, P., 340, 1007, 1009, 1399, 1400 Miller, M., 6, 8, 378, 934, 1007, 1009, 1028, 1223–1225, 1229–1231, 1399–1401 Miller, E., 343 Miller, H.D., 453 Miller, N.H., 1366
1699 Miller, R.B., 1345, 1347, 1349 Milne, F., 399, 401 Milonas, N.T., 915, 916, 920, 921, 930 Miltersen, M., 716, 727, 734 Mincer, J., 103 Ming, J., 1328 Minton, B.A., 999 Minton, B.M., 679 Miranti, P.J. Jr., 1319 Mishkin, F.S., 985 Mishra, B., 1152, 1153, 1400, 1401 Misra, L., 1427–1443 Mitchell, K., 1025, 1036 Mitchell, M.L., 1161 Mithal, S., 670, 991, 995, 999–1001 Mittelhammer, R.C., 1206 Mittnik, S., 515, 517, 523, 524, 853 Mitton, T., 1366 Miyahara, Y., 1555, 1568 Miyakoshi, T., 1174 Mizrach, B., 515–528 Modigliani, F., 6, 8, 9, 26, 27, 33, 677, 934, 1028, 1223–1225, 1230, 1231, 1466 Modigliani, L., 275 Moeller, S., 1427, 1428, 1434, 1439, 1441 Mogedey, L., 1568 Mohnhan, T.F., 1319 Molina, C.A., 1366 Molina, G., 1109–1120 Monat, P., 1555 Monfort, A., 1156 Monoyios, M., 1568 Montero, M., 1568 Montesi, C.J., 873 Moon, C.-G., 685 Moon, M., 768, 769, 772, 775 Moore, E., 138 Moore, P., 185, 193 Moore, T., 1352 Moore, W., 941 Mora, N., 648 Morck, R., 686 Mordecki, E., 47, 1445, 1446, 1454 Moré, J.J., 646 Moreira, M.J., 1365 Morellec, E., 667, 941, 947 Morgan, J.B., 700 Morgan, D.P., 643 Morgan, I.G., 122 Morgenstern, O., 71, 247, 607 Morris, C.S., 684 Morris, M.H., 1597 Morrison, D.G., 1064 Morrison, D.F., 1474 Morse, D. 1174 Mortensen, A., 669 Morton, A., 716, 981, 1008, 1490, 1494 Morton, K.W., 1211 Moser, J.T., 678, 686 Moskowitz, T.J., 1366 Mosler, K., 247 Mossin, J., 93, 203 Mosteller, F., 1411 Mota, B., 1278 Moy, J.W, 1065 Moy, R.L., 375, 392, 419
1700 Mozumdar, A., 986, 1490 Muchnik, I., 641–643 Mukherjee, A., 1381 Muliere, P., 250 Müller, A., 249 Mullins, D., 1383, 1406 Mullins, H.M., 819–821, 824, 825 Munk, C., 1211 Muralidhar, K., 1317, 1319 Muromachi, Y., 993 Musiela, M., 460, 465, 1571 Mussavian, M., 273 Muth, J.F., 1161 Myers, J., 1089 Myers, S., 464, 605, 1383 Myers, S.C., 28, 30, 935, 941, 1042, 1047, 1151, 1152, 1223, 1224, 1226, 1465, 1467 Myneni, R., 1490 N Nagel, S., 139, 273 Nail, L., 1429 Nakamura, L.I., 1028 Nakatsuma, T., 1375 Nance, D.R., 676, 677, 679 Nanda, V., 1237 Nandi, S., 547, 555, 556, 563, 884 Nandy, D., 1366 Nantell, T.J., 879 Napp, C., 1007 Narayanan, M.P., 1025 Narayanan, P., 193, 649, 1598, 1603 Nardari, F., 1109 Nash, R.C., 1162 Nashikkar, A.J., 670 Neave, E.H., 587 Neckelburg, G., 648, 659 Neely, C., 1608 Neftci, S., 447 Neis, E., 670, 990, 991, 993, 995, 999, 1000 Nelken, I., 670, 1000 Nelson, M., Nelson, B.L., 1136 Nelson, C., 1123 Nelson, D., 899, 1109, 1274 Nelson, D.B., 853, 1181, 1333, 1347, 1348, 1409, 1419 Nerlove, M., 885 Nesbitt, S.L., 865 Netter, J., 1427, 1428 Netter, J.M., 1153 Neuberger, A., 458, 1567 Neumann, K., 874 Newbold, P., 103–105, 193, 1068, 1072, 1075, 1076, 1466, 1476–1479 Newby, R., 144 Newey, W., 693, 1068, 1070, 1072, 1075, 1086 Newton, D.P., 506 Ng, A., 221, 226, 233 Ng, D., 1244 Ng, V., 547 Ng, V.K., 533, 1283 Ng, W.-Y., 654 Nguyen, T.T., 1407 Niden, C., 1174 Nieh, C.-C., 1121–1133 Niehans, J., 678 Niehaus, G., 763
Author Index Nielsen, J., 766 Nielsen, L.T, 666, 988 Nieto, L., 1174 Nijman, T.E., 173, 226 Nikeghbali, A., 698 Nison, S., 1273 Nissim, D., 198, 199 Noe, T., 1025, 1036 Nögel, U., 1168 Norden, L., 1000 Norgaard, M., 1502 Nouretdinov, I., 1269 Noussair, C., 137, 138 Novomestky, F., 221 Nyborg, K.G., 1225 Nye, D.J., 1466 O O’Brien, J., 831 O’Brien, T.J., 321, 327 Obreja, I., 996 Obstfeld, M., 221 Odders-White, E.R., 1366 Odean, T., 1235, 1244 Ofek, E., 767, 770, 775, 831, 834, 863, 864, 869, 999, 1026, 1028, 1036 Officer, M., 1427–1429, 1439, 1443 Ogden, J.P., 668, 995 Ogryczak, W., 249, 251, 254 O’Hara, M., 339, 340, 1397 Ohlson, J., 185, 193, 965, 1326–1329, 1598, 1599 Oksendal, B., 447, 448 Oliner, S., 1283 Omberg, E., 299, 316, 505 Ooghe, H., 1597, 1598 Opler, T., 1026, 1028, 1036 Ornstein, L.S., 578 Ortega, J.M., 1033 Ortiz-Molina, H., 1366 Ostdiek, B., 233, 1070, 1333, 1335 Ott, S., 942 Ouederni, B.N., 1323 Ou-Yang, H., 941, 942, 948 Oviedo, R., 670, 949 Ozenbas, D., 342 P Packer, F., 648 Padberg, M.E., 46, 82, 125, 127, 128, 130, 133, 135, 208, 227, 249, 285, 884, 885, 991, 995, 1465 Padmanabhan, P., 226, 1235 Pagano, M., 273, 334, 342–345 Pagano, M.S., 675–694 Page, E.S., 1577 Page, J., 1161 Pai, D., 1061–1066 Pai, Szu-Yu, 1577–1591 Pain, D., 329–332 Palatella, L., 1568 Palepu, K.G., 1406, 1598, 1600, 1603, 1604 Palfrey, T., 140, 145 Palia, D., 686, 1366 Palmon, O., 531–545, 1055–1060 Palmquist, R., 1205 Pan, E., 725, 1489–1502 Pan, G., 950
Author Index Pan, G.G., 670, 950, 990, 1301, 1305 Pan, J., 670, 716, 723, 725, 726, 734, 1397, 1499 Panayi, S., 767, 768 Panigirtzoglou, N., 522 Panjer, H.H., 248, 854, 855, 953 Papageorgiou, E., 1118 Papanicolaou, G., 1110, 1118 Pardoux, E., 1556 Park, S., 1153 Parkinson, M., 1273–1275, 1280 Parlour, C., 339, 341 Paroush, J., 342, 343 Parrino, R., 864, 865, 941, 943, 946, 1151, 1152 Parsons, J.E., 667 Partch, M.M., 1400, 1406 Parulekar, R., 802 Pastena, V., 1593, 1594 Pastor, L., 42, 187, 190–192, 199, 342 Patel, J., 1237 Patel, N., 1062 Patel, S.A., 884, 886 Patro, D.K., 1235–1255, 1523–1552 Patron, H., 1381 Patterson, J.H., 1319 Patton, A.J., 1417 Paul, S., 1152 Pauly, P., 193 Payne, J.E., 1352 Peake, J., 340 Pearson, M., 885 Pearson, N.D., 982, 986 Pedersen, C.S., 274 Pedersen, L., 1599 Pederson, H.W., 757–759 Pedrosa, M., 884 Peel, D.A., 1598 Peel, M.J., 1598 Peker, A., 1174 Peng, L., 1206 Peng, S., 1490, 1556 Peng, Y., 1324, 1325, 1329 Penman, S., 185, 186 Pennacchi, G.G., 686, 819, 823–825 Perello, J., 1568 Perfect, S., 679 Perold, A., 185 Perraudin, W., 667, 941, 988 Perron, P., 984, 1122, 1123, 1347, 1421 Persand, G., 853, 854 Pertman, M., 841 Petersen, M.A., 1366 Peterson, C.L., 802 Peterson, D., 1615 Peterson, S., 985 Petkova, R., 1069, 1070 Petty, W.J., 1307 Pfleiderer, P., 1173 Pham, H., 1555–1557, 1563 Pham, T.M., 327, 328 Philiippatos, G., 92 Philips, T.K., 290 Phillips, P.C.B., 1072, 1075 Phillips, R.D., 676 Phoon, K.F., 1179 Piazzesi, M., 717, 984, 985, 1121 Piesse, J., 1593–1604
1701 Pike, R., 1319 Pinches, G.E., 1098 Pinkerton, J.M., 1153 Pinkowitz, L., 1026 Pinto, J.E., 68, 92, 106, 122, 133 Piotroski, J., 193 Piotroski, J.D., 1153 Piramuthu, S., 965 Pistorius, M.R., 47, 1445, 1446, 1448, 1451, 1454, 1457, 1461 Pitt, L., 802 Pitts, M., 1173, 1175, 1176 Platen, E., 1555 Pliska, S.R., 458, 1015 Plott, C., 140, 143 Plumridge, R.R., 1162 Podolskij, M., 1274, 1278, 1279 Poggio, T., 1607, 1610, 1611, 1613 Pogue, G.A., 22, 68 Pogue, J.A., 92 Poitevin, M., 1381 Pollak, M., 1577 Pollakowski, H.O., 1207 Pollard, D., 1412 Pollard, H., 1451, 1460 Pollin, R., 965 Polson, N.G., 1109, 1110, 1421 Polya, G., 247 Pontiff, J., 1068, 1073 Poon, S.H., 1279, 1333–1344 Poon, W.P.H., 644 Porter, D., 342 Porter, D.P., 137 Posner, S.E., 587, 600, 601 Poterba, J.M. 221, 290, 293, 294, 1247 Poteshman, A., 941 Poulsen, N.K., 1502 Prabhala, N.R., 1333–1335, 1343, 1358 Prade, H., 1184 Prakash, A.J., 1597, 1598 Pratelli, M., 1556 Prause, K., 533, 874 Predescu, M., 670, 1000 Prékopa, A. 250 Press, W. 1271 Price, B., 879 Price, K., 879 Price, L. 274 Prigent, J.L., 321, 324 Pringle, J.J., 30, 676, 679 Pritchard, A.C., 843 Protopapadakis, A., 884 Protter, P., 717, 1007–1024, 1569 Ptittas, N., 1284 Pudet, T., 587 Puri, M.L., 1184 Purnanandam, K., 843 Purschaker, N., 329 Pye, G., 1030 Pyle, D.H, 9, 819–821, 824, 825, 835, 1400 Q Qian, X., 1374 Qiao, Z., 1173–1179, 1283–1289 Qu, W., 1577 Quandt, R., 71 Quandt, R.E., 1319
1702 Quenez, M.C., 1555, 1557 Quenouille, M., 651 Quiggin, J., 247, 248 Quirin, G.D., 1466 Quirk, J.P., 247, 249 Quiros, J.L.M., 1353 Quiros, M.M.M., 1353 R Rabinowitz, N., 197 Rachev, S.T., 853 Radner, R., 1055, 1056 Raftery, A., 42, 199 Ragunathan, V., 1174 Rajan, R.G., 867, 1366 Rajan, U., 341 Ralescu, D.A., 1184 Ramanathan, R., 193, 1303 Ramaswamy, K., 25, 26, 1091, 1466, 1472, 1476 Ramos, R., 802 Ramsden, R., 802 Rand, J., 329–332 Rangel, J.G., 1422 Rao, C., 401 Rao, R.R., 401 Rao, R.K.S., 815 Rasson, J.P., 1347 Ratings, F., 644, 649 Ratti, V., 204 Raven, O., 1501 Ravid, S.A., 1025–1027 Raviv, A., 836, 863, 864, 885 Raviv, J., 1381, 1382 Ray, B.K., 1409 Raymar, S., 1028 Read, J.A., 1466 Rebello, M., 1025, 1036 Rebonato, R., 460, 465 Reck, M., 333–347 Rees, B., 1594, 1595, 1598 Rees, L., 1151 Register, C.A., 677, 679, 686 Reimer, M., 505, 506 Reiner, E., 939 Reinganum, M.R., 109, 1091, 1093 Reinhart, C.M., 648 Reinhart, W.J., 1319 Reisen, H., 648 Remolona, E.M., 884, 885, 912 Ren, Q., 1136 Renault, E., 1156, 1420 Renault, O., 641, 642, 670, 995 Rendleman, R.J., 395, 1397 Reneby, J., 668, 943, 949, 950 Reneby, Joel Rennie, A., 451 Resnick, B.J., 133 Revuz, D., 1447, 1559 Reynolds, C.E., 766 Rheinländer, T., 1555, 1567, 1569, 1574 Rheinländer, Th., 1557, 1561 Rhoades, S.A., 767, 770 Riani, M., 965, 966 Ribeiro, R.A., 1190 Riboulet, G., 698 Richard, S.F., 987
Author Index Richardson, D.H., 1093 Richardson, M., 292 Ricke, M., 138 Riffe, S., 186 Rijken, H.A., 649 Riley, J., 1401, 1402 Ring, S., 1257–1271 Risa, S., 802 Ritchey, R., 515 Ritchken, P., 505, 884 Ritter, J.R., 843, 1152 Rivoli, P., 648, 659 Robbins, H., 1263 Robert, C.P., 1109, 1372 Roberts, J., 1399, 1400 Robicheck, A.A., 68 Robichek, A., 605, 1226 Robin, S., 138 Robins, R., 194 Rocha, G., 1278 Rochet, J.C., 819, 822, 825 Rock, K., 1399–1403 Rockafellar, R.T., 252 Roden, D.M., 677, 679, 687 Röell, A., 342 Roenfeldt, R., 941 Rogers, D.A., 675–678, 680 Rogers, G.T., 290 Rogers, L., 585, 1273, 1275, 1279 Rogers, L.C.G., 47, 587, 591, 593, 596, 597, 667, 1445, 1446, 1451, 1452, 1458, 1490 Rogoff, K., 1394 Roley, V.V., 883, 884 Roll, R., 41, 108, 109, 138, 208, 269, 273, 439, 442, 464, 532, 605, 884, 1475 Rolski, T., 1448 Roman, E., 321 Romano, R., 1207 Rombouts, J.V.K., 1278, 1424 Romer, D., 1394 Roncalli, T., 698, 994 Ronen, J., 843–851 Ronen, T., 1397 Ronn, E.I., 819, 821–825 Roper, A.H., 1366 Rosen, R., 676, 677, 683, 686 Rosen, S., 1202, 1206 Rosenberg, B., 102, 108, 117, 118, 612 Rosenberg, J.V., 532, 738 Rosenfeld, E., 668, 986, 995 Ross, S., 9, 12, 118, 203, 273, 427, 458, 464, 466, 471, 472, 483, 489, 531, 547, 549, 565, 612, 769, 826, 941, 1190, 1191, 1223, 1400, 1466, 1468, 1475, 1593, 1604, 1619, 1620 Rossi, P.E., 1109, 1110, 1174, 1421 Rotemberg, J., 1403 Rothenberg, T.J., 1364 Rother, P., 641 Rothschild, M., 258 Rouge, R., 1555, 1556, 1567 Roulstone, D., 1153 Rousseeuw, P.J., 965, 966 Rouwenhorst, K., 197 Royston, P., 1177 Rozeff, M.S., 1153, 1154 Ruback, R., 185, 198, 769, 1224, 1231, 1430 Rubin, H., 1365
Author Index Rubinstein, M., 35, 104, 283, 320, 324, 415–417, 427, 453, 466, 515, 518, 520, 522, 532, 533, 547, 548, 550, 556, 558, 567, 587–589, 605–616, 939, 1190, 1274, 1607, 1613 Rudd, A., 278–280, 361, 362, 457, 505, 507, 755 Rudebusch, G., 1134 Ruffieux, B., 138 Ruiz, E., 1109, 1277, 1409, 1410, 1421 Ruland, W., 1593, 1594 Runkle, D., 471, 854, 1161, 1333, 1336, 1340, 1348 Russell, W., 247, 249, 668 Russo, E., 587 Rusticus, T.O., 1359, 1366 Ruszczynski, A., 247–257 Rutkowski, M., 460, 465, 671, 990, 1128, 1129 Ryan, S.G., 1227 Rydberg, T., 874, 1570 Ryngaert, M., 777, 1427, 1434, 1439, 1441 S Saá-Requejo, J., 533, 666, 988 Saita, L., 1599 Salanie, B., 1156, 1161 Salop, S., 1406 Samperi, D., 1557, 1567 Samuelson, P., 204, 236, 607, 611, 613, 915, 916, 921, 929–931, 1055 Sánchez, J. de A., 1190, 1197, 1198 Sandas, P., 342 Sanders, A.B., 984, 1122, 1374, 1515 Sandmann, K., 755 Sandroni, A., 1265 Sanghani, J., 144 Sankaran, M., 478, 1622, 1623 Santa-Clara, P., 716, 721, 743, 986 Santacroce, M., 1555–1564 Santhanam, R., 1317, 1319 Santibanez, J., 276 Saposnik, R., 247 Saraf, M., 802 Sarath, B., 843–851 Sargan, J.D., 1362 Sargent, T.J., 1608 Sarig, O., 669, 995 Sarkar, S., 226 Sarkar, A., 647 Sarkissian, S., 1067–1089 Sarnat, M., 38, 39, 79, 1235, 1236 Satchell, S., 1273, 1275, 1279, 1417 Satchell, S.E., 274 Sato, K., 1447, 1569, 1574 Saunders, A., 671, 675, 676, 678, 679, 686, 759, 819–822, 824, 825, 965, 1275 Savage, L.J., 1321 Savarino, J.E., 204 Savin, N.E., 1409 Scarsini, M., 247, 250 Schachermayer, W., 465, 1012, 1018, 1555 Schadt, R., 1067, 1069, 1070, 1072, 1082 Schaefer, S., 668, 987 Schaefer, S.M., 1491, 1494, 1615 Schall, L., 1068, 1073 SchÄonbucher, P.J., 993, 995 Scharfstein, D., 1403 Scheaffer, R.L., 68, 92, 106, 122, 133 Scheffman, D., 1406 Scheinkman, J., 1484, 1490 Scheinkman, J.A., 138
1703 Schink, G.R., 1465 Schlarbaum, G., 610 Schleef, H., 340 Schleifer, A., 138 Schmidli, H., 1448 Schmidt, V., 1095 Schmukler, S.L., 648, 664 Schniederjans, M., 1317, 1319 Scholes, M., 26, 107, 267, 327, 378, 454, 464, 516, 532, 576, 606, 611, 665, 768, 809, 933, 988, 1055, 1098, 1110, 1230, 1466, 1472, 1601, 1619 Schölkopf, B., 1267 Schonbucher, P., 995 Schonbucher, P.J., 963, 993 Schrage, L., 1137, 1139 Schrage, L.E., 1310, 1319, 1322 Schrand, C., 676, 677, 679, 686 Schroder, M., 471, 473, 474, 477, 575, 1618, 1620, 1623 Schubert, D., 995 Schultz, P., 342, 1616 Schwartz, A.L., 587 Schwartz, E.S., 584, 666–668, 670, 716, 768, 769, 772, 775, 777, 885, 886, 910, 916, 934, 937, 940, 949, 982, 985, 987, 988, 995, 1041, 1515, 1527, 1615 Schwartz, N.L., 778, 831 Schwartz, R., 333–352 Schwarz, T., 137–163 Schwebach, R.G., 761 Schweizer, M., 465, 1555, 1556 Schwert, G.W., 883, 1075, 1089, 1352 Scorcu, A.E., 1206 Scott, E., 1615 Scott, L.O., 47, 481, 485, 489, 533, 547–551, 575, 725, 737, 982, 1191, 1483–1488, 1490, 1523, 1615 Scott, W., 694, 879 Seagle, J.122 Sears, S., 375 Seiford, L., 1308 Seligman, J., 190, 191 Senbet, L.W., 27, 36, 1025, 1036 Seneta, E., 1570 Sengupta, B., 1136 Seppi, D., 339, 341, 342 Servaes, H., 865, 1028, 1237, 1247 Seyhun, H.N., 1152, 1153 Shafer, G., 43, 1257–1271 Shailer, G., 1598 Shakarchi, R., 1450, 1458 Shaked, I., 819, 823–825 Shalen, C.T., 1173 Shanken, J., 190, 289, 1067, 1068, 1070, 1085, 1086 Shanmugam, B., 648 Shapiro, A.C., 676, 677 Shapiro, M.D., 1394 Sharda, R., 1325 Sharfman, K., 186 Sharpe, I.G., 111, 112 Sharpe, W.F., 93, 96, 106, 122, 123, 126, 208, 271, 275, 283, 327, 1091, 1366 Shastri, K., 465 Shaw, K.N., 1324 Shaw, W., 1168, 1171 Shea, J., 1364, 1367 Sheard, P., 830 Sheehan, D.P., 1153 Shefrin, H., 399, 401
1704 Shen, C.-H., 965–976, 1124 Shen, F.C., 203–218 Shen, S.P., 1577 Shen, Z., 1578, 1579 Shephard, N., 49, 515, 527, 1273, 1420, 1421, 1567–1569, 1574 Shepp, L., 1055–1060 Sheppard, K., 1278 Sheppard, S., 1202 Sheu, Y.-C., 873–882, 1445–1464 Shi, J., 1293–1300 Shi, Y., 1307–1329 Shi, Z., 585, 587, 591, 593, 596, 597 Shibut, L., 643 Shie, F.S. 819–827 Shih, F., 1400 Shih, H.H., 1579 Shih, P.-T., 42, 505–513 Shih, W.-K., 355–375, 377–392, 491–503 Shiller, R., 289, 1393 Shiller, R.J., 137, 290, 298, 299, 986 Shimizu, H., 1375 Shimko, D., 515, 522, 523, 532, 667 Shiryaev, A.N., 1056, 1491, 1574 Shiu, E.S.W., 1445, 1457 Shleifer, A., 140, 865 Shmueli, G., 1062 Shoven, J., 1595, 1604 Shrestha, K., 873 Shreve, S.E., 447, 448, 460, 517, 981, 982 Shu, J.H., 1274, 1275 Shu, S., 1366 Shum, K.-H., 654 Shumway, T., 669, 1602–1604 Sias, R.W., 864, 865 Sichel, D., 1283 Sick, G.A., 1230 Siclari, J., 584 Sidenius, J., 703 Sieg, H., 1206 Siegel, J., 138, 185 Siegmund, D., 1577 Siems, T.F., 643 Sikorski, D., 1179 Silberberg, E., 236 Silverman, B.W., 543 Sim, A.B., 327, 328 Simin, T., 1067–1089 Simioni, M., 1206 Sims, C.A., 1391, 1392 Singer, R.F., 988 Singleton, K.J., 665, 671, 713, 717, 718, 912, 979, 981–984, 986, 987, 990, 993, 995–997, 999, 1121, 1127–1129, 1270, 1490 Sinha, G., 802 Sinkey, J.F.Jr., 695 Sinquefield, R.A., 1481 Sircar, R., 1118 Sirri, E.R., 1235, 1237, 1238, 1241, 1244, 1247 Sklar, A., 698, 701 Skorokhod, A.V., 451 Skouras, S., 528 Sloan, R.G., 273, 1108 Smidt, S., 339, 1174, 1176 Smirlock, M., 1173, 1176 Smith, A.F.M., 1347 Smith, B., 1206 Smith, C., 676, 677, 1227
Author Index Smith, C.W., 463, 464, 677, 680, 682, 836, 1025 Smith, D.J., 999 Smith, K.V., 122 Smith, M., 865 Smith, P.A., 1366 Smith, R.L., 843, 1151, 1152 Smith, S.D., 676 Smith, T., 1079 Smith, V., 139, 141 Smith, V.L., 137 Smithson, C.W., 676, 677, 679 Smola, A.J., 1267 Smorodinsky, R., 1263 Snedecor, G.W., 859 So, M.K.P., 1421 Sobehart, J.R., 656, 658, 967 Sodal, S., 1042 Sofianos, G., 342 Sola, M., 1122 Sølna, K., 1110, 1118 Solnik, B. Solnik, B., 204, 234, 996, 1235, 1236, 1283, 1527 Sondermann, D., 716, 727, 734 Soner, H.M., 1007 Sopranzetti, B.J., 1201–1207, 1523 Sorensen, D., 1033 Sorensen, M., 1366 Sornette, D., 260, 716, 721, 986 Sortino, F., 274, 328 Souders.T.L., 1313 Spagnolo, N., 1284 Spatt, C., 339, 342 Spence, L.E., 1451 Spence, M., 1381 Spencer, B., 1403 Spindt, P.A., 844 Spong, K.R., 676, 683, 694 Sprenkle, C., 611 Srinivas, P.S., 1397 Sriram, R.S., 647 Stafford, E., 1161 Staiger, D., 1364 Stambaugh, R., 191, 204, 210, 226, 342, 1070 Stambaugh, R.F., 289, 293, 294, 301, 982, 1068, 1072 Stambaugh, R.S., 1072 Stanton, R., 986 Stapleton, R.C., 985 Starks, L.T., 864, 865, 1173, 1176 Staunton, M., 182 Stegemoller, M., 1427, 1428 Stegun, I.A., 1057, 1059, 1623 Steiger, G., 1574 Steil, B., 340, 342 Stein, C., 614 Stein, E.M., 547, 551 Stein, J., 547, 551 Stein, J.C., 678, 680, 684–686 Stein, R.M., 656, 658, 967, 968 Steiner, T., 835 Stensland, G., 523 Stephan, J., 1397 Steuer, R., 1308 Stevens, G.V.G., 266 Stevenson, B.G., 700 Stickel, S., 1174 Stigler, G., 339
Author Index Stiglitz, J.E., 9, 36, 137, 247, 249, 340 Stinchcombe, M., 1347 Stochs, M.H., 1025 Stock, D., 883–912 Stock, J.H., 1175, 1364, 1365, 1367, 1368 Stockman, A., 195, 196 Stoer, J., 1033 Stokey, N., 340 Stolin, D., 1427, 1428 Stoll, H.R., 340, 431, 808, 810, 1041 Stomper, A., 1028 Stone, B., 122 Storey, D.J., 1598 Stout, D.E., 1319 Stoyan, D., 249 Strahan, P.E., 683 Straumann, D., 698 Strebulaev, I.A., 668 Streller, A., 874 Stricker, C., 465 Strock, E., 676, 678, 679, 686 Stulz, R.M., 226, 405, 547, 675, 676, 864, 865, 1236, 1381, 1382 Stutzer, M., 532 Su, T., 587 Subrahmanyam, A., 199, 342 Subrahmanyam, M., 667, 734 Subrahmanyam, M.G., 670, 996 Suchanek, G., 141 Suen, W., 236 Suk, D., 1400 Sul, D., 1395 Sullivan, R.J., 195, 676 Sullivan, W.G., 1323 Summa, J.F., 375, 392, 419 Summers, L.H., 290, 293, 294 Summers, L., 1398 Sun, L., 1598 Sun, T.S., 982, 999 Sundaram, A., 1400, 1407 Sundaram, R., 1490 Sundaram, R.K., 458, 531, 533, 537 Sundaresan, M., 1517 Sundaresan, S., 667, 668, 670, 941–943, 947, 949, 988, 999, 1517 Sunder, S., 98, 137, 138, 140, 143, 1399, 1596, 1597 Sundaram, A., 1399–1407 Sung, T.K., 1325 Suret, J., 221, 226 Surry, Y., 1206 Susmel, R., 1284, 1421 Sverdlove, R., 670 Swaminathan, B., 1089 Swan, P.L., 1524 Swanson, N.R., 1365, 1366 Swary, I., 1406 Switzer, L., 829 Sy, A.N.R., 645, 664 Szafarz, A., 1223 Szego, G., 260 Szewczyk, S.H., 831, 834 T Taffler, R., 1597 Taggart, R.A., 1226, 1230 Takahashi, H., 1568 Takemura, A., 1264 Taksler, G., 668
1705 Taleb, N., 458 Talwar, P.P., 1098 Tam, K., 1061, 1062 Tamayo, P., 641 Tanaka, H., 1197 Tandon, K., 465 Tang, C.M., 168 Tang, X., 1329 Tansey, R., 1062 Tarjan, R.E., 653 Tarpley, R., 193 Tauchen, G., 718, 1173, 1175, 1176 Taurén, M., 666, 941 Tayler, H., 1617, 1619 Taylor, S., 1409, 1410, 1420 Taylor, S.J., 533, 1273, 1333–1344 Teiletche, J., 994 Tejima, N., 667 Telser, L.G., 915 Temin, P., 137, 139 Tenenbaum, M., 1451, 1461 Tenney, M., 984, 1490 ter Horst, E., 1109 Tesar, L., 1236, 1237, 1247 Teugels, J., 1448 Teukolsky, S.A., 1271 Tevzadze, R., 1555–1564 Thanassoulis, E., 1319 Theil, H., 44, 1466, 1476 Theobald, M., 319–332, 1352 Theodossiou, P., 854, 1596, 1600, 1601 Thomadakis, S.B., 915, 916, 920, 921, 930 Thomas, C., 515 Thomas, J., 199 Thompson, D.J., 68 Thompson, G.W.P., 587 Thompson, H.E., 984, 1121, 1127, 1179 Thompson, R., 186 Thompson, S., 715 Thompson, S.B., 290 Tian, Y., 505, 506, 510 Tiao, G.C., 1347, 1350 Tibshirani, R., 1069 Timmermann, A., 195 Tinic, S., 339, 810 Tinsley, P.A., 985 Tirole, J., 139, 1403 Titman, S., 209, 274, 676, 677, 807, 831, 867, 988 Tiwari, A., 341 Tjahjapranata, M., 829, 832, 840 Tkac, P., 1174 Tobin, J., 267, 1525 Toft, K.B., 667, 668, 670, 934, 937, 939–941, 946, 948, 949, 988, 995, 1299 Tollefson, J.O., 1603 Tompkins, R., 517 Tookes, H., 1397 Topaloglu, S., 844 Topkis, D., 1399, 1400 Topper, J., 1221 Torous, W., 725, 737, 741, 988, 1273 Torous, W.N., 303, 985, 1607 Touze, N., 1007 Toy, W., 1002, 1511 Tracie, W., 1152 Trapp, M., 999
1706 Travlos, N., 1427 Travlos, N.G., 676, 678, 679, 686, 866 Treacy, W.F., 639, 640 Trennepohl, G., 375, 392, 419 Trevor, R., 884 Treynor, J., 63, 185, 191 Triantis, A., 942, 1036, 1042 Trigeogis, L., 767, 768, 1041 Trigueros, J., 1013 Trognon., A., 1156 Trolle, A.B., 716 Truong, C., 1279 Trzcinka, C.A., 670 Tsai, C.C.-L., 1447 Tsay, R.S., 1409, 1422 Tse, Y.K., 873, 879, 882 Tsetsekos, G., 829, 831 Tsiolis, A.K., 1136 Tsurumi, H., 1371–1380 Tucker, S., 1615 Tuckman, B., 503 Tufano, P., 665, 676, 679, 686, 1235, 1237, 1238, 1241, 1244, 1247 Tung, C.C., 1579 Turnbull, S., 549, 555, 584, 585, 587, 596, 600, 601 Turner, C.M., 669, 995 Turney, S.T., 1319 Tuttle, D., 37, 68, 92, 106, 122, 133 Tychon, P., 667 Tzeng, G.H., 1190, 1191 U Udell, G., 1438 Uejima, S., 1197 Uhlenbeck, G.E., 578 Unal, H., 665, 676, 677, 679, 686, 984, 988 Underwood, S., 1397 Uno, J., 999 Urias, M., 226 Urich, T., 122 Uryasev, S., 252 V Valdez, E.A., 994 Valieva, E. Van Boening, M., 148 van der Vorst, H., 1033 Van Deventer, D., 667 Van Dijk, D., 1274, 1278, 1279, 1333 Van Dijk, M.A., 1407 Van Horne, J.C., 25, 26, 273, 886 van Ness, J.W., 1410 Vanderbei, R.J., 248, 256 Vapnik, V., 1269, 1271 Vargas, M., 267–280 Varian, H., 1173 Varikooty, A., 49, 587, 588, 590, 592, 593, 596–602 Vasicek, O.A., 103, 667, 948, 982, 984, 988, 1483, 1515, 1516 Vassalou, M., 807, 809, 988 Vecer, J., 585 Veld, C., 873 Venezia, I., 583–586 Venkat, S., 1152 Venkataraman, S., 1025 Verbrugge, J.A., 1438 Verma, A.K., 819, 821–825 Verrecchia, R., 1174
Author Index Vetterling, W., 1271 Vetzal, K., 1218 Viceira, L., 289, 299, 1394 Vidmar, N., 144 Vijh, A.M., 1397 Villani, G., 1042 Ville, J.A., 1260, 1263 Vincete-Lorente, J.D., 831 Vishny, R., 140, 226, 686, 865, 1428 Vishwasrao, S., 1381 Vissing-Jorgensen, A., 1366 Viswanathan, R., 1490 Vohra, R., 1265 Volinsky, C., 42, 192, 199 von Furstenberg, G.M., 1283 von Maltzan, J., 648 von Neumann, J., 71, 247, 249, 254 Vora, G., 587 Vorkink, K., 1352 Vorst, A.C.F., 584, 585 Vorst, T., 533, 670, 990, 999, 1000 Voth, H.J., 137, 139 Vovk, V., 43, 1257–1266, 1269, 1271 Vyncke, D., 586, 593, 594 W Wachter, J., 298, 986 Wadhwa, P., 807 Wahal, S., 1366 Wahl, J.E., 1179 Wakeman, L., 584, 585, 587, 600, 601, 755 Wald, J., 531–541 Waldman, R., 1395 Waldmann, R.T., 138 Walker, I.E., 1594 Walking, R., 831, 834 Walkling, R.A., 1366 Wall, L.D., 679 Wallace, H.A., 1201 Wang, C., 999 Wang, C.-J., 1357–1368 Wang, D., 1274 Wang, H., 47, 666, 949, 950, 1455 Wang, J., 670, 990–992, 995, 1174, 1176, 1177, 1422 Wang, K., 587–602, 819, 821, 822, 825, 873–880, 1599 Wang, K.-L., 854, 855 Wang, Kehluh, 873–882 Wang, L., 1598, 1603 Wang, P.J., 1174, 1352 Wang, R.S., 409 Wang, S.S., 248 Wang, S.Y., 1183–1198 Wang, T., 585, 758, 759 Wang, Wei, 802 Wang, X.M., 965 Wang, Y., 1593–1604 Wang, Z., 1067, 1070, 1072, 1082, 1086 Wang, Z. Jay, 1237 Warachka, M., 1015 Warga, A., 669, 995 Warner, J.B., 807, 1224 Waters, W.R., 1466 Watson, D.J. H., 1308 Watson, M., 1392 Watson, R.D., 759 Watson, R., 1597–1599
Author Index Watts, R., 836 Watts, S., 1328 Weaver, D.G., 1223–1232 Weaver, D., 342 Weber, M., 1000 Webster, E., 1206 Wedel, M., 1061 Weeks, D., 587 Wei, D., 949 Wei, Jason, 670, 1366 Wei, K.C.J., 44, 1191, 1466, 1469, 1545 Wei, S., 1179 Wei, Z., 1366 Weinbaum, D., 668, 669 Weingartner, H.M., 1319 Weinstein, M., 885 Weisbach, M., 866, 941, 1357 Weisbenner, S., 1366 Weisberg, S., 966 Weiss, L., 941 Weiss, L.A., 807, 941 Welch, D., 1028 Welch, I., 290, 294, 295, 301, 831, 1089 Weller, P., 1608 Wells, E., 627 Werker, B.J., 226 Wermers, R., 273, 1237 Werner, I., 1236, 1237 Wessels, R., 831, 867 West, K., 137, 138, 1343 West, M., 1109 West, P.M., 1065 West, R., 810 Westerfield, R., 1223, 1593, 1604 Westgaard, S., 965 Weston, J.P., 679 Wetherbe, J.C., 1316 Whalen, G., 1604 Whaley, R.E., 41, 430, 431, 439, 441, 515, 522, 523, 552, 556, 1191, 1334, 1397 Whitcomb, D., 335, 339–341 White, A., 49, 464, 531, 547, 549, 551, 586, 587, 593, 599–602, 698, 999, 1191, 1483, 1615 White, E., 137, 148 White, H., 195, 587, 1086, 1347, 1439 White, M., 1062 Whitelaw, R.F., 587, 986, 987 Whitmore, G.A., 247 Whittington, G., 1597 Whyte, A., 1305 Wichern, D., 1276 Wichern, D.W., 1345, 1347, 1349 Widdicks, M., 506 Wiggins, J., 547, 551, 1027, 1273, 1275 Wiggins, J.B., 1191, 1615 Wiginton, J.C., 1062 Wijst, N., 965 Wiley, M.K., 1173 Wilhelm, W.J., 843 Wilhelm, W.J. Jr., 1366 Wilks, S.S., 485 Willard, G.A., 1507 Williams, A., 139–141, 148 Williams, J., 606, 1284, 1400, 1401, 1466, 1471 Williams, T., 340 Williamson, O., 831
1707 Williamson, R., 226, 1026 Wilmott, G.E., 1447 Wilmott, J., 1211 Wilmott, P., 451 Wilson, B., 1349 Wilson, R.L., 1325 Wilson, T., 1484 Winkler, R., 194 Wirl, F., 1042 Wise, M., 1325 Wishart, J., 1411 Witten, I.H., 1324 Wolf, A., 342, 343 Wolf, R., 677 Womack, K., 1161 Wong, W.K., 1173–1179, 1283–1289 Wood, D., 1598, 1603 Wood, R., 342 Wooldridge, J., 899, 904 Wooldridge, J.M., 1241, 1336, 1358–1361, 1363 Woolridge, J.R., 829 Working, H., 915 Worley, L., 1026 Wort, D.H., 93–109, 111–122, 172 Worthing, P., 1061, 1062 Wright, J.H., 1365, 1409 Wruck, H.K., 1151, 1153, 1161 Wu, C., 965, 979–1001, 1224, 1526, 1577 Wu, De-Min, 1093 Wu, G., 843 Wu, H.-C., 1190, 1191 Wu, J.-W., 854 Wu, L., 526, 667, 714, 718, 748, 1489–1500, 1504, 1511 Wu, M.C., 1579 Wu, S., 641, 647, 1121–1133, 1603 Wu, T., 1005 Wu, Wei, 1258 Wu, W.-H., 819–827 Wu, Yangru., 1061, 1391–1397 Wu, YiLin., 1151–1162 Wu, Z., 1577 Wycoff, F., 1207 Wynarczyk, P., 1598 X Xia, Y., 289–314, 316 Xie, F., 1366 Xing, H., 1417–1425 Xing, Y., 807, 809, 988 Xiong, W., 138 Xu, W., 1324–1326, 1329 Xu, X., 533, 1334 Xu, Y., 273 Xu, Y.J., 854 Y Yaari, M.E., 247 Yager, R.R., 1190 Yamazaki, H., 248 Yan, A., 1366 Yan, H., 670, 1000, 1001 Yang, C.W., 235–245 Yang, D., 1273, 1275, 1279 Yang, H., 1447 Yang, L., 802 Yang, S.S., 1412, 1413
1708 Yang, Tze-Chen, 703–705 Yang, W., 984 Yap, C.M., 829, 832, 840 Yaron, A., 1490 Yawitz, J., 995, 1036 Ye, G.L., 587 Yee, K., 42, 185–190, 193, 194, 198, 199 Yee, K.K., 185–200 Yen, N.-C., 1578, 1579 Yeo, G.H.H., 863–871 Yermack, D.L., 863, 864, 869 Yildirim, Y., 667, 996 Yilmaz, K., 1395 Yin, G.G., 1129 Yogo, M., 290, 1364, 1365, 1367, 1368 Yohai, V.J., 965 Yong, V.R., 248 Yoo, Y., 198, 199 Yoon, Y., 1275 Yor, M., 585, 1447, 1559 Yoshihara, K., 1411 Young, R., 802 Yu, E., 1381 Yu, F., 670, 993, 994 Yu, H.-C., 829–840 Yu, M.T., 753–764, 819, 823, 825 Yu, P.L., 1308, 1325 Yule, G.U., 45, 1068, 1076 Yunker, P.J., 1307, 1308 Z Zacks, S., 1347 Zadeh, L.A., 1183, 1184, 1188, 1190, 1193 Zahavi, J., 1062 Zajdenweber, D., 754 Zaman, M.A., 1153, 1154 Zanakis, S.H., 965 Zanola, R., 1206 Zantout, Z., 829, 831 Zarembka, P., 1553 Zariphopoulou, T., 1007, 1567, 1571 Zarnowitz, V., 103, 1477 Zavgren, C.V., 1597, 1598
Author Index Zechner, J., 667, 941, 943 Zeckhauser, R.J., 1237 Zeleny, M., 1308 Zellner, A., 192, 1095, 1156 Zeng, Y., 1121, 1133 Zervos, S., 965 Zhang, A., 1349 Zhang, B.Y., 670 Zhang, D., 843 Zhang, G., 1000 Zhang, J.E., 587 Zhang, L., 1069, 1070 Zhang, P., 585 Zhang, S., 1366 Zhang, X., 226, 669 Zhang, Xiongfei Zhang, Y., 1000 Zhao, F., 713–746 Zhao, Y., 323 Zheng, L., 1237, 1244 Zheng, Q., 1578, 1579 Zhong, Z., 670 Zhou, C., 666, 950, 988 Zhou, G., 172, 174, 505, 506, 510 Zhou, H., 668, 670, 717, 984, 1121, 1126, 1128 Zhou, J., 458, 464 Zhou, X., 1391–1397 Zhu, A., 965 Zhu, F., 668, 670 Zhu, H., 668, 670, 1000 Zhu, N., 1524, 1530 Zhu, Y., 324–327 Ziemba, W.T., 323 Zimmermann, H.-J., 1190 Zin, S.E., 986 Zingales, L., 867 Zitzewitz, E., 886 Zivot, E., 1422 Zmijewski, M.E., 1598 Zoido-Lobaton, P., 649 Zopoundis, C., 965 Zwiebel, J., 865
Subject Index
A Accounting beta, 100–105, 117 Accounting information system selection, 48, 1307, 1316–1319, 1329 Acquisition discounts, 1427–1443 Actuarial reserve, 48, 959–961 Actuarial risks, 953 Added constraints, 236 Additive model, 154, 1055, 1056, 1059, 1060, 1201, 1411 Adjustable rate mortgage (ARM), 43, 716, 744–746, 784, 788, 790–793, 796, 797, 805, 854, 1135–1149 Adjusted present value (APV), 29, 30, 32–34, 36, 1223–1225, 1231 After-tax weighted average cost of capital (ATWACOC), 30–33, 36, 1223–1226, 1228–1232 Agency costs, 28, 677–679, 683, 686, 689, 692, 694, 807, 830, 831, 863–865, 867–871, 933, 936, 940, 941, 947, 1036 Alt-A, 782 American option, 14, 41, 356, 360–363, 378, 387, 405, 431–433, 438–446, 506, 510, 512, 517, 523, 533, 551, 552, 589, 780, 1209 Annuity, 32, 460, 954, 955, 958–960 APER. See APparent error rates (APER) APparent error rates (APER), 42, 1064, 1065 Approximately complete markets, 1008, 1017 Approximation of stochastic integrals, 1021–1024 APT. See Arbitrage pricing theory (APT) APV. See Adjusted present value (APV) Arbitrage, 12, 13, 17–19, 27, 47, 118, 140, 144, 149, 185, 217, 286, 339, 343, 360, 367, 395, 433, 436, 438, 447, 455, 459, 461, 464, 465, 483, 489, 490, 522, 523, 576, 585, 619, 620, 726, 748, 826, 886, 920, 943, 981, 1017, 1126, 1131, 1466, 1490, 1503, 1509–1511, 1555, 1567, 1577–1591, 1619 Arbitrage detection, 1577–1591 Arbitrage pricing theory (APT), 3, 12–14, 22, 44, 109, 185, 191, 273, 283, 286, 483, 489, 534, 1007–1024, 1465–1471, 1474–1481 ARCH models, 1174, 1181, 1333–1338, 1343, 1348, 1349, 1418, 1419, 1421 ARM 2/28, 779, 782–785, 788, 793–795 Asian options, 49, 583–602, 755 Asset allocation, 46, 69–91, 166, 171, 184, 185, 192, 203, 206, 210, 217, 221, 222, 224, 226, 248, 290, 304, 305, 323, 327, 328, 330, 1088, 1236, 1273, 1278, 1280, 1393–1394, 1424 Asset pricing, 42, 44, 45, 93, 137–140, 143, 151, 185, 187, 190, 191, 195, 233, 270, 272–274, 280, 283–286, 290, 314, 330, 332, 356–359, 365, 366, 369, 375, 447, 448, 464, 491, 493, 496, 506, 508, 509, 511, 531, 533, 534, 536, 584, 589–595, 600, 677, 853, 861, 1008, 1012, 1014, 1016, 1017, 1067–1089, 1091–1108, 1110, 1121, 1127, 1165, 1229, 1236, 1244, 1274, 1278, 1279,
1349, 1352, 1354, 1371, 1374, 1395, 1466, 1469, 1470, 1476, 1480, 1481, 1526, 1527, 1534, 1537, 1563, 1567, 1573 Asset return predictability, 1394 Autoregressive modular (ARM) forecasting methodology, 43, 1135–1149 modeling methodology, 43, 1135–1149 processes, 43, 1135–1149 B Bank acquisitions, 1427–1443 Bank merger, 767, 771, 775, 777, 1433, 1434, 1439 Bankruptcy, 9, 40, 48, 49, 55, 100, 189, 198, 204, 205, 272, 515, 659, 667, 682, 769, 775–777, 807, 808, 819, 821, 822, 825, 830, 934–942, 944–950, 969, 996, 998, 1000, 1027, 1028, 1045, 1055–1061, 1299, 1307, 1324–1329, 1354, 1381, 1389, 1446, 1451, 1458, 1593–1600, 1603, 1604 Banks, 16, 101, 264, 330, 332, 515, 639–641, 643–649, 659, 660, 675–680, 682–694, 697–699, 703, 705, 710, 713, 729, 767–777, 797, 819–826, 829–840, 883, 884, 953, 954, 961, 962, 965, 969, 984, 997, 1000, 1007, 1008, 1025–1029, 1033–1036, 1061, 1062, 1071, 1121, 1237, 1305, 1329, 1350, 1354, 1367, 1375, 1417, 1465, 1595, 1603, 1604 Bayesian analysis, 190, 192, 199 Bayesian estimation, 192, 1110, 1422, 1424 Behavioral finance, 165, 455, 1152, 1347, 1399 Beta, 11–13, 20, 34–36, 45, 54, 62–64, 93–105, 117–118, 120, 123, 129, 130, 132, 185, 189, 208, 209, 212, 271–277, 279, 280, 285, 286, 338, 366, 533, 534, 680, 689, 692, 854–856, 861, 909, 1067–1070, 1072, 1073, 1080, 1082–1089, 1091–1108, 1231, 1244, 1346, 1466–1469, 1471–1474, 1478, 1481, 1523, 1525, 1526, 1552 Binomial distribution, 41, 385, 393–397, 426, 589, 1161 Binomial model, 49, 395, 396, 441, 442, 505–513, 543, 585–593, 599, 601, 602, 625, 626, 1190, 1191, 1512 Binomial option pricing model, 15, 41, 393–397, 406, 409–418, 425, 426, 505–507, 617–636, 1190, 1194–1196 Bipower variation, 42, 526–528 Bivariate normal distribution, 41, 404, 424, 429–431, 614, 700, 882, 1285, 1373, 1420, 1421 Black-Scholes formula, 329, 385–388, 391, 402, 409, 415, 417, 418, 426, 428, 455, 457, 461–462, 468, 470, 484, 487, 492, 502, 505, 507, 516–518, 534, 548, 577, 579, 584, 606, 768, 809, 825, 1016, 1190, 1191, 1219, 1608, 1610, 1612, 1620, 1622 Black-Scholes model, 41, 47, 386–388, 392, 399–402, 406, 415, 418, 457, 458, 463, 464, 471, 481, 515, 518, 531–535, 537, 538, 540, 541, 543, 547, 548, 558, 576, 584, 715, 718, 809, 916, 1057, 1191, 1504, 1607–1611, 1613, 1615
1709
1710 Black-Scholes option pricing model, 15, 26, 355, 385–387, 397, 409–418, 447–465, 470, 481, 491, 526, 532, 611, 613, 617–627, 1613 Black-Scholes partial differential equation (PDE), 47, 502, 503, 1209, 1211, 1213, 1215, 1218, 1574 Business cycle, 56, 206, 686, 801–802, 915–931, 985, 1121, 1122, 1133 Business model, 1041–1051, 1349 C Calendar spread, 371–372 Call option, 14, 356–358, 360–363, 365, 367, 372, 374, 378–388, 390, 393–397, 405, 409, 425–431, 433, 435–438, 440, 454–458, 461–464, 477, 482, 483, 487, 491–496, 498–502, 506, 508, 510, 518, 526, 533, 536, 538, 548, 550, 552–554, 556, 559–564, 566, 569, 576, 577, 579, 584, 585, 588, 591, 594, 596, 598, 600, 611, 617–624, 626, 717, 749, 754, 761, 762, 780, 796, 809, 810, 821, 824, 886, 915, 916, 919, 920, 922, 933, 934, 1013, 1014, 1016, 1049, 1066, 1170, 1192, 1193, 1219, 1220, 1335, 1446, 1486, 1487, 1573, 1611, 1619, 1621 Capital allocation line (CAL), 171, 172, 175, 178, 180 Capital asset pricing model (CAPM), 3, 10–12, 22, 25, 44, 45, 63, 93–105, 125, 185, 187, 189, 191, 192, 204, 208, 209, 235, 267–280, 283–287, 335–340, 342, 343, 349, 355, 366, 383, 487, 532, 534, 677, 683, 920, 1067–1069, 1089, 1091–1093, 1098–1100, 1102, 1107, 1108, 1258, 1347, 1349, 1466–1481, 1523–1527, 1537, 1545 Capital budgeting, 23, 28–34, 36, 48, 102, 104, 378, 610, 613, 1028, 1041–1051, 1184, 1190, 1197, 1223–1232, 1307, 1319–1324, 1329, 1466, 1623 Capital labor ratio, 35, 101, 102, 1296 Capital market line (CML), 11, 12, 94–97, 105–107, 270, 272, 337–338 Capital standard, 685, 823 Capital structure, 8, 9, 23–25, 27, 28, 34–36, 46, 47, 99, 377, 387–390, 667, 668, 789, 809, 832, 863–871, 933–950, 995, 1025, 1028, 1036, 1043 Capital structure arbitrage, 1041 Capital structure model, 933, 934, 937, 941–950 CAPM. See Capital asset pricing model (CAPM) CARR. See Conditional autoregressive range (CARR) Change-point, 44, 1346–1348, 1350–1352, 1354, 1417, 1421–1424, 1577, 1578, 1581, 1584, 1587–1590 Change-point problem, 1347–1348 Characteristic function, 43, 47, 485, 488, 489, 528, 550, 551, 555, 573–581, 1486, 1569 Closed-end country fund, 44, 1523–1552 Closed fore option pricing model, 47, 485–489, 1611 Co-integration, 43, 1393 Collar, 374 Combinatorial optimization, 659 Combined LTV (CLTV), 45, 783, 784, 795 Complex logarithm, 1165, 1167–1170 Conditional autoregressive range (CARR), 1274, 1276–1278, 1280 Conditional correlation, 1152, 1155–1162, 1274, 1288 Conditional heteroskedasticity, 718, 880, 909, 1122, 1123, 1273, 1417–1422, 1425 Conditional probability analysis (CPA), 103, 1307, 1312–1316, 1329, 1598–1600, 1604 Conditional value at risk, 43, 48, 251–252 Confluent hypergeometric function, 1052, 1057, 1515, 1521 Consistency, 15, 144, 199, 210, 342, 351, 605, 641, 656, 915, 919, 921, 1096, 1098, 1211, 1362 Consolidation values, 767–777
Subject Index Constant elasticity of variance (CEV), 41, 42, 471–478, 575–577, 1191, 1615–1625 diffusion, 471–473, 1615–1619, 1623 model, 474, 476–478, 547, 744, 1616 Constant proportion portfolio insurance (CPPI), 321–324, 327–332 Consumption-based CAPM, 283, 605 Contingent claim pricing, 460, 463, 1623 Continuous time, 45, 204, 290, 292, 385, 411–415, 447–449, 465, 506, 510, 588, 596, 611, 768, 771, 826, 979, 985, 1015, 1055–1060, 1110, 1122, 1128–1133, 1466, 1504, 1511 Convenience yield, 915–931, 997, 1007, 1010 Copula, 42, 43, 697–711, 994–995, 1371, 1376–1379 Corporate bonds, 5, 55, 206, 211, 213, 214, 216, 668–670, 883–912, 934, 949, 950, 988, 991, 995, 996, 999–1001, 1068, 1070, 1229, 1269, 1271, 1397 Corporate failure, 1593–1604 Corporate finance, 3, 23, 27, 28, 30, 46, 104, 144, 675, 677, 678, 684, 694, 950, 995, 1041, 1042, 1051, 1160, 1357, 1358, 1366–1368, 1601, 1615, 1623 Correlation, 8, 11, 20, 21, 42, 45, 48, 59, 61, 70, 77, 78, 81, 91, 96, 103, 106, 117, 132, 133, 139, 144, 166–169, 175, 188, 189, 194, 197, 221, 222, 224–226, 233, 235, 237, 259–264, 271, 277, 278, 290–292, 294, 297–301, 306, 332, 334, 342, 404, 425, 429, 431, 441, 462, 481–483, 486, 490, 550–552, 557, 561, 578, 579, 607, 609, 646, 652, 655–658, 660, 666, 679, 697, 698, 700–703, 705, 710, 714, 716, 718, 721, 723, 725, 728, 734, 737, 748, 756, 757, 760, 765, 770, 772, 774, 775, 789, 792, 797, 798, 812, 813, 856, 868, 870, 871, 877, 885, 886, 896, 917, 921, 926, 930, 948, 987, 990, 993–996, 1001, 1065, 1072, 1093, 1094, 1096, 1098, 1101, 1102, 1109, 1111, 1152, 1154–1162, 1166, 1235–1238, 1241, 1244, 1247, 1258, 1274, 1278, 1285, 1288, 1289, 1305, 1336, 1338, 1340, 1353, 1358, 1359, 1363, 1364, 1376, 1379, 1389, 1417, 1421, 1424, 1472, 1477, 1478, 1483, 1485, 1537, 1601 Cost of equity capital, 25–26, 33, 36, 44, 1465–1481 Cost of equity of the unlevered firm, 1223–1227, 1229, 1231 Counterparty risk, 347, 697, 980, 994, 999 Covered call, 365, 372–374, 933 Cox–Ingersoll–Ross model, 47, 565, 1488, 1503–1506 Cox–Ross–Rubinstein (CRR) lattice binomial approach, 402 Credit crunch, 332, 779, 788, 789, 969 Credit default swaps (CDS), 48, 665, 668–670, 702, 801, 933, 949, 962, 963, 979, 980, 995–1001 Credit derivatives, 330, 331, 670, 997, 998, 1000 Credit risk, 43–45, 47, 639–641, 643, 644, 648, 649, 659, 660, 665–671, 679, 697, 698, 700, 701, 703, 710, 753–757, 759–760, 765, 808, 811, 815, 818, 836, 884, 933, 934, 949, 953, 961, 963, 965, 980, 988, 996, 1001, 1062, 1271, 1446, 1448, 1457, 1458, 1587, 1597 Credit risk rating, 45, 639–660 Credit spread, 668–670, 699, 705, 941, 943, 947–949, 996, 998, 1000, 1001, 1118 Credit VaR, 697–711 Critical line, 87, 91 Crop year, 917–919, 923 Cross-section of stock returns, 1093 CRRA intertemporal CAPM, 605 Cumulative binomial function, 397 Cumulative sum (CUSUM), 1347, 1577, 1578, 1581–1584, 1586, 1590, 1591, 1600–1601 Curtailment, 864 CUSUM. See Cumulative sum (CUSUM) D Data-mining, 44, 45, 48, 194, 200, 274, 641, 1067–1089, 1307–1329 Data snooping, 42, 187, 192, 194–200, 1067, 1092, 1333 DCC. See Dynamic conditional correlation (DCC)
Subject Index Debt financing, 24, 29, 31, 32, 34, 36, 39, 40, 56, 831, 863–865, 870, 938, 979, 1025, 1026, 1028, 1029, 1036, 1224–1226, 1321, 1381, 1382, 1389 Debt-to-income ratio (DTI), 780, 784–786, 800, 803, 804 Decision tree, 397, 617–627, 1031–1033, 1324, 1325 Default, 4, 27, 28, 30, 44, 45, 48, 55, 94, 179–181, 360, 389, 390, 607, 609–611, 613, 623, 639, 641, 644, 648, 664–670, 697–711, 756, 779–802, 807–810, 812–815, 853, 884, 885, 933, 934, 937, 939, 941–944, 947–950, 953, 961–963, 965–976, 979–1001, 1044, 1046, 1070, 1118, 1131–1133, 1224, 1226, 1231, 1232, 1299, 1324, 1353, 1446, 1451, 1593, 1596, 1601, 1602 Default correlation, 697–711, 980, 993–996 Default digital swap, 48, 963 Default intensity, 980, 988–990, 993, 994, 996, 997, 1001 Default risk, 5, 23, 25–28, 43, 44, 55, 640, 649, 753–757, 764, 787, 794, 807–815, 885, 933, 949, 991, 995–997, 999–1001, 1026, 1028, 1032, 1035, 1036, 1121, 1128, 1353, 1602 Default-triggering level, 47, 1293, 1299, 1300, 1451–1453, 1456 Defensive forecasting, 43, 44, 1257–1271 Delinquency, 45, 780, 783, 784, 789, 790, 797–801, 804, 1353 Delta, 47, 321, 327, 329, 455, 457, 491–494, 496–500, 502–503, 517, 522, 548, 549, 552–555, 561–567, 714, 719, 721, 855 Demand/supply shock, 915, 916, 918, 919, 921, 931 Density process, 49, 1556, 1567–1575 Deposit insurance, 683–686, 819–826, 1429 Discriminant analysis (DA), 965, 1061, 1062, 1065, 1066, 1325, 1328, 1596–1598 Displaced log normal, 41, 439–442 Distress, 676–680, 682, 688, 689, 692, 694, 779, 787, 789, 807, 808, 812, 813, 815, 830, 933, 937, 939, 947, 948, 996, 1026, 1036, 1397, 1429, 1593, 1594, 1596, 1598, 1600–1602 Dividend discount model (DDM), 303–306, 310, 314 Duality, 42, 236, 254–257, 286, 866, 1564 Dyl model, 80 Dynamic, 6, 44–47, 49, 138, 193, 194, 205, 221, 223, 233, 289, 290, 297, 299, 300, 302, 303, 314, 321, 323, 324, 327–329, 331, 332, 334, 341, 342, 350, 358, 455, 459, 460, 462, 463, 506, 533, 548–550, 554, 560–565, 567, 585, 605, 665, 666, 670, 713, 715, 717, 719, 723, 725–727, 737, 746, 748, 756, 759, 760, 768, 770, 779–802, 823, 844, 880, 909, 910, 934, 941, 943, 945–946, 948, 950, 979–988, 990, 991, 993, 997, 999–1001, 1009, 1025, 1032, 1036, 1069, 1071, 1089, 1109, 1121, 1122, 1126–1128, 1178, 1220, 1231, 1233, 1236, 1238, 1273, 1274, 1279, 1280, 1284, 1285, 1288–1291, 1312, 1316, 1352, 1382, 1385, 1391, 1392, 1394, 1396, 1417, 1420, 1422, 1424, 1446, 1484, 1489–1500, 1504, 1505, 1509, 1511, 1514, 1516, 1517, 1555, 1556, 1564, 1567–1569, 1571, 1574, 1601, 1604, 1612, 1615, 1623 Dynamic conditional correlation (DCC), 1274, 1278, 1280 E Early exercise, 361, 405, 431–433, 436–440, 552, 1333, 1334, 1488, 1615 Earnings multiples, 7, 199 Edgeworth binomial model, 49, 588–591, 593, 599, 601, 602 Efficient frontier, 46, 48, 60, 61, 78–83, 87–89, 91, 93–95, 111, 115, 120, 121, 123, 125, 126, 128, 165, 167–171, 175–182, 221, 222, 226–228, 233, 236–238, 245, 259, 260, 264, 265, 268–271, 273, 280, 1424 Efficient market hypothesis (EMH), 104, 137, 334, 459, 1258 Efficient portfolio, 46, 48, 60–62, 68, 69, 77–91, 94, 96, 111, 128, 166–172, 175, 180, 192, 245, 248, 259, 260, 265, 266, 272, 284–286 EGB2 distribution, 854–855, 859–861 Electronic markets, 345–347 Empirical analysis, 209, 277–280, 342, 535, 668, 682, 718, 723–725, 738, 844, 859, 917, 949, 1237, 1403, 1417, 1427, 1431, 1524
1711 Empirical evidence, 44, 47, 93, 118, 195, 204, 226, 301, 453, 471, 665, 668–670, 727, 769, 844, 845, 856–859, 863, 945, 948–950, 979–1001, 1015, 1121, 1151, 1241, 1247, 1399, 1401, 1406–1407, 1411, 1421, 1524, 1604, 1616, 1618 Endogeneity, 46, 1357–1368 Equity liquidity, 44, 807–815 Equity risk premium puzzle, 605 Errors-in-variables (EIV) bias, 1092–1099, 1102, 1107 Estimation risk, 10, 46, 203–218, 1092, 1107, 1466, 1472, 1474, 1481 European option, 356, 360–361, 378, 383, 385, 387, 405, 431, 437, 506, 507, 517, 534, 550–552, 555, 556, 585, 589, 769, 1110, 1483, 1487, 1619 Evolution of risk, 1346 Ex-dividend, 8, 28, 41, 296, 430, 432, 433, 436, 437, 439–442, 533–535 Executive stock option, 377 Exercise price, 14, 15, 286, 320, 321, 356–372, 374, 375, 378,–387, 390–393, 397, 399, 402, 403, 405, 415, 425–427, 431, 436–438, 454, 456, 457, 464, 467, 501–502, 557, 562, 583, 593, 618, 619, 621, 758, 761, 762, 920, 1042, 1195, 1196, 1211, 1335, 1608 Exotic options, 453, 465, 583, 584 Expected discounted penalty, 47, 1293, 1298, 1445–1458 Expected utility, 38, 71–74, 91, 208, 209, 217, 247–249, 251, 254–257, 286, 300, 301, 311, 317, 336, 337, 605, 607, 1384 Experimental asset markets, 138 F Failure prediction, 1596, 1598, 1604 Fat tails, 43, 698, 710, 718, 728, 792, 802, 823, 853–856, 861, 873, 896, 1109, 1111, 1285, 1421 Financial applications, 575–581, 961–963, 1183–1198, 1273–1280, 1353, 1568, 1569 Financial crisis, 832, 1122, 1349, 1354 Financial distress costs, 676–680, 682, 689, 692, 694, 807, 808, 812–813, 815, 933, 947, 1026, 1036 Financial markets, 16, 42, 44, 165, 166, 221, 222, 224, 226, 233, 285, 331–332, 342, 345, 450, 605, 606, 613, 639, 648, 738, 779, 782, 998, 1000, 1032, 1173, 1198, 1257, 1279, 1284, 1290, 1347, 1349, 1352–1354, 1383, 1409, 1524, 1537, 1608 Financial signaling, 1399–1407 Finite difference, 49, 1210–1217, 1219, 1220 Finite element, 1218–1219 Finite volume, 49, 1217–1218, 1220 Firm bankruptcy predictions, 1324–1329 Firm size, 835 Fixed operating cost, 1052 Forecast(ing), 5, 8, 11, 18, 23, 42–45, 65, 93–105, 117, 118, 120, 185, 187, 192–194, 198, 199, 217, 271, 290, 294, 295, 301, 303–314, 397, 458, 459, 524–526, 528, 684, 772, 780, 792, 797, 798, 800–802, 810, 831, 853, 855, 861, 907, 909, 910, 912, 967–969, 973–976, 987, 1041, 1135–1149, 1161, 1162, 1192, 1193, 1257–1271, 1274, 1276–1280, 1324, 1333–1344, 1348, 1349, 1378, 1379, 1391–1395, 1401, 1414, 1417–1425, 1432, 1466, 1467, 1476–1481, 1495, 1500, 1502, 1523, 1602 Foreclosure, 787, 797, 803, 1203, 1204, 1353 Foreign exchange volatility model, 686, 1110, 1111, 1113–1118, 1409 Forward measure, 715, 723, 725–728, 732, 734, 737, 738 FTSE100, 42, 175, 182, 533, 1376–1379 Functional transformation, 1523–1525, 1534, 1545 Fundamental analysis, 42, 198, 199, 343 Fundamental asset values, 137 Fundamental theorems of asset pricing, 1008, 1017 Futures option, 359, 362–363, 375, 557, 760, 763, 1488 Fuzziness, 1183, 1184, 1190, 1191, 1197, 1198 Fuzzy set, 41, 48, 1183–1198, 1307–1329
1712 G Game-theoretic probability, 43, 1257, 1258, 1260–1265 Game theory, 1042 Gamma, 473, 496–500, 502–503, 533, 721, 855, 873, 910, 1091, 1373, 1378, 1448, 1570, 1571 Gamma function, 855, 1052, 1515, 1618 GARCH. See Generalized autoregressive conditional heteroscedasticity (GARCH) GARCH effect, 43, 912, 1173–1179, 1285, 1352 Gaussian copulas, 42, 702, 705, 1376, 1378, 1379 Generalized autoregressive conditional heteroscedasticity (GARCH), 43, 44, 329, 518, 533, 715, 823, 853–858, 861, 884, 886, 896, 899, 901, 903–911, 1109, 1122, 1173–1179, 1273, 1274, 1278, 1280, 1284, 1285, 1287, 1288, 1334, 1347–1353, 1374–1376, 1409–1411, 1417–1422, 1424, 1425, 1607 Generalized binomial option pricing model, 43, 395–397, 613 Generalized hyperbolic (GH) distribution, 42, 873, 874, 876, 877, 880 Genetic programming, 1607–1614 Gibbs sampler, 42, 1371, 1373–1375, 1379, 1421 Global recession, 779 Granger-causality test, 1392, 1397 Greek letters, 47, 93, 115, 491–503 H Hansen Jagannathan bounds, 285–286 Hazard model, 1604 Heath–Jarrow–Morton (HJM), 713, 714, 716, 721–723, 727, 746, 981 Hedge ratio, 15, 20–22, 42, 377, 384, 387, 394, 455, 547–549, 552–555, 558, 561, 571, 676, 678, 679, 719, 720, 873–880, 1195 Hedging, 3, 15–22, 47, 233, 301, 320, 328, 329, 331, 332, 365, 366, 377, 383, 384, 493, 535, 543, 547–549, 552–558, 561–568, 570, 571, 583, 585, 668, 675–680, 682–684, 686, 687, 694, 713, 715, 716, 718–724, 729, 744, 746, 753, 756, 765, 873, 877, 878, 884, 909, 1007, 1016, 1118, 1290, 1349, 1351, 1394, 1488, 1494, 1498, 1555, 1556, 1615, 1616 Hedging performance, 47, 329, 547–571, 714, 719–721 Hedonic models, 45, 1201–1206 Heston model, 487–489, 1165–1171 High-frequency returns, 670, 1334, 1344 High/low range, 1278, 1280 High yield (HY), 668, 669, 883–886, 893, 896, 897, 899, 901, 903, 904, 907, 909, 910, 912, 1397 Hilbert–Huang transformation (HHT), 1577–1584, 1586, 1587, 1589–1591 Housing price appreciation (HPA), 45, 780–783, 789–792, 795, 800–803, 805 Housing price index (HPI), 45, 788–792, 801, 803, 804 Human resource allocation, 48, 1307, 1312–1316, 1329 I Illiquid markets, 19, 221 Implied probability densities, 518, 1568 Implied volatility, 42, 328, 329, 457, 458, 517, 518, 522–524, 532–534, 536–540, 544, 545, 548, 549, 552, 555–558, 571, 668, 669, 714, 715, 721, 723, 728–730, 733, 734, 736, 737, 743, 746, 1049, 1110, 1279, 1333–1344, 1487, 1581, 1588 Impulse response, 1392–1393, 1396 Incomplete market, 458, 465, 715, 725, 1008, 1017, 1555, 1564, 1567, 1568, 1574 Indifference curve, 46, 71–77, 79, 81, 169, 171, 268, 270, 280 Indifference pricing of derivatives, 1567 Informational efficiency, 109, 334, 680, 1339, 1343, 1396–1397 Information content of trades, 1395–1396 Information flow, 1000, 1174, 1175, 1177, 1178, 1285, 1397 Information technology, 347, 763, 1048, 1061, 1283–1289, 1326 Instability principle, 1346, 1348
Subject Index Instrumental variable estimation, 1357, 1360, 1361 Insurance, 27, 48, 49, 62, 319–332, 464, 584, 586, 643, 648, 683, 684, 686, 753, 756, 760, 761, 765, 771, 784, 819–826, 830, 953, 955–961, 963, 1086, 1087, 1156, 1184, 1410, 1414, 1445–1447, 1457, 1465–1479, 1481 Integration, 3, 16, 36, 222, 224, 226, 233, 347, 440, 449, 468, 469, 471–478, 491–503, 505, 577, 580, 615, 738, 767, 958, 1011, 1019, 1165, 1168–1171, 1218, 1283, 1324, 1371, 1372, 1379, 1419, 1420, 1449, 1450, 1521, 1527, 1537, 1572, 1615–1625 Integro-differential equation, 1293, 1295, 1298, 1445–1449, 1457 Integro-partial differential equations, 49, 1567, 1568, 1572–1574 Interest, 4, 5, 9, 14–17, 25–33, 36, 45–48, 55, 65, 68, 95, 99, 101, 102, 118, 144, 148, 186, 188, 206, 252, 270, 275, 289, 333, 336, 339, 341, 343, 357, 358, 360–362, 378, 382, 386–390, 418, 426, 432, 433, 436, 437, 485, 515, 516, 518, 522, 561, 605, 640, 641, 644, 660, 665, 678, 684, 700, 717, 753, 768–770, 772, 774–776, 779, 780, 783, 784, 792, 794, 805, 811, 834, 838, 843, 844, 850, 863–865, 878, 910, 916, 920, 930, 933, 934, 936, 938, 941, 943, 944, 953–956, 963, 969, 996–998, 1025, 1027, 1034, 1035, 1062–1064, 1069, 1110, 1113, 1130, 1153, 1192, 1195, 1224–1226, 1229, 1230, 1237, 1273, 1284, 1305, 1326, 1327, 1333, 1346, 1353, 1359, 1361, 1363, 1366, 1376, 1396, 1418, 1435, 1438, 1439, 1533, 1567, 1568, 1571, 1593–1596, 1615 Interest rates, 5, 9, 14–18, 45, 47, 55–57, 108, 178, 185, 205, 217, 226, 270, 273, 275, 276, 290, 292, 299, 303, 310, 314, 316, 321, 329–331, 361, 362, 382, 384–386, 389, 390, 392, 394, 400, 402, 409, 416, 456, 457, 464, 472, 490, 497, 500, 501, 503, 517, 521, 533, 547–571, 578, 588, 601, 607, 609, 611, 619–621, 624, 626, 639, 648, 665–667, 676–692, 703, 705, 713–746, 753–756, 758–760, 762–764, 769, 770, 772, 774, 780, 783, 788, 794, 795, 819, 821–823, 825, 878, 885, 886, 899, 916, 917, 921, 923, 926, 934, 935, 938, 939, 942, 948, 950, 953, 954, 963, 979–981, 983–991, 993, 995–1001, 1013, 1026, 1027, 1030–1032, 1036, 1043, 1067, 1071, 1073, 1075, 1118, 1121–1133, 1136, 1166, 1183, 1190–1197, 1209, 1211, 1224, 1226, 1228–1230, 1296, 1305, 1375, 1379, 1393, 1395, 1417, 1421, 1468, 1483, 1484, 1487–1500, 1503–1517, 1616, 1623 Interest rate volatility, 713, 715, 718, 737, 987, 1122 International capital flows, 1235–1237, 1247 International equity fund, 48, 235, 245, 1238 International mutual funds, 1235–1252 International portfolio management, 10, 46, 221–233 Interstate acquisitions, 1431, 1442 Intertemporal equilibrium, 10, 283–287 Investment constraints, 46, 221, 222, 226, 227, 229–231, 233 Investment model, 48, 203–206, 235–245, 1293, 1295, 1300 Iso-return line, 84, 85, 88 Iso-variance ellipse, 84–88 IT bubble, 1284, 1288, 1289 Iterated cumulative sums of squares algorithm, 1347, 1350 Itô’s Lemma, 451–452, 454, 460, 462, 473, 481, 490, 516, 517, 578, 579, 921, 1493, 1509, 1514, 1616
J Jensen measure (JM), 62–64, 209 Joint normality assumptions, 873, 880 Jump diffusion, 47, 526, 531, 533, 541, 666, 667, 669, 670, 716, 734, 949, 979, 984, 988, 1130–1133, 1191, 1293, 1445–1458, 1607, 1613 Jump risk, 42, 515, 527, 528, 666, 725, 726, 732, 1000, 1129, 1130
K Keiretsu, 829–840
Subject Index L Lagrange multipliers, 81, 169, 173, 236, 237, 254, 255, 257, 269, 1347 Laplace transform, 47, 585, 1136, 1139, 1293–1295, 1298, 1300, 1457, 1458 Latent factors, 985, 1273 Lattice model, 403–406 Law of long time, 1346, 1347 Le Châtelier principle, 48, 235–245 Leverage, 8, 9, 14–16, 35, 46, 56, 99–102, 117, 162, 163, 185, 198, 215, 216, 275, 323, 331, 360, 363, 366, 643, 666, 667, 670, 677, 682–684, 689–692, 725, 734, 801, 823, 825, 829–840, 854, 863–871, 873, 885, 910, 934, 937, 938, 940, 941, 946–950, 996, 1042, 1043, 1048–1051, 1111, 1224, 1225, 1229–1232, 1270, 1277, 1320–1322, 1326, 1352, 1381–1383, 1385–1389, 1396, 1400, 1403–1407, 1417, 1419, 1421, 1433, 1438, 1439, 1574 Levy processes, 526, 1009 Life insurance, 48, 955–958, 960, 1410, 1476 Linear programming approach, 48, 117, 123, 1323, 1324 Linguistic variables, 1190 Lintner’s method of short sales, 129 Liquidation, 20, 39, 40, 186, 188–190, 199, 369, 387, 942, 943, 947, 948, 953, 969, 1008, 1010–1014, 1019, 1594–1596, 1604 Liquidity, 5, 14, 15, 21, 44, 47, 56, 117, 118, 138, 185, 190, 196, 221, 224, 226, 274, 334, 335, 340–343, 345–347, 358, 533, 556, 645, 670, 676, 678, 682, 684, 685, 688, 692, 746, 807–815, 818, 820, 854, 873, 949, 950, 965, 980, 986, 991–993, 995–1001, 1007–1017, 1041, 1279, 1353, 1396, 1397, 1409, 1428, 1429, 1439, 1441, 1490, 1496, 1523, 1594, 1598 Listing effect, 1427–1443 Loan performance, 780, 782, 784–791, 793–797, 800 Loan-to-value (LTV), 783, 788, 792, 795, 800 Local expectations hypothesis, 605 Logarithmic utility, 210, 215, 607, 609 Logical analysis of data (LAD), 45, 640–647, 649–660 Logistic regression, 45, 646, 965–976, 1061–1063, 1065, 1066, 1325 Logit, 45, 965–967, 969, 973–975, 1062–1063, 1307, 1328, 1598–1600 Log normal, 41, 42, 49, 81, 82, 421–424, 426–428, 439–442, 453, 457, 467, 470, 477, 484, 515, 517, 518, 522–524, 534–535, 537–541, 544, 545, 584, 587, 593, 595, 610–613, 728, 754, 762, 942, 985, 1043, 1112, 1490, 1615, 1616, 1618 Log normal distribution, 41, 82, 324, 402–404, 421–428, 440–442, 453, 457, 458, 467, 468, 470, 471, 532, 541, 576, 584, 585, 594–601, 734, 754–756, 761, 854, 855, 1030, 1506, 1545 Log-normal process, 41, 439–440, 1616 Long-memory stochastic volatility model, 48, 1409–1414 Long straddle, 367–369 Long vertical spread, 369–370, 372 Loss severity, 753, 780, 802 M Main bank, 829–840, 1329 Managerial entrenchment, 863, 865 Market beta, 45, 93, 104, 1073, 1091, 1092, 1097, 1107, 1108, 1474 Market efficiency, 109, 143, 195, 334, 999, 1258, 1523, 1524, 1534 Market entry, 1041, 1381 Market makers, 44, 56, 339, 340, 342–344, 346, 351, 360, 807, 808, 815, 1041, 1111, 1398 Market microstructure, 185, 333–347, 351–352, 807, 1007, 1161 Market model, 35, 45, 93, 97–100, 107, 111, 117–119, 128, 271, 350, 680, 727, 734, 737, 738, 1021, 1092, 1094, 1095, 1100–1102, 1104–1108, 1431–1433, 1467–1469, 1471, 1475, 1476, 1564 Market quality, 1395–1396 Market risk premium, 13, 53, 94, 107, 1467, 1468, 1472, 1473, 1537 Markov Chain Monte Carlo (MCMC) algorithms, 42, 49, 1109–1119, 1371–1380
1713 Markowitz model, 69, 79, 81, 82, 90, 91, 112, 130, 132, 165, 166, 236, 239–245, 248, 271 Markowitz quadratic programming model, 235 Martingale property, 42, 880, 1561 Mathematical programming (MP), 42, 1061–1066, 1311, 1319, 1324 Maturity, 4, 5, 14, 15, 17, 18, 20, 27, 56, 323, 327, 331, 358, 360–362, 364, 365, 368, 369, 371, 375, 377, 378, 387, 388, 391, 393–396, 410, 415, 425, 426, 433, 441, 454, 456, 457, 460–462, 493, 494, 506, 508, 517, 523, 532–535, 537–541, 544, 545, 548, 549, 555–557, 559, 562, 565, 571, 585, 594, 596, 599, 600, 610, 666, 667, 669, 678, 679, 685, 698, 699, 703, 705, 714, 716, 717, 721, 723, 727, 728, 732, 734, 739–741, 744, 746, 754–758, 761, 763, 764, 780, 803, 809, 820, 884, 885, 896, 898, 899, 901, 904, 907, 909, 912, 916, 919, 920, 933, 934, 939, 940, 947, 948, 961, 962, 980, 981, 987–991, 995, 998, 1013, 1025–1037, 1128, 1133, 1166–1168, 1170, 1194, 1220, 1229, 1270, 1351, 1451, 1453, 1456–1458, 1467, 1490–1495, 1504–1507, 1509–1516, 1519, 1534, 1568, 1573, 1611, 1613 MCMC algorithms. See Markov Chain Monte Carlo (MCMC) algorithms Mean-variance spanning tests, 10, 46, 165–184 Membership function, 1184–1190, 1193 Mergers, 27, 104, 105, 185, 345, 464, 767, 768, 771, 775, 777, 1153, 1238, 1427–1430, 1433, 1434, 1439, 1443, 1600 Method of payment, 1428, 1430, 1433, 1434, 1439, 1441–1443 Microsoft excel, 41, 175–178, 182, 433–435, 617–636 Minimal entropy martingale measure, 49, 1555–1564, 1567–1575 Minimum variance portfolio, 59, 78, 79, 82, 84, 168–170, 176, 177, 227 Misclassification cost model, 1593, 1603 Mixture of distributions hypothesis (MDH), 1173–1175 Model identification, 1301, 1303 Moment generating function, 404, 423, 481, 485, 575, 1569, 1572 Monotonic convergence, 506–507 Monte-Carlo simulation, 49, 249, 324, 328, 453, 584, 587, 596, 597, 601, 768, 771, 775, 986, 1077, 1098, 1107, 1110, 1118, 1170, 1274, 1488, 1607, 1611 Monty Hall problem, 1371, 1372, 1379 Mortgage, 16, 44, 332, 743, 744, 746, 759, 765, 772, 779, 780, 782, 783, 787, 788, 791–794, 796, 797, 800–803, 805, 1122, 1353, 1354, 1587–1589 Mortgage-backed securities (MBS), 715, 716, 729, 734, 746, 979, 1353 Mortgage market, 716, 743, 744, 746, 779, 783 Multifactor model, 1001, 1109, 1483, 1485, 1511 Multi-index model, 46, 122, 123 Multinomial process, 399 Multiperiod, 203–206, 216, 217, 283, 464, 605, 606, 1025, 1026, 1028, 1033, 1036, 1037, 1042, 1043, 1047, 1119, 1525, 1526 Multiple indexes, 46, 111, 118–121, 124 Multiplicative model, 1055, 1056, 1059 Multivariate GARCH (MGARCH), 1178, 1283, 1284, 1290, 1424, 1425 Multivariate lognormal distribution, 423–425 Multivariate normal distribution, 421, 424, 702, 877 Multivariate volatility, 1274, 1276, 1278, 1290, 1424–1425 N Noisy markets, 185 Non-central chi-square, 41, 1506, 1616–1622 Noncentral chi-square distribution, 473, 476–478 Nonclosed-form option pricing model, 481–485 Non-parametric, 42, 278, 329, 1061–1066, 1433, 1607, 1608, 1613 Nonparametric density, 532, 537, 539, 541 Nonparametric estimation, 42, 535, 734–746 Normal distribution, 38, 41, 66, 81, 82, 275, 276, 278, 321, 329, 385, 386, 397, 404, 416, 421–430, 448, 456–458, 461–463, 470, 484,
1714 519, 522, 537, 587, 614, 615, 679, 697, 700, 702, 705, 706, 710, 725, 759, 778, 809, 820, 827, 853, 855–857, 859, 861, 873, 876, 877, 880, 882, 939, 1016, 1066, 1099, 1123, 1152, 1178, 1276, 1285, 1336, 1346, 1412, 1538, 1578, 1581, 1583, 1584, 1600, 1608, 1611 Numerical analysis, 292 Numerical solution, 29, 49, 405, 485, 1209–1220 O OBPI. See Option-based portfolio insurance Office of Federal Housing Enterprise Oversight (OFHEO), 789–791 On-line prediction, 1257–1258, 1269, 1271 Optimal capital structure, 8, 9, 27, 28, 34, 36, 38–40, 99, 667, 933–950, 1028, 1382, 1451, 1456 Optimal hedge ratio, 873–880 Optimal risky portfolio, 170–172, 175, 178, 180–182 Option, 14–15, 355–360, 377–392, 395–397, 461–462, 589–591, 767–778, 1190–1191 Option-based portfolio insurance (OBPI), 320–321, 327–329 Option pricing, 27, 30, 41–43, 357, 362, 402–406, 415, 416, 421, 439–446, 454–458, 470, 547–574, 584, 611, 612, 625–627, 822, 823, 884, 916, 920, 1007, 1156, 1165, 1192, 1581, 1607–1625 Option pricing model (OPM), 15, 26, 41, 43, 46, 47, 355, 358, 383, 385–387, 393, 395–397, 399–419, 421–440, 447–472, 481–491, 505, 506, 532, 548–555, 564, 571, 575, 576, 613, 617, 621–627, 683, 809, 825, 919, 987, 1190, 1192, 1193, 1581, 1607, 1611, 1613, 1615, 1619–1622 Options, 14, 15, 41, 42, 46, 47, 49, 144, 175, 177, 186, 229, 236, 332, 355–375, 377–397, 403–406, 425, 431–433, 435–439, 442, 453–458, 461, 464, 465, 482, 491–502, 506–508, 510, 512, 515, 517, 518, 520, 523, 524, 526, 531, 533, 534, 537, 538, 548–553, 556–571, 575, 577–580, 583–602, 617–618, 622–624, 667, 682, 713, 727, 729, 732, 738, 744, 746, 749, 753, 755, 760, 762–765, 767, 779, 780, 787, 791, 796, 800, 801, 809, 820, 821, 824, 831, 867, 884, 916, 922, 930, 934, 966, 979, 998, 1015, 1041, 1042, 1047–1049, 1153, 1170, 1175, 1180, 1191, 1196, 1209, 1220, 1229, 1232, 1333–1335, 1339, 1343, 1348, 1349, 1396, 1397, 1405, 1445, 1446, 1483, 1485–1488, 1490, 1588, 1607–1613, 1615, 1616, 1622, 1623 Options pricing, 46, 447, 605–616, 1191, 1349 Ordinary differential equation (ODE), 47, 297, 304, 316, 317, 449, 577, 580, 748, 937, 1167, 1293–1300, 1445–1464, 1493, 1501 P Panel data, 43, 44, 680, 686, 808, 810, 812–814, 1000, 1241, 1368 Penalized internal rate of return for beta index (PIRR for Beta), 276–277 Penalized internal rate of return index (PIRR), 276, 277, 279, 280 Performance evaluation, 38, 44, 208, 1008, 1070, 1072, 1089, 1307, 1523–1552 Performance measure, 46, 53, 62–64, 91, 125–135, 139, 208, 209, 217, 274–278, 280, 329, 567, 1049, 1241, 1245, 1246, 1341 Persistent lagged instruments, 1069, 1082 Phase-type distribution, 1446, 1448, 1449 Poisson distribution, 425–426, 428 Portfolio, 3, 6, 10–13, 16, 20–22, 35, 38, 42–46, 48, 53, 69–91, 93–99, 102, 105–109, 111–135, 138–140, 143, 144, 162, 187, 191, 192, 196, 197, 203–218, 221–233, 235–237, 245, 319–321, 323, 324, 327–329, 333–338, 344, 350, 355, 360, 361, 363, 372, 374, 383–387, 391, 394, 438, 439, 454–456, 458, 464–466, 478, 482, 483, 486, 490, 491, 493, 496–499, 502, 516, 531, 534, 535, 553–555, 562, 563, 566, 571, 576, 578, 585, 606–610, 612, 613, 679, 680, 683–685, 687, 697–703, 705–707, 710, 720, 727, 749, 761, 789, 853, 855, 873, 879, 884, 941, 968, 999, 1008, 1010–1012, 1019, 1030, 1067, 1070, 1083, 1084, 1087, 1088, 1091, 1092, 1094, 1098–1102, 1106, 1107, 1118, 1133, 1235,
Subject Index 1236, 1329, 1348, 1349, 1352, 1394, 1468, 1470, 1475, 1510, 1523–1527, 1593 Portfolio analysis, 48, 57–61, 69, 71, 88, 93, 114–117, 122, 131, 144, 170, 259–266 Portfolio choice, 38, 44, 75, 209, 299, 1525 Portfolio optimization, 42, 46, 91, 165–184, 191, 247–257, 289, 1417, 1424–1425, 1567, 1571 Portfolio planning, 289–318 Portfolio theory, 3, 10, 21, 46, 53–68, 106, 111, 217, 249, 266–280, 283–287, 289, 334, 342, 1008, 1346, 1417 Positivity, 47, 987, 1458, 1489–1491, 1498, 1511, 1560 Power utility, 10, 46, 203–218, 285 Predicting bond yields, 1257–1271 Predicting stock returns, 1069, 1394 Prepayment, 45, 716, 729, 734, 743, 744, 746, 779, 780, 791–797, 800–803, 805 Price bubbles, 137, 140, 148, 150, 156, 158, 162, 1511, 1517 Price discovery, 147, 334, 342, 343, 346, 347, 1000, 1001, 1396, 1397 Price volatility, 5, 14, 187, 289, 342, 350, 368, 382, 482, 508, 772, 1173, 1174, 1194, 1352, 1397, 1421 Pricing performance, 548, 559, 734, 949 Pricing uncertain income streams, 605–616 Prime, 42, 795, 936, 1061–1063, 1065, 1066, 1201–1207, 1223–1232 Principal component analysis, 703, 705, 710, 866, 867, 869 Private placement, 1151–1157, 1161, 1162 Private targets, 1430, 1431, 1433, 1441 Probability of bankruptcy, 205, 775–777, 935 Probability of default (PD), 27, 639, 665, 668, 669, 699, 810, 813, 961, 962 Probit models (PM), 556, 965, 1154–1156, 1161, 1307, 1599–1600 Product market games, 1399–1407 Property/casualty (PC), 43, 44, 1170, 1465–1481 Protective put, 372–374 Publicly traded option, 356, 377, 378 Public offering, 1151, 1152, 1154, 1157, 1162, 1366, 1406 Public targets, 1430, 1433 Put-call parity, 355, 360–363, 367, 375, 433, 437, 438, 584, 620, 625, 721, 962, 1190, 1191, 1487, 1619 Put option, 14, 47, 320, 321, 327, 355, 356, 358, 360–363, 366–368, 372, 374, 375, 379, 380, 385, 391, 393, 405, 427, 433, 436–438, 453, 491, 492, 494, 496, 498, 500–502, 508, 517, 534, 548, 556, 584, 585, 587, 617–625, 627, 631, 636, 749, 757–760, 779, 780, 783, 796, 800, 819–827, 1167, 1335, 1396, 1445, 1446, 1453, 1487, 1619 Q Quadratic forms, 85, 688, 692, 693, 718, 748, 1121, 1360, 1490–1493, 1505, 1598 Quantile, 48, 249–251, 254, 256, 294, 324, 329, 738–743, 745–746, 1241, 1247, 1280, 1409, 1411–1413 Quantile regression, 329, 829–840, 1241, 1247 R Random walk, 291, 294, 334, 342, 387, 448, 453, 481, 543, 588, 589, 605, 607, 609–613, 1068, 1137, 1139, 1174, 1179, 1276, 1375, 1394, 1396 Rank dependent utility, 248, 249 Rate of convergence, 505–507, 1263 Rate of return, 4–7, 9–11, 14, 35, 48, 53, 55–58, 65–68, 81, 83, 86–88, 90, 96, 98, 99, 104–107, 123, 125, 167, 168, 170, 171, 179, 181, 182, 206, 208–210, 236, 259–261, 276, 291, 296, 298, 299, 301, 304–306, 314, 336, 360, 384–386, 388, 389, 391, 394, 399, 402, 415–417, 425–427, 431, 455, 456, 464, 467, 490, 606, 607, 609–613, 821, 824, 920, 937, 939, 942, 943, 1043, 1055, 1059, 1110, 1175, 1319, 1321, 1394, 1466, 1468–1472, 1475–1477, 1481, 1505, 1526, 1568
Subject Index R&D investment, 829–840 Real estate, 45, 66, 77, 330, 355, 779, 1201–1207, 1353, 1438 Real estate own (REO), 787, 803 Real option, 30, 767–777, 1049, 1051, 1184, 1190, 1196, 1197, 1232, 14041–1043 Receivership, 1595 Recession, 99, 224, 643, 686, 701, 732, 743, 779, 1048, 1557, 1559 Recovery, 9, 639, 666, 686, 699, 711, 789, 802, 934, 945, 949, 961, 962, 979, 980, 989–991, 997, 1001, 1270, 1595 Recursive programming, 627 Refunding, 27, 1025–1037 Regimes shift, 45, 904, 984, 1121–1133, 1352 Regime-switching, 44, 45, 717, 904, 907, 909, 912, 979, 984, 987, 990, 1001, 1122–1133, 1352, 1417, 1421–1424 Regression, 20, 25, 42–45, 97, 103, 104, 111, 112, 114, 117–119, 129, 130, 156, 158, 159, 161, 172, 175, 189, 191–194, 199, 208, 217, 272, 273, 279, 284, 289–292, 294–297, 301, 303, 304, 310, 314, 329, 430, 540, 541, 544, 545, 564–568, 571, 610, 646, 654, 656, 658, 660, 670, 676, 679, 680, 687, 689, 691–693, 714, 720, 721, 738, 740–745, 791, 807, 808, 810–815, 817, 829–840, 866, 868–871, 879, 884, 897–899, 901, 917, 921, 923, 926–931, 949, 965–976, 986, 999, 1061–1063, 1065, 1067–1089, 1091–1094, 1097, 1099, 1125, 1135, 1141, 1152, 1154, 1155, 1159, 1161, 1190, 1197–1198, 1201–1207, 1238, 1241, 1244, 1247, 1277, 1284, 1325, 1333, 1338, 1340, 1342, 1343, 1347, 1348, 1358–1360, 1362–1364, 1366, 1368, 1373, 1376, 1379, 1391, 1394, 1417, 1422, 1424, 1425, 1427, 1428, 1430–1433, 1435, 1439–1443, 1467–1481, 1525–1527, 1539–1550, 1608 Residential mortgage-backed securities, (RMBS), 801 Return, 21–22, 57, 66–71, 106–109, 123–124, 166, 206–208, 261, 263–264, 284–285, 296–298, 303–304, 310, 363, 853–862, 886–899, 1041–1054, 1088, 1247, 1333–1344, 1393–1394, 1417–1418, 1588–1589 Return maximization, 115, 120 Return predictability, 289–314, 1393–1394 Reverse-engineering, 45, 190, 199, 641, 647–650, 659 Rho, 277–278, 434, 500–503, 1376 Risk averse, 15, 20, 38, 71, 73–77, 79, 95, 102, 144, 150, 166, 172, 204, 210, 216, 247–257, 260, 285, 319, 336, 337, 349, 466, 586, 605, 606, 610, 613, 683, 1352 Risk-based portfolio insurance (RBPI), 323–324, 329, 331 Risk management, 41–49, 675–694 Risk minimization, 20, 21, 265 Risk-neutral and physical measures, 1001 Risk premium, 4, 5, 11, 13, 18, 19, 27, 29, 30, 48, 53, 62, 66–68, 94, 96, 98, 105, 107, 147, 150, 159, 166, 172, 185, 273, 275, 284, 285, 292, 336–338, 349–350, 483, 533, 534, 551, 668, 669, 679, 725, 726, 728, 732, 759, 764, 819–827, 885, 886, 915, 920, 933, 934, 959, 986, 996, 999, 1094, 1098, 1110, 1126, 1127, 1395, 1466–1476, 1481, 1509, 1510, 1512, 1515, 1527, 1537, 1545 Risky bond price, 962 Robust Logit, 45, 965, 969, 973–975 S Second order complementarities/substitutabilities, 846, 1400, 1403–1405 Securitization, 779, 801, 802, 1353 Security market line (SML), 11, 12, 95–97, 106, 108, 208, 272, 338, 1467 Semi-analytic, 585 Semimartingale backward equation, 49, 1556 Severity, 548, 753, 761, 780, 802, 1203, 1524 Sharpe measure, 46, 62–64, 880 Sharpe performance measure approach, 125–128 Sharpe ratio, 173, 174, 222–224, 227, 229, 232, 274–279, 283, 285, 286, 324, 328, 329
1715 Short sales, 79–83, 125–130, 132, 135, 138, 162, 168, 169, 175–177, 191, 204, 205, 226–230, 270, 286, 310, 316, 464, 844, 1007, 1009, 1524, 1527, 1530, 1533, 1534, 1537–1545, 1547, 1549 Short selling, 46, 79–83, 89–91, 130, 132, 140, 149, 150, 168–171, 176, 178, 179, 182, 221, 226, 231, 270, 323, 336, 360, 385, 490, 843, 844, 851, 1396, 1524 Short straddle, 368–369, 371 Short vertical spread, 370–371 Sigma, 626 Simple binomial option pricing model, 393–395 Simultaneous, 17, 20, 36, 43, 44, 84, 127, 192, 265, 266, 341, 367, 368, 371, 378, 669, 718, 763, 830, 846, 854, 901, 939, 966, 987, 1025, 1033, 1035–1037, 1042, 1091, 1148, 1171, 1308, 1325, 1353, 1357, 1358, 1384, 1396, 1400, 1478, 1479, 1596 Simultaneous equation, 36, 43, 44, 127, 1301–1306, 1366 Single-index model, 46, 111–115, 118, 120–122, 129, 130, 132, 133 Single-price law of markets, 606, 607, 609 2SLS. See Two-stage least squares (2SLS) SP500, 1378 Speculative bubbles, 137–163 Spillover effect, 1174, 1178, 1179, 1284, 1285, 1287–1289, 1352 Standard deviation, 10, 11, 21, 48, 53, 58–62, 65, 66, 70, 71, 77, 81, 89, 94, 96, 97, 100, 102, 103, 106–108, 125, 126, 129, 144, 166–169, 171, 172, 175, 178–182, 188, 190, 197, 208–211, 215–217, 224, 235, 260–262, 267, 268, 274–276, 280, 286, 295, 297, 298, 301, 328, 338, 388, 397, 430, 449, 450, 454, 457, 462, 482, 522, 525–527, 536, 537, 543, 553, 557, 558, 568, 584, 585, 606, 612, 613, 646, 653, 676, 679, 687, 689, 698–700, 710, 754, 757, 759–761, 764, 769, 772, 776, 802, 809, 810, 812–814, 824, 855, 857, 877, 878, 897, 1030, 1065, 1113, 1123, 1136, 1184, 1236, 1241, 1244, 1247, 1258, 1270, 1320, 1335, 1350, 1354, 1375, 1379, 1397, 1413, 1438, 1471–1474, 1477, 1478, 1505, 1509, 1517, 1533, 1534, 1602, 1611 Standard method of short sales, 129 State-prices, 1203, 1506, 1517–1519 State-space estimation, 1495, 1496, 1502 Stochastic calculus, 46, 447, 449, 451, 452, 465 Stochastic discount factor (SDF), 284–287, 986, 1126, 1127, 1130 Stochastic dominance, 38–40, 42, 247–257 Stochastic order, 247 Stochastic volatility, 47, 464, 481–490, 515, 526, 531, 533, 547–575, 577–580, 713–721, 723, 725–734, 744, 746, 979, 985, 1001, 1276, 1279 Stochastic volatility model, 47, 48, 526–528, 531, 533, 547, 548, 573, 718, 728, 730, 731, 985, 1001, 1009–1120, 1165–1171, 1407–1414, 1567–1575 Stock index volatility, 1333–1344 Stock market, 43, 67, 137, 195, 221–224, 262, 272, 274–275, 277, 290, 303, 328, 331, 333, 383, 526, 550, 732, 808–810, 854, 1000, 1067, 1080, 1122, 1174, 1175, 1179, 1190, 1235, 1283–1291, 1302, 1346, 1348–1350, 1352, 1354, 1393, 1394, 1396, 1397, 1523, 1526, 1577, 1596, 1601, 1616 Stock option pricing, 438 Stock return, 43–45, 109, 223, 289, 290, 292–295, 297–299, 301, 303, 314, 453, 478, 481, 483, 485, 531, 533, 536, 549–552, 554, 555, 676–680, 682–687, 809, 810, 812, 831, 853–862, 886, 1000, 1050, 1051, 1067–1073, 1075, 1076, 1086–1089, 1091–1094, 1098, 1099, 1102, 1106–1108, 1153, 1174, 1175, 1177–1179, 1209, 1284–1286, 1335–1336, 1345, 1346, 1349, 1350, 1352, 1393–1394, 1397, 1409, 1421, 1430, 1602, 1607, 1615, 1616, 1619 Strategic substitutes and complements, 1403–1406 Structural and reduced-form models, 980, 988 Structural credit risk models, 665–668, 671, 933 Structural instability, 44, 1345–1354
1716 Structural model, 47, 297, 665, 667–670, 755, 764, 934, 941, 949, 950, 988, 995, 996, 1364 Subordinators, 1567–1569, 1574 Subprime, 45, 332, 765, 779–805, 1122, 1587–1590 Sudden shifts of variance, 1349 Support vector machines, 1257, 1271 Systematic risk, 11, 13, 23, 35, 57, 63, 66, 93, 95–100, 103–105, 107, 108, 112, 123, 133, 270–275, 280, 335, 338, 339, 534, 678, 683, 684, 884, 885, 1127, 1128, 1346, 1429, 1467, 1468, 1472, 1525–1527 T Term structure, 18, 27, 45, 331, 360, 461, 565, 566, 609, 610, 667–670, 705, 714–716, 718, 721, 729, 737–739, 741–744, 755, 764, 788, 922, 923, 962, 963, 979–1001, 1110, 1118, 1121–1133, 1190, 1197, 1337, 1393, 1489–1495, 1500, 1501, 1602, 1623 Term structure models(ing), 45, 47, 549, 565, 566, 667, 668, 713–723, 729, 734, 743, 746, 754, 979, 987, 991, 995, 996, 1001, 1121, 1128–1130, 1133, 1483–1485, 1487, 1490, 1493, 1494, 1511, 1616 Term structure of interest rates (TSIR), 18, 45, 609, 979, 985, 987, 1121–1133, 1190, 1197, 1198, 1489, 1623 Theory of storage, 197, 915–917, 920, 921, 923, 926, 930 Theta, 494–496, 502–503 Three-stage least squares (3SLS), 44, 1301, 1304–1305, 1365 Threshold regression, 44, 808, 811, 814–815 Time-additive utility, 298, 607, 608 Time scales in volatility, 1109–1111, 1113, 1118 Time to maturity, 14, 15, 358, 456, 493, 494, 506, 533, 587, 624, 714, 725, 727, 904, 980, 1029, 1032, 1167, 1170, 1220, 1270, 1277, 1611 Time-varying returns, 1067–1070, 1072, 1080, 1082–1089 Tournament, 138–141, 143, 148–150, 154, 156, 158–160, 1610, 1613 Trading costs, 330, 334, 335, 339, 809 Transfer pricing, 48, 185, 1307–1312, 1329 Treynor measure, 46, 63, 64, 128–132 Turnover, 43, 329, 792–793, 805, 863–866, 917, 919, 1048, 1174–1179, 1315 Two-pass methodology, 1091, 1241 Two-period model, 140, 143, 145, 396, 535 Two-stage least squares (2SLS), 44, 1301–1306, 1357–1366 Two-state option pricing model, 409–415 U Unbiased term structure, 610, 613 Uncollaterized call rate, 42, 1375 Unspanned stochastic volatilities, 713–746, 748–751 Unsystematic risk, 11, 12, 57, 66, 93, 99, 107, 108, 112, 123, 1429 Utility functions, 38, 40, 42, 43, 46, 69, 71–77, 79, 82, 171, 178, 180, 204, 205, 209, 210, 216, 247–249, 251, 254–257, 260, 285, 286, 300, 316, 317, 335, 336, 608, 846, 873, 1555, 1571 Utility theory, 46, 71–77, 180, 247, 250, 254, 255, 257
Subject Index V Validation test, 45, 967–969, 973, 976, 1065 Valuation, 3–8, 10, 14–22, 25, 27, 29, 30, 32, 41, 49, 104, 109, 139, 144, 146–148, 186–190, 193, 194, 198–200, 289, 290, 296, 299, 305, 314, 327, 333, 343, 355, 357, 358, 360, 362, 375, 377–392, 394, 397, 403–406, 429–431, 458, 459, 464, 466, 487, 526, 535, 537, 550, 551, 556, 584–602, 605–616, 667, 668, 699, 723, 753, 756–760, 768, 769, 775, 776, 802, 846, 854, 884–886, 933, 934, 939, 941, 948, 950, 961, 963, 995–999, 1008, 1015, 1016, 1042–1045, 1047–1052, 1055, 1059, 1060, 1224, 1226–1228, 1230, 1334, 1335, 1383, 1400, 1405, 1427, 1428, 1430, 1439, 1442, 1465–1467, 1483, 1502, 1506, 1594, 1596, 1601, 1602 Value-at-risk (VaR), 43, 48, 251–252, 323, 324, 329, 515, 524, 525, 528, 697–711, 853, 861, 987, 1348, 1409–1414 Value investing, 185 VaR. See Value-at-risk Variance-covariance matrix, 61, 262, 657, 1291, 1304, 1347 Variance decomposition, 1393 VBA. See Visual basic for applications Vega, 141, 144, 148–500, 503 Visual basic for applications (VBA), 617, 627–636 Volatility, 481–490, 518, 522, 547–574, 713–746, 748–751, 883–912, 1109–1120, 1165–1171, 1273–1280, 1333–1344, 1349–1352, 1409–1414, 1417–1425, 1567–1574, 1581–1582, 1587–1589 Volatility clustering, 718, 823, 853, 856, 1180, 1181, 1273, 1352, 1409, 1410, 1417, 1581 Volatility smile, 458, 464, 531–533, 541, 548, 567, 727–729, 732, 734 Volume effect, 43, 1174, 1177 W Warrants, 14, 34, 236, 274, 377, 390–392, 583, 940, 979 Weak instruments, 1357, 1359, 1364–1368 Weight, 10, 11, 13, 30–33, 36, 46, 48, 57, 59, 61, 65, 70, 72, 75, 77–81, 83–91, 97, 103, 112–115, 117, 120, 121, 125–130, 132, 133, 135, 166–172, 175–182, 186–194, 196, 199–217, 221–224, 226, 227, 229–232, 237, 256, 261, 263–266, 268, 269, 285, 323, 324, 329, 431, 524, 528, 543, 589, 608, 643, 646, 679, 686, 697, 702, 703, 705, 706, 770, 774, 789, 790, 836, 939, 1044, 1064, 1066, 1083, 1091, 1095, 1096, 1099–1105, 1107, 1112, 1120, 1125, 1140, 1141, 1153, 1187, 1189, 1190, 1215, 1223–1225, 1227, 1229, 1230, 1236–1238, 1241, 1244, 1276, 1280, 1308, 1319, 1320, 1322, 1323, 1335, 1347, 1351, 1360–1362, 1401, 1418, 1421, 1423, 1424, 1470–1472, 1474, 1475, 1477, 1480, 1481, 1499, 1511, 1512, 1526, 1530, 1533, 1534, 1600, 1601, 1622 Whole loan, 803, 8004 Y Yaari’s dual utility, 43, 247, 251, 255, 256 Yield spreads, 55, 565, 566, 666, 668–670, 934, 941, 947–950, 980, 991, 995–996, 1001, 1067, 1070, 1075, 1393